Control Engineering Series Editor William S. Levine Department of Electrical and Computer Engineering University of Mar...
1008 downloads
2968 Views
14MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Control Engineering Series Editor William S. Levine Department of Electrical and Computer Engineering University of Maryland College Park, MD 20742-3285 USA
Editorial Advisory Board Okko Bosgra Delft University The Netherlands
William Powers Ford Motor Company (retired) USA
Graham Goodwin University of Newcastle Australia
Mark Spong University of Illinois Urbana-Champaign USA
Petar Kokotovic University of California Santa Barbara USA Manfred Morari ETH
Zurich Switzerland
lori Hashimoto Kyoto University Kyoto Japan
Huaguang Zhang Derong Liu
Fuzzy Modeling and Fuzzy Control
Birkhauser Boston • Basel • Berlin
Huaguang Zhang School of Information Science and Engineering Northeastern University Shenyang, Liaoning 110004 People's Republic of China
Derong Liu Department of Electrical and Computer Engineering University of Illinois at Chicago Chicago, IL 60607 U.S.A.
Mathematics Subject Classification: 93C42, 93-02 Library of Congress Control Number: 2006933001 ISBN-10 0-8176-4491-1 ISBN-13 978-0-8176-4491-8
e-IBSN-10 0-8176-4539-7 e-IBSN-13 978-0-8176-4539-7
Printed on acid-free paper. ©2006 Birkhauser Boston BirkhdUSer All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Birkhauser Boston, c/o Springer Science-i-Business Media LLC, 233 Spring Street, New York, NY, 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 1 www.birkhauser.com
(MP)
T o LiQIN AND PINJIA (HGZ) To
CONNIE AND EMILIE ( D R L )
Contents Preface
xi
1
Fuzzy Set Theory and Rough Set Theory 1.1 Classical Set Theory 1.2 Fuzzy Set Theory 1.3 Rough Set Theory 1.4 Summary Bibliography
1 2 4 26 31 31
2
Identification of the Takagi-Sugeno Fuzzy Model 2.1 Introduction 2.2 Description of the T-S Fuzzy Model 2.3 An Off-Line Fuzzy Identification Algorithm 2.4 An Identification Approach with Less Computational Burden . . . . 2.5 Identification Approach for the Generalized T-S Fuzzy Model . . . 2.6 Sunmiary Bibliography
33 33 34 37 62 68 75 75
3
Fuzzy Model Identification Based on Rough Set Data Analysis 3.1 Introduction 3.2 Preliminaries 3.3 Input Structure Identification 3.4 Fuzzy Relation Model Identification 3.5 ANN Modeling Based on Rough Sets 3.6 Summary Bibliography
77 77 78 85 94 101 106 106
4
Identification of the Fuzzy Hyperbolic Model 4.1 Introduction 4.2 Fuzzy Hyperbolic Model 4.3 Generalized Fuzzy Hyperbolic Model
109 109 110 118
vii
viii
Contents 4.4 Summary Bibliography
134 134
5
Basic Methods for Fuzzy Inference and Control 137 5.1 Introduction 137 5.2 Design of a Simple Fuzzy Control System 137 5.3 Parameters and Responses of the Simple Fuzzy Control System . . . 145 5.4 Fuzzy Self-Tuning Control 148 5.5 Simulation Comparison Under Disturbances 154 5.6 Robustness of a Fuzzy Self-Tuning Control System 159 5.7 Automatic Generation of a Fuzzy State-Action Table 159 5.8 Summary 171 Bibliography 171
6
Fuzzy Inference and Control Methods Involving Two Kinds of Uncertainties 6.1 Introduction 6.2 Historical Overview and Problem Description 6.3 Definitions of Several Basic Concepts 6.4 The Function C F and the Overall Point-Valued THFDP Algorithm . 6.5 Fuzzy Decision-Making of Composite Rules 6.6 Numerical Examples 6.7 Fuzzy Control Methods Involving Two Kinds of Uncertainties . . . 6.8 Summary Bibhography
173 173 174 175 180 182 183 185 192 192
7
Fuzzy Control Schemes via a Fuzzy Performance Evaluator 195 7.1 Introduction 195 7.2 Fundamentals of a Fuzzy Control Scheme via FPE 196 7.3 Fuzzy Adaptive Control Scheme via FPE 197 7.4 Fuzzy State Feedback Control Scheme via FPE 210 7.5 Fuzzy Control of Nonlinear Systems with Time-Delays via FPE . . 224 7.6 Summary 239 Bibliography 239
8
Multivariable Predictive Control Based on the T-S Fuzzy Model 8.1 Introduction 8.2 Preliminaries 8.3 Equivalent Transformation of the Fuzzy Model 8.4 Predictive Control Law for Multivariable Processes 8.5 Stability of a Fuzzy Generalized Predictive Control System
241 241 242 244 249 251
Contents
ix
8.6 Other Performance Analysis 8.7 Fuzzy Generalized Predictive Control of a Boiler-Turbine Unit . . . 8.8 Comparison of Fuzzy Predictive Control and Conventional Control . 8.9 Robustness of Fuzzy Generalized Predictive Control System . . . . 8.10 Fuzzy Modeling of Operators'Control Rules with Application . . . 8.11 Summary Bibliography
253 255 259 261 263 269 270
Adaptive Control Methods Based on Fuzzy Basis Function Vectors 9.1 Introduction 9.2 Notation and Preliminaries 9.3 Design of an Adaptive Controller Based on Fuzzy Basis Function Vectors for Multivariable Square Nonlinear Systems 9.4 Design of an Adaptive Controller Based on Fuzzy Basis Function Vectors for Multivariable Nonsquare Nonlinear Systems 9.5 Numerical Example 9.6 Summary Bibliography
273 273 274
289 292 296 296
10 Controller Design Based on the Fuzzy Hyperbolic Model 10.1 Introduction 10.2 Stable Controller Design by Pole-Placement Method 10.3 Nonlinear 7^2 Optimal Controller Design 10.4 Hoo Controller Design 10.5 Control of Nonlinear Time-Delay Systems with Uncertainties . . . . 10.6 Summary Bibliography
299 299 300 305 309 312 319 319
11 Fuzzy Hoo Filter Design for Nonlinear Discrete-Time Systems with Multiple Time-Delays 11.1 Introduction 11.2 Modeling of Nonlinear Systems Using the T-S Fuzzy System . . . . 11.3 Fuzzy iJoo Filtering Analysis Based on the T-S Fuzzy Model . . . . 11.4 Fuzzy i:foo Filter Design 11.5 Simulation Example 11.6 Summary Bibliography
323 323 324 328 340 346 353 353
12 Chaotification of the Fuzzy Hyperbolic Model 12.1 Introduction 12.2 Chaotification by the Impulsive Control Method
357 357 358
9
278
X
Contents 12.3 Chaotification by the Inverse Optimal Control Method 12.4 Chaotification of the Original System 12.5 Summary Bibliography
367 377 385 385
13 Feedforward Fuzzy Control Approach Using the Fourier Integral 13.1 Introduction 13.2 Problem Formation 13.3 System Description and Assumptions 13.4 FSMC Feedback Control Law 13.5 Adaptive Feedforward Controller Design in the Fourier Space . . . 13.6 Convergence Conditions of the Global Closed-Loop System . . . . 13.7 Simulation and Comparisons 13.8 Summary Bibliography
389 389 390 392 393 400 404 406 411 412
Index
413
Preface In the present book we concern ourselves exclusively with fuzzy modeling and fuzzy control. Fuzzy logic methodology has been proven to be effective for dealing with complex nonlinear systems with uncertainties that are otherwise difficult to model. Fuzzy rule-based technology has been applied to many practical applications, especially in consumer products. However, for complex nonlinear systems, it is not adequate just to control them well with a few fuzzy rules. It is necessary to understand more thoroughly the theory of fuzzy modeling and fuzzy control that motivates the present book. We present a systematic framework for fuzzy modeling and fuzzy control of nonlinear systems with uncertainties. Based on three types of fuzzy models, i.e., the Mamdani fuzzy model, the Takagi-Sugeno (T-S) fuzzy model, and the fuzzy hyperbolic model (FHM), a number of the most important issues in fuzzy control systems are addressed. These include fuzzy modeling, fuzzy inference, stability analysis, systematic design framework, robustness, and optimality. The Mamdani fuzzy model is the first working model of fuzzy control systems. It constructs a bridge between the operator's knowledge and IF-THEN rules by fuzzy logic. However, it is difficult to analyze the stability of the Mamdani fuzzy model in theory, which limits further applications to complex nonlinear systems. For the Mamdani fuzzy model, we provide a basic procedure for fuzzy controller design. Also, we analyze the relationship between parameters and control performance. Furthermore, we propose a fuzzy self-tuning control algorithm. The stability and robustness of the proposed fuzzy control system are analyzed in detail. Moreover, concerning the credibility of fuzzy rules, we propose a new fuzzy inference method with two kinds of uncertainties. Accordingly, the control strategy based on two kinds of uncertainties is established. The T-S fuzzy model starts a new era of rigorous theoretical analysis for fuzzy modeling and control. The universal approximation theory establishes the theoretical foundation for fuzzy modeling. Complex nonlinear systems, which can be modeled by the T-S fuzzy model, can be viewed as a combination of some local linear models. Thus, the complex control task can be divided into several simple local tasks. The complexity and existence of solutions for fuzzy controller design depend on the number and characteristics of local models. For the T-S fuzzy model, we establish a systematic controller design framework for control schemes such as fuzzy model-based generalized predictive control, a fuzzy adaptive control scheme based on fuzzy basis function vectors, a fuzzy control scheme based on a fuzzy perfor-
xii
Preface
mance evaluator, and fuzzy sliding-mode control. In addition, we address the problem of designing an efficient filter for signal estimation in nonlinear discrete-time systems with multiple time delays via the T-S fuzzy model. An approach for designing robust H-infinity fuzzy filters is also provided. In contrast to the T-S fuzzy model, the FHM is a global fuzzy model whose fuzzy rules are easy to understand. These fuzzy rules can be converted to an overall function with hyperbolic form according to specific fuzzy inference and fuzzy membership functions. A number of fuzzy control schemes are developed for the FHM by taking advantage of nonlinear control systems theory and modem control theory. For the FHM, we establish sufficient conditions for global asymptotic stability. Also, we present the Hoo and H2 control algorithm based on optimal control theory. Furthermore, we extend the results to nonlinear time-delay systems with parameter uncertainties. The fuzzy hyperbolic guaranteed cost-control scheme is obtained. In order to make the nonlinear system produce the expected chaotic state, we model the original system with the FHM first. Then we design a fuzzy controller based on the FHM to produce expected chaos in the sense of Devaney's. Although fuzzy systems have been proven to be effective for modeling of nonlinear systems, the data-driven identification of fuzzy models alone sometimes leads to complex and unrealistic models. Typically, this will lead to over-parameterization of the model, high dimension, and rule explosion. So we give careful consideration to the questions concerning model complexity, model precision, and computing time. We apply rough sets data analysis (RSDA) to Mamdani fuzzy modeling. Especially for the input structure identification, RSDA is applied to simplify the premise structure using rough information measure. Furthermore, we applied artificial neural networks and the genetic algorithm to optimize the structure and parameters of fuzzy models. Moreover, we propose a generalized T-S fuzzy model and a generalized FHM, and establish the universal approximation theory for them. This book is intended for graduate students and researchers in electrical engineering, computer engineering, computer science, physical sciences, and any of the engineering disciplines, who are interested in the theory and applications of fuzzy logic systems in the modeling and control of nonlinear dynamical systems. It is assumed that the reader has a background in linear algebra, matrix theory, and control theory. The book is thematically divided into three parts. Part 1 of the book (Chapters 1^) deals with the modeling of nonlinear dynamical systems using fuzzy logic. Three fuzzy models for nonlinear dynamical systems are introduced, including the Takagi-Sugeno fuzzy model, the fuzzy model based on rough reasoning, and the FHM. Techniques for choosing the fuzzy model structure and identifying fuzzy model parameters are developed in each case and are elaborated in detail. Part 2 of the book (Chapters 5-9) is concerned with fuzzy inference and control techniques. The basic fuzzy inference and control techniques involve the construction of fuzzy rules that contain only a single kind of fuzziness expressed using the usual IF-THEN rules. By quantifying the strength of confirmation of fuzzy rules, we introduce fuzzy inference and control techniques involving two kinds of fuzziness
Preface
xiii
expressed using IF-THEN rules with a given strength of confirmation. In addition, several other advanced fuzzy control approaches are introduced, including fuzzy performance evaluator-based methods, generalized predictive fuzzy control methods, and adaptive control methods based on fuzzy basis functions. Part 3 of the book (Chapters 10-13) covers several advanced topics in fuzzy control ranging from H^ controller and filter design to chaotification of fuzzy systems and feedforward fuzzy control of nonlinear systems using Fourier integrals. Stable controllers for the FHM are developed based on Hoo theory and Lyapunov stability theory. Also, Hoo filter design techniques are developed for noise cancellation and signal estimation for nonlinear systems with or without delays and with unknown bounded disturbances. The chaotification of a nonlinear system is achieved by first chaotifying a fuzzy system that is modeled after the nonlinear system. Adaptive feedforward control schemes using Fourier integrals are developed for improving the tracking performance of closed-loop nonlinear control systems. A great deal of the material presented in this book is based on research that we conducted with several colleagues and former students, including M. Li, H. L. Liang, S. X. Lun, Y. B. Quan, Q. Y. Sun, G. Wang, Z. L. Wang, Z. S. Wang, J. Yang, and M. J. Zhang. We appreciate the efforts of X. R. Liu, Y H. Luo, and Y T Wei in typing and correcting the manuscript. Huaguang Zhang Shenyang, China Derong Liu Chicago, USA
Fuzzy Modeling and Fuzzy Control
Chapter 1
Fuzzy Set Theory and Rough Set Theory In daily life, we use information obtained to understand our surroundings, to learn new things, and to make plans for the future. Over the years, we have developed the ability to reason on the basis of evidence in order to achieve our goals. However, since we are restricted by our ability to perceive the world, we find ourselves always confronted by uncertainties about how good our inferences are. Uncertainties are one of the sources from which our errors stem since we do not know the exact information about our environment. In general, uncertainties result from both the measurement method used when we gain new knowledge and the natural language by which we communicate with others. To deal with the problem of uncertainty, the theory of probability has been established and has been successfully applied to many areas of science. However, in spite of its success, probability theory is not capable of capturing uncertainties in all manifestations. In particular, probability theory is not capable of capturing uncertainties resulted from the vagueness of linguistic terms in natural language, such as "tall," "warm," "very warm," "rapidly increasing," and the like. As a result, some new uncertainty theories capable of dealing with imprecision and vagueness have been developed. These theories include fuzzy set theory and rough set theory. This chapter provides introductory material to be used throughout the present book. Fundamental concepts and properties of classical set theory will be reviewed first. Fundamental concepts and principles of fuzzy set theory that are particularly useful in fuzzy modeling and fuzzy control will then be introduced. Finally, basic concepts of rough set theory will be summarized to conclude this chapter.
Chapter 1. Fuzzy Set Theory and Rough Set Theory
1.1
Classical Set Theory
1.1.1 Basic Concepts and Notation A set is a collection of things that can be distinguished from one another as individual elements sharing some common properties. Each individual element in this collection is called a member, or an element, of the set. Throughout this book, we use uppercase letters A,B,C,... ,X,Y, Z to denote sets and we use lowercase letters a,b,c,... ,x,y^z to denote elements of a set. If an individual element a belongs to a set A, we write this "belonging to" relationship using the notation a E A. The symbol "G" is read as "an element of." If an element a is not a member of a set A, we express this fact using the symbol ^. In classical set theory, there are only two possible relations between an individual element a and a set A; that is, either a E Aor a ^ A. The universal set is the the set that consists of all the individual elements of interest in a given application. We usually use the letter U to denote the universal set. The empty set, also called the null set, is the set that contains no elements and is denoted by the symbol 0 . For any set A, we say that 0 G A for mathematical convenience. Assume that A and B are sets. If every member of set A is also a member of set B, then A is called a subset of B. We use A C B to represent the fact that A is a subset of B. For any set A, we have A C U with the understanding that A and U are from the same application. If A C B and B C A, then A and B are called equal sets and their equality relationship is represented by A — B. To indicate that A and B are not equal, we write A^ B. If both A C B and A ^ B, then B contains at least one individual element that is not an element of ^ . In this case, A is called a proper subset of B, and this relationship is represented by the expression A C B. The set that consists of all possible subsets of a given set A is called the power set of A and is denoted by the symbol P{A). The complement, or absolute complement, of a given set A, denoted by the expression -1 A, is the set of all elements in the universal set U that are not in A. More precisely,^A = {x\x G U and x ^ ^4}. The union of set A and set B represented by the expression Au B is the set containing all elements belonging either to A, to B, or to both. More specifically, Au B = {x\x e A or X e B}. The intersection of set A and set B, denoted by An B, is the set containing all elements belonging to both set A and set B simultaneously, i.e.. An B = {x\x G A and x G B}. The difference of set A and set B is the set that consists of all elements of A thai do not belong to B. The difference set is represented by the expression A — B. Formally, A~
B = {x\x e A and x ^ B}.
Section LI Classical Set Theory
1.1.2 Representations of Classical Sets The three most common methods to represent or describe a set are the Ust method, the rule method, and the characteristic function method (also called membership function method). Using the list method, we represent a set by enumerating its elements, enclosing them with a pair of brace. For example, ^4 = {1,2,3,4,5}. Using the rule method, a set, say C, can be represented in a way that stipulates a rule whereby we can form the desired set: C = {x\P{x)}, where P{x) expresses a property that element x has. This representation indicates that the set C is constituted by elements that all share the property P. For example, C = {x\x is an integer}. Let Abe a subset of universal set U and let x G A. Then, its characteristic function, denoted by XA^ is defined by the following rule: . . / 1, ^^^^^ = \ 0,
if
xeA; ifx^A.
For example, suppose that U is the set of all nonnegative real numbers and A is the set of real numbers from 5 to 10. Then, ^ is a subset of U whose characteristic function is defined for each x by the following rule: . . _ / ! , XA[x) - I 0,
if 5 < X < 10; otherwise.
Some useful properties of the characteristic function are listed as follows: X^A{X)
=
I-XA(^),
XAnsix) = mm[xA{x),
XB{X)],
XAuB{x)=m^x[xA{x),
XB{X)],
where ^A is the complement of A.
1.1.3 Basic Properties of Classical Set Operations Involution: ^^A = A. Commutativity: An B = B n A, AU B = B U A. Associativity: An {B n C) = (An B) n C, AU {B U C) ^ {AU B) U C. Distributivity: An{BuC) = {AnB)U{AnC), Au{BnC) = (AuB)n{AU C). Idempotency: AnA = A, AuA = A. Absorption: An {Au B) = A, AU {An B) = A. Absorption by 0 andU: AUU =-U, An 0 = 0. Identity: AnU = A,AU0 = A. Law of contradiction: A n -^A = 0. Law of excluded middle: A U -> A = U. De Morgan's law: ^{A nB) = ^AU ^B, -.(A UB) = -^An -.J5.
Chapter 1. Fuzzy Set Theory and Rough Set Theory
1.1.4 Other Concepts The Cartesian product of two arbitrary sets A and B, denoted by A x B, is the set of all possible ordered pairs constructed in such a way that the first element in each pair is a member of A and the second element is a member of B. It is formally defined by the equation A x B = {{a,b)\a G A and b e B}. The order of the ordered pairs cannot be exchanged. In general, (a, b) ^ (&, a) and Ax B ^ B x A. A set ^ C [/ is said to be convex if and only if for any r^s £ A and any A G [0,1], Ar + ( 1 - A ) 5 G A
1.2
Fuzzy Set Theory
Fuzzy set theory by itself is a huge field that includes fuzzy measure theory, fuzzy topology, fuzzy algebra, fuzzy analysis, etc. Only a small portion of fuzzy set theory has been applied to engineering problems. In this subsection, we will introduce concepts and principles of fuzzy set theory [2] that are useful in fuzzy modeling and control [1,5,11,12].
1.2.1 Fundamental Concepts of Fuzzy Set Theory The overview of classical set theory in the preceding section emphasizes one of its central assumptions: the boundaries of classical set are required to be drawn precisely and, therefore, set membership is determined with complete certainty. An individual is either definitely a member of the set or definitely not a member of it. However, most sets and propositions are not so neatly characterized in reality. For example, the set of tall people is a set whose exact boundary cannot be precisely determined. To overcome this limitation of classical set theory, the concept of a fuzzy set was introduced [13]. Let U be the universe of discourse or the universal set. A fuzzy set in U is characterized by a membership function IJLA{^) that takes values in the interval [0,1]. Therefore, a fuzzy set is a generalization of a classical set by allowing the membership function to take values in the interval [0,1] instead of just 0 and 1. In other words, the membership function of a classical set can only take two values-0 and 1, whereas the membership function of a fuzzy set is a continuous function with its range given by [0,1]. We see from the definition that there is nothing "fuzzy" about a fuzzy set; it is simply a set with a continuous membership function. In contrast to fuzzy sets, a set defined in the classical sense in Section 1.1 is also sometimes referred to as a crisp set. A fuzzy set A'mU may be represented as a set of ordered pairs of generic element X and its membership value; that is, A = {(x, IIA{X))\X G U}. When U is continuous (for example, t/ = M), a fuzzy set A is commonly written as A = jjj IIA{X)/X which denotes the collection of all points x e U with the associated membership function IIA{X). On the other hand, when U is discrete, A is commonly written as ^ = J2u f^A{x)/x which denotes the collection of all points X G [/ with the associated membership function /J.A{X).
Section 1.2 Fuzzy Set Theory i I M-NCTS
1.0 •
0.8 0.6 • 0.4 0.2 (» 0
L.
0
-J
2
4
•—4—•—4
6
8
•
10
Figure 1.2.1: Membership function of the fuzzy set "numbers close to 3." Example 1.2.1. Let U be the integers from 0 to 10, i.e., /7 = { 0 , 1 , 2 , . . . , 10}. Then the fuzzy set "numbers close to 3" may be defined as (using the summation notation) {NCT3} =
V^
MNCT3(^)
0.1
0.5
0.8
1
0.8
0.5
0.1
That is to say, 3 belongs to the fuzzy set "numbers close to 3" with degree of 1, 2 and 4 with degree of 0.8, 1 and 5 with degree of 0.5, 0 and 6 with degree of 0.1, and 7, 8, 9 and 10 with degree of 0. See Figure 1.2.1 for an illustration. D Example 1.2.2. Let U be the interval [0,120] representing the age of ordinary humans. Then we may define fuzzy sets "young" and "old" as (using the integral notation) ^
f
/iy(x)
/^S 1
{young} = / ^^^-^^ = / Ju X Jo
.120
x-25
- + / X J25
x-50 Ju
X
J50
See Figure 1.2.2 for illustrations of the two fuzzy sets.
D
The support of a fuzzy set A in the universe of discourse /7 is a crisp set (i.e., a classical set) that contains all the elements of U that have nonzero membership values in A, i.e., Supp(^) = {x e U\/J.A{X) > 0},
where Supp( A) denotes the support of fuzzy set A. For example, the support of fuzzy set "numbers close to 3" in Figure 1.2.1 is the set of integers {0,1,2,3,4, 5,6}. If the support of a fuzzy set is empty, it is called an empty fuzzy set.
Chapter 1. Fuzzy Set Theory and Rough Set Theory —
!
!
!
young
Old
/
0.8
: 0.6 h :
1 1
'
.1
/. . .
•
1
'
:
•
1
•
•
\
•
/
'/
;
1
/
/ '
/
0.4 h \
•
'
0.2 F \
•
1
1
20
40
'
H
/
80
60
—
1
===
100
120
Figure 1.2.2: Membership functions of the fuzzy sets "young" and "old.' A fuzzy singleton is a fuzzy set whose support is a single point in U. An a-cut of a fuzzy set A is a crisp set Ao, that contains all elements of U that have membership values in A greater than or equal to a, i.e., Ao,^
{x e U\IIA{X)
>
a}.
For example, for a = 0.5, the a-cut of the fuzzy set "numbers close to 3" is the crisp set {1,2, 3,4,5}. When the universe of discourse U is the n-dimensional Euclidean space E^, the convexity of classical sets can be generalized to fuzzy sets. A fuzzy set A is said to be convex if and only if its a-cut A^ is a convex set for any a in the interval (0,1]. Let A and B be fuzzy sets defined in the same universe of discourse U. We say A and B are equal if and only if IIA{X) = l^six) for all x e U. We say that B is contained in A, denoted by B C A, if and only if /x^(x) > ^B{X) for all X G U. The complement of a fuzzy set yl is a fuzzy set ^A in U whose membership function is defined as: f^^A{x) = 1 — IJ^A{X) for all x G U.
Section 1.2 Fuzzy Set Theory The union of fuzzy sets A and 5 is a fuzzy set in U, denoted by AU B, whose membership function is defined as /^AUB{X)
= max[/iyi(x),
/2B{X)].
The intersection of fuzzy sets A and 5 is a fuzzy set An B in U with membership function given by /^Ansix) = min[/iA(^), fJ^six)]. With the operation of complement, union, and intersection defined above, many of the basic identities of classical set theory can be extended to fuzzy sets, except for the law of excluded middle and the law of contradiction. Example 1.2.3. Let us return to Example 1.2.1. Let A = {NCT3}, i.e., ,
0.1
0.5
0.8
1
0.9
0.5
0.2
0.2
0.5
0.8
0.5
0.1
We have ,
0.9
1
1
1
1
- ^ = T + T + T + T + T + T + 7 +8 + 9+10' ,
0.9
0.5
0.8
1
0.8
0.5
0.9
1
1
1
1
^ ^ - ^ = i r + T + ^ + 3 + X + X + - ^ + 7 + 8 + 9+IOFrom the above equations, we can see that the law of excluded middle of classical set theory does not hold for fuzzy sets under fuzzy union and fuzzy complement. For example, /J.A{2) = 0.8. We have /i^A(2) = 0.2 and /iAu-A(2) = max[0.8,0.2] = 0.8. This means that x is not a member of A U ^A with full membership and the law of excluded middle is violated, i.e., in this case, A U -> A ^ U. Since , 0.1 0.5 0.2 0.2 0.5 0.1
^^-^ = -^ + T + ^ + X + T + T ' it is clear that the law of contradiction A 0 -^A = 0 of classical set theory does not hold for fuzzy sets under fuzzy intersection and fuzzy complement. For example, we have /iAn^A(2) = min[0.8, 0.2] = 0.2. This implies that x is a member ofA Pi -lA with degree of 0.2 and not with degree of 0 as demanded by the law of contradiction. In this case, An^Aj^0. D
1.2.2 Membership Functions As already mentioned, one of the principal motivations for introducing fuzzy sets is to represent imprecise concepts. Because an individual's membership in a fuzzy set may admit some uncertainty, we say that its membership is a matter of degree of association. Accordingly, a person is a member of the set "tall people" to the degree to which he or she meets the operating concept of "tall." Alternatively, we can say that the degree of membership of an individual in a fuzzy set expresses the degree of compatibility of the individual with the concept represented by the fuzzy set.
Chapter 1. Fuzzy Set Theory and Rough Set Theory A
^AM
Figure 1.2.3: Triangular membership function.
Each fuzzy set is uniquely defined by a membership function. The concept of membership function is very important in fuzzy set theory. Naturally, the immediate question is how to determine the membership function for a given fuzzy set. There are two approaches to determining a membership function. The first approach is to use the knowledge of human experts. Because fuzzy sets are often used to formulate human knowledge, membership functions represent a part of human knowledge. Usually, this approach can only give a rough formula of the membership function and fine-tuning is required. The second approach is to use data collected from various sensors to determine the membership function. Specifically, we first specify the structure of membership function and then fine-tune the parameters of membership function based on the data. Next, we describe several frequently used membership functions: triangular membership function, normal distribution membership function, and trapezoidal membership function. Triangular membership function X
/iA(x)
a b — a' ^ ^ , [ Oy
a < X d.
This membership function is illustrated in Figure 1.2.5.
10
Chapter 1. Fuzzy Set Theory and Rough Set Theory
1.2.3 Fuzzy Relations Let X and Y be two arbitrary classical sets. The Cartesian product of X and Y, denoted by X x Y, is the (nonfuzzy) set of all ordered pairs (x,y), x G X and y EY; that is, X X Y = {{x,y)\x G X and 2/ G Y}. Note that the order in which X and Y appear is important; that is, if X ^ Y, then X X Y y^ Y X X. In general, the Cartesian product of arbitrary n nonfuzzy sets Xi, X2, . . . , Xn, denoted by Xi x X2 x • • x Xn, is the nonfuzzy set of all n-tuples (xi,a:2,... ,Xn), Xi G Xi fori G { 1 , 2 , . . . , n } ; that is, XiX
X2X '-' X Xn = { (xi,X2,... ,Xn)\xi G X i , a::2 ^ -^2,. ••,^n ^ ^ n } .
A nonfuzzy relation among nonfuzzy sets X i , X 2 , . . . ,X^ is a subset of the Cartesian product Xi x X2 x • • • x X^. If we use Q(Xi, X 2 , . . . , Xn) to denote a relation among X i , X 2 , . . . , X^, then 0 ( X i , X 2 , . . . , X , ) C Xi X X2 X .. • X X , . As a special case, a relation, or a "binary relation," between the nonfuzzy sets X and y is a subset of the Cartesian product X xY, Example 1.2.4. Let X = {1, 2,3} and Y = {2,3,4}. A relation between X and y is a subset of X x y . For example, let Q(X, Y) be a relation named "the first element is no smaller than the second element," then Q ( X , r ) = {(2,2),(3,2),(3,3)}.
D
Because a relation itself is a set, all of the basic set operations can be applied to it without modification. Also, we can use the following membership function to represent a relation: ,
.
/ 1,
{xi,X2 if (xi,a:2,...,Xn) G (3(Xi, X 2 , . . . ,Xn);
A fuzzy relation is a fuzzy set defined in the Cartesian product of crisp sets Xi, X 2 , . . . , Xn. A fuzzy relation Q in X i x X2 x • • x X^ is defined as the fuzzy set Q = {((xi,...,Xn), /iQ(xi,...,Xn)) |(xi,...,Xn) G X i X ••• X X ^ } ,
where //g : Xi x X2 x • • • x X^^ -^ [0,1]. As a special case, a binary fuzzy relation is a fuzzy set defined in the Cartesian product of two crisp sets. A binary relation on a finite Cartesian product is usually represented by a fuzzy relational matrix, which is a matrix whose elements are membership values of corresponding pairs belonging to the fuzzy relation. If for i = 1,2,..., n and j = 1,2,..., m, we have rij G [0,1], then the matrix R = {rij)nxm is called a fuzzy relational matrix.
Section 1.2 Fuzzy Set Theory
11
Let R = {rij)nxm and S = {sij)nxm be two fuzzy relational matrices. Then the operations of union, intersection, and complement are: r i U O = [Vij V Hn
O ^ [Tij A
~^^ ^^ (1 ~~
Sijjjixmi Sijjnxrri')
'^ijjnxmi
where V denotes the maximum operation (i.e., Tij V Sij — max{r^^, s^j}) and A denotes the minimum operation. We say that R is contained in S, denoted by i? C 5, if and only if r^j < Si^ for alii = 1, 2 , . . . , n and j = 1, 2 , . . . , m. If i? C 5 and S C R,we say that R is equal to 5. Example 1.2.5. Let X = {Chicago, Houston, New York} and Y = {Los Angeles, New York}. We want to define the relational concept "close in distance" between two sets of cities. The fuzzy relation can be represented by the following fuzzy relational matrix P ( X , y ) :
X
Chicago Houston New York
Y Los Angeles New York 0.2 0.7 0.35 0.3 0.1 1
n 1.2.4 Projections and Cylindric Extensions Starting from a crisp relation that is defined in the product space of two sets, the concepts of projection and cylindric extension can be defined. For example, consider theset^ = {(x,y) eR'^\{x-l)'^-\-{y-l)^ < 1} whichisarelationinXxF = M^. Then the projection of A on X is Ai = [0,2] C X, and the projection of ^4 on y is A2 = [0,2] C Y; see Figure 1.2.6. The cylindric extension of Aito X xY = R'^ is ^lE = [0,2] X (—00, oc) C M^. These concepts can be extended to fuzzy relations. Let Q be a fuzzy relation in Xi x • • • x Xn and {ii,...,i/j;}bea subsequence of { 1 , 2 , . . . , n}. Then the projection of Q on X^^ x • • x X^^ is a fuzzy relation Qp in Xi^ X '' • X Xi^ defined by the membership function IJ^Qp{xi,,...,Xi^)
=
max
/iQ(xi,...,Xn),
where {xj^, • • •, ^j^^-k)} ^^ ^^^ complement of {xi^,..., x^^ } with respect to {xi^ . . . , Xn}. As a special case, if Q is a binary fuzzy relation in X x y, then the projection of Q on X, denoted by Qi, is a fuzzy set in X defined by fiQ,{x) = max/iQ(x,?/). Note that the above formula is still valid if Q is a crisp relation. For example, if the crisp relation is the A in Figure 1.2.6, then its projection Qihy the formula is
Chapter 1. Fuzzy Set Theory and Rough Set Theory
12
Figure 1.2.6: Projections and cylindric extensions of a relation (where AIE = shaded region). equal to the Ai in Figure 1.2.6. Hence, the projection of fuzzy relation is a natural extension of the projection of crisp relation. Example 1.2.6. By definition, the projection of fuzzy relation in Example 1.2.5 on X and Y are the fuzzy sets Qi
0.7 Chicago
and Q2
0.35
+ Houston
0.35 LA
-h
NY
1 NY'
respectively.
n
The projection constrains a fuzzy relation to a subspace; on the other hand, the cylindric extension extends a fuzzy relation (or fuzzy set) from a subspace to the whole space. Let Qp be a fuzzy relation in Xi^ x • • • x X^^ and { i i , . . . , ijtj;} is a subsequence o f { l , 2 , . . . , n } . Then the cylindric extension of Qp to Xi x • • x Xn is a fuzzy relation QPE in Xi X • • • X Xn defined by f^QpE
K'^li • • • 5 ^ n j ^
f-^Qp K'^ii 5 • • • 5 ^ikJ'
As a special case, if Qi is a fuzzy set in X, then the cylindric extension of Qi to X X y is a fuzzy relation QIE inX xY denoted by
The definition of cylindric extensions is also valid for crisp relations.
Section 1.2 Fuzzy Set Theory
13
Example 1.2.7. Consider the projections Qi and (52 in Example 1.2.6. According to the definition of cylindric extensions, their cylindric extensions to X x Y are Q IE
Q 2E
0.7
0.35
0.7
+ (Chicago, NY) + (Houston, LA)
(Chicago, LA) 0.35 (Houston, NY)
1
+ (NY, LA)
1 (NY, NY)'
0.35 0.35 (Chicago, LA) (Houston, LA) ^ (NY, LA) 1 1 1 (Chicago, NY) (Houston, NY) (NY, NY)" 0.35
-h
D
From Examples 1.2.6 and 1.2.7, we see that when we take the projection of a fuzzy relation and then cylindrically extend it, we obtain a fuzzy relation that is larger than the original one.
1.2.5 Composition of Fuzzy Relations The composition of fuzzy relations P(X, Y) and Q{Y, Z), denoted by P o Q, is defined as a fuzzy relation in X x Z whose membership function is given by /iPoQ(x, z) = maxmin[/ip(x, y),iiQ{y, z)], where (x, z) ^ X x Z. Example 1.2.8. Consider X and Y be defined as in Example 1.2.5. Define another set Z = {Detroit,Philadelphia}. Let P{X^Y) denote the fuzzy relation "close in distance." Define the fuzzy relation "far away from each other" inY x Z, denoted by Q{Y, Z), by the fuzzy relational matrix Q{Y,Z)
0.9 0.25
We can use fuzzy relational matrices and matrix product to compute PoQ. Write out each element in matrix product PQ, but treat each multiplication as a min operation and each addition as a max operation. We get PoQ
0.2 0.35 0.1
0.7 0.3 1
0.9 0.25
0.25 0.35 0.25
0.2 0.35 0.1
D
1.2.6 The Extension Principle The extension principle is a basic identity that allows the domain of a function to be extended from crisp points in X to fuzzy sets in X. More specifically, let f:X-^YhesL function from crisp set X to crisp set Y. Suppose that a fuzzy set A
14
Chapter 1. Fuzzy Set Theory and Rough Set Theory
in X is given and we want to determine a fuzzy set B = f{A) in Y which is induced by / . If / is a one-to-one mapping, then we can define
i^B{y) = i^A[r\y)],
yeY,
where f~^{y) is the inverse of / , i.e., f[f~^{y)] = y.lf f is not one-to-one, then an ambiguity arises when two or more distinct points in X with different membership values in A are mapped to the same point in Y. For example, we may have f{xi) = 1(^2) = y but xi ^ ^2 and /J^A{XI) 7^ A^A(^2)- Thus, the right-hand side of the above equation may take two different values /iyif^i = f~^iy)] ^^ I^A[X2 — /~^ (?/)]. To resolve this ambiguity, we assign the larger one of the two membership values to fJ^siy)- In general, the membership function for B is defined as fiB{y) =
max
f^Aix),
xef-^{y)
yeY,
where f~^{y) denotes the set of all points x ^ X such that f{x) = y. The above expression is called the extension principle. Example 1.2.9. Let X = {-5, - 4 , . . . , 0 , 1 , . . . , 5} and f{x) denote the fuzzy set "positively small" in X defined by r^o.
0-4
0.5
0.7
0.8
1
0.9
0.7
= x^. Let {PS} 0.5
Then, we have r..o2i ^ ^
0-8
1 0
1
0.9 4
0.7 0.5 9 1 6
1.2.7 Basic Concept of Fuzzy Systems A fuzzy rule base consists of a collection of fuzzy IF-THEN rules in the following form: R^:
I F x i i s F ^ , . . . , andXnisF^, THEN^isG^
(1.2.1)
where F!- and G^ are fuzzy sets in Ui C M and V C M, respectively, and x — ( x i , . . . , X n ) ^ e Ui X " • X Un and y e V are linguistic variables. Let M be the number of fuzzy IF-THEN rules in the form of (1.2.1) in the fuzzy rule base; that is, / = 1, 2 , . . . , M in (1.2.1). X and y are the input and output to the fuzzy logic system, respectively. Without loss of generality, we consider multi-input single-output fuzzy logic systems, because a multi-input multi-output system can always be decomposed into a group of multi-input single-output systems. A. Product-Inference Rule Fuzzy inference is sometimes called fuzzy reasoning or approximate reasoning. It is used in a fuzzy rule to determine the outcome from the given input information. Fuzzy rules represent control strategy or modeling knowledge/experiences. When
Section 1.2 Fuzzy Set Theory
15
specific information is assigned to input variables in rule antecedent, fuzzy inference is needed to calculate the outcome for output variable(s) in rule consequent. A fuzzy IF-THEN rule (1.2.1) is interpreted as a fuzzy implication F^ x • • x F^ ^ G^ in U X V. Let a fuzzy set A' in U be the input to the fuzzy inference engine; then each fuzzy IF-THEN rule (1.2.1) determines a fuzzy set B^ in V. That is, fiBiiy) = sup \fJ^F,ix.:xFi-^Giix,y)iJ.A'{x)\ . (1-2.2) A fuzzy IF-THEN rule (1.2.1) can also be interpreted in a number of ways, such as, mini-operation rule, product-inference rule and maxmin rule. Here, we show the most commonly used interpretation of the fuzzy IF-THEN rule, i.e., the productinference rule, defined as:
= IIFl (^l) . . • /^F^ {Xn)l^Gi {x).
B. Singleton Fuzzifier The fuzzifier maps crisp points in U to fuzzy sets in U. Singleton fuzzifier is defined as follows: , .
f 1
^'^'^^)=\
X = X*
0 others
which means that A' is a fuzzy singleton with support x*, that is, X = X* and fiA' {x) = 0 for all other x e U and x y^ x*.
IIA'{X)
= 1 for
C. Gravity Center Defuzzijfiier Defuzzification is a mathematical process used to convert a fuzzy set or fuzzy sets to a crisp point. It is a necessary step because fuzzy sets generated by fuzzy inference in fuzzy rules must be somehow mathematically combined to come up with one single number as the output of a fuzzy controller or model. After all, an actuator for control systems can accept only one value as its input signal. There are some common defuzzification techniques: mean of maximum, gravity center, and linear method, etc. Here, we show the most popular defuzzification technique, the gravity center defuzzifier, defined as: y =
Ez^iMsKr)
where y^ is the center of the fuzzy set GK that is, the point in V at which fiQi (y) achieves its maximum value, and fi^i (y^) is given by (1.2.2). Example 1.2.10. Let {y} = {0.1/2 + 0.8/3 + 1.0/4 + 0.8/5 + 0.1/6}. Using the gravity center defuzzifier method, we have y= ^
0.1 X 2 + 0.8 X 3 + 1.0 X 4 -h 0.8 X 5 + 0.1 X 6 =4. 0.1 + 0.8 + 1.0 + 0.8 + 0.1
•
Chapter 1. Fuzzy Set Theory and Rough Set Theory
16
1.2.8 Fuzzy Logic and Fuzzy Reasoning Logic is the study of methods and principles of reasoning, where reasoning means obtaining new propositions from existing propositions. In classical logic, propositions are required to be either true or false; that is, the truth value of a proposition is either 0 or 1. Fuzzy logic generalizes classical two-value logic by allowing the truth values of a proposition to be any numbers in [0,1]. This generalization allows us to perform fuzzy reasoning, also called approximate reasoning; that is, deducing imprecise conclusions (fuzzy propositions) from a collection of imprecise premises (fuzzy propositions). In this section, we first introduce some basic concepts and principles in classical logic and then study their generalizations to fuzzy logic. In classical logic, the relationship between propositions are usually represented by a truth table. The fundamental truth table for conjunction V (i.e., logic "OR" operation), disjunction A (i.e., logic "AND" operation), implication -^, equivalence , and negation -i are shown in Table 1.2.1, where the symbols T and F denote true and false, respectively. Given n basic propositions pi,p2,... ^Pn, ^ new proposition can be defined by a function that assigns a particular truth value to the new proposition for each combination of truth values of the given propositions. The new proposition is usually called a logic function. Because n propositions can assume 2^ possible combinations of truth values, there are 2^ possible logic functions defined by n propositions. Because 2^ is a huge number for large n, a key issue in classical logic is to express all the logic functions with only a few basic logic operations; such basic logic operations constitute a complete set of primitives. The most commonly used complete set of primitives is negation -i, conjunction V, and disjunction A. By combining ^, V and A in appropriate algebraic expressions, referred to as logic formulas, we can form any other logic functions. Logic formulas are defined as follows: (a) The truth values 0 (F) and 1 (T) are logic formulas; (b) If p is a proposition, then p and -ip are logic formulas; (c) Ifp and q are logic formulas, then pW q and p A q are also logic formulas; (d) The only logic formulas are those defined by (a)-(c). When the proposition represented by a logic formula is always true regardless of the truth values of the basic propositions involved in the formula, it is called a tautology; when it is always false, it is called a contradiction. Various forms of tautologies can be used for making deductive inferences. They are referred to as inference rules.
Table 1.2.1: Truth table of five operations p T T F F
Q
T F T F
pAq T F F F
pV q T T T F
p-^q T F T T
p^q T F F T
^p F F T T
Section 1.2 Fuzzy Set Theory
17
The most commonly used inference rules are those listed in Table 1.2.2. In the table, each tautology involves one or two propositions that are used to make a deductive inference. For example, "constructive dilemma" involves the following two propositions: (1) {p-^ q) A{r ^ s); and (2) pVr. From these two propositions, we can deduce a new proposition given by qW s. Sometimes, however, the premises of an inference have a logical form to which no inference rule can be applied. Here is an illustration of such a situation: (1) - ( p V g ) (2) r^q
In the above, to reach the conclusion, we need to dissolve a negated disjunction and to detach the implication for which the basic inference rules shown in Table 1.2.2 cannot be applied. To conduct inferences like this one, we will have to use the "rules of replacement" with which we can change the original forms of our premises so that our basic inference rules can be applied to them. Table 1.2.3, shows a set of rules of replacement. Returning to our inference above, we can see that the first premise -(pvg) may be rewritten as its logically equivalent (synonymous) expression -ip A ^q
and the inference now proceeds easily. Before we start a survey on the various inference rules in fuzzy reasoning, we will make some remarks about knowledge representation. The fundamental knowledge representation unit in fuzzy reasoning is the notion of linguistic variables. In our daily life, words are often used to describe variables. For example, in the sentence "the speed of the car is fast," the word "fast" is used to describe the variable "the speed of the car." Roughly speaking, if a variable can take words in natural languages as its values, it is called a linguistic variable. Now, the question is how to formulate these words in mathematical terms. Here we use fuzzy sets to characterize words. In the fuzzy set theory literature, a formal definition of linguistic variables is usually employed [13,14], given as follows. Definition 1.2.1. A linguistic variable is characterized by (X, T, F, M), where X is the name of the linguistic variable; T is the set of linguistic values that X can take; V is the actual physical domain in which the linguistic variable X takes its quantitative (crisp) values; and M is a semantic rule that relates each linguistic value in T with a fuzzy set in y . D
Chapter 1. Fuzzy Set Theory and Rough Set Theory
Table 1.2.2: Basic inference forms
Conjunction (1) p (2) q pAq
Addition (1) P
Simplification il)pAq P
Disjunctive Syllogism {!) pWq (2) ^P
pV q Q
Modus Ponens (1) P - q (2) P Q
Constructive Dilemma (1) {p-^q)A{r^s) (2) pWr qW s
Hypothetical Syllogism (!) p ^ q (2) g ^ r p —^ r
Modus ToUens {!) p-^q (2) -^q -^p
Destructive Dilemma (1) {p^q)A{r^s) (2) ^ g V - i s -ip V -ir
Absorption {!) p ^ q p ^
(pAq)
Section 1.2 Fuzzy Set Theory
19
Table 1.2.3: Rules of replacement Involution (Double Negation) Commutativity Associativity De Morgan's law Distributivity Equivalence Contraposition Implication Exportation Idempotency
p ^^ ->^p {pM q) ^ {qyp) (pAq) (p Aq) ^^ -ip V -ig
\py {qAr)]^ [{p y q) A{py r)] \pA{qyr)]^ [{p Aq)y [pA r)] (p^q) ^ [{p Aq)y (-ip A -g)] (p^q) ^ [{p -^ q) A{q-^ p)] ip-^q)^ {q-^p) (p-^q) ^ (-np V q) Ip-^ (q^r)]^ [{p Aq) ^r] {p Ap) ^^ p {pyp) ^^p
The concept of linguistic variables is very important because linguistic variables are the most fundamental elements in human knowledge representation. When we use sensors to measure a variable, we get numbers as measured values; when we ask human experts to evaluate a variable, we get words to describe values. For example, when we use a radar gun to measure the speed of a car, it gives us numbers like 35 mph, 43 mph, etc. When we ask a human to tell us about the speed of a car, he/she often tells us in words like "it is slow," "it is fast," etc. Hence, by introducing the concept of linguistic variables, we are able to formulate vague descriptions in natural languages using precise mathematical terms. This is the first step to incorporate human knowledge into engineering systems in a systematic and efficient manner. With the concept of linguistic variables, we are able to take words as values of linguistic variables. In our daily life, we often use more than one word to describe a variable. For example, if we view the speed of a car as a linguistic variable, then its values might be "not slow," "very slow," "slightly fast," "more or less medium," etc. In general, the value of a linguistic variable is a composite term x = xiX2 - - -Xn that is a concatenation of atomic terms xi, X 2 , . . . , x^. These atomic terms may be classified into three groups: 1) Primary terms, which are labels of fuzzy sets, such as "slow," "medium," "fast," etc. 2) Complement "not" and connections "and" and "or." 3) Hedges, such as "very," "slightly," "more or less," etc. The terms "not", "and", and "or" have been studied in preceding sections. Our task now is to characterize hedges. Although in its everyday use the hedge "very"
20
Chapter 1. Fuzzy Set Theory and Rough Set Theory
does not have a well-defined meaning, in essence it acts as an intensifies In this spirit, we have the following definition for the two most commonly used hedges: "very" and "more or less." Let A be a fuzzy set in X. Then "very A" is defined as a fuzzy set in X with the membership function given by
and "more or less A'' is a fuzzy set in X with the membership function given by Mmore or less
A{X) = ^/JiX{x). Example 1.2.11. Let X — {1, 2,3,4, 5} and the fuzzy set "small" be defined as , „, 1 0.8 0.6 0.4 0.2 {small} = - H \ \ \ . ^ ^ 1 2 3 4 5 Then, according to the above definitions, we have ,, 1 0.64 0.36 0.16 0.04 very small = - + _ + ^ + _ - + -_, „ 1 0.4096 0.1296 0.0256 0.0016 very very small = - H \ h— h —-—, 1 2 3 4 5 ,, 1 0.8944 0.7746 0.6325 0.4472 more or less small == - H \ \ \ . 1 2 3 4 5
D
A fuzzy IF-THEN rule is the basic unit for capturing knowledge in many fuzzy systems. A fuzzy rule has two components: an IF-part (referred to as the antecedent) and a THEN-part (referred to as the consequent). Such a fuzzy rule can be expressed as IF (antecedent), THEN (consequent). The antecedent and the consequent are both fuzzy propositions. The antecedent describes a condition, and the consequent describes a conclusion that can be drawn when the condition holds. There are two types of fuzzy propositions: atomic fuzzy propositions and compound fuzzy propositions. An atomic fuzzy proposition is a single statement "x is A," where x is a linguistic variable and A is a linguistic value of x (i.e., A is a fuzzy set defined in the physical domain oi x). A compound fuzzy proposition is a composition of atomic fuzzy propositions using the connectives "and," "or," and "not" which represent fuzzy intersection, fuzzy union, and fuzzy complement, respectively. For example, if x represents the speed of a car, then the following are fuzzy propositions (the first three are atomic fuzzy propositions and the last three are compound fuzzy propositions): X is 5 xis M xis F
Section 1.2 Fuzzy Set Theory
21 X is 5 or X is not M
X is not S and x is not M {x is S and x is not F) or a; is M where S, M and F denote the fuzzy sets "slow," "medium," and "fast," respectively. Note that in a compound fuzzy proposition, the atomic fuzzy propositions can be independent; that is, the x's in each atomic fuzzy proposition of a compound fuzzy proposition can be different variables. Actually, the linguistic variables in a compound fuzzy proposition are in general not the same. For example, let x be the speed of a car and y = xbe the acceleration of the car. Then if we define fuzzy set large (L) for the acceleration, the following is a compound fuzzy proposition xis F and y is L. Therefore, compound fuzzy propositions should be understood as fuzzy relations. Next, we discuss how to determine the membership functions of these fuzzy relations. For connective "and," use fuzzy intersections. Specifically, let x and y be linguistic variables in the physical domains X and F , and A and B be fuzzy sets in X and Y, respectively, then the compound fuzzy proposition: xis A and y is B, is interpreted as the fuzzy relation An B inX xY with membership function liAnB{x,y) =
m.m[^A{x),iiB{y)\-
For connective "or," use fuzzy unions. Specifically, the compound fuzzy proposition: X is A or 2/ is B, is interpreted as the fuzzy relation A\J BixvX xY with membership function
For connective "not," use fuzzy complements. That is, replace "not A" by ^A. The membership function in this case is M-A(^) = 1 - M A ( ^ ) .
Because fuzzy propositions are interpreted as fuzzy relations, the remaining question is how to interpret the IF-THEN rules. In classical propositional calculus, the expression "IF p, THEN g" is written as p -^ g with the implication -^ regarded as a connective defined in Table 1.2.1, where p and q are propositional variables whose values are either true (T) or false (F). From Table 1.2.1, we see that if both p and q are true or false, then p -^ qis, true; if p is true and q is false, then p ^ g is false; and, if p is false and g is true, then p -^ g is equivalent to -^p\J q and {p A q) V ^p
22
Chapter 1. Fuzzy Set Theory and Rough Set Theory
in the sense that they share the same truth value (Table 1.2.1) as p —> g, where -•, V and A represent classical logic operations "not," "or," and "and," respectively. Because fuzzy IF-THEN rules can be viewed as replacing p and q with fuzzy propositions, we can interpret them by replacing -•, V and A operations with fuzzy complement, fuzzy union, and fuzzy intersection, respectively. We rewrite "IF (antecedent), THEN (consequent)" as "IF(FPi),THEN(FP2)" and assume that FPi is a fuzzy relation defined in X = Xi x • • • x Xn, FP2 is a fuzzy relation defined in F = Yi x • • x Ym, and x and y are linguistic variables in X and F , respectively. The fuzzy IF-THEN rule "IF (FPi), THEN (FP2)" can be interpreted as a fuzzy relation Qz in X x Y with the membership function: /^Q^(^^2/) =max{min[/iFPi(^), /^FP2{y)]^ 1
-/J^FPA^)}-
In fuzzy reasoning, two inference rules are of major importance, i.e., the compositional rule of inference and the generalized modus ponens. The first rule uses a fuzzy relation to represent explicitly the connection between two fuzzy propositions, while the second uses an IF-THEN rule that implicitly represents a fuzzy relation. The generalized modus ponens has the following symbolic inference scheme: premise 1: IF x is Pi, THEN y is P2; premise 2: xisQi; conclusion: y is Q2] where x and y are linguistic variable. Pi and Qi are linguistic values of x, and P2 and Q2 are linguistic values of y\ that is. Pi, Qi, P2 and Q2 are fuzzy sets defined in the physical domains of x and y. Consider the following example: premise 1: IF the tomato is red, THEN the tomato is ripe, premise 2: The tomato is very red. conclusion: The tomato is very ripe. The symbolic name "tomato" stands for the real world object tomato. In this example X and y are the same. Red and ripe are symbolic names for properties, corresponding to Pi and P2. The meaning of the symbol "red" is described by a fuzzy set. Due to the representation of the properties in terms of fuzzy sets, a conclusion can be derived even when the input is "very red" instead of "red." In fuzzy set theory the membership functions representing the meaning of "red" and "very red" will overlap each other, i.e., there are lots of values in the domain that have membership degrees greater than zero in both fuzzy sets. Example 1.2.12. Let the discrete domain X = Y = {1, 2, 3,4, 5}. Suppose that {large} =- 0.5/4 + 1/5, {small} =:: 1/1 + 0.5/2, and {somewhat small} = 1/1 + 0.4/2 + 0.2/3. Denote {somewhat small} = {SWS}. Let premise 1 be "IF x is small, THEN y is large," and premise 2 be "x is somewhat small." We now derive
Section 1.2 Fuzzy Set Theory
23
the conclusion for y given these two premises. According to the interpretation of fuzzy rules, the fuzzy relational matrix of premise 1 can be determined from /ilF X is small, THEN y is large (^, V) = m a x { m i n [ / i s m a l l ( ^ ) , /ilarge(2/)], 1 " Msmall(^)}
0 0 0 0.5 0.5 0.5 0.5 0.5 1 1 1 1 R 1 1 1 1 1 1 1 1 In this case the conclusion is represented by /isws o R
/^conclusion = /^SWS O R = {1 0.4 0.2 0 0) O
1 0.5 1 1 1 and can be expressed as,
0 0.5 1 1 1
0 0 0.5 0.5 0.5 0.5 1 1 1 1 1 1 1 1 1 1 1 1
1 0.5
- (0.4 0.4 0.4 0.5 1). Hence, the conclusion can be expressed as {conclusion} = 0.4/1 + 0.4/2 -f 0.4/3 + 0.5/4 H- 1/5, which can be interpreted as the linguistic conclusion "y is somewhat large." Such a process clearly emulates the inference process in human mind. • The compositional rule of inference can be considered as a special case of the generalized modus ponens. Its general symbolic form is premise 1: x is Qi; premise 2: xRy; conclusion: y is Q2' Here xRy reads as "x has relation R with 2/" and its meaning is represented as a fuzzy relation /IR. Hence, instead of the IF-THEN rule, there is a fuzzy relation R. An example of the compositional rule of inference is premise 1: j ; is a small number; premise 2: a; is somewhat smaller than y; conclusion: y is a very small number. In the above, "somewhat smaller than" is the fuzzy relation R. Example 1.2.13. Consider the discrete domains X = Y = {1,2,3,4}. Suppose that {small} = 1/1 + 0.6/2 -f- 0.2/3 and the binary relation "approximately equal" {AE} = 1/((1,1) + (2,2) + (3,3) + (4,4)) + 0.5/((l, 2) + (2,1) + (2,3) + (3,2) + (3,4) -f (4,3)). In this case /igmaii o MAE represents the conclusion for y and can be expressed as the max-min product of their relational matrices. Thus,
/^conclusion = ( 1 0.6
0.2
0) O
1 0.5 0.5 1 0 0.5 0 0
0 0.5 1 0.5
0 0 0.5 1
(1
0.6
0.5
0.2).
Chapter 1. Fuzzy Set Theory and Rough Set Theory
24
Hence, the conclusion is the fuzzy set {MLS} = 1/1 + 0.6/2 + 0.5/34-0.2/4, which can be denoted by the hnguistic label "y is more or less small." • There is another fuzzy conditional statement often used in the design of fuzzy adaptive control systems. Its general symbolic form is: IF X is A, THEN y is B, ELSE y is C. This fuzzy conditional statement can be expressed as: {xisA-^yisB)V
{x is not A-^ y is C),
(1.2.3)
where x and y are linguistic variables in X and Y, respectively, and A is linguistic value of X, and B and C are linguistic values of y. In fact, (1.2.3) can be considered as a fuzzy relation RonX x Y, R{x, y) = {xisAAyisB)\/
(x is not A Ay is C).
The Cartesian product form is
R = {Ax
B)^{-nAxC),
where " + " and " x " denote the conjunction and disjunction operations, respectively, of fuzzy relations. Example 1.2.14. Let the discrete domains X = y = {l,2,3}. Assume that A = {small} = Y + "y5 5-{large} = — + -; 1
C = {not large}
0.6
The fuzzy conditional statement is: "IF x is small, THEN y is large, ELSE y is not large." The conclusion corresponding to "x is very small" can be derived as follows. 9
1
0.16
= smalr = - +
R={AxB)
+ {-nAx C)
0 0 0 0 0.6 1
0.4 0.4 0 0.4 0.6 0.6
1 " 0.4 0 _
+
0 0.6 1
0 0.6 0.6
0 0 0
1 " 0.4 0
Therefore, B' = A' o R = (0.16 0.4 1), i.e., "IF x is very small, THEN y is very large." •
Section 1.2 Fuzzy Set Theory
25
The fuzzy conditional statement "IF x is A, THEN y is B, ELSE y is C" has only one condition. We call it single-fuzzy conditional statement. If a fuzzy conditional statement has more than one conditions, we called it multifuzzy conditional statement. A multifuzzy conditional statement has the following symbolic form: IF X is Ai, THEN y is Bi; IF x is A2, THEN y is B2; . . . ; IF X is An, THEN 7/is 5 ^ ; where x and y are linguistic variables in X and Y, respectively, and Ai, A2j. • • ,An are linguistic values of x, and 5 i , ^ 2 , • •, ^ n are linguistic values of y. The multifuzzy conditional statement denotes a fuzzy relation Ron X x Y, R={AiX
Bi) + (^2 X ^2) + • • • + {An X ^ ^ ) ,
with membership function given by liR{x,y)=
max {min [//A, (^),/iB, (?/)]} • l n(m -h 1)), for each group of vectors {xikjX2k, • • •, Xmk.Vk), calculate
Section 2.3 An Off-Line Fuzzy Identification Algorithm
T^,^ W;:xik....,
W^xmk).
A: - 1, 2 , . . . , L,
39
(2.3.4)
where
2=1
and Gl denotes the truth value of the ith rule from the A:th group of vectors by (2.2.5). Step 2: Set initial values of parameters for /c = 0 as ^o = 0 and So = al, where a takes a large value (e.g., 10^ in this chapter) and / is the identity matrix. Step 3: Using recursive least-squares algorithm to calculate [2] F, • J- k •
Sk-iHl 1^
HkSk-iH^''
Sk -= Sk--1 —
FkHkSk-i,
= Ok--1 + Fk{yk — HkO^_ -i),
^k '•
(2.3.5) (2.3.6) (2.3.7)
where Sk is the covariance matrix, Fk is the gain vector, Ok is the parameter vector to be identified, and Hk is the data row vector. Step 4: /c + 1 —> /c. If A; < L, go to Step 3; otherwise, stop. OL is the optimal consequence parameters. Example 2.3.1. For the fuzzy model of Example 2.2.2, we add Gaussian white noise with variance 0.5. The following results can be obtained with consequent parameters identification of 200 input-output data pairs: R^\
IF X is as in Figure 2.2.2 (a), THEN y = 0.554x + 2.3663;
R^:
W X is as in Figure 2.2.2 (b), THEN y = 0.2345x + 8.6705;
R^:
IF X is as in Figure 2.2.2 (c), THEN y - 0.2997x + 3.1702.
Figure 2.3.2 describes the original input-output data and the identification result. If there is no noise, the identification result and the original data will be identical. D
2.3.2 Premise Parameter Identification Three kinds of fuzzy subsets need to be considered, i.e., {small} (Figure 2.3.3 (a)), {medium} (Figure 2.3.3 (b)), and {large} (Figure 2.3.3 (c)). Their membership functions are of convex type and formed by piecewise linear functions as shown in Figure 2.3.3. In the figure, pi,p2, • • • ,P8 are the premise parameter values corresponding to membership function's "turning points." At those points, the degree of membership is either zero or one. There are two premise parameters in fuzzy subsets {small} or {large} and four premise parameters in subset {medium}. When
Chapter 2. Identification of the Takagi-Sugeno Fuzzy Model
40
12
, l o — - ^ :O -O'O'
O O
o o o o o
o
R3
,,o- o
P'
(P-^'
0
4
8
12
16
20
Figure 2.3.2: The identification result (circles) and the original data (dashed lines).
0 Pi
P3
P4
(a)
P5
(b) 1
P7
(c)
Figure 2.3.3: Three kinds of membership functions.
Pfi
Section 2.3 An Off-Line Fuzzy Identification Algorithm
41
Figure 2.3.4: Example of completeness. input/output data are given, the problem of premise parameter identification is a nonlinear programming problem minimizing the performance index, which can be solved by the complex method in optimization. In this section, the optimum premise parameters will be obtained by nonlinear programming. In the search process for optimum premise parameters, three conditions must be satisfied. (i) A premise parameter cannot go beyond the area of definition for the premise variable. (ii) The completeness of each premise variable must be maintained. The completeness requires that a fuzzy model can provide a corresponding output value in all cases. The completeness can be explained using Figure 2.3.4. Figure 2.3.4 (a) is immature because when 2 < Xi < 3 there is no corresponding fuzzy space and the corresponding model output value is undetermined. (iii) The value range of each premise variable Xi is divided only by two forms as Figure 2.3.5. In Figure 2.3.5 (a), premise variable Xi is divided by two fuzzy subsets, i.e., {small} and {large}. In Figure 2.3.5 (b), it is divided by {small}, {mediumi}, {medium2},... ,{mediumg_i}, and {large}, i.e., a total of g + 1 fuzzy subsets (g > 2). In this case, premise parameters must satisfy Pi 0 (z = 1, 2 , . . . , n) for the required / . Therefore n
f{^)
m
n
n
m
Xj-b)
^0.
m
E n exp
E n A){x,)
Xn —6^.
In summary, using the Stone-Weierstrass Theorem and the fact that y is a set of real continuous functions on X, we have proved Theorem 2.5.1. • From above theorem we can also conclude that the generalized T-S fuzzy model is the generalization of the common T-S fuzzy model and the fuzzy basis function network [9,11].
2.5.2 Parameter Identification Algorithm Based on the GA In this subsection we discuss how to derive the optimal structure and parameters of the generalized T-S fuzzy model based on the genetic algorithm (GA). The outline of the algorithm is shown in Figure 2.5.4. The parameter identification algorithm is summarized as follows. (1) Coding. If the number of fuzzy rules is n and the number of input variables is m, then we have E
i=i \
Po + E Pk^k k=i
n exp
J i=i
E n exp We can see from the above equation that there are n x (4m + 1) independent variables to be identified. In this chapter we choose both binary coding and
Chapter 2. Identification of the Takagi-Sugeno Fuzzy Model
74
real coding. The matrix coding of generalized T-S fuzzy model is shown in the following matrix: CTl
al
(72
af
rr
n^
bl
hP
m
r'^
Cm
Pi
PI
bl
cL
Pi
PI
bl ^m
„n c';^ Pi
^n P'i
^m
a'^ m
Pn
•••
K
In the matrix coding of the generalized T-S fuzzy model, the first row represents the coding of fuzzy rule 1, the second row represents the coding of
Coding and generating initial population
Calculation of parameters
Reproduction
Crossover
Mutation
Is the stopping criterion satisfied? N
Stop Figure 2.5.4: The outline of the genetic algorithm.
Bibliography
75
fuzzy rule 2 , . . . , the nth row represents the coding of fuzzy rule n. Ui (i = 1, 2 , . . . , n) is a binary number: ai = 0 represents that the fuzzy rule i does not exist; otherwise, fuzzy rule i exists. All other parameters take real values. (2) Evaluation of the generalized T-S fuzzy model. The evaluation of the generalized T-S fuzzy model involves both accuracy and complexity. We use the quadratic sum of errors e to represent the model accuracy. A smaller e indicates a higher accuracy. We use the number of fuzzy rules MTS to represent the model complexity. A smaller MTS will imply a lower complexity. Based on the above analysis, the following definition is used to represent the individual fitness value of the chromosome (in the GA): 1 e
1 MTS
where g{T) represents the individual adaptability, uj^ and UJM are weight coefficients to be prespecified, and the number of fuzzy rules MTS = ^(^ii
(3) Crossover and mutation. The crossover of generations is decided by crossover rate pc When crossing, we randomly choose the submatrix of the individual matrix. Then, elements of the same positions of two submatrices cross over to generate new individuals. The mutation is decided by the mutation rate pm, which is constant for each gene. (4) Stopping conditions. If a prespecified stopping condition is satisfied, the process ends. Our stopping condition is determined by the number of generations. After the process ends, the fitness value of each individual is calculated. We use the values of the individual whose fitness value is the largest as the optimal parameters of the generalized T-S model, in which cr^ = 0 represents the fuzzy rule i exists and otherwise fuzzy rule i does not exist.
2.6
Summary
In this chapter, we developed an off-line identification method, that transforms the input-output process data to a fuzzy Takagi-Sugeno (T-S) model with high accuracy. Then, an on-line identification algorithm was presented for the parameters and structures of the T-S fuzzy model in representing nonlinear dynamical systems with time delays. Finally, the genetic algorithm was used to identify the parameters and structures of a new dynamic fuzzy model that was proved to be a universal approximator.
Bibliography [1] O. Hecker, O. Nelles, O. Moseler, "Nonlinear system identification and predictive control of a heat exchanger based on local linear fuzzy models," Proc. of the American Control Conference, Albuquerque, NM, June 1997, pp. 3294-3298. [2] L. Ljung, System Identification: Theory for the User, 2nd Edition, Upper Saddle River, NJ: Prentice Hall, 1997.
76
Chapter 2. Identification of the Takagi-Sugeno Fuzzy Model
[3] Y. B. Quan, Fuzzy Modeling and Control of Nonlinear Systems, Ph.D. Dissertation, Northeastern University, Shenyang, China, 2001. [4] T. Robertson, F. T. Wright, and R. L. Dykstra, Order Restricted Statistical Inference, New York: Wiley, 1988. [5] W. Rudin, Principles of Mathematical Analysis, New York: McGraw-Hill, 1953. [6] M. Sugeno and G. Kang, "Structure identification of fuzzy model," Fuzzy Sets and Systems, vol. 28, pp. 15-33, Oct. 1988. [7] T. Takagi and M. Sugeno, "Fuzzy identification of systems and its application to modeling and control," IEEE Transactions on Systems, Man, and Cybernetics, vol. 15, pp. 116-132, Jan. 1985. [8] S. Tong, J. Tang, and T. Wang, "Fuzzy adaptive control of multivariable nonlinear systems," Fuzzy Sets and Systems, vol. 111, pp. 153-167, Apr. 2000. [9] L. Wang, A Course in Fuzzy Systems and Control, Upper Saddle River, NJ: Prentice Hall, 1997. [10] L. Wang and R. Langari, "A decomposition approach for fuzzy systems identification," Proceedings of the 34th Conference on Decision and Control, New Orleans, LA, Dec. 1995, pp. 261-266. [11] L.-X. Wang and J. M. Mendel, "Fuzzy basis functions, universal approximation, and orthogonal least-squares learning," IEEE Transactions, on Neural Networks, vol. 3 pp. 807-814, Sept. 1992. [12] H. Zhang and Z. Bien, "Adaptive fuzzy control of MIMO nonlinear systems" Fuzzy Sets and Systems, vol. 115, pp. 191-204, Oct. 2000. [13] H. Zhang, L. Cai, and Z. Bien, "A fuzzy basis function vector-based multivariable adaptive fuzzy controller for nonlinear systems," IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 30, pp. 210-217, 2000. [14] H. Zhang, L. Cai, and Z. Bien, "A fuzzy basis function vector-based multivariable adaptive fuzzy controller for nonlinear systems," IEEE Transactions on Systems, Man, and Cybernetics, vol. 30, pp. 210-217, Feb. 2000. [15] H. Zhang, L. Cai, and Z. Bien, "A multivariable generalized predictive control approach based on T-S fuzzy model, " Journal of Intelligent and Fuzzy Systems, vol. 9, pp. 169-189, Sept. 2000. [16] H. Zhang and L. Chen, "A technique for handling fuzzy decision-making problems concerning two kinds of uncertainty," Cybernetics and Systems, vol. 22, pp. 681-698, Nov.-Dec. 1991. [17] H. Zhang and Y. B. Quan, "Modeling, identification, and control of a class of nonlinear systems," IEEE Transactions on Fuzzy Systems, vol. 9, pp. 349-354, Apr. 2001.
Chapter 3
Fuzzy Model Identification Based on Rough Set Data Analysis
3.1
Introduction
It is an open problem to model nonlinear systems with uncertainties. In Chapter 2, we developed an identification algorithm based on the Takagi-Sugeno fuzzy model. The fuzzy modeling procedure in Chapter 2 can be divided into three steps: premise structure identification, premise parameters identification, and consequent parameters identification. The premise structure identification procedure is done in two phases: (1) Identify the input structure, i.e., the significant input variables are identified among all possible input candidates; (2) assign fuzzy membership functions. In Chapter 2, we introduced an identification algorithm which included both phases in a uniform processes. We can also deal with them in two individual processes. In this chapter, we will address both structure and parameters identification problems using a new data analysis method, the rough set data analysis (RSDA). The contents of this chapter are organized as follows. The basic concepts are introduced first in Section 3.2. Then a novel input structure identification algorithm is developed in Section 3.3. A new rough information measure is defined to identify the input structure. Furthermore, a fuzzy relation model is constructed using the RSDA in Section 3.4. In order to overcome the shortcomings that the rough set is not suitable to deal with continuous values, a rough-ANN (artificial neural network) hybrid model is developed in Section 3.5. Finally, we provide a summary for this chapter in Section 3.6. 77
78
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
3.2
Preliminaries
In this chapter, rough set data analysis (RSDA), data filtering, fuzzy c-means clustering (FCM) will be applied to our modeling procedure. In this section, we introduce the basic concepts of these three topics. For details, please see [4], [16], and [2], respectively.
3.2.1 Rough Set Data Analysis The essence of the rough set approach relies on the approximation of incomplete or imprecise information by means of completely and precisely known pieces of information. The theoretical foundations of rough set has been introduced in Chapter 1. Rough set data analysis (RSDA) [3,4] is a symbolic approach to discover which attributes are relevant for data description or prediction. We can discover significant attributes and the dependency among the attributes in decision table with RSDA. Consider the following information system I^{U, O, y„ f,),en.
(3.2.1)
where U denotes a set of objects with cardinality \U\ = n, Q is a. finite set of attributes, Vq is the set of attribute values, and fq is the information function defined ^sfq'.U^Vq. For each Q Cftwe associate an equivalence relation RQ on U. The equivalence classes induced by RQ are denoted by U/RQ. If X £ U, [X]RQ is the equivalence class of RQ containing x. Suppose that U/RQ = {f/i, t/2, • • •, Un} and Vx, y G Ui, 1 < i < n, we have fq{x) = fq{y) for all q G Q, [^]RQ = [y]RQDefinition 3.2.1 (cf. [3]). Suppose that x^y £ U and Q C Q. The indiscemibility relation between x and y is defined by RQ as xRQy^{VqeQ)(fqix)
= fq{y)).
D
Definition 3.2.2 (cf. [3]). Suppose that P,Q C ft. We say that P is dependent on Q, denoted by Q -^ P , if every class of U/Rp is a union of classes of U/RQ. • In other words, Q —> P means that the classification of U induced by Rp can be predicted by the classification induced by RQ . Each dependence Q —> P leads to a rule as follows: Let Q = {gi, 92, • • •, Qn} and P = {pi,P2, • • • .Pk}- For each t = { t i , t 2 , . . . ,tn}, where ti G Vqi, there is a unique determining set s = {si, S2,..., s/e}, where Si G Vpi, such that, for V x G U, if (/gi(^) =h^--'Jqr,{x)
=tn), then(/p,(x) - si,...Jp^{x)
= Sk).
It is of particular interest in RSDA to find the set Q which has the least number of attributes and still has Q —^ P. A set with this property is called a minimal determining set for P.
Section 3.2 Preliminaries
79
Definition 3.2.3 (cf. [3]). A set Q is a minimal determining set for P,ifQ^P P is not dependent on R for Sill R (Z Q.
and •
In order to measure the degree of dependence of (Q -^ P), SL measure of the prediction quality or approximation quality is introduced in [3] as follows:
^^^_^^^^Exe./..fe^ 1^1
(3.2.2)
where RQX is the lower approximation of X by Q (i^g-lower approximation), 0 < 7(Q -^ P) < 1, and RQX
= {X e U \[X]RQ C X}. RQX is the set of all elements
of X that are correctly classified with respect to the attributes in Q, and j{Q —^ P) is the ratio of the number of all elements of U/Rp that can be correctly classified based on the attributes in Q to the total number of elements of U. Larger j{Q —^ P) means better prediction quality. Note that Q -^ P implies 7(Q ^ P) = 1 and that 7((5 -^ P) 7^ 1 means P is not dependent on Q (Q 7^ P). Definition 3.2.4 (cf. [16]). The rough membership function of element a to set X is defined as
where [a]^ is the equivalence class of R containing a.
•
It is clear that 0 < /ix(^) ^ 1- The rough membership function is derived from raw data directly. There is no need to make any prior assumption. Now we use a simple example to interpret the procedure. Example 3.2.1. Following the notation introduced in (3.2.1), an information system 11 is defined as follows: t/ = { x i , X 2 , . . . , X 8 } ;
f] = {Color (C), Density (D), Volume (V), Weight (W)}; condition attribute set: {Color, Density, Volume}; decision attribute set: {Weight}; Vc - {1 (Red), 2 (Yellow), 3 (Green)}; VD = {1 (Low), 2 (Middle), 3 (High), 4 (Very High)}; yy = {l(Small),2(Big)}; Vw = {^ (Light), 2 (Middle), 3 (Heavy)}; fq {q G ft) is an information function (for example, if the color of the object X2 is yellow, we get fc{x2) = 2). The information system / i can be described by decision table as in Table 3.2.1. The equivalence classes induced by condition equivalence relation Re, P y , RD are as follows: U/Rc
= {{^1, ^3, ^7, XQ}, {X2,X4,
U/RD
= {{a^i, X3, xe}, {x4, X5, X7}, {x2, xg}, {^9, xio}};
^5, ^10}, {^6,
xs}};
U/Rv
=
{{xs,X4,Xe},{xi,X2,X3,X7,Xs,Xg,Xio}}.
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
80
Table 3.2.1: Decision table Xi X2
xs X4 X5 XQ
X7 Xs Xg ^10
C 1 2 1 2 2 3 1 3 1 2
D 1 3 1 2 2 1 2 3 4 4
y 2 2 1 1 2 1 2 2 2 2
W 2 3 1 1 2 1 2 3 3 3
The decision equivalence class is: U/Rw
= {Xi,X2,X3} =
{{xs,X4,Xe},{xi,X5,X7},{x2,X8,XQ,Xio}}.
Now we analyze the key condition attributes to the decision attribute (Weight). Let Q = {C, D, V} and P = {W}. We have: U/RQ
= {{^1}, te}, {^3}, {^4}, {^5}, {^e}, {^7}, {^s}, {^9}, {^lo}}-
RQXI
= {x3, X4, XQ};
RQX2
= {Xi,
RQXS
= {X2, X8 X9, Xio}.
X5,
X7};
It is easy to get -/{Q -> P ) = 1 by (3.2.2), which means that {C, D, V} -^ {V^}. Let Q = Q - {C} = {D, V} and P = {W}. We have: [//i?Q = {{^1}, {^2, xg}, {x3, xe}, {X4}, {x5, X7}, {0:9, xio}}; i ^ Q ^ l = {xs, X4, XQ}; RQX2
= {Xi,
X5, X7};
^ Q ^ 3 = {^2, 3^8, Xg, Xio}.
It is easy to get ^{Q -^ P) = 1, which means that {D, V} -^ {W}. Let Q = Q-{D} = {V} and P = {T^}. We have: U/RQ = {{xs,X4,xe},{xi,X2,XQ,xr,xs,xg,xio}}; RQXI
= {^3,
RQX2
= $;
X4, XQ};
^ ^ 3 = ^;
It is easy to get -f{Q -^ P) = 0.3, which means that {V} -/^ {W}. Let Q = {D} a n d P = { I ^ } . We have: U/RQ
= {{Xi, X3, Xe}, {X4, X5, X7}, {X2, Xs}, {Xg, Xio}};
^ X i
= ^ ; PQX2 -
$;
^ Q ^ 3 = {X2,X8,X9,XIO}.
Section 3.2 Preliminaries
81
Table 3.2.2: Decision table after reduction Xi
X2,X8 XS.XQ
X4
X5,X7 X9,Xio
D 1 3 1 2 2 4
V 2 2 1 1 2 2
W 2 3 1 1 2 3
Table 3.2.3: Decision table after reduction X i , ^ 5 , Xy ^2,^8 X3,X4,X6 X9,Xio
L> * 3 * 4
y 2 2 1 2
ly 2 3 1 3
(* means the attribute values are not important)
It is easy to get ^{Q -^ P) = 0.4, which means that {D} -/-^ {VF}. So the minimal determining set for {VF} is {L), F } . In other words, the attributes "Density" and "Volume" decide the "Weight." This is consistent with our general knowledge in physics. The decision table after reduction is as in Table 3.2.2. The last step is to reduce the attribute values. The key attribute values are extracted in the reduction process. The results after the reduction process are shown in Table 3.2.3. D
3.2.2 Data Filtering In Section 3.2.1, we introduced j{Q -^ P) in (3.2.2) to measure the prediction quality. If 7(Q ~^ P) — 1^ Ihe prediction is perfect and Q -^ P. Otherwise, 7((5 —> P) < 1. However, a perfect or high prediction quality cannot guarantee that the rule is valid. If, for example, the rough set method discovers that a rule Q -^ P is based on only a few observations, one might call it a "casual rule." The approximation quality in general varies from rule to rule. For example, in Table 3.2.2, the case xi leads to the following rule: If L> - 1 and V = 2 then T^ = 2. The rule is based on only one case xi. Therefore, its validity is doubtful and the rule may be due to chance. We obtain the decision table as in Table 3.2.3 by attribute
82
Chapter 3. Fuzzy Model Identification Based on Rougli Set Data Analysis
value reduction. The first row of the decision table leads to following rule: If Dy^3,
L> 7^ 4 and y = 2 then W = 2.
The rule is based on three cases xi, X5, and xj. From the point of view of statistics, the significance of the rule is improved. Reference [4] showed that one effective way to increase the significance is to reduce the granularity of information by using appropriate data filters on the set Vq, which may reduce the number of classes of RQ while at the same time keep the dependence information. Therefore, the data filtering can be used as a preprocessing step of RSDA. Reference [4] developed a simple data filtering procedure which is compatible with the rough set approach and which may result in an improved significance of rules. The main tool is the "binary information system." Considering the information system / as in (3.2.1), the associated binary information system I^ is defined as follows:
/« = ([/, f 2 M 0 , l } , / f J , ^ , „ . .
(3.2.4)
In these information systems, every attribute has exactly two values. Roughly speaking, we obtain a binary system I^ from an information system / by replacing a nonbinary attribute q with a set of attributes, each of which corresponds to an attribute value of q. The associated information functions have value 1 if and only if x has this value under fq. In the process of binarization no information is lost. Indeed, the information is shifted from attribute values to the attributes. The data filtering procedure is described next. Let us consider Q ^ P, and choose some m e Q. Suppose that m leads to the binary attributes m o , . . . , m^. For each t e {fp{x)\x G U} do the following: Step 1: Find the binary attributes rrii for which
{VxeU){f^^{x)
=
l^fp{x)=t).
If there is no such m^, go to Step 3. Step 2: Build their union within m in the following sense: If, for example, m^Q,... ^rrii^ satisfy the condition above, then we define a new binary attribute
and simultaneously replace m^^ ^... ^rrii^ by mi^^^^^^i^. Step 3: Collect the resulting binary attributes in m to arrive at the filtered attribute. In the next example. Example 3.2.1 is used again to explain the binary information system and the data filtering. Example 3.2.2. In the example, we only convert the condition attribute values in Table 3.2.2 to binary values. The result is shown in Table 3.2.4. Now we explain the data filtering process step by step as follows.
Section 3.2 Preliminaries
83
Table 3.2.4: Decision table of binary information system D Xi
X2,X8 XS.XQ
X4
Xs,X7 X9,Xio
Di
D2
1 0 1 0 0 0
0 0 0 1 1 0
V Ds 0 1 0 0 0 0
D4
Vi
V2
w
0 0 0 0 0 1
0 0 1 1 0 0
1 1 0 0 1 1
2 3 1 1 2 3
step 1: We find that D^ and D4 satisfy
{VxeU)ifE,{x)
=
l^fw{x)=3),
{VxGU)ifB,{x)
= l^fw{x)
= 3).
Step 2:
fSjx) = fE,ix)\/fES^). Step 3: The filtered binary attributes are collected as D = {Di, D2, ^^34} and V = {14,^2}- The filtered decision table is given in Table 3.2.5. In this table, the attribute values are expressed using columns (e.g., the four values of D are expressed using four columns). Some new attributes with binary value are extended. Therefore, we can use the same RSDA reduction algorithm for attributes to reduce the attribute value. The reduction decision table is shown in Table 3.2.6. The binary value decision table. Table 3.2.6, can be converted to the nonbinary value decision table as in Table 3.2.7, in which only three rules are applied to describe the information system as in Table 3.2.1. The reduction results in Table 3.2.7 is consistent with those in Table 3.2.3. It shows that the dependence information is not lost in the data filtering process. More details about the basic process of RSDA and data filtering can be found in [3], [4], [ 16]. •
3.2.3 Fuzzy C-Means Clustering Algorithm Fuzzy c-means (FCM) is a method of clustering that allows one piece of data to belong to two or more clusters. The technique is frequently used in pattern recognition and is originally introduced by Bezdek [2] in 1981 as an improvement to earlier clustering methods. It is based on minimization of the following objective function: N dry
C \Xi
Cn
(3.2.5)
where m is any real number greater than 1, A^ is the number of data points, C is the number of clusters, /j^ij is the degree of membership of Xi in the cluster j , Xi is the
84
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
Table 3.2.5: The filtered decision table D
V
Di
D2
D34
Vi
V2
1 0 1 0 0 0
0 0 0 1 1 0
0 1 0 0 0 1
0 0 1 1 0 0
1 1 0 0 1 1
Xi
X2,X8 XS.XQ
X4
X5,X7 X9,Xio
w 2 3 1 1 2 3
Table 3.2.6: Decision table after reduction D Xi
X2,X^ X3,X6 X4
Xs.Xj X9,Xio
V
ly
D34
Vi
V2
0 1 0 0 0 1
0 0 1 1 0 0
1 1 0 0 1 1
2 3 1 1 2 3
Table 3.2.7: Nonbinary decision table
Xi, X^,
X'j
^2,^8,^9,^10 ^3,^4,^6
D * 3,4 *
V 2 2 1
Ty 2 3 1
(* means the attribute values are not important.)
Section 3.3 Input Structure Identification
85
ith element of d-dimensional measured data, Cj is the (i-dimensional center of the cluster, and || • || is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership /j^ij and the cluster centers Cj by: Hij = ——
E
2 Xi
(3.2.6)
Co
N
i=l
(32.7)
E^ =1
This iteration will stop when max ||/i^
— /i^- || < s, where £ is a termination
criterion between 0 and 1, and k is the iteration step. This procedure converges to a local minimum or a saddle point of Jm- The algorithm is composed of the following steps: Step 1: Consider a set of N data points (feature vectors) to be clustered, X = {xi,X2,--- ,XN}Step 2: Assume that the number of clusters, or classes, C {2 < C < N), is known. Step 3: Choose an appropriate level of cluster fuzziness m. Step 4: Initialize the (A^ x C) sized membership matrix /x to random values such that iiij e [0,1] and Ylf=i /J^ij = 1. Step 5: Calculate the cluster centers Cj using (3.2.7), for j = 1 , . . . , C Step 6: Update the fuzzy membership matrix /j. = [/J^ij] according to (3.2.6). Step 7: Repeat from Step 5 until ||//^^+^^ —/i^^^ || is less than the given termination criterion e. This algorithm is a classical FCM algorithm. It is better than the classical A:-means algorithm at avoiding local minima but it is not immune from the problem. Some improved algorithms are introduced in [8]. The FCM algorithm has been collected in the fuzzy logic toolbox in Matlab. In this chapter, we will use the toolbox to discretize the continuous attribute values.
3.3 Input Structure Identification 3.3.1 Problem Description We are concerned with modeling a complex, poorly defined nonlinear system with hundreds of possible inputs and millions of input-output data pairs. Because we do
86
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
Table 3.3.1: M input-output data pairs Xi
X2
XN
xl xi
X2
^N Xjy
^1
X2
X2
^M ^N
y y' y' yM
not know the relationship between the input and the output, any element that may affect the output will be regarded as possible input. Because the complexity of the model depends on the number of input variables in some extent, it is necessary to identify the input structure before any identification method can be applied. The fundamental problem is to identify the input structure of an unknown system from a sequence of input-output data pairs. In this section, we will consider the problem with the following four characteristics: the relationship between the input variables and the output is known to be nonlinear, some of the input variables may not be related to the output, some of the input variables may be related to others in unknown manners, and all of the inputs may be noisy. Our objective is to eliminate spurious inputs and find the significant inputs to decrease the complexity of the model and increase the accuracy as much as possible. The complexity of solving such a problem depends on many factors, such as a priori system knowledge, completeness of data and the required model form and accuracy. For a multiple input single output (MISO) system y = F(xi,X2,...,Xiv),
(3.3.1)
Xl, X2,..., XN are possible input variables and y is the output variable. Suppose that we have collected M groups of input-output data pairs, as shown in Table 3.3.1. Our objective in this section is to develop a new method based on RSDA to find a small number of significant inputs x ^ , . . . , Xt, that can be used to construct a nonlinear function of the form y^F{xr,...,Xt). (3.3.2) There are two existing methods that are widely used for input structure identification: forward selection [12] and backward selection [18]. Many modeling techniques, such as artificial neural networks (ANN), genetic algorithms (GA), statistics, etc., can be employed. However, all these techniques are computationally expensive, and the expense typically increases dramatically as the number of inputs or data points increases, leading to too many parameters to be tuned. In addition, the local minima problem is an open, unsolved problem. Rough set (RS) theory is a relatively new soft computing tool for dealing with vagueness and uncertainty. It has been combined with other techniques including ANN, GA and fuzzy sets [1, 15] to reduce their computational complexity. Our algorithm developed in this section will be based on backward selection and it is less computationally complex than ANN and GA.
Section 3.3 Input Structure Identification
87
In references [13] and [14], a fast identification method based on fuzzy curves and surface is proposed. Our simulation results in Example 3.3.2 will show that results obtained using the identification algorithm based on RSDA developed in [9] seem to be more reasonable than that of [13]. Before discussing details of the algorithm in [9], we state the relationship between input structure identification and RSDA. From the point of view of RS, { x i , . . . , xj^} in (3.3.1) is regarded as condition attributes Q and {y} is considered as decision attribute P. Table 3.3.1 can be converted into a decision table by many methods. The main interest of RSDA is to find the least number of attributes with Q —^ P. A set with this property is called a minimal determining set for P. For (3.3.1), from the point of view of RSDA, the input structure identification problem is equivalent to searching the minimal determining set { x ^ , . . . , xt}. There are two key problems listed as follows. (i) How to deal with continuous attributes? In real systems, most of input variables are continuous variables. Because the RSDA can only deal with discrete data, we should first discretize continuous input variables. However, there is no uniform discretization method for continuous variables that fits all cases. In fact, the discretization method will affect the analysis results. It is thus a challenging problem to develop an algorithm which depends less on the discretization of input variables. In this section, the values of continuous variables will be divided into several isometric intervals. Though such a simple method will not lead to the best prediction results in most cases, it does include some important information. We will discuss the details in Section 3.3.3. (ii) How to evaluate the prediction quality? In Section 3.2.1, 7(Q -^ P) is introduced to measure the prediction quality. It is a classical measure of prediction quality which is simple and easy to understand. But it is not a robust prediction measure. Sometimes its results are not credible because it is easily disturbed by many other factors, such as noise, data discretization process, etc. Therefore, it is important to develop a robust prediction measure for RSDA. In [5], two information measures for uncertain data systems are introduced. We will developed a novel measure based on these results.
3.3.2 Rough Information IMeasures We first introduce two information measures defined by Diitsch et al. in [5]. The measures combine complexity of analysis with statistical uncertainty. Consider the information system (3.2.1). Denote the condition equivalence class as U/RQ = { X i , X 2 , . . . ,Xt} and the decision equivalence class as U/Rd — { l i , 12, • •, Ys}^ respectively. The partition induced by R^^^ = RQ D Rd are the nonempty sets in {Xi n K,-: 1 < i < t, 1 < j < 5} and its associated parameters are defined by
_ |x,ny,-| ""'''
n
_ \Xi\ '^'"
_ \x,nYj\
n ' '^''~
\XA
'
88
Chapter 3, Fuzzy Model Identification Based on Rough Set Data Analysis
where n is the cardinality of U. It is clear that Vij = TTiTjij and t
s
X^TT, = ^ r / , , = 1.
Now, we introduce the first information measure H^^^ as:
= ^
^, log2 (-)^^^^^V^j
log2 ( — ) •
(33.3)
This measure is based on the assumption that structure and amount of uncertainty can be estimated by the interaction of d and Q. In fact, it is difficult to estimate the uncertainty. Another measure was proposed in [5], which considers both the deterministic rules ofQ^d and nondeterministic rules. Suppose that V = {Xi, X 2 , . . . , X/^} is the set of deterministic part of Q —> d. Define \
1/n,
otherwise;
and rdet H^et^Q ^
^)
=
V TTi log2 (-\
+ |t/ - y | - log2(n).
(3.3.4)
For Xif]Yjj^(p,lH^°%Q -^d) + (1 -a;)if'^*^*(Q ^ d).
Section 3.3 Input Structure Identification
89
Thus,
H{Q-^d,uj) =00 + (l-a;) J27r,logJ-)+{l-u>)\U-V\-log^in) guessing
knowledge t
+ L0
E
^.fE^^^-i^g2(^
log2 —
(3.3.6)
TTi
J= l probability
H{Q -^ (i, u) is composed of three parts. Thefirstpart comes from the deterministic part. The second and third parts come from nondeterministic rules, which combine the randomness with probabiUty characteristics. In order to compare the prediction effects of different attribute sets, we define the normaHzed information measure as follows. If H{d) = log2(n), then
S{Q^d,w) = i l'^
RQ
= Rd'i
otherwise.
Otherwise, if H{d) < log2(n), then
S{Q^d,uj)
= 1-
H{Q-^d,ij)-H{d) log^in)-H{d)
•
If S{Q ^ d,uj) = 1, the prediction results are the best. If S{Q results are the worst.
(3.3.7) d, uj) = 0, the
3.3.3 Identification Algorithm In the previous subsection, we show that S{Q -^ d^uj) may be used as a measure of prediction success. Larger values of S mean better prediction. However, because the partitions of continuous variable are not optimal, it is not enough to consider S only. We cannot judge effectively whether the prediction is good or not by S. Therefore, the relative values of S is considered. Let Q = {xi,X2,...,:r„}, d = {?/}, and Qi = Q — {xi}. We define 5'max and 5min as follows: S'max = S{Qk -^ d) = max{S'((5i -^ d)\xi G Q], Smin = S{Qi -^d)=
min{5(0, ^ d)\xi e Q}.
5'max = S{Qk -^ d) means that prediction is better than others if x^ is deleted. 'S'min = S{Qi -^ d) means that the worst prediction happens when x/ is excluded
90
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
from the set of condition attributes (or, in other words, xi should be included in the set of condition attribute if S{Qi -^ d) reaches the minimum value). If Xk is not related to output y, S'^ax will be much larger than S'min- So we define a relative coefficient a as Q
a = l
— ^
.
^
= -
.
.
(3.3.8)
Xk is considered to be an insignificant input when a < ao (0 < ao < 1). Usually, we take ao = 0.5. The algorithm includes two parts, deleting spurious inputs and deleting interrelated inputs. Suppose that there are n inputs xi, X2,..., x^ and one output y. Part I of the algorithm is stated as follows. Part I. Step 1: LetQ = {xi,a:2,.. • ,Xn}, d = {y}, and Q^ = Q - { x j {I < i < n); Step 2: For each Xi e Q, calculate S(Qi -^ d); Step 3: S'max = S{Qk ^ d) = max{S'(Q^ -^ d)\xi e Q} {1 < k < n), = S{Qi —^d)= mm{S{Qi -^ d)\xi e Q} (1 < / < n); Step 4: If a = 5min/5'max < Qfo. then Q = Q — {xk}- Goto Step 2; Step 5: End. In the end, Q becomes a set of significant inputs. These inputs may be interrelated. In Part II, we will discover all interrelated inputs in Q. Part II of the algorithm is summarized as follows. Part II. Stepl: Seti=- 1; Step 2: If i > IQI then goto Step 8 else set j = 1; Step 3: If j > IQI then goto Step 7; Step 4: lfi=j then goto Step 6 ; Step 5:C = Xi and D = xj, S{iJ) = S{C -^ D)\ Step 6: Set j = j -^ 1, goto Step 3; Step 7: Set i = i -h 1, goto Step 2; Step 8: End. In Part II, we calculate all S{i, j). If Xi = f{xj), S{j, i) will be much larger than others. It is therefore easy to find the interrelated variables by evaluating S{j^i). We will compare between S{i^j) and S{j^i) to decide which of the inputs Xi or Xj to keep in this case. If S{j^i) > S{i^j), we will keep Xj because it is more important than Xi. The reason is as follows. Suppose that Xi = f{xj) and its inverse function Xj = f~^{xi) exists in the given interval which means Xi ^ Xj. In this case, whichever input is deleted, the result will not be changed because Xi and Xj are with the same significance. Otherwise, if Xj — f~^{xi) does not exist in the given interval, it means Xj =^ Xi only. In this case, Xj is more significant because Xi does not include all information about Xj. This principle leads to different results from those of [13] which will be explained in more detail in Example 3.3.2.
Section 3 J
Input Structure Identification
91
3.3.4 Performance Analysis of Noise Rejection The reason of RSDA having the ability to reject noise is similar to the reason that digital signals have better noise rejection performance than analog signals. Suppose that the random noise is defined in the interval [—d,d], d > 0. The continuous attribute value can be divided into several intervals and one of these intervals is [a, 6]. The probability that the attribute value will not be changed by noise is as follows. a) When d > {b - a), /
N
b -
a
p{x) = -^j-^
^
^,
X e[a , b].
b)WhQnd< ( 6 - a ) / 2 , X —a
p{x)={
I 1
1,
X e [a, a + G?); X G [a -\- d^b — d]]
X —b
I 1
2d " ^ 2 '
X e (b-
d,b].
c) When (6 - a)/2 < d < b - a. XE '
p{x) = < b- a
2d '
a-{-b
^ _ a±b. ^~^2~'
It is clear that the performance of noise rejection is related to {b — a). Larger intervals imply better noise rejection performance of system. But if the interval is too large, we will lose too much information about the system. Thus, it is important to choose a proper partition that can satisfy the noise rejection performance without losing too much information about the system.
3.3.5 Numerical Examples Example 3.3.1. Consider the following system y = xiX2 4- 1.5x| + 5sin(x3), X2 = 1 - Xi.
^
Assume that there are seven possible inputs {xi, X 2 , . . . , X7} with each defined on [—4,4]. First, we randomly generate 3000 data points in the six-dimensional input space of {xi, X3, X4,0:5, xg, X7} and calculate X2 and y using (3.3.9). Second, we arrange these data pairs as in Table 3.3.1. Each input variable value is divided into 16 intervals of equal size, and the output variable value is divided into 10 intervals of equal size. Let Q = {a:i,X2,X3,X4,X5,X6,X7}, d = {y}, and Qi = Q — {xi}. The process of reduction can be expressed clearly by Table 3.3.2. In Table 3.3.2, a is calculated by (3.3.8).
92
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
Table 3.3.2: The reduction process of input variables S{Q, -^d.uj)
i
I 0 0 0
Xi X2
^3 X4
0.02 0.02 0.02 0.04
X5 XQ
X-j
0
01
(xlO ^ ) ,
a; = 0.8
II
III
IV
V
VI
0.05 0.04 0.27 0.34 0.30 0.36
0.44 0.48 2.72 4.68 4.81
5.85 6.06 6.17 24.2
24.3 25.0 14.2
10.7 15.4
X
X
X
X
X
X
X
X
X
X
X
X
X
X
0.11
0.09
0.24
0.57
0.69
X
Table 3.3.3: Reduction of dependent variables
S{iJ)
Xi
X2
^3
Xi
1
0.2574
X2
0.2427 0.0955
1
0.0955 0.0956
0.0955
1
^3
In column I of Table 3.3.2, since S{Qj -^ d^ CJ) is the largest, x^ will be deleted from Q, and we have Q = Qj. In column II, XQ will be deleted from Q. In each column, ' X' represents an attribute that has been deleted from Q. We continue this procedure until column V where we obtain a large value for S. If we choose a^ — 0.5, we can determine that xi, X2 and X3 are the most important input variables. The results are consistent with the function in (3.3.9). In Table 3.3.3, we discover that xi and X2 are related, and therefore, function (3.3.9) can be expressed by xi and 0:3. In order to check the performance for noise rejection, we add random noise to (3.3.9). Let xi = xi + 5i and X3 = X3 + ^3. The width of intervals of xi and x^ is 0.5. We discuss the problem in following two cases. Case I: ^1, (^3 G [—0.2,0.2]. The process of reduction is shown in Table 3.3.4. The results in Table 3.3.4 are consistent with Table 3.3.2. Case II: 81,62, G [—1,1]. The process of reduction is shown in Table 3.3.5. The important input variable, X3, is deleted this time because of the large noise. • Example 3.3.2. Consider the following system X5 = cos(xi), X3 = s i n ( x 4 X 4 ) ,
(3.3.10) (3.3.11)
X2 = 0.5(sin(x6) + i ? ) , y = sin(6xi) sin(4x2) sin(2x3) sin(5.4x4) sin(4x5) sin(3.5x6), where Xi G (—1,1) for i = 1, 2 , . . . , 9 are possible input variables and R G (—1,1)
Section 3.3 Input Structure Identification
93
is a random noise. Example 3.3.2 is more complex than Example 3.3.1. In this example, we will show how to find the dependence between attributes. Suppose that there are nine possible inputs, {xi, X2,..., xg}. We randomly take 3000 data points in the ninedimensional input space of {a;i, 3^2,..., Xg} and calculate the output y. Each input and output variable value is divided into five isometric intervals. First, we eliminate the insignificant input variables. From Table 3.3.6, we can see that {xi, X2, xs,x4, X^.XQ} are significant input variables when ao = 0.5. Then, we begin to find the dependent variables. S{xi -^ X5) and S{x4 -^ xs) stand out in Table 3.3.7, implying that X5 (x^) are related to xi (X4). Because of the noise R, we cannot determine the dependency between X2 and XQ. Because S{xi ^ X5) > S{xs ^ ^i) and S(x4 ^ xs) > S{xs —^ X4), we consider that xi and X4 are more important than x^ and X3. The significant inputs are xi, X2, X4, and XQ. The result in [13] shows that X2, xs, X5,
Table 3.3.4: The reduction process of input variables under noise i Xl X2
S{Qi ->d,uj)
I 0 0
X7
0.02 0.02 0.01 0.05 0.02
a
0
X3 X4 X5 XQ
(xlO-^),
cc; = 0.8
II
III
IV
V
VI
0.05 0.05 0.30 0.40 0.28
0.57 0.48 2.64
6.04 5.33 7.16
23.7 25.2 13.8
10.6
X
X
X
X
4.20
23.5
X
X
X
X
X
X
X
0.37 0.14
4.22 0.11
X
X
X
0.27
0.56
0.71
X
15.0
Table 3.3.5: The reduction process of input variables under noise i Xl X2
S{Qi -^d,u)
I 0 0
x?
0.02 0.01 0.01 0.01 0.02
a
0
xs X4 X5
xe
(xlO-^),
cj = 0.8
II
III
IV
V
VI
0.02 0.02 0.24 0.19 0.19 0.18
0.41 0.31
4.27 3.56
7.28 7.72
15.7 15.1
X
0.09
X
X
X
X
2.46 2.80 2.93 4.22 0.08
7.09 7.46
13.8
X
X
X
X
X
X
X
X
X
0.48
0.53
0.962
94
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
and xe are significant inputs, which is different from the present result. In fact, X4 (xi) cannot be represented by X5 (xs) because the inverse functions of (3.3.10) and (3.3.11) do not exist in the given intervals. Therefore, our result is more reasonable than that of [13] in this case. This result can also be explained by ANN (artificial neural network) training. When we use ANN with a structure of 4-15-1 and different inputs to train the raw data, the ANN with inputs xi, X2, X4, and XQ converges much faster than the ANN with inputs X2, xs, x^, and XQ. •
Table 3.3.6: The reduction process of input variables S{Qi -> d,io) (xlO-^), u; = 0.8
i Xi X2 X2,
X4 X3 XQ
X7
Xs Xg
a
I
II
III
IV
0.34 0.51 0.27 0.36 0.28 0.60 0.77 0.85 0.88 0.31
1.93 2.27 1.46 1.74 1.54 2.84 3.57 3.47
6.78 7.91 5.87 6.78 6.11 9.60
17.54 19.02 16.05 17.40 16.48 21.09
X
X
12.12
X
X
X
X
0.42
0.48
0.77
Table 3.3.7: Reduction of dependent variables
S{iJ)
X2
Xs
X4
X5
Xi
1
0.152
X2
0.156 0.153 0.150 0.175 0.150
1
0.151 0.156
0.154 0.152 0.155 0.160
1
0.150 0.156 0.175
0.359 0.154 0.151
1
0.350 0.157 0.151 0.151
0.154 0.150
1
0.150 0.165 0.153 0.150 0.154
0.151
1
X3 X4 X3 XQ
Xi
XQ
3.4 Fuzzy Relation Model Identification 3.4.1 Preliminaries First, we summarize the relationships and differences between rough rules and fuzzy rules. Consider the following ith rough rule and ith fuzzy rule:
Section 3.4 Input Structure Identification
95
Rough rule R^: IF xi = An, X2 = Ai2,..., x^ = Ain, THEN y == Bi] Fuzzy rule R^: IF xi is An, X2 is ^4^2,..., x^ is A^n. THEN y is Bi. In rough rules, ^^^ (1 < j < ^) and 5^ are certain values. For continuous values, the value corresponds to an interval, and the separating point between two neighboring intervals is a certain value. It is also called a "hard partition." On the other hand, in fuzzy rules, Aij (I < j < n) and Bi are fuzzy sets. Fuzzy sets are defined by a fuzzy membership function which is also called a "soft partition." Rough rules are hard partitions and they are extracted by RSDA (rough set data analysis), which is data driven and requires no outside information. However, fuzzy rules depend on the fuzzy membership function. We should define the fuzzy membership function before we extract fuzzy rules. In real-world problems, most of the attribute values are continuous values. There are three cases in the rule matching: Case I: The input can match with premises of the rule, and the output is consistent with the consequence of the rule; Case II: The input can match with premises of the rule, but the output is not consistent with the consequence of the rule; Case III: Both the input and output cannot match with the rule. Case I is an ideal case and we can get the results directly. For cases II and III, we should consider the following two cases further: (a) The knowledge is not included in the raw data. This case should be added to the raw system; (b) The knowledge is included in the raw data, but the rule cannot be expressed exactly for the unreasonable discretization results. To summarize, there is close relationship between rough rules and fuzzy rules. We can extract rough rules from the raw data by RSDA. Furthermore, fuzzy rules may be derived from rough rules by choosing reasonable fuzzy membership functions. Therefore, RSDA builds a bridge between fuzzy model and raw data. A fuzzy model identification algorithm based on RSDA has been developed in [10].
3,4.2 Fuzzy Model Identification Consider an information system that can be described by the following m fuzzy rules, IF xi is Ail, X2 is Ai2^..., and Xn is A^^, THEN y is Bi. For such a fuzzy relationship model, the identification procedure will include: the premise structure identification, the premise parameters identification and the consequent parameters identification. In fact, the premise structure identification and the premise parameters identification are to find the optimal partition of the input space. RSDA can be applied to find important attribute values, which are very important for the input space partition. The consequent parameters can be obtained from the rough rules directly. Suppose that the value of input variable x is divided into five intervals as in Figure 3.4.1. The clustering algorithm, such as the fuzzy c-means algorithm, may be
96
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
a4
Figure 3.4.1: Rough intervals.
Figure 3.4.2: Fuzzy intervals in CASE I.
applied to obtain the intervals: [ai,a2), [a2,as), [as.a^), [a^.a^), [a^^ae]. We use the numbers I'^S to denote the five intervals. Now we discuss how to identify the premise structures and parameters of fuzzy rules. The trapezoid and triangle fuzzy membership functions are chosen in the fuzzy model. The following two cases are considered in the identification procedure. Case I: The significant attribute values are not in order. Suppose that the attribute values 1, 3, 5 are extracted by RSDA as the key attribute values. The input space can be described by three fuzzy subsets as in Figure 3.4.2. Case II: Some important attribute values are in order. Suppose that the attribute values 2, 3, 5 are key attribute values. Because the attribute values 2 and 3 are key attribute values, they cannot be merged directly. Therefore, we define //(as) = 0.5 and choose xi to be larger than the maximum value among those cases whose attribute value is 2 and X2 to be less than the minimum value among those cases whose attribute value is 3. The results can be described as in Figure 3.4.3. The detailed modeling procedure will be explained in the next subsection.
Section 3.4 Input Structure Identification
97
Figure 3.4.3: Fuzzy intervals in CASE II.
3.4.3 Simulation In this subsection, we use the fuzzy model in the last subsection to analyze the rock slope stability. Rock slope stability estimation is an important activity in the design and construction of slope engineering and open pit mine excavation. Because geological data obtained are often uncertain and fuzzy, it is difficult to estimate slope stability. Traditional method is not very effective. Feng et al. succeeded in using artificial neural network (ANN) to analyze this problem in [6]. But the time for the ANN training is often very long. The relationship between the parameters and the model is not easy to explain because the ANN model is a black box model. The raw input-output data pairs include 82 slope cases as in Table 3.4.1, which consist of 44 failure slopes and 38 stable slopes. There are six condition attributes Q = {7, C, 0, ^ / , i ^ , 7^} and two decision attributes P = {SF^ SS}, where 7: unit weight of the rock (KN/m^); C: cohesion of the rock (Kpa); (j): inherent friction of the rock (degree); ijjf : slope angle (degree); H: slope height (m); 7^: void pressure ratio; SF: safety factor; SS\ stability. All attributes except SS are continuous value variables. Especially, the decision attribute SF is continuous. Such a system cannot be modeled by the traditional RSDA directly. The model of safety factor SF will be addressed in detail in Section 3.5. In this subsection, we will build a fuzzy model for the stability of rock slope. The rock slope stability SS is described by two values: stable (T) or failure (F). The algorithm in this section will be applied for modeling. In order to compare with [6], we use the same 71 cases to train the model and another 11 cases to verify the model. The detailed modeling procedure is as follows.
98
Chapter 3. Fuzzy' Model Identification Based on Rough Set Data Analysis Table 3.4.1: The 82 cases (lata for rock slope stability analysis [6] No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
7 12.00 23.47 16.00 20.41 19.63 21.82 20.41 18.84 18.84 25.00 25.00 25.0 25.0 25.0 31.3 31.3 31.3 31.3 31.3 31.3 18.68 16.50 18.84 18.84 28.84 28.84 20.60 14.8 14.00 21.43 19.06 18.84 21.51 14.00 18.00 23.00 22.40 22.40 20.00 20.00 20.00
C 0.0 0.0 70.00 24.91 11.97 8.62 33.52 15.32 0 120.0 55 63 63 48 68.6 68.6 58.8 58.8 68.0 68.0 26.34 11.49 14.36 57.46 29.42 39.23 16.28 0 11.97 0 11.71 14.36 6.94 11.97 24.00 0.0 100 10.00 20.00 0.00 0.00
0 30 32 20 13 20 32 11 30 20 45 36 32 32 40 37 37 35.5 35.5 37 37 15 0 25 20 35 38 26.5 17 26 20 28 25 30 26 30.15 20 45 35 36 36 36
^/ 35 37 40 22 22 28 16 25 20 53 45 44.5 46 45 47.5 47 47.5 47.5 47 8 35 30 20 20 35 35 30 20 30 20 35 20 31 30 45 20.00 45 45 45 45 45
H 8.00 214.00 115.0 10.67 12.19 12.80 45.72 10.67
im 120.0 239.0 239.0 300 330 262.5 270 438.5 502.7 360.5 305.5 8.23 3.66 30.50 30.50 100.0 100.0 40.0 50 88 61.00 21.00 30.50 76.8 88.00 20.00 100.0 15.0 10.00 50.00 50.0 8.0
lu
0 0 0 0.35 0.405 0.49 0.20 0.38 0.45 0 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0 0 0 0 0 0 0 0 0 0.50 0.11 0.45 0.38 0.45 0.12 0.30 0.25 0.40 0.50 0.25 0.50
SF 0.86 1.08 1.00 1.40 1.35 1.03 1.28 1.63 1.05 1.30 1.71 1.49 1.45 1.62 1.20 1.20 1.20 1.20 1.20 1.20 1.11 1.00 1.875 2.045 1.78 1.99 1.25 1.13 1.02 1.03 1.09 1.11 1.01 0.625 1.12 1.20 1.80 0.90 0.83 0.79 0.67
ss
failure failure failure stable failure failure failure stable failure stable stable stable stable stable failure failure failure failure failure failure failure failure stable stable stable stable failure failure failure failure failure failure failure failure failure failure failure failure failure failure failure
Section 3.4 Input Structure Identification
99
Table 3.4.1 (continued) No. 42 43 44 4^ 46 4/ 48 49 M) M yi b3 b4 bt) bb b/ ^8 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
7 22.00 24.00 20.00 18.00 27.00 27.00 27.00 27.00 27.00 27.00 27.30 27.30 27.3 27.3 27.3 27.3 25.0 25.0 25.0 26.0 18.5 18.5 22.4 21.40 22.00 22.00 12.00 12.00 12.00 31.3 20.00 27.00 25.0 31.3 25.0 27.3 25.0 25.0 31.3 25.0 31.3
C 0.00 0.00 0.00 5.0 40.0 50.0 35.0 37.50 32.0 32.0 14 31.50 16.8 26 10 10 46 46 46 150 25.0 12.0 10.0 10.00 20.0 0.0 0.0 0.0 0.0 68.0 20.0 40.0 46.0 68 46 10 46 48 68.6 55 68.0
40 40 24.5 30.0 35.0 40.0 35.0 35.0 33 33 31 29.7 28 31 39 39 35 35 35 45 0 0 35 30.34 36 36 30 30 30 37 36 35 35 37 36 39 35 40 37 36 37
iPf 33 33 20 20 43 42 42 37.8 42.6 42.4 41 41 50 50 41 40 47 44 46 30 30 30 30 30 45 45 45 45 45 49 45 47.1 50 46 44.5 40 46 49 47 45.5 47
H 8.0 8.0 8.00 8.00 420 407 359.0 320 301 289 110 135 90.5 92 551 470 443 435 432 200 6.0 6.0 10.0 20.0 50.0 50.0 4.0 8.0 4.0 200.5 50 292.0 284.0 366.0 299.0 480.0 393.0 330.0 305 299 213.0
lu
0.35 0.30 0.35 0.30 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SF 1.45 1.58 1.37 2.05 1.15 1.44 1.27 1.24 1.16 1.30 1.249 1.245 1.252 1.246 1A34 1.418 1.28 1.37 1.23 1.20 1.09 0.78 2.00 1.70 1.02 0.89 1.46 0.80 1.44 1.2 0.96 1.15 1.34 1.20 1.55 1.45 1.31 1.49 1.20 1.52 1.20
SS failure failure stable stable failure stable stable stable failure stable stable stable stable stable stable 1 stable stable 1 stable stable 1 stable 1 failure | failure | stable 1 stable 1 failure stable stable failure stable failure failure failure stable failure stable stable stable stable failure stable failure
100
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
Step 1: Discretization First, we use the fuzzy clustering algorithm to divide each attribute into several classes. Then the data table is converted into decision table. After checking the consistency of the decision table, we get the intervals for each attribute as in Table 3.4.2. Step 2: RSDA The data filtering algorithm in [4] is applied to find the significant attribute values. Table 3.4.3 shows the reduction results. After data filtering, the number of condition attributes is 6 (7, C, 0, ipj, H, ju)- There are no condition attributes that can be reduced. Because all attributes in [6] are determined by analysis and experience of some experts, it is reasonable to believe that none of the attributes is redundant. All of them are very important to determine the stability of the rock slope system. But the number of significant attribute values is decreased dramatically. In Table 3.4.2, there are 29 attribute values that are reduced into 13 significant attribute values in Table 3.4.3. So the complexity of "IF-THEN" rules is decreased. In the end, 47 rough rules are extracted to describe the stability of the rock slope using six attributes and 13 significant attribute values.
Table 3.4.2: The partitions after eliminating the inconsistent rules Attribute 7 C
0 i^f H lu
Intervals [0, 17.2), [17.2, 21), [21, 24.5), [24.5, 26.5), [26.5, 30), [30, 35] [0, 3), [3, 18.2), [18.2, 43), [43, 80), [80, 150] [0, 8), [8, 22), [22, 34), [34, 43), [43, 50] [5, 27), [27, 39), [39, 60] [0, 7.8), [7.8, 25), [25, 160), [160, 291), [291, 330), [330, 390), [390, 600] [0, 0.05], [0.05, 0.32], [0.32, 0.6]
Table 3.4.3: The significant attributes and partitions after reduction Attribute 7 C
0 V^/ H
lu
Attribute Value 2, 4, 5, 6 1,3 2,3 2 2,4,6 2
Interval [17.2, 21), [24.5, 26.5), [26.5, 30), [30, 35] [0,3), [18.2,43) [8, 22), [22, 34) [27, 39) [7.8, 25), [160, 291), [330, 390) [0.05, 0.32]
Section 3.5 ANN Modeling Based on Rough Sets
101
Step 3: Identify the fuzzy premise structure and parameters The input space can be divided as in Figure 3.4.4 based on the 13 significant attributes. Finally, we use the 11 test cases to verify the fuzzy model. All prediction results are consistent with the raw data.
10
20 y(KN/m3)
30
20
40 C (Kpa)
60
Figure 3.4.4: The fuzzy membership function curves of condition attributes.
3.5 ANN Modeling Based on Rough Sets 3.5.1 Problem Definition RSDA is a tool for analyzing the relationship and dependence in data. But RS model is intended to deal with structural discrete (qualitative) data. Continuous (quantitative) aspects are only of secondary interest [3,4].
102
Chapter 3. Fuzzy Model Identification Based on Rough Set Data Analysis
ANN is an AI tool. Funahashi proved that any continuous function can be approximated by ANN with three layers [7]. Both quantitative and qualitative information can be dealt with using ANN. But it is always a time-consuming procedure to train an ANN. In order to approximate a complex system, the structure of the ANN may be very complex. In this case, it is an effective method to divide the single network into several simpler subnetworks. The output is the combination of subnetworks. The two key problems are how to partition the system to subnetwork and how to combine the subnetwork outputs. Sakar et al. used fuzzy integral to combine the outputs [17]. But it is a sophisticated process to calculate a fuzzy integral, and the fuzzy membership function must be assigned manually. Even if the system can be modeled by ANN, the model is still a black box. The meaning of parameters in ANN is not transparent. A technique to combine the ANN with RSDA has been developed in [11]. Rules are extracted from data by RSDA. The system can be divided into several subsystems by fuzzy clustering based on decision attributes. Then the outputs of subsystems are combined by rough membership function. In contrast to a fuzzy integral, a rough membership function is obtained from data, which is an objective parameter. In general, it is easier to calculate a rough membership degree than a fuzzy integral.
3.5.2 Matching Degree and Fitting Degree Before further discussion, we introduce some definitions first. Suppose that m {Q -^ P) rules are extracted from data by RSDA. The ith rule is R'\
I F fq^ {x) = tii, ...Jq^
(x) = tni,
T H E N / p , ( x ) = S ^ , . . . , fp^ (x) =
Ski,
where tj* G Vq. and Sj^ G Vq.. For an input vector IN = {ini, i n 2 , . . . ^irin}, function gi{k) is defined as
k = 1, 2 , . . . , n, i = 1, 2 , . . . , m. The matching degree of input vector IN to the ith rule is: d = !^^
. (3.5.2) n Rules are extracted from the raw data. Different rules have different reliability. The rough membership function can reflect the reliability of rules in a sense. From Definition 3.2.4, the rough membership degree of the jth attribute value of the ith rule to the decision equivalence class X is:
M|feO = ^ ; ^ ^ where i = 1, 2 , . . . , m, j = 1, 2 , . . . , n, and X = [su, S2i,...,
(3.5.3) Ski]p.
Section 3.5 ANN Modeling Based on Rough Sets
103
A large value of //^ (tji) implies a large possibility for getting the decision from tji. If /x^ {tji) = 1, then the consequence is obtained directly. The fitting degree fXi of the input vector IN to the ith rule is defined as: IJ.i= max {lJ^xitki)gi{k)}.
(3.5.4)
l 0 is a positive constant. Next, we define the fuzzy rule base for the fuzzy hyperbolic model. Definition 4.2.1. Given a plant with n state variables x = ( x i , . . . , x^)^ and p input variables u = {ui,... ^Up)^, v^e call the fuzzy rule base the fuzzy hyperbolic rule base if it satisfies the following conditions [13,14]: (1) The fuzzy rule is given as follows: IF xi is F^i, . . . , Xn is Fx^, ui is Fu^, ..., and Up is Fu^, THEN xi = ±Cj,^ ± ' • - ± Cx^ ± Cu^ ± • • • ± c^p, / = 1 , . . . , n, where F^. (i = 1 , . . . , n) and F^ (j = 1,...,p) are fuzzy sets of Xi and Uj, which include Pz (positive) and N^ (negative); c^c. (i = 1 , . . . , n) and Cuj {j — 1 , . . . ,p) are positive constants corresponding to F^. and F^j; and "=b" stands for either the plus or the minus sign. The actual signs in the THEN-part are determined in the following manner: If in the IF-part the term characterizing Fxi{Fu.) is Pz, then in the THEN-part Cxi{cuj) appears with a plus sign; otherwise, c^^ {cuj) appears with a minus sign.
Section 4.2 Fuzzy Hyperbolic Model
111
(2) The state variables and the input variables in the IF-part and the constant terms in the THEN-part are all optional. The constant terms c^. (cu^) in the THEN-part must correspond to Fx^{Fu ) in the IF-part, i.e., if there exists Fxi {Fuj) term in the IF-part, c^. {cuj) must appear in the THEN-part; otherwise, c^^ [cuj) does not appear. (3) If xi is in the THEN-part, and m {m < n) fuzzy variables (including state variables and input variables) appear in the IF-part, then xi corresponds to a total of 2^ fuzzy rules; that is, all the possible Pz and Nz combinations of state variables and input variables in the IF-part, and all the sign combinations of constants in the THEN-part. We use the following fuzzy model to represent a complex multiple-input multipleoutput continuous system: /
I tanh(A:iXi)
xi
( ux B
(4.2.2)
\ tanh(A:^a;n)
\ Xr,
where x — ( x i , . . . , x^)^ is the state vector, u = ( w i , . . . , u^)^ is the input vector, A G R^^^, B G M^^^, and A:^ (i = 1 , . . . ,n) are positive constants from fuzzy membership functions defined by (4.2.1). Define K^ — diag(A:i,..., k^). Then {A2.T) can be abbreviated to: x^
A tanh(i^^x) + Bu.
We call (4.2.3) a fuzzy hyperbolic model (FHM).
(4.2.3) •
In the following we will show that the fuzzy hyperbolic model can easily be derived from linguistic information concerning the plant. The following theorem explains how a fuzzy hyperbolic model is constructed. Theorem 4.2.1. Given the fuzzy rule base of Definition 4.2.1, and the membership function of Pz (positive) and Nz (negative) in the form of (4.2.1), we can always derive the following model: x^
A tanh(K^x) + Bu,
(4.2.4)
where K^ =diag(A:^,,...,/ca,^) and A and B are constant matrices. Proof. For any xi (I — 1 , . . . , n), assume that there are m {m < n) state variables and q {q < p) input variables in the IF-part. Applying the product-inference rule, singleton fuzzifier, and the gravity center defuzzifier to the fuzzy rule, we have: XI = FIG,
112
Chapter 4. Identification of the Fuzzy Hyperbolic Model
where F = (Ca,, H
h C^^ -h C^i H
m
= X](Cx,/iP.. - Cx,/i7V,^ ) i=l q
h Cn J /ip,^ (Xi) • • • / i p ^ ^ (Xm)
m
q
W
(/ip, . + //TV, . ) n ^ ^ ^ - ^ ^ ^^^i )
j = ljz^i m
1=1 q
G = MP,^ ( ^ l ) • • • / i P , ^ ( ^ m ) / i P . , ( ^ l ) • • • MP., K ) + • • •
Thus, 9
+ 6 22 ( ^ • ' " ' " ' W r / ~ 2-^ z=l
^^ f,kx-Xi
\ ^-kx-Xi
2-^ '" r=l
Define 4 = (c^^,..., c^;^, 0 , . . . , 0) and 4 = ( c ^ , , . . . , c^^, 0 , . . . , 0). We have x/ = 4 tanh(i^a^a:) + 4 tanh(i^^7i). The above equation means that xi is a linear combination of tanh(Ka;x) and tanh(K-^?x). For all a:^ (/ = 1 , . . . , n), we have: X = v4 tanh(K^x) + B^ tanh(K^u),
(4.2.5)
where A = ( 4 , 4 , . . . , c^)^ and 5 ^ - ( 4 , 4 , . . . , O ^ . Consequently, we can obtain the FHM (4.2.2) by linearizing (4.2.5) in u\ namely, X — A\>^r)h.{Kxx) + Bu^ where B — BuKw
n
Section 4.2 Fuzzy Hyperbolic Model
113
From the definition of FHM, it is clear that the FHM is a novel fuzzy model with a very simple structure. The state matrix of this model is a matrix hyperbolic function of state variables, and the input matrix is a linear constant matrix. In the next subsection, we will discuss the characteristics of the FHM.
4.2.2 Characteristics of the Fuzzy Hyperbolic Model There are some distinguishing characteristics of the FHM, which are summarized as follows: (1) The FHM is a nonlinear model. Unlike the T-S fuzzy model, which is a combination of local linear models, the FHM is a global model. (2) Because the FHM is a global model, we can design a global optimal controller and analyze the stability of closed-loop system. If x is located in a small neighborhood of the origin, then we have tanh(A:x) ^ kx. The linear control theory can therefore be applied to the FHM. (3) The identification of the structure and antecedent parameters of each fuzzy rule will not be needed, which leads to much reduced computational burden and computational complexity. The FHM is suitable for modeling complex plants. (4) The FHM can easily be derived from known linguistic information. We can easily construct an FHM if we know some linguistic information about the relationship between the derivative of state variables and the state variables (input variables). (5) Similar to the T-S fuzzy model, we can design a neural network model to identify the model parameters of the FHM.
4.2.3 Neural Network Implementation of the FHM In this subsection, we will prove that the FHM can also be viewed as a neural network model [13]. First we give the structure of the neural network. The proposed network is a three-layer feedforward neural network. In this structure, the input and output nodes of the network represent state variables and derivatives of state variables, respectively. The number of hidden layer nodes is the same as that of input nodes, and there is no cross links between input nodes and hidden nodes. The activation function of the hidden layer is the hyperbolic tangent function, and the activation of output layer is the linear function. Figure 4.2.1 shows the proposed neural network, in which x is the state vector, u is the input vector, and /c^ (i = 1 , . . . , n), gj (j = 1 , . . . ,p), Cij (i = 1 , . . . , n, j = 1 , . . . , n), and dij {i = 1 , . . . , n, j = 1 , . . . ,p) are the weights to be adjusted. If we set /i(x) = tanh(x) and f2{x) = x, then we can easily derive the following state-space model from Figure 4.2.1: x = A tanh(K'^x) + B tanh(K^?i),
(4.2.6)
where Kx = diag(/ci,..., /c^), Ku = d i a g ( ^ i , . . . , gp), and A and B are constant
114
Chapter 4. Identification of the Fuzzy Hyperbolic Model
Figure 4.2.1: The network structure of the FHM.
matrices composed of Cij and dij. We can see that (4.2.6) is the same as (4.2.4). Thus, we have derived the neural network implementation of the FHM. Remark 4.2.1. Different from other neural networks, the initial values of the network weights cannot be chosen randomly. Because the model is actually a fuzzy model, the initial values of the network weights of the model should be chosen by expert experience. The learning method of weight adaptation can use the common error back-propagation learning algorithm (the BP algorithm) [11] or other learning algorithms for feedforward neural networks. The BP learning algorithm of the network weights can be described by: Cij{t + 1) = Cij{t) - a{xi - ii)
t3nh.(ki{t)xi), (4.2.7)
ki{t + 1) = hit) - a^^{xj
- ti)Cij{t)xitd.ii]i
{ki{t)xi),
where a > 0 is the learning rate, Xi is the model output, xi is the actual output of the plant, and tanh^(a:) is the derivative of tanh(x). In the next subsection, we will apply the above BP learning algorithm in some examples. • The above model cannot approximate every real plant to any degree of accuracy because of the odd function characteristic of the FHM. However, in Chapter 10, it will be shown that the designed controller based on this model can stabilize the real plant with good performance. Next, we study how to construct the FHM.
4.2.4 Modeling Process We now investigate methods for modeling the FHM. Fuzzy systems are knowledge-based systems constructed from human knowledge in the form of fuzzy IF-THEN rules. An important contribution of fuzzy systems theory is that it provides a systematic procedure for transforming a knowledge base
Section 4.2 Fuzzy Hyperbolic Model
115
into a nonlinear mapping [10]. The FHM can be constructed from linguistic information concerning the plant. On the other hand, since the FHM can also be viewed as a neural network model, we can choose initial values of the network weights of the model by expert experience and then optimize model parameters by using the BP learning algorithm.
Incorporating Linguistic Information To show that the FHM can easily be derived from linguistic information concerning the plant, let us use an example to illustrate the modeling process. Example 4.2.1. Consider the inverted pendulum system depicted in Figure 4.2.2 [1,2]. xi{t) denotes the pendulum's angle and X2{t) denotes its angular velocity. ^ = 9.8 m/sec^ is the acceleration due to gravity, M is the mass of the cart, m is the mass of the pole, 2/ is the pole's length, and u is the control force. The system's dynamical equations are: Xi
=
X2,
±2
=
F ( x i , X 2 ) + G(xi,X2)li,
where F{xi,X2)
mix^ cos x\ sm x\ m-\-M m cos^ x\ ' m+M '
gsmxi
K cos x\
G{xi,X2)
7/4 H3
mcos^ x\ m^M
In this section, we set m — 0.2 kg, M = 1 kg and / = 0.5 m. After analyzing the system we can derive the following verbal knowledge: (1) xi is related to X2\ and (2) u is related to X2. We can construct the following fuzzy rule
M
Figure 4.2.2: Inverted pendulum system.
Chapter 4. Identification of the Fuzzy Hyperbolic Model
116
base according to the above knowledge: R^: IF X2 is P^2' THEN xi = 4; R^: IF X2 is N^^, THEN xi = - 4 ; R^: IF a is P^,, THEN ^2 = 8; i?^: IF li is A^^, THEN ±2 = - 8 ; where 4 and 8 are constants chosen by experience. The next step is to define the membership function of the fuzzy sets. Similarly we choose kx^ = 0,.4 rad/sec, kx2 = 0 . 2 rad/sec, and A:^ = 1 N by experience. Then we can derive the following equation from Theorem 4.2.1: xi = 4tanh(0.2:c2), X2 — 8tanh(i^).
(4.2.8)
The final step is to linearize (4.2.8) in u to obtain: X = Atainh(Kxx)
-\- Bu^
(4.2.9)
which is the FHM of the inverted pendulum system, where A =
Kx =
0.4 0
0 0.2
and B
0
This example shows how to construct a fuzzy hyperbolic model from linguistic information. Next, we optimize the parameters of the FHM. •
Using the BP Learning Algorithm In the previous subsection, we sought to construct a fuzzy hyperbolic model from linguistic information concerning the plant. Here, however, we use the BP learning algorithm to determine parameters that perform the best approximation (i.e., make the model as close to the plant as possible). While the BP learning algorithm tries to pick the best parameters, there is no guarantee that it will succeed in achieving the best approximation. So we first choose the initial network weight values of the model by expert experience and then optimize model parameters by using the BP learning algorithm. Example 4.2.2. Starting with the model obtained in Example 4.2.1, if we choose xi = 7rsin(t)/10 and X2 — 7rcos(t)/10 [7], after 1000 steps learning by using the BP learning algorithm, we obtain the new model parameters as: A =
0.01 -0.24
1.26 0
K.-
1.02 0
0 0.47
and 0 0.12 Simulation results are shown in Figures 4.2.3 and 4.2.4. B
D
Section 4.2 Fuzzy Hyperbolic Model
111
Figure 4.2.3: Comparison of the inverted pendulum's angle (dotted line) and the FHM's angle (solid line) after 1000 steps of learning, starting from the same initial condition xo = {20°, 0.5}.
Figure 4.2.4: Comparison of the inverted pendulum's angular velocity (dotted line) and the FHM's angular velocity (solid line) after 1000 steps of learning, starting from the same initial condition XQ = {20°, 0.5}.
118
4.3
Chapter 4. Iden tificadon of the Fuzzy Hyperbolic Model
Generalized Fuzzy Hyperbolic Model
The fuzzy hyperbolic model (FHM) is a nonlinear model that is suitable for representing nonlinear dynamic properties. It is easier to design a stable and optimal controller based on the FHM than on other models such as the T-S fuzzy model (see Chapter 10). However, due to the structural characteristic of the FHM, it cannot approximate every well behaved nonlinear continuous functions to arbitrary degree of accuracy. In other words, it is not a universal approximator. In this section, we extend the result of previous subsection and develop a generalized fuzzy hyperbolic model (GFHM). A GFHM can be expressed as the sum of FHM with generalized input variables and a constant matrix. The state matrix of the model is the hyperbolic function of generalized state variables. Furthermore, we prove that the GFHM is a universal approximator. Finally, we present a technique for identifying the GFHM.
4.3.1 Definition of the Generalized Fuzzy Hyperbolic Model In Section 4.2, the membership functions of the zth input variable Xz of the FHM, Pz and Nz, are defined as: _i,
,2
(4.3.1)
where A:^ > 0. We can see that only two fuzzy sets are used to represent the input variable, and that the fuzzy sets cannot cover the whole input space. It is therefore impossible for the model to be a universal approximator. Now, we define new variables Xi by transforming the input variable Xz as follows: Xi=Xz-di,
(4.3.2)
where i = 1^... ^w (w is a. positive integer) and di is a constant. We call the input variables after transformation, Xi = Xz - di (i = 1 , . . . ,w), generalized input variables. We can see that after the transformation of x^, the fuzzy sets may cover the whole input space if w is large enough. Before defining the GFHM, we first give the definition of generalized input variables and the generalized fuzzy hyperbolic rule base. Definition 4.3.1 (cf. [15]). Given a plant with n input variables xi{t)^... define the generalized input variables as follows: xi =xi
Xyj-^
=
Xl
-dii,
CilWl •>
,Xn{t),
Section 4.3 Generalized Fuzzy Hyperbolic Model Xwi-\-l
119
= X2 — ^ 2 1 ,
where m — XlILi ^^ ^^ ^^ number of generalized input variables, i^^ (z = 1 , . . . , n) represent the number of transformations associated with each x^, and dzj {z = 1 , . . . , n, j = 1 , . . . , w;^) are constants that define the transformations. • Definition 4.3.2 (cf. [15]). Given a plant with n input variables xi{t),... ,Xn{t) and an output variable y, define the generalized input variables as in Definition 4.3.1. We call the fuzzy rule base the generalized fuzzy hyperbolic rule base if it satisfies the following conditions: (1) The /th fuzzy rule takes the following form (/ = 1 , . . . , 2"^): R^: IF {xi - d i i ) i s F ^ , ^ , ..., {xi - di^J is F^^^^, {x2 - G^2I) isi^^^si' . . ., {X2 - d2w2) is i^X2^2. • • •' ^ ^ ^ ( ^ ^ ~ dnwj THEN
y^ =CF,,-\
IS Fj^^^^
+ CF,^^ + 0^21 + • • • + CF^^^ + • • • + CF^^^ ,
where Wz {z = 1 , . . . , n) represent the number of transformations associated with each Xz, and dzj {z = 1,... ,n, j = 1,... ,Wz) SLTQ constants that define the transformations, F^^. are fuzzy sets of Xz — dzj which include subsets Pz (positive) and Nz (negative), and CF^J are constants corresponding to F^^.. (2) The constants CF^^ {Z = 1 , . . . , n, j = 1 , . . . , i(;^) in the THEN-part correspond to Fx^j in the IF-part. That is, if there is F^^^ in the IF-part, CF^J must appear in the THEN-part; otherwise, CF^J does not appear in the THEN-part. (3) There are s = 2"^ fuzzy rules in the rule base, where m = Y^^=i ^«' that is, all the possible Pz and Nz combinations of input variables in the IF-part and all the linear combinations of constants in the THEN-part. • In the sequel, the generalized input variable Xi will be replaced by xi to simplify notation. Theorem 4.3.1. For a multiple input single output system, y = /(a:^i, X2, • •., x^), define the generalized input variables as in Definition 4.3.1 and the generalized fuzzy hyperbolic rule base as in Definition 4.3.2, respectively, and define the membership function of the generalized input variables Pz and Nz as in (4.3.1). We can then
120
Chapter 4. Men tification of the Fuzzy Hyperbolic Model
derive the following model:
-E
y
cp.e^^^^ -\- Civ^e ^^^'
m
m
1=1
—h
h • ^ •
i=l
where pi = {cp^ + CArJ/2, qi = {cp. - CNJ/2, P = Xll^i Pi,Q = [qi^-"^ Qm] is a constant matrix, Xi (i = 1 , . . . , m, m = X^ILi ^^) ^^ ^^^ generalized input variable after the linear transformation of X;^ {z = 1 , . . . , n), t8iiih.{Kxx) is defined as tanh(K^x) = [tanh(A:iXi),..., tanh(/c^x^)] , and K-;^ = diag[/ci,..., km]We call (4.3.3) the generalized fuzzy hyperbolic model (GFHM). Proof. By applying the product-inference rule, the singleton fuzzifier, and the center of gravity defuzzifier to the generalized fuzzy hyperbolic rule base, we have:
y = u/v, where U = {Cp^ H
h Cp^)
IJip^ • ' • llp^
-\
h (CAT, H
• • •/ip^ H
h (CTV, H
h CN^)
/J.N^ • • • jJ^N^ ,
Then, {Cp^ H
y
h Cp^)llp^
/ i P i / i P 2 • • • /^Pnz + MA^i/iPa • • • / ^ P ^ H
h CN^)flNi
' ' • MAT^
\- /^ATiMiVs • • • fJ^Nrr
m ^^1
f^P^ + i^iV.
(4.3.4) p
^ V'^*
""ij
—I— p
2^^^* ' ^ /
From (4.3.4), we have
V—> Cp
C
2
pKiXi
i
2'^i^'^'^'^'^ -\- CM C
_j_ p
i
2
i
'^^•^^
rZiXi
Let Pi = (cp, + CAr,)/2 and g^ = (cp, 2=1
2
pKiXi CNJ/2.
KiX-i We^ have
2=1
= - P + Qtanh(K^x), which is the same as (4.3.3).
•
Section 4.3 Generalized Fuzzy Hyperbolic Model
121
Remark 4.3.1. The differences between the GFHM and FHM are summarized as follows: (1) The input variables of the GFHM are generalized input variables, which are transformed from the original input variables. (2) After the linear transformation of x, we may choose the number of fuzzy rules arbitrarily until the model approximates a nonlinear function at an arbitrary accuracy. (3) cp^ and CN^ are unnecessary to be opposite numbers to each other, and we can choose them arbitrarily. D From the above description, we can see that the GFHM is a generalization of the FHM.
4.3.2 Distinguishing Characteristics of the GFHM There are some distinguishing characteristics of the GFHM that are summarized as follows: (1) The GFHM is a nonlinear model. Unlike the T-S fuzzy model, which is a combination of local linear models, the GFHM is a global nonlinear model. (2) The GFHM is a fuzzy model that can easily be derived from known linguistic information. (3) Similar to the T-S fuzzy model, we can design a neural network model to identify the model parameters. Our goal of extending the FHM to the GFHM is to develop a new fuzzy model that can uniformly approximate any nonlinear function over a compact set.
4.3.3 Approximation Capability of the GFHM Next, we will show that the GFHM can uniformly approximate any nonlinear function over U to any degree of accuracy if U is compact; that is, the GFHM is a universal approximator. We will prove that the generalized fuzzy hyperbolic model is a universal approximator using Stone-Weierstrass Theorem (see Lemma 2.5.1 for details). Theorem 4.3.2. Let Y be the set of all generalized fuzzy hyperbolic model given by Theorem 4.3.1. For any given real continuous function g on the compact set U CR'^ and an arbitrary 5 > 0, there exists f GY such that sup\g{x)-f{x)\<e. xeu
(4.3.5)
Proof. First, we prove that (F, doo) is an algebra. Let / i , /2 G Y. We can write them as
/i(^) = E - ^ H r ^
=ir^
'
(4.3.6)
122
Chapter 4. Identification of the Fuzzy Hyperbolic Model
^^(-) - .E^ ^ ^
^^e^?2^^2 .f ..._ ^ ,e -I ... ^?2^^2
'
(4.3.7)
We have /l(x)+/2(x) — \
11
_
li_
e n ^1 + e
y
'^^'^
^ \
^1 '1
!2
. _i
+^^f
_
^2_
e ^2 ^2 ^ g
Z2 ^2
.
(4.3.8)
It is easy to see that (4.3.8) has the same form as (4.3.3), that is, / i + /2 G Y. In the same way, we can obtain
_ \
^n
iin
^^1^^
ru •
00 -i^^
e "1
\
^ \
^ I
I
^^-^1
^^2 /
I
1 4- e
i = li2 = l
00 o -,
*i
1
00 ^ -,
I
v^
00 i c-%
1
AC -
e *2 *2 _^ g
00 -j r-t
Z2 ^2
^n ^^^2
AC •
00 i 1
\
/
*i ^ ) ( e
AC •
-'^^2 00 o t~i
1
AC •
'2 '2 _|_ g
3^ ? o
^
13 »2j
/ ( e l ' + 4 *(e )e 0 and c^. > 0 (z = 1 , . . . , m), that is, any f ^Y with cp^ > 0 and CAT, > 0 serves as the required / . From (4.3.3), it is obvious that y is a set of real continuous functions on U. The universal approximation theorem is therefore a direct consequence of the StoneWeierstrass theorem. D
4.3.4 Identification of the GFHM There are two main tasks in designing a fuzzy rule-based system (FRBS). One is to select fuzzy operators for inference. The other is to obtain an accurate knowledge base comprising the knowledge that is known about the problem to be solved. The latter is more important and more difficult. For the GFHM proposed in this section, we do not need to identify the premise structure of each fuzzy rule. The problem will be focused on how to determine automatically the consequent parameters. The genetic algorithm (GA) was introduced by Holland [5]. GA is an exploratory search and optimization method that simulates the evolutionary process in nature and in genetics. In this subsection, based on the method proposed in [3], in which GA and evolution strategies (ES) are used together to obtain the optimization result, we introduce a variable length real matrix coding scheme in which "chromosomes" in each generation are matrices that do not necessarily have the same number of rows. From this encoding method, after the optimization by GA and ES, we establish both the best structure of the GFHM and the best parameter values. First, we compare the complexity of GFHM and T-S fuzzy model (Chapter 2 or [9]) identification methods. The results are given in Table 4.3.1. Table 4.3.1: Comparison of the complexity of two identification methods Model Number of input variables Number of fuzzy subsets of the ith input variable
T-S fuzzy model n
GFHM n
rrii
rrii
Fuzzy membership function
^ — a^x-kx)"^
Number of rule bases
niLi ^i
Number of the unknown parameters
£2m^+ i=l
flrriiin^l) z=l
p
'2 K."^
'^x)
2Er=i^i
4n £ mi/2
Section 4.3 Generalized Fuzzy Hyperbolic Model
125
From Table 4.3.1, we can conclude that the number of rule bases in the GFHM is larger than that in the T-S fuzzy model. But the number of unknown parameters of the GFHM are much smaller. Because the structure is known for all rules, the complexity of the identification depends on the number of unknown parameters. Let us consider the extreme situation as follows: lim n^oo,m,-^oo ^^^^ 2mi + YY^^i mi(n + 1)
4Er=iW2
lim
0.
•^oo.rrii-
In some sense, the complexity of the GFHM identification is much less than that of the T-S fuzzy model identification. With the increase of the number of input variables and the number of fuzzy subsets, the computational complexity will be greatly reduced. Therefore, it is reasonable to think that the GFHM is more suitable for describing complex tasks than the traditional T-S fuzzy model. Now, we discuss how to identify the parameters of the GFHM by GA and ES. Typically, there are five basic components in a GA: (1) A genetic representation or encoding of chromosomes for potential solutions to a problem; (2) A method to create an initial population; (3) Afitness(or objective) function to evaluate each chromosome; (4) Operators such as crossover and mutation to perform an evolutionary process; (5) Choose working parameters such as population size, probabilities of applying genetic operators, and termination criteria. We will present this scheme in detail as follows. (1) Encoding. If the number of generalized input variables is m, then we have y
E-
•CNje oK'i X-i
where m = XlILi '^^• We can know from the above equation that there are 4m independent variables that need to be identified. In this section, we use matrix encoding as follows dii di2
hi ^12
CPl2
CiVi2
dlwi
hwi
CPi^-^
^Ni^^
dnl
knl
Cp^^
CN^^
(^nWn
f^nWn
^Pnw^
^^n
(4.3.11)
126
Chapter 4. Identification of the Fuzzy Hyperbolic Model
where Wi G [w^ ,w^] is an integer chosen randomly and w^, w^ are constants. In order to determine both the best structure of the GFHM and the best parameter values at the same time, we will use a variable length matrix encoding scheme, i.e., chromosomes in each generation are matrices that do not necessarily have the same number of rows. (2) Evaluation of the GFHM. The evaluation of the GFHM involves both accuracy and complexity. We use the quadratic sum of e, the output error between the GFHM and a practical plant, to represent the model accuracy. A smaller value of e implies higher accuracy of the GFHM. We use m to denote the number of generalized input variables, which indicate the model complexity. A smaller value of m indicates lower complexity of the model. Based on the above analysis, the following definition is used to represent the individual fitness value of the chromosomes:
imwi 1 + e^ fitness^)
imw2 m
= — # ^ ^ ,
(4.3.12)
where g{i) represents the individual adaptability, fitness{i) is the normalized fitness of the ith individual, e is the identification error of each chromosome, m = Yl^=i ^i and wi,W2 are initial weight values. (3) Crossover. We use a max-min linear crossover operator. First, select m pairs of individuals from the original population according to percentage Si, where rric — SiNp/2. The crossover probability is pc- We adopt the roulette wheel selection method, i.e., for an arbitrary random number r in the interval (0,1), let a{i) = a(i — 1) + fitness{i
— 1),
(j(0) = 0, i = 1, 2 , . . . , Np, in which Np is the population dimension. If cr(i) > r, put the ith individual into the mating pool. The crossover proceeds as follows: select two submatrices of the same dimension from two different parent chromosomes. Suppose that c* = (ci,C2,C3,C4) is a row from the first submatrix and c^ = (c'l, C2, C3, C4) is a row from the other submatrix. Then the following four offsprings are generated:
4+^ = A144-A24, 4"^'
=
{ 4 + V m ( c f c , 4 ) , ^ = l,2,3,4},
4"^'
=
{4tVax(cfc,4),A; = l,2,3,4}.
Choose the two best of the above four offspring as the resulting descendants, where Ai > 0, A2 > 0, Ai -h A2 < 2.
Section 4.3 Generalized Fuzzy Hyperbolic Model
111
(4) Mutation. We use Michalewicz's [6] nonuniform mutation operation. Select n individuals from the population after the crossover according to percentage 82, where n = S2Np. Assume that c^ = (ci, C2, C3, C4) is a chromosome, the element Ck is selected for this mutation, and Ck G [ckhCkr]- The resulting individual is c^+^ = {c^| if i = k, c[ = c'f^; else c^ = Q , z = 1, 2, 3, 4}, where
'^H
Ck -\- A{t,Ckr -Ck), P = 0; Ck - A(t, Ckr - c/c), /5 = 1;
where ^ is a random number that takes values of zero or one, r is a random number in the interval [0,1], T is the maximum number of generations, k G {1, 2, 3,4}, and 6 is a parameter chosen by the user. (5) Evolution strategy. After a GA generation is determined, the evolution strategy (ES) will be applied on a percentage, S, of the best individuals existing in the current generation. The operation is as follows:
MCT(^)
where p is the relative frequency of successful mutations (after which chromosomes are changed), c is a constant and has normal distribution, and a takes the value of 1 in the first ES generation. (6) Stopping condition. If a predetermined stopping condition is satisfied, the process ends. In this subsection, the stopping condition is the number of generations. After the process ends, the fitness value of each individual is calculated. We use the values of the individual whose fitness value is the greatest as the optimal parameter of the GFHM.
4.3.5 Numerical Examples To demonstrate the effectiveness of the modeling method proposed in this section, the following three examples are considered. Example 4.3.1. Consider y = exp(sinxi) + 51n(x2/exp(x2) + 1), —2 < a^i, X2 < 2, where xi, 0^2 are input variables, and y is the output variable. In the simulation experiment, we generate 1681 pairs of numbers {(xi(/c), X2(/c)), /c = 1, 2 , . . . , 1681} that are uniformly distributed in the input space and have the function value y as
Chapter 4. Identification of tlie Fuzzy Hyperbolic Model
128
the training targets. The parameters in the training process are chosen as: w^ = 1, wj = 6, i = 1, 2, ^1 = 0.7, ^2 = 0.1, with the initial population size A^ = 200, selection rate ps — 1, crossover rate pc = 0.8, mutation rate pm = 0.1, weight coefficients ujc = 0.4 and UOM = 0-6. After 200 generations, we can derive the following GFHM parameters: wi = 4, W2 = 4:, d
Xi
X2
f -1.0164 1.0222 < -1.5883 1.5991 ^ -1.4251 1.4251 ^ 0.2268 -0.2268
k
1.9268 1.9268 2.6630 2.6630 1.5232 1.5232 1.4811 1.4811
cp
CN
1.1051 -2.2238 2.2291 -1.0930 2.3344 -1.8342 -2.3212 1.8715 15.7150 18.3885 18.3885 -15.7150 -8.6240 -16.6595 16.6595 -8.6240
i.e., there are four generalized input variables corresponding to xi and X2, respectively. The above matrix shows the parameters of the FHM in the form of (4.3.11). The elements on the first column of the above parameter matrix are the transformation constants of the generalized input variables; the elements on the second column are the diagonal elements of Kx; the elements on the third and fourth columns are the conclusion parameters in the THEN-part. When the parameters matrix of the FHM in the form of (4.3.11) is obtained, we can derive the IF-THEN rules of the FHM as in Definition 4.3.2. The optimal fitness value is ^max = 39.72. The parameters of one individual are chosen as the optimal parameters of the GFHM. In the optimal GFHM, the maximum identification error is max
i\y(k)
y(^)|}-0.0712.
The output of the fuzzy hyperbolic model is drawn in Figure 4.3.1. The error curve between the model and the real nonlinear function is depicted in Figure 4.3.2. For the same example. Table 4.3.2 gives some comparison results between the method used by Tanaka [9] and the method proposed in this chapter. We can conclude from Table 4.3.2 that the method proposed in this chapter can greatly improve the identification accuracy, can significantly reduce the number of fuzzy rules and the complexity of the model, and is more suitable for control applications. • Example 4.3.2. Consider [12] 1/ = 1 + 0.5xi + 5sin(7rx2), 0 < xi,X2 < 2. In the simulation experiment, we generate 100 pairs of numbers {{xi{k),X2{k)),
A; = 1,2,..., 100}
Section 4.3 Generalized Fuzzy Hyperbolic Model
-2
129
-2
Figure 4.3.1: The output of the fuzzy hyperbolic model.
-2
-2
Figure 4.3.2: The error curve between the model and the real nonlinear function.
Chapter 4. Identification of the Fuzzy Hyperbolic Model
130
Table 4.3.2: Comparison of two modeling and identification methods Methods Tanaka [9] This section
Membership functions Trapezoid Gauss type
Encoding methods S-expression in LISP language Matrix encoding
Maximum errors 0.0865 0.0712
that are uniformly distributed in the input space with function value y as training targets. The parameters in the training process are chosen as: w^ = 1, w^ = 6, z = 1, 2, (5i = 0.7, 62 = 0.1, initial population size N = 200, selection rate Ps = 1, crossover rate pc — 0.8, mutation rate pm = 0.1, weight coefficients uoc = 0.4 and UJM — 0.6. After 200 generations, we can get the following GFHM parameters: Wi = l,W2=
4, d
Xi { X2
,
0
0
Figure 4.3.3: The output of the fuzzy hyperbolic model.
Figure 4.3.4: The error curve between the model and the real nonlinear function.
Chapter 4. Identification of the Fuzzy HyperboUc Model
132
Table 4.3.3: Comparison of two modeling and identification methods Methods B. Wu [12] This section
Membership functions Gauss type Gauss type
Encoding methods Real encoding Matrix encoding
Maximum errors 0.18 0.0969
rate Ps = 1, crossover rate pc = 0.8, mutation rate pm = 0.1, weight coefficients LOc = 0.4 and COM = 0 . 6 . After 200 generations, we have the following GFHM parameters: wi = 4,1^2 = 4, ws = 4, d
Xi
i = {(TTIJ, tj), 1 < j < p}, di — {vi^di), Tj is the fuzzy quantifier corresponding to the symptom ruj in M, tj is the fuzzy quantifier corresponding to the symptom rrij in Di, Vi is a name of disease in di, and di is the fuzzy quantifier corresponding to the disease Vi. Assume that the matching threshold value is AQ, and the knowledge base contains a set of fuzzy production rules. By using the matrix representation method, M and Di can be represented by the matrices consisting of grades of membership of respective fuzzy quantifiers, M and Di, i.e., M = [ri, r 2 , . . . , r^], Di = [ti, ^ 2 , . . . , tp], and di can be represented by an intensity di which is a fuzzy number or a fuzzy quantifier. From (6.3.4), the degree of the similarity SM{M, Di) between M and Di can be calculated. If SM{M, Di) > AQ, we execute the rule R^. Otherwise, the rule R^ will not be executed. Now, as an example, let us reconsider Problem 6.3.1 to see how to represent the knowledge and match the patterns. From (6.3.5), we get Di = { ( m i , t i ) , (7712,^2), ( m 3 , t 3 ) } ,
M = {(mi, r i ) , (m2, r2), (ms, rs)},
Section 6.3 Definitions of Several Basic Concepts
179
F = {mi,7712,ms},
Di = [tut2,t3] = [VS, RS.VS]
=
0 0 0 0.07 0.37 1
0 0.03 0.11 0.83 0.83 0.11
0 0 0 0.07 0.37 1
0 0
0 0
0 0.07 0.37 1
0.04 0.136 0.99 0.98
0 0.03 0.11 0.83 0.83 0.11
and
M=[n,r2^rs]^[VS,SE,RS]
where di = {vi.VS), di = VS and vi may stand for the acute enteritis. The intensity VS denotes very severe. By using the similarity measure (6.3.4), we get
SM{M,Di)
=
SM{nM)
+ 5M(r2, t2) + 5M(r3, ^3)
= (1 +0.4845 H-0.5399)/3 = 0.6748 If Ao = 0.65, this rule would be selected.
6.3.3 Modification Function {McF) In the proposed THFDP, a rule, R': IF A , THEN(ii, is to be executed with the use of a function McF that modifies the consequent di of the rule R^. The function MQF is dependent on 5 M , and its construction is subjective. That is, one would, if required, adjust the form of each McF based on the expert experience or historical data or the specific context of the concerned problems so that the system can function as close to the real situation as possible. There could be many forms of modification functions, and we present two of them which will be used in this chapter. (1) Intensity value reduction form. This is analogous to the Compositional Rule of Inference (CRI) proposed by Zadeh [20]. We simply multiply the fuzzy intensity value di of the consequent di by SM{M^ Di) for a given observation M to obtain: di =
diQSMi,
(6.3.6)
where 0 denotes the fuzzy arithmetic multiplication operation (cf. Section 1.2.9). In the above, SMi denotes a fuzzy number originating from fuzzifizing of the crisp
180
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
number SM{M^ Di). It can be interpreted as possibility distribution over the real line. When M is exactly matched with Di for the rule R\ we will have SM(M, Di) = 1 and the deduced consequent d[ = di. (2) Intensity value increment form [20]. The definition and function of this kind of MF can be viewed clearly from (6.3.7), which relates to the division operation of fuzzy numbers: J^ = min{l, di 0 SMi}, (6.3.7) where 0 denotes the fuzzy arithmetic division operation (cf. Section 1.2.9). We observe that small values of SM(M, Di) increase the fuzzy intensity value of J^. Obviously, when M and Di are exactly matched with the rule R^, we have d", = di.
6.4 The Function Cp and the Overall Point-Valued THFDP Algorithm In THFDP, the function Cp is viewed as a strength of confirmation of each fuzzy rule by the system or experts. This rule will be firmly believed to be true if Cjr — 1 and will be regarded as false if CF — 0. The larger the value of CF is, the more the rule should be believed in. The most obvious advantage of setting the parameter CF is that it is convenient for experts or operators to weigh and adjust the relative effects among the rules during the fuzzy decision-making according to the experts' experience and the operating information of the particular system. For example, there exists a rule base containing n different rules. Supposing that the second rule in the rule base has been verified to be especially effective for a long time, then we may set CF2 = lO- If the third rule has not been verified in practice, we perhaps set CF3 = 0.4. Given a fact and a rule base, how to make the most reasonable decision is the central task of the THFDP. Now, let us show how the value of CF affects the results of decision-making by observe the following example: Rule: IF {mi, ES} THEN {vi, RS} {CF = 1.0) Observation: {mi, RS} (6.4.1) Consequent: d'i — 1 where mi stands for a kind of symptom, and vi for a kind of disease. From (6.4.1), we can conclude intuitively the following three points. (1) The larger the value of SM between the observation and the antecedent (pattern) of the rule is, the closer d'i is to di. (2) If a patient has a rather severe m i , not especially severe m i , it may be concluded that the patient has unspecific disease vi. The conclusion is deduced under the precondition of CF = 1. In other words, it is only appropriate for the rule to be absolutely true.
Section 6.4 Function Cp and Overall Point-Valued THFDP Algorithm
181.
(3) Supposing that Cp is not equal to 1 in the example, which means that the rule is not fully believed in, then a certain extent of deviation from the deduction will exist in the deduced conclusion above. The smaller the Cp is, the lager the deviation between the deduced conclusion and the appropriate conclusion will be. There are two directions for the deviation tendency. It is either Hghtening or deepening the diagnosis for the patient. In order to reduce the above deviation and make a more reasonable deduction, two modifiers MQI and M02 will be adopted, by which the deduced consequent will be modified according to the value of CpL. A. Zadeh pointed out early that uncertainty of information in the knowledge base of any question-answering system induces some uncertainty in the validity of its conclusion [21], In this chapter, as there exists uncertainty in both the rules and the observations, the deduced consequent may be stated in the form of a fuzzy interval or a fuzzy number, rather than a crisp number. We refer to [4] to choose Moi(Ci?) = Cp and MQ2{CF) = 1 0 Cp- Hence, the final deduced consequent intensity d^/ is expressed as a fuzzy range between d[ 0 MOI{CF) and d[ 0 M O 2 ( C F ) . Now assume that we have a knowledge base (KB) with n rules: i?% i = 1, 2 , . . . , n. Further, assume that we have m observations in our fact base (FB): Mj, j = l,2,...,m. The overall point-valued THFDP algorithm consists of the following six steps. Step 1: Select appropriate similarity measure and modification functions. (i) Select an appropriate SM among different SMs. (ii) Choose a reasonable McF. (iii) Set the threshold AQ of SM. (iv) Let z = 1, j = 1. Step 2: Match patterns. itecede Di of the rule R^ and calculate the value of Match Mj and the antecedent
Xij=SM{Mj,Di). Step 3: Execute a rule. (i) If Xij > AQ, then the rule is executed and go to Step 4. (ii) If rule R^ cannot be executed and i ^ n, let z = i + 1, go to Step 2; otherwise if j = m and i = n, then go to Step 5; otherwise let j = j + 1, i = 1, go to Step 2. Step 4: Deduce the consequent. (i) The first fuzzy consequent d[ is inferred according to both McF and di, i.e., d[ = McF{di),
(6.4.2)
where d[ and di are the fuzzy intensity values of d^^ and di, respectively.
182
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
(ii) The final fuzzy consequent d[^ is deduced according to modifiers MQI and M02 as well as d-, i.e., J-' = {-^01 0 d'-, M02 0 d[}. More explicitly, Moi 0 < < < < M02 0 4 ,
(6.4.3)
where d'/ is the fuzzy intensity value of d^'. (iii) Put the deduced consequent df into the consequent base (CB). (iv) If both i = n and j = m, goto Step 5; otherwise, if i 7^ n, then let i = z -f 1 and goto Step 2; otherwise, let i = 1 and j = j -{-1, goto Step 2. Step 5: Combine the consequents. If there exist more than one d^/ in the CB, the rule R^ has been executed more than one time. Then combine these consequents using maximum operator which corresponds to the effect of the linguistic connective OR. Step 6: Determine whether the results are satisfactory. If not, goto step 1; otherwise, display the final results.
6.5 Fuzzy Decision-Making of Composite Rules We have discussed the case of simple rules such as R^: IF Di, THEN di. How to extend this algorithm to the case of composite rules will be discussed in this section.
6.5.1 "OR" Composition A composition rule "OR" is given in the following form: IF A i or Di2 or •.. or Afc, THEN di {Cri = Si).
(6.5.1)
The rule in (6.5.1) can be decomposed into k simple rules: IF A i THEN di {Cpi = Si) or --- or IF Afc THEN di {Cn = Si), and can be treated as individual simple rules, respectively.
6.5.2 "AND'' Composition A composition rule "AND" is given in the following form: IF A i and A 2 and ••• and Afe THEN di {CFi = Si).
(6.5.2)
The overall SM for the composition rule (6.5.2) can be determined by averaging SMs over all the corresponding pairs of (Dis^Mjs), s = 1,2,..., A;, i.e., SM = Xij = 2iVgSMs{Dis,Mjs),
s - 1,2,..., A:,
(6.5.3)
s
where Dis and Mjs are the vector representations of Dis and Mjs, respectively, and we suppose that observations Mjs have the following form Mji and Mj2 and M^s ••• and Mjk.
Section 6.6 Numerical Examples
183
Therefore, by means of the decomposition, THFDP can deal with many kinds of rule antecedent portions without any difficulty, i.e., THFDP can be used in extensive areas of decision-making problems.
6.6 Numerical Examples In this section, some examples are provided to illustrate how to use the scheme THFDP. Example 6.6.1. Consider Problem 6.3.1 of Section 6.3.2 where M has been matched with Di. Now, let us deduce the final consequent d'^ =
d^GSMieM^i
= [0,0,0,0.07,0.37,1]^ © 06748 © T = 065 = about 0.65. Here "about N'' means a fuzzy number N. As a result, the patient has rather severe vi according to the Euclidean SM.
•
Example 6.6.2. Consider Problem 6.3.1. Assume that F = {mi, m2, m3,7714,777.5, mQ},M = {(mi,5E),(m2,A^^),(m3,0.64),(m4,A^^),(m5,C/5),(m6,5L)}, Ao = 0.65, R':
IF {(mi, 5 ^ ) , (m2, 5L), (ma, NE), (m4, US), (ms, A^^), (me, US)}, THEN {vi.RS) {CFI = 0.60); R^: IF {(mi, FL), (m2, A^^), (ma, VS), (m4, A^^), (m5, RS), (mg, SL)}, THEN {v2, VS) {CF2 = 0.99); R^ : IF {(mi, ATE), (m2, ES), (ms, A^^;), (7714, US), {m^^NE), (mg, VL)}, THEN {vs, VS) {CF3 = 0.90); i^4: IF {(mi, T/L), (m2, A^^), (ms, A^^), (m4, ES), (ms, NE), (me, 5L)}, THEN {v^,ES) ( C F 4 = 0.79); i^^: IF {{mi, NE),{m2,NE),{ms,NE),(m^,SE),{ms,SL), {me, SL)}, THEN (7;5,i^S') ( C F 5 = 0.94); R^: IF {(mi, VL), {m2,NE), (ms, i?5), (m4, AT^), {m^, US), (me, 5 E ) } , THEN {ve, ES) {CFG = 0.97). By using the matrix representation method we have
M
0 0 0.04 0.136 0.99 0.98
1 0 0 0 0 0
0 0.04 0.15 0.87 0.79 0.04
Di
0 0 0.04 0.136 0.99 0.98
0.27 1 0.27 0.05 0.01 0
1 0 0 0 0 0
0 0.15 1 0.53 0.08 0
1 0 0 0.15 1 0 0 0.53 0 0.08 0 0
0.27 1 0.27 0.05 0.01 0
0 1 0 015 1 0 0 0.53 0 0.08 0 0
184
Chapter 6. Fuzzy Inference and Control Involving Uncertainties 0 0 0.04 0.136 0.99 0.98
Do
1 0 0 0 0 0
D.
DA
Da
0 0 0.04 0.136 0.99 0.98
0.27 1 0.27 0.05 0.01 0
1 0 0 0 0 0
0 0.15 1 0 0.53 0.08 0
1 1 0.53 0.08 0 0 0 0 0 0
1 0 0 0 0 0
1 0 1 0.27 0 0 0 1 0 0 0 0.27 0 0 0 0.05 0 0 0 0.01 0 1 0 0
0 0 0 0 0 1
1 0.53 0.08 0 0 0
=
D.
0 0 0 0.07 0.37 1
1 0 0 0 0 0 1 0.53 0.08 0 0 0
0 0 0.04 0.136 0.98 0.98 0 0.53 0.11 0.83 0.83 0.11
1 0 0 0 0 0
1 0 0 0 0 0
0
0.27 1 0.27 0.05 0.01 0
0.27 1 0.27 0.05 0.01 0
0 0.15 1 0.53 0.08 0
0 0 0.04 0.136 0.99 0.98
We can calculate from (6.3.4) that SM{M, Di) = 0.5028, SM{M, D4) = 0.5725,
SM{M, D2) = 0.5482, SM{M, D^) = 0.571,
SM{M, Ds) = 0.412, SM{M, DQ) = 0.76.
As a result, SM{M, DQ) > AQ, we get dg = [minb, maxb], where minb = [0,0,0,0,0,1] 0 OYG 0 0 9 7 = about 0.74, maxb = [0,0,0,0,0,1] 0 6?76 0 0 9 7 = about 0.78. Referring to Table 6.3.1, we obtain the conclusions that the patient has rather severe VQ by the Euclidean SM. D
Section 6.7 Fuzzy Control Methods Involving Two Kinds of Uncertainties
185
Example 6.6.3. Now, let us take a look at an example from the commerce. We are given: Rule: IF sales forecast S is high and inventory level / is low. THEN production level P should be high {Cp = 1.0, AQ = 0.65). Observation: Sales forecast S is rather high and inventory level / is low. The problem can be represented as follows: Rule: IF {{S, high), (/, low)}, THEN {P, high) {CF = 1-0). Observation: M = {(5, rather high), (/, low)}. We use similar division of fuzzy quantifier as in Problem 6.3.1 for convenience. They are given by especially high = [0, 0,0, 0,0,1]^, very high = [0,0,0,0.07, 0.37,1]^, high = [0,0,0.04,0.136,0.99,0.98]^, rather high = [0,0.03,0.11,0.83,0.83,0.11]^, low = [0,0.15,1,0.53,0.08,0]^, rather low = [0.27,1,0.27,0.05,0.01,0]^, very low = [1,0.53,0.08,0,0,0]^, especially low = [1,0,0,0,0,0]^. By using the matrix representation method, we obtain
M =
0 0.03 0.11 0.83 0.83 0.11
0 0.15 1 0.53 0.08 0
,
D =
0 0 0.04 0.136 0.99 0.98
0 0.15 1 0.53 0.08 0
We can calculate that SM{M, D) = 0.77 > AQ. When using the EucHdean SM, we get d^^ = about 0.67. The conclusion is that the production level P should be rather high. D
6.7 Fuzzy Control Methods Involving Two Kinds of Uncertainties In previous sections, we discussed a fuzzy inference and decision-making method for constructing an expert system. After suitable processing, it can be applied to the design of fuzzy control systems. In this section, eight fuzzy quantifiers are concerned with which are listed in Table 6.7.1, and they are distinctive fuzzy subsets of the unit interval [0,1]. We define them
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
186
as complete trust (CT), extraordinary trust (ET), very trust (VT), trust (T), basic trust (BT), rather trust (RT), less trust (LT), and no trust (NT), respectively, and set the types of membership functions to be normally distributed. Evidently Table 6.3.1 and Table 6.7.1 are identical on the whole. The difference lies in the different implication of the fuzzy quantifiers in the two tables. The notation CFI is still viewed as the strength of confirmation of the ith rule. For example, if Cpi = 0.98, it represents that the rule R^ is ET. However, if Cri = 0.5, it represents that the rule R^ is RT. When considering the influence of C F , we give the general formulation of a fuzzy control rule as follows: IF E is Ei, Ec is Ea,
THEN U is Ui {Cpi = Si), i = 1, 2 , . . . , n,
(6.7.1)
where n = mi x 7712, mi is the number of the fuzzy subsets of E, and m.2 is the number of the fuzzy subsets of EcThe fuzzy control method in this section is similar to the conventional fuzzy control method. Hence, we still use the following three steps: Step 1: Fuzzification of information. Step 2: Fuzzy inference and comprehensive evaluation of fuzzy decision-making. Step 3: Defuzzification.
Table 6.7.1: The division of fuzzy quantifiers Fuzzy Quantifiers CT Complete trust ET Extraordinary trust VT Very trust T Trust BT Basic trust RT Rather trust LT Less trust NT No trust
Numerical Intervals
Membership Functions
[LOO, 1.00]
fii(x) = 1 fii{x) — 0
[0.95,0.99]
l^2{x) = 1 - exp {[-(0.125/10.97 - x\)]'^-^}, X e (0,1)
[0.80,0.94] [0.61,0.79] [0.31,0.60] [0.11,0.30] [0.01,0.10] [0.00,0.00]
fisix) = 1 xG(0,l) fi4{x) = 1xG(0,l) /i5(x) - 1 X e (0,1) /^Q{X) = 1 xe (0,1) l^^(x) = 1 xe (0,1) fisix) = 1 /i8(x)=0
if X = 1 if X j^ 1
exp {[-(0.125/10.87 -
x\)]^-^},
exp {[-(0.125/10.70 - x\)]'^-^}, exp {[-(0.125/10.46 - xl)]^-^}, exp {[-(0.125/10.20 - xl)]^-^}, exp {[-(0.125/10.06 if a: = 0 if X 7^0
x\)]^-^},
Section 6.7 Fuzzy Control Methods Involving Two Kinds of Uncertainties
187
Please refer to Sections 5.2.3 and 5.2.5 for the contents of Step 1 and Step 3, respectively. We only consider Step 2, i.e., how to perform fuzzy inference and fuzzy decision-making in this section. Before fuzzy inference, we must construct a knowledge base consisting of many fuzzy rules such as (6.7.1). A knowledge base describes knowledge and experience of experts and operators. Assume that the fuzzy rules have been constructed, and the fuzzy subsets M and N have been measured, whose universes of discourse correspond to the error and change-of-error, respectively. Our objectives are to obtain a consequent (i.e., the fuzzy subsets of the control action), and determine the format of it. From (6.7.1), we know each fuzzy rule is related to a certain function of CFTherefore, the conventional fuzzy inference method, which constructs a fuzzy relation matrix, cannot be used directly. In theory, if the fuzzy subsets, M and TV, are totally identical to the antecedents, Ei and Eci, of a fuzzy rule, the obtained fuzzy subset of the control action is Ui with Cn • But M and N are often not identical to Ei and Eci in practice. When M and N are close to Ei and Eci, the consequent Ui has a certain degree to refer to. Now we call Ui the referential fuzzy subset of the consequent set, and still adopt the Euclidean SM formula. Then the SM between M and Ei is shown as follows: 1/2
nil
SMi{M,Ei)
= l
(6.7.2)
mi
Similarly, the SM between N and Ea is 1/2
Yl [/^N{ek) SM2{N,Eci)^l-
k=l
I^EcMk) (6.7.3)
1712
where Ck and e^ denote the kth element of error and change-of-error in the universe of discourse; and IIM{-), fiN{'), I^EA'), and IJLECA') denote the grades of membership of the fuzzy subsets M, TV, Ei, and Eci, respectively. The SM between two fuzzy subsets can be obtained from the above formula, while the SM between M and iV, and Ei and Eci can be calculated using (6.5.3), i.e., SM{i) = ^vg{SMi{M,Ei),SM2{N,Eci)). (6.7.4) Obviously the closer SM is to 1, the more U^i is credible. Suppose that the threshold Ao is given. When SM{i) > AQ, we will execute the fuzzy rule, and the consequent Ui has a certain degree to refer to, where Ui is defined as a referential fuzzy subset of the consequent set. Otherwise, when SM{i) < AQ, the fuzzy rule cannot be executed because the subset is far away from the antecedent of the rule. Using the same method, we compare the subsets with each fuzzy rules of the knowledge
188
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
base. Then we conclude a consequent set consisting of p {< n) referential fuzzy subsets. Actually, comprehensive evaluation is a process which deals with p {< n) referential fuzzy subsets of a consequent set in order to draw a credible conclusion. Because Cpi and SM{i) of each fuzzy rules are different, p referential fuzzy subsets also have different credibility. We define a credible factor as SM{i) • Cpi [12]. Thus, we should pay attention to considering the consequent with the larger value of SM{i) ' Cpi- Because each referential fuzzy subset has the same universe of discourse, we use the weighted average method to evaluate each referential fuzzy subset. The formula is given as follows: E SM{i) . CFI ' Ui U = '-^ . E SM{i) . Cpi
(6.7.5)
Example 6.7.1. Assume that the fuzzy subsets M and N, and the antecedents of each fuzzy rules in the knowledge base are given. After calculating whether SM{i) is larger than the threshold AQ or not, we get the referential fuzzy subsets of two control actions, Un and Ui2, as follows: f / a ( P ^ ) = 0.5/3 + 1/4, Ui2{PB) = 0.5/1 + 1/2 + 0.5/3. If the credible factors of them are 0.91 and 0.83, respectively, the result of comprehensive evaluation is 0.5x0.91 lx0.91\ /0.5X0.83 1x0.83 0.5x0.83 + -. + ^ + ^ 3 4 y V 1 2 3 (0.91 + 0.83) = 0.239/1 + 0.477/2 + 0.5/3 + 0.523/4. D Figure 6.7.1 illustrates the procedure of fuzzy inference and comprehensive evaluation method in this section. Example 6.7.2. By using the above fuzzy control method, we study the steam temperature control system for a steam power plant (which has been studied in Chapter 5, Section 5.5). The architecture of this control system is displayed in Figure 6.7.2 (cf. [24]). After research, reasonable analysis, and reference to Table 5.2.4, we conclude 56 fuzzy production rules as in Table 6.7.2. We also give the function Cn of each fuzzy rule in Table 6.7.2. Suppose that the threshold AQ = 0.65. When the disturbance d is a 4 mA step signal, the dynamic response curve of the fuzzy control system is shown in Figure 6.7.3, where ESI denotes the integrated square of error, i.e., ESI = J^^ e^dt (defined in Chapter 5). For comparative convenience, we also depict the response of the conventional cascade PID control system in the same figure. •
Section 6.7 Fuzzy Control Methods Involving Two Kinds of Uncertainties
189
Construct the knowledge base and obtain the fuzzy subsets M and N
i=l
Calculate SMi(M,Ei) and SM2(N,Ci)
SM(i) = avg(SMi(M,Ei),SM2(N,Ci)) i = i+l No
Yes The consequent Ui is a referential fuzzy subset of the consequent set
Yes
No Evaluate each referential fuzzy subset by (6.7.5) and get the final consequent Ui Figure 6.7.1: The procedure of fuzzy inference and comprehensive evaluation method.
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
190
Disturbance d
Reference value of the steam temperature f
Fuzzy controller
r ^
Ovprhppifpf
' 200 ^ * > (l+15s)2 ,
1.25 (l+25s)^
steam T
1
0.1 0.1
Figure 6.7.2: The simple block diagram of a steam temperature control system.
Table 6.7.2: The look-up table involving two kinds of uncertainties
^
NB NM NS O PS PM PB
NB
NM
NS
NO
PO
PS
PM
PB
PB (1.0) PB (1.0) PB (0.95) PM (1.0) PM (0.98) PM (1.0) NS (0.95)
PB (1.0) PB (0.97) PM (0.95) PM (0.95) PM (0.95) PS (0.95) NS (0.9)
PB (0.95) PM (0.9) PM (0.9) PS (1.0) PS (0.9) NS (0.85) NS (0.9)
PB (0.85) PM (0.9) PS (1.0)
PM (0.92) PM (0.9) PS (0.95) O (1.0) NS (0.95) NM (0.95) NB (0.85)
PS (0.9) PS (0.95) NS (0.95) NS (1.0) NM (0.95) NS (0.95) NB (0.95)
PS (0.95) NS (0.85) NM (0.97) NM (1.0) NM (0.98) NS (0.98) NB (1.0)
PS (0.95) NM (0.85) NM (0.95) NM (0.95) NB (1.0) NB (1.0) NB (1.0)
o (1-0) NS (0.95) NM (0.9) NM (0.92)
Section 6.7 Fuzzy Control Methods Involving Two Kinds of Uncertainties
191
303 r
293 y
The conventional cascade PID control method (ESI = 1071)
283
273 Our method (ESI = 10.872) 263
253
^0
160
320
480
640
800
t (sec) (a) 12
The conventional cascade PID control method
< I
Our method -4
-8
160
320
480
640
800
t (sec)
(b) Figure 6.7.3: Performance comparison of two control system when the disturbance d = 4mA.
192
6.8
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
Summary
We have provided details for the THFDP technique which can deal with two kinds of uncertainties, and given examples to illustrate how to use it. The examples are from the fields of medical diagnosis, commerce, and control engineering. The development and design principle of this chapter can be extended to a variety of engineering applications.
Bibliography [1] A. Basu, A. K. Majumdar, and S. Sinha, "An expert system approach to control system design and analysis," IEEE Transactions on Systems, Man and Cybernetics, vol. 18, no. 5, pp. 685-694, Sept.-Oct. 1988. [2] C. E. Bozdag, C. Kahraman, and D. Ruan, "Fuzzy group decision making for selection among computer integrated manufacturing systems," Computers in Industry, vol. 51, no. 1, pp. 13-29, May 2003. [3] B. G. Buchanan and E. H. Shortliffe, Eds., Rule-Based Expert System: The MYCIN Experiments of the Stanford Heuristic Programming Project, Reading, MA: Addison-Wesley, 1984. [4] S. M. Chen, "A new approach to handling fuzzy decision-making problems," IEEE Transactions on Systems, Man and Cybernetics, vol. 18, no. 6, pp. 10121016, Nov.-Dec. 1988. [5] F. Chiclana, F. Herrera, and E. Herrera-Viedma, "Integrating multiplicative preference relations in a multipurpose decision-making model based on fuzzy preference relations," Fuzzy Sets and Systems, vol. 122, no. 2, pp. 277-291, Sept. 2001. [6] T. Fujita and S. Iwamoto, "An optimistic decision-making in fuzzy environment," Applied Mathematics and Computation, vol. 120, no. 1-3, pp. 123137, May 2001. [7] M. B. Gorzalczany, "A method of inference in approximate reasoning based on interval-valued fuzzy sets," Fuzzy Sets and Systems, vol. 21, no. 1, pp. 117, Jan. 1987. [8] S. J. Henkind and M. C. Harrison, " An analysis of four uncertainty calculi," IEEE Transactions on Systems, Man and Cybernetics, vol. 18, no. 5, pp. 700714, Sept-Oct. 1988. [9] D. H. Hong and C. H. Choi, "Multicriteria fuzzy decision-making problems based on vague set theory," Fuzzy Sets and Systems, vol. 114, no. 1, pp. 103113, Aug. 2000. [10] K. Huan, Introduction to Expert Systems, Nanjing, China: Southeast University Press, 1988.
Bibliography
193
[11] H. J. Huang and F. S. Wang, "Fuzzy decision-making design of chemical plant using mixed-integer hybrid differential evolution," Computers and Chemical Engineering, vol. 26, no. 12, pp. 1649-1660, Dec. 2002. [12] K. J. Hunt, D. Sbarbaro, R. Bikowski, and P. J. Gawthrop, "Neural networks for control systems-A survey," Automatica, vol. 28, no. 6, pp. 1083-1112, Nov. 1992. [13] P. M. Larsen, "Industrial applications of fuzzy logic control," International Journal of Man-Machine Studies, vol. 12, no. 1, pp. 3-10, 1980. [14] N. S. Lee, Y. L. Grize, and K. Dehnad, "Quantitative models for reasoning under uncertainty in knowledge-based expert system," International Journal of Intelligent Systems, vol. 2, no. 1, pp. 15-38, 1987. [15] D. Li, "Fuzzy multiattribute decision-making models and methods with incomplete preference information," Fuzzy Sets and Systems, vol. 106, no. 2, pp. 113-119, Sept. 1999. [16] J. Lu, H. Zhang, and L. Chen, "A fuzzy control approach concerning with rule's confidence," Control and Decision, vol. 7, no. 3, pp. 225-228, 1992. (in Chinese) [17] E. H. Mamdani, "Advances in the linguistic synthesis of fuzzy controllers," International Journal of Man-Machine Studies, vol. 8, no. 6, pp. 669-678, 1976. [18] K. M. Passino and S. Yurkovich, Fuzzy Control, Reading, MA: AddisonWesley, 1998. [19] Y. Tsukamoto, "An approach to fuzzy reasoning method," in Advances in Fuzzy Set Theory and Applications, Edited by M. M. Gupta, R. K. Ragade, and R. R. Yager, Amsterdam: North-Holland, 1979. [20] L. A. Zadeh, "Outline of a new approach to the analysis of complex systems and decision processes," IEEE Transactions on Systems, Man and Cybernetics, vol. 3, no. 1, pp. 28-44, 1973. [21] L. A. Zadeh, "The role of fuzzy logic in the management of uncertainty in expert systems," Fuzzy Sets Systems, vol. 11, no. 1-3, pp. 197-198, 1983. [22] H. Zhang and L. Chen, "A technique for handling fuzzy decision-making problems concerning two kinds of uncertainty," Cybernetics and Systems, vol. 22, no. 11, pp. 681-698, 1991. [23] H. Zhang and L. Chen, "A fuzzy decision-making technique with two kinds of uncertainty," Science in China, Series A: Mathematics Physics Astronomy, vol. 34, no. 12, pp. 1508-1518, 1991. [24] H. Zhang and W. Xu, "Modem control theory applied to 200MW boiler: Turbine unit control," Proceedings of the IFAC Symposium on Power Systems and Power Plant Control, Beijing, China, Aug. 1986, pp. 330-336. [25] H. J. Zimmermann, Fuzzy Set Theory and Its Applications, 4th Ed., Boston, MA: Kluwer Academic Publishers, 2001.
194
Chapter 6. Fuzzy Inference and Control Involving Uncertainties
[26] K. Zou and Y. Xu, Fuzzy Systems and Expert Systems, Sichuan, China: Southwest Jiaotong University Press, 1989. [27] R. Zwick, E. Carlstein, and D. V. Budescu, "Measures of similarity among fuzzy concepts: A comparative analysis," International Journal of Approximate Reasoning, vol. 1, no. 2, pp. 221-242, Apr. 1987.
Chapter 7
Fuzzy Control Schemes via a Fuzzy Performance Evaluator 7.1
Introduction
In practical control systems, the plants are always nonlinear and with uncertainty. It is a difficult process to design a stable controller for such nonlinear plants. In the last few years, fuzzy control of nonlinear and uncertain systems has been an exciting research area and some significant results have been achieved in [4,7,10,12-17]. Fuzzy control can be divided into model-based methods and model-free methods according to whether a fuzzy model is needed. For model-based control methods, the theoretical foundation is the universal approximation theory. Most researchers use fuzzy logic systems as approximators for nonlinear and uncertain systems or controllers, and use the Lyapunov second method to analyze the stability of fuzzy logic systems. Considering the influence of both fuzzy logic system approximation error and external disturbance, fuzzy robust control schemes have been addressed in [4,7]. A fuzzy basis function vector-based adaptive control scheme for control of multipleinput multiple-output (MIMO) systems with square and nonsquare nonlinearity is proposed in [16,17]. An observer-based adaptive fuzzy-neural control scheme is proposed in [7]. In those studies, the upper bound of external disturbance must be known. The disturbance attenuation term is determined based on the known upper bound which leads to a conservative design scheme. Recently, the Hoo control problem of nonlinear system has attracted a great deal of attention. A convenient point of this approach is that it can attenuate the effects of various uncertainties (e.g., structured parametric uncertainty and unstructured disturbance). Reference [7] proposed an Hoo control scheme for a class of nonlinear systems. The control law can be determined by solving a Hamilton-Jacobi-Bellman (HJB) equation, which is the nonlinear version of the Riccatti equation. In fact, such an approach is hardly practicable because no analytical solutions can be obtained except for some special cases. 195
196
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
In this chapter, several novel fuzzy control schemes via a fuzzy performance evaluator (FPE) will be developed. For the T-S fuzzy model, a fuzzy adaptive control scheme and fuzzy static feedback control scheme are developed based on FPE [10]. First, a fuzzy model is employed to represent a nonlinear system. Then, based on the fuzzy model, an FPE is developed to predict and overcome the matching errors and disturbances. Finally, an Hoc controller is obtained via FPE. This chapter is organized as follows. In Section 7.2, the fundamentals of a fuzzy control scheme via FPE are introduced. In Section 7.3, a fuzzy adaptive control scheme via FPE is analyzed in detail. In Section 7.4, a simple fuzzy control scheme via FPE is discussed for the fuzzy dynamic model. In Section 7.5, the results in Section 7.4 is extended to a class of nonlinear systems with uncertain time delays. In Section 7.6, the chapter concludes with a few pertinent remarks.
7.2 Fundamentals of a Fuzzy Control Scheme via FPE The structure of a fuzzy control via FPE is described in Figure 7.2.1. Evaluation 1. Test _ .^ signal input - H O /
Plant
-o
2. Closed-loop FPE
Controller
Reference model
Figure 7.2.1: The block diagram of a tracking control scheme based on FPE.
The FPE is like an observer. The basic idea is that if the fuzzy model can describe the real plant very well, the state errors between the real plant and the FPE will be in a small range. If the fuzzy model is same as the real plant, the error will be zero. Because the modeling error is unavoidable, we use the disturbance attenuation term v to make the error as small as possible. We define the state error performance evaluation index as J = J^ ||e(t)|| c?t. If i; exists and J is small enough, the performance of a closed-loop system based on the fuzzy model will be good. The procedure to calculate V and J is called the evaluation procedure. Because the calculation procedure only gets signals from the original system, no damage will be done to the system, and it is also called a nondestructive debugging method.
Section 7.3 Fuzzy Adaptive Control Scheme via FPE
197
Based on the above analysis, we discuss the design procedure in two steps. We first show how to design the FPE with Hoc tracking performance. Then we prove that the same controller will stabilize the closed-loop system if the FPE is with H^ tracking performance. In this chapter, we solve the following three problems: (1) How to design an FPE based on a fuzzy model in order to guarantee small state errors; (2) Determine the relationship between the parameters of the FPE and those of the controller; (3) Determine the relationship between the performance of the FPE and that of the closed-loop system.
7.3 Fuzzy Adaptive Control Scheme via FPE In the 1990s, Wang presented the fuzzy basis function (FBF) model and its modeling theory in [12,14]. The FBF model is proved to be a universal approximator, which is the theoretical foundation of fuzzy adaptive control based on FBF. Many fuzzy adaptive control results are proposed based on FBF. But the universal approximation theory can only guarantee that the error is bounded and it cannot ensure that the errors will converge to zero. Also, it is very difficult to determine a priori the upper bound of the modeling error. Furthermore, there are many parameters in a fuzzy adaptive controller that are difficult to adjust, such as parameters of fuzzy membership function, initial values and adaption rates, etc. Because of these problems, many fuzzy adaptive control schemes are difficult to apply in practical control systems. In this section, we present a fuzzy adaptive control scheme via FPE, which offers a systematic tuning scheme for parameters of a fuzzy adaptive control system.
7.3.1 Problem Formulation Consider the following nth-order nonlinear dynamical system: (7.3.1)
y^^)=f{x)+g{x)u-\-d, or
x = Ax^ y = Cx,
B[f{x) + g{x)u + d],
(7.3.2)
where 0 0
1 0 0 1
... ...
0 0
0 0
0 0
... 0
1 0
A
' 0" 0 ,
0 0
•
1
^
0
,c^ =
B^ 0 _1 _
0 _0_
198
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
X G M^ is a vector of states, (i G M is the external bounded disturbance, \d\ < e, and 1^ G M and y E R are the control input and system output, respectively. The f{x) and g{x) are unknown nonlinear functions. The system state vector x is assumed to be measurable.
7.3.2 Preliminaries and Notation The basic configuration of fuzzy logic systems (FLS) consists of m fuzzy IFTHEN rules and a fuzzy inference engine. The ith rule is as follows: R^: IF xi is ^ i , . . . , Xn is ^ ^ , THEN y is B^ where A\^ A^^ • ^ • ^ A\ and B^ are fuzzy sets. By using product inference, centeraverage and a singleton fuzzifier, the output of a fuzzy logic system can be expressed as [13] Er(n;=iMA5fe)) y{x) = ^ ^ — = e^^^{x),
(133)
where x = [a^i, X2,..., Xn]'^, fij^i (xj) is fuzzy membership function value of the fuzzy variable Xj, y'^ is the point at which /i^i {y'^) = 1, 9^ = [g^, ^ ^ , . . . , y'^] is the vector of adjustable parameters, ip'^ (x) = [ip^ (x), ?/^^ ( r r ) , . . . , tp'^ {x)] is the fuzzy basis function vector, and ^\^)=
m /
'
' •
(7-3.4)
We can construct an FPE based on the above FLS. Let f{x\Of) = 0^^p{x) and g{x\6g) = 9jip{x), where x is the state vector of FPE, and f{x\Of) and g{x\6g) axe used to approximate the nonlinear functions f{x) and ^(x) in (7.3.1) or (7.3.2). We first introduce the following assumptions and lemmas. Definition 7.3.1. The norms of a vector x are defined as ||x|| =
yx^x,
1/2
= ( f
e-^^'-^^x^(T)x{T)dT)
^L = i
x^(r)x(T)c/rj
and
xl/2
/ . t
We say that x{t) G L2 if ||a:||2 = I /
D
x^(T)x{r)dT
j
exists.
Assumption 7.3.1. Let x and x belong to compact set Ux and Ux, respectively, where Ux = {x eW^: \\x\\ < rrix < 00}, Ux = {x EW^: \\X\\ < rrix < 00}, and
Section 7.3 Fuzzy Adaptive Control Scheme via FPE
199
rrix and m^ are design parameters. It is known a priori that the optimal parameter vectors 9*^ and 0* lie in some convex regions MQJ, and Me^: 0*f — arg min
sup
fix) - f{x\ef)
(7.3.5)
\g{x) - g{x\eg)\
(7.3.6)
X^UX,XEU£
0* = arg min ^
OgeMsg
sup XGUX,XEU£
where M^, = {Of e R^: ||^/|| < m ^ J , M^^ = {^^ e R^: mof and m^i^ are constants.
\\0g\\ < me J, and •
Assumption 7.3.2. ^(x|^^) is bounded away from zero.
•
Lemma 7.3.1 (cf. [5]). Consider the linear time-invariant system x{t) = Ax{t)-hBu{t), x{0) = xo,
(7.3.7)
where x{t) G W, u{t) G R, A G R^> Vm
J '
^ = x-Xm, q =
X
(7.3.27)
XfYl')
il=y-
Vm,
where ym is the reference output. The design procedure of an adaptive fuzzy controller is summarized in the following theorem. Theorem 7.3.2. Considering nonlinear system described by (7.3.1) or (7.3.2) that satisfies Assumptions 7.3.1 and 7.3.2. Suppose that the control law is u = T ^ [ - / > ) + ^ + y^^ - Kc{£ - xm)h
(7.3.28)
with the adaptive FPE (7.3.11) and the adaptive laws (7.3.9) and (7.3.10). If A-BKc is a Hurwitz matrix, then all the signals in the closed-loop system are bounded. The norm of tracking error ||^|| —^ 0 when w G L2[0, oo). Proof. Substituting (7.3.27) into (7.3.11), we obtain Xm + i = Ai-\- Axm + B[f{x) -h g{x)u -v]^
KoC{x -x)
.j^ 29)
il = C{X - Xm) = Cl Substituting Xm = Axm + By^
into (7.3.29), we get
i = Ai + B[f{x) + g{x)u -V-
7/^)] + KoCe.
(7.3.30)
204
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
Using control law (7.3.28), we have
From Theorem 7.3.1, we get e G Loo- According to Lemma 7.3.1, because A — BKc is a Hurwitz matrix, we have ml
< Aoe-^o* ||e(0)|| +
'-h==
Ht)hs
•
(7.3.31)
Therefore, ^ G Loo- From ^ = e + ^, it follows that ^ G Loo- Because x ^ , ^ and ^ are bounded, x = ^ -\- Xm and x = (^ + Xm are bounded too. If 16; G L2[0^ co), then it follows that lim ||e|| = 0 from (7.3.17). According to (7.3.31), we can obtain lim t^oo
Oand lim ||^|| = 0 . 11 li
D
t^oo
According to the above analysis, we can conclude the following for such a class of nonlinear systems. (1) According to Theorem 7.3.1, we can design the disturbance attenuation term V, so that the state errors between the FPE and the real system satisfy Hoo tracking performance with the prescribed 7. (2) Most of the parameters in the controller (except Kc) are the same as the FPE's parameters. So the controller parameters can be tuned by the FPE indirectly. (3) The closed-loop control performance depends on the tracking performance of the FPE. A better tracking performance of the FPE will lead to a better closedloop control performance. The design method is very important in practice. We can estimate controller performance by testing the FPE's performance. The FPE test will be nondestructive to the system, which is very important for the controller design and test. From the above analysis, a design procedure for a fuzzy adaptive control algorithm is proposed as follows: Design procedure for a fuzzy adaptive control algorithm: (1) Select fuzzy membership function for nonlinear system (7.3.1) and construct fuzzy logic system (7.3.3). (2) Select KQ and Kc such that the matrices A — KQC and A — BKc are Hurwitz. It is important that the real part of the eigenvalues of ^4 — KQC are less than those of A — BKc, such that the states of FPE will convergence faster than that of the closed-loop system. (3) Select 7, 71, 72, mof and m^^, and solve LMI (7.3.16) to obtain Ky. (4) Solve the FPE (7.3.11). (5) Solve the adaptive control law (7.3.28).
•
Section 7.3 Fuzzy Adaptive Control Scheme via FPE
205
7.3.5 Simulation To verify the validity of the proposed control scheme, the dynamics of an inverted pendulum on a cart is simulated. The state equation of the inverted pendulum is given by X = Ax-\- B[f{x) + g{x)u -\- d], y = Cx, 0 0
where A
1 0
0=%%
B
mlx2 sin Xi cos xi — (M -f m)g sin xi ml cos^ xi — | / ( M -h m) — cosxi ml cos-^ xi — | / ( M + m.)'
9{x)
X = [xi, ^2]^, xi denotes the angle (rad) of pendulum, X2 is the angular velocity (rad/sec), M is the mass (kg) of the cart, m is the mass (kg) of the pendulum, g = 9.8 m/sec^ is the acceleration due to gravity, / is the half-length of the pendulum, u is the force (N) applied to the cart, and d is the external disturbance. The pendulum parameters are chosen as M = 1 kg, m = 0.1 kg, and / = 0.5 m, and d{t) is chosen as a square wave with amplitude of ±0.1 and period of 27r. Our control objective is to control the state xi to track a reference trajectory i/m = (^r/SO) sin(t). The fuzzy membership functions for the FPE states Xj (j = 1, 2) are given as I^A^i^j) = l / ( l + exp(5 X {xj +0.75))), /i^2(xj) = exp(—(xj + 0.5)^), lij^z{xj) = exp(—(xj -f- 0.25)^), /^A4(%) = e x p ( - x ^ ) ,
/i^5(xj) = exp(—(xj — 0.25)^), MAf (%) = exp(-(x^- - 0.5)^), MA'=:(%)
= 1/(1 + e x p ( - 5 X {xj - 0.75))).
Because the FPE has two states and each state has seven fuzzy subsets, there will be a total of 49 fuzzy rules in the model. All parameters are given in Table 7.3.1. Kc are chosen such that A — BKc and A — KQC are a Hurwitz matrix. KQ and In
Table 7.3.1: The initial parameters (z = 1, 2 , . . . , 49) 71
72
mof
1
1
16
^e. 1
%^(0) 0
Ogr[^) 1
K'o
Kc
[30, 225]
[144, 24]
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
206
Table 7.3.2: The performance index for different attenuation coefficients 7 0.18 0.16 0.1 0.06 0.02 0.006 0.002
K, [217.7192, -9.6149] [218.0377,-10.6487] [219.7577,-16.2312] [222.8155, -26.1556] [238.1044,-75.7775] [291.6156,-249.4542] [444.5048, -745.6735]
J F P E (open-loop)
16.0791 11.5628 5.3612 2.1526 0.2643 0.0246 0.0028
J (closed-loop) 219.7978 225.8274 0.6361 0.3568 0.2193 0.1809 0.1712
order to guarantee that the FPE converges faster than the controller, the eigenvalues of A — KoC should be less than 3~5 times of the eigenvalues of A — BKcThe simulation procedure can be divided into two steps. Step 1: FPE performance test (open loop test). Choose the exciting input u = 0.5 [sin(2^) + cos(20^)], so that the pendulum runs in the permissible range. The simulation stop time if is chosen as 10 sec. Define the performance evaluation index as J F P E = /
lk('^) II dr.
Jo
The test results for different 7 are given in Table 7.3.2. Now we define the acceptable performance evaluation index as \xi — Xi\ < 0.0873 rad (or 5 degrees), 1^2 — ^2! < 0-5 rad/s, so that the acceptable performance evaluation index value should be less than / | | e ( r ) f dr = / (0.0873^ + 0.5^)dT = 2.5762. Jo io From Table 7.3.2, we may estimate that the closed-loop system will get satisfying performance when 7 < 0.06 by looking at the values of J F P E - Further estimation results can be obtained by closed-loop test. Step 2: Closed-loop test. Define the output tracking error ii — xi — ym and the closed-loop performance index J = J^^ ^\ {T)dT. The closed-loop test results for different 7 are given in Table 7.3.2. It is easy to get the positive relationship of performance between an open-loop test and a closed-loop test. We chose 7 = 0.02, 0.1, and 0.16, respectively, which correspond to the three cases: "Good," "normal," and "bad" performance of FPE to verify the above analysis. In the open-loop test, the state error trajectories between the FPE and the real system are shown in Figures 7.3.1-7.3.3. These figures show that the FPE can track the real system very well by choosing a good 7. The matching error and external disturbance are attenuated effectively. But when 7 = 0.16, the performance evaluation index is higher than the acceptable performance evaluation index and the closed-loop system's performance will not be good at all by our analysis. The closed-loop simulation results are shown in Figures 7.3.4-7.3.6, respectively. The closed-loop system's Jn
Section 7.3 Fuzzy Adaptive Control Scheme via FPE
207
4
6
t (sec)
Figure 7.3.1: Trajectories of states and errors with 7 = 0.02 (open-loop test). 0.04
0.02
^
0
-0.02
-0.04
8
10
8
10
Figure 7.3.2: Trajectories of states and errors with 7 = 0.1 (open-loop test).
208
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
1.5
A 0.5
in
0
2
^ '^ ' -0.5
[A
f^
A
f^-
h ] \ n C\-
-1.5
Iw
W w ^^ 4
6
8
10
t (sec)
Figure 7.3.3: Trajectories of the states and errors with 7 = 0.16 (open-loop test).
^
-50
-100
e
10
t (sec)
Figure 7.3.4: Trajectories of the states, control input and disturbance attenuation with 7 = 0.02 (closed-loop test).
Section 7.3 Fuzzy Adaptive Control Scheme via FPE
209
Figure 7.3.5: Trajectories of the states, control input and disturbance attenuation with 7 = 0.1 (closed-loop test).
Figure 7.3.6: Trajectories of the states, control input and disturbance attenuation with 7 = 0.16 (closed-loop test).
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
210
performance is consistent with the FPE's performance. When 7 = 0.16, the performance evaluation index is higher than the acceptable performance evaluation index, and using our theory, we can predict that the closed-loop system's performance will not be good at all. The simulation result (Figure 7.3.6) verified our prediction. The FPE performance test is very significant for predicting the performance of the closedloop system. Figure 7.3.7 shows that the tracking error trajectories of the pendulum angle for different 7. The simulation results in these cases demonstrate that if the state trajectories of FPE track that of actual states well, then the actual states can also track the reference signal well. 1
I
I
1
1
1
1
1
U
Y=0-02 1 Y=0.1 J-
0.8 [
-
0.6 h
;
•D CO ULP"
1
0.4 ^
0.2 ^
[,
!___
___j
Oh
n 0L
1
1
1
1
1
1
1
1
1
2
3
4
5 t (sec)
6
7
8
9
10
Figure 7.3.7: Errors trajectories of pendulum angle.
7.4 Fuzzy State Feedback Control Scheme via FPE In the previous section, we discussed a fuzzy adaptive control scheme via FPE based on a fuzzy basis function (FBF), which shows excellent control performance. But the adaptive law is too complex. In this section, a new robust fuzzy controller will be developed based on a fuzzy dynamic model where the control law will be much simplified. The control results will be compared with those of a parallel distributed compensation (PDC) scheme.
Section 7.4 Fuzzy State Feedback Control Scheme via FPE
211
7.4.1 Problem Formulation Consider a class of nonlinear systems as follows: x{t) = A(x{t))+Bu{t)^wo{t), y{t) = C{x{t)),
^''^''^
where x(t) - [xi{t),... ,Xn{t)f e W, u{t) = [ui{t),... ,Um{t)f G W^, A{x{t)) and C{x{t)) are unknown nonlinear functions, B G M^>^^ is known and independent of x, B^B is nonsingular, wo{t) = [woi{t),..., won{t)]'^ G W^ is bounded external disturbance vector, and y{t) = [yi{t),..., Vpi'^)] G MP is output vector. We assume that all states of the system are measurable. We apply a fuzzy dynamic model with L IF-THEN rules to describe such a nonlinear system, where the ith fuzzy rule is as follows: IF xi is Fii,...,
and Xn is Fin, THEN x = AiX-\-Bu, y^CiX,
n A0\ ^ • ^ ^
where i = 1,2,..., L, Fij are fuzzy sets, and Ai G R^^'^. By using product inference, center-average and singleton fuzzifier, the output of the fuzzy logic system can be expressed as [13]
L
y^Yl^i{x)CiX,
(7.4.4)
where ^ii{x) = \[ Fij{x), hi{x) = lJii{x)/^-^^
l-^ji^)^ and Fij{x) denotes fuzzy
membership function value of the fuzzy variable Xj. By (7.4.3) and (7.4.4), (7.4.1) can be rewritten as follows: X = A{x)+Bu -\- Wo L
= 2_. hi{x)AiX + Bu + A{x) — 2_. hi{x)AiX + '^0 i=l
i=l
L
= ^
J
hi{x)AiX + Bii -f AA(x) + Wo,
(7.4.5)
L
y = C{x) = J2 H^)CiX + AC(x).
(7.4.6)
Suppose that AA{x) = A{x) - Yl hi{x)AiX and AC{x) = C{x) - J2 hi{x)CiX i=l
i=l
are bounded modeling errors. We use wi= WQ + AA{x) to denote the external
212
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
disturbance and modeling error. Thus, (7.4.5) can be rewritten as follows: L
x = Y2^i{^)-^i^ -\-Bu-^wi.
(7.4.7)
i=l
Consider a reference model given by
where Xm denotes a reference state vector, Ar denotes a specific asymptotically stable matrix, r denotes bounded reference input, and B is the same as in (7.4.1). Our goal is to have the state of (7.4.7) to track the state of the reference model.
7.4.2 Design of a Fuzzy Performance Evaluator According to fuzzy model (7.4.2), construct FPE with L fuzzy rules to evaluate the validity of fuzzy model. The zth IF-THEN rule is written as follows: IF xi is F ^ i , . . . , and Xn is Fin, THEN x = AiX-\-B{u -\- v) + Mi{y — y), y = CiX,
where x denotes the state variable of FPE, v = —Ky{x — x) is applied to attenuate the external disturbance and modeling errors, y denotes the output of FPE, and Mi (i = 1, 2 , . . . , L) are performance evaluator gains. The overall FPE is given by: ^=Y^
hi{x) {AiX + Mi{y-
y)) + B{u + v)
(7.4.8)
2= 1
L
For convenience, define A{x) = ^i^i hi{x)Ai, M{x) = ^ ^ -^-^ and d = J2f=i hi{x)MiAC
hi{x)hj{x)MiCj,
(x). Therefore, (7.4.8) can be rewritten as:
i = A{x)x + B{u + v) + M{x){x -x)^d.
(7.4.9)
Let us consider the following performance evaluator index: J= / Jo0
{x-xf{x-x)dt,
(7.4.10)
which shows the tracking performance of FPE. In the following analysis, we can estimate the performance of closed-loop system via J to some extent. Define performance evaluation error as e = x-x,
(7.4.11)
Section 7.4 Fuzzy State Feedback Control Scheme via FPE
213
Differentiating (7.4.11) and according (7.4.7) and (7.4.8), we get e = X —X L
= ^^hi{x)AiX
-
i=l
^ • i,j
hi{x)hj{x){AiX =
-\- MiCj{x - x)) -\- B{u + v)
l
+ Bu -\- wi — d L
= ^
hi{x)hj{x){Ai
(7.4.12)
- MiCj + BKy)e -\-Wi - d.
Let Aij = Ai - MiCj, Aij = Aij + BKy, and w = wi - d. Then, (7.4.12) can be rewritten as: L
e= Y^ hi{x)hj{x)Aije-\-w.
(7.4.13)
In the following, we discuss how to design Mi and v to make the state tracking error satisfy the following Hoc performance index: / ^ e^{t)e{t)dt Jo
< e^(0)Pe(0) + 7^ / ^ w^{t)w{t)dt. Jo
(7.4.14)
For each local linear model, we can obtain Mi by assigning the eigenvalues of Ai MiCi to desired values. The disturbance attenuation v can be designed via Theorem 7.4.1. Theorem 7.4.1. Considering error equation of FPE (7.4.13), if matrix P — P^ Q~^ > 0 and matrix Y are the common solutions of the following LMIs:
Q^A^-^AjQ^Y^B^^BY^^I Q
Q^ ~I
0 with the linear gain matrix Ky = YQ~^. Proof. Consider a Lyapunov function V = e^Pe. with respect to time t, we get
Differentiating the function V
y = e' P e + e^ Pe L
L
^ / i , ( x ) ^ / i , ( x ) ( i ^ . P + P i , , ) e -\- 2w^Pe. z=l
214
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
The performance index /^^ e^edt is evaluated as follows: f ' e'^edt = e^(0)Fe(0) - e^ (tf) Pe (tf) + f ' [e'^e
+ T. dt i-'P-)
< e^(0)Pe(0) + / ' [e^e + ^
{e^Pe)\dt
= e^(0)Pe(0) + f
J2hji^K^P
+ / Jo
e^ H^hi{x)
+
dt
PA,j)\ edt
[e^e^2w'^Pe]dt edt T ,
-h / |e^e — ( ^yw / " + Yw^w
+
Pe j j ^w
Pe
—e^P^Pe dt 1 L
(O)Pe(O) + / ' (e^ [ E /ii(a^) E '^i(^)(^S-^ + ^ ^ )
<e'
+ e' e + Yw'
r
e^ [ J2 hi{x) E
+ e-\0)Pe{0)-h
/ Jo
w + -^e' P' Pe }dt
h^ix) (l + AlP
+ PA,j + ^P^p)
1 edt
j'^w'^wdt.
To show (7.4.14), we need to show that the first term on the right-hand side of the the above expression is negative. According to the definition of Aij, we have / + AjjP + PAij + -^P^P
= 1 + AfjP + PAij + K^B^P
+ PBKy +
1
^P^P (7.4.16)
0 with the linear gain Kic = YiQ~^^, I ^ e^edt < £^{0)P5e{0) + / ^ p^5^5dt. Jo Jo
(1A21)
Proof. Consider a Lyapunov function V{e^ x, t) = e^Pss. Differentiating the function with respect to time t, we obtain V = e^Pse + e' Pss L
r L
T
Y^h,{x){A-BK,,
.T
Pse + s' Ps
Y,h,{x){A,-BKic)
S^Pse + e^PsS
P6+P6 Zhi{x){A,-BKie)
Ps 0
Ps 't
Define the performance index as J = /
(e^e — p^S^S) dt. Then
Jo
J< r
(e^e - P^^^ 0 and a matrix Yy satisfying the following LMIs: [A2iiQ Q] 0
Q {A2ij -h A2jif Q
7/
[iA2ij + A2ji) Q 2T 0 0 7/
(7.5.13)
0 and T = T"^ > 0 and matrices Yi satisfying the following LMIs: On QA2i Q ®i3
QiA2i + A2jf Q
[A2iQ Q] T 0 0 pi
[{A2i + A2j)Q Q] 2T 0 0 pI/2
(7.5.26)
x^{t)PMi5{t)
+ 5^{t)Mf
Px{t),
we have V-i
Premultiplying and postmultiplying (7.5.36) by Q obtain QA^, + AuQ + Y;^Bf + B,Yi + + 7 ^ + -M^M^ 1 - p p
233
and let Yi = KicQ, we
{A2iQT-^QAl)
+ - Q Q < 0, p
and QAi^^AuQ
+ r / 5 / + B,Yi
+ 0
0
\^.
l-(3
+
p
-MiMf (7.5.37)
2.7520;
Mlow {X2),
X2 < 2.7520;
Mhigh {X2),
X2 > 2 . 7 5 2 0 ;
a;2-2.7520 4.7052-2.7520'
X2 < 2.7520; 2.7520 <X2< 4.7052; X2 > 4.7052.
The FPE model is designed as: Rule 1: IF X2{t) is {low}, THEN 5i{t) = AiiSx{t) + Ai26x(t - r ) + BiS [u{t) •V]+Mi{x2{t)-X2{t)), Rule 2: IF X2(t) is {middle}, THEN 6i{t) = A2iSxlt) + A22Sx{t - r ) + B2S [u{t) + v] M2{x2{t)-X2{t)), Rule 3: IF X2{t) is {high}, THEN S^t) = AsiSx{t)^ + As2Sx{t - r ) + B^d [u{t) + i;] + M3 {x2{t) X2{t)), where t' = Ky6x{t) is disturbance attenuation term. From (7.5.23), we design the control law u {t) = Uf{t) — v{t). The fuzzy control input Uf{t) is composed of the following three rules: Rule 1: IF X2{t) is {low}, THEN Uf{t) = KicSx{t), Rule 2: IF X2(t) is {middle}, THEN Uf(t) = K2cSx(t),
Chapter 7. Fuzzy Control Schemes via a Fuzzy Performance Evaluator
236
Rule 3: IF X2{t) is {high}, THEN Uf{t) = KscSx{t). The parameters are selected as follows. (1) All the poles of three subsystems of FPE are assigned at —10 =b 7i. We get the feedback gain vector as Ml = [-86.2514,17.6284]^, M2 = [-17.1158,19.5660]^, and Ms = [-2.6938,16.4558]^. (2) Solve Ky with 7 = 0.1 from Theorem 7.5.1. We get Ky - [-32.3290,-119.8857]. (3) Solve Kic with p = 0.2 from Theorem 7.5.2. We get Ku
-0.1983,1.024], K2c = [-173.9,50977], Ksc = [-1455.7,49884.1].
Consider the following two cases: without external disturbance and with external disturbance. The simulation results will be compared with those in [3]. Case I: Without external disturbance. The simulation results of FPE control scheme are shown in Figures 7.5.1-7.5.3, and the simulation results of PDC control scheme in [3] are shown in Figures 7.5.4 and 7.5.5. The results show that the performance of our control scheme is superior to that of [3]. Case II: With external disturbance. We added white noise with the amplitude ±0.0001 and ±0.01 to xi and X2, respectively. The tracking curves of our control scheme and that of [3] are shown in Figures 7.5.6 and 7.5.7, respectively. These simulation results show the validity of the control scheme via FPE. It is also shown that the stability and robustness of the control scheme via FPE are superior to the traditional PDC control scheme.
X
0.8 0.6 0.4
. • • •
X
f1
X
5
Id
4 3
" ^
0.2
X
2
f1
2d
-
V -
2 1
20
40
60
t (sec)
80
100
20
40
60
80
t (sec)
Figure 7.5.1: Trajectories of the closed-loop system states via FPE.
100
Section 7.5 Fuzzy Control of Nonlinear Systems with Time-Delays via FPE
10
61
~
5
4
I
-4'
' 20
237
~
.
ria -\- 1,
k—l
Bl^]
q^[B-
T.Bjq-^1 i=o
A: = l , 2 , . . . , n , ,
0,
(8.3.10)
/ c > n 6 4-l. D
From Definition 8.3.1, we see that F,B,F^^\ B^^^ are matrix polynomials ofq ^; Fj and 5^ denote the coefficient matrix ofq~^ of polynomials F and B, respectively. Definition 8.3.2. The uncertain term Ao{t) is a function of the input u{t) as follows Ao{t) = f{u{t),u{t
- 1)) ^ /Co H- A:i2i(t) + k2u{t - 1),
(8.3.11)
where fco = /(0,0) 1
^ A^jt) - A^{t - 1) „(t)=o u{t)-u{t-l) /- -
^-^ '^^ - duit --1) «(t-i)=o
_ ^o(i) - Ao(i - 1)
(8.3.12)
«(* - 1) - "(* - 2) D
For a slow time-varying industrial process like the boiler-turbine system studied in this chapter, the term /CQ ~ 0 in Definition 8.3.2. Definition 8.3.3. The matrix polynomials Yfc-i and Uk-i and the matrix Hk-i are defined by the following recursive relationship: k-2
n _ i = Ft^-1^ + Y. ^J^k-2-j
(fc = 2 , 3 , . . . ) ,
k-2
Uk-1 = B^^^ + Yl ^J^k-2-j i=o
(A: - 2 , 3 , . . . ) ,
248
Chapter 8. Multivariable Predictive Control Based on the T-S Model
Hi = Bi + FoHo + k2, fc-2
Hk-i=Bk^i
+ ^FjHk-2-j
(fc = 3 , 4 , . . . ) -
(8.3.13)
a We are now in a position to establish the next theorem. Theorem 8.3.1. The A;-step ahead prediction of the system in (8.3.8) can be expressed as y*(t + k)= S{t + k) + HoAu{t + k-l) + HiAu(t + Hk-iAu{t), S{t + k)= Yk-ivit) + Uk-iAu{t - 1).
+ fc - 2) + • • • (8.3.14) (8.3.15)
Proof. Based on (8.3.8) and (8.3.10), the future 1-step ahead output is: y{t + 1) = Fy{t) + BAu{t) + AAo{t) + iit + 1) = Fy{t) + ^[11 Aw(t - 1) + BoAu{t) + AAo{t) + J(i + 1) = Yoy{t) + UoAu{t - 1) + HoAu{t) + ^{t + 1),
(8.3.16)
where YQ = F,UQ = Sl^l + fej, and HQ = BQ + ki. In (8.3.16), ^{t + 1) is an unmeasurable component in the future so that the 1-step prediction is clearly y*{t + l) = Sit+l)
+ HoAu{t),
(8.3.17)
where S{t + 1) is the predictive information that can be estimated at time t and denoted as S{t + 1) = Yoyit) + UoAu{t - 1). (8.3.18) The future 2-step ahead output prediction is: y(t -F 2) = Fy{t -f-1) -F BAu{t -H 1) + ^t + 2) + AAo(t + 1) = i^o[io2/(t) + U^Auit - 1) + H^Au{t) + iit + 1)] + -h BoAM(t+l) + ByAu(i) -F B'^^^Auii-l)
F^S{t)
+ J(i+2) -F A ^ o ( i + l )
= [Fill + Foro]y(t) + [-BI21 + FoC/o]Aw(t - 1) -h [BQ + fci]AM(t + 1) + [Bi + Foifo + k2\Au{t) + [/oe(« + 1) + i ( t + 2)] = nj/(i) + UiAu{t - 1) + FoAw(t + 1) + HiAuit) + e(t + 2)],
+ [Foe(i + 1) (8.3.19)
where Fi = FI^' + FoFo, ?7i = S'^l + FoC/o, Ho = Bo + fci, H^ = Bi + Fo^o + ^2. (8.3.20)
Section 8.4 Predictive Control Law for Multivariable Processes
249
In (8.3.19), ^{t -h 2) is an unmeasurable component in the future so that the 2-step prediction is clearly y%t H- 2) = S{t -h 2) + HoAu{t + 1) + HiAu{t)
(8.3.21)
where S{t -\- 2) is the predictive information that can be estimated at time t and denoted as S{t + 2) = Yiy{t) + UiAu{t - 1). (8.3.22) In a similar manner, the /c-step ahead prediction can be written as (8.3.14) and (8.3.15). This completes the proof of the theorem. • Theorem 8.3.1 indicates that the output prediction of a multivariable plant consists of two parts: One being S{t-\-k) estimated at time t and the other depending on future control actions yet to be determined.
8.4
Predictive Control Law for Multivariable Processes
Given the uniformly bounded reference trajectory yr{t), the objective is to design a controller that minimizes both the generalized output tracking error variance and input energy consumption as follows:
r^ J = EIY1 [y{t + j) -yr{t^ j)f P{j) [y{t + j) -yr{t^ j)] ^ Au^it E
-^ j - d)Q{j)Au{t
^ j - d)[,
(8.4.1)
j=0
where Au{t ^j-d) (t >
= [Aui{t -h j - di), Au2{t + j - d s ) , . . . , Au^{t + J - dm)]^
max {dj}), dj {j = 1,2,..., m), and N, Nu are respectively the prediction j=l,...,m
horizon and the control increment horizon of a multivariable process. The values P{j) and Q{j) are the weighting sequences of the control increment, which are in general taken as a constant matrices and denoted by P{j)=diag(j>,{j),p2{j),...,Pm{j))
{j = l,2,...,N),
(8.4.2)
Q{j) = diag(^i(j), g2(j),..., qmU))
(j = 1, 2 , . . . , TV,).
(8.4.3)
and
Vri^ -\- j) is the vector of the reference trajectory at the future time instant j and is denoted by yr{t -h j) = [yrl{t
+ j),yr2{t
+ j ) , • . . , yrm{t
+ j)f
,
(8.4.4)
250
Chapter 8. Multivariable Predictive Control Based on the T-S Model
where yri{t -\- j) (i = 1, 2 , . . . , m) is the ith reference sequence at time instant t. Vriit + j) is obtainable from the simple first-order time-lag model: Vriit 4- j) == cxiVriit + j - 1) H- (1 - ai)uji
(8.4.5)
where cui is the ith output set-point of a multivariable system and o^^ (0 < ai < 1) stands for the adjustable parameter of the zth reference trajectory. By minimizing the cost function in (8.4.1), the predictive control law can easily be obtained as AU = [H^PH + Q] ~^ H^Pivr - S), (8.4.6) where AU = [Au^{t), Au^{t + 1 ) , . . . , Au^{t ^Nul)f is the control increment vector of dimension mNu\ S = \S^{t + 1), S'^(^ + 2 ) , . . . , S^{t + N)\ stands for the output prediction vector of dimension mN, which can be estimated at time t\ and iJ is a matrix of dimension mN x mNu given by
H
Ho Hi H2
0 Ho Hi
0 0 Ho
HN-1
HN-2
HN-S
(8.4.7) •••
HN-NU
with Hi eR'^'''^. Denote the first m rows of [H^PH + Q] "^ H^P in (8.4.6) as matrix Rg, i.e., Rg — [Rgi Rg2 ••• RQN]^
(8.4.8)
where Rgi is a matrix of dimension m x m. Therefore, we can obtain from (8.4.6): N
N
Au{t) = Rg [yr -S] = Y1 ^giVrii + 0 " J ] ^9^^^^ + ^)-
^^-"^-^^
From (8.3.15), we have Au{t) = R{q)yr{t) - Ry{q'^)y(t)
- R^{q-^)Au{t
- 1),
(8.4.10)
where Au{t) =u{t) -u{t-
1),
N
i=l
(8.4.11)
N
RyiO. ) = /
^Rgjyj-l^
i=l N
Ru{q
)= / i=l
^RgjUj-i'
Section 8.5 Stability of a Fuzzy Generalized Predictive Control System
8.5
251
Stability of a Fuzzy Generalized Predictive Control System
The following notation will be used for matrices A and B. (a) p{A) denotes the spectral radius of the matrix A. (b) \A\ denotes the modulus of matrix A, i.e., a matrix with modulus elements of A (c) A(^) denotes the eigenvalue of matrix A. (d) A < B means aij < bij for all i and j , where A = (a^^) and B = {hij). From (8.4.10), we can derive u{t) = Ryr{q-^)yr{t)
- Ryy{q-')y{t),
(8.5.1)
where Ryr{q~^) and Ryy{q~^) are two transfer function matrices, which relate to matrix polynomials in (8.4.11). Then (8.3.7) can be expressed as [A{q-^)^B{q-')Ryy{q~')q-']y{t) = B{q-^)Ryr{q~^)yr{t
- 1) + A^{t - 1) + ^ W / A .
(8.5.2)
(8.5.2) can be transformed into the following state-space equations: x{k + 1) = Asx{k) + y{k) = Csx{k),
Bsyr(k), (8.5.3)
where Ag = Ags + AA^, Ags is a nonsingular matrix with an appropriate dimension, and AA^ is the uncertain term from ^(t). Definition 8.5.1. For two n x n matrices A and B, A> B denotes an elementwise inequality. A family of interval matrices is defined as [9]: A{A,A) - { A G i^'^^^:
A
6
6
4
4
2
t^Q^P^% " ^"^"^
CD
=>
2
0'
0'
-2
-2 -4
C
80
160
240
320
t (sec)
400
480
560
0
80
160
240
320
t (sec)
Figure 8.9.2: The response of fuzzy generalized predictive control system (solid line) and boiler-fellow-mode unit (squares) when the plant's inertia is decreased.
8.10 Fuzzy Modeling of Operators' Control Rules with Application Traditionally, the method for obtaining fuzzy control rules is to summarize the fuzzy information that people receive from the controlled plant and from the operation experience. However, it often appears in practice that the skilled human operators can control certain equipment well, but it is very difficult for the operators to put forward a corresponding fuzzy control rule. Usually, operators can only provide a coarse, imperfect fuzzy control rule. It is difficult for the manual operators to express control strategy in words, though they can understand it. At the same time, the manual operator uses fuzzy quality criterion of fuzzy control rules which can easily be understood but can be difficult to present. So, it is very meaningful to study methods for obtaining automatically fuzzy control rules from manual control data. In this section, we present a new method to form a set of fuzzy control rules by means of fuzzy modeling of human operators' control actions. From the fuzzy model we can also formulate the corresponding fuzzy control state action table as well as look-up table automatically. Its effectiveness will be shown by a simulation study.
Chapter 8. Multivariable Predictive Control Based on the T-S Model
264
8.10.1 The Control of BoiIer-l\irbine Unit System Based on a Control Function Model In modem industrial production, a lot of equipments still depend on operators' manual control actions. The aim of this section is to utilize input/output data of a controlled plant to simulate the operators' manual control function and derive a fuzzy control rule model. Thus, we turn manual control rules into an automatic controller. In this section, fuzzy control rules are identified for fuzzy generalized predictive control of the boiler-turbine unit [3,16,20]. The composition of such a control system is shown in Figure 8.10.1. The controlled plant is a coal-fired boiler-turbine unit in a 125 MW thermal power plant. The model of the controlled plant is given in (8.7.2). Considering the disturbance of white noise in an actual system, we will add a disturbance term to (8.7.1). If sampling interval is 20 seconds, then we can obtain CARIMA model of the boiler-turbine unit system by taking z-transform of (8.7.1) and (8.7.2) as follows: NE{t) Prit)
Mq-')
Urit - 1) UBit-1)
Big-')
+ CW/A,
(8.10.1)
where 1 0 0 1
A{q-')
+ B{q-')
-2.5442 0
-h
0 2.4099
-0.8332 0
0 -0.5186
-0.5846 -0.4661
0.0248 1 -0.5810 J
1. 2798 - 0 .457 1
]^-^ + +
' 0.0914 0
r -1.5304 _| 0.8146
-0.016 8 1 -0.0419 q-'
+
-2.2940 0
1^-^ +
0 1.9352
0 0.0007 \ci-\
-0.0030 1 0.0003 J q-'
-0.361 0 0.0726)
0.0031 " 0.0001 q-'.
lfN = 5,N^ = 2, pi(j) = 1, p2{j) = 1, qi{i) = 0, q2{i) = 1 (j = 1,2,... ,N; i = 1,2,..., A^„), we obtain the parameters of the closed-loop control system as follows:
yr
o
Generalized *j predictive control
Plant
Figure 8.10.1: The block diagram of a generalized predictive control system.
Section 8.10 Fuzzy Modeling of Control Rules with Application
0
265
200 400 600 800 1000 t (sec)
200
400 600 t (sec)
800 1000
800 1000
200
400 600 t (sec)
800 1000
18 12 6 0 -6
0
200
400 600 t (sec)
Figure 8.10.2: The response of generalized predictive control system when NE does a 10% step change.
-4.222 12.417
Ry{q-')
-5.813 6.629
+
-9.7371 -21.0264
+
-0.3532 0.7334
11.4562 -26.3919
-5.3379 -9.9886
8.8346 -13.9215 3.3299 -6.9967
1.1661 -2.4414
-0.0016 0.0033
(8.10.2)
and Ru{q-')
1.1781 -4.3903 1.2318 -2.5552
0.0672 -0.2217 -0.0122 0.0253
-2.9842 6.6617
0.1508 -0.3173 (8.10.3)
We assume that NE does a 10% step change. The response of fuzzy generalized predictive control for boiler-turbine unit system is shown in Figure 8.10.2. We obtain 50 groups of data from the response curves of system in Figure 8.10.2 with sampling cycle of 20 seconds, shown in Table 8.10.1. We utilize the first kind of fuzzy identification method in Chapter 2 to get control function model as follows:
266
Chapter 8. Multivariable Predictive Control Based on the T-S Model
Table 8.10.1: The response data of fuzzy generalized predictive control k 1 7 3 4
s 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
ei(A:) -O.lOOxlO^ -0.100x10^ -0.702x10^ -0.563x10^ -0.413x10^ -0.256x10^ -0.144x10^ -0.491x10° 0.747x10-1 0.357 xlO^ 0.455 xlO^ 0.414x10° 0.320x10° 0.211x10° 0.114x10° 0.436x10-1 -0.110x10-^ -0.207x10-1 -0.251x10-1 -0.201x10-1 -0.113x10-1 -0.217x10-2 0.484x10-2 0.936x10-2 0.112x10-1
e2{k) 0.000 0.000 -0.167x101 -0.221x101 -0.226x101 -0.203x101 -0.151x101 -O.lOlxlOi -0.568x10° -0.232x10° -0.261x10-1 0.821x10-1 0.112x10° 0.962x10-1 0.601x10-1 0.200x10-1 -0.132x10-1 -0.354x10-1 -0.464x10-1 -0.483x10-1 -0.440x10-1 -0.365x10-1 -0.279x10-1 -0.200x10-1 -0.133x10-1
Table continued on the next page.
Urik)
Usik)
0.000 4.744 6.262 8.066 9.839 10.853 11.724 12.114 12.215 12.146 12.936 11.689 11.433 11.196 10.991 10.819 10.677 10.559 10.461 10.378 10.306 10.243 10.189 10.142 10.102
0.000 7.879 12.376 14.633 14.869 14.420 13.446 12.415 11.567 10.927 10.532 10.321 10.237 10.226 10.243 10.261 10.266 10.254 10.228 10.192 10.152 10.114 10.080 10.051 10.030
Section 8.10 Fuzzy Modeling of Control Rules with Application
267
Table 8.10.1: The response data of fuzzy generalized predictive control (FGPC) (continued) k 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
ei{k) 0.111x10-^ 0.968x10-2 0.771x10-2 0.568x10-2 0.377x10-2 0.229x10-2 0.123x10-2 0.437x10-2 0.286x10-^ -0.141x10-2 -0.360x10-2 -0.351x10-2 -0.401x10-2 -0.351x10-2 -0.416x10-2 -0.371x10-2 -0.326x10-2 -0.463x10-2 -0.362x10-2 -0.346x10-2 -0.412x10-2 -0.312x10-2 -0.335x10-2 -0.302x10-2 -0.254x10-2
62 W -0.839x10-2 -0.489x10-2 -0.267x10-2 -0.138x10-2 -0.618x10-2 -0.223x10-2 -0.172x10-^ 0.164x10-2 0.315x10-2 0.393x10-2 0.566x10-2 0.635x10-2 0.732x10-2 0.734x10-2 0.765x10-2 0.731x10-2 0.635x10-2 0.637x10-2 0.521x10-2 0.420x10-2 0.385x10-2 0.271x10-2 0.225x10-2 0.168x10-2 0.137x10-2
UT{k) 10.069 10.042 10.021 10.005 9.992 9.984 9.978 9.975 9.973 9.973 9.973 9.974 9.976 9.978 9.980 9.982 9.983 9.985 9.987 9.988 9.990 9.991 9.992 9.993 9.994
UB{k) 10.014 10.004 9.997 9.993 9.991 9.991 9.991 9.992 9.992 9.993 9.994 9.996 9.997 9.998 9.999 10.000 10.001 10.002 10.003 10.004 10.004 10.005 10.006 10.006 10.006
268
Chapter 8. Multivariable Predictive Control Based on the T-S Model
-4.86
7.34
-4.38
7.84 (b)
^3l
6.30
-6.81
6.80 (d)
(c)
Figure 8.10.3: Fuzzy model of the control function.
IF ei{k - 1) is as in Figure 8.10.3 (a), THEN Urik) = -0.69ei(A; - 2) + 0.22ei(A: - 4) + 0.32C/T(A: - 1) + O.bSUrik - 2) -OmWrik - 4) + 0.72e2(A: - 1) - 1.71e2(A: - 2) + 0.39e2(A: - 4); IF ei{k - 1) is as in Figure 8.10.3 (b), THEN Urik) = 2.73ei(A: - 1) - 2.5Sei{k - 2) + 1.09ei(A: - 4) + 4,S2UT{k - 1) -3.4C/T(A:-2);
IF e2{k - 1) is as in Figure 8.10.3 (c), THEN Usik) - -1.116ei(A:-l)-1.31ei(A:-2) + 2.41e2(A:-l)-0.0761e2(A;-4) -h 0A27UB{k - 3)-h 0.0546C/B(A^ - 4);
IF e2{k - 1) is as in Figure 8.10.3 (d), THEN Usik) = -0.245ei(A:-2)+0.232ei(/c-3)+0.244ei(/c-4)-0.773e2(A:-l)
-^imSUsik
- 1) + O.UUsik - 2);
where e i { k - i ) = NE{k - i) - 10.0 (i = 1,2,3,4) is power deviation at time {k — i) and e2{k - i) == PT{k - i) - 0.0 (i = 1,2,3,4) is pressure deviation at time {k — i).
8.10.2 Simulation Studies We utilize the control action model above to control the load system. When NE increases with a step change of 4%, the response of the system is shown in Figure 8.10.4. Figure 8.10.4 is similar to Figure 8.10.2 which shows the validity of the control function modeling method.
Section 8.11 Summary
0
269
80
160 240 t (sec)
320
400
80
160 240 t (sec)
320
400
0
80
160 240 t (sec)
320
400
80
160 240 t (sec)
320
400
Figure 8.10.4: The response system of based on a control function model.
8.11 Summary A multivariable fuzzy model-based generalized predictive control (FGPC) approach is developed by means of the principle of Clark's GPC. The simulation study has shown that this approach has higher speed in tracking the load change and more steady dynamic response to pressure, and is less sensitive to external disturbances than the conventional boiler-fellow load control system, as shown in Figures 8.8.18.8.3. Using the feedback configuration of the multivariable FGPC system developed in this chapter, the computational cost and storage requirement can effectively be reduced. By using the present interval matrix analysis method the stability of the system can immediately be checked. The present multivariable fuzzy control system also has the advantage that the design procedure and the tuning of the controller parameters are simple to understand and implement. It can effectively control a multivariable nonlinear plant, especially with large time-delay and with time-varying parameters. So far, there still exist some unsolved problems, such as how to improve the proposed algorithm so that it can be used for systems with faster dynamics.
270
Chapter 8. Multivariable Predictive Control Based on the T-S Model
Bibliography [1] J. Buckley, "Sugeno-type controllers are universal controllers," Fuzzy Sets and Systems, vol. 52, no. 2, pp. 299-303, 1993. [2] S. Cao, "Analysis and design for a class of complex control systems. Part II: Fuzzy controller design'' Automatica, vol. 33, pp. 1029-1039, 1997. [3] L. Chen, Automatic Control Principle for the Thermal Process and Its Applications, China Power Industry Press, China, 1991. (in Chinese) [4] D. Clark, C. Mohtadi, and P. Toffs, "Generalized predictive control. Part I: The basic algorithm'' Automatica, vol. 23, no. 1, pp. 137-148, 1987. [5] C. Dai, Linear Algebra in Control Systems, Southeastern University Press, Nanjing, China, 1993. (in Chinese) [6] T. Hansan, T. Fevzullah, and Y. Nejat, "Neural generalized predictive control: Robotic manipulators with cubic and sinusoidal trajectory," Proc. XII International Turkish Symposium on Artificial Intelligence and Neural Networks, Tainn, Turkey, July 2003, pp. 124-132. [7] O. Hecker, "Nonlinear system identification and predictive control of a heat exchanger," Proc. American Control Conference, 1997, pp. 3294-3298. [8] J. Rawlings, "Tutorial overview of model predictive control," IEEE Control Systems Magazine, vol. 20, no. 1, pp. 38-52, 2000. [9] M. E. Sezer and D. D. Siljak,"Stability of interval matrices," IEEE Transactions on Automatic Control, vol. 39, pp. 368-371, 1994. [10] J. T. Spooner, "Direct adaptive fuzzy control for a class of discrete-time systems," Proc. American Control Conference, 1997, pp. 1814-1818. [11] K. Tanaka, "Robust stabilization of a class of uncertain nonlinear systems via fuzzy control: Quadratic stability, control theory and linear matrix inequalities," IEEE Transactions on Fuzzy Systems, vol. 4, no. 1, pp. 1-14, 1996. [12] K. Tanaka and M. Sugeno, "Stability analysis and design of fuzzy control systems," Fuzzy Sets and Systems, vol. 45, no. 1, pp. 135-156, 1992. [13] S. Tong, J. Tang, and T. Wang, "Fuzzy adaptive control of multivariable nonlinear systems," Fuzzy Sets and Systems, vol. 111, pp. 153-167, 2000. [14] J. Waller, J. Hu, and K. Hirasawa, "Nonlinear model predictive control utilizing a neuro-fuzzy predictor," Proc. IEEE Conference on Systems, Man, and Cybernetics , Nashville, TN, 2000, pp. 3459-3464. [15] L. Wang, "Design and analysis of fuzzy identifiers of nonlinear dynamic systems," IEEE Transactions on Automatic Control, vol. 40, no. 1, pp. 111-117, 1995. [16] H. Zhang, "Fuzzy generalized predictive control and its application," ACTA Automatica Sinica, vol. 19, no. 1, pp. 9-17, 1993. (in Chinese)
Bibliography
271
[17] H. Zhang and L. Cai,"Multivariable fuzzy generalized predictive control for general nonlinear SISO systems," Cybernetics and Systems, vol. 33, no. 1, pp. 69-99, 2002. [18] H. Zhang, L. Cai, and Z. Bien, "A fuzzy basis function vector-based multivariable adaptive fuzzy controller for nonlinear systems," IEEE Transactions on Systems, Man, and Cybernetics, vol. 30, no. 1, pp. 210-217, Feb. 2000. [19] H. Zhang, L. Cai, and Z. Bien, "A multivariable generalized predictive control approach based on T-S fuzzy model," Journal of Intelligent and Fuzzy Systems, vol. 9, no. 3, pp. 169-190, 2000. [20] H. Zhang and T. Chai, "Fuzzy modeling of operators' control rules and its application," ACTA Automatica Sinica, vol. 20, no. 3, pp. 308-315, 1994. (in Chinese)
Chapter 9
Adaptive Control Methods Based on Fuzzy Basis Function Vectors 9.1
Introduction
There have been some attempts to design fuzzy controllers and explain their performance based on a variety of nonlinear control theories in recent years. Kiriakidis et al. [6] studied quadratic stability analysis methods in which the Takagi-Sugeno (T-S) model was analyzed as a linear system, subject to a class of nonlinear perturbations. However, it is sometimes difficult to determine a positive definite matrix that solves the Lyapunov equation. A robust controller for the T-S fuzzy model was presented in [5] and stability and robustness analysis results were also established. The main result of [5] is about the global stability of closed-loop system and the robustness with respect to unstructured uncertainty, which may include modeling errors and disturbances. The main limitation is that the unstructured uncertainty in the system must be relatively small compared to the inputs and outputs. Recently, model-reference adaptive control based on fuzzy basis function networks has been proposed as an alternative method to solve the above problems [7, 8,11], but the emphasis has been placed on the single-input single-output (SISO) plants. In this chapter, an adaptive control scheme based on a fuzzy basis function vector is developed for multiple-input multiple-output (MIMO) nonlinear systems [1,2, 12-14]. In the present scheme, a nonlinear system is first linearized and then treated as a partially known system. The partially known dynamics is used to design a nominal feedback controller to stabilize the nominal plant, and a robust controller is designed based on fuzzy basis function vectors to compensate the effects of system uncertainties. A fuzzy basis function vector is introduced in this chapter to learn the upper bound of the system uncertainties, and its output is used as the parameters of the robust controller. By proper design of a Lyapunov function, which consists of 273
274
Chapter 9. Adaptive Control Based on Fuzzy Basis Function Vectors
the output tracking error and the robust controller's parameter matching error, we prove the stability of the closed-loop nonlinear control system and show results for robustness analysis of the system with respect to unknown dynamics. This chapter is organized as follows. In Section 9.2, we will give some notations and recall some preliminaries used throughout the present chapter. In Sections 9.3 and 9.4, we will consider the adaptive control of MIMO square nonlinear systems and MIMO nonsquare nonlinear systems. In Section 9.5, a numerical example is simulated to illustrate the effectiveness of the present scheme. Section 9.6 summarizes the whole chapter.
9.2 Notation and Preliminaries In this section, we will first establish some notation used throughout this chapter. Next, we will introduce some preliminaries for the chapter.
9.2.1 Notation Let M denote the set of real numbers and let M^ denote the set of real n-dimensional space. If X G M^, then X ^ = (xi, X 2 , . . . , Xn) denotes the transpose of X. Let l^nxm (jenotes the set of n x m real matrices. If A = [aij] e M"^^^, then A'^ and A~-^ denote the transpose and inverse of matrix A, respectively. If A G R"^>B,
(9.3.15)
If q{^, rj) is Lipschitz in ^, then \q{^, 77) — ^^(0,77) | < A:2 |^| for some positive /c2. By using this condition, if Ir^l > 5 , we now have
tt'•; (2) the matrix C should be chosen such that the polynomial Sj is Hurwitz about Si, i = 1,2,... ,m; and (3) the matrix 6 in (9.3.46) is updated by the following adaptive mechanism:
0 = mX) I^^C^I ||C*£'o"^(X)||, where £ > 0 and the initial matrix 6 € ^^^^
(9.3.47)
j^ arbitrary with positive entries;
then the output tracking error vector e converges asymptotically to the zero vector. Proof. Consider the following Lyapunov function
where
0 = 0* -0,
0^-0.
Then K = S'^S + r 4r [0^§].
(9.3.48)
The first term in (9.3.48) is
s'^s = E'^C^CE = E^C'^C[AE
+ "iHEoiXy^pit)
= E'^C'^CAE
+ E'^C^C-^Eo{X)-'^p{t)
+ E'^ C'^ C^
=
+ E^C'^C^Eo{X)-^p{t)
+
E^C'^CAE
X EQ{X){C^)-^[-CAE
+
^Eo{X)-^Ui]
- ^ign{S^)
EQ{X)-'^UI
E'^C'^C^EO{X)-^
\\C^Eo{X)-^\\e'^^{X)\
= E'^C'^C{X)
X \E'^C'^\ = - t r \0*^^{X)
\E'^C'^\
||C*£;O"1(X)||1
•\-£-\i\0'^£r](X)
\\C^E^^{X)l\ \E^C^\ \\C^Eo\X)\\'\
x\\C^E^HX)\\].
+tr[0^(l){X)
\E^C^\
Section 9.4 Adaptive Control of Multivariable Nonsquare Nonlinear Systems 289 Therefore, we have i^ = E^C^C^Eo{X)-^p{t)
= E^C^C^Eo{X)-^p{t)
- tr [^*^(/)(X)
\E^C^\
- \E^C^\
\\C^E^\X)\\0''^(^{X)
< \E^C^\ \\C^E^\xm\p{t)\
||C^^O"'(^)II
- r^(/>(X))
^, where Sx C M'^ is some compact set of allowed state trajectories. As before, an ideal state feedback linearizing control law can be obtained by C/*(t) - ( J + ( X , t ) J ( X , t ) ) - V + ( X , t ) ( - 5 + y ) , and for convenience, the references to X and t are dropped in (9.4.3).
(9.4.3)
Section 9.4 Adaptive Control of Multivariable Nonsquare Nonlinear Systems 291 The rest of this subsection is similar to that of Section 9.3.1, and the detailed procedure is omitted here.
9.4.2 Fuzzy System Formation From (9.4.2) we can obtain E{X)Y
= F{X) -h U,
where
E(X) = {j\X,t)J[X,t))-^j\X,t)
e R^ix^^
F{X) = {j\X,t)J{X,t))-^j\X,t)B{X,t)
e M^^
J(X, t) and F{X) are assumed to be bounded by the following unknown positive function Pi{X) and vector Qi{X), i.e., 0 0 and the initial matrix 6 G M^>^i(^^.06
Figure 9.5.2: The response curve of the output yi{t).
0.02
Figure 9.5.3: The response curve of the output y2(^)-
295
296
9.6
Chapter 9. Adaptive Control Based on Fuzzy Basis Function Vectors
Summary
FBFV has many advantages: (1) it is a universal approximator; (2) it can be determined based on a given linguistic rule or generated based on a numerical input-output pairs; (3) it has the feature of characterizing both local and global properties; and (4) it can be viewed as a nonlinear or linear function by different methods. Therefore, an FBFV is good for approximating any continuous function. In this chapter, an adaptive control scheme is proposed for both MIMO square and MIMO nonsquare nonlinear systems based on the FBFV method. The theoretical analysis demonstrates that the FBFV can be used to learn the MIMO nonlinear system uncertainty bounds in the Lyapunov sense, and an FBFV-based adaptive hybrid controller can be designed to eliminate the effects of dynamical uncertainties and guarantee that the output tracking errors converge asymptotically to zero. In addition, it has a better robustness with respect to unstructured uncertainty. A numerical example shows that this method is strongly robust, converges fast, and is easy to design and use.
Bibliography [1] W. Chen and H. Zhang, "Input/output linearization for nonlinear systems with uncertainties and disturbances using TDC," Cybernetics and Systems, vol. 28, no. 7, pp. 625-634, 1997. [2] W. Chen, H. Zhang, and C. Yin, "An output tracking control for nonlinear systems with uncertainties and disturbances using time delay control," Cybemetica, vol. XL, no. 3, pp. 229-237, 1997. [3] D. Driakov, Advances in Fuzzy Control, Berlin: Springer-Verlag, 1998. [4] A. Isidori, Nonlinear Control Systems: An Introduction, Berlin: SpringerVerlag, 1989. [5] T. A. Johansen, "Fuzzy model based control: stability, robustness, and performance issues," IEEE Transactions on Fuzzy Systems, vol. 2, no. 3, pp. 221234, 1994. [6] K. Kiriakidis, A. Grivas, and A. Tzes, "Quadratic stability analysis of the Takagi-Sugeno fuzzy model," Fuzzy Sets and Systems, vol. 98, pp. 1-14, 1998. [7] Z. Man and X. Yu, "An adaptive control using fuzzy basis function expansions for a class of nonlinear systems," Journal of Intelligent and Robotic Systems, vol. 21, pp. 257-275, 1998. [8] Z. Man, X. Yu, and Q. Ha, "Adaptive control using fuzzy basis function expansion for SISO linearizable nonlinear systems," Proceedings of 2nd Asian Control Conference, Seoul, Korea, July, 1997, pp. 695-698. [9] R. K. Miller and A. N. Michel, Ordinary Differential Equations, New York: Academic Press, 1982. [10] J. T. Spooner and K. M. Passino,"Stable adaptive control using fuzzy systems and neural networks," IEEE Transactions on Fuzzy Systems, vol. 4, no. 3, pp. 339-359, 1996.
Bibliography
297
[11] L. Wang and J. Mendel, "Fuzzy basis function, universal approximation, and orthogonal least-squares learning," IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 807-814, 1992. [12] H. Zhang and Z. Bien, "Adaptive fuzzy control of MIMO nonlinear systems," Fuzzy Sets and Systems, vol. 115, pp. 191-204, 2000. [13] H. Zhang, L. Cai, and Z. Bien, "A fuzzy basis function vector-based multivariable adaptive controller for nonlinear systems," IEEE Transactions on Systems, Man, and Cy b erne tics-Part B, vol. 30, no. 1, pp. 210-217, 2000. [14] H. Zhang and X. He, Fuzzy Adaptive Control Theory and Its Applications, Beijing, China: Beijing University of Aeronautics and Aerospace Press, 2002. (in Chinese)
Chapter 10
Controller Design Based on the Fuzzy Hyperbolic Model 10.1 Introduction Fuzzy systems are naturally nonlinear. As the theory of fuzzy systems and the theory of nonlinear systems are not completely developed, universal control laws cannot easily be obtained for fuzzy/nonlinear control systems. However, it may be possible for us to design special controllers for a class of fuzzy/nonlinear systems. In this chapter, we introduce several techniques for controller design of nonlinear systems based on the fuzzy hyperbolic model. For the fuzzy hyperbolic models, we first extend the well-known pole-placement method in linear control system theory to the hyperbolic case, and design stable controller in the hyperbolic function form. Furthermore, based on optimal control theory, H^c theory and nonlinear control system theory we develop a nonlinear H2 optimal controller and an Hoc controller [11-13,18,19]. A sufficient condition for the global asymptotic stability of the overall system is also established. In recent years, the problem of designing controllers with guaranteed cost for uncertain systems with time-delay has attracted a number of researchers' attention [1,6,9,10,14,16]. This chapter will also study fuzzy hyperbolic control with guaranteed cost for nonlinear continuous-time systems with parameter uncertainties. Some sufficient conditions are provided for the construction of a fuzzy hyperbolic guaranteed cost controller via state feedback. These conditions are given in terms of the feasibility of linear matrix inequalities (LMIs). This chapter is organized as follows. In Section 10.2, a stable controller in the hyperbolic function form is designed. In Sections 10.3 and 10.4, a nonlinear H2 optimal controller and an iJoo controller are developed, respectively. In Section 10.5, a fuzzy hyperbolic control with guaranteed cost for nonlinear continuous-time systems with parameter uncertainties is studied. 299
300
Chapter 10. Controller Design Based on the Fuzzy Hyperbolic Model
10.2 Stable Controller Design by Pole-Placement Method In Chapter 4, we derived a novel fuzzy model, the fuzzy hyperbolic model (FHM), as follows X = Atdinh(K^x) -\- Bu. (10.2.1) In this section we design a controller u that is used to control the nonlinear systems in the form of the FHM such that some control objectives are satisfied. We are interested in designing a controller in the form i/ = iytanh(K^x),
(10.2.2)
where iJ is a constant matrix. Such a controller has several important properties: (a) Because of the characteristic of tanh function, the controller is bounded for arbitrary input variables x which fits the fact that variables of real physical systems are always bounded. (b) With no control {u — 0), the nonlinear system in the form of (10.2.1) becomes X = Ai3Dli{Kxx). (c) The control input u can easily be represented by linguistic IF-THEN rules, that is, it is a fuzzy controller. According to Chapter 4, the proposed control law can be described by following fuzzy rules: R^\ IF xi is Fpx^,X2 is Fp^^,..., THEN u = cp^^ + cp^^ H
and Xn is Fp^,^, h cp^^;
i?^: IF xi is Fpx^,X2 is Fp^^^,..., and Xn is FiVx^, THEN u = cp^^ + cp,2 +
^^-r^'
R'^": IF xi is FNXI, ^2 is FNX21' • •, and Xn is FAT^^, T H E N U = -CN,^
- CN,^
CN,^ ,
where Fp^. and FNxi (^ = 1, • • •, ^) are fuzzy sets corresponding to Xi, which include Pz (Positive) and Nz (Negative) in the form of (4.2.1), respectively. cpxi and cjsfxi (i = 1, • • •, ^) are positive constants corresponding to Fp^^ and FNXI ' respectively. There are 2'^ fuzzy rules, that is, all the possible Pz and Nz combinations of input variables in the IF-part, and all the linear combinations of constants in the THEN-part. (d) The initial values of the controller can be obtained by knowledge or determined by experience.
Section 10.2 Stable Controller Design by Pole-Placement Method
301
With the controller (10.2.2), the closed-loop systems can be described by i - (A + BH) tanh(X^x).
(10.2.3)
In this section, we present a design procedure for determining H to guarantee global asymptotic stability and robustness of the closed-loop system. This design procedure extends the well-known pole-placement method in linear control system theory to the hyperbolic case. The following definitions and theorems are required for the derivation of our main result in this section. We use the notation P > 0 (P > 0) to indicate that the matrix P is positive definite (nonnegative definite). Definition 10.2.1 (cf. [7]). A square matrix A is called diagonally stable if there exists a matrix Q > 0 and a diagonal matrix P > 0 such that: PA -K A^P = -Q.
(10.2.4)
n In other words, a square matrix A satisfies a Lyapunov equation with a diagonal matrix P . Obviously, if A satisfies (10.2.4) with a pair of positive definite matrices P and Q, then the linear system x = Axi^ globally asymptotically stable. Furthermore, if A satisfies (10.2.4) with a diagonal positive definite matrix P , then the system x = Ax is robustly stable, namely, it retains its stability for a large set of perturbations. Definition 10.2.2 (cf. [7]). Let Sc be the set of all functions / ( • ) : R-^ R satisfying: (1) / is continuous; (2) /(O) = 0, and for all other x e R, f{x)x > 0; (3) J^ f(y)dy ^ oo as \x\ —^ cx), where | • | denotes the absolute value. • Lemma 10.2.1 (cf. [7]). Consider the nonlinear systems: X = Af{x),
(10.2.5)
where/(x) = (/i(a:i),... ,/n(xn))^ with/i(-) e ^c for i = l , . . . , n . If A is diagonally stable, then x = 0 is a globally asymptotically stable equilibrium point of (10.2.5). D Theorem 10.2.1 (cf, [19]). Consider the fuzzy hyperbolic model in (10.2.3). If there exists a matrix H such that A -f- BH is diagonally stable, then x = 0 is a globally asymptotically stable equilibrium point of the closed-loop system (10.2.3). Proof. The result follows immediately from Lemma 10.2.1 with fi (xi) = tanh(A:^Xi). D
Theorem 10.2.1 reduces the controller design problem to determining whether there exists a matrix H such that A + BH is diagonally stable, and if it exists, to find this H. We can use MATLAB to solve for H effectively [5].
302
Chapter 10. Controller Design Based on the Fuzzy Hyperbolic Model
Example 10.2.1. Consider the FHM for inverted pendulum derived in Example 4.2.1 and design a stable controller with pole-placement method. To satisfy the conditions of Theorem 10.2.1 we must find a diagonal matrix P > 0 and a matrix Q > 0 such that P{A -h BH) -h (^ -h BHfP
= -Q.
(10.2.6)
Because the dimension of the system in this example is low (n = 2 and p = 1, H is a 1 X 2 dimensional matrix), the simplest way to do this is to substitute H = [hi, /12], P = diag [pi,P2], and the model's parameters (4.2.9) into (10.2.6). We have: Q = -
0 4pi -h Sp2hi
4pi 4- 8p2hi 16p2^2
Obviously, we can never make Q positive definite. Hence, Theorem 10.2.1 cannot be applied directly. However, in this specific example, we can overcome the problem by using a nonnegative definite matrix Q instead of a positive definite one, and apply the invariant set approach to prove closed-loop stability. If we take h2
O) CD "D
Figure 10.2.2: The response curves of the angle xi{t) when the controller is applied to the real inverted pendulum system (solid lines, controller designed in this section; dotted lines, controller designed in [3]).
304
Chapter 10. Controller Design Based on the Fuzzy Hyperbolic Model 100
Figure 10.2.3: The response curves of the angle xi{t) when the parameters of the inverted pendulum changed.
with the same initial conditions (solid lines). From simulation results we can see that the controller can stabilize the FHM in any initial condition, and because of the difference between the FHM and the real system, when the controller is applied to the real plant, it may cause the system to oscillate a bit. Also in Figures 10.2.1 and 10.2.2 we plot the response curves of a^^i (t) with the controller designed in [3] (dotted lines). In [3], the multi-input continuous-time system was first described by a fuzzy dynamic model. Then the linear feedback control law was constructed to stabilize the fuzzy dynamic model, where a set of positive definite matrices P was obtained by solving a set of Riccatti equations. The piecewise smooth quadratic Lyapunov function was also used in [3] for the design and stability analysis of feedback control law. To prove the robustness of the FHM, we change the parameter of the real inverted pendulum to m = 0.4 kg, M = 2 kg and / = 0.45 m. Figure 10.2.3 depicts the response curves of the closed-loop system with the new parameters. We can see that even when the parameters of a plant change, the control performance is still satisfactory, which proves that the method we proposed is effective and robust. The above simulation results show that even with the parameters we obtain by experience the dynamic response of a fuzzy system in this section is not worse than that of [3]. The parameters of the controller that need to be adjusted are much fewer than those in [3] and the design procedure is much easier too. •
Section 10.3 Nonlinear H2 Optimal Controller Design
305
10.3 Nonlinear H2 Optimal Controller Design In this section, our goal is to extend the Hnear quadratic optimal control theory to the following nonlinear system: x{t) = Af{x{t)) where f{x) function:
+ Bu{t),
(10.3.1)
= ( / i ( x i ) , . . . , / ^ ( x ^ ) ) ^ {fi{') ^ S'c). We define a nonlinear cost /•OO
J(xo, to, u) = /
{f^{x{t))Qf{x{t))
+ u^{t)Ru{t))dt,
(10.3.2)
where Q and R are symmetric positive definite matrices. Our task is to find u so that J becomes the minimal cost JminDefinition 10.3.1. The set of matrices [A, 5 , Q, i^] is called diagonally optimal if there exists a diagonal matrix P > 0 such that PA -h A^P - PBR-^B^P
+ 0 = 0.
(10.3.3)
In other words, the Riccatti equation (10.3.3) has a solution P that is positive definite and diagonal. • Now we can state the main theorem of this section, which pertains to a large class of nonlinear systems in which the control appears linearly. Theorem 10.3.1. Consider the nonlinear system (10.3.1) and the nonlinear cost function (10.3.2). If [A, B, Q, R\ is diagonally optimal, then the optimal controller is given by: u\t) = -R-^B^Pf{x{t)), (10.3.4) where P = diag ( p i , . . . ,pn) is a positive definite diagonal matrix. If we assume Jmin(^o,^o) = min{J(xo,to,ii)}, wehave u{t) Jmin{xo,to)
^ == 2 V ' p i
rxiito) /
fi{r)dT.
Proof. The first step is to show that Jmin satisfies the Hamilton-Jacobi-Bellman (HJB) equation [8]: 9Jn dt
min I f{x)Qf{x) u{t)
+ u^Ru +
^^-'min
dx
T
X\ .
(10.3.5)
The left-hand side of the equation is obviously zero. To estimate the right-hand side, we begin with calculating: r ^ ^ m i n l ^ ^ = 2f{x)P{Af{x)
+
BU)
dx \ = f{x){PA
+ A^P)f{x)
-f
2f{x)PBu.
306
Chapter 10. Controller Design Based on the Fuzzy Hyperbolic Model
Let H = f^{x)Qf{x)
+ u^Ru + [dJ^ir^/dx]^ x. We have:
H = f{x){Q
+ PA + A^P)f{x)
= f{x){Q
+ 2f{x)PBu
+ PA + A^P -
+ u^ Ru
PBR-^B^P)f{x)
+ {u + R''^B^Pf{x)fR{u
+
R-^B^Pf{x)).
Because the first term in the above equation is zero due to the Riccatti equation, we have, H = {u + R-'^B'^Pf{x)fR{u
+
R-^B'^Pf{x)).
Because R is positive definite, H attains a unique minimum value when u = u* [t) = -R~^B'^Pf{x{t)). Thus, mini f {x)Q fix)+u^Ru+
[^=1
u{t) I
L dx
x] = \
f{x)Qf{x)+u*^Ru*-
)
dJrr-
dx
-0, i.e., (10.3.5) is satisfied. The next step is to calculate the value of the nonlinear cost function when u''{t) is applied: poo
J{xo,to,u*)
= /
{f{x{t))Qf{x{t))
+
u*^{t)Ru*{t))dt
Jtn
.X) ]dt
dx dt
\dt
Because the closed-loop system is globally asymptotically stable, we have x(oo) = 0. Therefore, J(xo, to, t^*) = Jmin(^o)- The theorem in [7] implies that Jniin(^o) is the optimal cost and li* (t) is the optimal controller. Let dH/du = 0, we have 2i^ii* + B^
dJrr
0,
dx
I.e.,
2
dJn dx
Comparing (10.3.6) and (10.3.4), we have
dx
2Pf{x).
(10.3.6)
Section 10.3 Nonlinear H2 Optimal Controller Design
307
Thus, dJrr -^min(^0 5^o)
xdt
Jto
dx
•Lto
2Pf{x)xdt
0
L
2Pf{T)dT
x{to) x{to)
2Pf{T)dr Cx{to) pX{Zo)
= 2Y^p, /
Mr)dr,
which completes the proof of the theorem.
•
Because tanh(Ka:) G Sc, for the FHM, if we define the following cost function: /•OO
J{xo,to,u)
= /
[tdinh^(x{t))Qtdinh{x{t))
+
u^{t)Ru{t)]dt,
Jto
where Q and R are positive definite matrices, the optimal controller is given by: u\t)
=
-R-^B^Pianh{K:^x),
where P is a diagonal positive definite matrix satisfying (10.3.3). Next, we use two numerical examples to show the effectiveness of the control scheme. Example 10.3.1. Consider the model given in Example 4.2.1, and design an optimal controller. Because p = 1, assume P is a scalar r > 0. Substituting P — diag(pi,p2) into (10.3.3) we can get: Q
0
-4j9i
-4pi
64r~^P2
Just as in Example 10.2.1, here we cannot find Q > 0 such that (10.3.3) holds either. However, we can overcome this difficulty by introducing the following coordinate transformation: yx =xi
+X2,
2/2 = X2.
In the new coordinates, the model is given by y = Ay tanh(K^y) + ByU, where Ay =
' 8 • " 0 4 " " 0.4 0 0 0 , By = _ 8 _ , Ky = 0 0.2
(10.3.7)
308
Chapter 10. Controller Design Based on the Fuzzy Hyperbolic Model 100
Figure 10.3.1: The response curves of angle xi{t) when the optimal controller is applied to the real inverted pendulum.
and the subscript y designates the new coordinates. Now, if we can design an optimal hyperbolic controller for (10.3.7), then in the closed-loop system, yi and 1/2 will converge asymptotically to zero, and so will xi and X2. Hence, substituting (10.3.7) into (10.3.3), and set r = lQp2, we have:
Q =
PI P2
0
0 pj
Obviously Q is positive definite. Now we can derive the optimal controller: u* =.- r - ^ 5 j P ( t a n h ( 0 . 4 y i ) tanh(0.22/2))^ = - 2^(Pi tanh(0.42/i) +p2 tanh(0.2y2)) = - 2^(pitanh(0.4(a;i +X2)) + ^ 2 tanh(0.2x2)). Figure 10.3.1 shows the results of the optimal controller withpi = 300 andp2 = 100 applied to the real inverted pendulum system with initial positions {xi (0), X2 (0)} = {20^0}, {45^0}, {89^0}. D The controller designed in this section involves hyperbolic function of state variables. We can describe the controller with linguistic information. That is, the controller is also a kind of fuzzy controller.
Section 10.4 Hoc> Controller Design
309
10.4 Hoo Controller Design In this section we mainly focus on if oo controller design for the following nonlinear systems: X = Af{x(t)) + Bu{t) + Dw(t), ^ x{0) =-xo, wherex{k) = [xi{k),X2{k),... ,Xn{k)]'^ G M"" denotes the state vector; ^ G M"^^^ and B G M"^^^ are system matrix and input matrix, respectively; /(•) denotes the hyperbolic function of state variables, that is, f{x{t)) = i^ii\i{Kxx)
•= [tanh(/cia:i),..., tanh(A:nXn)]^,
Kx = diag[A:i,..., A:^]; u = (1^1,^2, • • •, Up)^ denotes the input vector; w = {wi,W2^..., Wjn)^ is an unknown bounded disturbance of the system; and D G M^^^ is the disturbance matrix. Define a nonlinear cost function: J(xo, ^0, u)=
I [f{x{t))Qf{x{t)) Jo
+ u^{t)Ru{t)
-
w^{t)Sw{t)\dt
where Q^R^S are symmetric positive definite constant matrices. Our objective is to find the controller
where {x(r)}^^o denotes the change of state x from x{^) to x(t), and a bounded function Z(xo), such that sup J(iz* ,W) cx). Then for any bounded w J{u\w)
0. The following results hold for any £ > 0, MFE + E^F^M^
< eMM^
+ ^E^E.
U
314
Chapter 10. Controller Design Based on the Fuzzy Hyperbolic Model
The following theorem can be established. Theorem 10.5.1. For nonlinear system (10.5.1) and associated with cost function (10.5.4), if there exist a positive scalar e > 0, a positive definite diagonal matrix X > 0 and a positive definite matrix S > 0 such that the matrix inequality e +
sMM^ SAl NiX + N3F X X F
*
-s
N2S 0 0 0
* * -si 0 0 0
* * * -{i-(S)s 0 0
* * * * -Q-' 0
* * * * * -R-'
< 0
(10.5.7)
holds, then, the control law, u{t) = K tainh(kxx{t)) is a fuzzy hyperbolic guaranteed cost controller and Jo = 2 7 ^ —^ ln(coshA:^a:i(0)) + :; / f^f^i 1 - P J-HO)
ta,nh^(kxx{s))Ht8i.nh(kxx(s))ds^
where 6 = AX-\-X A^ ^ BF-\-F'^B'^, F = KX and * denotes the entries induced by symmetry. Proof. Choose the following Lyapunov function for the system (10.5.6) V{t) = 2}^ ^
--^lji{cosh kiXi) + :; / tanh {kxx{s))H tdinh.{kxx{s))ds^ ^^ 1 - P Jt-h(t) (10.5.8) where Xi is the zth element of X, Ki IS the ith diagonal element of kx, and iJ is a positive definite matrix. Here, ki > 0 andp^ > 0. Because cosh{kiXi) = (e^^^^ + e-kiXiy2 > (e^^^^)l/2(e-/c^a:,)l/2 ^ 2, A:^ > 0, and pi > 0, we known that V{t) > 0 for all X and V{t) ^ cx) as | |x| | ^ 00, where 11 • 11 denotes a vector norm. Along the trajectories of system (10.5.6), the time derivative of V{t) is given by n
V = 2 2 . Pi tdiiih{kiXi)xi + a tanh {kxx)H tanh(A;^x) 2=1
— a ( l — h{t)) tanh {kxXh)H td,-nh.{kxXh) = 2 tanh^(/ca;x)Pi; -\- atanh^(A;icx)i7tanh(A:^x) — a ( l — h{t)) tanh {kxXh)H t^-nh{kxXh)