Lecture Notes in Control and Information Sciences Editor: M. Thoma
252
Springer London Berlin Heidelberg New York Ba...
34 downloads
638 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Lecture Notes in Control and Information Sciences Editor: M. Thoma
252
Springer London Berlin Heidelberg New York Barcelona Hong Kong Milan Paris Santa Clara Singapore Tokyo
Murti V. Salapaka and Mohammed Dahleh
Multiple Objective Control Synthesis With 17 Figures
~ Springer
Series Advisory
Board
A. Bensoussan • M.J. Grimble ' P. Kokotovic • A.B. Kurzhanski • H. Kwakernaak • J.L. Massey • M. Morari Authors Murti V. Salapaka, PhD Department of Electrical Engineering, Iowa State University, Ames, Iowa 50011, USA Mohammed Dahleh, PhD D e p a r t m e n t o f M e c h a n i c a l E n g i n e e r i n g , U n i v e r s i t y o f California, S a n t a B a r b a r a , C A 93106, U S A
ISBN 1-85233-256-5 Springer-Verlag London Berlin Heidelberg British Library Cataloguing in Publication Data Salapaka, Murti V. Multiple objective control synthesis. - (Lecture notes in control and information sciences ; 252) 1.Automatic control - Mathematical models I.Title II.Dahleh, Mohammed 629.8'312 ISBN 1852332565 Library of Congress Cataloging-in-PublicationData A catalog record for this book is available from the Library of Congress Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. © Springer-Verlag London Limited 2000 Printed in Great Britain The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Typesetting: Camera ready by authors Printed and bound at the Athenmum Press Ltd., Gateshead, Tyne & Wear 69/3830-543210 Printed on acid-free paper SPIN 10746705
To my parents: Shri. S. Prasada Rao and Smt. S. Subhadra (Murti V. Salapaka)
To my wife: Marie Dahleh (Mohammed Dahleh)
Preface
Many control design tasks which arise from engineering objectives can be effectively addressed by an equivalent convex optimization problem. The significance of this step stems from efficient computational tools that are available for solving such problems. However, in most cases the resulting convex optimization problem is infinite dimensional. Thus effective finite dimensional convex approximations are needed to complete the control design task. Researchers have employed advanced mathematical tools to exploit and expose the structure of the resulting convex optimization problem with the objective of obtaining computable ways of obtaining the controller. One of the striking insights obtained by such tools was that certain optimal control problems are equivalent to finite dimensional programming problems. Thus seemingly infinite dimensional problems can be converted to finite dimensional problems. Even when the convex optimization problem is truly infinite dimensional or when a finite dimensional characterization is not established, the philosophy that has emerged is to establish finite dimensional approximations to the infinite dimensional problem, which guarantee the optimal performance within any prespecified tolerance. Here too researchers have borrowed advanced mathematical tools to exploit the underlying structure of the problem. One of the difficulties faced by a researcher in this area is the lack of a source where a comprehensive treatment of the tools employed is given. In this book we attempt to fill this gap by developing various topological and functional analytic tools that are commonly used in the formulation and solution of a class of optimal control problems. Efficient design techniques for multi-objective controllers are necessary because often a single measure fails to capture the design performance objective. The standard 7i2,7-/oo and gl designs are incapable of handling such multi-objective concerns because they optimize a single measure which is no guarantee of performance with respect to some other measure. An important subclass of multi-objective problems is the class of problems for synthesizing optimal controllers which guarantee performance with respect to both the 7/2 measure and relevant time domain measures. The 7/2/t71 problem is an example which falls in this class of multi-objective prob-
VII I
Preface
lems where the objective is the design of controllers which optimally reject white noise while guaranteeing stability margins with respect to uncertainty. We apply the developed mathematical tools to such problems where the 7/2 measure and time domain measures on the performance of the closed loop system can be incorporated in a natural manner.
Organization
of the Book
This book can be divided into two parts. The first part constitutes Chapters 1, 2 and 3 where the mathematical machinery is developed. In the second part, (Chapters 4, 5, 6, 7, 8 and 9) various control design problems are formulated and solved. In Chapter 1 we introduce the basic topological concepts. The importance of continuity and compactness with regards to optimization is established. We take a top-down approach where the spaces described have more and more structure as one proceeds through the chapter. In Chapter 2 functions on vector spaces and weak topologies are studied. Motivation for why weak topologies are important is elucidated. The Banach-Alaoglu result on compactness of bounded sets in weak topologies is proven. Important results on sublinear functions are given which prove to be instrumental in studying convex sets and functions. The chapter on convex analysis is the culmination of the mathematical treatment given where we establish the Kuhn-Tucker-Lagrange duality result. Chapter 4 develops a paradigm where control design objectives can be stated precisely and in an effective manner. Youla parameterization of all closed-loop maps achievable via stabilizing controllers is developed. Chapters 5 and 6 study single-input single-output systems: In Chapter 5 the ll norm of the closed loop is minimized while keeping its 7t2 norm below a prespecified value. In Chapter 6 a weighted combination of the 7t2 norm of the closed loop and various other relevant time domain measures is minimized over all stabilizing controllers. Exact solutions to the problems formulated are given and continuity of the solutions with respect to change in parameters is established. Even though these problems address single-input single-output systems, they serve to highlight the nature of mixed objective problems involving the two norm of the c!osed-loop and time domain measures. In Chapter 7 the square case of the of the 7-/2-gl problem is studied. It is shown that the problem is equivalent to a single finite dimensional quadratic programming problem. In Chapter 8 the interplay of the 7/2 and the el norms of the closed-loop in the general multiple-input multiple-output setting is studied. It is shown that controllers can be designed to achieve performance within any given tolerance of the optimal performance via finite dimensional quadratic pro-
Preface
IX
gramming. The design methodology avoids many problems associated with zero-interpolation based methods. Chapter 9 tackles a non-convex problem where the 7/2 norm of the closed loop is minimized while guaranteeing a specified level of fl performance for a collection of plants in a certain class. It is shown that this robust performance problem can be solved via a simplex-like procedure.
Acknowledgements We would like to thank the students at University of California at Santa Barbara who made valuable suggestions at various stages of the book. In particular we would like to thank Srinivasa Salapaka who proofread the first four chapters of the book. We would like to acknowledge Petar Kokotovic for the encouragement he provided in publishing this book. The methodology on multiple objective problems in the book was largely shaped in collboration with Petros Voulgaris. The results presented in Chapter 8 were obtained in collaboration with Mustafa Khammash. The results presented in Chapter 9 were obtained in collaboration with Antonio Vicino and Alberto Tesi. We would like to acknowledge the support of NSF and AFOSR during the period in which this manuscript was written.
Notation
{} (x, (x, d) (x, It II) R
The The The The The The The
empty set. set X endowed with the topology v. set X endowed with the metric d. set X endowed with the norm I1" [I. real number system. n dimensional Euclidean space. p-norm of the vector x E R '~ defined as Ixlp
:=
IxlP) Ixll
g ~rrt × rt
gl g~xn
too ~ r r t × rt
C0
mxt~
C0
g2
The 1-norm of the the vector x E R '~. The 2-norm of the the vector x E R '~. The )~ transform of a right sided real sequence x = (x(k))~= o defined as 5:()~):= }-~=0 x(k) Ak" The vector space of sequences. The vector space of matrix sequences of size m x n.. The Banach space of right sided absolutely summable real sequences with the norm given by II x Itl:= ~ ° = 0 Ix(k)l. The Banach space of matrix valued right sided real sequences n with the norm IlXlll := maxl denotes the value of the bounded linear functional x* at r E X. The weak star topology on X* induced by X. The adjoint operator of T : X --+ Y which maps Y* to X*. The interior of the set A. The closed unit disc in the complex plane. The transpose of the matrix A.
Contents
1.
Topology ................................................. 1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 G e n e r a l T o p o l o g y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Metric T o p o l o g y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 N o r m e d Linear (Vector) Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 F i n i t e D i m e n s i o n a l Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 E x t r e m a of Real Valued F u n c t i o n s . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 13 16 19 23
2.
Functions on Vector Spaces ............................... 2.1 S u b l i n e a r F u n c t i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 D u a l Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 W e a k Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 ~p Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27 27 32 36 38
Convex Analysis .......................................... Convex Sets a n d Convex Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . S e p a r a t i o n of Disjoint C o n v e x Sets . . . . . . . . . . . . . . . . . . . . . . . Convex Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 M i n i m u m D i s t a n c e to a C o n v e x Set . . . . . . . . . . . . . . . . . 3.3.2 Kuhn-Tucker Theorem ...........................
45 45 48 55 57 60
4.
Paradigm for Control Design ............................. 4.1 N o t a t i o n a n d P r e l i m i n a r i e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 I n t e r c o n n e c t i o n of S y s t e m s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 I n t e r c o n n e c t i o n of F D L T I C S y s t e m s . . . . . . . . . . . . . . . .
69 69 75 76
5.
S I S O £ 1 / 7/2 P r o b l e m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 P r o b l e m F o r m u l a t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 O p t i m a l S o l u t i o n s a n d their P r o p e r t i e s . . . . . . . . . . . . . . . . . . . . 5.2.1 Existence of a S o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 S t r u c t u r e of O p t i m a l S o l u t i o n s . . . . . . . . . . . . . . . . . . . . 5.2.3 A n A priori B o u n d on the L e n g t h of A n y O p t i m a l Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 U n i q u e n e s s a n d C o n t i n u i t y of the S o l u t i o n . . . . . . . . . . . . . . . . .
83 85 87 87 88
3.
3.1 3.2 3.3
90 93
XIV
Contents
5.4 5.5 .
A
6.1 6.2
6.3 6.4 6.5
5.3.1 U n i q u e n e s s of the O p t i m a l S o l u t i o n . . . . . . . . . . . . . . . . . 5.3.2 C o n t i n u i t y of the O p t i m a l S o l u t i o n . . . . . . . . . . . . . . . . . An Example ........................................... Summary .............................................. Performance Measure ...................... Problem Formulation ................................... 6.1.1 R e l a t i o n to P a r e t o O p t i m a l i t y . . . . . . . . . . . . . . . . . . . . . Properties of the O p t i m a l S o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Existence of a S o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 S t r u c t u r e of O p t i m a l Solutions . . . . . . . . . . . . . . . . . . . . 6.2.3 A n A priori B o u n d on the L e n g t h of a n y O p t i m a l Solution ........................................... An Example ........................................... C o n t i n u i t y of the O p t i m a l S o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . Summary ..............................................
Composite
93 95 95 98 99 99 100 101 101 102 103 106 107 109
.
MIMO Design: The Square Case ......................... 111 7.1 P r e l i m i n a r i e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 7.2 T h e C o m b i n a t i o n P r o b l e m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2.1 Square Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.3 T h e Mixed P r o b l e m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.3.1 T h e A p p r o x i m a t e P r o b l e m . . . . . . . . . . . . . . . . . . . . . . . . 125 7.3.2 R e l a t i o n between the A p p r o x i m a t e a n d the Mixed Problem ......................................... 127 7.4 A n I l l u s t r a t i v e E x a m p l e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.4.1 S t a n d a r d gl S o l u t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.4.2 S o l u t i o n of the Mixed P r o b l e m . . . . . . . . . . . . . . . . . . . . . 130 7.5 S u m m a r y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.6 A p p e n d i x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.6.1 I n t e r p o l a t i o n C o n d i t i o n s . . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.6.2 Existence of a S o l u t i o n for the C o m b i n a t i o n P r o b l e m . 135 7.6.3 Results on the Mixed P r o b l e m . . . . . . . . . . . . . . . . . . . . . 136
.
Multiple-input Multiple-output Systems .................. 8.1 P r o b l e m S t a t e m e n t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 C o n v e r g i n g Lower a n d U p p e r B o u n d s . . . . . . . . . . . . . . . . . . . . . 8.2.1 C o n v e r g i n g Lower B o u n d s . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2 C o n v e r g i n g U p p e r B o u n d s . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 S u m m a r y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
139 139 141 142 143 144
9.
Robust Performance ...................................... 9.1 R o b u s t S t a b i l i t y a n d R o b u s t P e r f o r m a n c e . . . . . . . . . . . . . . . . . 9.1.1 R o b u s t S t a b i l i t y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.2 R o b u s t P e r f o r m a n c e . . . . . . . . . i ....................
145 146 146 147
Contents 9.2
9.3 9.4 9.5
Problem Formulation ................................... 9.2.1 Delay A u g m e n t a t i o n A p p r o a c h . . . . . . . . . . . . . . . . . . . . . 9.2.2 F i n i t e l y M a n y Variables A p p r o a c h . . . . . . . . . . . . . . . . . . Quadratic Programming ................................. Problem Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary ..............................................
XV 148 149 153 154 157 162
References ....................................................
163
Index .........................................................
165
1. Topology
In this chapter we lay down the foundations of the m a t h e m a t i c a l structure required for optimization methods for vector spaces. We start by a terse introduction to sets. No a t t e m p t is made to provide an axiomatic description of set theory. We appeal to the intuitive notion of sets as being a collection of objects. The reader is introduced to the axiom of choice and Zorn's lemma. Tile m e a t of this chapter is the section on general topology where we study topological sets with the bare m i n i m u m of structure on the sets. T h e concepts of convergence, continuity and compactness are presented in this general setting. T h e next two sections discuss metric and vector normed spaces. Vector normed spaces will be studied in greater detail in the next chapter. The section on finite dimensional spaces summarizes some i m p o r t a n t properties enjoyed by finite dimensional spaces that are lacking in infinite dimensional spaces. In tile last section of this chapter we study e x t r e m a of real valued functions. It is shown that compact sets and various forms of continuity play a pivotal role in the existence of extrema. Thus they form the focus of optimization. It is shown t h a t the norm topology does not have an abundance of compact sets. Elementary knowledge of real analysis is assumed. Definitions for well known operations like union and intersection of sets is not provided. Familiarity with countable sets, uncountable sets and the real number system is assumed. Except for these topics this chapter is self contained. However, the material presented in this chapter will be more transparent to a reader who has been exposed to analysis concepts taught in a undergraduate course (the first seven chapters of [1] is sufficient background).
1.1
Sets
Here, we do not a t t e m p t to provide the axiomatic development of sets, rather we appeal to the intuitive notion of a set as being a collection of objects. If A and B are two sets then A x B is another set defined as A x B := {(a, b) : a E A and b E B}. A binary relation on a set X is a subset T of X x X .
2
1. Topology
The relation is usually denoted by a symbol -~ and we say x -< y if and only if (x, y) belongs to T. An order is a binary relation which is transitive (x -< y and y --< z implies that x -< z), reflexive (x --< x) and antisymmetric (x -< y and y -< x implies that x is the same element as y). A relation without the a n t i s y m m e t r i c property is called a preorder. An element x is called the majorant of the subset Y if for all y in Y, y ~ x. An element, x is called the minorant of the subset Y if for all y in Y, x -~ y. The set is said to be totally ordered if either x ~ y or y -< x for any two elements x and y of X. Furthermore, it is well ordered if every subset Y of X has a minorant in the set Y. A directed set X is a set with a preorder such that every pair (x, y) of X has a majorant. We say that (X, -~ ) is inductively ordered if each totally ordered subset of X (in the order induced from X) has a m a j o r a n t in X. We denote the collection of all subsets of a set X by x ( X ) and the e m p t y set by {}. For sets A and B, we define A \ B as the collection of all elements of A which are not in B. The axiom of choice states that it is possible to choose an element from any nonempty subset of a set. We now state this axiom precisely. A x i o m 1.1 ( A x i o m o f c h o i c e ) Given any set X there exists a function c such that
c: (x(X) \ {})
X.
c is called the choice function.
In the intuitive description of a set, thought of as a collection of elements, it is difficult to explain the role of the axiom of choice. In the axiomatic description of set theory it can be shown that the axiom of choice is independent of the axioms used in developing set theory. Thus, different m a t h e m a t i c s results based on whether the axiom of choice is accepted to be true or not. The axiom of choice results in i m p o r t a n t theorems (such as the Hahn-Banach theorem). There seems to be no other alternative in establishing such key results and thus the validity of the axiom of choice is generally accepted. L e m m a 1.1.1 ( Z o r n ' s l e m m a ) . Every inductively ordered set X has an element x such that if y is in X and x -< y then y = x. x is called the maximal element. It can be shown that Zorn's l e m m a is equivalent to the axiom of choice. This l e m m a will be used in obtaining i m p o r t a n t results.
1.2 General Topology D e f i n i t i o n 1.2.1 ( T o p o l o g y ) . A topology on a set X is a collection of subsets r of X with the following properties.
1.2 General Topology
3
1. Any union of sets in 7 belongs to r. 2. Any finite intersection of sets in 7 belongs to r. 3. {} and X belong to 7. We say that (X, r) is a topological set and that r consists of the open subsets of X. A subset F of a topological set (X, r) is said to be closed if X \ F is open. A subset Y in a topological set (X, 7) is a neighbourhood of the point x in X if there is an open subset A of Y such that x E A. For each subset Y of X we define the closure of Y as the intersection of all the closed sets that contain Y. The closure of the set Y is denoted by Y - . T h e o r e m 1.2.1. For a topological set (X, 7), the following assertions are true. 1. The union of any collection of open sets is open and the union of a finite collection of closed sets is closed. ~. The intersection of any collection of closed sets is closed and the intersection of a finite collection of open sets is open. 3. If Y C X then z E Y - if and only if for every neighbourhood A of z, Y MA :/: {}. Proof. (3.) A - = A { F : F D A and F is a closed set} = N { X \ G : X \ G D A and G E r}. Therefore z0 belongs to A - if and only if z0 E X \ G for every open set G which is such that X \ G D A. Therefore z0 is in A - if and only if z0 E X \ G for every open set G which is such that G M A = {}. This is equivalent to the statement that a:0 is in A - if and only if for all G open which contain x0, G tq A # {}. We leave the rest of the proof to the reader. [] D e f i n i t i o n 1.2.2 ( R e l a t i v e t o p o l o g y ) . If (X, 7) is a topological set and Y is a subset of X then the relative topology on Y is a collection of sets of the form Y (3 A where A belongs to 7. It follows from the definition of relative topology that a set is closed in the relative topology on Y if and only if it has the form Y M F where F is closed in (X, r). D e f i n i t i o n 1.2.3 ( I n t e r i o r o f a s e t ) . Let Y be a subset of a topological set (X, r). A point z is called an interior point of Y if there is an open set A E 7 such that z E A with A C Y. The collection of all interior points of the subset Y is called the interior of the set Y and is denoted by i n t ( Y ) . Given a topological set (X, r) and a point x E X we define the neighbourhood filter, X'(z) associated with z by {N : N C X such that there exists U E 7, x E U and U C N},
(1.1)
4
1. Topology
which is a collection of neighbourhoods of x. We say t h a t for any two sets A and B in A/, A -< B if A D B. It is can be shown t h a t the relation -< defined above is reflexive and transitive. Also, given A and B in N'(z), A N B is in A/(x) and is contained in A and B. Therefore, any two elements in A/(z) have a majorant. This implies that N'(z) with the relation -.< is a directed set. If a and r are two topologies on X then we say that cr is a stronger topology than r or r is a weaker topology than c~ if r C c,. L e m m a 1.2.1. Let {rj : j E A} be a collection of topologies on a set X indexed by the set A. Then there exists a weakest topology which contains all vj for j in A. There also exists a strongest topology which is contained in all rj for j in A. Proof. Let rim be the collection of sets which are in all rj for j in A. Then it is clear that tint defines a topology on X. Also, rim is contained in all rj for j in A. It is also easy to see that ri,~t is the strongest such topology. Let T be the collection of all topologies which are stronger than rj for all j in A. This collection is not e m p t y because the topology in which all elements of X are defined open (called the discrete topology) is in T. We know that there exists a strongest topology ro~,t which is contained in all topologies in 7-. This is the weakest topology which contains all rj for j in A. [] D e f i n i t i o n 1.2.4 ( S u b b a s e , B a s e , N e i g h b o u r h o o d b a s e ) . Let cr be a collection of subsets of X . Then (r is a subbase for the topology v, if r is the weakest topology, amongst topologies on X that contain c~. ~ is a basis for r if all sets in it are unions of elements in cr. cr is a neighbourhood basis for an element z in ( X , r ) if for every set A in the neighbourhood filter of x (given by A/'(x)) there exists a set in both, N'(x) and (r which is a subset of A. Lemma
1.2.2.
The following assertions are true.
1. Let (r be a collection of subsets of a set X . Let r be a topology on X . is a subbasis for v if and only if r is the set { F : F = X , F = {} or F is a union of sets which are finite intersections of sets in c~} for which ~ is a subbasis. Then every set in r(c~) is a union of sets which are finite intersections of sets in cr, or the null set or the set X . 2. A collection of open sets (r in a topological set (X, v) is a basis for r if ~r f o r m s a neighbourhood basis for all elements in X . 3. A collection of open sets cr in a topological set (X, r) is a subbasis for r if the collection of sets formed by finite intersections of sets in cr f o r m s a neighbourhood basis for all elements in X . Proof. (1.) Every topology that contains (r contains all the sets which are union of sets formed by finite intersections of sets in ~r. This follows from the definition of a topology on a set. Also, all sets which are unions of sets formed
1.2 General Topology
5
by finite intersection of sets from ~r together with the null set and the set X define a topology on X. (2.) Let A E r. Then for every x in A, A is in Af(z) where A/'(x) is the neighbourhood filter of x. As the collection of sets cr forms a neighbourhood basis for all points in X, it follows that there exists a set B~ such that Bz E ~r M A/'(z) and B~ C A. It is evident that A = UxEA B:c, Therefore, A can be written as a union of sets from ~r. As A is an arbitrary set in r it follows that that cr forms a basis for the topology r. (3.) Let A E r be such that A :/= X and A :/= {}. Then for every z in A , A is in Af(z) where Af(z) is the neighbourhood filter of x. It follows that there exists a set Bx which is a finite intersection of sets from ~r such that B~ E Af(z) and B~ C A. It is evident that A = UxEA B~. Therefore, A can be written as a union of sets from the collection of sets formed by finite intersections of sets from ~r. As A is an arbitrary set in r it follows that cr forms a subbasis for the topology r. [] D e f i n i t i o n 1.2.5 ( N e t s ) . A net in a set X is a pair (A, k) where A is a directed set and k is a map from A into X. A net is also denoted as {x~}x~A where x~ = k(A). We also say that x;~ is a net in X on A. The net x~ is eventually in the set A if there exists A0 E A such that if A0 -< A then xx E A. The net xx is frequently in the set A if for every A0 E A there exists a A E A such t h a t A0 -~ A and xx E A. D e f i n i t i o n 1.2.6 ( C o n v e r g e n c e ) . Let x;~ be a net in a topological set (X, r). We say that x:~ -+ xo if x~ is eventually in N for all N E Af(xo). We also say that xx converges to xo in the topology r. Another notation used is limx,x to represent xo and xo is said to be the limit of the net xx.x~ is said to be a convergent net if there exists a xo E X such that xx -+ xo. It is possible that a net converges to more than a single element. This is not the case for the Hausdorff topology which is defined below. D e f i n i t i o n 1.2.7 ( H a u s d o r f f t o p o l o g y ) . For a set X, r is a Hausdorff topology if for all elements x and y in X with x ~s y there exist sets A E r and B E r such that x E A, y E B and A M B is empty. D e f i n i t i o n 1.2.8 ( D e n s e n e s s , S e p a r a b i l i t y , A x i o m o f c o u n t a b i l i t y ) . A subset Y of a topological set ( X , r ) is dense in (X, r) if Y - -- X. The topological set (X, 7") is called separable if it has a countable dense subset. The topological set (X, r) is said to satisfy the first axiom of countability if for every x in X there exists a countable number of open sets A,~(x) such that any neighbourhood of x contains atleast one of them. A net defined on the set of integers in X is called a sequence. In most results the concept of a net is superfluous and can be replaced by the notion of a sequence if the topological set satisfies the first axiom of countability. However,
6
1. Topology
the concept of nets allows for more general results and the proofs of m a n y results become easier to establish. L e n a m a 1.2.3. Let (X, r) be a topological set and let A be a subset of X. Then z0 is in the closure of A if and only if there is a net xx in A such that
Proof. From T h e o r e m 1.2.1 we know that if A C X, then z0 E A - if and only if for every neighbourhood G of ~:0, A f l G :/= {}. (r Suppose there is a net x;~ in A such that xx --+ z0. T h e n for any open set G which contains z0 (and therefore is in A/'(z0) ), there exists a A0 such that A0 -< A implies t h a t x~ E G. However, every such xx is in A because xx is a net in A. Therefore G M A =~ {} which implies t h a t z0 E A - . (=:~) Suppose x0 E A - and N E A/'(z0). Then there exists G E r such t h a t G C N and xo E G. Therefore for every N E Af(zo), N N A r {}. Let ZN belong to N n A. xN is a net in A defined on the directed set A/'(xo) (we have shown before that Af(x0) is a directed set with A -< B if and only if A D B). Given any N E A/'(zo), let Ao = N. If A0 -< A then zx E N because N = A0 D A and xx E A. Thus ZN is a net in A such t h a t ZN --+ z0. This proves the lemma. [] Example 1.2.1. It is not true that if z0 is in the closure of a set A then there exists a sequence in A which converges to z0. Consider any uncountable set X and define the topology on X by r := {{}, X, all sets which have countable complements}. T h e n the closed sets are given by {{},X, all countable sets}. Let A := X \ { z 0 } where z0 is any element in X. Then A is open and X is the only closed set that contains A. Therefore A - = X. Let zn be any sequence in A. T h e set X \ { x n : n > 1} is an open set which contains x0 and therefore is a neighbourhood of x0. However z,~ is never in this neighbourhood and therefore zn 74 z0. This example illustrates that to fully describe topological concepts in terms of convergence, the use of nets is indispensable. D e f i n i t i o n 1.2.9 ( C o n t i n u i t y ) . Let (X, r) and (Y, c~) be topological sets. A function f : (X, 7-) -+ (V, or) is continuous if for every set C Err, f - 1 (G) E 7".
y-1 : (v,
(x, T) is defined as
-,
f-l(B)
----={x E X : f ( x ) e B}.
f is said to be continuous at a point xo if for every neighbourhood G o f f ( x o ) , f - l ( G ) is a neighbourhood of xo. Lemma
1.2.4.
The following statements are equivalent.
1. f : (X, v) -+ (Y, or) is continuous at every point x E X.
1.2 General Topology
7
2. f : (X, r) --+ (Y, ~r) is a continuous function. 3. If xx is a net such that xx -+ xo in ( X , r ) then f(xx) --+ f(xo) in (Y,a).
Proof. (1 =~ 2) Suppose f : (X, r) -+ (Y, c~) is continuous at every point z E X. Let G E a and let x E f-l(G). As G is a neighbourhood of f(x) we have t h a t f - l ( G ) is a neighbourhood of x. This implies t h a t for every x E f-a(G) there exists a set A~ in r containing x such t h a t A~ C f-l(G). It is easy to show that U{A~ : z E f-I(G)} = f-I(G). Therefore f - l ( G ) (being a union of open sets) is open. As G is an arbitrary open set in Y we have shown that f is continuous. (2 ==> 1) Suppose f is a continuous function. Let N be a neighbourhood of f ( x ) for some z E X. Then there exists a set G E cr such that f(z) E G and G C N. From continuity o f f we have that f - l ( G ) E r. As f - l ( N ) D f - I ( G ) and f - l ( G ) is open and contains x we have that f - l ( N ) is a neighbourhood of x. Therefore we have shown that for any neighbourhood N of f ( x ) , f - 1 (N) is a neighbourhood of x. As x was chosen arbitrarily we have t h a t f is continuous at every point z E X. (2 ==r 3) Suppose f is continuous. Given any neighbourhood N of f(xo) we know that there exists a set G E ~r such that f(x0) E G and G C N. From continuity of f we know t h a t f - a ( G ) E 7". Also, x0 E f-l(G). Therefore f - l ( G ) is a neighbourhood of x0. As xx ~ x0 we know that there exists a A0 such t h a t A0 -~ A implies t h a t x~ is in f-l(G). Therefore there exists a A0 such that A0 -~ A implies t h a t f(x~) E G C N. As the neighbourhood N of f(xo) was chosen arbitrarily we have shown t h a t f ( x ~ ) --+ f(zo). (3 ::~ 2) Suppose f is not continuous. Then there exists x0 E X such that f is not continuous at x0 (from 2 ::r 1). This implies t h a t there exists a neighbourhood N of f(xo) such that f - a (N) is not a neighbourhood of x0. Therefore given M E A/'(x0) there exists x M E M such t h a t X M ~ f - l ( N ) . XM is a net in X on the directed set A/'(x0) where the binary relation is given by A -~ B if and only if A D B. It is clear t h a t XM --r x0. However, f(xM) ~ f(xo) because f(zM) E Y \ N for all M E A/'(x0) and g is a neighbourhood of f(xo). This proves the lemma. [] For a m a p f between sets X and Y and A C X we define
f(A) := {y E Y : y = f(z) for some z E A}. The set f ( X ) is called the range of f and is denoted by range(f). Lemma
1.2.5. / f f : (X, rx) --+ (Y, ry) is a continuous map then f - l ( B ) is
closed if B is closed in Y. Proof. As B is closed in Y it follows that Y \ B is open. From continuity of f it follows that f - l ( Y \ B ) is open. However, f - l ( Y \ B ) = X \ f - I ( B ) . Therefore, f - 1 (B) is closed. []
8
1. Topology
D e f i n i t i o n 1.2.10 ( I n i t i a l t o p o l o g y ) . Let {fj : j 9 be a collection of functions, indexed by the set J on a set X such that f j : X ~ Yj where Yj is a topological set with topology rj. The weakest topology on X which makes all functions fj continuous is called the initial topology on X induced by fj.
9 be a collection of functions on a set X, indexed by the set J, such that f j : X ---+Yj where Yj is a topological set with topology rj. The collection of sets of the form L e m m a 1.2.6. Let {fj : j
{fj-l(A) : A 9 rj for some j 9 J},
forms a subbasis for the initial topology on X. Proof. It is clear that any topology on X which makes all the functions fj continuous must contain all the sets from ~ where o" : : {fj-X(A) : A e rj for some j e J ) , otherwise at least one of the fj will be discontinuous. Therefore, the initial topology also contains a. Let r denote the topology for which a forms the subbasis. As 7- is the weakest toplogy that contains ~r and the initial topology contains a it follows that the initial topology is stronger than r. Also if X is endowed with the topology r then the functions fj are continuous. As the initial topology is the weakest topology on X which make fj continuous for all j it follows that r is stronger than the initial topology. Thus r is the initial topology. 13 L e m m a 1.2.7. Let X be endowed with the initial topology 7" induced by a
collection of functions {fj : j E J}, indexed by the set J, where fj : X Then a net x~ in ( X , r ) on A converges to xo in X if and only if fj(x~) converges to fj(xo) in rj for all j in J.
(Yj,Tj).
Proof. (=:>) Suppose the net x~ in X on A converges to x0. Then, because the topology on X is the initial topology (which makes all fj continuous) f j ( x x ) --+ fj(xo) for all j 9 J. (r Suppose for all j 9 J, fj(x~) ~ fj(xo). Let B be a neighbourhood of so. Then, as := { f 7 1 ( A ) : A 9
for some j 9 d ) ,
is a subbasis for the initial topology on X, B contains a set C which is a finite intersection of sets from ~ such t h a t z0 9 C. Without loss of generality assume that k C = Oi=l{f[-l(Ai) : where A~ 9 r~}.
Note that as x0 9 C it follows that Ai is an open set containing fi(xo) for all i = 1 , . . . , k. As fi(x~) ~ fi(xo) it follows that there exist Ai such that if ,~i "< ,k then fi(xx) 9 Ai for all i = 1 , . . . , k. Let ,~0 represent the m a j o r a n t of the set { A t , . . . , A k } . This implies that if A0 -< A then fi(xx) 9 Ai for all
1.2 General Topology
9
i = 1 , . . . , k . Therefore, if A0 -< )~ then xx E f~-l(A~) for all i = 1 , . . . , k which implies that zx E n~=i{f~-l(Ai)} C t3. This implies that z~ is eventually in B. As B is an arbitrary neighbourhood of z0 it follows that zx --+ z0. [] D e f i n i t i o n 1.2.11 ( I s o m o r p h i s m ) . A function f : X --4 Y is an isomorphism if it is a one-to-one and onto function. D e f i n i t i o n 1.2.12 ( H o m e o m o r p h i s m ) . A function f : (X, r~) --~ (Y, ry) is a homeomorphism if it is a one-to-one and an onto function which is continuous with its inverse also continuous. Two topological sets which have a homeomorphism between them are said to be topologically identical. D e f i n i t i o n 1.2.13 ( F i l t e r s ) . A filter Y: in a set X is collection of subsets of X which has the following properties.
1. { } r 2. X E.T. 3. A C B and A E .T implies that B E Y:. 4. A E .T and B E Y: implies that A n B E .T. D e f i n i t i o n 1.2.14 ( U l t r a f i l t e r ) . An ultrafilter in a set X is a filter in X with the additional property that no other filter in X properly contains it. L e m m a 1.2.8. For a filter g in a set X, the following statements are equivalent.
1. g is an ultrafilter in X. 2. For every set A C X either A E g or X \ A E g. Proof. (2 ~ 1) Suppose g is not an ultrafilter in X. Then there exists a filter ~" in X and a subset A of X such that A E Y" but A ~ g with the property that if B E g then B E }-. If X \ A E g then X \ A E 9v and because .T is a filter, ( X \ A ) A A E 9r. This would imply that {} E T . Therefore X \ A ~ g. Therefore we have shown that both A and X \ A are not in the filter g. (1 => 2) Suppose g is a filter and there exists a set Y C X such that Y ~ g and X \ Y ~ ~. Let A be any set which belongs to the filter g. Then ( X \ Y ) n A # {} because otherwise A C Y and as ~7 is a filter Y E ~. Let .T:={C:
there e x i s t s A E g s u c h t h a t
CD(X\Y)
AA}.
As ( X \ Y ) n A ~ {} for every A E g it is clear that {} ~ .T. It is clear that X E 3v. If C1 E .T then there exists a s e t A1 E g s u c h t h a t C'1 D ( X \ Y ) A A 1 . If C2 D Ci then C2 D ( X \ Y ) n Ai and therefore C~ E ~ . If Ci and C'2 both belong to .T then there exist sets Ai and A2 both in g such that C1 D ( X \ Y ) O Ai and C2 D ( X \ Y ) n A2. As Ai VIA2 E g and Ci AC2 D ( X \ Y ) f q (A1 AA2) it follows that Cl OC2 E .T. This proves that .T is a filter. Note that if B E G then B E .T. Also, X \ Y E .T whereas X \ Y f[ g. Thus .T properly contains G. Therefore g is not an ultrafilter, which proves the lemma. []
10
1. Topology
Lemma
1.2.9. E v e r y filter in a set X is contained in an ultrafilter in X .
Proof. Suppose G is a f l t e r in X. Consider
P := {7i : 7-I is a filter in X which contains ~}. Let the binary relation on P be given by -< where B -< .4 if and only if B C .A. Let Q be a totally ordered subset of P. Let N denote the union of all the elements of Q. Then it can be shown that N E P and t h a t N is a m a j o r a n t for Q. This implies t h a t P is inductively ordered. From Zorn's l e m m a ( L e m m a 1.1.1) P has a m a x i m a l element ~'. It follows t h a t 5r is an ultrafilter that contains ~. [] D e f i n i t i o n 1.2.15 ( S u b n e t s ) . Let ( A , i ) be a net in X on A. A subnet o f ( A , i ) is a net ( M , j ) in X with a f u n c t i o n h : M --+ A such that j = i(h) and f o r every X0 E A there exists a ~o E M with A0 -< h(~) if ~o -'< ~. We also say that yz on M is a subnet of the net xx on A if there exists a function h : M --+ A such that xh(Z) = yZ and f o r every Ao E A there exists a fie E M with Ao ~ h(13) if /3o -< ~3. The definition of subnets seems involved. However, the definition of a subsequence of a sequence will bring out the similarity between subnets and subsequences. L e m m a 1.2.10. Let xx be a net in ( X , r) on A. Let yp be a subnet of x;~ on M. If x~ -+ xo then y~ --+ xo. Proof. Let N E A/'(x0). As xx --+ z0 we know that there exists a A0 E A such t h a t )~0 -< )~ implies that xx E N. As yz is a subnet of x~ there exists a function h : M --+ A and /30 E M such that/30 -- Y where Y is a set then f ( x x ) is a universal net in Y on A. Proof. Let B C Y. Then as xx is a universal net on X it follows that xx is eventually either in f - l ( B ) or in X \ f - I ( B ) = f-I(Y\B). Therefore, f ( z x ) is eventually either in B or in Y \ B . Therefore, f ( x x ) is a universal net in Y. []
1.2 General Topology Theorem
11
1.2.2. Every net has a universal subnet.
Proof. Let x~ be a net in X on A. Consider the set := {A : A C X and x~ is eventually in A}. It is evident that ~ is a filter. From L e m m a 1.2.9 we know t h a t there exists an ultrafilter .T which contains ~. We will show now t h a t x~ is frequently in every set in 9r . Suppose there exists a set F E .T and a A0 E A such t h a t if )~0 -< ~ then xx ~ F. Therefore {x~ : )~0 -< )~} M F = {}. The set {x~ : )~0 -< )~} belongs to G and therefore it belongs to .T. F belongs to 9e and as .T is a filter it follows t h a t {x~ : )~0-_ 0 and Ilxll = 0 if and only if x -- O. 2.
II,~xll
= I,~1 Ilxll f o r any scalar (~ and vector x in X .
3. IIx + yll < Ilxll + Ilyll.
1.4 Normed Linear (Vector) Spaces
17
It is clear t h a t a n o r m on a vector space X induces a metric on X defined by d(x, y) :-- ] ] x - yll for elements x and y in X. Therefore, a n o r m also induces a t o p o l o g y (called the n o r m topology) which is the metric topology with the metric defined as above. T h e norrned topological space X with a n o r m I]" I] is denoted by (X, ]]. I]). Also, note t h a t because a n o r m e d space is a metric space sequences suffice to describe convergence. E x a m p l e 1.4.1. C o n s i d e r the set R n which is defined as R n := { ( x l , x 2 , . . . , x n )
:xiEnfori=
1,...,n}.
R n is a vector space with the real numbers as the scalars. M a n y different n o r m s can be defined on R n. A n i m p o r t a n t class o f n o r m s on R n is defined by the p n o r m which is given by n
Ixlp := ( ~
lx, l~)~
i=1
where x = ( x l , x 2 , . . . , Xn) and p is an integer such that 1 < p < ~ . The two n o r m (p = 2) and the one n o r m (p --- l) are o f particular interest. A n o t h e r i m p o r t a n t n o r m on R n is the oo n o r m which is defined as
Ixlo~ := max Ixil. l_ N implies that IIx~ - xmll N T h e n PIx~ll = I I x . - XN + XNII 0 such that IIT(x)llY ___ I f l l x l l x . Proof. (1 =~ 2) Follows i m m e d i a t e l y because every continuous m a p is continuous at every point. (2 =~ 1) Suppose for any sequence {x,~} in X which converges to zero, T(x,~) converges to zero. Let {z,~} be a sequence which converges to z0 in X. It follows clearly t h a t the sequence {z,~ - z0} converges to zero in X and therefore T(zn - zo) converges to zero. From linearity of the m a p T it follows t h a t T(z,~) - T(zo) converges to zero which implies t h a t T(z,~) converges to T(zo). T h u s we have shown t h a t if zn converges to z0 in X then T(z,~) converges to T(zo) in Y. (2 ==r 3) Suppose for every positive integer n, there exists an element xn X
in X such t h a t IIT(x,~)lly > nllx,~ll x . Let zn : = n - ~ x "
T h e n IIz,~llA- = 1
and therefore z , converges to 0 in X. However, IIT(z,~)l I > 1 for all n which implies t h a t T(zn) does not converge to zero. This contradicts the fact t h a t
Zn "-+0. (3 ::~ 2) Suppose there exists a real n u m b e r I( such t h a t IIT(x)lly 0 such t h a t [ r - e , r + ~ ] C A~ o. From the definition of r it follows t h a t [a, r - c] a d m i t s a finite subcover from {Ax}. This subcover together with A,xo forms a finite subcover of [a, r + ~]. This contradicts the definition of r. T h u s r > 1. This proves the l e m m a . [] Now we prove the well known Hiene-Borel theorem. T h e o r e m 1.5.1 ( H i e n e - B o r e l ) . 11) is compact.
Every closed bounded subset C, of (R n, 1.
Proof. Because C is bounded there exist b o u n d e d closed intervals Ik for k = 1 , . . . , n in R such t h a t C C H~=llk. As Ik are c o m p a c t (see L e m m a 1.5.1), it follows from T h e o r e m 1.2.4 (Tychonoff's t h e o r e m ) t h a t II'~=llk is c o m p a c t . As C is a closed subset of Hr~=lIk it follows from L e m m a 1.2.12 t h a t C is compact. [] L e l n m a 1.5.2. Every linear map from ( R n, l" I1) to any normed space (X, I[" Ilx) is continuous.
Pro@ Let ei be the n a t u r a l basis for /~n where ei is the n-tuple with 1 in the i th place and zeros elsewhere. Suppose, zk --+ z0 in ( R '~, I" I1) t h a t is, n k rt ]xk -- X0]l --+ 0. If xk = E i = I ai ei and x0 = ~--~i=1 a~ then this implies t h a t aik --+ a ~ for all i = 1 , . . . , n. Now, if T : (R n, I" 12) -+ (X, I1" IIx) is linear then IIT(xk)- Z(x0)ll = IlT(zk- x0)ll = IIr(~i~-0(ag- a ~ max1 0 there exists y,, 9 Y and bm 9 B such that y,~ + 2-'~bm = x. This implies that for any x 9 X and m > 0 there exists y,,, 9 Y, bm 9 B such that Ilym - xll -- 2-mllbmll _< 2 - m . Therefore, X = Y - . But Y - is Y itself because Y is finite dimensional (see Corollary 1.5.2). Therefore, X is finite dimensional. [] This L e m m a indicates the scarceness of compact sets in the norm topology. As we will see in the next section compactness is essential in optimization and this will lead us to define lcss restrictive topologies in the next chapter.
1.6 E x t r e m a of Real Valued F u n c t i o n s
1.6 Extrema
of Real
Valued
23
Functions
In this section we provide characterizations for functions which allow for extrema to exist. It is shown that cornpactness of sets and continuity properties of functions play an important role for cxtrema to exist. D e f i n i t i o n 1.6.1 ( L o c a l e x t r e m a ) . Let (X, II' IIx) be a normed vector space and let f : D --+ R be a real valued function defined on a subset D of X . An element xo in X is a local minimum if there exists a neighbourhood N o f x o such that for all x E N M D, f(xo) < f ( x ) . xo is a strict m i n i m u m if for all x e g f3 D, f(xo) < f ( x ) . Local maxima are defined analogously. Local extrema refers to either local minima or local maxima. T h e o r e m 1.6.1. If (X, r) is a topological compact set and f : (X, 7-) ---> R is a real valued continuous function then there exists elements Xo and xl in X such that f ( x o ) f ( y ) for all y E X. Pro@ Let tt := inf{f(x) : x E X}. Then from the definition of infimum for every positive integer n there exists an element z,~ such that f(~,~) ~} for a real is open. Then f is a lower semicontinuous function, f is upper semicontinuous if - f is lower semicontinuous. For A a directed set and rx a net in R on A we use the notation lira infrx to represent liminfrx:=
sup (xoin[ ,ko•A
r;~), A
24
1. Topology
and we use lira sup r~ to represent lim supr,x := inf ( s u p r x ) . AoEA
\Ao-~ A
L e m m a 1.6.1. Let (X, r) be a topological compact set and f : (X, r) --+ R. f is lower semicontinuous if and only if f ( l i m xx) _< l i m i n f f ( x A ) ,
for every convergent net x~. f is upper semicontinuous if and only if f(limzx) > limsupf(z~),
for every convergent net xx. Proof. (==~) Suppose f is a lower semicontinuous function and suppose x,x is a convergent net with limxx = x0. Choose any real number t such that t < f(xo). As f is lower semicontinuous it follows that the set { z : f ( x ) > t} is in r. Note that x0 is in this set and therefore {x : f ( x ) > t} is a neighbourhood of x0. As xx --+ x0 it follows that xx is eventually in this set. Therefore, there exists a )q such that if AI -~ )~ then f(x~) > t which implies that inf f ( z ~ ) > t. Therefore, l i m i n f f ( z ~ ) = sup inf f ( x ~ ) > t. This is true )~ 1 "~ A
--
AoEA
Ao-~A
--
for any t < f(x0) and therefore lim i n f f ( x x ) > f(zo). (r Suppose, for every convergent net xx, f ( l i m xx) < lira i n f f ( z x ) . Consider any set F := {x : f ( z ) < t} where t E R. Suppose, x0 E F - . Then from L e m m a 1.2.3 it follows that there exists a net x~ in X on a directed set A such that x~ --+ x0. From the assumption we have f ( x 0 ) < l i m i n f f ( x x ) _< t. Therefore, z0 E F which implies that F is closed. It follows t h a t {x : f ( x ) > t} = X \ F is open. Thus we have shown that f is lower semicontinuous if and only if f ( l i m z~) < lim i n f f ( x ~ ) . T h e rest of the l e m m a is left as an exercise. [] C o r o l l a r y 1.6.1. If (X, r) is a topological compact set and f : (X, r) -~ R
is a real valued lower semicontinuous function then there exists an element xo in X such that f(xo) < f(y) for all y E X. Similarly, if f is upper semicontinuous then there exist an element xl in X such that f ( x l ) > f ( y ) for all y E X. Proof. Follows from L e m m a 1.6.1 and arguments similar to one used in proving T h e o r e m 1.6.1. [] It is clear that the topology of a set is vital in determining whether ext r e m a for a function exist or not. In m a n y cases the function is a measure of a physical quantity which needs to be minimized or maximized on the given set. We have seen in the previous section that the norm topology is particularly restrictive for infinite dimensional spaces because of the dearth of compact sets in this topology (a norm bounded ball is not c o m p a c t in the norm topology; see T h e o r e m 1.5.2). Therefore, for infinite dimensional spaces
1.6 Extrema of Real Valued Functions
25
it is worthwhile to study relevant topologies other than the n o r m topology. We do this in the next chapter. T h e fact t h a t the derivative of a real valued function f : R --+ R vanishes when it has a local m a x i m a or a local m i n i m a is a classical result. Now, we generalize this result. D e f i n i t i o n 1.6.3 ( G a t e a u x d e r i v a t i v e ) . Let X be a vector space with an open subset D and let (Y, ][-[IY) be a normed vector space with a map, f : D ---+ Y defined. For an element x E X and h E X , f is said to be gateaux differentiable at x with increment h if there exists an clement fh (x) E Y such that
i i f ( x + a h ) - Ot f(m) - ~ A ( m ) l i t
~ 0 a,~ ~ ~ 0.
f h ( x ) is called the gateaux derivative of f at x in the direction h. If f is gateaux differentiable at x with all increments h E X then f is said to be gateaux differentiable at x. If f is gateaux differentiable at all x E X then f is gateaux differentiable. Note t h a t for a differentiable function f : R --+ R the notion of the g a t e a u x derivative and the ordinary derivative are tire same. T h e o r e m 1.6.2. Let f : ( X , I1' IIx) -+ :~ be a real valued gateaux dif:erentiablc function on a normed vector ,space (X, I1' IIx). A,, element Xo in X is a local extrema only i f f h ( X o ) = 0 f o r all h E X .
Proof. Suppose at x0 there is a local minima. T h e n there exists an e > 0 such t h a t Ilzllx _ ~ implies t h a t f ( x o + z) - f ( x o ) >_ O. Therefore, if a > 0 and [lahllx < e then f ( x o -t- a h ) - f ( x o ) > 0. Letting c~ --+ 0 while keeping cr positive we see t h a t f h ( x ) >_ O. Similarly, note t h a t if cr < 0 and II~hllx _ then f ( x o + ah) - f ( x o ) < 0. Letting a --+ 0 while keeping a negative we see t h a t fh(xo) - c o . Then U is a real valued sublinear function such that for all x in X , U(x) g(x) for all x in X. We will prove the existence of L by proving t h a t ~" is inductively ordered and then a p p l y i n g Zorn's l e m m a to it. Let ~ be a totally ordered subset of.T. We will show t h a t ~ has a m a x i m a l element. Define for w in X M ( w ) := i n f { g ( w ) : g E G}.
30
2. Functions on Vector Spaces
It is clear t h a t g -< M for any g E ~. Also note t h a t for any element f in ~" and for any element w in X, 0 = f(w - w) < f(w) + f ( - w ) . Therefore, f(w) >_ - f ( - w ) >_ - S ( - w ) > - o o . Therefore, for any element w E X, M(w) > - ~ . As M(w) _g(x) +g(y) > g(x + y), and i f g -< h then h(x) + g(y) >_ h(x) + h(y) >_ h(x + y). In either case there exists a function in ~ such t h a t h(x) + g(y) >_-h(x + y). It follows t h a t
M(x) + M ( y ) = inf{h(x) + 9 ( y ) : h,g E ~} > inf{h(x § y) : h E ~}
= M(x + y). T h u s M is a real valued sublinear function. It is clear t h a t M(x) ----IIx011x. Proof. Let S : X --+ R be defined by S ( x ) := Ilxllx for all x 9 x . T h e n S is a real valued sublinear function. From L e m m a 2.1.3 we know t h a t there exists a linear function L : X --+ R such t h a t L(x) < S ( x ) for all z 9 X a n d L(xo) = S(xo). T h i s implies t h a t IILII = x and Z(x0) = II~011x. Denote Z by x~. This proves the t h e o r e m . []
34
2. Functions on Vector Spaces
T h e o r e m 2.2.3. Let (X, I I IIx) and (Y, I1" IIY) be normed vector spaces. Let X • Y be endowed with the product norm. Then there exists an isometric isomorphism between ( X x Y)* and X* • Y* with the norm on X* x Y* defined by
II(x*, y')ll := max(llx'll, Ily'll}, where x* E X* and y* E Y*. Proof. Let f be a b o u n d e d linear real valued function on X • Y. T h e n f r o m the linearity of f we have for any (x, y) E X • Y, f ( ( x , y)) = f ( ( x , 0) + (0, y)) = f ( ( x , 0)) + f ( ( 0 , y)).
(2.5)
It is clear that x* a function on X defined by x*(x) = f ( ( x , 0)) is linear. Similarly, y* a function on Y defined by y*(y) = f ( ( 0 , y)) is linear. Also, from equation (2.5) we have ]x*(x) + y*(y)l = I f ( x , y ) l _< Ilfll(llxllx + Ilyllr) for all (x,y) ~ X • In particular, Ix'(x)l _< Ilfll Ilxllx which implies
that IIx'll _< Ilfll. Similarly, it max{llx*ll, IlY*II) < Ilfll.
can
be shown that IlY*II _< Ilfll and therefore
Given any e > 0 we know t h a t there exists xr E X and yr E Y such t h a t Ilxdlx + IlY, IIY < 1 and If(x,y)l >_ I l f l l - r Therefore, for every e > 0 there exists xr E X and y, E Y such t h a t I I x , l l x + l l Y , llr < 1 and I x * ( x ) + y * ( x ) l > Ilfll- r which implies t h a t IIx'll IIx, llx + Ily*ll Ily, llr > Ilfll- r T h u s ,
max{llx'll, Ily'll}(llx,llx + Ily, llYI) > Ilfll - r As IIx,llx + IlY, IIY < 1 and e > 0 is a r b i t r a r y it follows t h a t max{llx*ll, IlY*II) > Ilfll. T h u s we have established t h a t the m a p i : (X • Y)* --+ X* • Y* which takes f E (X x Y)* into (x*, y*) E X* x Y* as defined above is isometric. T h e fact t h a t i is isomorphic is left to the reader to prove. [] D e f i n i t i o n 2.2.2 ( S e c o n d d u a l s p a c e ) . Let (X, I1' I[x) be a normed vector space with (X*, I1' II) as its dual. The set of all bounaed linear functions from X* to R is the second dual of X. It is denoted as X**. Every element x in X can be t h o u g h t of as a m a p f r o m X* into R. T h u s every element x E X can be identified with an element in X**. For any x* in X* let (J(x))(x*) := x*(x). Note t h a t (J(x))(x~ + x~) = (x*1 + x~)(x) = x*l(x)+x~(x ) = (g(x))(x*l)+(g(x))(x~) where x E X, x~ E X* and x i E X*. Also, (g(x))(c~x*) = ((~x*)(x) = c~x*(x) = c~(J(x))(x*). Therefore, it follows that J(x) is a linear m a p from X* to R. It is also a b o u n d e d linear function on X*. Indeed, for x E X and x* E X*, I(J(x))(x*)l = Ix*(x)l < I]x*lI I]xilx. Therefore, s u p { ( g ( x ) ) ( x * ) : IIx'll < 1} < Ilxllx.
(26)
Thus we have established t h a t
IIJ(x)ll_ Ilxllx.
(2.7)
2.2 Dual Spaces
35
In fact we will show using a H a h n - B a n a c h result t h a t IIJ(x)ll = Ilxllx. We call the m a p J : X --4 X** the canonical map from X to X**. Let (x, II. [Ix) be a n o r m e d vector space with X* as its dual. We use the symmetric notation < x,x* > to denote x*(x). With this notation we have ( J ( x ) ) ( x * ) = < x * , J ( x ) > = < x,x* > . We will call < .,. > the bilinear form on X. D e f i n i t i o n 2.2.3 ( A d j o i n t m a p ) . Let A : (X, [ [ - [ [ x ) -+ (Y, [[" [Jr) be a bounded linear map from a normed vector space X to a normed vector space Y. The adjoint of the map A, denoted by A* is a map from Y* to X* defined by <x,A*(y*)>x:=v for a l l x 9
and for ally* 9
where < .,. > x is the bilinear form on X and < .,. > y is the bilinear form on Y. T h e o r e m 2.2.4. For any element xo of a normed linear space (X, in ' Ilx) there exists an element x~ in X* such that [Ix~[I I -IIx011x. Also, if j : x x ' " is the canonical map then IlJ(x)fl = IJxllx. Proof. Note that JR' mix : X -+ R is a sublinear function on X. Applying L e m m a 2.1.3 we know that there exists a linear function L : X -~ R such that i ( x ) = [[xo[lx. We have established earlier t h a t [[J(xo)[[ _< [[xo[[x. However, I[J(xo)[[-s u p ( < x * , J ( x o ) >: IIx'll = < xo, x; > = IIx0llx. This proves that IlJ(x)llx = Ilxllx for all x in X. [] T h e o r e m 2.2.5. Let (X, [l" mix) and (Y, II [iv) be normed vector spaces with a linear map A defined from X to Y. Let A* : Y* -+ X* denote the adjoint of A. Then, IIAll = IIA*II that is sup{llA(x)l[y : x E X and IIxIIx l
--+<x,x~ >
in R for all x in X .
2.3 Weak Topologies
Proof. Follows from L e m m a 1.2.7. T h e o r e m 2.3.1 ( B a n a c h - A l a o g l u ) . space with X* as its dual. The set
37 []
Let (X, I1" IIx) be a normed vector
B* := {x*: llx*ll < M},
(2.8)
is compact in the weak-star topology for any M G R. Proof. Let x~ be a universal net in B* on the directed set A. For any x in X, (g(x))(x~) = < x, x~ > is a universal net in R (see L e m m a 1.2.11) and furthermore, [ < x,x~ > I < M[[x[[x. Therefore < x,x~ > is a universal net in the compact set B := {r E R : -M[[x[[x][ _< r -+ f ( x ) and f ( x ) E B (see Theorem 1.2.3). Let x and y be any two elements in X. Let < x,x~ >--4 f ( x ) and < y, x*~ >-4 f(y). Also, it is true that < x + y, x~ >--+ f ( x + y). This implies that < x, x~ > + < y, x~ >--4 f ( x + y). Therefore, f ( x + y) = f ( x ) + f(y). It can be shown in a similar manner that f ( a x ) = a f ( x ) for any real number a and x in X. This implies f : X --4 R is linear. As f ( x ) is in B it follows that I]f[[ := sup{[f(x)l : [Ix[Ix < 1} _< M. Therefore, f belongs to B* and by definition < x, x~, >--4 f ( x ) for all x in X. Hence, from Lemma2.3.1 it follows that x~, --~ f. Thus, every universal net in B* is convergent in the weak-star topology which implies that B* is compact in the weak-star topology (see Theorem 1.2.3). [] This shows that the weak-star topology is important because, unlike the norm topology, the unit norm ball is compact. However, it is not true that every sequence in a compact topological space has a convergent subsequence (an example is given later). The following result guarantees existence of a convergent subsequence when the space is separable. T h e o r e m 2.3.2. Let (X, [1" [Ix) be a separable normed vector space with X* as its dual. Then every sequence in {x* : [tx*[[ < M} has a convergent subsequence in the weak-star topology where M E R.
Proof. Left to the reader. [] Next we present a result on the compactness of the norm ball of X in the weak topology. T h e o r e m 2.3.3. Let (X, [[. [[x) be a normed vector space with X* as its dual. The set B := {x: Ilxllx
_< 1},
is compact in the weak topology if and only if X is reflexive.
(2.9)
38
2. Functions on Vector Spaces
2.4 s
Spaces
In this section we study the vector space of sequences with different norms imposed on it. We denote the space of sequences by g. Therefore, every element of g is a function from the set of integers to the real numbers; g = { z l z : I --+ R}. I f a E R and x E t then we define ax by (az)(i) = crz(i). For z E e and y E g we define z + y by (x + y)(i) = x(i) + y(i). It is evident that e with the scalar multiplication and vector addition as defined above is a vector space. Often, for an element x E t and i E I, x(i) is denoted as xi. For an element z in t define
Ilxllp :=
I~il p
and
i=--oo
Ilxll~ :--
sup
I~il,
-r
where 0 < p < oo is a real number. D e f i n i t i o n 2.4.1 (gp s p a c e s ) . Let p be a real number such that 0 < p < oo.
The ~p spaces are defined by
ep :-- {z 6 e: II~llp < or We will restrict our attention to right-sided sequences i.e. for elements of/~ which m a p negative integers to zero. This is a matter of convenience and the results in this section are valid in general. D e f i n i t i o n 2.4.2. co is a subset o f t such that for every x 6 e0, limxi = 0. Now we show that ev spaces are nested. Lemma2.4.1.
If O < p < q < oo then ~p C co and ev C s
Proof. We leave the first part of the proof to the reader. Suppose z E s Then ~-~i~162[z~IP < oo. Therefore, there exists an integer N such that i > N implies that Iz~l < 1. Note that if p _< q then for i > N, Izil q _< Iz~l v Therefore, oo
E
g
Ixilq = E
i=0
i=0 N
(3o
Ixitq + E
oo
0, t > 0 and 0 1, f ' ( h ) < 0. This implies that f is a strictly increasing function in the region 0 < h < 1 and therefore f ( h ) < f(1) = 0 if 0 < h < 1. Similarly in the region h > 1, f is an strictly decreasing function and f ( h ) < f(1) = 0 if h > 1. This implies that f ( h ) < 0 if h _> 0 and the equality holds only if h = 1. In other words h A _< ~ h + 1 - A, with equality only if h = 1. If t > 0 then by substituting ~ for h we have the inequality in (2.11). If t = 0 then the inequality in (2.11) clearly holds. Let r=
,)~= 1
,t=
and l - A =
.
From inequality (2.11) we have
ix,
y,I
Therefore,
+ i=0
i=0
This proves the lemma.
= ~ §
= 1.
i=0
[]
L e m m a 2.4.3 ( M i n k o w s k i ' s i n e q u a l i t y ) . Suppose that 1 N K
Ix"U) - xm(i)l p N K
Ix"(/) - z(i)l p _< e. i=0
This proves t h a t [Ix n - ~llp ~ 0 as n -+ ~ . T h u s we have established t h a t if 1 _< p _< oc then (gp, I1" lip) is a B a n a c h space. If p = co and {x"} is a C a u c h y sequence in goo, then given any e there exists a N such t h a t for i = 1 , . . . , oo and for any n, m >_ N ]xn(i) - xm(i)] _< e.
(2.13)
This implies t h a t {x'~(i)} is a C a u c h y sequence in R and we suppose it converges to x(i). Letting m -~ c~ in (2.13) we see t h a t given any e > 0 there exists a N such t h a t if n > N then
Iz"(i) - z(i)l < e, for any i = 1 , . . . , c ~ . This proves t h a t x n --+ x in the I1 I1~ norm. T h u s we have established t h a t if 1 < p _< oe then (tp, I1" lip) is a B a n a c h space. (2) P r o o f is identical to the p r o o f for 1 < p < e~. gp is not a B a n a c h space for 0 < p < 1 because the Minkowski's inequality does not hold. [] L e m r n a 2 . 4 . 4 . If O < p < ee then (gp, I1" lip) is separable.
Zgso, (co, I1 I1~)
is separable. Proof. Let A be a subset of g such t h a t for every an integer N such t h a t if i > N then x(i) = 0 i = 0 , . . . , o0. We will show t h a t A is dense in (gp, in fp. Given any e > 0 there exists an integer M
element x E A, there exists and x(i) is rational for all I1" lip)- Let x be any element such t h a t
42
2. Functions on Vector Spaces
Z
ix(i):
_ I I ~ ' l l - ~, As II~:llp _< 1 and e is any arbitrary positive number it follows that Ilyllq _> II~'ll and therefore Ilyllq = II~*ll, Thus we have shown that there exists a map F : g~ --+ gq defined by r ( ~ ' ) = {x.(~i)} OO
such that F is isometric and also, x*(x) -- E
x(i)F(x*)(i). It is clear that
i----0
F is one to one. We will now show that F is onto. Indeed, let z := {z(i)} be any element t'~r
in gq. Let f : gv --~ R be defined by f(x) = ~ x ( i ) z ( i ) . OO
linear and [f(x)l = [ Z x ( i ) z ( i ) ]
It is clear that f is
i----0
0 such that
B(x ,O := D : IIx-
,llx
y if x - y E P. := N a n d x x Eint(P). Similarlyx < y ifx-y E-P Given a vector space X with positive cone P the positive defined as
X based on P such We write x > 0 i f < 0 ifx Eint(N). cone in X " , P ~ is
P ~ := {x* G X* :< x,x" > > 0 for all x E P}. Example 3.1. I. Consider the real number system R. The set
P := {x : x is nonnegative},
3.1 Convex Sets and Convex Maps
47
defines a cone in R. It also induces a relation _> on R where for any two elements x and y in R, x >_ y if and only if x - y 9 P. T h e convex cone P with the relation _> defines a positive cone on R. D e f i n i t i o n 3 . 1 . 5 ( C o n v e x m a p s ) . Let X be a vector space and Z be a vector space with positive cone P. A mapping, G : X --~ Z is convex if G ( t x + ( 1 - t ) y ) 0 and M E R such t h a t if ( x , y ) 9 R ~, and I(x,Y)ll _< ~ then f ( ( x , y ) ) 0 such that B(0,5) := {r 9 R s : Irll _< 5} C A. Thus we have shown that f is bounded above in the neighbourhood, B(0, 5) of 0 by M. Given any 0 < e < 1 in R let x 9 B(0, eh) := {r 9 R 2 :lrl < eh}. This implies that 1-x 9 B(0,5) and therefore
f ( c ( ! x ) + ( 1 - c)0) < e f ( ! x ) +
( 1 - e)f(0) < cM + ( 1 - e)f(O).
Therefore, f ( x ) - f(O) _ - e ( M - f(0)).
Thus we have shown that given any 0 < e < 1 there exists a neighbourhood B(O, eh) of 0 such that if x is in this neighbourhood then I:(~) - f(0)l _< e(M - f(0)). Therefore, f is continuous at 0. rn
3.2 Separation
of Disjoint
Convex
Sets
Consider the vector space R 2. The equation of a line (see Figure 3.3) in R s is given by
m l X l -1- msx2 -~ c, where ml, ms and c are constants. The graph of a line is given by the set n -- {(Xl,XS)lmlXl + msxs = c}, which can be written as
n = {x E R s] < x,x* >= c},
(3.1)
where x* = (ml, ms). Note that if rns -- 0 then we have a vertical line. We now generalize the concept of a line in R 2 to normed vector spaces.
3.2 Separation of Disjoint Convex Sets
49
X2
(Xl,
c} (xl, X2) ~k
A ={x 9 < x , x >~< c}
L= {x : < x,x*> =c} Fig. 3.3. Separation of R 2 into half spaces by a line L.
D e f i n i t i o n 3.2.1 ( L i n e a r v a r i e t y , H y p e r p l a n e s ) . A subset V of a vector space X is a linear variety if there exists an element xv in X such that V = xv + M : = { x : X = X v + m f o r s o m e r n E
M},
where M is a subspace of X . A subset H of X is called a hyperplane if it is a linear variety which is proper (i.e. there exists an element xo in X which is not in H ) and maximal (i.e. if i l l is another linear variety which contains H then H1 = X ) . The line L defined earlier is a hyperplane in R 2. T h e o r e m 3.2,1. H is a hyperplane in X if and only if there exists a nonzero linear function f : X --~ R such that H := {x : f ( x ) = e}, where c E R. Proof. (=r Let H be a hyperplane in X. Then there exists x0 in X such t h a t H = x o + M := { x 0 + m : rn E M} where M is a proper subspace of X which is maximal. Let xl in X be such that xl ~ M. It is clear that the set M@Rxl
:= { m + y : m E
M andy=axl
for some a E R )
is equal to X (because it is a subspace which contains M and M is m a x i m a l ) . It is also clear t h a t for any x in X there exists unique elements f ( x ) E R and m~ E M such that x = ra~ + f(X)Xl.
50
3. Convex Analysis We will now show t h a t f is linear. Let x a n d y be e l e m e n t s in X , t h e n
x = rnz + f ( x ) x l a n d y = m y + f ( y ) x l . T h e r e f o r e x + y = ( f ( x ) + f ( y ) ) x l . F r o m the definition a n d uniqueness of f ( x x + y = m x + y + f ( x + y ) x l with f ( x + y ) = f ( x ) + f ( y ) and mx+y can be shown s i m i l a r l y t h a t f ( a x ) = a f ( x ) where a E R a n d x
(rnx + my) + + y) we have = m , + m y . It
is a n y e l e m e n t in X. Thus, f is linear. It is clear t h a t M = {x : f ( x ) = 0} a n d therefore tt = {x : f ( x ) = e} where c : = f(xo). (r Let f : X ~ R be a nonzero linear m a p a n d let g : = {x : f ( x ) = c} for s o m e c in R. Define M : = { x : f ( x ) = 0}. As f is nonzero t h e r e exists an e l e m e n t x0 in X such t h a t x0 ~ M. N o t e t h a t x E H if a n d o n l y if f ( x ) = c which is true if a n d only if f ( x - f(-~o)x0) = 0. Therefore, x E H if a n d only if x - ] - - ~ x 0 E M. Thus, H = f - ~ x 0 + M which i m p l i e s H is a linear variety. Now we show t h a t H is a p r o p e r m a x i m a l linear variety. I n d e e d , H is p r o p e r because x0 ~ M. Let n E N where N is a s u b s p a c e which c o n t a i n s M a n d n ~ M. As f ( n ) # 0 we have f ( x - /,,-~, n) = 0 for all x in X which jtn)
.
implies that x n E M for all x in X. T h u s x = m + n for s o m e m E M. As N is a s u b s p a c e which c o n t a i n s M a n d n E N it follows t h a t x E N. As x E X was chosen a r b i t r a r i l y it follows t h a t N = X. T h u s we have e s t a b l i s h e d t h a t M is a p r o p e r m a x i m a l subspace. T h i s proves t h a t H is a hyperplane. [] W i t h this t h e o r e m we have recovered the f a m i l i a r d e s c r i p t i o n as given in e q u a t i o n (3.1) for h y p e r p l a n e s . 3 . 2 . 1 . Let H be a hyperplane in a vector space X such that 0 is not in H. Then there exists a unique nonzero linear map f : X -+ R such that H = { x : f ( x ) = 1}.
Corollary
Proof. F r o m T h e o r e m 3.2.1 it is clear t h a t t h e r e exists a l i n e a r m a p f l : X -+ R a n d c E R such t h a t H = {x : f l ( x ) = c}. As 0 ~ H it is n o t p o s s i b l e t h a t e = 0. Let f : X -+ R be defined b y f ( x ) = ~ f l ( x ) for any x in X. T h e n it follows t h a t H = { x : f ( x ) = 1}. We will show t h a t f is unique. Let g : X --+ R be a n o t h e r n o n z e r o linear f u n c t i o n such t h a t H = {x : g(x) = 1}. Let h be an a r b i t r a r y e l e m e n t in H. T h e n it is clear t h a t for any x in X , f ( x + ( 1 - f ( x ) ) h ) = 1 which i m p l i e s t h a t x + (1 - f ( x ) ) h E H. Therefore, g ( x + (1 - f ( x ) ) h ) = 1 from which it follows t h a t g(x) = f ( x ) for any x in X (because g(h) = 1). T h u s , f is unique. [] For t h e p u r p o s e s of the discussion below we will a s s u m e t h a t c, m l a n d m 2 which d e s c r i b e t h e line L in F i g u r e 3.1 are all n o n n e g a t i v e . T h e r e s u l t s for o t h e r cases will be s i m i l a r . C o n s i d e r the region A in F i g u r e 3.1 which is t h e region "below" t h e line L. As i l l u s t r a t e d earlier, L = {x : < x, x* > = c} where x* = ( m l , m 2 ) . C o n s i d e r a n y p o i n t x = ( x l , x 2 ) in region A. Such a p o i n t lies "below" t h e line L. T h u s if x ' = ( x l , x~) d e n o t e s the p o i n t on t h e line L which has t h e s a m e first c o o r d i n a t e as t h a t of x t h e n x '2 >_ x2 . As x ' is on the line L it follows t h a t < x ' , z * > = mix1 + m2x'2 = c. As m2 > 0 it follows
3.2 Separation of Disjoint Convex Sets
51
that < x, x* > = r n l z l + rnex2 _< c. In a similar manner it can be established that if < x, x* > < c then z lies "below" the line L, that is x C A. Thus the region A is given by the set {z :< x , z * >_< c} which is termed the negative half space of L. In an analogous manner it can be shown that the region B (which is the region "above" the line L) is described by {x :< x, x* >_> e}. This set is termed the positive half space of L. Thus the line L separates R e into two halves; a positive and a negative half. We generalize the concept of half spaces for an arbitrary normed vector space. D e f i n i t i o n 3.2.2 ( H a l f s p a c e s ) . Let (X, II" IIx) be a normed linear space and let z* : X --4 R be a bounded linear functon on X . Let S1 : = { x E X : < z , x ' >
c} are open and the sets {x 9 X : f ( x ) < c} and {x 9 X : f ( x ) >_ c} are closed in the norm topology for any c in R. Proof. The proof is left to the reader. [] It is intuitively clear that in R e if two convex sets C1 and Ce do not intersect then there exists a line in R e which separates the two sets (see Figure 3.4). In other words there exists x* in (R2) * and a constant c in R such that C1 lies on the positive half space of the line L = {x I < x, x* > = c} and C~ lies in the negative half space of L. T h a t is
C1 C {.v :< x,x* > > c}, and C2 c {z :< z, z* > < c}. The main focus of this section is to generalize this result to disjoint convex sets in a general normed vector space. In this regard we can immediately establish the following result. Suppose (X, II-IIx) is a normed vector space and let B := {xlllzll < 1} be the unit norm ball such that xo ~ i n t ( B ) . T h e n it is possible to separate B and z0 by a hyperplane. Indeed from T h e o r e m 2.2.4, it follows that there exists z* in X* such that IIz'll _ 1 and < z0, z" > =
IIz011x > 1. As IIx'll < 1 and Ilzitx < 1 it follows that < x, ~* > _ IIx'llll~llx < II~llx < X < II~011x for all x 9 Jut(B). Thus
52
3. Convex Analysis
~
~
/
/
/
/
/
A
Fig. 3.4. Separation of of convex sets in R 2.
int(~) c {x :< x,x" > < II~ollx), whereas
x0 E L := {~ :< ~, ~" > - - [Ix011x} Thus we have shown that it is possible to separate the interior of a unit norm ball and any element which does not belong to the interior of the unit n o r m ball. Minkowski's function is a norm like functional associated with a convex set which allows us to use a similar argument as developed above to separate disjoint convex sets. L e m m a 3.2.2 ( M i n k o w s k i ' s f u n c t i o n ) . Let K be a convex subset of a normed linear space (X,[[. [Ix) such that 0 E i n t ( K ) . For any x E X let p(z) := inf{A E R : $ > 0, such that z E M~'},
where A/~" :=- {X : X = Ak for some k E K } . Then p is a real valued continuous sublinear function which is non-negative. Proof. It is clear that p is non-negative. As 0 E i n t ( I ( ) , there exists an a > 0 in R such that if for any k E X , ]lkl[x < a implies that k E h'. Let x be ax any element in X such that ]lxi]x # O. Then [ ] ~ I i x < a and therefore 211~llx~ " Thus, for any z E X , p(x) < 211~llx < ~. a a shown that p is real valued and non-negative. e
Thus
we have
3.2 Separation of Disjoint Convex Sets
53
Let a E R be such that a > 0. Then
p(ax) = inf{A E R : A > 0 and ax E AN} =inf{a~
ER: ~ >0 andxE
~I(}
=
As 0 E i n t ( K ) it follows that p(0) = 0. Let z and y be elements in X. Note that ifp(z) < 1 then z E K. Indeed, if p(z) < 1 then there exists A E R and k E K such that 0 < A < 1 with z = Ak = A k + ( 1 - A ) 0 . As k and 0 both are in K and K is convex it follows that z is in K. Given any e > 0 let r , E R and ry E R be such that p(z) < r, < p(x) + ~ and p(y) < ry < p(y) + ~. As, 1 >
p(z) = P(r-~) we know that ~
E K. Similarly, ~
E K. Let
r := r~ + r~. From the convexity of K, Lr_ r ~z + r~ r yry E K. This implies that
l ( x +y) 9 K and therefore z + y 9 vii. Thus, p(x +y) 0 is arbitrary it follows that p(x + y) < c} ff C {z 9 X :< x,z* > < c} V C {z 9 X :< z, z* > = c},
where K denotes the closure of the set I( in the norm topology. Proof. We will first, prove the theorem when 0 9 i n t ( K ) . Let V = x0 + N where x0 is an element in X and N is a subspace of X. Let
M=N•Rxo
:={n+y:n 9
andy=ax0
for s o m e c ~ 9
For any element m in M let m = nm + f ( m ) x o . We now show that nm and f ( x ) are unique. Let m = nl + axo = n2 + ~xo where a and /3 in R with --n I a # /3 and nl and n2 are elements in M. Then it follows that x0 = n 2~_# and therefore z0 is in N. As N is a subspace we have - x 0 9 N. This imphes that 0 9 V which is not true because i n t ( K ) f3 V = {} and 0 9 i n t ( K ) . Thus nm and f ( m ) are unique for every m in M. Thus f defines a function on M. It can also be shown that f is a linear function. Note that
V = {m 9 M : f(m) =1}.
54
3. Convex Analysis
For any x in X let
p(x) = inf{A E R : x E AK}. p is the Minkowski's function of the convex set K. As int(K) ;3 V = {} it follows t h a t for all v E V, p(v) _> 1. Therefore, if v is in M and f(v) = 1 then p(v) > 1. For all m E M, with f ( m ) > 0, f(f-~-~}) = 1 which implies t h a t for all m in M with f ( m ) > 0, p ( ] - - ~ ) >_ 1. F r o m the sublinearity and the non-negativity of p (see L e m m a 3.2.2) it follows t h a t p(m) > f ( m ) for all m in M. From T h e o r e m 2.1.2 we know t h a t there exists a linear function F : X R such t h a t F(m) = f ( m ) for all m in M and F(x) ~_ p(x) for all x in X. It is clear t h a t F is continuous because p is continuous and F(x) ~ p(x) for all x in X. We r e n a m e F as x*. T h e n it is clear t h a t for every element k in int(K), p(k) < 1 and therefore < k, x* > < 1. Also, for every element k in K , < k,x* > < p(k) < ]. As F(v) = f(v) --- 1 for every element v in V it follows t h a t < v,x* > = 1 on V. Also, x* :~ 0 as V is not e m p t y . T h u s we have established the t h e o r e m when 0 E int(K). For the m o r e general case if k0 is in int(K) then let h " := { k -
k0: k E K } , and let Y' := {v - k o : v E V}.
W i t h these definitions we have t h a t 0 E int(K') and V' N int(K') = {} (as V M int(K) = {}). A p p l y i n g the result to K ' and V' the t h e o r e m in the general case follows easily with c : = < k0, x* > . [] 3 . 2 . 2 ( S e p a r a t i o n o f a p o i n t a n d a c o n v e x s e t ) . Let K be a convex subset of a normed linear space (X, [[. [Ix) with int(K) # {}. Let xo in X be such that xo q~ int(K). Then there exists a nonzero x* E X* such that Corollary
for allk i n i n t ( K ) and > < < x 0 , x * > for allk in K.
Proof. Let V := {x0}. T h e n V is a linear variety such t h a t V M int(K) = {}. From T h e o r e m 3.2.2 it follows t h a t there exists x* in X* and c E R such t h a t < x 0 , x* > = c, < k, x* > < c f o r a l l k i n i n t ( K ) and < k , x * > < c f o r a l l k in K . This proves the corollary. [] C o r o l l a r y 3.2.3. Let K be a convex subset of a normed linear space (X, I[. [ix) with int(K) r {}. Let xo E bd(K) := If \ int(K) := {x E K : x int(K)}. Then there exists a nonzero x* E X* such that < xo,x* > = s u p { < k,x* >: k E K } .
Proof. From Corollary 3.2.2 we know t h a t there exists x* in X* such t h a t
< <x0, x*>
for a l l k E I f .
(3.2)
3.3 Convex Optimization
55
As x0 E K we know that there exists a sequence {x,~} in K such that Ilx0 x,~[[x --+ O. From continuity of x* it follows that < x,~,x* >--+< xo, x* > . From equation 3.2 we conclude that < x0,x* > = s u p { < k,x* >: k E K}. [] The following corollary is often referred to as the Eidelheit separation result. C o r o l l a r y 3.2.4 ( S e p a r a t i o n o f d i s j o i n t c o n v e x s e t s ) . Let I(1 and K2 be convex subsets of a normed linear space (X, I1' IIx). Let int(I~x) 5s {} and suppose int(Ka) N K2 = {}. Then there exists a nonzero x* in X* such that s u p { < x,x* >: x E K1} _< inf{< x,x* >: x E K2}.
Proof. Let K : = [ ( 1 - I~'2 :---- {kl - k2 : kl E [ ( 1 and k2 E K2}. As int(/(1) f3 I(2 = {} it follows that 0 ~ i n t ( K ) . Also, i n t ( K ) 5s {} because int(Ifl)fqI(2 = {} and int(K1) 5s {}. From Corollary 3.2.2 there exists a nonzero x* in X* such that < k, x* > < 0 for all k in K. This implies that for any kl in h'l and for any k2 in Ks, < kl, x* > < < k~,x* > . This proves the corollary. []
3.3 Convex
Optimization
The problem that is the subject of the rest of the chapter is the following problem. p =
inf f ( x ) subject to xE/2,
where f :/2 --+ R is a convex function on a convex subset /2 of a vector spacc X. Such a problem is called a convex optimization problem. L e m m a 3.3.1. Let f : (X, ][.]]x) -~ R be a convex function and let ~ be a convex subset of X . If there exists a neighbourhood N in /2 of ~o where ~o E /2 such that for all w E N, f(wo) =hK(x) }
F i g . 3.5. Support hyperplane to a convex set K. T h e figure also illustrates the fact t h a t the m i n i m u m distance from a point x to a convex set K is the m a x i m u m of the distances of the point from the s u p p o r t i n g hyperplanes of the convcx set.
Note that the minimum distance of a point x = (xl,x2) from a line (see Figure 3.3) in R 2 given by inf Ily - xll,
yEL
is equal to mlXl+rn2;g2--c
< X~X* > --C llx-ll '
where the equation of the line is given by m l x l + r n 2 x 2 = c and x* = (ml, rn2). D e f i n i t i o n 3.3.2 ( S u p p o r t - f u n c t i o n a l ) . Let K be a n o n e m p t y convex subset of a normed linear space (X, II' tlx). The support functional h : X * --4 R U {ec} is defined by h K ( x * ) := sup{< k,x* >: k E K } , f o r any x* in X * .
58
3. Convex Analysis
Note t h a t for all k E K, < k,x* > < hg(x*). T h u s it is clear t h a t the h y p e r p l a n e L = {< x , x * > = hk(x'-)} divides the vector space into two halves such t h a t the convex set K lies entirely in one half of the space (see Figure 3.5). By inspection it Call be seen t h a t in R ~, the m i n i m u m distance of a point x from a convex set K is equal to the m a x i m u m of the distances of the point x from the s u p p o r t i n g hyperplanes. As the distance of the point x from the s u p p o r t i n g h y p e r p l a n e associated with h g ( x * ) is given by
= IIz0- k0llx IIz;ll.
Proof. We will first show that sup{< Xo,Z* > - h K ( x * ) : IIx*ll _< 1} >_ •. Let S : X --+ R be a real valued sublinear function defined by S ( z ) = for any z in X. Let Z := x 0 -
Ilxllx
K : = {y : there exists k E K with y = z0 - k}.
F r o m L e m m a 3.3.6 we know t h a t there exists x ; in X* such t h a t Ilx;[I _< 1 and i n f { < z,x• > : z E Z} = inf{iiziix : z E Z} = inf{ll~:0- kllx : k E K} = p. However, i n f { < z,x; > : z E Z} = i n f { < x0 - k , x ; >: k E h'} =<xo,z;>+inf{-:kEK}
= < x0, x; > - h u ( ~ ; ) .
60
3. Convex Analysis
Thus we have established that there exists z~ in X* such that IIz~ll ~ 1 and p = < z0, x~ > - h K ( z ~ ) . This implies that
sup{< ~0,z" > - h K ( ~ ' ) : I1~'11 _< X} > ~. Now we will show that/1 _> sup{< x0, z'* > - - h K ( x ' ) : I]:c*ll < l}. Let ,v" in X* be such that ]]x'll < 1. Then for any k 6 K, lifo - kllx >_ I < ~o - k, ~" > I >_ < xo, x" > - < k, ~* >
Therefore, inf{I]xo- kllx : k E K } >_ < xo, x* > - h K ( x ' ) . This holds for any x" in X* which satisfies IIz'll sup{< zo, x" > - h K ( Z ' )
: IIx'll S 1}.
However, we have established earlier that there exists x~ E X* such that ]]z~l I __< 1, and/~ = < zo, x~ > - h K (z~). Therefore,
= max{< ~o,x" > - h K ( x ' ) :
I1~*11 _< 1),
where we ha.ve replaced the term sup by max in the right hand side of the equation. Let k0 in I~ and x~ in X" be such that I1~;11_ 1 and = lifo - k l l x = < ~0, x~ > - h K ( ~ ) .
From the definition of hK we have that < k0, x~ > _< hK(x~) which implies that < z0 - k0, ~=; > ~ < x0,m~ > -hz,-(x~) = ~ - IIx0 - k011x. As IIx~ll ~ 1 it follows that < Xo - ko,z~ > > IIx0 - k0[lx I1~*11. Thus < zo - k o , x ; > = Ilz0 - k011x 11~'11. This proves the theorem. [] 3.3.2
Kuhn-Tucker Theorem
Consider the convex optimization problem ~(z) =
inf f(x) subject to xEI2
g(x) _< z. We will obtain information about w(0) by analyzing w(z). We have shown that w(z) is a decreasing function of z (see Lemma 3.3.4) and that it is a convex function (see Lemma 3.3.3). It can be visualized as illustrated in Figure 3.6. As w(z) is a decreasing fimction it is evident that the tangent to the curve at (0,w(0)) has a negative slope (see Figure 3.6). Thus the tangent can be characterized by a line L with the equation:
3.3 Convex Optimization
\(z,w(z))
,.
R
61
tl ff is ff
L= lx : < x,x* > =hK(x 3 }
Fig. 3.6. Illutstration of
w(z).
w ( z ) + < z, z* > = c, where z* > 0. Also, note that if we change the coordinates such that L becomes the horizontal axis and its perpendicular the vertical axis with the origin at (0,w(0)) (see Figure 3.6) then the function w(z) achieves its minim u m at the new origin. In the new cordinate system the vertical cordinate of the curve w(z) is given by the distance of (z, w(z)) from the line L. This distance is given by s(z) = w ( z ) + < z, z* > - c II(1,z')ll Thus
s(z)
achieves its m i n i m u m at z = O. This implies that
w(0) -=- min{w(z)+ < z, z* >} zEZ
= mi~{inf{f(x):x E fl,g(x) : x E ~ , z E Z,g(x) < z} i n f { f ( x ) + < g(x),z* >: x E $2). The first inequality is true because z* > 0 and g(x) : x E 9 , z E i n f { f ( x ) + < z, z* >: x E Y2), because
g(x) < g(x)
Z,g(x) < z}
is true for every x E (2. Thus we have
62
3. Convex Analysis w(0) = i n f { f ( x ) + < z, z" >: z E /2}.
Note that the above equation states that a constrained optimization problem given by the problem statement of w(0) can be converted to an unconstrained optimization problem as given by the right hand side of the above equation. We make these arguments more precise in the rest of this subsection.
y
separatmg hyperplane
Fig. 3.7. Figure for Lemma 3.3.7.
Lemma 3.3.7. Let (X, II-IIx), and (Z, II' IIz), be normed vector spaces with /2 a convex subset of X. Let P be a positive convex cone defined in Z. Let Z* denote the dual space of Z with the postive cone P ~ associated with P. Let f : /2 --+ R be a real valued convex functional and g : X --+ Z be a convex mapping. Define P0 := inf{f(x) : g(z) < 0, x E /2}.
(3.4)
Suppose there exists Xl E / 2 such that g(xl) < 0 and and suppose Po is finite. Then, there exist z~ > 0 such that /z0 = i n f { f ( x ) + < g(x), z~ >: x E /2}.
(3.5)
Furthermore, if there exists xo such that g(xo) > 0. This implies t h a t for any z E Z, < z, z* > > 0. As Z is a vector space (which implies - < z, z* > ___ 0) it follows t h a t
64
3. Convex Analysis
< z, z" > = 0 for all z E Z. T h u s z* = 0. This contradicts (z, s) :/: (0, 0) and therefore, s > 0. 7,* Let z~ = -Y" Dividing inequality (3.7) by s we have k < z,z~ > + r > - for all ( z , r ) E A and
(3.12)
8
dividing inequality (3.8) by s we have k < z,z~ > + r _< - for all ( z , r ) E B.
(3.13)
8
In particular, as (z, p0) E B for all z E - P it follows from inequality (3.13) that k < z,z~ > < - - p 0 for a l l z E - P . s This implies t h a t < z,z~ > _< 0 for all z E - P . Indeed, if for s o m e Zl E - P , < -~ 1 , z 0* > > 0 then < c~zl,z* >--+ oo as a -4 oo which contradicts the fact that < a z l , z* > is bounded above by k_ $ _/10. T h u s we conclude t h a t
z; e P c . Also, as (g(x), .f(x)) for x E 12 is in A it follows from (3.12) t h a t
< g(x), z~) > +f(x) > _k for all x E 12 and
(3.14)
8
as (0,p0) E B it folllows from (3.13) t h a t k ]~o < - for all (z, r) E/3.
(3.15)
8
F r o m inequalities (3.14) and (3.15) we conclude t h a t i n f { < g(x), z~ > +f(x) : x E 12} _> t o .
(3.16)
Suppose x E 12 and g(x) < 0 (i.e. x is feasible), then
f(x)+ < g(x), z~ >< f(x),
(3.17)
because ~o -* E p C . Therefore, we have i n f { f ( x ) + < g(x),z; >: x E 12} < i n f { f ( x ) + < g(x),z; >
: . e 12,a(.) < o} _< inf{f(x) : x E 12,g(x) _< f(~o) = t~o. T h e first inequality follows from e q u a t i o n (3.18) and the second inequality l S true because z~ E p e and g(zo) = 0. []
3.3 Convex Optimization
65
L e m m a 3.3.8. Let X be a Banach space, 12 be a convex subset of X, Y be a finite dimensional normed space, Z be a normed space with positive cone P. Let Z* denote the dual space of Z with a postive cone P(~. Let f : 12 -+ R be a real valued convex functional, g : X --+ Z be a convex mapping, H : X ~ Y be an afflne linear map and O E int({y E Y : H ( x ) = y for s o m e x E 12"2}). Define #o := i n f { f ( x ) : g(z) _< 0, H ( z ) = 0, x E ~2}.
(3.19)
Suppose there exists xl E 12-2such that g ( x l ) < 0 and H ( x l ) = 0 and suppose #o is finite. Then, there exist z~ > 0 and y~ such that #0 = i n f { f ( x ) + < g(x),z~ > + < H ( x ) , y ; >: x E 12}.
(3.20)
Proof. Let 121 : = { x :
x E 12, g ( x ) = O } .
A p p l y i n g L e m m a 3.3.7 to 121 we know t h a t there exists z; E p e such t h a t P0 = i n f { f ( x ) + < g(x),z~ >: x E 12I}.
(3.21)
Consider the convex subset, H(Y2) := { y E Y
: y=
H(x) for s o m e x E
12}
of Y. For y E H(12) define
k(y) := i n f { f ( x ) + < g ( x ) , z ; >: x E I"2, g ( x ) = y}. We now show t h a t k is convex. Suppose y,y' E H(12) and :c,x ~ are such t h a t H ( x ) = y and H(x') = y'. Suppose, 0 < A < 1. We have, A ( f ( x ) + < g(x), z; >) -{- (1 - A ) ( I ( x ' ) + < g ( x ' ) , z; > ) >__ f()~x + (1 - )~)z') q- < g()~x q(1 ,~)x'), z~ >>_ k(Ay + (1 - A)y'). (the first inequality follows f r o m the convexity of f and g. T h e second inequality is true because H ( A x + (1 A)x') = Ay + (1 - )~)y'.) T a k i n g i n f i m u m on the left hand side we o b t a i n )~k(y) + (1 - )~)k(y') >_ k(Ay + (1 - A)y'). This proves t h a t k is a convex function. We now show t h a t k : H(12) --+ R (i.e. we show t h a t k(y) > - o c for all y E H(12)). As, 0 E int[H(12)] we know t h a t there exists a n , > 0 such t h a t if IlYll < e then y E H(12). Take any y E H(12) such t h a t y r 0. Choose A, b/ such t h a t -
)~ =
e 2-~
and y~
= -)~Y"
This implies t h a t y~ E H(12). Let, B = X--~S" We have (1 -
+ B y = 0.
Therefore, from convexity of the function k we have
66
3. Convex Analysis
~k(y) + (1 - ~)k(y') > k(O) = #o. Note that tt0 > - c r by assumption. Therefore, k(y) > - o a . Note, that for all y E H(12), k(y) < oc. This proves that k is a real valued function. Let [k, H(12)] be defined as given below [k,H(12)] :-- {(r,y) E R • Y : y E H(12), k(y) 0 such that y E H(12) and Ilyll < 5 implies that Ik(y) - k(0)l < ,'.
This means that if y E H(12) and IlYll + < H ( x ) , y >: x E g2 } and the maximum is achieved for some z~ > O, z~ E Z*, yo E Y . Furthermore if the infimum in (3.28) is achieved by some xo E 12 then < g ( x 0 ) , z ; > + < H(xo),Y0 > = 0,
(3.30)
and xo m i n i m i z e s f ( x ) + < g(x), z~ > + < H(x), Yo >,
over all x E 12.
(3.31)
Proof. Given any z* > 0, y E Y we have inf { f ( x ) + < g(x),z* >
xED
+ < H ( z ) , y >} < i ~ f { f ( x ) +
+ < H(x),y >
: g(x) < O, g ( x ) = 0} < i n f { f ( x ) : g ( x ) < 0, H ( x ) = 0} --
zEI'~
~-
/.tO.
Therefore it follows that max{~o(z*, Y) : z" _> 0, Y E Y}_< Po. From Lemma 3.3.8 we know that there exists z~ E Z*,z~ _> 0, Vo E Y such that /~o = ~o(z;, Yo). This proves (3.29).
68
3. Convex Analysis
Suppose there exists x0 E 12, H(xo) = O, g(xo) + < H(xo),y0 >< = t0. Therefore we have < g(xo),z~ > + < H(xo),yo > = 0 and to = f ( x o ) + < g(xo), z~ > + < H(xo), Y0 >. This proves the theorem. [] We refer to (3.28) as the Primal problem and (3.29) as the Dual problem. C o r o l l a r y 3.3.1 ( S e n s i t i v i t y ) . Let X , Y, Z, f, H, g, 12 be as in Theorem 3.3.2. Let xo be the solution to the problem
minimize f(x) subject to x E 12, H ( x ) = O, g(x) + < t[(xo),yo > < f ( x ) + < g(x) - zo,z~ > + < H ( x ) , y o > . In particular we have
f ( x o ) + < g(xo) -- zo, z; > + < tt(xo),Yo > f(Xl)-]-
< .q(Xl) -- Zo,Z; • + < ~ t ( X l ) , Y 0
> .
From T h e o r e m 3.3.2 we know that < g(xo) - z0, z~ > + < H(xo), Yo > = 0 and H ( x l ) = 0. This implies f(xo)--f(xl)
_~< g ( X l ) - - z 0 , z ~ > _~< z l - - z 0 , z ~ > .
A similar argument gives the other inequality. This proves the corollary.
[]
4. P a r a d i g m
for Control
Design
We present notions of stability, causality and well-posedness of interconnections of systems. The main part of this chapter focusses on the parametrization of all closed loop maps that are achievable through stabilizing controllers.
4.1 Notation
and
Preliminaries
We will generalize the gp space that we introduced in Chapter 2. Let t~ denote the space of all vector-valued real sequences taking values on positive integers that is g~ = {x: x = ( x l , x 2 , . . . , x n ) with xi 6 gp}. For any x in tm let
[xi(k)] v
[[x[[v = \k=O
Ilxll~
=
sup
1 _< p < oo and
i=1
max
Ix~(k)l
where x = ( x l , x = , . . . , x n ) and xi = (xi(O),xi(1),...) with zi(k) E R. Let
e~ :-- {xlx ~ e~, II~llp< oo}. All the results that were established for gp spaces in Section 2.4 hold for the ( q , II.llv) spaces. We often refer to e" as a signal-space. Let g;'• denote the spaces of m x n matrices with each element of the matrix in s Let Pk denote the truncation operator on gmxn which is defined by
Pk(x(O), x(1), x ( 2 ) , . . . ) = (x(0), x ( 1 ) , . . . , x(k), O, 0,...). Let S denote the shift map from t~ to ~ defined by
S(x(O), x(1), x ( 2 ) , . . . ) = (0, x(0), x(1), x(2), x ( 3 ) , . . . ) . D e f i n i t i o n 4.1.1 ( C a u s a l i t y ) . A linear map 7- : ~
causal if P t T = PtTPt for all t.
-4 (m is said to be
70
4. Paradigm for Control Design
T is strictly causal if P t T = P t T P t - 1 for all t where Pt is the truncation operator.
D e f i n i t i o n 4.1.2 ( T i m e i n v a r i a n c e ) . A map T : ga _.+ trm is time invariant if S T = 7-S where S is the shift operator. Let T be a linear m a p from ( q , ]].]]p) to ( ~ , ]].llp). The p-induced norm of T is defined as
117-11p-i.d :=
117-xllp
sup II~lt,~0 Ilxllp
We often refer to a m a p from a signal-space e '~ to another signal space gm as a system. D e f i n i t i o n 4.1.3 ( S t a b i l i t y ) . A linear map 7- : (X, t].]]x) -+ (Y, ]].]]Y) is said to be stable if it is bounded. T : (~'~, [[.lip) --4 (~p [[.lip) is said to be s stable if it is bounded. Example 4.1.I. Let 7- : /~ --4 ~ be a linear operator such that y = T u is defined as
y(1) y(2).
~
IT(l) T(0) '0" . =
T 1) T(2). T(0)... ':
)ii|u(1)~ u 2)
,
where y = (y(0), y(1), y(2), ...), u = (u(1), u(2) . . . . ) and T ( j ) E R, for all j = 0, 1 , . . . . Let t be any positive integer. Then it can be verified t h a t P t T P t u = P t T u for any u E L Thus T is causal. If T(0) = 0 then it can be verified t h a t P t T P t - 1 = P t T . In this case T is strictly causal. It also follows t h a t for any u E ~, S T u = T S u . Thus 7" is a time invaraint map. D e f i n i t i o n 4.1.4 ( C o n v o l u t i o n m a p s ) . 7- : gn __+ ~ is linear, time invariant causal, convolution map if and only if y = T u is given by
I 1 1 2 )(i1) y2
T~I ~ 2
T2,~
u2
rLrL
r;o
o
where y = (Yl Y2, . . . , Ym) E gm and u : (ul, u2, . . . , un) E ~ , 7ij : e ~ is described by
4.1 Notation and Preliminaries
|T~5(1 ) T~t(0)
71
lut(1)l
'0" .
with Tit(k) E R for all k = O, 1,.... {7;}j(k)}~~ is also called the impulse response of the system 7}t. The linear map T 0 can be identified with the sequence {T0(k)}~= 0 9 t. Thus with some abuse of notation we often write Tit to mean the map 7~t and T to mean the map T . Depending on the context Tij can denote the map Tit or the sequence {T~j(k)}. The operation given by Equation ,~. 1 is often written as Yi = Ti t * u t. L e m m a 4 . 1 . 1 . Let T : f --4 f be a linear, time invariant, causal, convolution map. Let {T(k)} denote its impulse response. Then
llTIl~-.~d -- ~ IT(k)l. k=0
Proof. Note that for any u 9 ~ w i t h llull~o
i ~ . infimum in (5.5) is a m i m m u m .
Therefore the
Proof. We denote the feasible set of our problem by ~ : = { S E gl : AS = b and < S, S >_< 7). v-r < oo because 3' > poo and therefore the feasible set is not empty. Let B := {S E gl : IlSl[x _< u-r + 1}. It is clear that
v-r=
inf .{[[Slla}.
CE,~nB
Therefore given i > 0 there exists Si E (P A B such that [ISil]l < u-r + ! B is a bounded set in gl = c0. It follows from the Banach-Alaoglu result --
i
~
88
5. SISO E,/ 7/2 Problem
(see Theorem 2.3.1) that B is W(c~,co) compact. Using the fact that co is separable we know (see Theoerm 2.3.2) that there exists a subsequence {r of {r and r E r B such that {r -+ r in the W(c~,co) sense, that is for all v in co --+ = by for all k and for all j which implies < aj, dpo ;>= bj for all j. Therefore we have A(r = b. As 12 C Co we have
As A(r
from (5.6) that for all v in 12 < v, r
>--+< v, r
> as k -+ ~ ,
(5.8)
which shows that r --+ r in W(l~,l~). Also, from the construction of r we know that I1r 112 _< x/7. From Theorem 2.3.1 (Banach-Alaoglu theorem) we conclude that < r r > < 7 and therefore we have shown that r G ~. 1 From Theorem 2.3.1 Recall that r were chosen so that IIr < u~ § i-~" for all k. Therefore IIr < u~. As r E we have that [leo[It < u-r + U (which is the feasible set) we have IIr -- t,.,. This proves the theorem. [] 5.2.2
Structure of Optimal Solutions
In this subsection we use duality results to show that every optimal solution is of finite length. The following two lemmas establish the dual problem. L e m m a 5.2.1. u~ : max{~(yl,y2) :Yl >_ 0 and y2 E Rn},
(5.9)
where
~(y,, y2) := ~ f {11r
+ yl(< r r > - ~ ) + < b -
Ar
Y2 >}.
Proof. We will apply Theorem 3.3.2 (Kuhn-Tucker-Lagrange duality theorem) to get the result. Let X, $?, Y, Z in Theorem 3.3.2 correspond to ~ea,gl, R" and R respectively. Let g(r : = < r r > -% H ( r := b - Ar With this notation we have Z* = R. A has full range which implies 0 E int[range(H)]. 7 > p~o and therefore their exists r such that < r r > - 7 < 0 and H(r = 0. Therefore all the conditions of Theorem 3.3.2 are satisfied. From Theorem 3.3.2 (KuhnTucker-Lagrange duality theorem) we have u-y:
max
inf{Jlr162162162
yt )_O,y2ER'* CElx
This proves the lemma. The right hand side of (5.9) is the dual problem.
}. []
5.2 Optimal Solutions and their Properties Lemma
The dual problem is given by :
5.2.2.
m a x { ~ ( y l , Y2) : Yl >_ 0 and Y2 E tt'~},
where ~(Yl,
89
(5.10)
Y2) is
r162
>}
+ y x ( < r 1 6 2> - 7 ) + < b, y2 > - < r
v(i) is defined by v(i) := A* y2(i). Proof. Let Ya _> 0, Y2 E R '~. It is clear that inf {114)111+ Yl(< 4),8 > - 7 ) +
CEll
=
< b - A t , y2 >}
inf {114)111 + Y l ( < 4), 4) ~> --7)-}- < b, .I]2 > - < r v > } .
Suppose 4) E gl and there exists i such t h a t r < 0 then define 4)1 E gi such t h a t 4)l(j) = 4)(j) for all j # i and r -- 0. Therefore we have IIr + yl(< r > -7)+ < b, y2 > - < 4),v > k II4111~+ y l ( < 4)1,4)1 > --7)+ < b, y2 > - < 4)1,v >. This shows that we can restrict 4) in the infimization to satisfy r >_ O. This proves the lemma. [] T h e following theorem is the main result, of this subsection. It shows t h a t any solution of (5.5) is a finite impulse response sequence. 5.2.2. Define T:={4) E gi : there exists L* with r L" }. The dual of the problem is given by.
= 0 if i >
Theorem
max{~(yl,Y2) : Yl > 0, y2 E /in},
(5.11)
where ~(Yl, Y2) is r
inf
{llr162162
y2>-}.
v(i) defined by v(O = A'y2(i). Also, any optimal solution r of (5.5) belongs to T. Proof. Let y~ >_ 0, y~ E R '~ be the solution to max yI >>O,y2ER"
r162
+ y~(< 4),r > - 7 ) + < b
m4),y2 >}. * At
It is easy to show t h a t there exists L* such that v'Y(i) := (A y2)(i) satisfies Iv*r(i)l < 1 if/>_ L'. If 4)(i)v'~ ( i) > 0 for all i then, 114)11~ + y ; ( < r 1 6 2 > - 7 ) + < b,y~ > - < r ~ > (x) =
~--~{1r
+ y~ (r
2- r
- y~7+ < y~,b >
i=0 oo
= Elr
- vW(i)) + y~(r
~ } - Y1~7+ < y~, b >
- v'~(i)) + y~(r
2}
i=0 L*
= E{r i=0
90
5. SISO el/ 7i2 Problem
+
)_~ { r 1 6 2
2} - Y ~ 7 + < yg,b > .
i=L*+I
Suppose [v"(i)[ < 1. Then we have, r
- vY(i)) + y~(r
2> 0
and equals zero only if r = 0. Therefore, in the infimization we can restrict r = 0 whenever IvY(i)[ < 1. As IvY(i)[ < 1 for all i _> L" it follows t h a t we can restrict r to 7- in the infimization. In T h e o r e m 5.2.1 we showed t h a t there exists a solution r to the primal. From T h e o r e m 3.3.2 (Kuhn-TuckerLagrange duality theorem) we have t h a t r minimizes
Iir
+y~(< r162> -7)+
< b, y'~ > - < r v y >,
over all r in gl. From the previous discussion it follows t h a t r proves the theorem.
G 7-. This []
5.2.3 An A priori Bound
Solution
on the Length
of Any Optimal
In this subsection we give an a priori b o u n d on the length of any solution to (5.5). First we establish the following three lemmas. Lemma Ar
5.2.3. Let 7
inf
r162
#~,
:=
ml
Ar
inf
r162162
IIr
and Vy : =
y:, y~ represent a dual solution as obtained in (5.10).
I[r
Then y~ < M.y where My : = --
m~ .
~[ -- tt oo
Proof. Let 7 > 3'1 > # ~ and vy I '-.-
Ar
inf
r 1 6 2 ~"h
IIr
Let Yly, Y2Yrepresent
a dual solution as o b t a i n e d in (5.10). From Corollary 3.3.1 (a sensitivity result) we have < 7 -- 3'1, Y~ > ~ Vy, -- /JAr ~ Vy, _< m l , which implies t h a t y~ < '~: . This holds for all 7 > 71 > /loo. Therefore -"7-')'1 My : = y -m;,ur is an a priori b o u n d on y~. This proves the lemma. [] 5.2.4. Let r be a solution of the primal (5.5). Let y~, y~ represent the corresponding dual solution as obtained in (5.10). Let v y : = A*y~ then,
Lemma
y~r
= . - ( i2) - 1 i f vY(i) > 1 _ ~'(i)+1 --
=
2
0
i f vY(i) < - 1 i f IvY(i)[_< 1. 2rn xy/'~
Also, IlvY[[oo 0 then as r minimizes L(r we have r = 0. We have already shown that if [v'r(i)l < 1 then r = 0. Therefore, Y7r -= 0 if Iv~(i)l _< 1. Suppose v~(i) > 1 then it is easy to show that there exists r such that r > 0 and r - v'r(i)) + y7(r 2 < 0. As any optimal minimizes n(r we know that r ~ ( i ) ) - v ~ (i))+y7 (r 2 < 0, which implies r > 0 and therefore 1 - v'r(i) + 2y7r ) = 0. This implies that Y7r = ,'(i)-~ Similarly the result follows when v~(i) < - 1 . Therefore, 2 " I[v~[l~ < 2M~llr
< ~
--
--
IIr
"y--,u~o
follows from the fact that < r is an a priori upper bound on
r
< ~"F+I. --
"Y--P,
o o -
The last inequality
> < 7. This implies that c~-r := 2"~'v~ + 1 This proves the lemma. [] --
~ - P o o
IIv lloo.
--
L e m m a 5.2.5. If y~ E R n is such that IIA*y?llo~ < a n then there exists a positive integer L* independent of y2 such that I(A* y2)(i)l < 1 for all i >_ L*.
Proof. Define Zl
A)=
Z2
Z3 . . .
L" ' L.' L.' ' L. . . \
Zn 1
I
z3 ... z. /
A*L : R n -+ R L+I. With this definition we have A~o = A*. Let Y2 E R'* be such that [IA*y211~ < a~. Choose any L such that L > ( n - 1). As zl,i = 1 , . . . , n are distinct A~, has full column rank. A~ can be regarded as a linear map taking (R n, ]].]]1) --4 (R L+I, [I-[[oo). As A~ has full column rank we can define the left inverse of A~,, (A~,) -l which takes (R L+I, ]].1]o~) --+ (R", [I.[]l). Let the induced norm of (A~,) -l be given by H (A~) -l Ho~,l. Y2 e R n is such that IIA * y2l[oo < _ a-y and therefore IIALy2II+ * _< a~. It follows that, Ily2lll = - o e .
Therefore we can restrict v in the maximization to satisfy Ilvll~ ___ 1. From arguments similar to that of the proof of Theorem 5.2.3 r = 0 whenever Iv(i)[ < 1. Therefore the infimum term is zero whenever Ilvll~ _< 1. This implies that the dual problem reduces to: max < C 1, v >, ven,,~ge(a'),llvl[~__ O, y~ E R '~ be a solution in (5.10). If y~ > 0 then
the solution r
of (5.5) is unique.
Proof. Let L ( C ) : = I1r + y~'(< r162 > - 7 ) + < b - AC, y~ >. From Theorem 3.3.2 (Kuhn-Tucker-Lagrange duality theorem) we know that C0 minimizes L(C), C E ~1. If y~ > 0 then it is easy to show that L(C) is strictly convex in ~I- From the L e m m a 3.3.2 it follows that r is unique. This proves the lemma. [] The main result of this subsection is now presented. Theorem
5.3.1. Define S := {r : Ar = b and 11r
= u ~ } , m2 := inf < CES
r C > - The following is true: 1) I f 7 > rn2 then problem (5.5) is equivalent to the standard ~x problem
whose solution is possibly nonunique. 2) If #o~ < 7 < rn~ then the solution to (5.5) is unique. Proof. Suppose m2 < 7 then there exists r E s such that Ar = b, I1r = u ~ and < Cl, Cl > 5 ")'. This implies that t,~ = inf IIC111 < uoo. T h e Ar
~b,r162
other inequality is obvious. This proves 1). Let/zoo < 7 < m2 and suppose yl~ = 0 then we have shown in L e m m a 5 . 3 . 1 that v7 = u ~ . Therefore there exists Cl e el such that [1r = u ~ , Ar = b and < r162 > < 7 < m2. This implies that r E S and < r162 > < m2 which is a contradiction. Therefore y~ > 0. From L e m m a 5.3.2 we know that C0 is unique. This proves 2). [] The above theorem shows that in the region where the constraint level on the 7"/2 is essentially of interest (i.e., active) the optimal solution is unique.
5.4 An Example 5.3.2 C o n t i n u i t y o f t h e O p t i m a l
95
Solution
Following is a theorem which shows that the ~1 norm of the optimal solution is continuous with respect to changes in the constraint level 7. Theorem
5.3.2. Let v.y :=
Ar162
inf
~?
116111- Then v.~ is a continuous
function of 7 on (/aoo, oo). Proof. If 7 E ( ~ , ~ ) then it is obvious that "~ ~ i n t { d o m ( v ~ ) } where dom(v~) := {7 : - o o < v-~ < oo} is the domain of v-~. From L e m m a 3.3.3 we know that v~ is a convex function of 7. The theorem follows from the fact that every convex function is continuous in the interior of its domain (see L e m m a 3.1.2). [] Now we prove that the optimal solution is continuous with respect to changes in the constraint level in the region where the optimal is unique. Theorem
Then r
5.3.3. Let poo < 7 ~ m2. Let 6~ represent the solution of
-4 6~ in the norm topology if 7k --4 7.
Proof. Let ml :=
min
ACmb, 7/2. Let L* represent the upper bound on the length of 6~- Then as the upperbound is nonincreasing (see Theorem 5.2.3) we can assume that r E R L ' . L e t B := { x : x E R L" : ]]x]]l _< m l } then we have 6-~k E B. Therefore there exists a subsequence Ok, of 6~k and 61 such that 6k, ~ 61 as i -4 cx~ in (R L" , 11.111).
(5.16)
It is clear as in the proof of T h e o r e m 5.2.1 that A61 = b as ACk, = b for all i. Also,
11611t22 _< 7. This implies that 61 is a feasible element in the problem of v-y. From T h e o r e m 5.3.2 we have ]]6k,]]1 -4 v~. From (5.16) we have ]]61]]1 = v~. From uniqueness of the optimal solution we have 61 = 6~. From uniqueness of the o p t i m a l it also follows that 6 ~ -4 6~. This proves the theorem. []
5.4 A n E x a m p l e In this section we illustrate the theory developed in the previous sections with an example. Consider the SISO plant,
96
5. SISO gl/ 7i~ Problem 1 /5()~) = )~_ 2 '
(5.17)
where we are interested in the sensitivity m a p r := ( I - P K ) - I . Using Youla p a r a m e t r i z a t i o n we get that all achievable transfer functions are given by r = (I - / 5 / ~ - ) - ~ = 1 - (A - 1)0 where ~) is a stable m a p . T h e m a t r i x A and b are given by 1
1
A:(1,2,22
. . . . ),
b = 1.
It is easy to check t h a t for this p r o b l e m ~r162:= inf{llr
: r 9 t?~ and Ar : b} : 0.75,
and rnl :--
I1r
inf
with the o p t i m a l solution r r
=
~
= 1.5,
given by
0.75.t --~-A .
t----O
P e r f o r m i n g a s t a n d a r d gl o p t i m i z a t i o n [3] we obtain u ~ :-- i n f { l l r
: r 9 g~ and A r = b} = i
and m2 :=
inf
Ar162
11r
= x,
with the o p t i m a l solution r = 1. We choose the constraint level to be 0.95. Therefore, ~-y = 2ml,/~ + 1 = 15.62. For this e x a m p l e n = 1 and zl = 1 L" the a priori b o u n ~ o n the length of the o p t i m a l is chosen to satisfy max
k=l,...,n
Izkl L" II (A~) -t I1~,1 a~ < 1.
(5.18)
where L is any positive integer such t h a t L > (n - 1). We choose L = 0 and therefore AL = 1 and ]] (A~) -~ Hob,l= 1. We choose L* = 4 which satisfies (5.18). Therefore, the o p t i m a l solution r satisfies r = 0 if i > 4. T h e p r o b l e m reduces to the following finite dimensional convex o p t i m i z a t i o n problem: u~ =
rain
3 {~"~ ]r
: r 9 R4},
1 1 1 where AL. = (1, 2,2~, 2~). We obtain (using M a t l a b O p t i m i z a t i o n T o o l b o x ) the o p t i m a l solution r to be:
r (A) = 0.9732 + 0.0535)~.
5.4 An Example
97
Tradeoff Curve 1.5
]
I
I
I
I
0.8
0.85
~I 0.9
I
1.45 1.4 1.35 1.3 1.25 1.2 1.15 1.1 1.05 1 0.75
~
~
' ~ 0.95
Fig. 5.3. The e 1 and the 7i2 norms of the optimal closed loop for various values of "r are plotted. The x axis can be read as the square of 7i2 norm or the value of % The y axis shows the gl norm. Therefore we have IIr = 1.02670 and 11r ~ 0.95. The same c o m p u t a tion was carried out for various values of the constraint level, 7 E [0.75, 1]. The trazteoff curve between the gx and the 7i2 norms of the o p t i m a l solution is given in Figure 5.3. For all values of 7 in the chosen range the square of the 7/2 norm of the optimal closed loop was equal to the constraint level 7. Although when the constraint level 7 equals 0.75 the o p t i m a l closed loop m a p is an infinite impulse response sequence, the o p t i m a l closed loop m a p has very few nonzero terms in its impulse response even for values of 7 very close to 0.75. For example with "~ = 0.755 the o p t i m a l closed loop m a p is given by: r
= 0.7708 + 0.3632A + 0.1596A 2 + 0.0578A 3 + 0.0065A 4.
As a final remark, we can use the structure of this example to illustrate that the optimal unconstrained 7/2 solution can have 7/2 norm much smaller than the 7/2 norm of the o p t i m a l gl (unconstrained) solution. Hence, minimizing only the s norm, which is an upper bound on the 7/2 norm, m a y require substantial sacrifices in terms of 7-/2 performance. Indeed, instead of the P used in the example before, consider the plant /5~(A) -- A - a where now a is a zero in the unit disk (i.e., lal < 1) and very close to the unit circle (i.e., ]a] ~ 1). Then the optimal unconstrained 7/2 norm given by
(b~,(AaA*~)-lb~) 1/2
- (1-
la12) 1/2
98
5. SISO gl/7"/2 Problem
where Aa = (1,a, a2,...), b~ = 1 (see [3] for details) is close to 0. On the other hand, for the optimal/1 unconstrained solution r we have r "- 1 which has 7/~ norm equal to 1. Therefore minimizing only with respect to ~1 may have undesirable 7/~ performance.
5.5 Summary In this chapter the mixed problem of/1/712 for the SISO discrete time case is solved. The problem was reduced to a finite dimensional convex optimization problem with an a p r i o r i determined dimension. The region of the constraint level in which the optimal is unique was determined and it was shown that in this region the optimal solution is continuous with respect to changes in the constraint level of the 7/2 norm. A duality theorem and a sensitivity result were used.
6. A Composite
Performance
Measure
This chapter studies a "mixed" objective problem of minimizing a composite measure of the ~1, 7t2, and/?r norms together with the ~ norm of the step response of the closed loop . This performance index can be used to generate Pareto optimal solutions with respect to the individual measures. The problem is analysed for the discrete time, single-input single-output (SISO), linear time invariant systems. It is shown via the Lagrange duality theory that the problem can be reduced to a convex optimization problem with a priori known dimension. In addition, continuity of the unique optimal solution with respect to changes in the coefficients of the linear combination is established.
6.1 Problem
Formulation
Let wl be the unit step input i.e., wl = (1, 1,...). The problem of interest can be stated as: Given cl > 0, c2 > 0, c3 > 0, and c4 > 0 obtain a solution to the following mixed objective problem:
:= =
inf
{elll~lll + e=ll~ll22 + e3110* wxll~ + e4ILr
inf
{c~llr
~b Achievable
CEel, Ar
e=11r = +c311r wxll~ + e41lr
} (6,1)
The assumptions made on the plant are the same assumptions that were made in Chapter 5. The definitions of achievability, the matrix A and the vector b are as given in Chapter 5. We define f : s --+ R by,
Y(r := cxllr
+ c211r
+ c311r wall~ + c411r
which is the objective functional in the optimization given by (6.1). In the following sections we will study the existence, structure and computation of the optimal solution. Before we initiate our study towards these goals it is worthwhile to point out certain connections between the cost under consideration and the notion of Pareto optimality.
100
6. A Composite Performance Measure
6.1.1 R e l a t i o n
to Pareto
Optimality
T h e notion of Pareto o p t i m a l i t y can be stated as follows (see for e x a m p l e , [22]). Given a set of rn n o n n e g a t i v e functionals 7i, i -- l , . . . , m on a n o r m e d linear space X , a point x0 E X is Pareto o p t i m a l with respect to the vector valued criterion 7 := ( f l , . . . , f m ) if there does not exist any x E X such that 7 i ( x ) _< f i ( x 0 ) for all i E { 1 , . . . , m } and f i ( x ) < 7~(x0) for s o m e i E {1,...,m}. Under certain conditions the set of all Pareto o p t i m a l solutions can be generated by solving a m i n i m i z a t i o n of weighted s u m of the functionals as the following t h e o r e m indicates. T h e o r e m 6.1.1. [23] Let X be a normed linear space and each nonnegative functional 7i be convex. Also let {TI
s,.. := {e e ~ " : c, _> O, ~ c ,
= 1),
i=1
and for each e E R m consider the following scalar valued optimization: inf s
ciTi(x).
xEX i=1
If xo E X is Pareto optimal with respect to the vector valued criterion -](x), then there exists some c E Sra such that xo solves the above minimization. Conversely, given c E Sin, if the above minimization has at most one solution xo then xo is Pareto optimal with respect to 7(x). [] In the next section we show t h a t there is a unique solution r to P r o b lem (6.1). F u r t h e r m o r e , since u is assumed to be a scalar, there is a unique o p t i m a l q E ~1. Hence, in view of the a f o r e m e n t i o n e d t h e o r e m we have t h a t if we restrict our a t t e n t i o n to p a r a m e t e r s el,c2,c3 and c4 such t h a t (Cl,C2, C3, C4) E ~4 := {(c1,c2,e3, c4) : Cl JrC24-c34-c4 = 1, c,,c2, c3, c4 > 0}, we will produce a set of P a r e t o o p t i m a l solutions with respect to the vector valued function f ( q ) := ( l l h - u , q{ll, lib - u , qll2 2, {{(h - u , q) , w , l l ~ , Ilh - u , q l l ~ ) =: (71 (q), 72 (q), 73 (q), f 4 ( q ) ) . where q E ~l. Thus, if r is the o p t i m a l solution for P r o b l e m (6.1) with a corresponding qo for some given (el, c2, c3, c4) E Z'4, then there does not exist a preferable alternative r with r = h - u * q for some q E tl such t h a t f i ( q ) < fi(qo) for all i E { 1 , . . . , 4 } and f i ( q ) < fi(qo) for s o m e i E {1,...,4}. As a final note we m e n t i o n t h a t if (cl, c2, c3, c4) do not satisfy cl + c2 + c3 + c4 = 1 then we can define a new set of p a r a m e t e r s ~1, ~2, ~3 and 54 by cl = cl c2 = c~ 53 = c~ and 54 = c~ c1+c:+c3+c4 ' c1+c2+ca+c4 ' cl+c2+ca+c4 ct+c2+ca+c4 with 51 + 52 + ~3 + ~4 = 1. T h e s e new p a r a m e t e r s would yield the s a m e o p t i m a l solution as with (cl, c2, c3, c4).
6.2 Properties of the Optimal Solution
101
6.2 Properties of the Optimal Solution In the first part of this section we show that Problem (6.1) always has a solution. In the second part we show that any solution to Problem (6.1) is a finite impulse responsc sequence and in the third we give an a priori bound on the length. 6.2.1 E x i s t e n c e o f a S o l u t i o n Here we show that a solution to (6.1) always exists. T h e o r e m 6.2.1.
There exists r
= inf {C111r
f(r 9
+ C211r
r
where r162
9 ~ such that
"t- C3I] r * Wl ]]r "]- C411r
9 e~ : Ar = b}. ThereSore the i , ~ , m
,~ (6. ~) is ~ m i m m u ~ .
Proof. We denote the feasible set of our problem by 9 := {r 9 gt : Ar = b}. Let
B := { r 1 4 9
: e~llr
+ ~11r
+ r162
w~lloo + c~11r
< ~+ 1).
It is clear that u =
inf {~11r cE~nB
+ c211r
Therefore given i > 0 there exists r
~11r
+ c~11r
+ c311r w~ll~ + c~11r 9 ~5 M B such that
~ + c311r * w~llo~ + ~411r
1 < - + =.
Let
m B:={CEgl
:c~[14[11< u + l } .
B is a bounded set ill gt = C~. It follows from the Banach-Alaoglu result (see Theorem 2.3.1) that B is W(c~,co) compact. Using the fact that co is separable and that {r is a sequence in B we know that there exists a subsequence {r of {r and r 9 B such that r -+ r in the W(c~, co) sense, that is for all v in co < v, r
> ' - ~ < v, r
> as k --~ oe.
(6.2)
Let the jth row of A be denoted by aj and the jth element of b be given by bj . Then as aj 9 Co we have,
aj, r As A(r < aj, r that r
~"'~ as k -+ oo for all j = 1 , 2 , . . . , n.
(6.3)
= b we have < aj,r > = by for all k and for all j which implies >---=bj for all j. Therefore we have A(r = b from which it follows 9 4. This gives us ct I1r162
+ca]lr
* wlll~+eallr
> -
102
6. A Composite Performance Measure
From (6.2) we can deduce t h a t for all t, r (t) converges to r An easy consequence of this is we have for all N as k tends to co, ~tN=o{ellr + c~(r (t)) 2 } + e3 max0l ~,-----.
max O (n - 1). As zi,i = 1 , . . . , n are distinct A~ has full column rank. A~, can be regarded as a linear map taking (R '~, ]1.111) -'~ ( RL+I , I].II~). As A~ has full column rank we can define the left inverse of A~,, (A~,) -t which takes (R L+I, [I.1[~) --+ ( Rn, I1.111). Let the induced norm of (m~,) -~ be given by II (A~,) -I I1~,1. Y E R n is such that m _ ~. It follows that, IIA*yll~o - L*. This prove s the lemma. [7 We now s u m m a r i z e the main result of this subsection T h e o r e m 6.2.3. The unique solution r of the primal (6.1) is such that r = 0 if i > L* where L* given in Lemma 6.2.3 can be determined a
priori. Proof. Let Y0 be the dual solution to (6.1) and let v0 := A*yo. From L e m m a 6.2.2 we know that ]lv0[l~ < a where a = cl + c3 + e4 + 2 ~ f ( h ) . Applying L e m m a 6.2.3 we conclude that there exists L* (which can be determined a priori) such that Ivo(i)l < el if i > L*. Therefore, r = 0 if i > L*. This proves the theorem. [] The above theorem shows that the Problem (6.1) is a finite dimensional convex minimization problem. Such problems can be solved efficiently using standard numerical methods. At this point we would like to make a few remarks. It should be clear t h a t the uniqueness property of the optimal solution is due the the non-zero coefficient c2. This makes the problem strictly convex. T h e finite impulse response property of the optimal solution is due to the nonzero cl. Also, it should be noted t h a t in the case where c3 a n d / o r c4 are allowed to be zero, all of the previous results apply by setting respectively c3 a n d / o r c4 to zero in the appropriate expressions for the upper bounds.
6.3 An
Example
In this section we illustrate the theory developed in the previous sections with an example taken from [14] and also considered in C h a p t e r 5. Consider the SISO plant,
=
- 5'1
(6.11)
where we are interested in the sensitivity m a p r := ( I - P K ) -1. Using Youla parametrization we get that all achievable transfer functions are given by
6.4 Continuity of the Optimal Solution
107
=(I-Pk)
~)q^ where ~ is a stable m a p . Therefore, h = 1 and u = A - 89 The m a t r i x A and b are given by 1 1 A = (1, 2 , 2 2 , . . . ) ,
b=l.
We consider the case where cl = 1, c~ = 1, c3 = 1 and c4 = 1. Therefore, 1 L* the a = cl + c3 +c4 + 2 ~ f ( h ) ) = 11. For this example n = 1 and zl = 3' a priori bound on the length of the o p t i m a l is chosen to satisfy
max
k-~l,...,n
Izk] L" Jl ( A D -*
"
4. The problem reduces to the following finite dimensional convex optimization problem: 3
v = Amicn=l{ E ( l r
+ 0 K then [ A ( x ) - f ( x ) l < e for any x E S
=~ fk(x) > f ( x ) -- e 3> f(xo) -- e for any x E S
=~ min fk(z) > f(z0) -- c ~ES
lim minfk(z) k-+oo xE S
=r
> f ( x o ) -- ~.
As c is a r b i t r a r y we have k--roolim~ l ~ ]'k (X) _> f(Xo). This proves the l e m m a
[]
T h e o r e m 6.4.1. Let c~ 9 [al, bl], ck2 9 [a2, b2], Ck 9 [a3, b3] and ck4 9 [a4, b4] where al > O, a2 > O, aa > O, a4 > O. Let Ck be the unique solution to the problem "k : = min exkllr Ar
and let r
k + c211r
~ + e~llr
wxlloo + c4k11r
(6.13)
be the solution to the problem
v := rain c111r Ar
+ c21]r
+ c31tr
w l l l ~ + e411r
(6.14)
I/c~ ~ el, c~ ~ c~, c~ -~ c~ . n d e~ ~ e. t h e . Ck ~ r Proof. We prove this t h e o r e m in three parts; first we show t h a t we can restrict the proof to a finite dimensional space, second we show t h a t Uk --~ ~ and finally we show t h a t Ck "~ r Let Yk represent the dual solution of (6.13) and let Vk : : A*yk. Let f k ( r : = Ckll[qJl11"~-Ck2[l~)l122"4:-ck3[lCk * Wl[lc~-lt-C4k][r and f ( r := c,11r + c211r ~ + callr * w, ll~ + c~llr Let Ok the u p p e r bound on Ilvkll~ be as given by L e m m a 6.2.2. Therefore, ck
ak = c k + eka + c k + 2 ~ A ( h )
(n - 1). Therefore, it follows t h a t
max I:il L' II (A'L) - t I1~,1 c~k < c~.
i=l,...,n
6.5
Summary
109
for all k. From arguments similar to that of L e m m a 6.2.3 and Theorem 6.2.3 it follows that ek(i) = 0 if i > L* for all k. Therefore we can assume that ek
9 R L" 9
Now, we prove that uk -+ u. Let r ul := min bll]r Ar
+ b2]]r
be the solution of the problem
+ ba]]r * Wl][oo -t- b4]]r
As Clk _< bl, c~ < be, ca~ _< ba and c~ _< b4 we have that uk < ul for all k. Therefore, for any k we have c~]]r ]]r I~ ~+ c 3 1k1 r * w~ll~+c~llCkll~ _< vl which implies I1r x i= 1,...,nu j = 1 . . . . , ny < or2,I F ijk)'~ > = < H, I F ijtr176> for
{
k = 0,...,
o'u,(Ao) + o'v,(),o) - 1
and
f i=n,,+l,.
.,n~ j = nu + 1,...,n~ < ~, Gaiqt > = < H, Ga,qt > for J q = l , . . . , n w = < H,G~jpt > p= 1,...,nz t =0,1,2,... Furthermore, F ijkx~ Ga,qt and Gzjpt are matrix sequences in g~~x'~'~
I
Proof. Follows easily from Theorem 7.1.2, equation (7.4) and the fact that H and R are real matrix sequences. The fact about sequences in ~ , x , ~ is shown in the Appendix. [] We assume without loss of generality that Y ijk)'~ is a real sequence. Further, we define I,l u
ny
H,F,5 o> and XofiAuv i=1
j=l
Cz is the total number of zero interpolation conditions. The following problem /]0,1 = ,i~ Ai~fv~blr {llr
(7.5)
is the standard multiple input multiple output ~ ' x n ~ problem. In [5] it is shown that this problem for a square plant has a solution, possibly nonunique but the solution is a finite impulse response m a t r i x sequence. Let #0,2 := ~ A{l14511~}'~mev inf ~ble --
(7.6)
which is the standard 7t2 problem. The solution to this problem is unique and is an infinite impulse response sequence. We now collect all the assumptions made (which will be assumed throughout this chapter) for easy reference. A s s u m p t i o n 2 ~r has normal rank n~, and ~/ has normal rank ny.
7.2 The Combination Problem
115
A s s u m p t i o n 3 U and V have no zeros which lie on the unit circle, that is A u v C int(:D). Assumption
7.2 The
4
F ijk;~~
is a 7~ealsequence.
Combination
Problem
In this section we state and solve the combination problem. We first make the problem statement precise. Next we show the existence of an o p t i m a l solution. We then solve the problem for the square case. Finally, we study the nonsquare case. Let Nw := { 1 , . . . , n ~ } and let Nz := { 1 , . . . , n z ) . Let M, Y and M Y be subsets of N~ • N~ such that the intersection between any two of these sets is e m p t y and their union is N~ • Nw. Let ~pq and Cpq be given positive constants for (p,q) E M N U M and for (p,q) E M N U N respectively. The problem of interest is the following: Given a plant G solve the following optimization
problem:
=
inf
Achievable
~
~11r
(p,q) e M N u M
2+
~
cpqll~'pqtl~. (7.7)
(p,q) e M N u N
Note that for all (p, q) E M only the 712 norm ofqSpq appears in the objective, for all (p, q) E N only the gl norm of Cpq appears in the objective and for all (p, q) E M N a combination of the 7/~ and the gl norm of q~pq appears in the objective. For notational convenience we define f : s • --4 R by f(r
::
Z
-
(p,q)E M N u M
: (p,q)E M N u N
which is the objective functional being minimized. As it can be seen the objective functional of the combination problem constitutes a weighted sum of the square of the 7/2 norm and the t?l norm of individual elements qSpq of the closed loop m a p r Note that with this type of functional the overall 712 norm of the closed loop as well as s norms of individual rows can be incorporated as special cases. For technical reasons explained in the sequel we define the space .4 := {~b e Lcr~2"•
: ~pq E gl for all (p, q) 9 M Y O N } .
The following set is an extension of O O~ := {~ 9 ,4 : 4~ satisfies the zero and the rank interpolation conditions). Note that 69 is the set {45 9 g~" • 9 r satisfies the zero and the rank interpolation conditions). Also, note that when M is e m p t y then O : O~. Finally, we define the following optimization problem v~ := inf f(45). CEO~
Now, we show that a solution to (7.8) always exists.
(7.8)
116
7. MIMO Design: The Square Case
There exists ~o E O~ such that
L e m m a 7.2.1.
~-~
/]e
--
0
e; llr176
2
(p,q)E M N u M
(p,q)E M N u N
Therefore, the infimum in (7.8) is a minimum. Proof. See Appendix. 7.2.1 S q u a r e C a s e Here, we solve the combination problem for the square case. T h r o u g h o u t this subsection the following assumption holds:
5 nu = n~ and ny = nw.
Assumption
In the sequel y E R c" is indexed by ijkAo where i, j, k, A0 vary as in the zero interpolation conditions. The following l e m m a gives the dual problem for the square case. L e m m a 7.2.2.
ue = m a x { r
where r
y E R e'},
(7.9)
inf L(45) and
'~EA
L(r
qiI%II2 + (p,q)e ( M1V )u M
epqii%itl+
(p,q)E ( ~t N )u N -~ ~ . . . . (hijkAo
v,:~0w
""
- < F'3J')'~ ~ > ) '
i,j,k,Ao
Proof. We will apply Theorem 3.3.2 (Kuhn-Tucker-Lagrange duality theorem) to get the result. Let X, s Y,Z in T h e o r e m 3.3.2 correspond to A,.A, R e ' , R respectively. Let 7 = ue + 1 and let g : .,4 -+ R be given by g((P) := f ( ~ ) - 7. Let t I : , A --~ R e` be given by tIijkXo(r
: = biJ k A o _ < FiJk)'~
> ,
We index the equality constraints of [_[/by ijkAo where i, j, k, A0 vary as in the zero interpolation conditions. In [3] it is shown that the m a p H__ is onto R c" (it is shown that the zero interpolation conditions are independent). This means that 0 E i n t ( R a n g e ( H ) ) . From L e m m a 7.2.l wc know that there exists a (/il E .A such that /_/(~1) = 0 (that is (P~ satisfies the zero interpolation conditions) and f((P~) = ue which implies that g ( ~ l ) = - 1 < 0. Thus all the conditions of T h e o r e m 3.3.2 (Kuhn-Tucker-Lagrange duality theorem) are saitisfied. The l e m m a follows by applying T h e o r e m 3.3.2 to (7.8). [] For notational convenience we define the functionals Zpq E ~1 by
Zpq(t) :=
~--~ ffijkAorpq " ~ijk,ko (t). i,j,k,Ao
In what follows we show that the dual problem is in fact a finite dimensional convex p r o g r a m m i n g problem. A bound on its dimension is also furnished.
7.2 T h e Combination P r o b l e m
117
T h e o r e m 7.2.1. It is true that v = v~. v~ can be obtained by solving the following problem: O0
max{ E E - ~ P q C P q (t)2 + E E--6Pq~bPq (t)2+ E yijk~o~-~,jk~o, (p,q)EM t=0 (p,q)EMN t=0 i,j,k,Ao subject to Y 9 RC',r 9 gl for all (p,q) 9 M N U M, --Cpq 5 Zpq(t) < Cpq i f (p, q) 9 N ] 2"~pqOpq(t) = Zpq(t) -epq i f (p,q) 9 M N and Zpq(t) > epq, = Zpq(t) + epq i f (p, q) 9 M N and Zpq(t) < -epq, (I) 0 i f (p, q) 9 M N and IZpq(t)] < Cpq, = Zpq(t) i f (p, q) 9 M, for all t = 0, 1 , 2 , . . . . Furthermore, it holds that the infimum in (7. 7) is a minimum, and, ~o is a solution of (7.8) if and only if it is a solution of (7. 7). In addition, r is unique for all (p, q) 9 ( M N ) tO M.
Proof. In Lemma 7.2.1 we showed that an optimal solution ~b~ always exists for problem (7.8). From Theorem 3.3.2 (Kuhn-Tucker-Lagrange duality theorem) we know that if y 9 R ~" is optimal for the dual problem then r minimizes L(~) where L(~) =
~pqll~qllfl +
E
E
Cpqll~Pq}[1
(p,q)E( M N)tJM (p,q)E (M N)v N -[- E YiJk)~~176 ) " i,j,k,Ao
Thus, r176 minimizes
(~v~q(t) ~ - zpq(t)r
+
-z,q(t)r
~
(e~r
~ + c~l~(t)l
(p,q)EMN
(p,q)EM
+
~_, ( c ~ l r
z,~(t)~(t)) +
(p,q)eN
Therefore, if (p, q) 9 M then ~~
E
YijkAo bijkA~ i,j,k ,)~o
minimizes
~pq~p~(t) 2 - z~(t)%~(t), which is strictly convex in ~pq(t) and therefore ~~ is unique. Differentiating the above function with respect to ~pq(t) and equating tile result to zero we conclude that if (p, q) E M then
2~pq%~
= zpq(t).
As Zpq E gl we have that for all (p, q) E M, ~Oq E gl. If (p, q) E M N then ~~ minimizes
~pqCpq(t) 2 + Cpql,~pq(t)l- zp~(t)r which is strictly convex and therefore the minimizer is unique. This also implies that if (p, q) E M N then ~b~ satisfies conditions stipulated in (I).
118
7. MIMO Design: The Square Case Suppose, (p, q) G N then #~
minimizes
7(epq(t)) := {cpqlepq(t)t- zpq(t)ep~(t)}. Note that if Zpq(t)q~pq(t) < 0 then -f(r the optimal minimizes
> O. But, 7(0) = 0. Therefore,
ep~ (t)(e~sgn(Z~ (t)) - Z~ (t)), over all ~pq(t) 6 R such that Zpq(t)~)pq(t) >_ O. Now, if IZpq(t)l > Cpq then given any K > 0 we can choose r 6 R that satisfies Zpq(t)~pq(t) > 0 and "-f(~pq(t)) < -A" and thus the infimum value would be - o o . Therefore, we can restrict Zrq(t ) in the maximization of the dual to satisfy IZpq(t)l < Cpq. If, IZpq(t)l < cpq then -f(r >_ 0 for any qbpq(t) E R t h a t satisfies Zpq(t)~pq(t) ~> 0 and is equal to zero only for r = 0. Therefore, we conclude that if (p, q) 6 N then we can restrict Zpq(t) in the m a x i m i z a t i o n of the dual to satisfy IZpq(t)l _ Cpq and that the o p t i m a l r176 minimizes -f(g'pq(t)) with a m i n i m u m value of zero. It also follows that if IZpq(t)l < cpq then r E R that minimizes f(~pq(t)) is equal to zero. The expression for ue follows by substituting the value of r obtained in the above discussion for various indices in the functional L(r Note that O = Oe f'l t?7"Il z X t ' l . w . But in the previous steps we have shown by construction that the optimal solution to (7.8), ~b~ is such t h a t Op0q 6 el for all (p, q) 6 M. This means that 4' ~ E O. From the above discussion the theorem follows easily. [] Note that the above theorem demonstrates that the problem at hand is finite dimensional. Indeed, at an optimal point y0, r the constraint IZp~ I _ Cpq will be satisfied for sufficiently large t since Z~q~ E el. Thus, ~pq(t)O __-- 0 for (p,q) 6 M N U N and large t i.e., ~p0q is F I R for (p,q) E M N U N. T h e following lemmas provide a way to compute a priori bounds on the dimension of the problem. 7.2.3. Let q~o be a solution to the primal problem (7.7) and let yO, ZOq be solutions to the dual. Then the following is true: -Cpq < Z~ c~, Z~ + Cpq i f (p, q) e M N and Z~ < -Cpq, 0 i f (p,q) 6 M N and IZ~ffq(t)l_< cpq, = Z~ i f (p, q) E M. 4'~ is unique for all (p, q) in ( M N ) U M. Also, there exists an a priori bound a such that IIZ~ _< ~ for all (p,q) E N~ x N,o. Lemma
Proof. The first part of the l e m m a follows from the arguments used in Theorem 7.2.1. We now determine an a priori bound. For all (p, q) E M N the following is true:
7.2 The Combination Problem
1~9
< Cpq + ~ y ( H ) where the last inequality follows since H is a feasible solution and hence cvqll~Oqllx L'.
Proof. For notational convenience we index Fpq ijkx~ and bijk~~ where ijkAo vary as in the zero interpolation conditions by Fvq and bn respectively where n = 1 , . . . , cz. T h e vector in R r whose n th element is given by b'* is denoted by b. We interpret Fpq as a cx~ x 1 column vector equal to -
(F;~ (0), F ; ~ ( 1 ) , . . . ) ' . With this notation = ( F;1, FL, . . . ,
)y,
where Zpq is viewed as a infinite column vector with the t th element equal to Zpq(t). Therefore the condition
120
7. MIMO Design: The Square Case
IIZpqlloo < a for all (p,q) E N, x iV,,,, is equvalent to the condition
IIA'YlI~
< ,~,
where
F!~.
F~2
..
F:~
:
:
:
A !
F~I. : ~k
F;; :
:
E rl,l z fl, ~o
T h e m a t r i x A := (A') ~ is the m a t r i x which has c~ rows each for one zero interpolation condition. If 9 E ~ " • is stringed out into a vector as below: r
~11
'
then A ~ = b gives the zero interpolation conditions. It is known t h a t the zero interpolation conditions are independent and therefore A has full row rank 9 Equivalently, A' has full c o l u m n rank. Choose L > c~ such t h a t D E R e` • with rows from the first L rows of A' is invertible. Consider D as a m a p f r o m (R c', ].]1) to (R ~', [.l~). Now, as y = O-1Dy we have [Y]I = [D-1Dy]I %q then --Cpq : 2-CpqCpq(~) -- Zpq(t) < Cpq,
because Cpq > O. Similarly it can be checked that all the other conditions of (II1) are satisfied. This implies that P > u. Suppose, y E R c', Zpq(t) (determnined by y) and 4~pq(t) satisfy condition (Ili) of Theorem 7.2.2. Let q)pq(t) be defined as follows: 2-dpqqSpq(t) = Z p q ( t ) - epq if (p,q) C M N and Zpq(t) > Cpq, 2"dpqOpq(t) = Zpq(t) + Cpq if (p, q)E M N and Zvq(t ) < -%q, Opq(t) = 0 if (p,q) E i N and IZpq(t)[ < %q, for all 0 < t < L* (i.e ~vq(t) satisfies constraints (II)). Suppose, (p, q) E M N and Zpq(t) > Cpq then 0 ~__ 2"Cpq~pq(t) = Zpq(t) - Cpq ~__ 2-dpq~vq(t ).
Therefore, --2
2
--~pq (4) >__--~pq (t).
Similarly, the above condition follows for other indices. Thus, given variables satisfying (III) we have constructed variables satisfying (II) which achieve a greater objective value. This proves that ~ _< u. This proves the theorem [] Thus, we have shown that the problem (7.7) for a square plant is equivalent to the finite dimensional quadratic programming problem of Theorem 7.2.2 with the dimension known a priori. Such types of programming are well studied in the literature and efficient numerical methods are available (e.g., [22]). We should point, out that the sum ~-'~4~o 4~vq 1 Zpq(t)2 appearing in the quadratic program above is a quadratic function of yijk),o with coefficients of the form < pijk)~o >, which can be readily computed pq ~-prst-~o pq The solution procedure consists of solving the quadratic program of Theo0 and r176 with t = 0, 9 9", L* rem 7.2.2 to obtain the optimal variables Yijkxo for all (p,q) E M N . The latter set completely determines (pOq for all (p, q) E M N . From Yijk~o~ the optimal ~p0q for (p, q) E m can be computed as CpOq 1 0 = 2~pq Z~q (see Lemma 7.2.3). The quadratic program of Theorem 7.2.2 0 for (p, q) E N. Nontheless, does not yield immediately any information on ~p~ 0 for (p, q) E N can be easily obtained o n c e q)pq u for (p, q) E M N U M are ~pq found through the following (finite dimensional) optimization:
minimize
E Cpqll~Pqlll (V,q)eN
subject to
45pq E R L" E < FiJkX~ (p,q)E N
> = biJkX~ --
E (p,q)E ( M N )u M
< FiJk~~
0 >
7.3 The Mixed Problem
123
This problem can be readily solved via linear programming [3]. From the developments above it follows that the structure of an optimal solution ~p0 to the primal problem (7.7) has in general an infinite iiupulse response (IIR). The parts of ~0 however that are contained in the cost via 0 ,s with (p,q) in M N U N, are always FIR. their/71 norm i.e., the ~pq Finally, it should be noted that the optimal solution has certain properties related to the notion of Pareto optimality (e.g., [22]). In particular, from the uniqueness properties of r it is clear that there is no other feasible cp such that 1]4ipqlI2 < ]l~p~ for some (p,q) E M N U M while tl~ppqlll < IIr176 or, conversely, there is no 4) such that Ilgipqlll < IIr176 for some (p, q) 6 M N U M while ll4ipqll2 < [Iq~p~
7.3 The
Mixed
Problem
In this section we make the statement for the mixed problem precise. We solve the mixed problem via a related problem called the approximate problem. For both the mixed and the approximate problems the following notation is relevant: Let N~ := { 1 , . . . , n w } and let Nz := { 1 , . . . , n z } . Let S be a given subset of Nz. S corresponds to those rows of the closed loop which have some part constrained in the/71 norm. We denote the cardinality of S by c,~. Let Np for p 6 S be a subset of N,o. Np characterizes the part of the pth row of the closed loop that is constrained in the el norm. The (positive) scalars 7p for p 6 S represent the/?1 constraint level on the pth r o w . It is assumed that 7p > tJ0.1. Finally, 7 6 R c~ is a vector which has 7p for p 6 S as its elements. We define a set F, C ~ " x,~ of feasible solutions as follows: q~ 6 ~ " • is in F-~ if and only if it satisfies the following conditions: a)
~
II~pqlll < ~p for all p E S,
qENp
b) 4i 6 0
(i.e 9 is an achievable closed loop map).
r is said to be feasible if r 6 _r'~. Let M_M_be a given subset of N~ • N~o. The problem statements for the mixed and the approximate problems are now presented. Given a plant G the mixed problem is the following optimization: p-~ := inf { Z @6F.~
IlCpqll~}.
(7.11)
(p,q)EM
Given a plant G the approximate problem of order ~ is the following optimization: p76 := ver,inf(
~ (p,q)6M
II~pqll2~+~--~ ~ II~pqll~}.
(7.12)
p68 q 6 N n
W e will further assume that for all (p,q) 6 Nz • Nw the component qSpq appears in the ~i constraint or in the objective function or in both. Note that
124
7. MIMO Design: The Square Case
M__M_is the set of transfer function pairs whose two norms have to be minimized in the problem. The problem is set lip so that one can include the constraint of a complete row in the closed loop m a p (P or part of a row. This way we can easily incorporate constraints of the form I](pI]l < 1 which is cquivalent to each row having one norrn less than 1. Also, the 7/, norm of (P can be included in the cost as a special case. We also define the following sets which help in isolating various cases in the dual formulation: N := Ui6s(i, Ni), which is set of indices (i, j) such that cPij occur in the/~1 constraint,
M N := M A N , which is the set of indices (i, j) such that 4'ij occurs in the g~ constraint and its two norm appears in the objective, M := M \ M N , which is the set of indices (i, j) such that two norm of Oij occurs in the objectivc but it does not appear in tile t?l constraint and N := N \ M N , which is the set of indices (i, j) such that ~ i j o c c u r s ill the gl constraint but its two norm does not appear in the objective. With this notation we have, M = ( M N ) U M and N = ( M N ) U N. We assumc that M N U M U N equals N~ x N~. This implies that for all (p, q) 6 Nz • N~ r appears in the i71 constraint or in the objective function or in both. We define Sm : g ? ' x ~ __4 R and ]a~ : g ? ' x " ~ --4 R by
Sm(r := E
IlCpqllg=
(p,q)6-M"
~
Ilepqll ,
(p,q)E(M N ) u M
and (p,q)6 M (p ,q)6 M Nu M
p6S qeNp (p,q)@ M Nu N
which are the objective functions of the mixed and the a p p r o x i m a t e problems respectively. We make the following assumption. Assumption
6 The plant is square i.e., n~ = nu and ny = n~.
We now solve the a p p r o x i m a t e problem and later we give the relation of the mixed problem to the a p p r o x i m a t e problem.
7.3 The Mixed Problem
125
7.3.1 T h e A p p r o x i m a t e P r o b l e m In this subsection wc study the approximate problem of order 5. This problem is very similar to the combination problem. The techniques used in solving the combination problem are often identical to thc ones used in solving the approximate problem. We state many facts without proof. These facts can be easily deduced in ways similar to the ones used in the solution of the combination problem. 'the importance of this problem comes from its connection to the mixed problem. As in the combination problem, we define for notational convenience
Zvq(t) :=
YiJkx~ jkx~
Z i,j,k ,,ko
T h e o r e m 7.3.1. There exists q5~ 9 F~ ,such that 5 (p,q)E M N o M
(p,q)E M N u N
Therefore, the infimum in (7.12) is a minimum. Moreover, the following it is true that p~ equals max
Z (p,q)EM
oo t=0
E
Z
(p,q)EMN
t=0
bi j k A ~
i,j,k,,ko
pES
subject to y E R c ' , $ v q E g l for all (p,q) E M N U M ,
-(5 + ~p) < Izpq(t)l < (5 + yp) if (p, q) 9 N 2Ovq(t ) = Zvq(t ) - (5 + yv) = Z p q ( t ) -t- (5 -t- y p )
=0 = Zpq(t)
i f (p,q) 9 M N , Zvq(t ) > (5 + yv), i f (p, q) 9 M N , Zpq(t) < - ( 5 + Yv)' i f (p, q) 9 MN, IZpq(t)l < (5 + ~p),
] (IV)
if (p, q) 9 M, for all t = 0 , 1 , 2 , . . . .
In addition, the optimal ~5~ is unique for all (p, q) E (M N) 0 M. Proof. The proof follows by utilizing results analogous to Lemmas 2 and 3, and similar arguments to Theorem 7.2.1. [] To get an analogous result to Lemma 7.2.3 it is clear that we have to get an a priori bound on the dual variable ~. L e m m a 7.3.1. Let (p0,1 denote a solution of the standard gt problem (7.5).
f~a(q5~ is the objective of the approximate problem evaluated at a solution of the standard ~1 problem. If (~-~, y'~) is the solution to the approximate problem as given in Theorem 7.3.1 then ~p _< fa~(~~ ~'p - - 120,1
for all p E S .
126
7. MIMO Design: The Square Case
Proof. Take any c 6 R such that uo,1 < c < minTp. p6s Let -t o 6 R c" be given by 7v~ = c. Let 6 pwo :=
inf f ~ ( ~ ) .
4'fi F.,o
Then from Corollary 3.3.1 we have,
> < ~ 5o - ,..~5 < ~ o5 < f ~ ( ~ o , 1 ) .
< v - v~ Therefore, pfS
As (Tp - c) > 0 we have < f~((po,1) yp _ - for a l l p 9
-~
"/p
--
c
This holds for all c > /I0,1and therefore the l e m m a follows. Now we state the l e m m a analogous to L e m m a 7.2.3.
[]
L e m m a 7.3.2. Let q5~ be a solution to the primal problem (7.12) and let ~o, yO, zOq be solutions to the dual. Then the following is true:
- ( 5 + ~p) O, y E R e` ,~pq(t) E R L" f o r all (p, q) 9 M N , --(~ + ~p) < Zpq(t) = < Fl,~bl > A- < F2,~b2 > = 1. As n= = nz and nw = ny the system is square and rank interpolation conditions are absent. First we solve the standard multiple input multiple output el problem for the given system G.
y
Ul it2
It
Fig. 7.1. A two input single output example
130
7. M I M O Design: T h e Square Case
7.4.1 S t a n d a r d s
Solution
In this subsection we are interested in solving the following optimization: u0,1 =
inf
4' Achievable
IIr
=
inf
: 1
114)111.
We refer the reader to section 12.1.2 of [3] for the the theory used to solve this problem. It can be easily verified that the above problem reduces to the following finite dimensional linear program:
min
it,
",r subject to 1
r
+ E~+(t)
= u for i = 1,2,
+ r
t=O
ct(o) - r
+ ~'(~t(1) - ~i-(1)) - (r r - 89 ~ ; ( 1 ) ) = 1,
r
> o.
Using the linear programming software of MATLAB we obtain that an optimal is given by
o0,__ This implies that u0,1 : 0.5.
7.4.2 Solution o f t h e M i x e d P r o b l e m In this subsection we are interested in solving the following optimization for the given system in Figure 2:
m := ~ Ai~vab,e{ll~l122 : I1~111 + y - y 1 subject to ~1_>0, Y2 >_ O, y E R.
- Y~}
7.4 An Illustrative Example
131
In keeping with the notation defined in earlier section we denote
Z := yF, that is Z1 = yF1 and Z2 = yF2, 1oI (~) := ii~ii122 + ii~iii 2 + o.iII~iI11 + o.III~2111. Therefore, we have, f ~ = 1 2 + 0 + 0 . 1 + 0 = 1.1 and fo.1 ( ~ 0 j ) = (0.5)2+ (0.5)2+0.1(0.5+0.5) = 0.6. Let ~ and y~ be the solution to the dual problem stated above and let ~ be the solution to the primal. We define L : ~ x l _+ R
by 5 ( r
I[(P[[22 + +(0.i + fflf)l](~l[]1-Jr-(0.I 4-ff;)I]~2[[I-- < Z;,(Pl > -
. From Theorem 3.3.2 (Kuhn-Tucker-Lagrange duality theorem) it follows that (P~ minimizes L((/)) over all (/i E g~• This implies that ~ ( t ) minimizes,
9 1(t) 2 4- (0.1 4- yT)l~1(t)l-
z?(t)~1(t),
(7.13)
over all ~ l ( t ) E R. We can discard the ~ l ( t ) ~ R which have an opposite sign to that of Z~(t) because then - < Z~, r >_> 0. Therefore ~7(t) minimizes,
r (t) ~ + ~ (t)((0.1 + ~ ) s g ~ ( z ? (t)) - z? (t)), over all (/il(t) E R which satisfy assume that Z~(t) >_O. Then r Ol(t) 2 4- r (t)((0.1 4- ff~) -
O~(t)Z~(t) >_ O. Without loss of generality minimizes,
Z'~(t)),
over all ~l(t) E R which are positive. Now if (0.1+y~) _> Z'~(t) then objective is always positive as ~i I (t) is restrained to be positive and therefore the minimizer Or(t) is forced to be equal to zero. If 0 < (0.1 4- ff~) < Z'~(t) then the coefficient of ~(t) in the objective is negative and therefore we can do better than achieving a zero objective value and this forces 4~7(t) > 0. With this knowledge we can now differentiate the unconstrained objective function in (7.13) to get 2(P'~(t) = Z'~(t)- (0.1 + ~ ) . Similarly, if Z~(t) < - ( 0 . 1 +ff~) < 0 then 2(V~(t) = Z~(t) 4- (0.1 4- y~). In any case the following holds:
IZ?(t)l ___0.1 4- ~7 + 21r
___0.1 + fO.1(~0,1) 4- 211~111. 7p -- U0,1
The second inequality follows from Lemma 7.3.1. From the fact that I1(/i~111 < 1 we have 0.6 IZlr(t)[ < 0.1 + 1 - 0.-----~4- 2 = 3.3. Note that Z'~(t) ly~l ___3.3. Now, IZ?(t)l =
= y'rFl(t). As IZ~'(0)l _< 3.3 it follows that ly~Fl(0)l = 1
ly'~Fl(t)l < 3.3 Fl(t) = 3 . 3 ~ .
This implies that we can determine
a priori L'~ such that if t > L~ then
132
7. MIMO Design: The Square Case
IZ~(t)l ~ 0.1 _~ 0.1 + ~1~, which will imply that ~l~(t) = 0 if t :> L~. L~ = 5 does satisfy this requirement. A similar development holds for 4)5. From Theorem 7.3.2 we have that the dual can be written as: 2
#o., =
5
max{E E-r
2
+ y- E
i = l t=O
~i}
i=l
subject to y E R ~, y > O y E R,~i E R6 i = l,2 Zi(t) ~_ (0.1 + Yi) for i = 1,2. for all t = 0, 1 , 2 , . . . , 6 .
- - ( 0 . 1 + Yi) ~ 2(Pi(/) --
Using MATLAB software we obtain that the optimal ~7 which is unique for this example is given by r
= 0.3972 + 0.1732)~ + 0.0617()~) 2 + 0.0058()~) 3,
Therefore, f ~ ) =- 0.5109, II(P'~II1 -- 0.6379 and II(P'~II2 = 0.6191. This implies from Theorem 7.3.3 that if (p0 represents the solution of the mixed problem then If-,(~~ - / - ~ ( ~ ) l
_< 0.2.
This completes the example.
7.5 Summary In this chapter we considered two related problems of MIMO controller design which incorporate the 7t2 and the gt norms of input-output maps constituting the closed loop directly in their definitions. In the first problem termed as the combination problem, a positive linear combination of the square of the 7/2 norms and the gl norms of the inputoutput maps was minimized over all stabilizing controllers. It was shown that, for the 1-block case, the optimal is possibly IIR and the solution can be nonunique. However, it was shown that the problem can be solved exactly via a finite dimensional quadratic optimization problem and a linear programming problem of a priori known dimensions. In the second problem termed as the mixed problem, the 7/~ performance of the closed loop is minimized subject to a gl constraint. It was shown that suboptimal solutions within any given tolerance of the optimal value can be obtained via the solution to a related combination problem.
7.6 Appendix
133
7.6 Appendix 7.6.1 I n t e r p o l a t i o n
Conditions
We analyse here in some detail the zero interpolation conditions. ;1xn, and /33" E ~?~xl we have Given a l ~ ~
~,(~)4(~)b~ (~) : ~ ~ ~,~ ( ~ ) ~ ( ~ ) ~
(~)
p=l q=l
= ~ ~ ~(~,~ 9~
9 ~jq)(t)~ ~
p=l q=l t=0
=~ ~
~,~(t - l ) ( ~ , , ~ ) ( ~ ) ~
p=l q=l t=0 l=0
=E E
aip(t - l) E fljq(l - S)Opq(S)At
p=lq:l flz
t=0 l=0
ntv
OO
OO
s=0 (:~
: p=lq----1 EE E E E t=O l=O 8----0 Therefore, it follows t h a t ~Iz
n w
=
~,p(t - t)~j~(l - s)~p~(~) (~,)(k) ~:~o p = l q = l t = 0 1=0 s=O
= EEEEEcqP(t-l)fljq(1-s)~pq(s)(At)(k)x=,~o
"
p=l q=l 8----0 t=O l-=O
Define F ijkx~ E ~ •
by O0
ijkAo F~q (s) := E E aip(t - l ) f l j q ( l - s) (At) (k) X=Xo
(7.14)
/=0 t=0
It can be easily verified that for any 9 E s
•
( ~ ) ( ~ ) ( ~ o ) =< ~, F ' j ~ ~ >.
(7.15)
Proof of Lemma 7.1.1 : We first show that if IAI < 1 then there exists an integer T such t h a t t > T implies I(,kt)kl _< l(At+l)k I where t is an integer. Let T be any integer such t h a t T > ~ and let t > T be an integer then
I(,V)(k)l
- I(At+l)(k)l = I(t)(t - 1 ) , . . . , (t - k + 1)At-k I - I ( t + 1 ) ( t ) , . . . , (t - k -t- 2)At-k+1[ = (t)(t -- 1 ) , . . . , (t -- k + 2)1,~1'-k. =
=
{t - k + 1 - (t + 1)IAI} - 1),..., (t - k + 2)IXl '-k. {t(1 -I,Xl) - k -I,Xl + 1} (t)(t 1) .... , (t - k + 2)1~1'-k(1 (t)(t
{t- (~)} >0.
- I,Xl).
134
7. MIMO Design: The Square Case
Suppose s is an integer such t h a t s > T where T is as defined above. F r o m (7.3) we have
IFSq
(~)1 =
~,(t
t)~(t
s)(~*)(k)
1=0 t = 0
A=Ao
A=Ao
Ll=s
A=Ao
t=s (21o
To, which m e a n s t h a t we can choose T0 > T such t h a t
IF~Jk~~
< e for all s >
To.
This proves the l e m m a . [] T h e the elements G , ~ , q t and G i 3 j p t c o r r e s p o n d i n g to the rank conditions can be defined as [3]:
a~,q,(O :=
,~;(t
r
.
c~,,,(t):
.
....
'
0
...
0
.
.
.
~'](t - 0-..
}p'" ,-o~.
9
.
.
0
,
.
,
9
.
.
O
.
.
.
7.6 A p p e n d i x
135
As 6~i and ~j are polynomial vectors we have that Ga,qt and G~jpt are in ~1, Xnw .
7.6.2 E x i s t e n c e o f a S o l u t i o n f o r t h e C o m b i n a t i o n P r o b l e m Here we show that a solution to (7.7) always exists. Proof of Lemma 7.2.1 : As
f(~) :=
pqllpql122+
E (p,q)EMNuM
epqll pqlll, (p,q)6MNuN
we have r'e-- inf { f ( 4 ~ ) : f ( r
= bijkx~
(7.17)
II'P~qllX < C for all (p,q) 9 M N U N,
(7.18)
I](~qll2 2 < U for all (p, q) 9 M.
(7.19)
From (7.18) we conclude that for all (p,q) 9 M N U N, ~ q belongs to a bounded set in (co)*. From the Banach-Alaoglu result (see Theorem 2.3.1) and separability of co we conclude that for all (p, q) 9 M N U N there exists a subsequence {~pq nk } of {45~q} and (Ppq 0 such that {~pq ,~ } --4 ~pq 0 in the W((co)*, co) topology. This implies that < "/), ~)p~ >--4"< ~), ~p0q >
for all v 9 co.
(7.20)
Similarly, we conclude that for all (p, q) 9 M there exists a subsequence { 4 ~ ' } of {4~pq -k } and Ovq 0 such that { ~ ' } --+ ~p0q in the W((/~)*,ts) topoiogy. This implies that
136
7. MIMO Design: The Square Case 0 < v , S ; ; ' >--+< v,Svq > for all v 9 e2.
(7.21)
Thus, we have defined ~0 9 .4 by the limits (7.20) and (7.21). Note that
< FiJk~~
>':
E
< FPjk~~
> "
(p,q)EN, • Therefore, it follows from (7.17), (7.21), (7.20) and Lemma 7.1.1 that
< FiJk;%,qb 0 > = biJ k)~o. Similarly, the rank interpolation conditions are also satisfied by S ~ From the above discussion it follows that S ~ is in Oe and therefore, f(~b0) __
cpqlir
~
-cpql['Pvqll2 0 2+
~ (p ,q)r ( M N )u M
_> lie"
(v,q ) e N
From (7.20) and (7.21) it follows that for all t 9 R and for all (p, q) 9 Nz • N~,.
S ; ; ' (t) -+ ~~
This implies that for all T as s ~ T
tin0
(v,q)E(MN)uM T E ( t=0
(p,q)E(MN)uN
E "Cpql~Opq(~)[2 -4- E -~pq[~Oq(,)[2) (p,q)E(MN)uM (p,q)e(MN)uN
We have from (7.16) that for all s and T T
~--~( t=0
~ ~,~ql~;;'(t)l:~+
(p,q)e(MN)uM
~
1 cvqlS;;'(t)l)
% v~ 4- - - .
(p,q)E(MN)uN
Ilk"
(7.22)
In (7.22) first letting s -+ cr and then letting T ~ cr we have that f ( S ~ < ue. This proves the lemma. [] 7.6.3 R e s u l t s o n t h e M i x e d P r o b l e m
Proof of Theorem 7.3.4 : From Theorem 7.3.3 we have that
f# (4,") < ~w + -11,.;,11. n
This implies that there exists a constant CI such that II~;qll.~ < c1, for all (p, q) E. (MN)tJ M. From the Banach-Alaoglu (see Theorem 2.3.1) result we conclude that there exists a subsequence { ~ } of { ~ q } and S~ E e2 such that 0 in the W((s
e2) topology for all (p, q) E ( M N ) LJ M.
7.6 Appendix
137
This implies that for all v E g2 and for all (p, q) E (MN) tO M,
--~< V,4pq o > V,4pq
(7.23)
as k -.), oo.
4 "~ E F~ for every k therefore [14pq Ill < "/p for all p E S.
qs Nr nk
_..=o
We conclude that there exists a subsequence { 4 ~ ' } of {4vq } and 4pq E ~ , x,,~ such that for all v E co and for all (p, q) E (MN) U N, _---0 < U, 4p~' >---+< V, 4pq > as s ~ 0(3.
(7.24)
From the uniqueness of the limit for all (p, q) E ((MN) U M) M ((MN) U N), 4pq-~: 4pq.O Thus, for every (p, q) 6 N~ x Nw we have a sequence { 4 ~ ' } which converges to 4p0q in the W((t2)*,g2) topology (note convergence in W((co)*, co) implies convergence in W((~2)*, t2)). For all s we have
114 3.112
n, IIP.(HS2-U2,Q"..,v2)II== 0, for all nm > n. Taking the limit as m goes to infinity we have
ilP.(HS= _ U 2 , Q 0 ,
VS)ll~ _
0. rn--~ oo
It follows t h a t [[HS2 _ U 2 , Q 0 , V2I[~ _< limoo vn.,. T h u s we have shown t h a t limm-~oo v,~.~ = u. Since vn is a m o n o t o n i c a l l y increasing sequence, it follows t h a t u n / z u. It is clear from L e m m a 8.1.1 t h a t ~S2.o := HS2 _ U s , Q 0 , V 2 is unique. If(P ss'n := P,~(H s2 - U s * Q'~ * V s) then from the discussion above it follows t h a t v , , , = 11~22,~11~ converges to v = I1r176 ~. Also, O22'n"(t) converges to r 1 7 6 It follows f r o m L e m m a 8.2.1 t h a t II~ 2 2 ' " ~ - ~22'~
- ~ o as m
-+ ~ .
From L e m m a 8.1.1 we also have t h a t if ~-2 and ~,s have full n o r m a l column and row ranks respectively then Q0 is unique. F r o m the uniqueness of Q0 it follows t h a t the original sequence, {r converges to ~22,o in the two norm. This proves the theorem. [] 8.2.2 Converging
Upper
Bounds
Let v '~(7) be defined by inf
IIH 2 2 _ u 2 , Q 9 v211~
subject to lJ H l l - U 1 * Q * V i i i 1 ~ 7
(8.7)
IIQII~ n.
T h e following t h e o r e m shows t h a t {vn(7)} defines a sequence of upper b o u n d s to v ( 7 ) which converge to v(7). Theorem
8.2.2.
F o r all n, u n ( 7 ) >_ v'*+1(7) :> v(7). Also,
~"(7) "~ ~(7).
144
8. Multiple-input Multiple-output Systems ~uXny
Proof. It is clear that urn(7) _> u'~+1(7) because any Q in s which satisfies the constraints in the problem definition of u'~(7) will satisfy the constraints in the problem definition of u '~+1 (7). For the same reason we also have u ' ( 7 ) _> u(7) for all relevant n. Thus {u n (7)} is a decreasing sequence of real numbers bounded below by u(7). It can be shown that u(7 ) is a continuous function of 7 (see Theorem 6.5 in [15]). Given e > 0 choose 6 > 0 such that - ( 7 - 5) - - ( 7 ) < ~.
(8.8)
Such a 6 exists from the continuity of u(7) in 7- Let Q'Y-~ be a solution to the problem u(7 - 5) which is guaranteed to exist from Theorem 8.2.1. Let M be large enough so that m ~ M implies that
IIIH 22 - U 2 , Pm(Q ~-~) , V2II~ - ] ] H 2 2 - U 2 * Q'Y-~ 9 v211~l < -~ and (8.9) 6
IlIHll-U1,pm(Q~-~),V1111
-IIHI1-UI,Q~-~,V'II~I
< ~.
(8.10)
As Q'~-a is a solution to the problem u(7 - 5) it is also true that IIH22 - U ~ * Q"-~ * V2ll~ -- ~'(7 - 5), IIHx~ - u x * Q ~ - a * v i i i 1 _< 7 - 6 and IlQ~-allx ___~. From the above and equations (8.9), (8.10) it follows that for all m >_ M,
IIH22
- u 2 * Pm(Q "~-~) ,
E
v211~ - ~(7 - 5) nu or n~ + nv > ny then the range space of.A is not finite dimensional. In this case #~ (D) is solved by converting it to a square problem.
150
9. Robust Performance
This is done by the Delay Augmentation Method. We give a brief description of this method here (for a detailed discussion see [3]). Let S denote a unit shift, that is,
S(r
z(1), z(2),...) = (0, x(O), z(1) .... ),
and S T denotes a T th order shift. Suppose, that the Youla parametrization of the plant yields H in gl"xn~, U in e~~xn', and V in gl , where n z = t n~ + n~ and n~o = nw + no. Partition, 0 into i
0=
02
i
l
i
,
where U 1 in g~" xn~. Similarly, partition I~" into (Q1, V~) where V 1 in gl Let (P and H be partitioned according to the following equation:
(r r
11 ~12~
~11 H~2~
We augment 0 and ~" by following
(
U1
~ 2 2 j = (/:/ul f i 2 2 j - ( ~ r 2 ) Qll (tY' Q2) . N th
9
(9.5)
order delays and augment Qll a,s given by the
~ll,N ~12,N~_(IJ11 /t12
01 0
011 012"~(~'1 ~"2 (9.6)
or equivalently, CN := H _ 0 ~ 0 9 r r
We define 69(D, N) the feasible set for the delay augmented problem to be the set p2x2J ~N = f[
{~U
E ~1
~fNQQN with O stable and IID-Ir
< 7}-
We define the Delay Augmentation problem of order N by /z~v(D) := inf{l(#N): #N E O ( D , N ) ) .
(9.7)
where
l(~) := 11~221122+ a(ll(O-~O)lll~
+
II(D-I~D)~II1).
This is a square problem and can be solved via finite dimensional quadratic programming. Let the delay augmented problem corresponding to (9.4) be given by P~v := inf #~v(D).
DE'D
We will now show that P~N converges to #a from below.
(9.8)
9.2 Problem Formulation
151
L e m m a 9.2.1. lim P~N = /~
where the limit on the left hand side of the equation above exists. Proof. It can be shown that O(D) C O(D, N + 1) C O(D, N) for all integers N. Therefore, for a given D and for a]l N #~v(D) _< #~N+,(D) and p~v(D) __el4 V X >_ M. This, contradicts the fact that #~N(Do) --+ IJ~(Do) as N -+ exp. This proves the lemma. [] Notice that for a given D in 7) we have the following: (
D-I~D =
q~lt
(d2/dl)r
(dt/d2)C'2t
~/'22
"
We denote d~/dl by s and therefore we have
D-lC~D =
(l/s)e_~l e.~2 2 "
Thus, #~v (D) can be obtained by solving the following problem. 2 Achievable subject to
I1' :11
0 Note that if we denote #T(D) by 7(s) then ~T __ inf 7(s) =: "/opt. sER +
(QP2)
154
9. Robust Performance
9.3 Quadratic
Programming
Consider the following quadratic p r o g r a m m i n g problem 1
,
min ~ x C x - p' x subject to
(QP)
AxO
where A in R m~x'~l, H in R m2xnl has full row rank, and C is positive semidefinite. We are interested in obtaining necessary and sufficient conditions for x0 to be optimal for ( Q P ) . The following theorem gives such conditions. Theorem
9.3.1. C o n s i d e r the quadratic p r o g r a m m i n g problem, ( Q P ) . xo is optimal f o r the problem i f and only i f there exist yo in R m l , u in R m~, A in R m:, v in R nl such that xo, yo, u, v, ~ satisfy the following conditions p = C x o + A~u + H ~ A - v e -= H x o b = A x o + Yo 0 = u~yo 0 ---- ~)IX 0
xo >_O, yo >_O,u>_O,v >_O. Proof. (=v) Suppose, x0 is optimal for the problem ( Q P ) . This implies that x0 satisfies the conditions e : H x o , A x o - b < 0 and x0 > 0. From Theorem 3.3.2 (Kuhn-Tucker-Lagrange duality result) we know-that there exists u in R rn~ , )~ in R m2' v in R '~ with u > 0 and v > 0 such that x0 minimizes L ( x ) where 1
~
L ( x ) := ~ x C x -
p x + u ' ( A x - b) + A ' ( H x -
e) - v ' x .
This implies that d L(x)
= Cxo - p + A'u +
- v = O.
2~-X 0
Also, from Theorem 3.3.2 we know that u ' ( A x o - b) + A ' ( H x o - e) - v ' x o = O.
As x0 satisfies e - H x o = 0 we have u ' ( A x o - b) - v~xo = O. However, x0 satisfies A x o - b ~ 0 and x0 > 0. Therefore we conclude that u ' ( A x o - b) = z/x0 = 0. The necessity of the conditions given in the theorem s t a t e m e n t for x0 to be optimal is established by defining Yo = b - A x o .
9.3 Quadratic Programming
155
(r Suppose, for a given x0 there exist vectors A, y0, u, v which satisfy the conditions given in the theorem statement. Let x in R ~' be any element which satisfies the constraints of ( Q P ) . Let f(.) denote the objective function of ( Q P ) . We have 1
,
1
,
f ( x ) - f ( x o ) - -~x C x - 7 x o C x o - p ' ( x - xo) = .~(xl
_ X o ) ' C ( x - xo) + x ' C x o - xoCxo'
- p ' ( x - xo)
1
-- -~(x - X o ) ' C ( x - x o ) + ( x - x o ) ' ( p - A ' u - H A + v)
-p'(x > -(x
- xo)
- X o ) ' A ' u - ( x - x o ) ' f t ' A + (x - x o ) ' v
-- - u ' ( ( A x
- b) - ( A x o - b)) - A ' H ( x - xo) + (x - Xo)'v
= -u'((Ax
- b) + Yo) - A ' H ( x - Xo) + ( x -
xo)'v
"- u'(b - A x ) + v' x >_ 0
This proves the theorem. [] The above theorem shows that the solution of a convex quadratic programming problem as given in ( Q P ) is equivalent to the search of a vector (x, u, v, y, X) which satisfies the following conditions:
A 0 H0
(i)
0 I 0 0
=
p
,
(9.10)
v ' x + u ' y = 0,
(9.11)
(x u v y) _> 0.
(9.12)
Also, note that if conditions (9.10) and (9.11) are satisfied then the objective function f(.) of ( Q P ) is given by: 1
I
1
!
f(x) = !
= ~x (p-Au-H'A+v)-p'x I ,A x - IA, 1 ,v = - - ~Ip ,x - "~u ,~ H x + -~x
(9.13)
ii 11 ii ii = - - ~ p x - -~u (b - y) - -~:~ e + -~v 1
,
:
--~p x -
=
-~P
1, z -
1 , -~b u
1 , lv, x 1 , ~eAq~ --k ~ u y
1, -~b u
1, ~e A
(9.14)
156
9. Robust Performance
Define b :=
and x :=
v
. Let the m a t r i x in equation (9.10) be
denoted by A. Also, we assume t h a t A in R m• has full row rank (i.e. it has rank m). Note t h a t the objective function f(.) of ( Q P ) is given by
f=(_l
1 0 0 - 89
~ : CtX
(see equation (9.14)). In this section, whenever, we refer to x we assume that it is in tire form (x u v y A)' where the variables x , u , v , y and A are as defined in T h e o r e m 9.3.1. We call zi and yi primal variables. We call vi the dual variable associated with the primal variable xi and ui as the dual variable associated with the primal variable yi. Before we characterize the set of elcmemts which satisfy equations (9.10), (9.11) and (9.12), we give the following definitions. D e f i n i t i o n 9.3.1 ( F e a s i b l e s o l u t i o n ) . A n e l e m e n t x in R n is called feasible if it satisfies equations (9.10), (9.11) and (9.15). The set of all such elements is denoted by 5 . Note that a primal variable and its corresponding dual variable both c a n n o t be nonzero in a feasible solution, because of (9.11) and (9.12). D e f i n i t i o n 9 . 3 . 2 ( B a s i c s o l u t i o n ) . Let B be a m x m submatrix f o r m e d f r o m the columns of A such that B is invertible. Then, x u : = B - I b defines a basic solution of A x = b. Such a solution will have n - m components equal to zero corresponding to the columns of A not in [3. These components are called the non-basic variables. The rn components that correspond to the columns of B are called basic variables. D e f i n i t i o n 9 . 3 . 3 ( B a s i c f e a s i b l e s o l u t i o n ) . A n e l e m e n t x in R n is called basic feasible solution if it is basic and feasible. Theorem solution.
9.3.2.
If iP is not e m p t y then it has at least one basic feasible
Proof. Let ai denote the i th c o l u m n of A. Let z be a feasible solution and let the i th element of the vector z be denoted by zi. Also, let z be p a r t i t i o n e d as (x z u z v z yZ Az), where the variables x z, u z, v ~, y~, A z correpond to variables x, u, v, y, A in T h e o r e m 9.3.1 indexed by z. For simplicity assume t h a t the first p c o m p o n e n t s of z are nonzero while the rest are zero. This m e a n s t h a t Zlal + z2a2 + . . . + zpap = b, and z is such t h a t (u~)'y z + (vZ)'x z = 0 and (x ~ u z v ~ y~)' >_ 0.
9.4 Problem Solution
157
If a l , . . . , ap are independent columns then p < m because A has rank m. This implies t h a t z is a basic solution. Suppose, ax,. 9 av are dependent. T h e n there exists a in R" with at least one strictly positive element such that aiai + a 2 a 2 + , . . + a p a p = O,
with the last n - p c o m p o n e n t s equal to zero. Let c :=
zi
min - {i:a,>0} ~i
Let t : = (z - e~). This implies t h a t A t = A ( z - c~) = b because A a = 0. Also, note t h a t if zi = 0 then ti = 0. T h e condition ( u ~ ) ' y z + ( v Z ) ' x ~ = 0 is equivalent to uiz Yi~ = v ~ x zi = 0 (because (x ~ u s v ~ yZ), > 0). This m e a n s t h a t u i~y t = vixitt = 0. Also, if zl > 0 then t~ >_ 0. T h u s t is a feasible solution. From the definition of e, t will have a t m o s t p - 1 nonzero c o m p o n e n t s . T h u s from a feasible solution which had p nonzero c o m p o n e n t s we have created a feasible solution which has p - 1 nonzero c o m p o n e n t s . This process can be repeated until the n u m b e r of strictly positive c o m p o n e n t s is less than or equal to m and the corresponding columns of A are linearly independent (i.e. until the feasible solution is also basic). This concludes the proof. [] In the next section we exploit T h e o r e m 9.3.2 to solve the robust perform a n c e problem wc have formulated.
9.4
Problem
Solution
We saw ill Section 9.2 t h a t converging upper and lower b o u n d s to p as defined in (9.2) can be obtained by solving problems which can be cast into the following form: 1
,
min -~x C a - p ' ( s ) x subject to All(S) A 1 2 ~ A21 A22 J x < b Hx~e x>0
(Qp(s))
with "/opt : =
inf 7(s),
sER+
where "/opt is the p r o b l e m of interest. Note t h a t A l l (s) has the s t r u c t u r e given by
Alx(S'
158
9. Robust Performance
and s > 0. Using the results obtained in the previous section we know that the above problem has a solution if and only if there exists x, u, v, y, )~ which satisfy the following constraints:
)
0 0
0 I 0 0
=
,
(9.15)
v' x + u' y = O,
(9.16)
( x u v y ) _> 0.
(9.17)
Using the structure of A(s) the constraints given by equation (9.15) can be rearranged as given below: 0 ~
0
9 ,-s
0 * **
x2
b2
0.**
ul
pl(s)
9
*
*
****
~
9
*
*
****
"-~
_
where the entries denoted by * do not depcnd on s. We denote the m a t r i x on the left hand side by A(s), the vector on the rightmost side of the equation by b(s) and (xl,x2, ul,u2, x_,)~)' by x (note that ~ is the last element in x). We have also shown that if (QP(s)) has a finite value for some fixed s then there exists a basic feasible solution z of A ( s ) x = b(s). Note that the lower bound given by (QP1) and the upper bound given by (QP2) (which are of the form (Qp(s)) always have a finite value. Thus we will assume that (QP(s)) has a finite value for all relevant s. Also note that f(.) is given by f ( x ) ----C'(S)X. Suppose, for some fixed value so > 0 we have obtained a basic feasible solution of (QP(so)), given by z, o. Note that because of condition (9.11) one can choose the m a t r i x B(so) in R mxm where B(so) is the associated matrix with the basic solution z~o (see Definition 9.3.2) such that if a column corresponding to a dual (primal) variable is included in B(so) then the column associated with the primal (dual) variable is not in B(so). We call the rn independent columns of B(so) as the optimal basis associated with z, o. Our intention is to characterize the set of reals 0 < s such that (QP(s)) has a basic feasible solution which has the same optimal basis as the o p t i m a l basis of Zso. The way we have chosen the optimal basis for so guarantees that the condition (9.11) is satisfied if we generate a basic solution using the same columns for a value of s different from so (because the product vixi = uiyi = 0 will always be true if a solution is generated with the an optimal basis). We introduce some notation now. We assumc that A(s) is a m • n m a t r i x with m _< n.
9.4 Problem Solution
159
Given an indexing set of m positive integers J = { j l , - . . , j m } , the notation B j ( s ) denotes the matrix formed by those columns of A(s) indexed by the elements of ,7. An indexing set is said to be basis-index if the rn x rn m a t r i x Bff(s) is invertible and is an optimal-basis-index if B T(s ) is an o p t i m a l basis for the problem (QP(s)). The vector cn in R l x ' ' consists of entries of c corresponding to the basic variables whereas CD is the 1 • (n - m) vector corresponding to the nonbasic variables. Let fl be defined as /9 := B ~ 1 =
D e f i n i t i o n 9.4.1. Let so > O. Let ,70 be an optimal basis index for the problem (QP(so)). Define X j o ( . ) : R -+ R m, the solution function w.r.t J as follows
~joCs) : :
B-1 joCS)b(8)
1 if B -Jo(S) exists. Otherwise this function is given a value O. We assume throughout this chapter that xl and x2 are basic variables. 9.4.1. Let so > O. Let,70 be an optimal-basis-index with xB as the basic feasible solution for the problem (QP(so)). Suppose ul and u2 are basic variables in the optimal solution. Define B := Bffo(s0) and let fl := B -1 Then B flo (s) is invertible if and only
Theorem
a(s) := det(I4 + S Y B - 1 X ) # 0
(o00 0o)
where
X =
, S=
0
0
sos
0
so-s
0
0
'
Y
=
,o, ,
Ira(s) # 0 then
(xsCs))1 (~8(s))2 Xjo(S) = xB(,) - [~1 ~2 ~3 a4] R(~) (xBCs))3 (xsCs))4 where R(s) := (14 + S Y B - 1 X ) - I S Ira(s) = 0 then Xflo(S) = 0.
and ZB(S) := B - l b ( s ) .
(14
0).
160
9. Robust Performance
Proof. Let B := BJo(SO). As z1,x2,ul and u2 are basic variables in the o p t i m a l we have, 8-
0
0
0
0
so
~o-'
0
0
0
0
so - s
0
0
0
0
$0 3
(/4
0) =: B + X S Y .
$0 $
Therefore, det(Bjo (S)) = d e t [ B ( I + B - 1 X S Y ) ] = d e t ( B ) d e t ( I + B - 1 X S Y )
det(B)det(I4 + SYB-IX). Note that c~(s) = det(14+ SYB-IX) and therefore, it is clear that the inverse of Bjo(S) exists if and only if ~(s) ~ O. Assuming c~(s) ~ 0 we find an expression for Bj0 (s) as follows:
Bjo(S) = B - ' ( I + X S Y B - ~ ) -~ = B-I[I - (I+XSYB-1)-IXSYB-1 ] = B-1 _ B-1X(I4 + S Y B - 1 X ) - I S Y B -1. Now, X j o (s) = B)lo ( s ) b ( s ) _- [B-1 _ B - 1 X (I4 + S Y B - 1 X ) - I S Y B-1]b(s) = B - l b ( s ) - B-1X(I4 + S Y B - I X ) - I S Y B - l b ( s )
[
( xs( sl) ~
S
=B-lb(s)_[fllfl2fl3~4](i4+SrB-1X)-i
(zs(s))2 (xB(s))3 (xB(s))4
(xB(s))~ = x~Cs) - [Z 1 Z 2 Z 3 Z 4 ] RCs)
(x~Cs))2 (x~Cs))~ (xBCs))~
'
where we have defined R(s) := (I4 + S Y B - 1 X ) - I S and x n ( s ) :-- B - l b ( s ) . An expression for the 4 x 4 m a t r i x R(s) can be found easily. Note t h a t if c~(s) = 0 then BJo(S) is not invertible and by definition it follows t h a t
XjoC8) = 0.
[]
D e f i n i t i o n 9.4.2. Given so > O. Let Jo be an optimal-basis-index for the problem (QP(so) ). Define
Reg(Jo) := {s :> 0 : ~(s) r 0, ( X j o ( S ) ) i > 0 for all i : 1 , . . . , m } . Note t h a t X J o ( S ) is a rational function o f s and therefore Reg(flo) is a union of closed intervals except for the roots of a ( s ) = 0. D e t e r m i n i n g Reg(Jo) is therefore an easy task. 9.4.2. Let So > 0 and let Jo be an optimal-basis-index with XB as the basic feasible solution for the problem (QP(so)). Suppose ul and u2
Theorem
=
9.4 Problem Solution
161
are basic variables in the optimal solution. Then B j o ( s ) is an optimal basis for (QP(s)) if and only if s in Reg(Jo). Suppose, s in Reg(/To) then the objective value of (QP(s)) is given by
-(x.(s))l-
720(8) _m. c T ( s ) X B ( 8 )
_ cT(s)
[~1 ~2 f13 ]~4] R(8)
(XB(8))2
(x,(s))3 _(x,(s))4.
Proof. Suppose s in Reg(/To). Then, B/T ~(s) has linearly independent columns (because a(s) # 0). As x/To(s ) := Bjo(S)b(s) we know that X/To(S ) is a basic solution. X/To (s) is a feasible solution because (X/To(S))i >_ 0 for all i = 1 , . . . , m . If s > 0 is such that s ~ Reg(/To) then either feasibility is lost or the columns of B/T ~(s) are not independent. This proves the first part of the theorem. If the solution is optimal for (QP(s)) then the optimal objective value is given by
"~/To(S)= c~ (s)x s (s) - (XB(S))I
= c~(s){xB(s) - [~1 ~2 ~3 ~4] R(s)
(xB(s)),(~(s))2 } (~B(s))~_ (zB(S))ll = e~(s)~(s) - c~(s) [~' ~2 ~ ~ ] R(s) (xB(s))~| (~B(s))~/ 9
(~(s))~J This proves the theoren. [] We now present the following theorem which gives a way to compute 7opt. T h e o r e m 9.4.3. There exists a finite set of basis indices /T0,/T1,...,/Tt
such that R + = U~=lReg(/Tk ). Furthermore if fk :=
min
~eR~g(/Tk)
7/Tk (s)
then 7opt = min fk.
k=0,...,l
Proof. The proof is iterative: Step 1) Let Sl > 0. Find ,71, Reg(/Ti) and fl where /T1 is the optimal basis-index for (QP(sl)). Note that Reg(fll) is a finite union of closed intervals except for a finite number of points which can be determined,
162
9. Robust Performance
Step 2) Suppose we have reached the (k - 1)th step. If t.)k-ll Reg ( J p ) ----R + then stop and the theorem is true with l = k - 1. Otherwise choose any s in R + - U~'-~Reg(ffp) and perform step 1 with sl = s. This procedure has to terminate because for any s in R + there exists a basic feasible solution and there are only finite number of basis-index sets. [] We have assumed t h a t for (QP(so)) the optimal is such t h a t ul, us are there in the basis (we assume that xl and x2 are always in the o p t i m a l basis). This might not be so. In t h a t case the expressions can be easily modified and they will be simpler t h a n the ones derived.
9.5 Summary A problem which incorporates ~/~ nominal performance and gl robust performance was formulated. It was shown t h a t this problem can be solved via quadratic p r o g r a m m i n g using sensitivity techniques.
References
1. W. Rudin. Principles of Mathematical Analysis. McGraw-Hill, Inc., 1976. 2. C. Chen. Linear System Theory And Design. Holt, Rinehart And Winston, Inc., New York, 1984. 3. M. A. Dahleh and I. J. Diaz-Bobillo. Control of Uncertain Systems: A Linear Programming Approach. Prentice Hall, Englewood Cliffs, New Jersey, 1995. 4. J. C. Doyle, K. Glover, P. Khargonekar, and B. A. Francis. State space solutions to standard 7-/2 and 7-/oo control problems. IEEE Trans. Automat. Control, 34, no. 8:pp. 831-847, 1989. 5. M. A. Dahleh and J. B. Pearson. £1 Optimal feedback controllers for MIMO discrete-time systems. IEEE Trans. Automat. Control, 32, no. 4:pp. 314-322, 1987. 6. J. C. Doyle, K. Zhou, and B. Bodenheimer. Optimal control with mixed 7t2 and 7-/oo performance objectives. In Proceedings of the American Control Conference. Vol. 3, pp. 2065-2070, Pittsburg, PA, June 1989. 7. P. P. Khargaonekar and M. A. Rotea. Mixed 7/2/7/¢¢ control; a convex optimization approach. [EEE Trans. Automat. Control, 36, no. 7:pp. 824-837, 1991. 8. N. Eha, M. A. Dahleh, and I. J. Diaz-Bobillo. Controller design via infinite dimensional linear programming. In Proceedings of the American Control Conference. Vol. 3, pp. 2165-2169, San Fransiscoi California, June 1993, 9. M. Sznaier. Mixed e l / 7 / ~ controllers for MIMO discrete time systems. In Proceedings of the IEEE Conference on Decision and Control. pp. 3187-3191, Orlando, Florida, December 1994. 10. H. Rotstein and A. Sideris. 7/oo optimization with time domain constarints. IEEE Trans. Automat. Control, 39:pp. 762-770, 1994. 11. X. Chen and J. Wen. A linear matrix inequality approach to discrete-time e l / 7 / ~ control problems. In Proceedings of the IEEE Conference on Decision and Control. pp: 3670-3675, New Orleans, LA, December 1995. 12. N. Eha and M. A. Dahleh. ea minimization with magnitude constraints in the frequency domain. Journal of Optimization Theory and its Applications, 93:27-52, 1997. 13. N. Elia, P. M. Young, and M. A. Dahleh. Multiobjective control via infinite dimensional lmi optimization. In Proceedings of the Allerton Conference on Communication, Control and Computing. pp: 186-195, Urbana, I1., 1995. 14. P. Voulgaris. Optimal 7/2/el control via duality theory. IEEE Trans. Automat. Control, 4, no. ll:pp. 1881-1888, 1995. 15. M. V. Salapaka, M. Dahleh, and P. Voulgaris. Mixed objective control synthesis: Optimal tl/7/2 control. SIAM Journal on Control and Optimization, V35 N5:1672-1689, 1997.
164
References
16. M. V. Salapaka, P. Voulgaris, and M. Dahleh. SISO controller design to minimize a positive combination of the ~1 and the 7t2 norms. Automatica, 33 no. 3:387-391, 1997. 17. M. V. Salapaka, P. Voulgaris, and M. Dahleh. Controller design to optimize a composite performance measure. Journal of Optimization Theory and its Applications, 91 no. 1:91-113, 1996. 18. M. V. Salapaka, M. Khammash, and M. Dahleh. Solution of mimo 7t2/ Q problem without zero interpolation. In Proceedings of the IEEE Conference on Decision and Control. pp: 1546-1551, San Diego, CA, December 1997. 19. P. M. Young and M. A. Dahleh. Infinite dimensional convex optimization in optimal and robust control. IEEE Trans. Automat. Control, 12, 1997. 20. N. Elia and M. A. Dahleh. Controller design with multiple objectives. IEEE Trans. Automat. Control, 42, no. 5:596-613, 1997. 21. M. V. Salapaka, M. Dahleh, and P. Voulgaris. Mimo optimal control design: the interplay of the 7/2 and the el norms. IEEE Trans. Automat. Control, 43, no. 10:1374-1388, 1998. 22. S. P. Boyd and C. H. Barratt. Linear Controller Design: Limits (9] Performance. Prentice Hall, Englewood Cliffs, New Jersey, 1991. 23. N. O. D. Cuhna and E. Polak. Constrained minimization under vector valued criteria in finite dimensional spaces. J. Math. Annal. and Appl., 19:pp 103-124, 1967. 24. M. Khammash. Solution of the l l mimo control problem without zero interpolation. In Proceedings of the IEEE Conference on Decision and Control. pp: 4040-4045, Kobe, Japan, December 1996. 25. J. C. Doyle. Analysis of feedback systems with structured uncertainty. In IEE Proceedings. Vol. 129-D(6), pp. 242-250, November 1982. 26. M. H. K h a m m a s h and J. B. Pearson. Robust disturbance rejection in g l optimal control systems. Systems and Control letters, 14, no. 2:pp. 93-101, 1990. 27. M. H. K h a m m a s h and J. B. Pearson. Performance robustness of discrete-time systems with structured uncertainty. I E E E Trans. Automat. Control, 36, no. 4:pp. 398-412, 1991. 28. M. H. K h a m m a s h and J. B. Pearson. Analysis and design for robust performance with structured uncertainty. Systems and Control letters, 20, no. 3:pp. 179-187, 1993. 29. M. H. Khammash. Synthesis of globally optimal controllers for robust performance to unstructured uncertainty. IEEE Trans. Automat. Control, 41:189-198, 1996:
Index
(x, I1" II), 17 (x, T), 3 (X,d), 14 < .,. >, 35
A\B, 2 AxB, 1
B(X, Y), 32
C
D
' 73
bi3k~° , 114 co, 38
f-'(B), 6 hK(x*), 57 int(Y), 3 B~p, 148 BZ~LTU, 147 B~NL , 147 ANL, 146
F Ok'% , 113 Gc,,qt, 114 Gf~apt, 114 P $ , 46 Pk, 69 S. 69
W(X, X*),
[AI ]
ALTV, 36
X*, 33 X**, 34 Y-,3 [f, ~], 56 Auv, 112 H3ej , 13
x(X), 2 £r,, 38 g~, 69
146
Af(x), 3 1-block problem, 113 4-block problem, 113 7/00, 83
achievable, 112 adjoint map, 35 affine linear map, 18 approximate problem, 123 axiom of choice, 1, 2 axiom of countability, 5
e~, 69 m×n , 69 t~p
{}, 2 &i(A), 113
~(A), u3 A-transforms, 72 ]im inf r~ 23 ]im sup r~, 24 b d ( K ) , 54 p~, I49 p~(D), 149
Banach space, 18 Banach-Alaoglu, 37, 44. 87. 88, 101, 135. 136, 142 base. 4 basic feasible solution, 156 basic solution, 156 basis, 18 basis-index 159 bilinear form, 35 bounded linear operators, 32 bounded map, 19
II II, 16 -~,2 O, 73 aui (Ao), 112 ~vj (A0), 112
canonical map, 35 Cauchy sequence, 15 causality, 69 closed, 3
166
Index
closed loop map, 76 closure, 3 combination problem, 115 compactness, 11 completeness, 15 composite performance measure, 99 cones, 46 continuity, 6 controllability, 73 convergence, 5 convex combination, 46 convex maps, 47 convex optimization problem, 55 convex sets, 45 convolution maps, 70 coprime, 73 dcf, 74 delay augmentation, 149 denseness, 5 detectability, 73 dimension, 18 directed set, 2 dual, 68 dual spaces, 33 dual variable, 156 Eidelheit separation, 55 Epigraph, 56 eventually, 5 FDLTIC, 72 feasible solution, 156 filter, 9 finite dimensional system, 72 finite dimensional vector space, 18 FIR, 84 frequently, 5 Gateaux derivative, 25 Hahn-Banach. 33 half spaces, 51 Hausdorff topology, 5 Hiene-Borel theorem, 20-22 Holder's inequality, 39 homeomorphism, 9 hyperplanes, 49 inductively ordered, 2 initial topology, 8 interior, 3 isometric maps, 15 isomorphism, 9
Kuhn-Tucker-Lagrange duafity, 67, 88, 90, 94, 102, 116, 117, 130, 131, 154 lcf, 74 linear independence, 18 linear map, 18 linear variety, 49 local extrema, 23 local maxima, 23 local minimum, 23 Luenberger controller, 79 majorant, 2 maximal element, 2 metric, 14 metric topology, 14 MIMO systems, 139 minimal realization, 73 minimum distance from a convex set, 59 Minkowski's function, 52 Minkowski's inequality, 39 minorant, 2 mixed problem, 123 neighbourhood, 3 neighbourhood base, 4 neighbourhood filter, 3 nets, 5 non-basic variable, 156 non-square, 113 norm topology, 17 normal rank, 74 normed vector space, 16 observability, 73 open, 3 open cover, 11 optimal basis, 158 optimal-basis-index, 159 order, 2 pareto optimality, 100 poles, 74 positive cones, 46 positively homogeneous , 27 preorder, 2 primal, 68 primal variable, 156 product normed spaces, 17 product set, 13 product topology, 13 projection, 13
Index
quadratic programming, 154 range, 7 rank interpolation conditions, 113 rcf, 74 real valued, 27 reflexive, 35 relative topology, 3 robust performance, 147 robust stability, 146 second dual space, 34 semicontinuity, 23 sensitivity, 68, 90 separability, 5 separation of a point and a convex set, 54 separation of disjoint convex sets, 55 sequence, 5 shift map, 69 signal-space, 69 SISO fl/7t2 problem, 83 Smith-Mcmillan form, 74 square plant, 113 stability, 70 stability of closed loop maps, 76 stabilizability, 73 stabilizing controller, 76 state space, 73 stronger topology, 4 strongest topology, 4 subadditive function, 27 subbase, 4
sublinear functions, 27 subnets, 10 subspace, 16 support-functional, 57 system, 70 time invariance, 70 topology, 2 totally ordered, 2 transitive, 2 truncation operator, 69 Tychonoff's theorem, 13, 20 ultrafilter, 9 unimodular matrices, 73 unit in t l , 75 universal nets, 10 vector space, 16 weak topology, 36 weak-star topology, 36 weaker topology, 4 weakest topology, 4 well ordered, 2 well-posed, 76 Youla parameter, 81 Youla parametrization, 81 zero interpolation conditions, 113 zeros, 74 Zorn, 2, 10, 29, 30
167
Lecture Notes in Control and Information Sciences Edited by M. Thoma 1993-1999 Published Titles:
Vol. 186: Sreenath, N. Systems Representation of Global Climate Change Models. Foundation for a Systems Science Approach. 288 pp. 1993 [3-540-19824-5] Vol. 187: Morecki, A.; Bianchi, G.;
Jaworeck, K. (Eds) RoManSy 9: Proceedings of the Ninth CISM-IFToMM Symposium on Theory and Practice of Robots and Manipulators. 476 pp. 1993 [3-540-19834-2] Vol. 188: Naidu, D. Subbaram Aeroassisted Orbital Transfer: Guidance and Control Strategies 192 pp. 1993 [3-540-198199] Vol. 189: Ilchmann, A. Non-Identifier-Based High-Gain Adaptive Control 220 pp. 1993 [3-540-198458]
Vol. 194: Cao, Xi-Ren Realization Probabilities: The Dynamics of Queuing Systems 336 pp. 1993 [3-540-19872-5] Vol. 195: Liu, D.; Michel, A.N. Dynamical Systems with Saturation Nonlinearities: Analysis and Design 212 pp. 1994 [3-540-19886-1] Vol. t96: BattilotU, S. Noninteracting Control with Stability for Nonlinear Systems 196 pp. 1994 [3-540-19891-1] Vol. 197: Henry, J.; Yvon, J.P. (Eds) System Modelling and Optimization 975 pp approx. 1994 [3-540-19893-8] Vol. 198: Winter, H.; N0l~er, H.-G. (Eds)
Advanced Technologies for Air Traffic Flow Management 225 pp approx. 1994 [3-540-19895-4]
Vol. 190: Chatila, R.; Hirzinger, G. (Eds)
Experimental Robotics I1: The 2nd International Symposium, Toulouse, France, June 25-27 1991 580 pp. 1993 [3-540-19851-2] Vol. 191: Blondel, V. Simultaneous Stabilization of Linear Systems 212 pp. 1993 [3-540-19862-8] Vol. 192: Smith, R.S.; Dahleh, M. (Eds) The Modeling of Uncertainty in Control Systems 412 pp. 1993 [3-540-19870-9] Vol. 193: Zinober, A.S.I. (Ed.) Variable Structure and Lyapunov Control 428 pp. 1993 [3-540-19869-5]
Vol. 199: Cohen, G.; Quadrat, J.-P. (Eds) 1 lth International Conference on Analysis and Optimization of Systems Discrete Event Systems: Sophia-Antipolis, June 15-16-17, 1994 648 pp. 1994 [3-540-19896-2] Vol. 200: Yoshikawa, T.; Miyazaki, F. (Eds) Experimental Robotics IIh The 3rd Intemational Symposium, Kyoto, Japan, October 28-30, 1993 624 pp. 1994 [3-540-19905-5] Vol. 201: Kogan, J. Robust Stability and Convexity 192 pp. 1994 [3-540-19919-5] Vol. 202: Francis, B.A.; Tannenbaum, A.R. (Eds) Feedback Control, Nonlinear Systems, and Complexity 288 pp. 1995 [3-540-19943-8]
Vol. 203: Popkov, Y.S. Macrosystems Theory and its Applications: Equilibrium Models 344 pp. 1995 [3-540-19955-1]
Vol. 213: Patra, A.; Rao, G.P. General Hybrid Orthogonal Functions and their Applications in Systems and Control 144 pp. 1996 [3-540-76039-3]
Vol. 204: Takahashi, S.; Takahare, Y. Logical Approach to Systems Theory 192 pp. 1995 [3-540-19956-X]
Vol. 214: Yin, G.; Zhang, Q. (Eds) Recent Advances in Control and Optimization of Manufacturing Systems 240 pp. 1996 [3-540-76055-5]
Vol. 205: Kotta, U. Inversion Method in the Discrete-time Nonlinear Control Systems Synthesis Problems 168 pp. 1995 [3-540-19966-7] Vol. 206: Aganovic, Z.; Gajic, Z. Linear Optimal Control of Bilinear Systems with Applications to Singular Perturbations and Weak Coupling 133 pp. 1995 [3-540-19976-4] Vol. 207: Gabasov, R.; Kirillova, F.M.; Prischepova, S.V. Optimal Feedback Control 224 pp. 1995 [3-540-19991-8] Vol. 208: Khalil, H.K.; Chow, J.H.; Ioannou, P.A. (Eds) Proceedings of Workshop on Advances inControl and its Applications 300 pp. 1995 [3-540-19993-4] Vol. 209: Foias, C.; Ozbay, H.; Tannenbaum, A. Robust Control of Infinite Dimensional Systems: Frequency Domain Methods 230 pp. 1995 [3-540-19994-2] VoI. 210: De Wilde, P. Neural Network Models: An Analysis 164 pp. 1996 [3-540-19995-0]
Vol. 211: Gawronski, W. Balanced Control of Flexible Structures 280 pp. 1996 [3-540-76017-2] Vol. 212: Sanchez, A. Formal Specification and Synthesis of Procedural Controllers for Process Systems 248 pp. 1996 [3-540-76021-0]
VoI. 2t5: Bonivento, C.; Marro, G.; Zanasi, R. (Eds) Colloquium on Automatic Control 240 pp. 1996 [3-540-76060-1]
Vol. 216: Kulhavy, R. Recursive Nonlinear Estimation: A Geometric Approach 244 pp. 1996 [3-540-76063-6] Vol. 217: Garofalo, F.; Glielmo, L. (Eds) Robust Control via Variable Structure and LyapunQv Techniques 336 pp. 1996 [3-540-76067-9] Vol. 2t8: van der Schaft, A. L2 Gain and Passivity Techniques in Nonlinear Control 176 pp. 1996 [3-540-76074-1] Vol. 2t9: Berger, M.-O.; Deriche, R.; Herlin, I.; Jaffr~, Jo; Morel, J.-M. (Eds) ICAOS '96: 12th International Conference on Analysis and Optimization of Systems Images, Wavelets and PDEs: Paris, June 26-28 1996 378 pp. 1996 [3-540-76076-8] Vol. 220: Brogliato, B. Nonsmooth Impact Mechanics: Models, Dynamics and Control 420 pp. 1996 [3-540-76079-2]
Vol. 221: Kelkar, A.; Joshi, S. Control of Nonlinear Multibody Flexible Space Structures 160 pp. 1996 [3-540-76093-8] Vol. 222: Morse, A.S. Control Using Logic-Based Switching 288 pp. 1997 [3-540-76097-0]
Vol. 223: Khatib, O.; Salisbury, J.K.
Vol. 233: Chiacchio, P.; Chiaverini, S. (Eds)
Experimental Robotics IV: The 4th International Symposium, Stanford, California, June 30 - July 2, 1995 596 pp. 1997 [3-540-76133-0]
Complex Robotic Systems 189 pp. 1998 [3-540-76265-5] Vol. 234: Arena, P.; Fortuna, L.; Muscato, G.; Xibilia, M.G.
Tenfouw, J. (Eds) Robust Flight Control: A Design Challenge 654 pp. 1997 [3-540-76151-9]
Neural Networks in Multidimensional Domains: Fundamentals and New Trends in Modelling and Control 179 pp. 1998 [1-85233-006-6]
Vol. 225: Poznyak, A.S.; Najim, K.
Vol. 235: Chen, B.M.
Learning Automata and Stochastic Optimization 219 pp. 1997 [3-540-76154-3]
Hoo Control and Its Applications 361 pp. 1998 [1-85233-026-0]
Vol. 224: Magni, J.-F.; Bennani, S.;
Vol. 236: de Almeida, A.T.; Khatib, O. (Eds) Vol. 226: Cooperman, G.; Michler, G.;
Autonomous Robotic Systems 283 pp. 1998 [1-85233-036-8]
Vinck, H. (Eds) Workshop on High Performance Computing and Gigabit Local Area Networks 248 pp. 1997 [3-540-76169-1]
Vol. 237: Kreigman, D.J.; Hagar, G.D.;
Vol. 227: Tarboudech, S.; Garcia, G. (Eds) Control of Uncertain Systems with Bounded Inputs 203 pp. 1997 [3-540-76183-7]
Vol. 238: Elia, N. ; Dahleh, M.A.
Morse, A.S. (Eds) The Confluence of Vision and Control 304 pp. 1998 [1-85233-025-2] Computational Methods for Controller Design 200 pp. 1998 [1-85233-075-9]
Vol. 228: Dugard, L.; Verdest, E.I. (Eds)
Stability and Control of Time-delay Systems 344 pp. 1998 [3-540-76193-4]
Vol. 239: Wang, Q.G.; Lee, T.H.; Tan, K.K.
Vol. 229: Laumond, J.-P. (Ed.)
Finite Spectrum Assignment for Time-Delay Systems 200 pp. 1998 [1-85233-065-1]
Robot Motion Planning and Control 360 pp. 1998 [3-540-76219-1]
Vol. 240: Lin, Z.
Vol. 230: Siciliano, B.; Valavanis, K.P. (Eds)
Low Gain Feedback 376 pp. 1999 [1-85233-081-3]
Control Problems in Robotics and Automation 328 pp. 1998 [3-540-76220-5] Vol. 231: Emeryanov, S.V.; Burovoi, I.A.; Levada, F.Yu. Control of Indefinite Nonlinear Dynamic Systems 196 pp. 1998 [3-540-76245-0]
Vol. 241: Yamamoto, Y.; Hare S. Learning, Control and Hybrid Systems 472 pp. 1999 [1-85233-0767] Vol. 242: Conte, G.; Moog, C.H.; Perdon A.M. Nonlinear Control Systems 192 pp. 1999 [1-85233-151-8]
Vol. 232: Casals, A.; de Almeida, A.T. (Eds)
Experimental Robotics V: The Fifth International Symposium Barcelona, Catalonia, June 15-18, 1997 190 pp. 1998 [3-540-76218-3]
Vol. 243: Tzafestas, S.G.; Schmidt, G. (Eds)
Progress in Systems and Robot Analysis and Control Design 624 pp. 1999 [1-85233-123-2]
Vol. 244: Nijmeijer, H.; Fossen, T.I. (Eds) New Directions in Nonlinear Observer Design 552pp: 1999 [1-85233-134-8] Vol. 246: Garulli, A.; Tesi, A.; Vicino, A. (Eds)
Robustness in Identification and Control 448pp: 1999 [1-85233-179-8] Vol. 246: Aeyels, D.;
Lamnabhi-Laganigue,F.; van der Schaft,A. (Eds) Stability and Stabilization of Nonlinear Systems 408pp: 1999 [1-85233-638-2] Vol. 247: Young, K.D.; OzgQner, 0. (Eds)
Variable Structure Systems, Sliding Mode and Nonlinear Control 400pp: 1999 [1-85233-197-6] Vol. 248: Chen, Y.; Wen C.
Iterative Leaming Control 216pp: 1999 [1-85233-190-9] Vol. 249: Cooperman, G.; Jessen, E.;
Michler, G. (Eds) Workshop on Wide Area Networks and High Performance Computing 352pp: 1999 [1-85233-642-0] Vol. 250: Corke, P. ; Trevelyan, J. (Eds) Experimental Robotics VI 552pp: 2000 [1-85233-210-7] Vol. 251: van der Schaft, A. ; Schumacher, J. An Introduction to Hybrid Dynamical Systems 192pp: 2000 [1-85233-233-6]