THEORY OF MULTIOBJECTIVE OPTIM IZATION
This is Volume 176 in MATHEMATICS IN SCIENCE AND ENGINEERING A Series of Monog...
76 downloads
483 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THEORY OF MULTIOBJECTIVE OPTIM IZATION
This is Volume 176 in MATHEMATICS IN SCIENCE AND ENGINEERING A Series of Monographs and Textbooks Edited by RICHARD BELLMAN, University of Southern California The complete listing of books in this series is available from the Publisher upon request.
THEORY OF MULTIOBJECTIVE OPTIM IZATION
YOSHIKAZU SAWARAGI Faculty of Science The Kyoto Sangyo University Kyoto, Japan
HIROTAKA NAKAYAMA Department of Applied Mathematics Konan University Kobe, Japan
TETSUZO TANINO Department of Mechanical Engineering II Tohoku University Sendai, Japan
Academic Press, Inc. (Harcourt Brace Jovanovich, Publishers)
Orlando San Diego New York London Toronto Montreal Sydney Tokyo
COPYRIGHT © 1985, BY ACADEMIC PREss, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBUCATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBUSHER. ACADEMIC PRESS, INC. Orlando, Florida 32887
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD. 24-28 Oval Road, London NWI 7DX
Library of Congress Cataloging in Publication Data Sawaragi, Yoshikazu, Date Theory of multiobjective optimization. Includes index. I. Mathematical optimization. I. Nakayama, Hirotaka, Date. II. Tanino, Tetsuzo. III. Title. QA402.5.S28 1985 519 84-14501 ISBN 0-12~620370-9 (alk. paper)
PRINTED IN THE UNITED STATES OF AMERICA
85 86 87 88
987654321
To our wives Atsumi, Teruyo, and Yuko
This page intentionally left blank
CONTENTS
ix xi
Preface Notation and Symbols
1 INTRODUCTION 2 MATHEMATICAL PRELIMINARIES 2.1 Elements of Convex Analysis 2.2 Point-To-Set Maps 2.3 Preference Orders and Domination Structures
6 21 25
3 SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS 3.1 3.2 3.3 3.4 3.5
Solution Concepts Existence and External Stability of Efficient Solutions Connectedness of Efficient Sets Characterization of Efficiencyand Proper Efficiency Kuhn-Tucker Conditions for Multiobjective Programming
32
47 66 70
89
4 STABILITY 4.1 4.2 4.3 4.4 4.5
Families of Multiobjective Optimization Problems Stability for Perturbation of the Sets of Feasible Solutions Stability for Perturbation of the Domination Structure Stability in the Decision Space Stability of Properly Efficient Solutions vii
92 94 107
119 122
viii
CONTENTS
5 LAGRANGE DUALITY 5.1 Linear Cases 5.2 Duality in Nonlinear Multiobjective Optimization 5.3 Geometric Consideration of Duality
127 137 148
6 CONJUGATE DUALITY 6.1 Conjugate Duality Based on Efficiency 6.2 Conjugate Duality Based on Weak Efficiency 6.3 Conjugate Duality Based on Strong Supremum and Infimum
167 190 201
7 METHODOLOGY 7.1 Utility and Value Theory 7.2 Stochastic Dominance 7.3 Multiobjective Programming Methods
244 252
References
281
Index
293
210
PREFACE
This book presents in a comprehensive manner some salient theoretical aspects of multiobjective optimization. The authors had been involved in the special research project Environmental Science, sponsored by the Education Ministry of Japan, for more than a decade since 1970. Through the research activities, we became aware that an important thing is not merely to eliminate pollutants after they are discharged, but how to create a good environment from a holistic viewpoint. What, then, is a good environment? There are many factors: physical, chemical, biological, economic, social, and so on. In addition, to make the matter more difficult, there appear to be many conflicting values. System scientific methodology seems effective for treating such a multiplicity of values. Its main concern is how to trade off these values. One of the major approaches is multiobjective optimization. Another is multiattribute utility analysis. The importance of these research themes has been widely recognized in theory and practice. Above all, the workshops at South Carolina in 1972 and at IIASA in 1975 have provided remarkable incentives to this field of research. Since then, much active research has been observed all over the world. Although a number of books in this field have been published in recent years, they focus primarily on methodology. In spite of their importance, however, theoretical aspects of multi objective optimization have never been dealt with in a unified way. In Chapter 1 (Introduction), fundamental notions in multiobjective decision making and its historical background are briefly explained. Throughout this chapter, readers can grasp the purpose and scope of this volume. Chapters 2-6 are the core of the book and are concerned with the mathematical theories in multiobjective optimization of existence, necessary and ix
x
PREFACE
sufficient conditions of efficient solutions, characterization of efficient solutions, stability, and duality. Some of them are still developing, but we have tried to describe them in a unified way as much as possible. Chapter 7 treats methodology including utility/value theory, stochastic dominance, and multiobjective programming methods. We emphasized critical points of these methods rather than a mere introduction. We hope that this approach will have a positive impact on future development of these areas. The intended readers of this book are senior undergraduate students, graduate students, and specialists of decision making theory and mathematical programming, whose research fields are applied mathematics, electrical engineering, mechanical engineering, control engineering, economics, management sciences, operations research, and systems science. The book is self-contained so that it might be available either for reference and self-study or for use as a classroom text; only an elementary knowledge of linear algebra and mathematical programming is assumed. Finally, we would like to note that we were motivated to write this book by a recommendation of the late Richard Bellman.
NOTATION AND SYMBOLS
X:=y XES xrtS
SC
clS int S
as
SeT, T => S Su T S (l T S\T S+ T
S x T
x is defined as y x is a member of the set S x is not a member of the set S complement of the set S closure of the set S interior of the set S boundary of the set S S is a subset of the set T union of two sets Sand T intersection of two sets Sand T difference between Sand T, i.e., S ( l TC sum of two sets Sand T,i.e., S + T:= {s + t: s E S and tE T} Cartesian product of the sets Sand T, i.e., S x T:= {(s, t): s E Sand t E T} n-dimensional Euclidean space. A vector x in R" is written x = (Xl' X 2' ••. , x n ) , but when used in matrix calculations it is represented as a column vector, i.e.,
The corresponding row vector is x T = (x., x 2 , ... ,xn ). When A = (a i ) is an m x n matrix with entry aij in its ith row andjth column, the xi
xii
RP+
NOTATION AND SYMBOLS
product Ax is the vector Y E R" with components Yi = Ij= 1 aijx j , i = 1, ... , m. nonnegative orthant of RP, i.e., R~
:=
{y
E
RP: Yi
~
0 for i
positive orthant of RP, i.e., R~ fori=1, ... ,p} SP:={YER~:If=lYi=
SP
:=
{y E R~
:If= 1 Yi =
=
:=
1, ... , p}
{y
E
RP : Yj > 0
1} l}
Fo'r x, y E RP and a pointed cone K of RP with int K =1= 0, x
~
K
Y
is defined as
x - Y E K,
x ~
K
Y
is defined as
x - y
x >KY
is defined as
x - Y E int K.
In particular, in case of K = R~, script K is omitted, namely, x
~
y:x i ~
x
~
y:x
Yi
~y
x> y t x, > Yi x>y
X-Y <x,Y) [x]
co S T(S, y)
P(S)
O+y y+
@,(Y, D) .9(Y, D)
D-epi W
K\{O},
E
the sub-
for all
i= 1, ... , p;
and
x
for all
i
=1=
Y;
= 1,oo.,p.
x is preferred to Y x is indifferent to Y inner product of the vectors x and Y in the Euclidean space Euclidean norm of the vector x in the Euclidean space convex hull of the set S tangent cone of the set S at Y projecting cone of S recession cone of the set Y extended recession cone of the set Y efficient set of the set Y with respect to the domination structure (cone) D properly efficient set (in the sense of Benson) of the set Y with respect to the domination coneD D-epigraph of the (point-to-set) map W
NOTATION AND SYMBOLS
b(·IS) f'(x; d)
Vf(x)
indicator function of the set S one-sided directional derivative of the functionf at x gradient of the function f at x, i.e., iff: R" -lo R,
Vf(x)
.= (af(x) , af(x) , ... , af(x») aX 1
f* F* F**(f**)
af(x) aF(x; y) Maxj, Y (MinD Y)
w-Maxj, Y (w-Min., Y)
w-Sup Y(w-Inf Y) w-F* w-F** w-aF(x)
w-L max Y (min Y) sup Y (inC Y)
f*
of(x)
xiii
ax z
ox"
conjugate map (or function) of the functionf conjugate map of the point-to-set map F biconjugate map of F(f) subdifferential of the (vector-valued) function fat x sub differential of the point-to-set map F at (x; y) set of the D-maximal (D-minimal) points of the set Y, i.e., Maxj, y.= 8'(Y, -D) (Min., y.= 8(Y, D». In case of D = R~, in particular, Max y.= 8(Y, -R~) (Min y.= 8'(Y, R~» set of the weak D-maximal (D-minimal) points of the set Y. In case of D = R~, in particular, the subscript D is omitted. set of the weak supremal (infimum) points of the set Y weak conjugate map of the point-to-set map F weak biconjugate map of the point-to-set map F weak subdifferential of the point-to-set map F at x weak Lagrangian strong maximum (minimum) of the set Y strong supremum (infimum) of the set Y strong conjugate of the vector-valued functionf strong subdifferential of the vector-valued functionfat x
This page intentionally left blank
1
INTRODUCTION
Every day we encounter various kinds of decision making problems as managers, designers, administrative officers, mere individuals, and so on. In these problems, the final decision is usually made through several steps, even though they sometimes might not be perceived explicitly. Figure 1.1 shows a conceptual model of the decision making process. It implies that the final decision is made through three major models, the structure model, the impact model, and the evaluation model. By structure modeling, we mean constructing a model in order to know the structure of the problem, what the problem is, which factors comprise the problem, how they interrelate, and so on. Through the process, the objective ofthe problem and alternatives to perform it are specified. Hereafter, we shall use the notation (!) for the objective and X for the set of alternatives, which is supposed to be a subset of an n-dimensional vector space. If we positively know a consequence caused by an alternative, the decision making is said to be under certainty; whereas if we cannot know a sure result because of some uncertain factor(s), the decision making is said to be under uncertainty. Furthermore, if we objectively or subjectively know the probability distribution of the possible consequences caused by an alternative, the decision making is said to be under risk. Even though the final objective might be a single entity, we encounter, in general, many subobjectives (!)i on the way to the final objective. In this book, we shall consider decision making problems with multiple objectives. Interpretive structural modeling (ISM) (Warfield [W5]) can be applied effectivelyin order to obtain a hierarchical structure of the objectives. In order to solve our decision making problem by some systems-analytical methods, we usually require that degrees of objectives be represented in numerical terms, which may be of multiple kinds even for one objective. In 1
2
INTRODUCTION
Decision
Problem
Fil!:.1.1. Conceptual model of the decision making process.
order to exclude subjective value judgment at this stage, we restrict these numerical terms to physical measures (for example, money, weight, length, and time). As such a performance index, or criterion, for the objective (9;, an objective function J;: X -+ R 1 is introduced, where R 1 denotes onedimensional Euclidean space. The value J;(x) indicates how much impact is given on the objective (9; by performing an alternative x. Impact modeling is performed to identify these objective functions from various viewpoints such as physical, chemical, biological, social, economic, and so on. For convenience of mathematical treatment, we assume in this book that a smaller value for each objective function is preferred to a larger one. Now we can formulate our decision making problems as a multiobjective optimization problem: (P)
over
Minimize
x EX.
This kind of problem is also called a vector optimization. In some cases, some of the objective functions are required to be maintained under given levels prior to minimizing other objective functions. Denoting these objective functions by gix), we require that gJ{x) ::;; 0,
j = 1, ... , m,
which, for convenience, is also supposed to represent some other technical constraints. Such a function gix) is generally called a constraint function in this book. According to the situation, we will consider either the problem (P) itself or (P) accompanied by the constraint gix) ::;; 0 U = 1, ... , m). Of course, an equality constraint hk(x) = 0 can be embedded within two inequalities hix)::;; 0 and - hk(x) ::;; 0, and, hence, it does not appear explicitly in this book. Unlike traditional mathematical programming with a single objective function, an optimal solution in the sense of one that minimizes all the objective functions simultaneously does not necessarily exist in multiobjective optimization problems, and, hence, we are troubled with conflicts among objectives in decision making problems with multiple objectives. The final decision should be made by taking the total balance of objectives into account. Therefore, a new problem of value judgment called value trade-off arises. Evaluation modeling treats this problem that is peculiar to decision
3
INTRODUCTION
making with multiple objectives. Here we assume a decision maker who is responsible for the final decision. In some cases, there may be many decision makers, for which cases the decision making problems are called group decision problems. We will consider cases with a single decision maker in this book. The decision maker's value is usually represented by saying whether or not an alternative x is preferred to another alternative x', or equivalently whether or not f(x) is preferred to f(x'). In other words, the decision maker's value is represented by some binary relation defined over X or f(X). Since such a binary relation representing the decision maker's preference usually becomes an order, it is called a preference order. In this book, the decision maker's preference order is supposed to be defined on the so-called criteria space Y, which includes the setf(X). Several kinds of preference orders will be possible, sometimes, the decision maker cannot judge whether or not f(x) is preferred to f(x'). Roughly speaking, such an order that admits incomparability for a pair of objects is called a partial order, whereas the order requiring the comparability for every pair of objects is called a weak order (or total order). In practice, we often observe a partial order for the decision maker's preference. Unfortunately, however, an optimal solution in the sense of one that is most preferred with respect to the order, whence the notion of optimality does not necessarily exist for partial orders. Instead of strict optimality, we introduce in multiobjective optimization the notion of efficiency. A vector f(x) is said to be efficient if there is no f(x) (x E X) preferred to f(x) with respect to the preference order. The final decision is usually madeamong the set of efficient solutions. --,' One approach to evaluation modeling is to find a scalar-valued function U(fl' ... ,Jp) representing the decision maker's preference, which is called a preference function in this book. A preference function in decision making under risk is called a utility function, whereas the one in decision making under certainty is called a value function. The theory regarding existence, uniqueness, and practical representation of such a utility or value function is called the utility and value theory. Once we obtain such a preference function, our problem reduces to the traditional mathematical programming: Maximize
over
x
E
X.
Another popular approach is the interactive programming that performs the solution search and evaluation modeling. In this approach, the solution is searched without identifying the preference function by eliciting iteratively some local information on the decision maker's preference. Kuhn and Tucker [KlO] first gave some interesting results concerning multiobjective optimization in 1951. Since then, research in this field has made remarkable progresss both theoretically and practically. Some of the earliest attempts to obtain conditions for efficiency were carried out by Kuhn
4
INTRODUCTION
and Tucker [KlO], and Arrow et al. [A5]. Their research has been inherited by Da Cunha and Polak [01], Neustadt [N14], Ritter [R4-R6], Smale [SlO, S11], Aubin [A7], and others. After the 1970s,practical methodology such as utility and value analysis and interactive programming methods have been actively researched as tools for supporting decision making, and many books and conference proceedings on this topic have been published. (See, for example, Lee [Ll], Cochrane and Zeleny [C12], Keeney and Raiffa [K6], Leitmann and Marzollo [L3], Leitmann [L2], Wilhelm [W15], Zeleny [Z4Z6], Thiriez and Zionts [T14], Zionts [Z7], Starr and Zeleny [S13], Nijkamp and Delft [N18], Cohon [C13], Hwang and Masud [H17], Salkuvadze [S3], Fandel and Gal [F2], Rietveld [R3], Hwang and Yoon [H18], Morse [M5], Goicoeche et al. [G8], Hansen [H3], Chankong and Haimes [C6], and Grauer and Wierzbicki [GlO].) On the other hand, duality and stability, which play an important role in traditional mathematical programming, have been extended to multiobjective optimization since the late 19708. Isermann [15-17] developed multiobjective duality in the linear case, while the results for nonlinear cases have been given by Schonfeld [S6], Rosinger [R10], Guglielmo [G 12], Tanino and Sawaragi [T9, T11], Mazzoleni [M3], Bitran [B13], Brumelle [B21], Corley [C16], Jahn [11], Kawasaki [K2, K3], Luc [LlO], Nakayama [NS], and others. Stability for multiobjective optimization has been developed by Naccache [N2] and Tanino and Sawaragi [TlO]. This book will be mainly concerned with some of the theoretical aspects in multiobjective optimization; in particular, we will focus on existence, necessary/sufficient conditions, stability, Lagrange duality, and conjugate duality for efficient solutions. In addition, some methodology such as utility and value theory and interactive programming methods will also be discussed. Chapter 2 is devoted to some mathematical preliminaries. The first section gives a brief review of the elements of convex analysis that play an important role not only in traditional mathematical programming but also in multiobjective optimization. The second section describes point-to-set maps that playa very important role in the theory of multiobjective optimization, since the efficient solutions usually constitute a set. The concepts of continuity and convexity of point-to-set maps are introduced. These concepts are fundamental for existence and necessary/sufficient conditions for efficient solutions. The third section is concerned with a brief explanation of preference order and domination structures. Chapter 3 begins with the introduction of several possible concepts for solutions in multiobjective optimization. Above all, efficient solutions will be the subject of primary consideration in subsequent theories. Next, some properties of efficient solutions, such as existence, external stability, connectedness, and necessary/sufficient conditions, will be discussed.
INTRODUCTION
5
Chapter 4 develops the stability theory in multiobjective optimization. The factors that specify multiobjective optimization problems are the objective functions with some constraints and the preference structure of the decision maker. Therefore, the stability of the solution set for perturbations of those factors is considered by using the concept of continuity of the solution map (which is a point-to-set map). A number of illustrative examples will be shown. Chapter 5 will be devoted to the duality theory in multiobjective optimization. Duality is a fruitful result in traditional mathematical programming and is very useful both theoretically and practically. Consequently, it is quite interesting to extend the duality theory to the case of multiobjective optimization. In this chapter, the duality of linear multiobjective optimization will be introduced. Next, a more general duality theory for nonlinear cases will be discussed in parallel with the case of ordinary convex programming. Given a convex multiobjective programming problem, some new concepts such as the primal map, the dual map, and the vector-valued Lagrangian will be defined. The Lagrange multiplier theorem, the saddlepoint theorem, and the duality theorem will be obtained via geometric consideration. Chapter 6 will develop the conjugate duality theory in multiobjective optimization. Rockafellar [R8] constructed a general and comprehensive duality theory for an ordinary convex optimization problem by embedding it in a family of perturbed problems and associating a dual problem with it by using conjugate functions. In this chapter, we shall introduce the concept of conjugate maps, which are extensions of conjugate functions, and develop duality for convex multiobjective optimization problems. In addition, recent results on this topic from several approaches will be introduced. Chapter 7 deals with the methodology for practical implementation of multiobjective optimization. Although various methods for multiobjective optimization have been developed, we shall focus on some of the main approaches (for example, the ultility and value theory and interactive programming methods). However, our aim is not merely to introduce these methods, but to clarify the noticeable points about them. In the first section, the utility and value theory will be treated. There, we shall introduce several results on existence, uniqueness, and representation of preference functions that represent the decision maker's preference. Several practical implications of these methods, in particular, von Neumann-Morgenstern utility functions, will also be discussed along with clarification of some of their controversial points. In the next section, we shall discuss interactive programming methods, showing some typical methods as examples. Desirable properties of this approach will be clarified throughout the discussion.
2
MATHEMATICAL PRELIMINARIES
This chapter is devoted to mathematical preliminaries. First, some fundamental results in convex analysis, mainly according to Rockafellar [R7], are reviewed. Second, the concepts of continuity and convexity of vectorvalued point-to-set maps are introduced. Finally, the concepts of preference orders, domination structures, and nondominated solutions are introduced to provide a way of specifying solutions for multiobjective optimization problems. 2.1
Elements of Convex Analysis
As is well known, convexity plays an important role in the theory of optimization with a single objective. A number of books dealing with convex optimization both in finite and infinite dimensional spaces have been published (Rockafellar [R7], Stoer and Witzgall [SI5], Holmes [HI2], Bazaraa and Shetty [B3], Ekeland and Temam [EI], Ioffe and Tihomirov [14], Barbu and Precupanu [BI], and Ponstein [P2]). Since they are also fundamental in the theory of multiobjective optimization, some elementary results are summarized below. All spaces considered are finite-dimensional Euclidean spaces. The fundamental text for this section is "Convex Analysis" by Rockafellar [R7], and the reader might refer to it for details.
2.1.1 Convex Sets In this subsection, we will first look at convex sets, which are fundamental in convex analysis. 6
2.1
Definition 2.1.1
7
ELEMENTS OF CONVEX ANALYSIS
(Convex Set)
A subset X of R" is said to be convex if for any
X
I,X2
E
X
and any
rt E
[0,1].
Proposition 2.1 .1
A subset X of R" is convex if and only if x; E X (i = 1, ... , k), rt; ~ 0 (i = 1, ... , k), and L~=l rt; = 1 imply L~=l rt;X i EX; that is, if and only if it contains all convex combinations of its elements. Definition 2.1.2
(Convex Hull)
The intersection of all the convex sets containing a given subset X of R" is called the convex hull of X and is denoted by co X. Proposition 2.1.2
(i) co X is the unique smallest convex set containing X. i (ii) co X = {L~=l rtiX : rt i ~ 0, L~=l rt i = 1, Xi E X (i = 1, ... , k)}; that is, co X consists of all the convex combinations of the elements of X. Definition 2.1.3
(Cone and Convex Cone)
A subset K of R" is called a cone if rtX E K whenever x E K and rt > O. Moreover, a cone K is said to be a convex cone when it is also convex. Proposition 2.1 .3
A set K in R" is a convex cone if and only if (i) x E K and rt > 0 imply eoc E K, (ii) x ', x 2 E K imply x' + Xl E K. Definition 2.1.4
(Pointed Cone and Acute Cone)
A cone K c R" is said to be pointed if -x 1= K when x # 0 and x E K. In other words, K is pointed if it does not contain any nontrivial subspaces. It is said to be acute if there is an open halfspace (see Definition 2.1.16)
fI+
= {x E R":<x, x*) >
O}
such that
elK
c fI+ u
{OJ.
x* # 0
8
2
MATHEMATICAL PRELIMINARIES
Proposition 2.1.4
When K is a convex cone, it is acute if and only if cl K is pointed. Of course, acuteness is a stronger concept than pointedness. If K is closed and convex, these concepts coincide.
Definition 2.1.5
(Polar and Strict Polar)
For a subset X in R", its positive polar XO is defined by X = {x* ERn: <x,x*) O
~
0 for any x EX}.
The strict positive polar XSO of X is defined by
X'O = {x* ERn: <x, x*) > 0 for any nonzero x EX}. Proposition 2.1.5
Let X, Xl' and X 2 be sets in Rn. Then the polar XO is a closed convex cone, the strict polar X'" is a convex cone, XO = (el X)O, if X is open, X'O u {O} = Xc, (v) Xl C X 2 implies X~ C X~ and X~o c xslo, and (vi) if K is a nonempty convex cone, K oo = cl K.
(i) (ii) (iii) (iv)
Proposition 2.1.6
Let K I and K 2 be cones in R". Then (i) if K 1 and K 2 are nonempty,
+ K 2 )O = K~ n K 2 )O :::> K~ + K~ = co(K~ (K I
K~
= (K I
U
K 2 )O,
(ii) (K I n u K~), (iii) if K I and K 2 are closed convex with nonempty intersection, (K 1 n K 2 )O = cl(K~
+
= cl co(K~
K~)
u
Kn.
If ri K I n ri K 2 i' 0, in addition, then (K
I
r. K 2 )O =
K~
+ K~.
Furthermore, the last equality holds also for convex polyhedral cones (Definition 2.1.7) K 1 and K 2 •
2.1
9
ELEMENTS OF CONVEX ANALYSIS
Proposition 2.1.7 t Let K be a cone in W. Then (i) int KO #- 0 if and only if K is acute, (ii) when K is acute, int KO = (cl Kro.
Proof (i) Suppose first that int KO #- 0. Let x* #- 0 in int KO and fr = {x E R": (x, x*) > OJ. We now show that cl K c fr u {OJ. Suppose, to the contrary, that (x, x*) ~ 0 for some nonzero x E cl K. Since x* E int KO, x* - IXX E KO for sufficiently small IX > O. Since KO = (cl K)O [Proposition 2.1.5 (iii)],
<x, x* -
IXX) ~
O.
However, this implies that <x, x*) ~ IX<X, x) > 0, which contradicts the hypothesis <x, x*) ~ O. Hence, cl K c fr u {OJ. Conversely, suppose that K is acute, i.e., cl K c fr u {OJ = {x E R": <x, x*) > O} u {OJ for some x* #- 0 E W. We now prove that x* E int KO. If we suppose the contrary, there exist sequences {x'"} C R" and {x"} c K such that
X*k
--+
x*
and
<x\ X*k) < 0
for all
k.
We may assume without loss of generality that Ilxkll = 1 for all k and so x k --+ x for some x E cl K with Ilxll = 1. Clearly then <x\ X*k) --+ (x, x*) ~ O. However, this contradicts the fact that cl K c ir u {OJ. (ii) The inclusive relation int KO c (cl Kyo holds whether K is acute or not. In fact, let x* E int KO, and suppose that x* ¢ (cl Kyo. Then there exists nonzero x' E cl K such that <x', x*) ~ 0. Since x* - lXX' E KO = (cl Kt for sufficiently small IX > 0, then (x', x*) = <x', x* -
IXX')
+ IX<X', x') >
0,
which is a contradiction. To prove the converse, let x* ¢ int KO. Then there exist sequences {X*k} C Wand {x'} c K such that
X*k
--+
x*
and
<x\ X*k) < 0
for all
k.
We may assume without loss of generality that II~II = 1 and so x k --+ x for some x E cl K with Ilxll = 1. Then clearly <x\ X*k) --+ (x, x*) ~ 0. This implies that rt (cl Kro, which is what we wished to prove.
x*
tyu [VI].
10
2
MATHEMATICAL PRELIMINARIES
Proposition 2.1.8 t Let K 1 be a nonempty, closed, pointed convex cone and K 2 be a nonempty, closed, convex cone. Then
= {OJ
(-K 1 ) n K 2
if and only if K~O
n
K~
=f-
0.
Proof First suppose that there exists x* nonzero x E ( - K 1) n K 2 ,
0
and
E
KS10 n
<x, x*)
~
K~.
Then, for any
0,
which is impossible. Therefore (-K 1 ) n K 2 = {OJ. (Note that 0 is always contained in (- K 1) n K 2 because both K 1 and K 2 are nonempty closed cones.) Conversely, suppose that K~o n K~ = 0. Then, by the separation theorem of convex sets (Theorem 2.1.1), there exist x =f- 0 E Wand f3 E R such that
Since both
<x, x*)
~
f3
for any
x*
E K~o,
<x,x*)
~
f3
for any
x*
E K~.
and K 2 are cones, f3 should be equal to O. Hence,
K~o
x
n
E [-(K~O)oJ
(K~)o.
From Propositions 2.1.5 and 2.1.7, (K~O)O
= (int K~)O
=
(K~)O
= cl K 1 = K 1,
and from Proposition 2.1.5 (vi),
Therefore, x
E
(-K 1 ) n K 2 , i.e., (-K 1 ) n K 2 =f- {OJ.
Definition 2.1.6
(Recession Cone)
Let X be a convex set in R". Its recession cone 0+X is defined by O+X
= {x'
E
R": x
+ ax'
E
X for any x
E
X and a > OJ.
Of course, 0+X is a convex cone containing the origin. t
Borwein [B17] and Bitran and Magnanti [B14].
2.1
ELEMENTS OF CONVEX ANALYSIS
11
Proposition 2.1.9
If {Xi: i E I} is an arbitrary collection of closed convex sets in R" whose intersection is not empty, then
Proposition 2.1.10
A nonempty closed convex set X in R" is bounded if and only if its recession cone O+X = {OJ. Proposition 2.1.11
Let Xl and X 2 be nonempty closed convex sets in R", If 0+X 1 n (-0+X 2 ) = {OJ,
then the set XI
+ X 2 is closed, and 0+(X 1 + X 2 )
Definition 2.1.7
= 0+X 1
+ 0+X 2 •
(Polyhedral Convex Set and Polyhedral Convex Cone)
A set X in R" is said to be a polyhedral convex set if it can be expressed as the intersection of some finite collection of closed halfspaces (see Definition 2.1.16), i.e., if X
= {x: X
Proposition 2.1.25
A closed convex functionf is the pointwise supremum of the collection of all affine functions h such that h ~ f. Definition 2.1.20
(Conjugate Function and Biconjugate Function)
Letf be a convex function from R" to [- <Xi, + <Xi]. The functionf* on W defined by f*(x*) = sup{ <x, x*) - f(x): x ERn},
is called the conjugate function off. The conjugate off*, i.e., the functionf** on Rn defined by f**(x) = sup{ <x, x*) - f*(x*): x* ERn},
is called the biconjugate function off. Proposition 2.1 .26
Let f be a convex function. The conjugate function f* is a closed proper convex function if and only if f is proper. Moreover, (cl f)* = f* and f** = cl f. Thus, the conjugacy operation f ~ f* induces a symmetric oneto-one correspondence in the class of all closed proper convex functions on
n:
Proposition 2.1.27
iFenchel's Inequality)
Letf be a proper convex function. Then f(x) Definition 2.1.21
+ f*(x*)
~
(x, x*)
for any
x and x*.
(Support Function)
Let X be a convex set in W. The conjugate function of the indicator function b(" X) of X, which is given by b*(x* I X)
= sup{ <x, x*) : x EX},
is called the support function of X.
2.1
ELEMENTS OF CONVEX ANALYSIS
19
Proposition 2.1.28
If X is a nonempty, closed con vex set in R", then 15*(·1 X) is a closed, proper convex function that is positively homogeneous. Conversely, if y is a positively homogeneous, closed, proper convex function, it defines a nonempty closed convex set X' by X' = {x' E R": (x, x')
2.1.5
~
y(x) for all x ERn}.
Subgradients of Convex Functions
Differentiability facilitates the analysis of optimization problems. For convex functions, even when they are not differentiable, we might consider subgradients instead of nonexistent gradients. Definition 2.1.22
(One-Sided Directional Derivative)
Let f be a function from R" to [ - 00, + 00], and let x be a point where f is finite. The one-sided directional derivative off at x with respect to a vector d is defined to be the limit f'(x; d) = lim{[f(x + td) - f(x)]/t}, t!O
if it exists. Remark 2.1.2
Iff is actually differentiable at x, then f'(x; d)
=
(Vf(x), d)
for any
d,
where Vf(x) is the gradient off at x. Definition 2.1.23
(Subgradient)
Letfbe a convex function from R" to [- 00, + 00]. A vector x* E R" is said to be a subgradient off at x if f(x')
~
f(x)
+ (x*, x'
- x)
for any
x' ERn.
The set of all subgradients off at x is called the subdifferential off at x and is denoted by of(x). If of(x) is not empty,f is said to be subdifferentiable at x. Remark 2.1.3
Whenfis finite at x, we have the following geometric interpretation of the subgradient: x* is a subgradient off at x if and only if (x*, -1) E R n + 1 is a normal vector to the supporting hyperplane of epi f at (x, f(x».
20
2
MATHEMATICAL PRELIMINARIES
Proposition 2.1.29
Letf be a convex function from R" to [- 00, + 00]. Then f(x) = min{f(x): x
E
R"}
if and only if 0
E
of(x).
Proposition 2.1 .30
Let f be a convex function and suppose that f(x) is finite. Then x* is a subgradient off at x if and only if for any
f'(x; d) ~ <x*, d)
d.
Proposition 2.1.31
Iffis a proper convex function, then iJf(x) is nonempty for x and f'(x; d) = sup{<x*, d) : x*
E
E
ri(domf),
iJf(x)} = c5*(d I iJf(x)).
Moreover, iJf(x) is a nonempty bounded set if and only if x which case f'(x ; d) is finite for every d.
E
int(domf), in
Corollary 2.1.2
If f is a finite convex function on R", then at each point x, the subdifferential of(x) is a nonempty, closed, bounded convex set and f'(x;d) = max{<x*,d):x*
E
of(x)}.
Proposition 2.1.32
Let f be a convex function and x be a point at which f is finite. If f is differentiable at x, then Vf(x) is the unique subgradient off at x, such that in particular f(x')
~
f(x)
+ Ofl(X) + Of2(X) If the convex sets ri(domJ;) (i
=
+ ... + ofm(x)
for any x.
1, ... , m) have a point in common, then
actually for any x.
2.2
Point-To-Set Maps
This section deals with the concept of point-to-set maps and their properties of continuity and convexity. 2.2.1
Point-To-Set Maps and Optimization
A point-to-set map F from a set X into a set Y is a map that associates a subset of Y with each point of X. Equivalently, F can be viewed as a function from the set X into the power set 2 Y • Some authors use other names such as correspondence and multifunction instead of point-to-set map. In addition to their intrinsic mathematical interest, the study of such maps has been motivated by numerous applications in different fields. Readers interested in the history and the state of the art of the study of point-to-set maps may refer to Huard [HIS]. The use of point-to-set maps in the theory of optimization can be seen in several areas. The first one is the application of fixed point theorems in the field of economics and game theory. The most prominent result is related to the existence of an equilibrium or a saddle point. (See, for example, Debreu [D3].) The second application is concerned with the representation of iterative solving methods, or algorithms, for mathematical programming problems. (See, for example, Zangwill [Z2].) The third application, which is closely related to the results in this book, is the study of the stability of the optimal value of a program (or the set of optimal solutions) when the problem data depend on a parameter (Berge [Bll], Dantzig et al. [D2], Evans and Gould [E2], Hogan [Hll], and Fiacco [F4]). For other applications, see Huard [H14]. In a multiobjective optimization problem, it is rather difficult to obtain a unique optimal solution. Solving the problem often leads to a solution set.
22
2
MATHEMATICAL PRELIMINARIES
Thus, if the problem has a parameter, the solution set defines a point-to-set map from the parameter space into the objective (or decision) space. Some point-to-set maps of this type will be investigated in this book. 2.2.2
Continuity of Point-To-Set Maps
A point-to-set map from a set X into a set Y is a function from X into the power set 2Y • We can consider the concept of continuity if a certain topology (or distance) is defined in the power set 2Y • Moreover, noting that the inclusion relation between subsets of Yinduces a partial order in 2Y , we may also consider the concepts corresponding to the upper and lower semicontinuity of real-valued functions. Thus, in the theory of point-to-set maps, two kinds of continuity have been studied. Though they have been described and analyzed in a number of different settings, we follow the definitions by Hogan [Hll] in this book. Several similar definitions of continuity and relationships between them are summed up in Delahare and Denel [04]. In the following, sets X and Yare assumed to be subsets of some finitedimensional Euclidean spaces for simplicity. Definition 2.2.1 and Continuity)
(Lower Semicontinuity, Upper Semicontinuity,
If F is a point-to-set map from a set X into a set Y, then F is said to be (1) lower semicontinuous (l.s.c.) at a point x E X if {x"} C X, x k -+ x, and y E F(x) all imply the existence of an integer m and a sequence {y} c Ysuch that y E F(x k ) for k ~ m and 1-+ y; (2) upper semicontinuous (u.s.c.) at a point x E X if {x"} c X, xk -+ x, I E F(x k), and 1 ~ y all imply that y E F(x); (3) continuous at a point x E X if it is both l.s.c. and u.s.c. at x; and (4) l.s.c. (resp. u.s.c. and continuous) on X' c X if it is l.s.c. (resp. u.s.c. and continuous) at every x E X'. Definition 2.2.2
(Uniform Compactness)
A point-to-set map F from a set X to a set Y is said to be uniformly compact near a point x E X if there is a neighborhood N of x such that the closure of the set UXEN F(x) is compact. Figure 2.la shows a map that is lower semicontinuous but not upper semicontinuous at x. On the other hand, part b of the figure shows a map that is upper semicontinuous but not lower semicontinuous at x. We frequently encounter the case in which the feasible set is described by inequalities in a mathematical programming problem. Some properties of maps determined by inequalities have been investigated.
2.2
23
POINT-TO-SET MAPS
y
y
..
.. .
-...:-.:--,
S/.
~
. .
o
. ..
....::.: .. .. :.
L/
I
--+----~-----
. ..
I t t
x
x
I
--+-------J~_----x
o
(a)
x (b)
Fig. 2.1. (a) Map that is lower semicontinuous but not upper semicontinuous at x. (b) Map that is upper semicontinuous but not lower semicontinuous at x,
Let X(u)
where 9 is a function from X
= {x EX: g(x, u) ~ O},
x U to R'".
Proposition 2.2.1
If each component of 9 is lower semicontinuous on is closed, then X is upper semicontinuous at u.
X x u and if the set X
Proposition 2.2.2
If the set X is convex, if each component of 9 is continuous on X(u) x u and convex in x for each fixed U E U, and if there exists an x E X such that g(x, u) < 0, then the map X is lower semicontinuous at u. 2.2.3
Convexity of Point- To-Set Maps
In this subsection, we will extend the convexity concept of functions to point-to-set maps by generally taking values of subsets of a finitedimensional Euclidean space. Definition 2.2.3
(Cone Epigraph of a Point- To-Set Map}t
Let F be a point-to-set map from R" into RP and D be a convex cone in RP. The set {(x, y): x
E
R", Y E RP, Y E F(x)
+ D}
is called the D-epigraph of F and is denoted by D-epi F. t
See Definition 2.1.10.
2
24
MATHEMATICAL PRELIMINARIES
Definition 2.2.4 (Cone Convexity and Cone Closedness of a Point-To-Set Map)
Let F be a point-to-set map from R" into RP and let D be a convex cone in RP; then F is said to be aD-convex (resp. D-closed) if D-epi F is convex (resp. closed) as a subset of R" x RP. Proposition 2.2.3 Let D be a convex cone containing 0 in RP and let F be aD-convex pointto-set map from R" into RP; then F is D-convex if and only if a.F(x l) + (1 - ex)F(x 2) c F(a.x l + (1 - ex)x 2 ) + D, (*) for all x', x 2
E
R" and alia.
E
[0,1].
Proof First, assume that F is D-convex, and let x', x 2 E R", yl yl E F(x 2 ), and a. E [0, 1]. Since (Xi, yi) E D-epi F (i = 1,2), then a.(x l , y2) + (1 - a.)(x 2, y2) E D-epi F,
E
F(x l),
i.e.,
a.yl + (1 - ex)y2
E
F(a.x l + (l - a.)x 2) + D.
Next, suppose conversely that the relationship (*) holds. Let (Xi, yi) E D-epi F = 1,2) and a. E [0, 1]. Since yi E F(xi ) + D, for i = 1,2, a.yl + (1 - a.)yl E a.F(x l) + (l - a.)F(x 2 ) + D c F(a.x l + (1 - a.)x 2) + D.
(i
Hence a.(x l , yl)
+ (l
- a.)(x 2 , yl) E D-epi F,
which is what we wished to prove. Definition 2.2.5
(Cone-Concavity of a Point-To-Set Map)
Let F be a point-to-set map from R" into RP, and let D be a convex cone in RP; then F is said to be D-concave if F(a.x l + (l - a.)x2 ) c a.F(x l) + (1 - a.)F(x 2 ) + D for all x', x 2
E
R" and all a.
E
[0,1].
Remark 2.2.1 Note that the D-concavity of a point-to-set map F is not equivalent to (- D)-convexity of F (see Fig. 2.2).
2.3
PREFERENCE ORDERS AND DOMINATION STRUCTURES
y
25
y
--;----------- x
o
-+-~---------x
o
(a)
(b)
Fig. 2.2. (a) (-R~)-convex
2.3
map and (b) R~-concave
map.
Preference Orders and Domination Structures
In ordinary single-objective optimization problems, the meaning of optimality is clear. It usually means the maximization or the minimization of a certain objective function under given constraints. In multiobjective optimization problems, on the other hand, it is not clear. Let us consider the case in which there are a finite number of objective functions each of which is to be minimized. If there exists a feasible solution, action, or alternative that minimizes all of the objective functions simultaneously, we will have no objection to adopting it as the optimal solution. However, we can rarely expect the existence of such an optimal solution, since the objectives usually conflict with one another. The objectives, therefore, must be traded off. Thus, in multiobjective optimization problems, the preference attitudes of the decision maker play an essential role that specifies the meaning of optimality or desirability. They are very often represented as binary relations on the objective space and are called preference orders. This section is devoted to preference orders and their representation by point-to-set maps (also known as domination structures). , 2.3.1
Preference Orders
A preference order represents a preference attitude of the decision maker in the objective space. It is a binary relation on a set Y = f(X) c RP, wheref is the vector-valued objective function and X the feasible decision set. The basic
26
2
MATHEMATICAL PRELIMINARIES
binary relation > means strict preference; that is, y > z for y, z E Y implies that the result (objective value) y is preferred to z. From this, we may define two other binary relations '" and;:;: as if and only if not y
>z
>z
or
if and only if y
and
not z
> y,
y '" z.
The relation '" is called indifference (read y '" z as y is indifferent to z), and ;;: is called preference-indifference (read y ;:;: z as z is not preferred to y). The binary relations that we use as preference or indifference relations have certain properties. A list of some of these properties are as follows (Fishburn [F7], [F8]): A binary relation R on a set Y is (1) reflexive if yRy for every y E Y, (2) irreflexive if not yRy for every y E Y, (3) symmetric if yRz => zRy, for every y, z E Y, (4) asymmetric if yRz => not zRy, for every y, z E Y, (5) antisymmetric if (yRz, zRy) => y = z, for every y, z E Y, (6) transitive if (yRz, zRw) => yRw, for every y, z, w E Y, (7) negatively transitive if (not yRz, not zRw) => not yRw, for every y, z, WE Y, (8) connected or complete if yRz or zRy (possibly both) for every y, z E Y, and (9) weakly connected if y :F z => (yRz or zRy) throughout Y. Lemma 2.3.1
Let R be a binary relation on Y. (i) If R is transitive and irreflexive, it is asymmetric. (ii) If R is negatively transitive and asymmetric, it is transitive. (iii) If R is transitive, irreflexive, and weakly connected, it is negatively transitive. Definition 2.3.1
(Strict Partial Order, Weak Order, and Total Order)
A binary relation R on a set Y is said to be (i) a strict partial order if R is irreflexive and transitive, (ii) a weak order if R is asymmetric and negatively transitive, and (iii) a total order if R is irreflexive, transitive, and weakly connected. Proposition 2.3.1
If R is a total order, then it is a weak order; and if R is a weak order, then it is a strict partial order.
2.3
PREFERENCE ORDERS AND DOMINATION STRUCTURES
27
In this book, the preference order is usually assumed to be at least a strict partial order; that is, irreflexivity of preference (y is not preferred to itself) and transitivity of preference (if y is preferred to z and z is preferred to w, then y is preferred to w) will be supposed. Given the preference order >- (or ;::) on a set Y, it is a very interesting problem to find some real-valued function u on Y that which represents the order >- as u(y) > u(z)
if and only if y
>- z
for all
y, z
E
Y.
(Preference Functions'
Definition 2.3.2
Given the preference order that u(y) > u(z)
>- on a set
Y, a real-valued function u on Ysuch
if and only if y
>- z
for all
y, z
E
Y
is called a preference function. Remark 2.3.1
Note the following relationships: y ~ z ~ not y
y ;:: z
~
y
>- z
>- z
and
not z
>- y ~
u(y)
=
u(z),
y ~ z ~ u(y) ~ u(z).
or
With regard to the existence of the preference function, the following result is fundamental (Fishburn [F5]). Lemma 2.3.2
If >- on Y is a weak order, then symmetric, and transitive).
~
is an equivalence relation (i.e., reflexive,
Theorem 2.3.1
If >- on Y is a weak order and vi-. (the set of equivalence classes of Y under ~) is countable, then there is a preference function u on Y. The utility theory (especially multiattribute utility theory) will be explained in a later chapter. (See Fishburn [F7] and Keeney and Raiffa [K6] for details.) Given a preference order, what kind of solutions should be searched for? t In some literature preference functions are also called utility functions or value functions. In this book, however, we use the term preference functions to include both of them. See Section 7.1 for more details.
28
2
Definition 2.3.3
MATHEMATICAL PRELIMINARIES
(Efficiency)
Let Y be a feasible set in the objective space RP, and let >- be a preference order on Y. Then an element y E Y is said to be an efficient (noninferior) element of Y with respect to the order >- if there does not exist an element y E Ysuch that y >- y. The set of all efficient elements is denoted by 8(Y, >-). That is, 8(1', >-)
= {y
Y: there is no y
E
E
Y such that v > Y}.
Our aim in a multiobjective optimization problem might be to find the set of efficient elements (usually not singleton). Remark 2.3.2 The set of efficient elements in the decision space X may be defined in an analogous way as (X,f-l(>-)) wheref-
1
= {x
EX: there is no x E X such that f(x) >- f(x)} ,
(>-) is the order on X induced by >- as xf-1(>-)x' 2.3.2
if and only if f(x)
>- f(x').
Domination Structures
Preference orders (and more generally, binary relationships) on a set Y can be represented by a point-to-set map from Y to Y. In fact, a binary relationship may be considered to be a subset of the product set Y x 1', and so it can be regarded as a graph of a point-to-set map from Yto Y. Namely, we identify the preference order >- with the graph of the point-to-set map P: P(y) = {y'
E Y: y
>- y'}
[P(y) is a set of the elements in Yless preferred to y],
graph P
:=
{(y, y')
E
Y X Y: y'
E
P(y)} = {(y, y')
E
Yx Y: y
>- y'}.
This viewpoint for preference orders is taken, for example, by Ruys and Weddepohl [Rll] (though r:: is considered instead of P). Another way of representing preference orders by point-to-set maps is the concept of domination structures as suggested by Yu [Yl]. For each y EYe RP, we define the set of domination factors D(y) := {d
E
RP: y
>- y + d} u
{O}.
This means that deviation d E D(y) from y is less preferred to the original y. Then the point-to-set map D from Y to RP clearly represents the given preference order. We call D the domination structure.
2.3
PREFERENCE ORDERS AND DOMINATION STRUCTURES
29
Since the feasible set Y is not always fixed, it is more convenient to define the domination structure on a sufficiently large set that includes Y. In this book, D is defined on the whole objective space RP for simplicity. The pointto-set map D may then have the following properties (the numbers in parentheses correspond to those for the properties of preference orders listed in Subsection 2.3.1): The domination structure D(·) is said to be (4) asymmetric if d E D(y), d¥-O = - d ¢ D(y + d) for all y, i.e., if y' + D(y'), y' E Y + D(y) y = y', (6) transitive if d E D(y), d' E D(y + d) d + d' E D(y) for all y, i.e., if y E y' + D(y'), y' E y" + D(y") Y E y" + D(y"), (7) negatively transitive if d ¢ D(y), d' ¢ D(y + d) = d + d' ¢ D(y) for
y
= =
E
=
all y.
Efficiency can be redefined with a domination structure. Definition 2.3.3'
(Efficiency)
Given a set Yin RP and a domination structure D(·), the set of efficient elements is defined by
PE Y such that p E Y + D(y)}.
t&'(Y, D) = {.9 E Y: there is no y ¥-
This set t&'(Y, D) is called the efficient set. Remark 2.3.3 We can induce a domination structure D' on X from a given domination structure D on Yas follows:
D'(x) = {d'
E
+ d') Ef(x) + D(f(x))\{O}}
R" :f(x
u {O}.
If we denote D' simply by f-l(D), the set of efficient solutions in the decision space is {x :f(x)
E
$(Y, D)}
For example, if f is linear, [i.e., f
=
$(X,f-l(D».
=
C (a p x n matrix)] and if D
D'(x) = {d': Cd'
~
= R;', then
O} u {O}.
The most important and interesting special case of the domination structures is when D(·) is a constant point-to-set map, particularly when D(y) is a constant cone for all y. When D(y) = D (a cone) (4) asymmetry
[d E D, d¥-O
[d, d' E D
2.1.4.),
(5) transitivity
=
-d ¢ D] D is pointed (Definition
=d + d'
E
D] D is convex.
2
30
MATHEMATICAL PRELIMINARIES
Thus, pointed convex cones are often used for defining domination structures. We usually write Y ~D y' for y, y' E RP if and only if y' - y E D for a convex cone D in RP. Also y S;D y' means that y' - ye D but y - y' ¢ D. When D is pointed, y S;D y' if and only if y' - y e D\{O}. When D = R~, it is omitted as ~ or S;. In other words, y~y'
if and only if Yi
~
y;
for all
i
yS;y'
if and only if y
~
y'
and
y :F y',
= 1, .. . ,p,
i.e., for all Yi
= S;, i.e., D = R~. (ii) Weak Pareto order: > = O}.
2.3
PREFERENCE ORDERS AND DOMINATION STRUCTURES
(iii) Lexicographic order:
r
>.
31
>-,
y' if and only if 3k E {I, ... , p} such that Yi = Y; for all i < k
Yk < y~,
and
i.e., D
=
{d E W: there exists a number k E {I, ... , p} d, = 0 for all i < k and
such that
dk > O} u {O}.
Note that this D is a cone, but it is neither open nor closed (Fig. 2.3). (iv) Order by a polyhedral convex cone: D = {d E W: Ad ~ O} with an I x p matrix A. Here we suppose that Ker A = {y: Ay = O} = {O} in order for D to be pointed. Then, this order is essentially reduced to the Pareto order as is shown in the following lemma. Lemma 2.3.4
Let Y be a set in W, A be an I x p matrix with Ker A E W:Ad ~ O}. Then
= {O},
and
D = {d
I&"(Y, D)
= {y
E
Y: Ay
E
I&"(A Y, R 1+)}.
y E Y; Y t$ 1&"( Y, D) if and only if there exists y E Y such that D\{O}. Since Ker A = {O}, Y- Y E D\{O} if and only if Ay ~ Ay, i.e., if and only if Ay t$ I&"(A Y, R 1+). Proof Let
y - y
E
Fig. 2.3. Domination structure of lexicographic order.
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
In this chapter we discuss solution concepts for muItiobjective optimization problems and investigate some fundamental properties of solutions. First, efficiency and proper efficiencyare introduced as solution concepts. In the second section, existence and external stability of efficient solutions are discussed. The third section is devoted to conditions for the connectedness of efficient sets. Three kinds of characterization for efficient or properly efficient solutions-scalarization, best approximations, and constraint problemsare given in the fourth section. The final section deals with the Kuhn-Tucker conditions for multiobjective problems. 3.1
Solution Concepts
As we have already mentioned, the concept of optimal solutions to multiobjective optimization problems is not trivial and in itself debatable. It is closely related to the preference attitudes of the decision makers. The most fundamental solution concept is that of efficient solutions (also called nondominated solutions or noninferior solutions) with respect to the domination structure of the decision maker, which is discussed in the first subsection. We also discuss another slightly restricted concept-proper efficiency-in the second subsection. 3.1.1
Efficient Solutions
In this section we consider the muItiobjective optimization problem (P) minimize f(x) = (fl(X),f2(X), ... ,fp(x) subject to x E X C W. 32
3.1
33
SOLUTION CONCEPTS
Let Y = f(X) = {y: y = f(x),
X E
X}.
A domination structure representing a preference attitute of the decision maker is supposed to be given as a point-to-set map D from Y into RP. Though it might be natural to suppose that D(y) => R~ since a smaller value is preferred for each}; (i = 1, ... , p), we do not assume this in order to deal with more general multiobjective problems. Definition 3.1.1
(Efficient Solution)
A point x E X is said to be an efficient solution to the multiobjective optimization problem (P) with respect to the domination structure D if f(x) E 0"(Y, D); that is, if there is no x E X such thatf(x) E f(x) + D(f(x)) and f(x) # f(x) (i.e., such thatf(x) E f(x) + D(f(x))\{O}). The following proposition is immediate. Proposition 3.1.1
Given two domination structures D1 and D z , D1 is said to be included by
o, if
for all
y
E
Y.
In this case,
Many interesting cases of efficient solutions are obtained when D is a constant point-to-set map whose value is a constant (convex) cone. In such cases, we identify the map (domination structure) with the cone D. Then x E X is an efficient solution to the problem (P) if and only if there is no x E X such that f(x) - f(x) E D\{O}; namely, x is efficient if and only if U(X) - f(x» (\ (-D) = {O}. If D is an open cone, it does not contain O. However, we consider D(y) = D u {O} and call D itself the domination structure in this book. As a matter offact it does not matter so much whether D contains 0 or not since the set D\ {O} is used for the definition of efficient solutions. Remark 3.1.1
Given a closed convex cone D, some authors call x a weakly efficient solution to the problem (P) if f(x) E 0"( Y, int D), i.e., if (f(X) - f(x)) (\ (-int D) = 0 (Nieuwenhuis [N15, N16], and Corley [C15]). Weakly
34
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
efficient solutions are often useful, since they are completely characterized by scalarization (see Corollary 3.4.1 later). The following propositions are often very useful. Proposition 3.1.2 Let D be a nonempty cone containing 0, then S(Y,D)
=:J
S(Y+ D,D)
with equality holding if D is pointed and convex. Proof The result is trivial if Y is empty, so we assume otherwise. First suppose y E S(Y + D, D) but Y ¢ S(Y, D). If y ¢ Y, there exist y' E Y and nonzero dE D such that y = y' + d. Since OED, Y c Y + D. Hence, y ¢ S(Y + D, D), which is a contradiction. If y E Y, we directly have a similar contradiction. Next suppose that D is pointed and convex, y E S(Y, D) but y ¢ S(Y + D, D). Then there exists a y' E Y + D with y - y' = d' E D\{O}. Then y' = y" + d" with y" E Y, d" ED. Hence, y = y" + (d' + d") and d' + d" E D, since D is a convex cone. Since D is pointed, d' + d" #- and so y ¢ S(Y, D), which leads to a contradiction. This completes the proof of the proposition.
°
Remark 3.1 .2 It is clear that the pointedness of D cannot be eliminated in the inclusion S(Y, D) c S(Y + D, D).
The convexity of D is also essential. In fact, let
and
which is pointed but not convex. Then (1, 1) E S(Y, D). However, (1,1)
= (0,0) + (t,O) + (t, 1) E Y + D + D.
Hence (1, 1) ¢ S(Y + D, D) (see Fig. 3.1).
3.1
35
SOLUTION CONCEPTS
y
--¥---..L.---Vl
o
o
1
"2
1
Fig. 3.1. B( Y. D) ¢ 6'( Y + D, D). (Remark 3.1.2.)
Proposition 3.1 .3 Let Y1 and Y2 be two sets in RP, and let D be a constant domination structure on RP (a constant cone, for example). Then
+ Y2, D) C 1&"(Y1, D) + 1&"(Y2, D). + Y2, D). Then y = yl + y 2 for
1&"(Y1
Proof Let y E 1&"(Yl some yl E Y1 and y2 E Y2 . We show that yl E 1&"(Y1, D). If we suppose the contrary, then there exist y E Y1 and nonzero dE D such that yl = Y + d. Then y = yl + y2 = Y + y2 + d and y + l E Y1 + Y2, which contradicts the assumption y E 1&"(Y1 + Y2, D). Similarly we can prove that y2 E 1&"(Y2, D). Therefore, YE &(Y1, D) + &(Y2 , D). Remark 3.1 .3 The converse inclusion of Proposition 3.1.3 does not always hold. For example, let
Y1 and D = R~.
=
Y2
= {(Yl'Y2):(Yl)2
+ (Y2)2
~ I}
C
R2
Then and
y2 = (0, -1)
E
1&"(Y2, D).
However, yl
+ y2 = (-1, -1) > (-)2, -)2) = (- )2/2, - )2/2) + (- )2/2, - )2/2) E Y1 + Y2 •
Proposition 3.1.4 Let Y be a set in RP, D be a cone in RP, and a be a positive real number. Then 1&"(a Y, D) = a1&"( Y, D).
36 3 SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS Proof The proof of this proposition is easy and therefore left to the
reader. The most fundamental kind of efficient solution is obtained when D is the nonnegative orthant R':- = {y E RP : y ~ O} and is usually called a Pareto optimal solution or noninferior solution. Definition 3.1.2
(Pareto Optimal Solution)
x
A point E X is said to be a Pareto optimal solution (or noninferior solution (Zadeh [Zl])) to the problem (P) if there is no x E X such that f(x) s f(X). A number of theoretical papers concerning muitiobjective optimization are related to the Pareto optimal solution. In some cases a slightly weaker solution concept than Pareto optimality is often used. It is called weak Pareto optimality, which corresponds to the case in which the domination cone D\{O} is equal to the positive orthant R':- = {y E RP: y > O}. Definition 3.1.3
(Weak Pareto Optimal Solutionv'
x
A point E X is said to be a weak Pareto optimal solution to the problem (P) if there is no x E X such that f(x) < f(x).
3.1.2
Properly Efficient Solutions
This subsection is devoted to another slightly strengthened solution concept, proper efficiency. As can be understood from the definitions (particularly from Geoffrion's definition), proper efficiency eliminates unbounded trade-offs between the objectives. It was originally introduced by Kuhn and Tucker [KIO], and later followed by Klinger [K7], Geoffrion [G5], Tamura and Arai [T3] , and White [W9] for usual vector optimization problems with the domination cone R':-. Borwein [BI6, BI7], Benson [B7, B9], and Henig [H7] dealt with more general closed convex cones as the domination cone. Hence, in this subsection, the domination cone D is assumed to be a nontrivial closed convex cone in RP unless otherwise noted (Definition 3.1.8 later). Definition 3.1.4
(Tangent Cone)
Let S c RP and yES. The tangent cone to S at y, denoted by T(S, y), is the set of limits of the form h = lim tiY' - y), where {tk } is a sequence of nonnegative real numbers and {I} is a sequence in S with limit y. t
cf. Remark 3.1.1.
3.1
37
SOLUTION CONCEPTS
Remark 3.1 .4
The tangent cone T(S, y) is always a closed cone. Definition 3.1.5
(Borwein's Proper Efficiency)t
x
A point E X is said to be a properly efficient solution of the multiobjective optimization problem (P) if
T(Y + D,f(x» n (-D) = {O}. Proposition 3.1.5
x
If a point E X is a properly efficient solution of (P) by the definition of Borwein, then it is also an efficient solution of (P).
Proof If x is not efficient, there exists a nonzero vector d E D such that d = f(x) - y for some y E Y. Let dk = (1 - l/k)d E D and t, = k for k = 1, 2, .... Then y
+ dk
= f(x) - d
+ (1 -
l/k)d
= f(x) - (l/k)d ~ f(x)
as
k~
00,
and
tk(y + dk
-
f(x» = k(-d + (1 - l/k)d)
= -d ~-d
as
k~
00.
Hence, T(Y + D,f(x» n (...I'D) =F {O}, and x is not properly efficient in the sense of Borwein. For example, if X = {(Xl' x 2 ) : (XI)2
+ (X 2)2
~ I} C R 2 ,
fl(X) = xl,fl(X) = Xl' and D = R~. Then, (-1,0) and (0, -1) are efficient solutions but not properly efficient solutions (in the sense of Borwein) (see Fig. 3.2). Definition 3.1.6
(Projecting Cone)
Let S c RP. The projecting cone of S, denoted by P(S), is the set of all points h of the form h = Cty, where o: is a nonnegative real number and yES. The projecting cone is also known as the cone generated by S. Definition 3.1.7
A point
(Benson's Proper Efficiency)t
x E X is said to be a properly efficient solution of the problem (P) if cl P(Y + D - f(x» n (-D)
t I
Borwein [B16]. Benson [B7].
= {O}.
38
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
- - -......----+-----+----1
X1
= Y1
-1 Fig. 3.2. Efficient but not properly efficient solutions.
Since T(Y + D, f(x» c cl P(Y + D - f(x» Benson's proper efficiency strengthens Borwein's proper efficiency. The converse, however, does not always hold (see Example 3.1.1 later). Lemma 3.1.1 Let S be a convex set and yES. Then T(S, y)
= cl P(S
- y),
which is a closed convex cone. Proof It is obvious that cl P(S - y) is a closed convex cone. So it suffices to show that cl P(S - y) c T(S, y), since the converse follows directly from the definitions. Since T(S, y) is closed, we need only to prove that P(S - y) c T(S, y). Let h E P(S - y). Then h = f3(y' - y) for some 13 ~ 0 and y' E S. Let
l =
[1 - (l/k)]y
+ (l/k)y'
and tk = 13k ~ O. Then tk(l - y) = f3(y' - y). Hence l vexity of S, and
l--> y Thus h
E
T(S, y), and the proof is completed.
as
k
E
S from the con-
--> 00.
3.1
SOLUTION CONCEPTS
39
Theorem 3.1.1
x
If is a properly efficient solution of the problem (P) in the sense of Benson, it is also a properly efficient solution of (P) in the sense of Borwein. If X is a convex set and iff is a D-convex function on X (see Definition 2.1.15), then the converse also holds; that is, Borwein's proper efficiency is equivalent to Benson's proper efficiency.
Proof If D is a con vex cone and f is D-con vex on the con vex set X, the set Y + D is a convex set (Proposition 2.1.21). Hence, cl PlY + D - f(x)) = T(Y + D,f(x)) by Lemma 3.1.1. Therefore, the proper efficiency in the sense of Borwein is equivalent to that in the sense of Benson. The following definition of proper efficiency by Henig does not require the domination cone D to be closed.
(Henig's Proper Efficiency)t
Definition 3.1.8
(i) A point
xE X is said to be a global properly efficient solution of (P) if f(x)
E
g( Y, D'),
for some convex cone D' with D\{O} c int D'. (ii) A point E X is a local properly efficient solution of (P) if, for every s > 0, there exists a convex cone D ' with D\{O} c int D ', such that
x
f(x)
E g((
Y + D) r. (f(x)
+ eB), D'),
where B is the closed unit ball in RP. These definitions are essentially the same as Benson's and Borwein's, respectively. Theorem 3.1.2
If D is closed and acute, then Henig's global (respectively, local) proper efficiency is equivalent to Benson's (respectively, Borwein's) proper efficiency. Theorem 3.1 .3
Any global properly efficient solution is also locally properly efficient. Conversely, if D is closed and acute, and if y+ t
Henig [H7].
r. (-D) = {O},
40
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
then local proper efficiency is equivalent to global proper efficiency. Here y+ is an extension of the recession cone of Yand is defined by
°
= {y': there exist sequences {lXd c R, {l}
y+
such that
> 0, IXk
IXk
->
and
IXk
l
->
c
Y
y'}.
(The condition y+ n (- D) = {a} will later be called the D-boundedness of Y. See Definition 3.2.4.) The proofs of the above two theorems are rather long, and so they are omitted here. The reader may refer to Henig [H7]. When D = known.
Geoffrion's definition of properly efficient solutions is well
R~,
Definition 3.1.9
(Geoffrion's Proper Efficiency)t
When D = R~, a point x is said to be a properly efficientsolution of(P) ifit is efficient and if there is some real M > such that for each i and each x E X satisfyingji(x) < /;(x), there exists at least one j such that .fj(x) < .fj(x) and
°
(h(x) - h(x))/(.fj(x) - .fj(x)) ~ M.
Theorem 3.1 .4
When D = R~, Geoffrion's proper efficiency is equivalent to Benson's proper efficiency, and therefore is stronger than Borwein's proper efficiency.
x
Proof Geoffrion => Benson: Suppose that is efficient but not properly efficient in the sense of Benson, namely, that there exists a nonzero vector dE cl P(Y + R~ - f(x)) n (-R~). Without loss of generality, we may assume that d 1 < -I, d, ~ (i = 2, ... , p). Let
°
tk(f(X
where r k E R~, assume that
tk
k)
> 0, and x k
E
+ rk -
f(x))
->
d,
X. By choosing a subsequence we can
I = {i :h(x k ) > h(x)} is constant for all k (and nonempty since x is efficient). Take a positive number M. Then there exists some number ko such that for k ~ ko fl(X k )
-
fl(X) < -1/2t k
and h(~) "Geoffrion [G5].
- h(x) ~ 1/2Mtk
(i
= 2, ... , p).
3.1
Then for all i
SOLUTION CONCEPTS
1, we have (for k
E
~ k o)
o < J;(xk) and for k
~
41
J;(x) ~ 1/2Mtk
ko, f1(X) - f1(Xk ) 1/2t k _ M J;(xk) - J;(x) > 1/2Mtk .
Thus
x is not properly efficient in the sense of Geoffrion.
x
Benson => Geoffrion: Suppose that is efficient but not properly efficient in the sense of Geoffrion. Let {M k } be an unbounded sequence of positive real numbers. Then, by reordering the objective functions, if necessary, we can assume that for each M; there exists an x k E X such that fl(X k) M, (f1(x) - f1 (Xk))/(J;(~) k) for all i = 2, ... , p such that J;(x > J;(x). By choosing a subsequence of
{M k } , if necessary, we can assume that
1 = {i :J;(xk) > J;(x)}
x is efficient). For each k let
is constant for all k (and nonempty since
tk = (f1(X) - f1(Xk))-1.
Then tk is positive for all k. Let
r7 Clearly rk
E R~,
=
{~(X)
- J;(xk )
i = 1 or i E 1, i i= 1, i ¢ 1.
for for
and
Let d,
=
lim tk[J;(Xk)
k-oo
+ r~
- J;(x)J.
Clearly d 1 = -1 and d, = 0 when i i= 1 and i ¢ 1. Since {Mk } is an unbounded sequence of positive real numbers, we have d, = 0
for
iE
1.
Thus, d = (-1,0, ... ,0) i= 0
E
cl P(f(X)
+ R~
- f(x))
n (- R~).
42
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
Hence x is not a properly efficient solution of (P) in the sense of Benson, as was to be proved. Paying attention to multiobjective programming problems, we may consider another type of proper efficiency that was defined earlier than the preceding ones. A multiobjective programming problem is defined as (P')
f(x) = (fl(X), ... ,fix»
minimize subject to
x
E X
= {x: g(x) = (gl(X), ... , gm(X»
~
OJ.
That is, X is supposed to be specified by m inequality constraints. Here, all the functions}; and gj are assumed to be continuously differentiable. Definition 3.1.10
(Kuhn- Tucker's Proper Ejficiency)t
x
A point is said to be a properly efficient solution of the problem (Pi) if it is efficient and if there is no h such that (iv) follows from Theorem 3.2.9 or Theorem 3.2.10. (ii) => (iii) follows from Theorem 3.2.11. Finally, (iii) => (ii) and (iv) => (ii) are trivial.
Some examples of externally unstable sets will be given later in Chapter 4, in connection with the perturbation stability of efficient sets. 3.3
Connectedness of Efficient Sets
In this section we discuss the connectedness of efficient sets in multiobjective optimization problems. Throughout this section, the domination structure is given by a constant pointed closed convex cone D. The following results are mainly due to Naccache [Nl]. Definition 3.3: 1 (Separatedness and Connectedness) A set A c RP is said to be separated if it can be written as A
= A 1 U Az,
where cl A 1 n A z = A 1 n cl A z =
0
with two nonempty sets A 1 and A 2 • Equivalently, A is separated if there exist two open sets 0 1 and O2 such that
If A is not separated, then A is said to be connected.
The following two lemmas are well known and useful in this section. Lemma 3.3.1
If a set A is connected and A c B c clA,
then the set B is also connected. Lemma 3.3.2
If {Ai: i E I} is a family of connected sets such that Ai is connected.
UieI
nieI
Ai "# 0, then
3.3
67
CONNECTEDNESS OF EFFICIENT SETS
Now let
{Y E
Sell, Y) =
Y: 0 such that
y t
Naccache [NI].
a E R.
E
y(oe) - D.
3.3
CONNECTEDNESS OF EFFICIENT SETS
69
If this is not true, then we can apply the standard separation theorem to the ray {y - ad: a > O} and the convex cone - D to deduce the existence of /l E RP\{O} such that
0
0
and
0
for any a sufficiently small. Thus (I;1)jk
(j =F k).
We define
In this case
Iy;i
=
1(I~I;1y);i
~
I(I;1 y);i
+ Ct11(I;1Y)jl)!(P
- 1)a.
Hi
Hence
where M(a) = 1 + l/a. Let SP be the set as follows: SP = {Jt E RP:
J/i
= 1, Jti
> 0 (i = 1, ... , P)}.
For each u ESP and a E A, the (Jt, a)-norm of y E RP is defined by Ilyl ~
and the (Jt,
00 )-norm
=
IIJt
(0 yll~,
is defined by Ilyll~
where Jt (0 Y = (Jt1Y1'" ·,JtpY p)·
= IIJt
(0
yiLt>,
I~
3.4
CHARACTERIZATION OF EFFICIENCY AND PROPER EFFICIENCY 79
In this subsection ji E RP will designate a fixed but arbitrary point such that ji ~ y. This point will serve as the ideal point. Define A(Il, IX, Y)
= {,Y E
II; -
Y:
jil ~
=
y
inf{lIy - jil ~:
E
Y}},
i.e., A(Jl, IX, Y) is the set of best approximations to ji out of Y in the (J.L, IX)norm. Moreover, denote by dey) the set Upd~"EA A(Jl, IX, Y), i.e., dey) is the set of all best approximations to ji out of Y in the (Jl, IX)-norm for some u E SP and IX EA. Definition 3.4.1
(Compromise Solution)
A point ji in dey) is called the compromise solution (in the objective space) to the multiobjective optimization problem (P). We now investigate the relationships between dey) and c&"(Y) (or &'(Y». Theorem 3.4.8
If Yis R~ -closed and the family of norms PI-P3, then
{11·110:: IX E
A} satisfies Properties
dey) c &,(Y) c c&"(Y) c cl d(Y). Proof Since &'(Y) c c&"(Y), we will prove that dey) c ,qIl(Y) and c&"(Y) c cl dey). (d(Y) c &'(Y»: First, let ; E deY). Then for some Jl E SP and IX E A,
liP -
jil ~
~
Ily -
jil ~
for all
y
E
Y.
Suppose to the contrary that PIf &'(Y). Then there exist sequences {Pk} c R, {yk} c Y, {dk } C R~ such that Pk > 0 and for
some nonzero d
E R~.
There are two cases for {Pk}: bounded and unbounded. If {Pk} is bounded we may assume, without loss of generality (by taking a subsequence if necessary) that Pk -> Po for some Po ~ O. If Po = 0, noting that yk + dk - P~ ji - y, we have -d E (ji - Y + R~t = R~, which is a contradiction. If Po > 0, + dk - Y -> -dlPo =1= 0 E -R~ as k -> 00. Since Y + R~ is closed, P- dlPo E Y + R~, i.e., there exists yO E Ysuch that y 2 yo. However, this implies that
t
liP -
.vl ~
> Ilyo - jil ~
by Property P3 and leads to a contradiction.
80
If
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
{fJk} is unbounded we may assume that, fJk --+ 00 as k --+ - P--+0. If Pi = .vi for some i, then d, = O. In fact,
I + dk
fJk(Y; + df - .vi) ~ fJk(Yi + df - .vi)
=
00.
Hence
fJkdf ~ 0
for this i and so - d, ~ O. Thus, there exists a p > 0 such that
o ~ y - dlfJ - Y ~ y - Y
for all
fJ ~
p.
Then, from Property P3, for all fJ ~ p. liP - dlfJ - ylI~ < liP - yll~ Since fJk --+ 00, then fJk ~ Pfor all k sufficiently large. Hence,
III + dk
-
YII~
< liP - jil ~
for some k sufficiently large. In fact,
III + dk - YII~ = III + dk - P + dlfJk + P- dlfJk - jil ~ k ~ III + d - 911~ + Ildl ~/fJk + 119 - dlfJk - yll~ and the first and second terms of the right-hand side converge to 0 as k --+ Thus, we have
III -
yll~
00.
< 119 - YII~
for some k sufficiently large, since I + dk - ji ~ I - ji ~ O. This is also a contradiction and therefore 9 E .9(Y). (tS'(Y) c cl d(Y)): Let 9 E tS'(Y). We will prove that for each 8 > 0, there exists y(t) E d(Y) such that lilt) - 91100 < 8. First, there exists y' > y such that lIy - Plloo < s for any y E(Y' - R~) n Y. In fact, if no such y' exists, then there exists a sequence {r} c RP such that > y, --+ 9 and for each k, there is a IE (1 - R~) n Y such that 11/- Pll oo ~ 8. Since Y + R~ is closed and Y is R~-bounded, we may assume without loss of generality that I --+ r" + dO with yO E Yand dO ~ 0. Then IIYo + dO - ylloo ~ 8.
r
r
However, r" + dO E (y - R~) n (Y + R~) = {9}, since y E tS'{Y). Thus, we have a contradiction. Now, since ji ~ 9 < y', there exist some p E SP and fJ > 0 such that y' - Y = fJ(1/PI' ... , 1/p p ) '
Then i = 1, .. . ,p.
3.4
CHARACTERIZATION OF EFFICIENCY AND PROPER EFFICIENCY 81
Hence, 119 - yll~ < {3. Let y(a) be a best approximation to y out of Y in the (11, a)-norm. Since Y + R~ is closed, y(a) exists. Then, from Property PI, lIy(a) - yll~
~ M(a)lly(a) - YII~
~
M(a)119 -
YII~.
From Property P2, for all a sufficiently large, lIy(a) - YII~
~ {3,
whence for all
i
= 1, ... , p,
i.e., y(a) ~ y'. Thus y(a) E (y' - R~) n Y, and so if we put l£) = y(a) for sufficiently large a, there I!y(£) - 91100 < e. This completes the proof of the theorem. Remark 3.4.2
The proof of Theorem 3.4.8 suggests that, in order to generate the efficient set, the value of a should be taken large. However, as has been established as Theorem 3.4.6, it suffices to use the so-called [I-norm [i.e. a = 1 in the norm of Example 3.4.3(i)] only if Yis R~ -convex. Possibly, the magnitude of a that is required can be related to a measure of the nonconvexity of Y. Remark 3.4.3
(i) In Theorem 3.4.8, the converse relationship t&'(Y) necessarily hold. In fact, if
::>
cl d(Y) does not
and Y = (-1, -1), then 9 = (0,0) ¢ tS'(Y), but 9 E cl t&'(Y) c cl d(Y) (see Fig. 3.14). (ii) On the other hand, the relationship d(Y) ::> &(Y) holds when y < y and we use the family of norms (ii) or (iii) in Example 3.4.3. This fact will be shown later in Theorem 3.4.10. In subsection 3.4.1 we studied the characterization of efficient sets by scalarization. If we use the [I-norm in Example 3.4.1(i), then 8(11, Y) = A(I1, 1, Y).
Namely, the procedure of best approximation by the [I-norm is of particular interest, though it cannot generate all the efficient points without convexity
82
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
. ..
I\'
Y '.
-1,
•
.
I I
.
,
y
to
.
.
\ •
'
..
.. - - - - 1 - - - - - - - - -
Y
-1
Fig. 3.14. III' - yll~k
=
III' - Y + Y-
=
liP -
yll~
+ ~OC
LJli~
k
i
jil ~k
+ [m~x{Jl,(y~
-
- PI)]'
the second term of the last equation is negative and therefore
IIY + dk
-
jil ~k
< IIY - jil ~k'
Yi)}
86
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
Similarly, when we use the family of norms (iii),
Ily + dk
-
jil ~k
=
III;.l(.u 0 (y - ji)) + Ia~l(.u
0 dk)lloo·
Since
11/- jil ~k
= III;.l(.u
< Ily - jil ~k
0 (y - ji)) + I;.l(.u 0 (/- y))lloo
III;.l(.u 0 (y - ji))lloo
=
and
we have Hence
Now, since Ildkll:::' = fJ for any k, we may assume that {dk } converges to some with Ilall:::' = fJ. Then, taking the limit of
a
Ily + dk - YII:::'
~ M(Ctk)lly
+ dk -
jil ~k
< M(Ctk)lly - jil ~k'
we have
Ily +
a
a- jill:::' ~
Ily - jill:::'.
a
This implies that ~ 0, since y + ~ ji and .u 0 (y - ji) = (fJ, ... ,fJ). On the other hand, from the definition of {dk } , dE cl P(Y + R~ - Y). However, this contradicts the assumption y E &'(Y) and the proof of the theorem is complted.
3.4.3
Characterization as Solution of Constraint Problems
Another method of characterizing Pareto optimal solutions is to solve kth objective constraint problems associated with the original multiobjective optimization problem. The kth objective constraint problem is formulated by taking the kth objective functionj, as the objective function and letting all the other objective functionsfj U # k) be inequality constraints (Benson and Morin [BIO], Haimes et al. [H2], Chankong and Haimes [C5, C6]). That is, given the original multiobjective optimization problem (P)
minimize f(x) = (fl(X), ... ,fix)
subject to
x
E
X
c
R",
3.4
CHARACTERIZATION OF EFFICIENCY AND PROPER EFFICIENCY 87
the kth objective constraint problem can be defined as a scalar optimization problem minimize fk(X) subject to J;(x) ~ ei ,
i = 1, ... , p,
i
#- k, x
E
X,
where s = (e1 , ••• , ep ) E RP. A similar type of equality-constrained problems was studied by Lin [L6-L8]. The following theorems provide the relationships between the Pareto optimal solutions of the original multiobjective optimization problem (P) and the solutions of the kth objective constraint problems (Pk(e)). Theorem 3.4.11
x
x
A point is a Pareto optimal solution of (P) if and only if solves (Pk(e)) for every k = 1, ... , P, where e = f(x).
x x x
Proof If is not a Pareto optimal solution of (P), there exists x E X such that J;(x) ~ J;(x) with the strict inequality holding for at least one k. This implies that does not solve (pk(e)). Conversely if does not solve (Pk(e)) for some k, then there exists x E X such thatf,.(x) f,.(x). Hence x is a Pareto optimal solution of (P).
As has been already stated, every properly efficient solution (in the sense of Geoffrion) is a Pareto optimal solution, but not vice versa in general. Benson and Morin showed that the converse holds when the constraint problems are stable as scalar optimization problems. Definition 3.4.2
(Perturbation Function)
The perturbation function (Pk(e)) is defined as wk(u)
= inf{fk(x) :J;(x) -
W k: RP-1 --+
ei ~ u., i
R
= [ - 00,
+ 00] associated with
= 1, ... , k - 1, k + 1, ... , p,
X EX},
88
3
SOLUTION CONCEPTS AND SOME PROPERTIES OF SOLUTIONS
Definition 3.4.3
(Stability)
(Pie)) is said to be stable when Wk(O) is finite and when there exists a scalar M > 0 such that, for all u '# 0,
[wk(O) -
wk(u)J/llull
~
M.
Theorem 3.4.13
Assume that f1 (x), ... ,fr,(x) are convex functions on the nonempty convex set X and that x is a Pareto optimal (i.e. efficient) solution of the multiobjective optimization problem (P). Then x is a properly efficient solution of (P) if and only if (Pk(e)) is stable for each k = 1,... , p, where e
= f(x).
Proof Assume first that (Pk(e)) is stable, and then define the dual problem of (Pk(e)) by (Ok(e))
CPk
= max{min[J,.(X) + I'~O
XEX
L j1.U;(x) -
ii'k
ei)]}
where j1. is a (p - I)-vector of dual variables. Since (Pk(e)) is stable, adapting a result of Geoffrion [G6, Theorem 3], we find that (Ok(e)) has an optimal solution Po and x is an optimal solution for the problem (P~.I')
miminize J,.(x)
+L
Po; 0 (i = I, ... , p) and ~j ~ 0 U E lex)) such that p
L iti V/;(x) + L:
i= 1
~j Vgix)
jeJ(x)
=0
Letting ~j = 0 for j rt: lex), we can immediately establish the theorem. Corollary 3.5.1
x
If the Kuhn-Tucker constraint qualification is satisfied at E X, a necessary condition for x to be a properly efficient solution to (P) in the sense of Geoffrion is the condition in Theorem 3.5.1. Proof Immediate from Theorem 3.1.6 and 3.5.1. Theorem 3.5.2
If every function /; (i = 1, ... , p) and gj U = 1, ... , m) is convex, the condition in Theorem 3.5.1 is also sufficient for E X to be a properly efficient solution to (P).
x
Proof The condition in Theorem 3.5.1 implies that f(x) E S(P, Y) (see Section 3.4.1) and hence, from Theorem 3.4.1, x is a properly efficient solution of (P). Theorem 3.5.3
Suppose that (P) satisfies the Kuhn-Tucker constraint qualification at Then, a necessary condition for x to be a weak Pareto optimal solution to (P) is that there exist it E RP and ~ E B" such that
x EX. (i) (ii) (iii)
(it, Vf(x» + 0,
{(d1, d2): d 1 = 0, d 2 ~ O}
if Yl
{(d1, d 2): d 1 ?; 0, d 2 > O}
if Y2 = 0.
= 0, Y2
Then, N1(u) = {(O,O)}
N1(0)
for
= {(Yl'Y2): 0 ~
u
> 0,
Yl ~ 1, Y2 = O}.
#- 0,
106
4
STABILITY
-. -1
.
.
'. .
.. " -1 . ':
o
o -2 Y
Y1 = Y2 ~ 0
= (0,
0)
-----t----d1 o
Y2 = 0, Y1 'i' 0
Y2
~~__cI,....-
Y2
.....- - Y1
--¥--_-o-_ _..L-_ Y1 o 1/2
u=o Fig. 4.10, Example 4.2.12.
4.3
STABILITY FOR PERTURBATION OF DOMINATION STRUCTURE 107
..
.
---L_ _
_....-=:~
o
Yl
..
'
.
,
,;
-1------......-Yl o 1
u>O
u=O Fig. 4.11. Example 4.2.13.
4.3
Stability for Perturbation of the Domination Structure
This section deals with the stability of the solution set from the second point of view described in Section 4.1. In other words, we will investigate the continuity of the point-to-set map N z , which is defined on the space V by parameter vectors v that specify the domination structure of the decision maker. This point of view is peculiar to multiobjective optimization problems. The aim of the decision maker in ordinary optimization problems is to maximize (or minimize) scalar objective functions and so is quite clear; hence, there is no room for such consideration. On the other hand, in multiobjective optimization problems, the domination structure of the decision maker is a very important and essential factor. Hence, it is quite natural to consider that it is perturbed according to changes of environmental factors or of feelings of the decision maker. Thus, the study of the stability of the solution set for perturbations of the domination structure seems to be necessary and interesting.
108
4
STABILITY
Throughout this section, the set of feasible solutions in the objective space is fixed as Y and the domination structure D is considered to be a point-to-set map from Y x V into RP. As has been already noted in Section 4.1, the pointto-set map N z from V into RP is defined by Niv) = 6'(Y, D(·, v)).
Some sufficient conditions for the continuity of this point-to-set map N z will be investigated in this section. 4.3.1
Upper Semicontinuity of the Map N 2
First, sufficient conditions under which the map N z is upper semicontinuous are obtained as follows. Theorem 4.3.1
The map N z is upper semicontinuous at are satisfied: (i) (ii) (iii) (iv)
vE V if the following conditions
Y is a closed set; the map D(y, . ) is lower semicontinuous at v for each fixed y E Y; D(y, v) is a convex set for all y E Y and every v E V near v; the relationship Nz(v) = 6'(Y, int D(·, v)) holds.
Proof Let and We must show that y E Nz(v). Since Y is a closed set, then assume to the contrary that y does not belong to the set
y E Y. Hence, if we
(condition (iv)),
Niv) = 6'(Y, int D(·, v))
then there exist aYE Y and a nonzero d e int D(y, v) such that y = y
Y' -
+ d. Let
k = 1,2, ....
y = d",
Then, in view of Lemma 4.2.1, dk
E
D(y, Vk)
except for a finite number of k's, since
dk
--+
d E int D(y, v)
and the point-to-set map D(y, . ) is lower semicontinuous at that and for some
v. This implies k,
4.3 STABILITY FOR PERTURBATION OF DOMINATION STRUCTURE 109 V2
V2
-+---'---.j-"'----+-
-v
Vt
-1
-1
v > 0
v=O
Fig. 4.12. Example 4.3.1.
which contradicts the assumption yk E Nitf). Therefore y must be contained in N 2(v). Namely, N 2 is upper semicontinuous at v and the proof is completed. Some examples are given in the following to illustrate the necessity of each condition in the theorem. Example 4.3.1
(Closedness of the Set Y-Fig. 4.12)
Let
v =0,
v = {v E R: v ~ O},
for all Y E Y. Then, N2(v) = {(Yl'Y2): (Yl)2 N 2(0) = {(Yl'Y2): (Yl)2 Example 4.3.2
+ (Y2)2 = + (Y2)2 =
1, Yl ~ 0, Y2 ~ vy,}
for
v> 0,
1, Yl ~ 0,Y2 < O}.
(Lower Semicontinuity of the Map D(y, .) at v-Fig. 4.13)
Let
v=
{v
E
R: v > O},
Y = {(Yl,Y2): (Yl)2 D(y, v) = R~
v = 1,
+ (Y2)2
for all Y E Y
~ 11} C R 2,
if v > 0, v ¥- 1, for all y.'
110
4
STABILITY Y2
V2
D 1
~.-.-:
--'---~--
.
dl
o
-1
-1
v =1
v > 0, v F 1
v = 1
Fig. 4.13. Example 4.3.2.
Then, Nz(v) = {(Yl'YZ): (Yl)Z + (YZ)Z = 1, Yl ~ 0, Yz ~ O}
v > 0, v ¥= 1,
for
= {(Yl' Yz): (Yl)Z + (Yz)Z =
Nz{l) Example 4.3.3
1, Yl ~ 0, Yz ~ Yl}'
(Convexity of the Set D(y, v)-Fig. 4.14)
Let
fJ=l,
V={vER:O 0,
Example 4.3.6 (Boundedness of the Set Y-Fig.4.17)
In Example 4.3.5, we replace Y by a new set defined by
Then, for
v> 0,
,
114
4
STABILITY
-1
0
-------+---
Yl
.1 ' . ' "
. I
' .
. .: ,', D(vl, . '. ',I "
'. .. '.
Fig. 4.17. Example 4.3.6.
(External Stability of N 2(v), Upper Semicontinuity of D(·, v) on Y for v near v-Fig. 4.18)
Example 4.3.7
Let
v=
{v
Y
{y: 0
=
lin: n is a positive integer} u {O},
=
~
y
I}
~
{
R,
s kin -
{d : 0 ~ d D(y, v) =
c
v = 0,
y}
(k - l)ln < y ~ kin,
for
k = 1",., n,
{O}
y
for
=
0
if v = lin.
Then,
= {O},
N2(v)
for
N 2(O) = {y: 0
kin -y , 0
(k - 1lin
,
-'d
lin
0,
v # 1,
Y E Y.
Then,
Nz(v)
=
{(YI' Yz): (Yt)Z + (yzf = 1, YI ~ 0, YI - Yz ~ O} for
v > 0, v # 1,
N z(1) = {(YI' Yz): (YI)Z
+ (Yz)Z
=
1, YI ~ 0, Yz ~ O}.
118
4
STABILITY
d2
.
..
. . . . "
,
:
'
1,
• • s • : • ~.
.'
d1
o
0
d1
-2v
Y2 ~ v o 0
v=o
Y2
Y2 1 .
. .. .'.::" ... . . . . : " "
..
...
.' 0. . ~ .
'. '.
'~
.:
"
.
~
.
".-
"..
" . : '
"
.... ".
0..
. . ...' .. . .. .... : . " ': .. ; . . ,
.. --1-----......-Y1 o
..
"
",
v=O
0< v < 1 Fig. 4.22. Example 4.3.11.
Example 4.3.11
(Upper Semicontinuity of the Map D(·, f;) on Y-Fig.4.22)
Let
v=
{v E R : 0
~
v
~
1},
f;
= 0,
Z Y = {(Yl'YZ):O ~ Yl ~ 1, 0 ~ Yz ~ 1} c R , if Y2 ~ D, V > 0, if Y2 < v, D > 0, if Y2 > 0, if Y2
= O.
4.4
Then, Nz(v) = {(Yt,Yz): Yt ~ 0,
for Nz(O)
119
STABILITY IN THE DECISION SPACE
°
; y. Then, since yk - yk --> Y- Y E int D, l - yk E int D for sufficiently large k. Hence, l ¢ tS'(Y(u k), int D) ::J tS'(Y(u k), D) ::J Ql (Uk), which contradicts the assumption. Therefore y E Ql(U). This completes the proof. Indispensability of conditions (i) and (ii) in Theorem 4.5.1 can be understood by Examples 4.2.1, 4.2.2, and 4.2.5. Though the second condition is indispensable, it may be too restrictive. For example, let Y(u) = {(Yl' Y2): (Ylf D
=
+ (Y2)2
~ I}
for all
u
E
U = R,
R~.
Then, Ql(U) = {(Yl'Y2):(Yl)2
+ (Y2)2
~ l'Yl
< O,Y2 < O}
for all
u,
124
4
STABILITY
and so Ql is not upper semicontinuous at any u E U, despite the fact that this example is quite natural. Theorem 4.5.2
The map Ql is lower semicontinuous at U E U if the following conditions are satisfied: (i) the map Y is continuous at u; (ii) Y is uniformly compact near u; (iii) Y(u) is D-closed for every U E U near U. Proof Let
and Since Yis lower semicontinuous at U, there exist a number m and a sequence {l} such that l E Y(uk ) for k ~ m and l ~ y. Since Y(uk ) is bounded (condition (ii)) and D-closed (condition (iii))for all k sufficiently large, l E Y(uk) C I!(Y(u k)) + D c cl Ql(Uk) + D by Theorems 3.2.10 and 3.2.11. Thus, we may assume that there exist sequences {yk} and {1} such that yk E cl Ql(Uk), 1 E Ql(Uk), l - yk E D, and Ilyk -111 < 11k, for k ~ m. Since Y is uniformly compact near {1} may be assumed to converge to some y. Since Y is upper semicontinuous at U, y E y(u). Since
u,
lit - yll
~
lit -111 + 111 - yll < 11k + 111 - YII,
yk ~ Yas k ~ 00. Hence l - yk ~ Y - Y E D, as D is closed. Since y E Ql(U), Y = y. Thus, we have established that 1 ~ Yand 1 E Ql(Uk) for k ~ m. This completes the proof. In view of Examples 4.2.6-4.2.9, each condition of the theorem cannot be dispenced with. Theorem 4.5.3
The map Q2 is upper semicontinuous at are satisfied:
vE V if the following conditions
(i) Y is a closed set; (ii) the map int D is lower semicontinuous at (iii) Q2(V) = I!(Y, int D(v)).
v;
Proof Let
and
4.5
125
STABILITY OF PROPERLY EFFICIENT SOLUTIONS
Since Y is a closed set, y E Y. If we suppose that there exists y E Y such that
y-
y ¢ Q2(V) =
G(Y, int D(v)),
Y E int D(v).
Since Y' - y --+ y - y #- 0, in view of Lemma 4.2.1, Y' - y E D(tf)\{O} for all k sufficiently large. Then, for those k, Y' ¢ Q2(Vk), which leads to a contradiction. Hence y E Qz{v), and the proof is complete. Indispensability of each condition can be understood by Examples 4.2.1, 4.3.2, and 4.3.4. Theorem 4.5.4
The map satisfied:
Q2
is lower semicontinuous at
vE V if the following condition is
(i) the map D is upper semicontinuous at V. Proof Let and
Y E Qz{v).
Then, in view of Theorem 3.1.2, there exists a convex cone D' such that
y E G(Y, D')
and
D(v)\{O} c: int D'.
We will prove that D(vk)\{O} c: int D' for all k sufficientlylarge. If we suppose the contrary, then, by considering a subsequence if necessary, we may take a sequence {dk } such that dk E D(v"),
IId k ll
=1
and
dk ¢ int D'
for all
k
= 1,2, ....
a
We may assume without loss of generality that {d k } converges to some with Iia II = 1. Then, E D(v), since D is upper semicontinuous at v. On the other hand, ¢ int D' since d k ¢ int D' for any k. This contradicts the fact that D(v)\{O} c: int D'. Therefore, D(tf)\{O} c: int D' for all k sufficiently large. Hence y E Q2(tf) = &'(Y, D(vk)) forthose k by Theorem 3.1.2, and the proof of the theorem is completed.
a
a
Examples 4.3.5, 4.3.6, and 4.3.10 illustrate that the condition (i) is essential to the theorem. Remark 4.5.1
Comparing Theorems 4.5.1-4.5.3 with Theorems 4.2.1, 4.2.2, and 4.3.1, we can conclude that the conditions stated in the theorems in this subsection are also sufficient for the upper and lower semicontinuity of the maps N 1
126
4
STABILITY
and N 2' respectively, if Q; is replaced by N; (i = 1,2). However, the condition in Theorem 4.5.4 is not sufficient for the lower semicontinuity of N 2 as can be seen by Examples 4.3.5 and 4.3.6. Remark 4.5.2
In a linear multiobjective optimization problem, proper efficiency is equivalent to efficiency (Theorem 3.1.7). Therefore, the results in this subsection are valid for the stability of the efficient set in linear problems.
5
LAGRANGE DUALITY
This chapter will be devoted to Lagrange duality for efficient solutions in multiobjective optimization where the domination structure is supposed to be a pointed closed convex cone. As was seen in the previous chapters, efficient solutions correspond to minimal (maximal) solutions in ordinary mathematical programming. Therefore, for convenience in this chapter we define Dminimizing as finding efficient solutions with respect to a cone ordering SD; in particular, we define minimizing for cases with the cone ordering with R~. Similarly we use the notation Minj, Y for representing &'(Y, SD); in particular, D-maximization and Maxj, are used in a we define Min Y as &'(Y, SR~)' similar fashion for the cone ordering S_ D' The first section gives several results on duality in linear multiobjective programming, which are mainly due to Gale et al. [G2], Isermann [15-17], and Nakayama [N6]. The next section develops duality in nonlinear multiobjective optimization in parallel with ordinary convex programming, which was originally developed by Tanino and Sawaragi [T7, T9]. Some of modified versions are due to Nakayama [N5]. The third section considers duality from a geometric viewpoint. A geometric meaning of the vector-valued Lagrangian function will be given. Other duality theorems will be derived from geometric consideration. The main parts of this section are due to Nakayama [N5, N7]. Earlier results along a similar line can be found in Nieuwenhuis [N15] and Jahn [11].
5.1
Linear Cases
Possibly the first work on duality for multiobjective optimization was given by Gale et al. [G2] for linear cases. They considered the following matrix optimization problem. 127
128
5
LAGRANGE DUALITY
Let D, Q, M, and N be pointed convex polyhedral cones in RP, R'", R", and R', respectively. This means, in particular, that int DO ¥= 0. In what follows, we shall suppose that int N ¥= 0. Furthermore, we shall identify the set of all m x n matrices with R'"":". This notation is also valid for matrices with other dimensions. Define f+:= {K
E
Rpx':KN cD},
{K
E
RPx , : K(int N) c D\{O}}.
$"+
Then, the order
~
:=
for p x r matrices is introduced as follows:
:it+,
if and only if K 1
-
K2
E
if and only if K 1
-
K2
E $"+.
The following problem is a simple extension of that given by Gale et al. [G2]: Minimize Cx ~DKy,
Ax
K
subject to
x
~QBy,
~MO,
Here, A E B"?", BE R"?". C E RPX", K E RPX', x E R", and y ERr. The dual problem associated with problem (PGKT) is then given by (D GKT)
Maximize
K
subject to
Remark 5.1 .1
The problems (P GKT) and (D GKT) represent a class of matrix optimization that includes vector optimization as a special case. In fact, in the case where Band K are vectors and y is a positive scalar, problem (P G KT) reduces to the usual formulation of vector optimization problems; that is, Minimize Cx ~Dk,
k Ax ~Q b,
subject to X
~MO.
The dual problem associated with the problem (PKGT) then becomes Maximize
k
subject to
Before proceeding to duality for the stated problem, we will prove the theorem of alternative in a generalized form.
5.1
129
LINEAR CASES
Lemma 5.1.1
For a matrix A
E
R" x" and a convex cone S c R", set AS:= {Ax:x
E
S}.
Then
Proof Easy. Lemma 5.1.2 (Farkas Lemma)
In order that N O.
Proof The given A condition $> BTA ¢ N°\ {OJ for any A E QO n ( - AM)O $>
Ax
~ Q
b for some x
$>
Ax
~
By for some x ~
Q
~
M
0 (from Lemma 5.1.2) M
0 and y > N O.
= By)
130
5
LAGRANGE DUALITY
Lemma 5.1.4
K is an (i)
efficient solution to problem (P GKT) if and only if
Ky
Cx ~D
for some X E M and y E int
N such that Ax
~Q
By, and
(ii) CX::;D Ky for no x EM and yEN such that Ax ~Q By. Similarly, K is an efficient solution to the problem (D G KT) , if and only if " ') (11
> KATflBTi" A =w
for some i (i')
E
QO and ji
BTA ~N'
E
int DO such that ATi ~Mo CTji, and
KTfl
for no A E QO and fl
E
DO such that ATA ~Mo CTfl.
Proof We shall prove here the first part of the lemma, since the second one follows readily in a similar fashion.
if Suppose that K satisfying condition (i) is not a solution to problem (P GKT)' Then there exists a matrix K' such that K' ::;
K
and Cx ~DK'y
for some X ~M 0 and y >N 0 such that Ax ~Q CX ~D K'y ::;D Ky, which contradicts condition (ii).
By.
Hence, we have
only if Suppose, to the contrary, that K does not satisfy condition (ii). Then there exist some x' E M and y' E N such that and By taking condition (i) into account, it follows that C(x
+ x') ::;D K(y + y')
for x + x' E M and y + y' E int N such that A(x + x') ~Q B(Y + y'). Take a vector d' E D\{O} in such a way that d' ::;D K(y + y') - C(x + x'). Choose a matrix fo..K E .ff+ such that fo..K(Y + y') = d'. In fact, for some vector e of N°
5.1
131
LINEAR CASES
with <e, y + i) = 1, a possible 11K is given by 11K where di is the ith component of d'. Then
CK -
I1K)(y
+ y')
.= (d'le, d~
e, ... , d~e?,
~D C(X + x'),
and K'
which implies that
:=
K-
11K ::; K,
K cannot be a solution to
the problem
(PGKT)'
A duality between the problems (PGKT) and (D o KT ) can be stated as follows.
Theorem 5.1.1 (Gale-Kuhn- Tucker Duality) (i) A matrix K is an efficient solution to problem (P OKT) if and only if it is a solution to problem (D OKT) ' (ii) If K is an efficient solution to problem (POKT) for some X ~M 0 and Y>N 0, then we have Ky = Cx. (iii) If K is a feasible solution to problem (P OKT) for some X ~M 0 and y >N 0 and is also a feasible solution to problem (D GKT ) for some J: E QO and {1 E int DO, then R is an efficient solution to both problems (PGKT) and (D o KT ) ·
Proof (i) It is easily shown that, for any two convex cones Sand T (S x
n° = so x
TO.
Now with a help of Lemma 5.1.3 we have for some
XE M
and y E int N
~T Ji (BT, - K)
[A]
~w 0
for no
such that
(AT,
-CT{~J
~MoO
(i') of Lemma 5.1.4.
(A, Ji)
E
QO x DO
132
5
LAGRANGE DUALITY
Similarly, (ii') of Lemma 5.1.4 - [
~~T]i
and
~MOXN'[
R~T]jl
for some
i
e QO
jl e int DO
_(-C,K{;] ~DO (-A, B{;] ~Q
for no
such that
(x, y) e M x N
0
_ (ii) of Lemma 5.1.4.
On the other hand, statements (ii) and (iii) follow immediately from Lemma 5.1.4. This completes the proof. Another formulation for vector optimization with more reciprocity was suggested by Kornbluth [K8]:
(Pd D-minimize (OK)
D-maximize
Cx
subject to
Ax ~Q By,
x
~M
0, Y >N O.
BT A
Theorem 5.1.2 (Kornbluth Duality)
x
There exists an efficient solution to problem (P K ) for some y = y if and only if there exists an efficient solution Ato problem (OK) for some J.1 = ft.. Proof See Kornbluth [K8].
The following relationship between problems (P GKT) and (P d (resp. (D GKT ) and (D K )) was revealed by Rodder [R9]. Theorem 5.1 .3
x,
x
(i) If (K, y) solv.es problem (P GKT), then is an efficient solution to problem (P K ) for y = y. (ii) If is an efficient solution to problem (P K ) for y = y, then there exists a matrix K such that (K, x, y) solves problem (P GKT)' (iii) Statements analogous to (i) and (ii) hold for problems (D G KT) and
x
(OK)'
5.1
133
LINEAR CASES
Proof (i) is obvious. Since (iii) is dual to (i) and (ii), we shall prove (ii) here. Suppose that ~ is an efficient solution to problem (P K ) . It is readily shown that there exist f1 E DO such that
for all x
~M
0 such that
Ax ~QBY. Considering the dual problem associated with this scalarized linear programming problem, it follows that there exists 1 E QO such that VB Tl ~ yTBT). for any).
E
QO such that
According to the well-known duality theorem in linear programming, we have
This condition implies that
Ky = ex
and
have a common solution K (see, for example, Penrose [PI]). Hence, it follows immediately from (iii) of Theorem 5.1.1 that K is an efficient solution to problem (P G K T ) . This completes the proof. On the other hand, Isermann [I6] has given a more attractive formulation without auxiliary parameters such as y and u. In the following, we shall consider it in an extended form. Let 2 0 be class of p x m matrices such that CTu. The primal there exists /l E int DO holding AT /l E QO and ATA T/l~Mo and dual problems are then defined as
D-minimize
{ex: x EX} D-maximize
where
X:= {x EM: Ax
~Q
b}.
{Ab: A E 2 0 } .
The following duality properties hold for these problems. Here, we denote by MinD(P r) and MaxD(Dr)) the set of D-minimal (D-maximal) points in the objective space of problem (P r) (problem (Dr)).
134
5
LAGRANGE DUALITY
Theorem 5.1 .4 t (i) Ab i:.D CX for all (A, x) E 2 0 x X. (ii) Suppose that l\. E 2 0 and x E X satisfy l\.b = cx. Then l\. is a Dmaximal solution to dual problem (D) and x is a D-minimal solution to primal problem (Pj). (iii) MinD(P) = MaxD(D). Proof
(i) Suppose, to the contrary, that there exist some X E X and l\. that
ii E
2
0
such
(5.1.2)
l\.b ?::.D Cx. Note here by definition that there exists
E
int DO such that
and Therefore, since
x E M, we have ({1 + J1yAb. This implies the existence of solutions (1 + J1' E int DO, X+ A' E QO to Eqs. (5.1.6), which is a contradiction. Rewriting Eqs. (5.1.6), we may see that there exist no A E QO and no J1 E DO for which
[-~
-~bJT~J
R~
E MO x
\{O}
is satisfied. Thus from Lemma 5.1.5 below, there exists int(MO x R~)O = int(M x R~) satisfying
[-~ Since
0(
-~b J[~J
E
(-Q)
X
a
(~a)
E
(-D)
> 0, we finally have
Ax ~Q
band
Cx ~D
Ab.
(5.1.7)
By using result (i), the last relation reduces to
Cx = Ab.
(5.1.8)
From the result (ii), the relations (5.1.7) and (5.1.8) imply that minimal solution to the primal problem. This completes the proof.
x is a
D-
The following lemma, which was used in the proof of Theorem 5.1.4, is a simple extension of that due to Gale [G1] and also a special case of Lemma 5.1.3 with M = {O}.
Lemma 5.1.5 Either for some
(1)
A E QO
or (II)
ByE -Q
holds, but never both.
for some
y
E
int N
5.2 DUALITY IN NONLINEAR MULTIOBJECTIVE OPTIMIZATION
137
5.2 Duality in Nonlinear Multiobjective Optimization This section will be concerned with duality in nonlinear multiobjective optimization problem formulated as follows: D-minimize
(P)
{f(x): x
E X}
where X:= {x
E
X' :g(x) ~QO,
X' c R"}.
Since efficient solutions correspond to maximal solutions in ordinary mathematical programming, it is hard to derive duality for efficient solutions in the absence of the compactness condition. Throughout this section, we impose the following assumptions: (i) (ii) RP and (iii) (iv)
X' is a nonempty compact convex set,
D and Q are pointed closed convex cones with nonempty interiors of R"', respectively,
f
is continuous and D-convex, g is continuous and Q-convex.
Under the assumptions, it can be readily shown that for every U sets X(u)
:=
E
R"', both
{x E X': g(x) ~Q u}
and Y(u) := f[X(u)] := {y E
RP: y
= f(x), X EX', g(x) ~Q u}
(5.2.1)
are compact, X(u) is convex, and Y(u) is D-convex. Although these conditions might seem too strong, something like these conditions would be more or less inevitable as long as we consider efficient solutions. Several authors have been recently attempting to relax the compactness condition by using the notion of inf instead of min, which will be restated later.
5.2.1
Perturbation (or Primal) Map and Lagrange Multiplier Theorem
Let us consider the primal problem (P) by embedding it in a family of perturbed problems with Y(u) given by (5.2.1): D-minimize
Y(u).
138
5
LAGRANGE DUALITY
Clearly primal problem (P) is identical to problem (Pu) with u = O. Now define the set r as
r:= {u E Rm:X(u):I: 0}. It is known that the set
Definition 5.2.1
r is convex (see, for example, Luenberger [L14]).
(Perturbation (or Primal) Map)
The point-to-set map
w: r --. RP defined by W(u)
:=
Min., Y (u)
is called a perturbation (or primal) map. Observe that the perturbation map corresponds to the perturbation (or primal) function w(u)
= min {f(x) : x EX', g(x)
~ u}
in ordinary mathematical programming. Obviously the original problem (P) can be regarded as determining the set W(O) and f - 1 [W(O)] rl X. In the following we shall investigate the properties of W. Proposition 5.2.1
For every u
E
r; W(u)
+D
= Y(u)
+ D.
Proof Note first that W(u)
= Min., Y(u) c
Y(u).
On the other hand, since Y(u) is D-compact, Theorem 3.2.9 yields Y(u) c W(u)
+ D,
from which Y(u)
+D
c
W(u)
+D +D
= W(u)
+ D,
because D is a convex cone. This completes the proof. Proposition 5.2.2
For each u Proof
E
r; W(u) is a D-convex set in RP.
Immediate from Proposition 5.2.1 and the D-convexity of Y(u).
5.2
DUALITY IN NONLINEAR MULTIOBJECTIVE OPTIMIZATION
139
Proposition 5.2.3 The map W is D-monotone on I', namely,
+D
W(u I ) c W(u 2 )
for any u\ u
2
E
1
I" such that u ~Q u
2
•
Proof Obviously, Y(u I ) c Y(u 2 ) whenever u i ~Q u 2 • Hence W(u I ) c Y(u I )
C
Y(u 2 )
C
W(u 2 )
+ D.
Proposition 5.2.4 W is aD-convex point-to-set map on F.
Proof In view of Proposition 5.2.1, it suffices to show that
IXY(U 1)
for any u\ u2
E
+ (1 -
IX)Y(U 2) c
Y(IXU
1
I' and any IX E [0, 1]. If we suppose that y E IXY(U 1) + (1 - IX)Y(U 2),
then there exist x', x 2 E X' such that g(x 1) ~Q u\ g(x 2) ~Q u2, and Since X' is a convex set, convexity of g, g(IXX
l
IXX
I
+ (1 -
IX)X
y = IXf(x I) 2
E
+ (1 -
-
IX)X
2)
~Q
IXg(X
lX)f(x 2).
X'. Furthermore, from the
+ (1 - lX)g(X 2) ~Q IXU I + (1 2) and, thus, 1 1 2 IXX + (1 - IX)X E X(IXU + (1 - IX)U f(IXX I + (1 - IX)X 2) E Y(IXU 1 + (1 - IX)U 2).
+ (l
which implies
IX)U 2) + D
+ (1 -
1)
On the other hand, from the D-convexity off IXf(x 1) + (1 - lX)f(x 2) E f(IXX I + (1 -
IX)X
2)
Q-
IX)U 2,
+ D.
Finally we have
YE
Y(IXU
1
+ (1 -
IX)U 2) + D.
This completes the proof of the proposition. Remark 5.2.1 Propositions 5.2.3 and 5.2.4 correspond to the fact that the primal function w in ordinary mathematical programming is monotonically nonincreasing and convex.
140
5
LAGRANGE DUALITY
It is well known that, in scalar convex optimization, the convexity of w ensures that by adding an appropriate linear functional <J., u) to w(u), the resulting combination w(u) + 0, u) is minimized at u = 0 (see, for example, Luenberger [L14J). Geometrically speaking, this appropriate linear functional 0, u) leads to a supporting hyperplane of epi w at u = O. In an analogous manner, the D-convexity of the point-to-set map W ensures that, if an appropriate linear vector-valued functional Au is added to W(u), there will exist no point of W(u) + Au that dominates a given point of W(O). This linear vector-valued functional leads geometrically to a supporting conical variety (i.e., a translation of cone) of W(O) at the given point. Such a geometric meaning will be discussed in more detail in the next subsection. We now have a Lagrange multiplier theorem for multiobjective optimization. Theorem 5.2.1
x
If is a properly efficient solution to problem (P), and if Slater's constraint qualification holds (i.e., there exists x' E X' such that g(x /) O. Making IIXII sufficiently large and taking A E !l' such that pTA = P, we obtain the relation
(P-, Ag(x»
- (fJ., Ag(x» > 0,
which contradicts relation (5.2.4). Thus, g(x)
~Q
O.
5.2
DUALITY IN NONLINEAR MULTIOBJECTIVE OPTIMIZATION
143
Using this result, Ag(x) ~D 0 for A E !f'. On the other hand, substituting A = 0 into Eq. (5.2.3) yields Ag(x) iD O. Finally, we have Ag(x) = o.
sufficiency Since Ag(x) follows that
E -
D for any A E!f' as long as g(x) ~Q 0, it
MaxD{Ag(x): A E !f'} = {O}. Thus, from Ag(x) = 0 we have L(x, A) E MaxD{f(x) + Ag(x): A E !f'}. This result and condition (i) imply that the pair (x, A) is a saddle point for L(x, A). Corollary 5.2.1
Suppose that x is a properly efficient solution to problem (P) and that Slater's constraint qualification is satisfied. Then, there exists a p x m matrix A E !f' such that (x, A) is a saddle point for the vector-valued Lagrangian function L(x, A). Proof Immediate from Theorems 5.2.1 and 5.2.2.
Thus, we have verified that a properly efficient solution to the problem (P) together with a matrix give a saddle point for the vector-valued Lagrangian function under the convexity assumptions and the appropriate regularity conditions. Conversely, the saddle point provides a sufficient condition for optimality for the problem (P). Theorem 5.2.3
If (x, A) E X' x !f' is a saddle point for the vector-valued Lagrangian function L(x, A), then x is an efficient solution to problem (P).
x
Proof Suppose that is not a solution to problem (P); namely, that there Since g(x) ~Q 0 and A E!f' yield exists x E X such that f(x) ~Df(x). Ag(x) E -D, we finally have f(x) + Ag(x) ~Df(x), which contradicts A) E MinD{L(x, A): x EX'}.
us;
5.2.3
Dual Map and Duality Theory
Recall that the dual function in ordinary convex optimization is defined by ¢(A) = inf{f(x)
+ (A., g(x» : x
EX'}.
To begin with, we define a point-to-set map corresponding to the dual function.
144
5
Definition 5.2.4
LAGRANGE DUALITY
(Dual map)
Define for any 1\. E 2, 0(1\.) = {L(x, A): x
E
X'} = {f(x)
+ 1\.g(x): x
E
X'}
and (1\.) = Min n O(1\.).
The point-to-set map : 2
-+
RP is called a dual map.
Under the terminology, we can define a dual problem associated with the primal problem (P) as follows (Tanino and Sawaragi [T9]): D-maximize
U (1\.).
hEY
In the following, we summarize several properties of dual maps. Proposition 5.2.5 For each 1\. E 2, (1\.) is a D-convex set in RP. Proof Since the maps f and g are D-convex and Q-convex, respectively, the map L(. ,i\) is D-convex over X' for each fixed 1\. E 2. Hence 0(1\.) is a compact D-convex set in RP, for X' is a compact convex set. Therefore, (1\.)
+D=
0(1\.) + D
by Theorem 32.9, and thus (1\.) is also D-convex. Proposition 5.2.6 is aD-concave point-to-set map on and any !'J. E [0, 1]
(!'J.N
+ (1 -
!'J.)N)
C
!'J.(N)
r. Namely, + (1 -
for any 1\.\ 1\. 2 E 2
CX)(1\.2)
+ D.
Proof Note that Theorem 3.2.9 yields
for any two sets A and B such that A c B, and B is compact. From this and the relation {!'J.(f(x) c
+ Ng(x» + (l - cxHf(x) + 1\.2g(X»: x E X'} !'J.{f(x) + Ng(x): x E X'} + (1 - cx){f(x) + Ng(x): x
E
X'},
5.2 DUALITY IN NONLINEAR MULTIOBJECTIVE OPTIMIZATION
145
we have (aN
+ (1 -
a)N)
+ (exN + (1 - ex)A2)g(x): x EX'} MinD{ex(f(x) + Ng(x)) + (1 - exHf(x) + Ng(x)): x E X'} MinD[a{f(x) + Ng(x): x E X'} + (1 - ex){f(x) + A.2g(x): x E X'}] + D.
= MinD{f(x)
= c
Hence, in view of Proposition 3.1.3, (aN
+ (1 -
IX)A2)
+ Ng(x): x E X'} + MinD(l - IX) {f(x) + A 2 g(x ) : x E X'} + D = ex MinD{f(x) + AIg(X): x E X'} + (l !.... IX) MinD{f(x) + A 2 g(x ) : x EX'} + D = IX(N) + (l - ex)(N) + D.
c
MinD1X{f(X)
Remark 5.2.3
Proposition 5.2.6 is an extension of the fact that the dual function ¢().) is concave. We can now establish the following relationship between the dual map (A) and the primal map W(u), which is an extension of the following relation between the dual function ¢().) and the primal function w(u):
¢(A)
=
inf{w(u)
+ ()., u): U E T}.
Proposition 5.2.7 The following relation holds:
(A)
= MinD U(W(u) + Au). uef
Proof Let
yl = f(x l ) for any
Xl E
+ Ag(xl)
X'. Then, letting u 1 = g(x l ) ,
yl = f(x l ) Note here that
+ Aul .
146
5
because j'(x ')
E
LAGRANGE DUALITY
Y(ul ) . Hence
i
E
+ Au l + D
W(u!)
U(W(u)
c
+ Au) + D,
UEr
from which
U(W(u) + Au) + D,
Q(A) c
UEr
+D
ileA)
U(W(u)
c
+ Au) + D.
UEr
Conversely, suppose that
for some u l
E
r, namely that yI - Au l
E
Minj, y(u l ) .
Thus, for some
Xl E
Then, for A
E
X' such that
2,
and hence
yI
L(x\A)
E
+ Dc
Q(A)
+ D.
Therefore, U(W(u)
+ Au)
c
ileA)
+ D,
UEr
and thus U(W(u) UE
I"
+ Au) + D
c iliA)
+ D.
Finally, we have Q(A)
+D=
U(W(u) + Au) + D, UEr
and hence MinD(Q(A)
+ D) =
Min D(
U(W(u) + AU) + D),
UEr
which establishes the proposition, because in general Minj, A = MinD(A + D) whenever D is a pointed convex cone (cf. Proposition 3.1.2).
5.2 DUALITY IN NONLINEAR MULTIOBJECTIVE OPTIMIZATION
147
Remark 5.2.4
Recall that Theorem 5.2.1 implies that, given a properly efficient solution X, then f(x) E cD(A) for some A E fE. Hence, we may see from Proposition 5.2.7 that f(x)
E
Minj,
U(W(u) + Au)
uer
for some A E fE. This is an extension of a property of the primal function w in ordinary convex optimization stated in Subsection 5.2.1. The following theorems represent some properties of efficient solutions to primal problem (P) in connection with the dual map and so might be considered as duality theorems for multiobjective optimization. Theorem 5.2.4
For any x
E X
(Weak Duality)
and y
E
(A) y lDf(x).
Proof For any y with the property y lDf(x)
+ Ag(x)
for all
x EX',
the result of the theorem follows immediately from Lemma 2.3.3, because for any x E X and A E fE, Ag(x) ;£D 0. Theorem 5.2.5
(i) Suppose that x E X, A E fE and f(x) E (A). Then y = f(x) is an efficient point to the primal problem (P) and also to the dual problem (D TS) '
(ii) Suppose that x is a properly efficient solution to problem (P) and that Slater's constraint qualification is satisfied. Then f(~)
E
Maxj,
U (A).
heY
Proof If f(~) is not an efficient point to problem (P), then there exists such that f(x) 5. Df(x). Since g(x) ;;:;;Q and A E fE yield Ag(x) E - D, we finally have f(x) + Ag(x) 5. nf(x), which contradicts f(x) E MinD{f(x) + Ag(x): x E X'}. Hence f(~) is an efficient point to problem (Pl. Furthermore, suppose thatf(x) 5. D Y for some Y E Uhey(A). Let A E fE be such that Y E (A). Then, since Ag(~);;:;; DO, we have (i)
xE X
°
148
5
LAGRANGE DUALITY
f(x) + Ag(x) ~nY, which contradicts Y E Minn{f(x) + Ag(x): x EX'}. Therefore.j'(x) is also a solution to dual problem (0). (ii) Theorem 5.2.1 ensures that there exists a p x m matrix such that f(x) E (A), which leads to the conclusion via result (i). 5.3 5.3.1
AE 2
Geometric Consideration of Duality
Geometric Meaning of Vector-Valued Lagrangian Functions
In Theorem 5.2.1, we established that for a properly efficient solution m matrix A such that I\.Q c D and
xto problem (P) there exists a p x
for all
x
E
X'
under an appropriate regularity condition such as Slater's constraint qualification. The theorem in ordinary scalar convex optimization that corresponds to this theorem asserts the existence of a supporting hyperplane for epi w at (O,j(x)). On the other hand, in multiobjective optimization, for the D-epigraph of W(u) defined by D-epi W ,= {(u,y):
U
E I, Y E RP, y E W(u)
+ D},
Theorem 5.2.1 geometrically implies the existence of a supporting conical variety (i.e., a translation of the cone) supporting D-epi W at (O,j(x)). We shall discuss this in more detail. All assumptions on f, g, D, and Q of the previous section are inherited in this section. As is readily shown, D-epi W is a closed convex set in R" x RP under our assumptions. Hereafter we shall assume that the pointed closed convex cone D is polyhedral. Then, as is well known, there exists a matrix M 1 such that D
= {y E RP:MIy
~
OJ.
Since D is pointed, M 1 has full rank p. We now have the following lemma. Lemma 5.3.1
For a given I\. E 2, the tollowing two conditions are equivalent to each other; for all x EX', (5.3.1) (i) f(x) + I\.g(x) $,nf(x) + I\.g(x) (ii] Md(x) + M1I\.g(x) $, Md(x) + M1Ag(x) for all x EX'. Proof Immediate from Lemma 2.3.4. Let M 2
= MIl\. and K
:=
define a cone K in R" x RP by
{(u,y): M1y
+ M2u ~
0,
U
E R", y E RP}.
(5.3.2)
5.3
GEOMETRIC CONSIDERATION OF DUALITY
149
For this cone K, the following properties hold. Let the y-intercept of K in R m x RP be denoted by YK , namely YK
:=
{y: (0,Y) E K, 0
E
te;
Y E RP};
then from the definition of M I, we have YK = -D. Furthermore, let I(K) denote the lineality space of K, i.e., - K n K. Then, since I(K) accords to Ker(M1 , M 2 ), it is m-dimensional. For I(K) = {(u,y) E R'" x RP: Mly
= {(u,y)
E
R'" x RP:y
+ M2u =
+ Au =
O}
O},
because the s x p matrix M 1 has the maximal rank p. In addition, since the row vectors of M I are generators of DO and AQ c D, every row vector of M2 is in QO. We now can establish with the help of Lemma 5.3.1 a geometric interpretation of Theorem 5.2.1. We say the cone K supports D-epi W at (u, y), if K n (D-epi W) - (u, y» c l(K). Clearly, K supports D-epi W at (u, y) if and only if MI(y -
y) + M 2 (u
- u) 1, 0
for all
(u,y)
E
D-epi W,
or equivalently y
+ Au
l,D Y +
Au
for all
(u, y)
E
D-epi W.
Therefore, we can conclude that Theorem 5.2.1 asserts the existence of supporting conical varieties (a translation of K) for D-epi W at (g(x),f(x» (see Fig. 5.1).
u
Fig. 5.1. Geometric interpretation of vector-valued Lagrangian.
150
5
LAGRANGE DUALITY
Remark 5.3.1
Observe that in ordinary scalar optimization, the supporting conical varieties becomes half-spaces, because the lineality is identical to the dimension of g. This accords with the well-known result. We now investigate a relationship between supporting hyperplanes and supporting conical varieties. Let H(A., u : y) be a hyperplane in R m x RP with the normal (A., /l) such that
H(A., u : Y):= {(u,y): O}.
H- and fJ- are similarly defined by replacing the inequality> (resp., =) with < (resp., =). In particular, let H(A., /l} denote the supporting hyperplane for D-epi W with the inner normal (A., /l); that is,
H(A., /l) := H(A, u : y)
where
y=
sup{y: H+(A., u: y)
:=>
D-epi W}. (5.3.3)
Lemma 5.3.2
The lineality space of the cone K given by (5.3.2) with Mz = MiA is included in the hyperplane H(A, /l : 0) if and only if the matrix A satisfies N/l = A.
Proof Since the s x p matrix M; has the maximal rank Mi(y + Au) = 0 is equivalent to y + Au = O. Hence, for Mz = MiA (iv) (Theorem 5.2.2) and (ii) or (iv) => (i) (Theorem 5.2.3 and Lemma 5.3.1), and, hence, we prove the remaining parts.
(iv) => (iii) Since the equivalence of (ii) and (iv) yields Ag(x) = 0, we have f(x)
= f(x) + Ag(x) = L(x, A) E MinD{f(x) + Ag(x) : x EX'} = (A).
X E X is immediate.
5.3
(iii) => (iv)
GEOMETRIC CONSIDERATION OF DUALITY
153
Note that for any y E $(A) and any x EX',
Y iDf(x) + Ag(x).
(5.3.4)
Since f(x) E l1>(A), putting x = x into relation (5.3.4) with y = f(x) yields Ag(x) $,D O. On the other hand, Ag(x) ~D 0 for A E !fl, because g(x) ~Q 0. Therefore, Ag(x) = and L(x, A) = f(x) E $(A) := MinD{L(x, A): x EX'}.
°
(i) => (iv) Under our assumption of M I , M 2 , and A, if conical varieties of the cone K support D-epi W at (O,f(x)), then for any (u, y) E D-epi W M 1(y - f(x))
+ M 2(u
- 0) $, 0,
or equivalently due to Lemma 5.3.1, y - f(x)
+ Au iD 0.
Since (g(~),f(~)) E D-epi W, we have Ag(~) $D O. By a similar argument in in the proof of (iii) => (iv), we can conclude Ag(~) = O. Therefore, since (g(x),f(x)) E D-epi W for any x E X', we have
+ Ag(~)
f(~)
5.3.2
MinD{f(x)
E
+ Ag(x): x EX'}.
Another Geometric Approach to Duality
Now we shall take another approach to duality in multiobjective optimization. As in the previous section, the convexity assumption on f and g will be also imposed here, but X' is not necessarily compact. Define G := {(u, y) E R'". x RP: y
YG := {y : (O,y)
E
G, 0
E
u ~Q g(x), for some x EX'},
~Df(x),
R'", Y
E
RP}.
We restate the primal problem as follows: (P)
D-minimize
{f(x): x EX},
where
X:= {x
E
X':g(x)
~QO,
X' c R"}.
Associated with this primal problem, we shall consider the following two kinds of dual problems: D-maximize
U
AE!l'
¥SIAl'
154
5
LAGRANGE DUALITY
where YSCA ) := {y
E
RP :f(x)
+ Ag(x)
$DY' for all x
E
X'}
and D-maximize where YH - O. , I' )
{y
:=
E
+ (A., g(x)
RP: (/1,f(x)
~
(/1,Y) for all
X.E X'}.
The following results assert weak duality theorems for these problems, Theorem 5.3.2
(i) For any Y E
UAd' Y
SCA )
and for any x
E
X,
Y iDf(x).
Y iDf(x). Proof Part (i)follows in a similar fashion to Theorem 5.2.4. To show part (ii), suppose that for some /1 E int DO, some A. E Q", and all x E X'
0 we have h(u, y) ~ 0 for all (u, y) E G, and h(u, 51) < 0 for sufficiently large e > O. Note that since fl1 E aDo and flO E int DO, efl1 + flO E int DO for any 8> O. Therefore, the hyperplane H(8A 1 + ,10, efl1 + flo: 8y1 + yO) with a sufficiently large 8 > 0 is the desired one. This completes the proof. Proposition 5.3.1 Suppose that G is closed and that there is at least a properly efficient solution to the primal problem. Then, under the condition of Slater's constraint qualification, (Ya)C c
U
IJ.EintD' AEQ'
YH-(A,IJ.) c
UY
AEfI'
SCA )
c
cl(Ya)c.
156
5
LAGRANGE DUALITY
Proof Let y = f(x) be a properly efficient point in the objective space to the primal problem. Then there exists ii E int DO such that
for all x E X.
E
YG •
1, we finally have y E cl YG . This completes the proof.
Theorem 5.3.4 t
Suppose that YG is a nonempty convex subset in RP. Then the following hold: (i) If inf{(Jl, y) : y E YG} is finite for some jl E int DO, then J-normality implies N-normality. (ii) IfO+(cl YG ) = D, then N-normality implies J-normality.
Proof (i) Since cl YG C Y.:1G is always true, we shall show the reverse inclusion. Define hyperplanes and associated halfspaces in the objective spaces as follows:
h'(Il:p):= (Il,y) - P,
= OJ,
H'(Il: p) := {y
E
RP: h'(jl: p)
H'+(Il: p) := {y
E
RP: h'(Il: p) ~ OJ,
fI'+(Il: p) := {y
E
RP: h'(Il: p)
> OJ,
and
p := inf{(11, y) : y t
Borwein and Nieuwenhuis [B19].
E
YG) .
5.3
GEOMETRIC CONSIDERATION OF DUALITY
161
Here, H'-, and fJ.:- are defined by reversing the inequality of tr: and 11'+, respectively. Notice that under the condition, by the same reason of Lemma 5.3.5 cl YG =
n
Hd(p: p).
I'EinlDC
Hence, if ji ¢: cl YG , there exists ji. E int
DO
<ji., y) > <ji., ji)
such that
for any
y
E
cl YG •
Thus, fj(O) for other j's, there exist T E RP x n and y E RP satisfying
y E -f*(T)
and
Hereafter, for simplicity, we assume without loss of generality thatf(O) = O. (See Proposition 6.1.2.) From the definition of T,
Y(T) We may also
Y2 > 0, ... , Yp >
.= {f(x) -
Tx E RP: x
E
R"} c R~.
assume without the loss of generality that Yl < 0,
o. Take a positive number e as
e < min{-Yl/4, Y2/2, ... ,YpI2}. = +00 for all x ::f: 0, then -f*(T) = {O} for all Tand sof**(O) = {O}. Hence, assume thatf(x) ::f: +00 for some x ::f: O. Then, sincef(O) = 0 andfis convex, there exists some x ::f: 0 such that
Iff(x)
i = 1, .. . ,p.
Define a vector t
E
RP by t, = I
{Yi/2 Yi
for for
i i
=1 = 2, .. . .p,
6.1
CONJUGATE DUALITY BASED ON EFFICIENCY
let !!.T = (cxlt, ... , cxnt) with Then Yi
L'J=l cx/<j = -! and
= J;(x)
y = f(x)
let
177
- (1" + !!. T)x.
n
- (ii, x) -
L CXjXjti ,
j=l
and hence
i Y(1" + !!.T) and Moreover,
YE
Y(1"+ !!.T) and
± t ¢ R~.
c
Y(1" + !!.T) is R~-closed Y(1")
=
2, . . . ,p.
from
+ {!!.Tx:x ERn}
c
R~
the assumption.
+ {pt:P E R}.
Therefore
O+(Y(1" + L\T) + R~)
= {O};
n (-R~)
that is, Y(1" + !!. T) is R~-bounded. Thus, from Theorem 3.2.10, there exists Y E Min(Y(1" + !!.T)) = -f*(1" + L\T) such that Y ~ y. Since Y(1" + !!.T) c R~ + {pt: PER}, Y ~ pt for some PER. Since Yl/2 > Y1 ~ Y1 ~ Pt1 = PY1/2,
we have
P>
1. On the other hand, i = 2, ... , p,
for and hence P < 2. Then, from P > 1, Yi > Yi for i Yl > Yl' Thus, we have proved that Y
O. Then T + tJ. T E of(x). In fact, if we suppose to the contrary that T + tJ. T ~ of(x), there exists x' E domf such that f(x') - (T+ tJ.T)x'
- (T+ tJ.T)x;
~f(x)
i.e., f(x') - Tx'
~
f(x) - Tx
+ tJ. T(x'
- x).
Here (tJ.T(x' - X))i = -IX<X, x' - x) ~ 0,
whence f(x') - Tx'
which contradicts T
E
~
f(x) - Tx,
of(x). Therefore T
f(x) - (T+ tJ.T)x
+ tJ. T E
E
of(x) ; that is,
-f*(T+ tJ.T).
Moreover f(x) - (T+ tJ.T)x =f(x) - Tx +
{~:: ~J
-+
+00
Hence, f**(O) = Max
as was to be proved.
U[f*(T)] T
=
•
0,
as
IX -+
+ CX)•
6.1
6.1.3
CONJUGATE DUALITY BASED ON EFFICIENCY
179
Multiobjective Optimization Problems and Duality
In this subsection we consider a multiobjective optimization problem minimize f(x),
(P)
x
where f is an extended real vector-valued function from R" to RP U {+ co} and + 00 is the imaginary point whose every component is + 00. In other words, (P) is the problem to find x E R" such that f(x)
E
Min {I(x)
E
RP : x
E
R"}.
We denote the set Min {I(x) E RP: x E R"} simply by Min(P) and every x satisfying the above relationship is called a solution of Problem (P). If some component off(x) is + 00, then x is regarded to be infeasible. Hence, we may take no thought of it and so may assume thatf(x) = +00. We analyze Problem (P) by embedding it in a family of perturbed problems. The space of perturbation is assumed to be B" for simplicity. Thus we consider a function tp: R" x R" -+ RP U {+ oo} such that cp(x, 0) = f(x) and a family of perturbed problems (PJ
minimize
cp(x, u)
x
or (PU> find
x E R" such that cp(x, u) E Min{cp(x, u) E RP: x
E
R"}.
Let cp* be the conjugate map of o, i.e., cp*(T, A)
= Max{Tx + Au - cp(x, u) E RP: x
E
R", U E R m } .
We consider the following problem as the dual problem of the problem (P) with respect to the given perturbations: (D) find
A E RPx m such that -cp*(O, A) n Max U -cp*(O,A) I: 0· l\
This problem may be written formally as maximize
- cp*(O, A).
l\
However, it is not an ordinary multiobjective optimization problem, since - cp*(O, A) is not a function but rather a point-to-set map. The set
180
6
CONJUGATE DUALITY
Max UA - cp*(O, A) will be simply denoted by Max(D) and every 11 satisfying the above relationship will be called a solution of the problem (D). The first result is the so-called weak duality theorem. Proposition 6.1 .12 For any x
R" and A
E
E
R'":",
cp(x,O) ¢ - is convex, {q>(x, u) E RP: x E R"} =f 0 for every u in some neighborhood of 0, and W*(A) can be characterized completely by scalarization for each A, then problem (P) is stable. Here we should note that we may take a vector subgradient A as pointed out in Remark 6.1.4. Proof Consider the point-to-set map Yfrom B" to RP defined by Y(u) = {q>(x, u) E RP: x E R"}.
From the assumptions, Yis a convex point-to-set map and Y(u) =f 0 in some neighborhood of O. Noting that W(u) c Y(u) for every u, we can prove the proposition in a manner quite similar to the proof of Proposition 6.1.7. In view of Proposition 6.1.6(ii), the problem (P) is stable if and only if Min(P)
=
W(O) c W**(O)
= Max(O).
Thus we have the following theorem, which is a generalization of the strong duality theorem in scalar optimization. Theorem 6.1.1
(i) The problem (P) is stable if and only if, for each solution exists a solution A of the dual problem (0) such that q>(x,O)
E -
q>*(0, A),
i.e.,
f(x)
E -
xof (P), there
W*(A),
or equivalently (0, A) E oq>(x, 0),
x
i.e.,
AE
oW(O;f(x)).
(ii) Conversely, if E R" and A E RP x m satisfy the above relationship, then is a solution of (P) and A is a solution of (0).
x
Proof These results are obvious from the previous discussions.
6.1.4
Lagrangians and Saddle Points
In this subsection we define the Lagrangian and its saddle points for Problem (P) and investigate their properties.
6.1
CONJUGATE DUALITY BASED ON EFFICIENCY
Definition 6.1.4
183
(Lagrangian)
The point-to-set map L: R" x RP x m -> RP, defined by -L(x, A) = Max{Au - cp(x, u) E RP: u E Rm},
i.e., L(x, A) = Min{cp(x,u) - Au E RP:u E Rm}
is called the Lagrangian of the multiobjective optimization problem (P) relative to the given perturbations. We can write L(x, A)
= - cpi(A),
where CPx denotes the function u -> cp(x, u) for a fixed x the conjugate map of CPx • Definition 6.1.5
E
R", and cpi denotes
(Saddle Point)
A point (x, A) E R" x Rr":" is called a saddle point of L if L(x, A) n [Max
VL(x, A)J n [Min VL(x, A)J '# 0·
It will be useful to express problems (P) and (D) in terms of the map L. Proposition 6.1.14
For each A
E
RP x m, either - L(x, A) is externally stable, i.e.,
{Au - cp(x, u) E RP: u
Rm} c -L(x, A) - R~ for every xER", or {Au - cp(x,u)ERP:UER m} is (-R~)-convex E
(-R~)
unbounded or empty for every x. Then cp*(O, A) = Max U [- L(x, A)], x
and, hence, the dual problem (D) can be written formally as
(D)
Max U Min U L(x, A). A
x
Proof In the former case, in view of Corollary 6.1.3, Max U[ -L(x, A)] = Max U Max{Au - cp(x, u) E RP: u E R m} x
x
= Max U{Au - cp(x, u) E RP: U E R m} x
= Max{Ox + Au = cp*(O, A).
cp(x, u) E RP: x E R", u E Rm }
184
6
CONJUGATE DUALITY
On the other hand, in the latter case, both Ux[ - L(x, A)] and cp*(O, A) are empty. Hence, Max(O) = Max U[ -cp*(O, A)] = Max U(-Max U[ -L(x, A)]) A
= Max V[Min l) L(x, Since L(x, A) = -
x
A
A)J.
cp~(A),
Max U L(x, A) = Max U [ - cp~(A)] A
A
=
cp~*(O).
Hence, we can directly apply Propositions 6.1.10 and 6.1.11 to obtain the following result. Proposition 6.1 .1 5 Suppose that cp(x,·) is convex for each fixed x ERn, and that (i) whenf(x) = q>(x, 0) is finite, each Cpj(x,') is subdifferentiable at 0, and the set {cp(x, u) - Au E RP: U E Rm } is R~ -closed for each A E RP x m, (ii) whenf(x) = cp(x,O) = +00, CPx is closed and subdifferentiable at each U E dom CPx' Then in case (i), Max UA L(x, A) = {CPx(O)} = {f(x)} , and in case (ii), Max UA L(x, A) = 0. Thus the primal problem (P) can be written formally as (P)
Min U Max U L(x, A). A
x
These expressions of the problems (P) and (0) suggest that the saddle point of the Lagrangian is closely connected with a pair of solutions to the problems (P) and (0). Theorem 6.1.2 Suppose that the assumptions both in Propositions 6.1.14 and 6.1.15 are satisfied. Then the following statements are equivalent to each other: (i) (x, A) is a saddle point of L, (ii) is a solution of the problem (P), A is a solution of the problem (0), and
x
cp(x,O)
E -
cp*(O, A).
Proof This theorem is obvious from Propositions 6.1.12, 6.1.14, and 6.1.15.
6.1
CONJUGATE DUAU1Y BASED ON EFFICIENCY
185
Theorem 6.1.3
Suppose that the hypothesis of Theorem 6.1.2 holds and that the problem E R" is a solution of (P) if and only if there exists A E RPx m such that (x, A) is a saddle point of the Lagrangian L.
(P) is stable. Then
x
Results similar to the above theorems for convex problems can be obtained with the vector dual variable (cf. Remark 6.1.4). The details can be seen in Tanino and Sawaragi [Ttl]. 6.1.5
Duality in Multiobjective Programming
In this subsection we apply the results obtained in the preceding subsections to multiobjective programming. The results will be essentially the same as those obtained in Chapter 5. We consider a multiobjective programming problem (P)
minimize f(x)
subject to
g(x) ~
°
as a primal problem, where f and 9 are vector-valued functions on R" of dimension p and m, respectively. Of course we assume the existence of feasible solutions, i.e., let X = {x E R" : g(x) ;;;; O} :f. 0. If we define a vectorvalued function/: R" -+ RP U {+oo} by !(x) =f(x)
+[
!5(X I
:
X)]
!5(x IX)
,
i.e., by ](x)
=
if g(x);;;; 0, otherwise,
{f(X)
+00
the problem (P) can be rewritten as minimize !(x).
(P)
A family of perturbed problems is provided by a function qJ(x, u) =
if g(x) ~ u, otherwise,
{~~
with the m-dimensional parameter vector u. Then the perturbation map W is W(u) = Min{qJ(x, u) E RP: x E R"}
= Min{f(x): x
E
R", g(x) ;;;; u}.
186
6
CONJUGATE DUALITY
In this subsection we use the vector dual variable to define the dual problem and the Lagrangian because we can obtain simpler expressions of them. For A E R m ,
-cp*(O,A) = Min{cp(x,u) - (A,u) =
Setting u
E
Min{f(x) - (A, u): x
Rm }
RP:x
E
R", u
R", u
E
B", g(x)
E
E
~ u}.
= g(x) + v, we have
-cp*(O, A) = Min{f(x) - (A, g(x» - (A, v): x
R", v
E
E
0. Moreover, for l
~
0,
-cp*(O, l) = Min{f(x) - (A, g(x»: x
E
R"}.
Hence, unless A ~ 0, - cp*(O, A) =
R", v
~
O}.
Thus, the dual problem with the vector dual variable can be written as
(D)
Max
U Min{f(x) -
(A, g(x»: x ERn}.
).~o
Theorem 6.1.1 provides the following duality theorem for multiobjective programming problems. Theorem 6.1 .4
x
°
(i) The problem (P) is stable if and only if, for each solution of (P), there exists ~ E R m with ~ ~ such that ~ is a solution of the dual problem (D),
f(x)
E
Min{f(x) - (~,
g(x»: x
E
R"},
and O} u {O} eRe R .
Then (i) Min Y = {O}, (ii) Inf Y = 0 (iii) Inf Y = {(- 00,
by Gros's definition, by Ponstein's definition.
On
Unfortunately both definitions of infimum by Gros and Ponstein might not be adequate for defining the conjugate as is shown in the following example. Example 6.1.2
Let for Then, for
T=
[::J
E
R
2x
l,
-f *(T) --
x
E
R.
We can easily check that U(-f*(T)) T
(R 2\ (R ~
{ (R'\(R~
u
= R~))
u {O}
u R')) u
-n [=:}£t:)'J~:I
by Gros's definition
{[~H -~H
)xJ.. x E R } .
(1 - t 1 Inf {[ -(1 + t )x 2
190
6
CONJUGATE DUALITY
Hence, (Gros) (Ponstein), which does not contain 1(0) =
6.2
[~J
Conjugate Duality Based on Weak Efficiency
As has been shown in the previous section, the duality result is not always reflexive if we use the vector dual variable (Le., W(O) ~ W**(O), generally). Recently Kawasaki [K2, K3] gave some new notions and some fundamental tools and results needed for developing the reflexive duality theory of multiobjective optimization problems. His approach is based on weak efficiency instead of usual efficiency. Weak efficiency is easier to deal with than efficiency, since it is directly related to scalarization. (See Corollary 3.4.1.) Moreover, his primal problems are also defined by vector-valued point-to-set maps (called relations in his papers). The above two facts are undesirable from the practical point of view in multiobjective optimization. However, his results can enrich the conjugate duality theory and so are well worth paying attention to. Hence, in this section, we overview Kawasaki's conjugate duality theory. In order to avoid making this section longer than needed, we omit all the proofs of theorems, propositions, and lemmas. If the reader is interested, please see the original papers. We should also note that some notations and terms in this section are different from the original ones. Particularly, we often add the modifier weak not to confuse the concepts in Sections 6.1 and 6.2. 6.2.1
Weak Supremum and Infimum
In this subsection, the definition and some fundamental properties of weak supremum and infimum will be given. The p-dimensional point consisting of a common element IY. E R is simply denoted by IX. Let R~
Of, course,
-00
a(x a, \) is finite and continuous at 0. Then (P) is weakly stable. Proposition 6.2.7" IfcI>
E
P, then the following three conditions are equivalent to each other:
(i) (P) is normal, (ii) (D) is normal, (iii) w-Inf(P) = w-Sup(D) and this set is a nonempty subset of RP. Proposition 6.2.8 (i) For any 2 E R" and y E R~, Y E -w-cI>*(O, 2) r, w-Sup(D) (', RP if and only if,{ E w-o(w-W**XO; y). (ii) Suppose that cI> E P. For any x E R" and y E R~, Y E cI>(X, 0) (', w-Inf(P) (', RP if and only if x E w-o(w- V**XO; - y). t I §
cr. Lemma 6.1.2. Cf. Ekeland and Temam [EI. Definition m.2.l].
cr. Definition 6.1.3.
110. Ekeland and Temam [E1, Proposition III.2.l].
6.2
CONJUGATE DUALITY BASED ON WEAK EFFICIENCY
199
Proposition 6.2.9
If
E
P, the following three conditions are equivalent to each other:
(i) (P) is weakly stable, (ii) W(O) is a nonempty subset of RP, and Wis weakly subdifferentiable at (0; y) for any y E W(O). (iii) w-Inf(P) = w-Sup(D) = w-Max(D), and this is a nonempty subset of RP, where w-Max(D) = w-Maxjj], -w-*(O, A) - K]. The main strong duality theorem is given as follows. Theorem 6.2.3 t
Suppose that E P and that the primal problem (P) is weakly stable. Then the dual problem (D) has at least one solution. If x E Rn is a solution of (P), then there exists a solution ~ of (D) such that (x,O) n - w-*(O, ~)
:t- 0
or equivalently (0, ~)
E
w-8(x, 0).
Conversely, if x E R" and ~ E R" satisfy the above relationship, then x is a solution of (P), 1 is a solution of (D) and (x,O) r. -w-*(O,~)
c w-Inf(P) n w-Sup(D) n RP.
Remark 6.2.2
The above duality theorem satisfies a reflexivity. In fact, we can exchange the roles of (P) and (D) in Theorem 6.2.3. As a matter of fact, as is shown in the following proposition, the primal problem (P) can be reduced to ordinary scalar optimization problem when is scalarizable. Proposition 6.2.1 0
If is scalarizable, then x E R" is a solution of (P) if and only if there exists RP such that x is a solution of the following scalar optimization problem (Pa): a
E
(Pa) minimize t
ix, 0).
cr. Theorem 6.1.1.
200
6
6.2.5
CONJUGATE DUALITY
Weak Lagrangians and Weak Saddle Points
In this subsection, we shall introduce a weak Lagrangian of the primal problem (P) relative to the given perturbations C1> and clarify the relationship between the pairs of solutions of (P) and (0) and weak saddle points of the weak Lagrangian. Definition 6.2.11
(Weak Lagrangian)t
The point-to-set map w-L: W x R'" -.. w-L(x, A) = w-Inr[
v
defined by
R~,
for
(C1>(x, u) - (A, u»)]
x
ERn,
A E Rm
is called the weak Lagrangian of the primal problem (P) relative to the given perturbations. Definition 6.2.12
(Weak Saddle Point)t
A point (x,~) E R" x R'" is called a weak saddle point of the weak Lagrangian w-L, if there exists y E w-L(x.~) n RP such that w-L(x, A) ~
Y~
w-L(x, ~)
for all
x
E
R"
and
A.
E
B".
Let
V[w-Sup Vw-L(x, A)], w-Sup w-Inf w-L = w-Sup V[w-Inf V w-L(x, A)]. w-Inf w-Sup w-L
= w-Inf
Proposition 6.2.11
w-Sup w-Inf w-L Proposition 6.2.1 2
~
w-Inf w-Sup w-L.
The following two conditions are equivalent to each other: (i) (x,~) is a weak saddle point of w-L and y E L(x,~) w-L(x, A) ~ Y ~ w-L(x,~) for all x E R" and A E R'", (ii) y E w-Int'[l], w-L(x,~)] n w-SUP[Ul w-L(x, A)] n RP. t Cf. Definition 6.1.4. : Cf. Definition 6.1.5.
satisfies
6.3
CONJUGATE DUALITY ON STRONG SUPREMUM AND INFIMUM
201
Proposition 6.2.13 (x,~)
W x R" is a weak saddle point of w-L if and only if there exists is a saddle point of the scalarized function (w-L)a'
E
a E RP such that (x,~)
The primal problem and the dual problem can be represented by the weak Lagrangian as follows. Proposition 6.2.14 t
Ux
(i) -w-*(O, A.) = w-Inf w-L(x, A.) for all A. E R" and hence w-SuP(D) = w-Sup w-Inf w-L, (ii) if E P, then (x,O) = w-Sup w-L(x, A.) for all x E R" and hence w-Inf(P) = w-Inf w-Sup w-L.
U..
Theorem 6.2.4:
If
E
P, then the following conditions are equivalent to each other:
(i) (x, 1) E R" x B" is a weak saddle point of the weak Lagrangian, (ii) and 1 are solutions of (P) and (D), respectively, and -). A preference function is defined as an isomorphic (or homomorphic) function u: (Y, >-) - (R 1, ». In examining the existence of such a preference function, it is meaningful to know the structure of the set of real numbers. Among several characteristics of the set of real numbers, the structure of order and addition is essential and also plays an important role to our aim. Let Re denote the set of real numbers with the natural order>. By this notation we discriminate such a set of primitive real numbers from R 1 with some topological properties. The first noticeable point is that the natural order is a total order (Definition 2.3.1). Further, let Ra denote the set of rational numbers. It is well known that the rational numbers are countable and dense in Ra itself, i.e., for any CI., j3 E Ra for some
y ERa.
This encourages us to score each element of a totally ordered and countable set Y= {yl,y2, ... ,yk, ... } by the function u: Y-Ra such that for any positive integer n, (i) u(i) = 0, (ii) y"+l >- yk, 1 ~ Vk ~ n =? U(y"+l) = n, y"+l < I, 1 ~ Vk::::;; n =? U(y"+l) = -no
Otherwise,
since there are integers i and j (1 ::::;; i, j ::::;; n) such that
or
1::::;; Vk ::::;; n.
This observation yields Theorem 7.1 below. For an uncountable set Y, we make use of the denseness of Ra in Re; that is, for any CI., j3 E Re
CI.>j3=rt>y>j3
for some
y ERa.
Of course, it is required that Y have a subset that is countable and orderdense in Y. (See Theorem 7.2 below.) It should be noted here that the denseness of Ra does not imply its connectedness or continuity. In fact, there
7.1
213
UTILITY AND VALUE THEORY
are infinitely many gaps! in Ra. Similarly, there might be a gap in an uncountable set Ywith a subset countable and order-dense in Y. In defining a preference function Y -+ R 1 , special attention must be paid to such gaps. (For a proof of Theorem 7.2 below, see, for example, Krantz et al. [K9] and Fishburn [F7].) Now recall that the set of real numbers is completed by making up these gaps. To this end, irrational numbers are introduced, which produces the continuity property of Re. In addition to the order structure, an operation on Re is defined called addition, +, by assigning a + b for any a, b E Re (which is again in Re) with the following properties: (1) (a + b) + c = a + (b + c), (2) a + b = b + a, (3) given a, b ERe, there exists c s Re such that a (4) a < b => a + c < b + c, for any c e Re,
+c=
b,
where property (4) connects the order and the addition on Re and is called the monotonicity of addition. An additive structure stands for a set together with the operation of addition defined on the set. The additive structure on the ordered set (Re, leads to the well-known Archimedean property: For given a > 0 and b > 0, there exists an integer n such that na > b. In order to get a measurable preference function for (Y, >-), some notions corresponding to the addition and the Archimedean property will be necessary (cf. Theorem 7.1.4 and Remark 7.1.4 below). In the following, we shall list several existence theorems for preference functions without proof. (For more details see, for exampl, Krantz et al. [K9], Fishburn [F7], and Keeney and Raiffa [K6].) For a totally ordered set (A, »0), the term a cut oj A implies to partition A in such a way that A' u An, where A' and AU are disjoint subsets of A and for any a' E A' and an E An, we have a" >- d. For such a cut, the following three cases are possible: (i) The lower part A' has a greatest element and the upper part AU has a least element; (ii) the lower part A' has no greatest element and the upper part AU has no least element; (iii) either A' has a greatest element and AU has no least element or A' has no greatest element and AU has a least element. For any cut of discrete sets, case (i) is necessarily possible. On the other hand, for any cut of dense sets either case (ii) or (iii)is possible, whereas case (i) is never possible. In case (ii), we say that the cut has a gap. For the set of rational numbers Ra, infinitely many cuts with gaps are possible. For example, consider Ra = {IX E RalIX 2 ~ 2} u {IX E RalQf 2 > 2}. Irrational numbers are introduced to make up for such gaps in Ra so that only case (iii)arises for any cut in the set of real numbers, which is defined as the set consisting of both Ra and the set of irrational numbers (Dedekind cut). t
A
=
214
7.1.1.1
7
METHODOLOGY
Value Functions
Theorem 7.1.1
If (Y, >-) is a totally ordered set and if Y is countable, then there exists a value function u: Y--> R 1 such that for every i, y2 E Y >- y2
i
¢>
uV) > U(y2).
Such a value function is unique up to a monotonically increasing function.' Definition 7.1.3
(Order Dense)
Let >- be an order relation. Then a subset Z of Y is said to be >--order dense in Y, if for any yl, y2 E y\Z there exists Z E Z such that i>-y2=yl>-Z,
Z>-y2.
Theorem 7.1.2
If (Y, >-) is a totally ordered set and if Y has a subset Z countable and >--order dense in Y, then there exists a value function u: Y--> R 1 such that for every i, y2 E Y i
>-l
¢>
u(i) > U(y2).
Such a value function is unique up to monotonically increasing functions. Remark 7.1.2
In cases of (Y, >-) being an weakly ordered set, we define the order o-' on the set of equivalent classes Y/", as a »' b,
a, b
E
Y/",
¢>
x >- y,
X E
a,
y
E
b.
t In the following, it is important to keep in mind what meaning the numerical values in (Re, » isomorphic or homomorphic to (Y, :>-) have. If these numbers make sense only for their orders, they are said to be ordinal. Clearly, since their order relations are unchangeable by any monotonically increasing transformation, the ordinal scale is rather redefined as the one that is unique up to a monotonically increasing transformation. Numerical values that make sense for their differences are said to be measurable or cardinal. There are two kinds of measurable scales: The first is the interval scale unique up to a positive affine transformation, which implies that both the order among the numerical values and the ratio (u(y!) - U(y2))j(V(Y') - V(y2)) are invariant under a positive affine transformation v of u; that is, v = IX + flu, for fl > 0. This relationship also says that the interval scale has arbitrariness of unit and origin, which can be seen in an example of the measure of temperature (Celsius and Fahrenheit). The second scale is the ratio scale unique up to a positive similar transformation, which implies that both the order among the numerical values and the ratio u(y)jv(y) are invariant under a positive similar transformation v of u: that is, v = flu, for fl > 0. The ratio scale leaves arbitrariness of unit as in examples of length and weight. In general, two preference functions u and v with the relationship u = IX + flv for fl > 0, are said to be equivalent as cardinal measure.
7.1
215
UTILITY AND VALUE THEORY
Then, since j-' is clearly a total order, Theorem 7.1.1 and 7.1.2 can easily be modified for a weakly ordered set (Y, >-). In Theorem 7.1.1, the result holds for a weakly ordered set (Y, >-) by replacing Y is countable with the set of equivalent classes Y/- is countable. Theorem 7.1.2 holds for a weakly ordered set (Y, >-) by replacing (Y, >-) has a subset countable and >--order dense in Y with the set ofequivalent classes Y/- has a subset countable and ';>!-order dense in
y/..J .
In order to reduce the selection of the maximal element of (Y, >-) to the maximization of the preference-representing function u('), the continuity property of u(.) encourages the existence of a maximal element under some appropriate conditions and applications of various optimization techniques. A mild, sufficient condition for continuity of value functions was derived by Debreu [D3]. Theorem 7.1.3
Suppose that the preference >- is defined on a subset Y of p-dimensional Euclidean space RP with the natural topology 1:. Let 1:' be the induced topology of 1: to Y. Then, if Y is connected with respect to the topology 1:' and if for any y' E Y p'
:=
{y
E
Y: y
>- y'} E T',
and
¥"
:=
{y
E
y: y
-< y'} E T',
then there exists a continuous value function u such that for any yl, y2 E Y yl
>- y2
U(yl) > u(y2).
Such a value function is unique up to a monotonically increasing function.
7.1.1.2 Measurable Value Functions For a long time, there have been many controversies with regard to measurable (cardinal) value functions. (See, for example, Fishburn [F7].) One of the reasons for rejecting them in traditional economics is that value functions with ordinal measure provide a sufficiently effective tool for many economic analyses. However, measurable value functions would be expected to show great power as a tool for decision making rather than for mere analysis using ordinal value functions. Definition 7.1.4
(Preference-Difference Order)
Define >-'" as a binary relation on Y x Y reflecting the order over differences of preference >- on Y; i.e., for any yl, y2, y3, y4 E Y, yl y2 >- '" y3y4 implies that the strength of preference for yl over y2 is greater than the strength of preference for lover y4.
216
7
METHODOLOGY
Theorem 7.1.4 t Let >-* denote a preference difference order as in Definition 7.1.4. Suppose the following conditions hold: (C1) (C2)
>-* is a weak order. i y2 >-* y3y4 ~ y4y3 >-* y2y l . i y2 >-* .vyZ and y2y3 >*V y3 ~ i y3 >* .VV i y2 >* y3y4 >* ii ~ yIy5 ~* y3y4 ",* y6y2 for some y5, y6 E
(C3) (C4) Y. (C5) If {yl, y2, ... ,yk, ... } is a bounded standard sequence: such that for every k, not
yk+l/ ,...,,* y2yI
y 2i,...,,* iyl,
and
yy >-* ykyl >* yy
for some
y, y E Y
and for all k,
then the sequence is finite. Then there exists a function u: Y -> R I such that for any i, y2, y3, and y4 in
y,
yl y2
>-* y3y4
U(yl) _ u(yz) > U(y3) _ U(y4).
The function u(·) is unique up to a positive affine transformation. Remark 7.1.3
Since the value function u(·) in Theorem 7.1.4 is unique up to a positive affine transformation, it is a cardinal measure. Hence it is called a measurable value function.
Remark 7.1.4
For a precise proof see, for example, Krantz et at. [K9]. Fishburn [F7] gave another proof by using the fact that the conditions (Cl)-(C5) yield an additive structure for (Y x Y, >*), whence according to Theorem 7.1.15 below there exists an additive value function representing the preference >* over two attributes Y x Y such that i y2 >-* y3 y 4 UI(yl) + uz(y2) > UI(y3) + Uz(y4). A measurable value function u(·) can be obtained by setting u(y) = UI(y) - uz(y)· In referring to Theorem 7.1.15, note that condition (C3) corresponds to the Thomsen condition, condition (C4) to the solvability condition, and condition (C5) to the Archimedean condition. t Krantz et al. [K9]. tIn this case, a standard sequence {i,y2, ... ,yk, ...} is defined as the one such that y3y' = 2y2y' , ... , y'y' = (k - 1)y2y ' .In addition, the sequence is said to be bounded if there exists aye Y such that y >-/, Vk.
7.1
UTILITY AND VALUE THEORY
217
7.1.1.3 von Neumann-Morgenstern Utility Functions' In decision making under risk, an alternative does not necessarily produce exactly one consequence, rather it possibly causes several consequences with some probability distribution. Therefore, risky alternatives are identified with probability measures over Y. Note that any convex combination of probability measures is also a probability measure. For general spaces other than linear spaces, roughly speaking such operations as the convex combination are usually called mixture operations. A set with a mixture operation is called a mixture set. Definition 7.1.5
(Mixture Set)i
A set S is said to be a mixture set if S is closed under the operation on S assigning aa + (1 - a)b to each a, b E S, and to each a E [0, IJ; i.e., «a + (1 - a)b is again in S and if the operation satisfies for any a, b E S, and for any a, p E [0, IJ,
(i) la + Ob = a, (ii) aa + (1 - a)b = (1 - a)b + aa. (iii) a[pa + (1 - P)bJ + (1 - a)b = apa
+ (1 - ap)b.
The following theorems are due to Fishburn [F7J which weakened the conditions of Herstein and Milnor [HlO]. Theorem 7.1.5
Let S be a mixture set. Then there exists an affine utility function u: S representing the preference >- over S; namely for any a, b E S
a
>- b
¢>
u(a) > u(b)
u(aa + (1 - rt.)b) = rt.u(a)
+ (1 - a)u(b),
if and only if the following hold: (Cl) >- is a weak order, (C2) a >- b ::;. «a + (1 - a)e >- ob + (1 - a)e for any e E S and any a E (0, 1), (C3) a >- b >- e ::;. aa + (1 - a)e >- b >- pa + (l - p)e for some a, p E (0, 1). This utility function is unique up to a positive linear transformation. t
I
von Neumann and Morgenstern [N13] and Luce and Suppes [L13]. Herstein and Milnor [HlO].
-+
R1
(7.1.1)
218
7
METHODOLOGY
Remark 7.1.5
Condition C2 is called the independence condition, whereas C3 is called the continuity condition. Now we turn to our decision problem. Let Y = RP and denote the set of probability measures on Y as Y. Here, Y is considered as the set of risky prospects including Yas the set of sure prospects. Note that Y is a mixture set. The well-known expected utility principle follows from this observation through Theorem 7.1.5. For example, consider a case with discrete random variables on Y. Suppose that there exists a utility function u(·) on the set Yof simple probability measures. Then, we consider the utility function restricted on Y by setting u(y) = u(p) for p E Y such that p(y) = 1. As is well known, there exists some finite number r for a simple probability measure such that {y:p(y) > O} = {y!,y2, ... ,yr}. Therefore, using the relationship (7.1.1) for pi E successively,
Y
such that pi(yi) = 1
u(P) = u(t P(yi)p) = itl p(!)u(!), l
which represents the expected utility of the risky prospect p. This discussion can be readily extended to cases with general probability measures. Definition 7.1.6
(Expected Utility)
Corresponding to a risky outcome p, p(y) denotes a probability distribution for discrete y E Y, and a probability density function for continuous y E Y. The expected utility of p is defined by
E(p, u)
=
fy~Y
lf
for discrete
p(y)u(y) p(y)u(y) dy
y,
for continuous
y.
yeY
Remark 7.1.6
Since, in general, a probability density function does not necessarily exist, sometimes a cumulative probability distribution P(y) is used for defining the expected utility with the help of the Lebesgue-Stieltjes integral: E(u, P) =
f
yeY
u(y) dP(y).
7.1
UTILITY AND VALUE THEORY
219
Note that if a cumulative distribution P(y) is absolutely continuous, then a corresponding density function exists. Theorem 7.1.6
Suppose that Y is a subset of RP and Y represents the set of risky prospects represented by probability measures over Y. Then for a given (,Y, >-) there exists a utility function u(·): Y --+ R 1 such that for any p, q E Y p
>- q
¢>
E(p, u) > E(q, u)
if and only if the following hold:
(Cl) >- is a weak order, (C2) p >- q => IXp + (1 - IX)r >- IXq + (1 - IX)r for any r E Yand any IX E (0, 1), (C3) p >- q >- r => IXp + (1 - IX)r >- q >- f3p + (1 - f3)r for some IX, f3 E (0, 1). This utility function is unique up to a positive affine transformation. Remark 7.1.7
In practice, we often use the term lottery or prospect instead of probability measure. Letting Y be a subset of R 1 throughout this Chapter, we denote by [y1, y2, ... , l : lXI, 1X 2, ••• , IX k ] a lottery with the gain yi E Y with the probability IX i (i = 1; ... , k). Let the preference >- be a weak order. For y* ;:: y ;:: yO (y* >- yO), a von Neumann-Morgenstern utility function can be identified by finding a probability IX such that the sure prospect y is indifferent to the lottery [y*, yO: IX, 1 - IX]. Letting y* be the best level, yO the worst, u(y*) = 1, and u(y°) = 0, we can assign a number of [0, 1] to the utility u(y) for any y. When it is difficult to find such a probability IX, we find a sure prospect y indifferent to the fifty-fifty lottery [y', y" : 0.5, 0.5]. Then we have u(y) = (u(y') + u(y"))/2. Starting with a pair of y' and y" with known utilities u(y') and u(y") (for example, y* and yO), we can get a utility u(y) for any y (midpoint method).
Remark 7.1.8
Let Y be a subset of R 1, and let Y be the set of lotteries over Y. Then a sure prospect yis called a certainty equivalent of a lottery p E Y, if Yis indifferent to p. The von Neumann-Morgenstern utility functions give information on the attitude of the decision maker toward risk. The decision maker's preference is
220
7
METHODOLOGY
said to be risk averse if, for any nondegenerated lottery p, he/she prefers the expected (monetary) value y (= LYP(Y» to the lottery itself; that is, u(y) > u(y).
On the other hand, a preference that prefers lotteries to their expectations is said to be risk prone and risk neutral if lotteries are indifferent to their expectations. It can be readily shown that risk averse (resp. prone) utility functions monotonically increasing over Yare concave (resp. convex). (See, for example, Keeney and Raiffa [K6].) The difference y - y is called a risk premium, which shows how much the decision maker accepts the decrease of gain than the expectation in order to avoid risk. Risk premiums reflect the degree ofrisk aversion. Since the form of von Neumann-Morgenstern utility functions itself also reflects the degree of risk aversion, a measure of risk aversion can be defined in terms of utility functions (Arrow [A3], Pratt CPS]) in cases with a monotonically increasing utility function u as r(y) := - u"(y)ju'(y) = - (d/dy)[log u'(y)],
which is called a local risk aversion at y. It is easily shown that r(y) == c > 0 ¢> u(y) ~ -exp[ -cy], r(y) == 0
¢>
u(y) ~ y,
r(y) == c < 0 ¢> u(y) ~ exp[ -cy],
where ~ represents the equivalence of utility functions. (See, for example, Keeney and Raiffa [K6].) These relationships are often used in determining utility functions. 7.1.2
Decomposed Representation for Multiattribute Preference Functions
The section will be concerned with some decomposed representations for multiattribute preference functions. Our main concerns are how much information is needed for the decomposed representation and how the whole preference function is decomposed. Keeney and Raiffa [K6] gave an excellent development to this theme. In the following, we take a geometric approach that is slightly different from theirs.
7.1.2.1 Decomposability First, we investigate conditions under which the overall preference function can be decomposed in terms ofits conditional preference functions. For
7.1
221
UTILITY AND VALUE THEORY
simplicity, for a vector Y = (YI, Y2, ... , Yn), we use abbreviated notations such as
Yij-
= (Yb ... , Yi-b Yi+l, ... , Yj-b Yj+b ... , Yn),
Similarly,
Yi- = YI X Yij- = Y1 X
Yi-I Yi-I
X X
X
Yi+l
X
X
X
Yi+l
X
X 1]-1 X 1]+1 X ... X
Yn,
1;,.
(Conditional Preference)
Definition 7.1.7
For a given preference ~ over multiple attributes Y1 x Yz X ••• x 1;" the conditional preference ~y,_ over Y; at Yi- implies the induced preference of ~ on the subset Yy,_ := {y = (YI, ... , Y.) IYi E Yi, Yj = Yj for j E {l, , n}\{in· Similarly, for a given overall preference function U(Yl, Yz, , Yn) over YI x Yz X ••• x 1;" the conditional preferencefunction Ui('; Yi-) over Yi at a fixed Yi- is defined by Ubi; Yi-) = u(yj, Yi-)'
(Preferential Independence)
Definition 7.1.8
The attribute Yi is said to be preferentially independent of Yi-, if the conditional preference ~y,_ over Yi is not affected by the level Yi- of Yi-, namely, if
yl
~.v.-
Yf
for some
Yi-
E
Y;-
Yf
=>
yl
~y,_
Yi-
=>
(yf, Yi-) ~ (Yf, Yi-) for all Yi-
for all Yi-
E
Y;-,
or equivalently, if
(yf, yi-) ~ (Yr, yi-)
for some
yi-
E
E
Yi-.
Theorem 7.1.7
Suppose that there exists an order preserving preference function U(YI, ... , Yn) for the preference structure (n?= I Yi, ~), namely (YI"'" Yn) ~(Y;,···,
Y~)
¢>
U(Yb" ·,Yn) > U(y;, ... , y~).
Then the preference function u(·) is decomposable in terms of the conditional preference function over each attribute; namely, there exists a conditional
222
7
METHODOLOGY
preference function Ui(Yi) for each li at some point bi - E li- (i = 1, ... , n) and a strictly increasing! one-to-one function : R" ..... R such that (7.1.2) if and only if each attribute li is preferentially independent of li- . Moreover, such a function is unique up to an increasing transformation. Proof only if Suppose that there exists a conditional preference function U;(Yi) for each attribute li at some bi - E li- and a one-to-one increasing function €I> with the properties of Eq. (7.1.2). Then Yi '>-b i - Y; => Ui(Yi) > ui(yi) => (a\ ... , ai - 1, ui(y;), a i + 1,
.•. ,
an) ~ (a\ ... , ai-I, Ui(yi), ai+ 1 ,
for any
•• _,
a k E R 1 (k
=> (Ul(Yd, ... , Ui-l(Yi - d, Ui(Yi), Ui+I(Yi+ 1), ... , Un(Yn) ~ (Ui(Yi), ... , Ui-l(Yi-d, Ui(Y;), Ui+l(Yi+d,···, un(Yn»
for every
Yi-
an)
=I: i)
E
li-
E
li-
=> (u 1(Yl), ... , u,-1 (Yi-1), Ui(Yi), Ui + 1(Yi+ 1), ... , Un(Yn»
> (Ul (Yl), ... , u, -1 (Yi-1), ui(yi), u, + 1(Yi+ d, ... , un(Yn»
for every Yi-
Yi-
E
li-
for every Yi-
E
li-.
for every
This relationship shows that >-b, is not affected by the level bi - of implies that Yi is preferentially independent of Yi - •
li-, which
if Let ui(y;) denote the conditional preference function over Yj at b.: E li-; that is, ui(y;) = u(b;, , bi-I> Y;, bi+!' ... , bn). Then we can show the fact that a function : R" R with the form of Eq. (7.1.2) is well defined as follows. Since li is preferentially independent of li-, then -bi is not affected by b., whereby we simply denote -bi by - i ' Then we have
I"'V
t
(a'b a2'
... , a~),
A function Ill: Rn ---> R is said to be strictly increasing if (al>"" an) ;:;:: (a'l"'" a~)
lIl(a t , ... ,an) > lIl(a'" . .. ,a~).
=0-
7.1
223
UTILITY AND VALUE THEORY
and hence ui(ai) = u;(a;) (i
= 1, ... , n)
¢>
u(a) = u(a').
(7.1.4)
The relation => in Eq. (7.1.4) guarantees the existence of a function $ with Eq. (7.1.2) and 0. (7.1.5) Consider a particular case in which all conditional preference functions over the attribute Yare equivalent to each other independent of the level of Z, namely, all conditional preference functions u(', z) over Y become the same if their origins and units are set in common. More formally, let u(y Iz) be the normalized conditional preference function over Yat z defined by u(y, z) - u(yo, z) u(Y Iz) = ---"--'---"----=(7.1.6) u(y*, z) - u(yO, z)
°
for given land y* E Y (yO and y* may be set respectively as the worst level and the best level of 1', if any), for which we have u(yO Iz) = and u(y* Iz) = 1 for all z E Z. Then the special case under consideration requires that all these normalized conditional preference functions become identical with each other regardless of the level z. Under such a circumstance, therefore, since these normalized conditional preference functions are not affected by the level z E Z, they can be delegated by u(y IZO). We define Ul(Y) = u(y IZO). Then letting u(yo, ZO) = 0, we have k1Ul(y)
= u(y, ZO),
k 1 = u(y*, zO).
(7.1.7)
Henceforth, for convenience in this book we refer to this special independence condition on the conditional preference functions as the cardinal indpendence. Definition 7.1.9
(Cardinal Independence)
The attribute Y is said to be cardinally independent of the attribute Z, if all conditional preference functions over Yare equivalent to each other as cardinal measure independently of the level of Z. Theorem 7.1.8
The attributes Y and Z are cardinally independent of each other, if and only if
u(y, z) = k1u1(y)
+ k1Ul(Z) + k12Ul(Y)Uz(Z)
Here, for (yO, ZO) and (y*, z*) such that (y*, ZO) (y", ZO), we normalize each preference function as (1) u(l, ZO) = 0, u(y*, z*) = 1, (2) Ul(YO) = 0, Ul(Y*) = 1, (3) uz(ZO) = 0, Ul(Z*) = 1.
» (y", z")
(7.1.8) and (yO, z*)
»
7.1
UTILITY AND VALUE THEORY
225
Then we have (i) k 1 = U(y*, ZO), (ii) k 2 = U(yO, z*), (iii) k 12 = 1 - k 1 - k 2 • Proof It has been seen that if the attribute Yis cardinally independent of Z, then Eq. (7.1.7) follows through the normalization Eq. (7.1.6) for each conditional preference function over Y at every Z E Z and u(yO, ZO) = O. Similarly, if Z is cardinally independent of Y, then for U2(Z) := u(z IyO) with u(ZIy) = (u(y, z) - u(y, ZO))/(u(y, z*) - u(y, ZO))
(7.1.9)
and u(yO, ZO) = 0, we have
Note here that from Eq. (7.1.6) u(y,z)
= u(yO,z) + [u(y*,z) - u(yO,Z)]Ul(y),
(7.1.10)
and similarly from (7.1.9) u(y, z)
= u(y, ZO) + [u(y, z*) - u(y, ZO)] U2(Z),
(7.1.11)
Hence, substituting Eq. (7.1.11) with y = y* into (7.1.10),
+ [k 1 + (1 - k1)1I2(Z) = k11l](Y) + k 2u2(Z) + (1 - k 1 -
u(y, z) = k 2u2(Z)
- k 2I1iz)]u 1(y) k 2)U1(Y)1I2(z).
The reverse is obvious. As a particular case, with k 1
+ k2 =
1, we have an additive decomposition (7.1.12)
Let us consider the geometric meaning for this case in more detail. Recall that the cardinal independence condition admits the arbitrariness of both the origin and the unit for each conditional preference function. If the preference function u(y, z) has an additively decomposed representation [Eq. (7.1.12)], then it is easily seen that each conditional preference function over Y has a common unit and leaves only a difference of origin. Conversely, if each conditional preference function over Y has a common unit and leaves a difference of origin, then for all Y E Yand Z E Z II(Y,z)
= iX(Z) + u(y,ZO).
(7.1.13)
Substituting u(yO, ZO) = 0 into Eq. (7.1.13), we have iX(Z) = u(yO, z), whence u(y,z)
= u(yO,z) + u(y,ZO).
(7.1.14)
226
7
METHODOLOGY
Equation (7.1.14) yields Eq. (7.1.12). It should be noted here that if each conditional preference function over Y has only the arbitrariness of origin, then the one over Z automatically has the same property. The above discussion leads to the following theorem. Definition 7.1.10
(Additive Independence)
The attributes Y and Z are said to be additive independent, if each conditional preference function over either Y or Z has a common unit and leaves only the arbitrariness of origin. Remark 7.1.9
As stated above, the condition of additive independence is symmetric, whereas the condition of cardinal independence is not so. Namely, even if Y is cardinally independent of Z, Z is not necessarily cardinally independent of Y. Theorem 7.1.9
The attributes Yand Z are additive independent if and only if u(y, z)
= k1Ul(Y) + k2U2(Z).
Here, for (yO, ZO) and (y*, z*) such that (y*, z°»o. (yO, ZO) and (yO, z*) >(yo, ZO), we normalize each preference function as (1) u(yO, zo) = 0, u(y*, z*) = 1, (2) Ul(YO) = 0, Ul(Y*) = 1, (3) U2(ZO) = 0, U2(Z*) = 1.
Then we have (i) k 1 = u(y*, ZO), (ii) k2 = u(yO, z*), (iii) k 1 + k2 = 1. Cardinal Independence in Decision Making under Risk. Now we shall consider practical implications of the stated independence conditions. In the following, we shall use the notation Y for representing the set of all probability measures over Y, which also stands for the set of risky outcomes over Y. Recall that the expected preference principle by von Neumann and Morgenstern asserts that in decision making under risk there exists a utility function such that p
>- q
¢>
f
u(y) dP(y) >
f
u(y) dQ(y).
where P(y) and Q(y) represent the probability distribution functions corresponding to the risky outcomes p and q, respectively. Moreover, at'
7.1
227
UTILITY AND VALUE THEORY
important thing is that such a utility function is unique up to a positive affine transformation. Note that, according to von Neumann-Morgenstern utility theory, utility functions that are equivalent to each other as cardinal measure reflect the same preference order over risky outcomes. Therefore, if the conditional preference over risky outcomes (Y, z) does not depend on the level z, then the conditional utility functions over Yare equivalent as cardinal measure independently of z. As a result, as long as we consider utility functions in the sense of von Neumann-Morgenstern, the condition of cardinal independence can be restated by a preference independence over risky outcomes. Definition 7.1.11
(Utility Independence)
The attribute Y is utility independent of another attribute Z, if the preference over risky prospects (Y,z) does not depend on z; namely, if (Vi, z") ~ (ji2, ZO)
ZO ~ -* be a binary relation on y* reflecting a weak order of preference difference for >- on Y. We have seen in Theorem 7.1.4 that some appropriate conditions ensure a measurable value function u(·) such that if i >- y2 and y3 >- y4, then yly2
>-* y3y4 U(yl) _
U(y2) > U(y3) _ U(y4).
(7.1.16)
In addition, this measurable value function is unique up to a positive affine transformation; namely, two measurable value functions u and v are equivalent (i.e., reflect the same order >-*), if and only if there exist (f. and positive p such that
u = + pv. (f.
Therefore, it can be readily seen that the attribute Y is cardinally independent in the preference order >-* of the attribute Z, if and only if the conditional order of >-* on Yat z is not affected by z; namely,
(i, ZO)(y2, ZO) >-* (y3, ZO)(y4, ZO)
=
(yl,Z)(y2,Z)>-*(y3,Z)(y4,Z)
for some
ZO E
forevery
Z ZEZ.
(7.1.17)
7.1
229
UTILIlY AND VALUE THEORY
In other words, this condition implies that the order of the preference difference over Y is not affected by the level of Z. Definition 7.1.13
(Weak Difference Independencef
The attribute Y is said to be weak difference independent of the attribute Z, if, for any yl, yZ, y3, and y4 over Y, the relation (7.1.17) holds. As a corollary of Theorem 7.1.8, we have the following theorem. Theorem 7.1.12 t
In decision making under certainty with two attributes Yand Z, suppose that there exists a measurable value function u(y, z). Then, u(y, z) is decomposable in the bilinear form of Theorem 7.1.8, if and only if the attributes Y and Z are preferentially independent and weak difference independent of each other. Recall that, in Theorem 7.1.8, the additive decomposition is possible if and only if k 1 + k z = 1, where k 1 = u(y*, ZO), k z = u(yO, z*), u(y*, z*) = 1, u(yO, ZO) = 0, (y*, ZO) >- (yO, z"), and (yO, z*) >- (yO, ZO). Since y*, yO, z*, and ZO are introduced merely in order to define the unit and the origin, they can be arbitrarily chosen. Hence, letting some y1, yZ, z', and ZZ act for y*, yO, z*, and z", respectively,
k 1 + k z = 1 '¢> u(yl, ZI)
-
u(yZ, ZI) = u(yl, ZZ)
-
u(yZ, ZZ),
which leads to the following theorem. Theorem 7.1.13
The measurable value function u(y, z) is decomposable in the additive form of Theorem 7.1.9, if and only if, in addition to the assumptions of Theorem 7.1.12, the following holds: (yl, z l Xy z, ZI) """'*(yl, zZ)(yZ,ZZ)
for arbitrarily chosen
v', yZ
E
Yand
ZI, ZZ E
(7.1.18)
Z, such that (7.1.19)
and for some t
I
Dyer and Sarin [D 10]. Theorems 7.1.12 and 7.1.13 from Dyer and Sarin [DlO].
yO
E
Y.
(7.1.20)
230
7
METHODOLOGY
Remark 7.1.11
As a result of Theorem 7.1.13, it should be noted that if, under the assumption of weak difference independence, Eq. (7.1.18) holds for some i,y2 E Yand some z1,z2 E Z satisfying Eqs. (7.1.19) and (7.1.20), then Eq. (7.1.18) holds for all y1, y2 E Y and all z1, Z2 E Z satisfying Eqs. (7.1.19) and (7.1.20). In fact, the additive decomposition implies each conditional measurable function has a common unit, and hence the preference difference between two arbitrarily given points over Y x z is identical independently of z. Without the assumption of weak difference independence, the additive decomposition can be obtained as follows. Recall that the additive independence implies that all conditional value functions over each attribute have a common unit and only their origins vary depending on levels of another attribute. This implies, in other words, that the difference of the conditional preference function u(y Iz) between the points (y1,z) and (y2,z) is not affected by z, whereby according to Eq. (7.1.16), the difference of the intensity of the conditional preference over Y does not depend on the level ofZ. Definition 7.1.14
(Difference Independencef
The attribute Y is said to be difference independent of Z, if for all yl, y2 E Y such that (yl, ZO) ~ (y2, ZO) for some ZO E Z (/, Z)(y2, z) -* (/, zO)(f, ZO)
for all z E Z.
Theorem 7.1.14
Assume that a measurable value function u(y, z) exists for the given preference over Y x Z. Then u(y, z) is decomposable in the additive form if and only if the attributes Y and Z are preferentially independent of each other and Yis difference independent of Z. Conditional measurable value functions may be calibrated by some methods such as the direct rating method or the direct midpoint method (Fishburn [F6J). Once some conditional measurable value functions are assessed, the weak difference independence condition can be easily checked by comparing their normalized function forms. However, in cases that the assessment of conditional measurable value function is not easy, or trade-off between one attribute and another can be relatively easily made, the trade-off can be used for eliciting cardinal information on the preference for the additive decomposition. t Dyer and Sarin [D 10].
7.1
231
UTILITY AND VALUE THEORY
z Z3 I - - - -....---~
z2 t----~--~M---~K
z1 t-----+----~r__--....;lI...:
' - -_ _..........JL.....- _ _
yl
__L
~
y2
y
y3
Fig. 7.2. Thomsen condition.
The additive independence condition can be restated in connection with the trade-off between Yand Z as follows. In Fig. 7.2, since the points A and B are on an indifference curve 11 , the difference of the conditional preference over Ybetween i and yl at z = z' equals that of the conditional preference over Z between Z1 and Zl at y = i . These differences of conditional preferences may be represented by Aut and Aui in terms of measurable value function. A similar argument can be made for the points C and D on the indifference curve 12 , Now the additive independence condition asserts that the difference Aut of the conditional preference over Y between i and yl is not affected by the level of Z. The same is true for other differences of conditional preferences. Therefore, the difference of conditional preference between the points P and Q can be represented by Aut + Aui for the attribute Yand Aui + Au~ for the attribute Z. In view of that Aut = Aui and Aui = Au~, it finally follows that the point P should be indifferent to the point Q. This observation leads to the well-known Thomsen condition. Definition 7.1.15 (Thomsen Conditionv A weakly ordered set (Y x Z, >-) is said to satisfy the Thomsen condition, if for any yt, y2, y3 E Yand Zt, Z2,Z3 E Z, (i,Zl) ~ (yl,Z1)} (1 3 3 1) (y2,Z3) ~ (y3,Zl) => y,z ) ~ (y ,z . t
Kranz et al. [K9].
232
7
METHODOLOGY
Remark 7.1.12
The conditions of preference independence and Thomsen give sufficient power for the additive decomposability of value functions. However, some additional conditions are required for ensuring the existence of such an additively decomposed value function. Definition 7.1.16
(Solvability Condition)
A weakly ordered set (Y x Z, >-) is said to satisfy the solvability condition if there exists some y E Ysuch that (y, z*) '" (y', z*),
Definition 7.1.17
(Archimedean Condition)
A weakly ordered set (Y x Z, >-) is said to satisfy the Archimedean condition, if the following holds. If, for any z, Z E Z, the sequence {i, y2, ... , y\ ...} such that (y\ z) '" (/+ 1, z) (Vk) is bounded, then it is finite. Remark 7.1.13
Recall that the set of real numbers satisfies the continuity property and has a subset countable and dense in it in connection with the structure of order and addition. The condition of solvability and that of Archimedean ensure similar properties for (Y x Z, >-). (For details, see Krantz et al. [K9].) Theorem 7.1.15 t
Suppose the following conditions hold for (Y x Z, >-): (Cl) (C2) (C3) (C4) (C5)
>- is a weak order: the attributes Yand Z are mutually preferentially independent; the Thomsen condition holds: the solvability condition holds; the Archimedean condition holds.
Then there exists a value function u: Y x Z an additive form. t
Kranz et al. [K9].
--+
R 1 which is decomposable in
7.1
233
UTILITY AND VALUE THEORY
Proof Since the precise proof is complicated, we shall sketch an outline of the proofs. Given y, z, y*, and z*, let (y), tI(z) be solutions such that
and let
v'
(y*, (y» ~ (y, z*),
(tI(z), z*) ~ (y*, z),
yZ E Ybe a solution to
0
«y! 0 yZ), z*) ~ (yt, (yZ».
A key idea to the proofis that for any bounded subset of (Y x Z, »), (Y, »Y. 0) is of additive structure; i.e., there exists a function u, unique up to a positive affine transformation such that y' »y y2 ¢:> u,(y,) > u,(yZ) u,(y! 0 yZ) = u,(y!)
+ u1(yZ).
Then, the preference independence condition yields (y', z'»> (yZ, ZZ) ¢:> y' tI(z'»>y yZ tI(ZZ), 0
0
and hence letting uz(z) = u,(tI(z»
(i, Z1)
» (y2, Z2)
¢:>
u,(y!)
+ uz(z') >
u,(yZ)
+ uiz 2).
The theorem is established by extending u, and Uz to the whole space (Y x Z,
»).
For the additive decomposition of value functions, several methods have been developed for testing the Thomsen condition and the assessment of the value function simultaneously. We shall introduce briefly a method called the lock step method based on the above proof (Keeney and Raiffa [K6]) (Fig. 7.3).
(1) Define yO and ZO as the least preferred level of Yand Z, respectively. Set u(yO, ZO) = Ut(YO) = uz{ZO) = O. (2) Take y' in such a way that (y', z") » (yO, ZO) and set U1(Y!) = 1. (3) Find z' such that (yO,Z1) ~ (y!,zo) (due to the solvability condition) and set uz(z') = 1. (4) Find yZ and Z2 such that (yZ, ZO) ~ (y!, Z1) ~ (yO, Z2) and set U1(y2) = U2(ZZ) = 2. (5) Check whether or not (y!, Z2) ~ (yZ, Z1), which is a part of the Thomsen condition. If the condition holds, go to the next step. Otherwise, we can not have the additive decomposition. (6) Find land Z3 such that (l, z") ~ (y2, z ') ~ (y', Z2) ~ (yO, Z3) and set u1(l) = uz(Z3) = 3. (7) Check whether or not (y3, Z1) ~ (y2, ZZ) ~ (y1, Z3).
(S) Continue the same process for some appropriate number of sample points. (9) Decide the function forms of u1(y) and uz(z) by some appropriate curve fitting method and set u(y, z) = U1(y) + uz(z).
234
7
METHODOLOGY
z
Zl . . .- -....-----~
zO
L-L..-..:
L.-_----.:__-----.:~--yO
y
...;..._ y
o ~.-..=;---'-----'-----''--
Z zl
zO
z2
z3
Fij:. 7.3. Lock step method.
Interpolation Independence and Convex Dependence. Recall that the conditions of cardinal independence and additive independence impose a severe restriction upon interrelationships among conditional preference functions over each attribute. Now we shall relax the interrelationships. Bell [B4] suggested the condition of interpolation independence, which implies that each conditional preference function over an attribute is expressed by a certain convex combination of two other conditional preference functions over the attribute. A more general formulation, called the convex dependence, was developed by Tamura and Nakamura [T5].
Definition 7.1.18
(Interpolation Independence)
Suppose that there exist some values yO and y* of Y and ZO and z* of Z such that for all yin Yand z in Z the decision maker strictly prefers (y*, z) to (yO, z) and (y, z*) to (y, ZO). Then Y is said to be interpolation independent of (or 1-convex dependent on) Z, if each normalized conditional preference function u(y Iz) over Y can be written as a convex combination of u(y Iz'') and u(y Iz*); that is, if for some ce(z) u(y Iz) = O!(z)u(y Iz*)
for any
y
+ (1 E
Y
O!(z»u(y IZO)
and
z E
Z.
(7.1.21)
7.1
235
UTILITY AND VALUE THEORY
Remark 7.1.14
As is readily seen, the notion of interpolation independence includes several other independence conditions as special cases. For example, if u(y IZO) = u(y Iz*) and u(z IyO) = u(z Iy*), the condition that Y is interpolation independent of Z reduces to the condition that Y is cardinally independent of Z. Note that, conversely, if Y is cardinally independent of Z, then not only is Y interpolation independent of Z, but Z is also interpolation independent of Y. Theorem 7.1.16 t
The attributes Yand Z are interpolation independent of each other if and only if u(y, z)
= k[u(y IZO) + k 2u(z IyO) - ku(y IzO)u(z IyO)
+ (k + (1 -
+ (k
k l)U(Y IZO)u(z Iy*) k)u(y Iz*)u(z I y*),
- k 2)u(y Iz*)u(z IyO)
(7.1.22)
where k 1 = u(y*, ZO), k 2 = u(yO, z*), and k is an independent constant. Proof if
Suppose that Eq. (7.1.22) holds. Then Eq. (7.1.22) yields u(yO,z)
= k 2 u(z l yo)
and
u(y*,z)
because u(yO Iz) = 0 and u(y* Iz) = 1 for any u(y Iz)
=
Z
= k 1 + (1- kdu(zIY*), E
Z. Hence,
(u(y, z) - u(yO, z))/(u(y*, z) - u(yO, z)) {k 1
-
ku(z I yO)
+ {(k
+ (k -
k1)u(z I y*)}u(y IZO)
- k 2)u(z I yO)
+ (1 -
k)u(z I y*)} u(y Iz*)
k 1 + (1 - k1)u(zly*) - k 2u(zIYO)
which is of the form of Eq. (7.1.21) showing that Y is interpolation independent of Z. Similarly, it can be shown that Z is interpolation independent of Y. only if Suppose that Yand Z are mutually interpolation independent. For a special case where u(y IZO) = u(y Iz*) and u(z I yO) = u(z Iy*), the condition of Yand Z being mutually interpolation independent implies that Yand Z are mutually cardinally independent, which leads to Eq. (7.1.22) with k = k 2 . Therefore, for the remainder of the proof, we assume that u(y I ZO) is not identical to u(y I z*) and that u(z IyO) is not identical to u(z I y*). t
Bell [B4].
236
7
METHODOLOGY
From the definition of the normalized conditional preference function over Ywith u(y*, ZO) = k., and u(Y°, z*) = k 2 [cf. Eq. (7.1.6)J, we have u(y, z)
= k 2u(z IyO) + [k 1 - k 2u(z IyO) + (1 - k1)u(z Iy*)]u(y I z).
from which the condition that Y be interpolation independent of Z produces u(y, z)
= k 2u(z IyO)
+ G2(Z)[1X2(Z)U(Y Iz*) + (1 -
1X2(Z»U(Y IZO)],
(7.1.23)
where G2(z) = k 1 - k2u(zIyO)
+
(1 - k1)u(z Iy*).
Similarly, from the condition that Z be interpolation independent of Y,
+ G1(Y)[1X1(y)u(zIY*) + (1 -
u(y, z) = k1u(YlzO)
1X1(y»u(zIYO)], (7.1.24)
where G1(y) = k 2 - k1u(YlzO)
+ (1 -
k 2)u(ylz*).
Fixing z at some z :f: ZO in Eqs. (7.1.23) and (7.1.24), we may see that for . some constants Clo C2, and C3
+ C2 U(Y I ZO) + C3U(Y Iz*). = 1 - k 1 because G1(yO) = k2,
G1(Y)1X1(Y)
Here
Cl
= 0 and
C2
+ C3 =
1X1(YO) = 0, and 1X1(Y*)
=
Cl
G1(y*) = 1·- k b 1. Finally, using a constant C instead of C2'
G1(Y)1X1(Y)
= cu(ylzO) + (1- k1 - c)u(ylz*).
Similarly, for some constant d, G2(Z)1X2(Z) =
du(z IyO)
+
(1 - k 2 - d)u(z Iy*).
Substituting them into Eqs. (7.1.23) and (7.1.24), (k1
-
k2 + C
d)(u(YlzO) - u(Ylz*)Xu(zIYO) - u(zly*» = 0,
-
whence k1
Substituting GiZ)1X2(Z) with k
=
+ C = k2 + d. k 2 + d into Eq. (7.1.23), we finally have
+ [k 1 - k 2u(zIYO) + (1 - k1)U(zly*)]u(YlzO) + (u(y Iz*) - u(y IzO»[(k - k 2)u(z I yO) + (1 - k)u(z I y*)] = k1u(y IZO) + k 2u(zIYO) - ku(YlzO)u(zIYO)
u(Y,z) = k 2u(z IyO)
+ (k -
kJly(y IZO)u(z Iy*)
+ (1 -
k)u(y Iz*)u(z Iy*).
+ (k -
k 2)u(y Iz*)u(z IyO)
7.1
UTILITY AND VALUE THEORY
237
Remark 7.1.15
The decomposed representation of Theorem 7.1.16 clearly includes other representations stated above as special cases. It reduces to the one under bilateral independence (Fishburn [F9]) if (k1 + k 2 - l)k = k 1k2 , and to the one under the generalized utility independence if k = 1. Further, if u(y IZO) = u(y I z*) and u(z I yO) = u(z I y*), it reduces to the one under the cardinal independence, and in addition, if k1 + k 2 = 1, the additive decomposition results. 7.1.2.3 Assessment of Multiattribute Preference Functions
Although we introduced some methods for assessing some of particular multiattribute utility functions, we shall summarize here an effective method based on conditional preference functions using a graphic display terminal of computer. Consider cases with two attributes Yand Z for simplicity. (1) Set the best level y* and z* and the worst level yO and ZO for attributes Yand Z, respectively. (2) Assess some representative conditional preference functions over each attribute in the normalized form, i.e., in such a way that u(y° Iz) = 0 and u(y* I z) = 1 for any Z E Z and similarly for u(z Iy). (3) Depict graphs of the normalized conditional preference functions over each attribute on the graphic display. (4) If all normalized conditional preference functions over Yare perceived identical, then it may be considered that the attribute Y is cardinally independent of Z. In addition, if Z is also cardinally independent of Y, then we will have in general the bilinear utility function. In this case, assess the scale constants k 1 and k2 by asking the trade-off between Yand Z; i.e.,in risky choices, ask for the decision maker probabilities C( and [3 such that (y*, ZO)
~
[(y*, z*), (yO, ZO): c(, 1 -
(yO, z*)
~
[(y*, z*), (yO, ZO): [3,1 - [3],
C(]
from which k 1 = C( and k 2 = [3. If k 1 + k 2 = 1, then the utility function reduces to the additive form, otherwise it reduces to the bilinear form, in which the constant k is decided by solving 1 + k = (1 + kk 1 X1 + kk 2 ) . (5) If only Yis cardinally independent of Z, and if Z is not of Y, then Z is interpolation independent of Y(see Remark 7.1.14), and hence both Y and Z are interpolation independent of each other. Go to step (7). If neither Y nor Z is cardinally independent of the other, go to step (6) for the check of interpolation independence.
238
7
METHODOLOGY
(6) Set Ul(y : «)
= ()(U(y I z*) + (1 - ()()U(y IZO).
For a sample level Z, take y appropriately (for example, Find a solution & to the following equation
y=
(yO
+ y*)/2).
u(Y, z) = ()(u(y Iz*) + (1 - ()()u(y IZO).
Compare the utility curves of u(y Iz) and Ul (y : Ii). If we observe that these two curves are identical for several levels of Z, then Y may be considered to be interpolation independent of Z. Check for the conditional preference functions over Z in a similar way. Go to step (7) if both Y and Z are interpolation independent of each other, otherwise some other decomposed representation would be possible (for example, nth convex dependence decomposition, Tamura and Nakamura [T5]). (7) Assess the scale constants k 1 and k 2 in the similar way to step (4). Decide k directly from the utility at some representative point. Due to recent remarkable developments of micro computers, the assessment method stated above is expected to be useful for practical problems in an interactive way with ease. For cases with more than two attributes, the consistency check of trade-off constants may be a problem. A suggestion using Saaty's method for cardinal ranking (Saaty [SI]) is seen in Sawaragi et al. [S5].
7.1.3 Discussions In the preceding subsections, we have introduced some of the representative theorems regarding existence, uniqueness, and decomposed representation for multiattribute preference functions. These theorems are of a normative theory based on some assumptions. In applying them in practical problems, therefore, it is meaningful to check the properness of these assumptions from some viewpoint of behavioral science and to know the limits of the theorems. 7.1.3.1 A//ais Paradox
One of the most famous cnticisms for von Neumann-Morgenstern expected utility theory was claimed by Allais (see, for example, Allais and Hagen [AI]). We shall explain it using the example given by Kahneman and Tversky [KI]: Consider a pair of lotteries L 1 with a gain of 3000 for sure and L 2 = [4000,0: 0.80, 0.20], and another pair of L 3 = [3000,0: 0.25, 0.75]
7.1
239
UTILITY AND VALUE THEORY
and L 4 = [4000,0: 0.20, 0.80]. According to an experiment by Kahneman and Tversky, many respondents preferred L 1 for the first pair and L 4 for the second pair. Note that their preferences are not consistent with the von Neumann-Morgenstern expected utility theory. In fact, for the preference over the first pair, a von Neumann-Morgenstern utility function u(·) with u(O) = 0 prod uces u(3000)
= 1.0u(3000) + O.Ou(O) > 0.8u(4000) + 0.2u(0)
= 0.8u(4000).
(7.1.25)
On the other hand, for the preference over the second pair, 0.2u(4000)
= 0.2u(4000) + 0.8u(0) > = 0.25u(3000)
0.25u(3000)
+ 0.75u(0) (7.1.26)
We have from Eq. (7.1.25) u(3000)/u(4000) > 0.8, whereas from Eq. (7.1.26) u(3000)/u(4000) < 0.8, which lead to a contradiction. Now let us consider a reason why such a contradiction for the von Neumann-Morgenstern utility function arise. To this end, observe that the lottery L 3 can be regarded as a compound lottery [L,, 0: 0.25, 0.75J and also L 4 as [L z , 0: 0.25, 0.75J. The independence condition C2 of Theorem 7.1.6 implies that if we prefer L 1 to L z , then we must prefer L 3 to L 4 , which is violated by the preference stated above. In many other reports also (see, for example, Allais and Hagen [A1J and Machina [M1J), such a violation of the independence condition has been observed. Kahneman and Tversky considered that the stated preference incompatible with the independence condition of von Neuman-Morgenstern expected utility theory is caused from the certainty effect, which implies that people overweight outcomes that are considered certain, relative to outcomes which are merely probable. Another effect violating the independence condition is observed by the following example. Consider a pair of lotteries L 5 = [3000, 0: 0.90, 0.10J and L 6 = [6000,0: 0.45, 0.55J, and L 7 = [3000,0: 0.02, 0.98J and L 8 = [6000,0: om, 0.99]. Most people would prefer L 5 to L 6 for the first pair, because although both L 5 and L 6 have the same expected monetary value, the probability of winning for L 5 is much larger than that for L 6 • For the second, on the other hand, probabilities of winning for the lotteries are both minuscule. In this situation where winning is not almost probable, most people would prefer L 8 with a large gain to L 7 • This effect suggested by Kahneman and Tversky is called the get-rich-quick effect in the literature (Sakai [S2J). Regarding L 7 as a compound lottery [L 5 , 0: 1/450, 449/450J and L 8 as [L 6 , 0: 1/450, 449/450J, we may see that the preceding example violates the independence condition of Theorem 7.1.6.
240
7
METHODOLOGY
1------------------ ----
o
p Fi~.
7.4. Weighting function for probability.
Observing further effects that violate the expected utility theory, say, the reflection effect t and the isolation effect ~ Kahneman and Tversky [K1] suggested a weighted value function model (prospect theory) for a simple probability measure p over Y u(P) =
L n(p(y»u(y), yEY
where n(p) is a weighting function for probability with a property of Fig. 7.4. It has been reported that such a modification of the expected utility overcomes several drawbacks in the von Neumann-Morgenstern expected utility theory. For more details, see the original work by Kahneman and Tversy [Kl]. 7.1.3.2 Group Decisions and Equity
In group decisions, the well-known Arrow's general impossibility theorem ([A2], [F8]) asserts that it is impossible in a general sense (under Arrow's five t The preference tendency is called the reflection effect if we 0 bserve the preference over a pair of risky alternatives reverses when all the outcome values are reversed in sign. For example, the risk aversion on the positive domain leads to the risk prone in the negative domain. l A risky alternative can be reconsidered as a compound lottery in various ways; sometimes a preference changes over ways of decomposition. This phenomenon is called the isolation effect.
7.1
UTILITY AND VALUE THEORY
241
conditions) to aggregate individuals' preference orders into a social one. t Numerous researches have tried to overcome the difficulty pointed by Arrow, but it is considered to be essentially impossible to get out of the stated dilemma. The most controversial subject is directed towards the independence of irrelevant alternatives which is imposed in order to avoid the measurability of preferences. It has been known, however, that even if each individual value (utility) can be evaluated quantitatively, there does not exist any social choice function without interpersonal comparison of values (utilities) (Sen [S7]). A group Bayesian approach tries to find a group utility function by regarding the group (society) as a man having rationality consistent with the expected utility theory (Raiffa [Rl]). Harsanyi [H4] showed that if both the individuals and the group satisfy the conditions of Theorem 7.1.6, and if the group is indifferent between two risky alternatives whenever all individuals are indifferent, then the group utility function should be a linear combination of individual utilities. Note here that such a linearity of the group utility function is mainly due to the independence condition of Theorem 7.1.6. In fact, the proof by Harsanyi proceeds as follows. Under the assumption, the group utility function is a function of the expected utilities of individuals. The group utility function based upon Theorem 7.1.6 is a linear function over the set of risky prospects of individual expected utilities. We shall show this in more detail as follows. Suppose that, for a risky alternative P, each individual i has an expected utility Ui' Hence, we identify P as a prospect (Ub .•. , un)' Similarly, identify 0 as (0, ... ,0). Let Q be an uncertain prospect [P, 0: a, 1 - ct.]. Then clearly, Q = «P + (1 - ct.)O. According to Theorem 7.1.5, u(Q) = ct.u(P) + (1 - CI.)u(O), whereas u(Q) = u(ct.u 1 , ••• , rY.u n), u(O) = u(O, ... ,0) = and u(P) = u(ub ..• , un), which leads to the first order
°
t Define a social choice function F by a functional relationship between the tuple of individual preference (Pi, ... , P") and the social preference P; namely, P = F(p l, ... , PO). Arrow's general impossibility theorem says that under the assumption that there are at least two persons in the group and three alternatives, there is no social choice function satisfying the following five conditions:
(CI)
The social preference P is a weak order. The domain of F must include all possible combinations of individual orderings (unrestricted domain). (C3) For any pair x, x' in X, if xp1x' for all i, then xPx' (Pareto principle). (C4) Let P and P' be the social binary relations determined by F corresponding, respectively, to two sets of individual preferences, (Pi, ... , P") and (pI', ... , PO'). If for all pairs of alternatives x, x' in a subset S of X, xPx' if and only if xP'x' for all i, then the greatest set of S with respect to P is identical to the one with respect to P' (independence of irrelevant alternatives). (CS) There is no individual i such that xpix' implies that xPx' for every pair of elements x, x' in X (nondictatorship). (C2)
242
7
METHODOLOGY
homogenuity of u(U1> ... , Un) for 0 ~ a ~ 1. In addition, similar arguments for 0 = «R + (1 - a)P with R = (ku l , ... , kUn), where k = 1 - Y]« and P = as + (1 - a)O with S = (kul> ... , kUn), where k = III>: yield the first order homogenuity of U(UI, ... , un) for k < 0 and k > 1, respectively. The additivity of U(Ul>"" un) follows immediately by considering a prospect S, = (0, ... , 1, ... ,0) (i = 1, ... , n) and T = [S I' ... , S; : lin, ... , lin]. We have seen that the independence condition of Theorem 7.1.6 plays a key role in deriving the linear group utility function via its linearity over uncertain prospect. However, Diamond [05] criticized this point. Let us consider a problem with two individuals and two specific alternatives. Let the alternative A be (UI = 1, U2 = 0) for sure and B (u[ = 0, U2 = 1) for sure. Unless we discriminate two individuals, the group should be indifferent between A and B. Now mix these two alternatives with the probability 0.5; that is, C = [A, A : 0.5, 0.5] and D = [A, B : 0.5, 0.5]. According to the independence condition (in fact, according to a weaker condition called the sure thing principle t ), the group should be indifferent between C and D as long as A is indifferent to B. However, D seems more preferable to C in the sense that in D both individuals have a possibility getting the prize, whereas in C the second person has no possibility getting the prize. Therefore, Diamond claimed the group does not accept the expected utility theory. In addition, Keeney [K4] pointed out that as long as the group 'utility function is viewed as a function of individual expected utilities, it can not necessarily take equity into account. For example, consider a problem with two individuals and two alternatives: the alternative P results in either (u[ = 1, U2 = 0) with probability one-half or (Ul = 0, U2 = 1) with probability one-half. The alternative Q yields either (Ul = 1, U2 = 1) or (Ul = 0, U2 = 0), each with probability one-half. For each alternative, both individual expected utilities are 0.5, whence the linear group utility function does not distinguish Q from P. However, from the viewpoint of equity, Q seems more preferable to P. This phenomenon may be considered a natural result from the fact that the group utility function as a function of individual expected utilities is caused from an individualism in the sense that the group is indifferent between two alternatives whenever all individuals are indifferent between them. As is readily seen in the above discussion, it is hard to construct a group utility function consistent with the expected utility theory and equity. Sen [S7] proposed that the social welfare function should depend not only on the mean value of individual utilities but also on some measure of inequality (dispersion) in these individual utilities. Based on a similar idea, Nakayama
t
For any p, q
E
Y, P -
q => rxp
+ (1
- rx)r - «q
+ (I
- rx)r for any r
E
Yand rx
E
(0, 1).
7.1
UTILITY AND VALUE THEORY
243
et al. [N 10J suggested the extended contributive rule (ECR) method by combining the utilitarian index (linear combination of individual utilities) and a certain egalitarian index (variance of individual utilities), whose aim is to help mutual understanding of the group members toward a consensus rather than to get a social welfare index. Some kinds of utility functions that incorporate equity are discussed in Keeney [K5J, Bodily [B5J and others. As to group decisions in terms of parametric domination structures, see Tanino et al, [T12, T13J, and from a viewpoint of fuzzy set theory, see Tanino [T8].
7.1.3.3
Measurability of von Neumann-Morgenstern Utility Function
It is readily seen that the measurability of von Neumann- Morgerstern utility functions strongly depends on the independence condition of Theorem 7.1.5. From a viewpoint of behavioral science, however, we have observed that the independence condition may be violated in many practical cases, where a sure prospect cannot necessarily be considered an uncertain prospect with the probability one. Therefore, a utility function obtained by some measurement methods based on the certainty equivalent does not necessarily reflect the preference correctly. More strongly, several researchers such as Arrow [A3J, Baumol [B2J, Luce and Raiffa [L12J, and Dyer and Sarin [D9J argued that von Neumann-Morgenstern utility functions do not measure the intensity of preference, though they evaluate the decision maker's attitude toward risk. Recently, some trials of evaluating the preference intensity and the risk attitude separately have been reported (Dyer and Sarin [DlOJ and Sarin [S4J). In the previous section, we discussed a direct measurement of preference intensity and decomposed representation of measurable multiattribute value functions. 7.1.3.4
Utility Functions Nonlinear in the Probability
We may see that the expected utility theory by von NeumannMorgenstern is in a sense a linear model for the preference in risky choices. Of course, the real world is generally nonlinear. It depends on the nonlinearity whether or not the expected utility theory fits to the reality as a first-order approximation. It would still remain valid for many practical problems keeping the stated limits in mind. Finally, we would like to mention that some devices for expected utility theory without the independence condition have been developed recently (Fishburn [FlO, FllJ and Machina [MlJ), which are nonlinear in probability while keeping its measurability. For a review of recent developments of utility and value theory, see Farquhar [F3J.
244
7
7.2
METHODOLOGY
Stochastic Dominance
We have observed in the previous sections that the expected utility theory plays a major role in risky decision making problems, while at the same time, there are some difficulties in assessing utility functions. The theory of stochastic dominance has been developed for ranking risky alternatives without a complete knowledge on the utility function. In the following, we suppose that our risky decisions follow the expected utility theory by von Neumann-Morgenstern, but we know only partial information on the utility function of the decision maker. For simplicity, we identify risky alternatives as cumulative probability distributions over a positive scalar attribute Y. Hence, for a risky alternative F, the expected utility can be represented by E(u, F)
= IX) u(y)dF(y).
Definition 7.2.1
Consider utility functions that are bounded and appropriately smooth on [0, co). Let u', u", and u" denote the first, second, and third derivatives of the utility function u, respectively. Then the classes of utility functions IJlt 10 IJlt 2, and IJlt 3 are defined as (i) 1Jlt 1 (ii) 1Jlt 2 (iii) 1Jlt 3
:= := :=
{ulu E C', u'(y) > 0, Vy E [0, CD)}. {ulu E e 2 , u E OU 1o u"(y) < 0, Vy E [0, CD)}. {ulu E e 3 , u E 1Jlt 2 , u"'(y) > 0, Vy E [0, CD)}.
Remark 7.2.1
As is readily recognized, OU 1 is the class of utility functions for which the decision maker's preference is strictly increasing over outcomes. Furthermore, IJlt 2 is the class of utility functions which are of IJlt 1 and risk averse. Finally, it is readily seen that decreasing risk averse utility functions belong to the class IJlt 3 . Definition 7.2.2
(Stochastic Domination)
For i = 1,2,3, F >;G
if and only if E(u, F) > E(u, G),
Vu
E
1Jlt;.
We refer to >1 as the first-degree stochastic dominance (FSD) to >2 as second-degree stochastic dominance (SSD) and >3 as third-degree stochastic dominance (TSD).
7.2
245
STOCHASTIC DOMINANCE
These domination relationships induce some kinds of partial orderings over risky prospects. Therefore, although stochastic dominance may not, in general be expected, to lead to the best alternative by itself, it may be helpful to narrow down the alternative set. Definition 7.2.3 Let f!lJ be the set of all right-continuous distribution functions with = O. For each FE fYJ, we define P as
F(O)
F1(y)
:=
p+ 1
:=
F(y),
J:
P(y)dy,
y E [0, 00).
Remark 7.2.2
The right continuity of F and G of fYJ implies that F 2 t= G2 when F t= G. Therefore, F = G follows whenever F" = G" for some n ~ 2. Denote by G(y);::: F(y), Vy E [0,00) that G(y) ~ F(y), vv E [0,00) and > F(y) for at least some interval of [0, 00). Then, the first two parts of the following theorem are due to Hadar and Russell [H1], and the last part is due to Whitmore [WlOJ, (See also Fishburn and Vickson [F13].)
G(y)
Theorem 7.2.1
The following hold: (i) F (ii) F (iii) F where
/IF
>1 >2
>3
G ¢> G1(y) 2:: F1(y) for all y E [0, 00), G ¢> G 2(y) 2:: F 2(y) for all y E [0, 00), G ¢> /IF ~ /lG and G 3(y) 2:: F 3(y) for all y
E
[0, 00),
denotes the mean value of F given by /IF :=
Proof Let D i [0,00),
=r -
Gi (i
E(u, F) - E(u, G) =
f)
y dF(y).
= 1,2,3).
Then since u(y) is bounded over
{<Xl u(y) dD1(y)
= [u(y)D1(y)JO' _
= _{<Xl u'(y) dD 2(y). from which the part 1 gives no information on the domination between F and G. In such a case, if F 2 is pointwise smaller than G2 , then F is preferred to G in terms of expected utilities as long as the decision maker's preference is monotonically
7.2
247
STOCHASTIC DOMINANCE
L.--_
_~
Fi~.
X
7.5. F >1 G.
increasing over outcomes and risk averse (Fig. 7.6). Finally, in the case of F Z and G Z intersecting at a point, if the decision maker's preference is decreasing risk averse, we have a domination between F and G by checking each mean value together with F 3 and G3 • Remark 7.2.4
As is well known, EV-dominance, for which alternatives with larger expected value and smaller variance are preferred, cannot be necessarily adequate for decision problems, if we take the preference of the decision maker into account. (See,for example, Keeney and Raiffa [K6].) Moreover, if IlF = Ila, then F >z G (F >3 G) implies (fF < (fa (resp. (fF ~ (fa). Clearly, IlF = Ila and F >1 G are inconsistent. On the other hand, if IlF > Ila, then F > i G (i = 1,2,3) is neither necessary nor sufficient for (fF ~ (fa. In view of
t
L",.:::;~:::::::"_---_-1-x Fi~.
7.6. F >2 G.
248
7
METHODOLOGY
these facts, stochastic dominance seems to be sufficiently adequate for reducing the alternative set in the light of decision maker's preference. 7.2.1
Multivariate Stochastic Dominance
Now we consider stochastic dominance for decision problems with multiple attributes, in particular, with two attributes Y x Z for simplicity. Definition 7.2.4
(Multivariate Risk Aversion)t
If for any Y1' Yz, Z1' and Z2 with Y1 < Yz and Z1 < Z2 a decision maker prefers the risky alternative which gives an even chance for (Yh Z2) or (Y2, z 1) to the one which gives an even chance for (Y1' Z1) or (Y2' Z2), then his preference is said to be multivariate risk averse. Lemma 7.2.1 t
Let the utility function of the decision maker be smooth enough. Then the decision maker's preference is multivariate risk averse if and only if
02U(Y, z)/Oy oz < 0
for all
Y and
z.
Proof The lemma follows immediately from
The following theorem is originally due to Levy and Paroush [L4]. Theorem 7.2.2
Let F(y, z) and G(y, z) be two cumulative probability distribution functions absolutely continuous on R~ with F1 (y), F 2 (z) and G 1 (y), G2 (z) as marginals. In addition, suppose that the utility functions under consideration u(y, z) are smooth enough, and lim.., 00 u(y, z) and limz->oo u(y, z) exist for any (y, z) E R~. Then if the decision maker's preference is increasing over outcomes, if and only if F(y, z)
~
G(y, z)
for all
(y, z) E R~.
Here, the notation >MRA represents a stochastic dominance for the class of utility functions that are multivariate risk averse. t
Definition 7.2.4 and Lemma 7.2.1 are from Richard [R2].
7.2
249
STOCHASTIC DOMINANCE
Proof
if Letting D(y, z) =
F(y, z) - G(y, z), D}.z = o2D/oy oz, u y = au/oy, etc.,
f"1 00
E(u, F) - E(u, G) =
u(y, z)Dyz(y, z) dy dz
OO
fo [u(y, z)Diy, z)]~~o
=
dz
_ {OO {OO Dz(Y, Z)Uy(y, z) dy dz. Set d(y, z) = f(y, z) -g(y, z), where f(y, z) and g(y, z) are the density functions associated with F(y, z) and G(y, z), respectively. Then
= lim (Y d(t l' z) dt 1 = lim Diy, z)
d2(z):= (00 d(y, z) dy
Jo
and D2(z) =
Jo d 2(t2) dt 2 =
y-e co
Jo
)' . .
F2(z) - G2(z). Therefore,
first term = f.OO{lim u(y, Z)diZ)} dz o
y--+oo
= lim f.oo u(y, z)d 2(z) dz y--+
00
0
= - lim (00 D2(z)u.(y, z) dz. y--+
00 Ja
Hence, second term = - {OO
=
-1
00
1 00
D.(y, Z)Uy(y, z) dy dz
[D(y, z)u)y, z)]~~o
11
+
00
dy
00
D(y, Z)Uy.(y, z) dy dz
= - lim (00 D1(y)u y(y, z) dy
J
z .... co o
1
+
00
{OO D(y, Z)Uyz(y, z) dy dz.
00
250
7
°°
METHODOLOGY
Since u/y, z) > 0, uiY, z) > and uyz(Y, z) < holds as long as D(y, z) < for any (Y, z) E R~.
°for all
(Y, z) E R~,
F
>MRA
G
only if Provided that F(y, z) ~ G(y, z) for some (y, i), then a contradiction results from using a utility function
!
y+z
() = uy,z
for all (Y, z) such that either
y
+ Z + Il(yz
for all (Y, z)
y
+Z +K
otherwise,
y < ji or z < z
such that both ji ~ y < ji + h and z ~ z < =+ h,
where h is a sufficiently small positive scalar, and K (z + h) + Il(ji + h)(z + h).
= (ji + h) +
Corollary 7.2.1 t
If the utility function u(y, z) is of an additive form:
then for the dominance F
>MFSD
G
u(y, z)
= k1u1(y) + k2U2(Z),
>MFSD
for the class of increasing utility functions
.
.
If and only If
{F1(Y):$; G1(y) "Iy E (0, co) F 2(z):$; Gz(z) "IzE(O, co).
Remark 7.2.5
If F and G are probability independent, the preference between F and G depends only on their marginal distributions. Then the utility function becomes of an additive form (ref. Theorem 7.1.1). Therefore, if F and G are probability independent, then >MFSD can be judged based only on their marginals. 7.2.2 A Method-Via-Value Function
In practical assessment of multiattribute utility functions, decision makers tend to be confused with judging their preference based on lotteries. For these cases, we can make use of multiattribute value functions. Measurable value functions are assessed on the basis of information on the intensity of preference without using any lotteries. Therefore, value functions themselves do not represent decision maker's attitude toward risk and hence their direct applications are limited to deterministic (nonrisky) problems. However, by t
Levy and Paroush [L4].
7.2
251
STOCHASTIC DOMINANCE
making use of value functions as inputs to utility functions, several merits appear in risky situations (Keeney and Raiffa [K6], and Bodily [B15]). Under the circumstance, when we have only partial knowledge about a multiattribute value function V(Yl"'" Yp ) and a utility function over the value u(v), stochastic dominance is introduced in a similar way as in the single attribute cases. Definition 7.2.5
Letting v(O, 0) =
°and
u(O) = 0,
OIl! := {u = u(v) Iu' > 0, avjay > 0, avjaz > 0, V(y, z) E R~}
Ulti := {II =
u(v) III E Ulti, u" < 0, a2vjayZ < 0, a2vjaz 2 < 0, V(y, z) E R~}.
Remark 7.2.6
°
It is well known that the condition of a2vja y 2 < 0 and a2vjaz 2 < means decreasing marginal value. Therefore, when the value v is increasing over outcomes and its marginal value of each attribute is decreasing, then risk aversion utility functions over v forms the class Olti. Define cumulative distributions over Y = Y x Z as
where vz(v: y) is the solution of z to v(y, z) = v. Then the expected utility can be written as E(u(y, z), F(y, z]
=
1'" Sooo
=
foo u(v) dF(v).
u(y, z) d 2 F(y, z)
Definition 7.2.6
Define a multivariate stochastic dominance F(y, z)
>f G(y, z) ~
E(u(y, z), F(y, z)
>
>f for i = 1,2 as
E(u(y,z), G(y, z»,
Note here that F(y, z) >f G(y, z) is equivalent to F(v) », G(v). Then we have the following theorem, in which the first part is due to Huang et al. [H13], and the second is due to Takeguchi and Akashi [Tl]. For the proof, see their original papers.
252
7
METHODOLOGY
Theorem 7.2.3 (i) Suppose that for all y E (0, 00), (a) GI(y) > FI(y) (b) OG 2II(zly)/Oy!
G(y,z).
(ii) Suppose in place of (c) above
(c') {G2II(tly)dt > {F2Il(t 1y)dt
for all (Y,z) E
R~.
Then under the conditions (a), (b) and (c'), F(y, z) >~ G(Y, z).
7.2.3
Discussion
In order for the stochastic dominance theory to be valid for practical problems, it is necessary to know the probability distributions of alternatives, which might seem too restrictive. However, recall that the expected utility theory requires information on both the probability distribution and the utility function. Since the examination of multivariate stochastic dominance is rather difficult in practice, it would be expected to be especially effective for single attribute cases with many alternatives. In fact, the stochastic dominance theory has been extensively developed in portfolio selections. Another application to equal distribution of public risk would be possible: As is well known, risk measurement in von Neumann-Morgenstern utility functions is also applicable to inequality measurement. (See,for example, Atkinson [A6], and Bodily [BI5].) A discussion on this topic is seen in Nakayama et al. [NI2]. As to the statistic dominance for problems with incomplete information on the state of nature, see Fishburn [F5], and regarding the statistic-stochastic dominance, see Takeguchi and Akashi [T2].
7.3
Multiobjective Programming Methods
Utility/value analysis tries to represent the decision maker's preference as a utility/value function. To this aim, as was seen in the previous sections, some restrictive assumptions are imposed on the decision maker's preference
7.3
253
MULTIOBJECTIVE PROGRAMMING METHODS
structure. However, if our aim is mainly to find a solution to the given problem, it is not always necessary to grasp the global preference structure of the decision maker. Interactive programming methods have been developed for obtaining a solution by utilizing only partial information about the decision maker's preference. We now discuss several kinds of interactive programming methods. However, our aim is not merely to introduce those methods but to discuss their practical implications. For surveys of more methods see, for example, Hwang and Masud [HI7], Chankong and Haimes [C6].
7.3.1
Goal Programming
The term goal programming was originally proposed by Charnes and Cooper [C7]. Its purpose is to find a solution as close as possible to a given goal or target that is, in some cases, unattainable (but not necessarily so). Let J = J2' ... ,Jp ) denote such a goal. Then the general formulation of goal programming is given by
u;
(GP)
Minimize
subject to
d(f(x),j)
x
E
X,
(7.3.1)
where d( .) is a distance function associated with a chosen metric. Lemma 7.3.1
Let y+ and y- be vectors of RP. Then consider the following problem: (GP')
Minimize
g(y+, y-)
subject to
f(x)
+ y+
- y-
y+,yX
=J ~
0
EX.
Suppose that the function g is monotonically increasing with respect to elements of y+ and y- and strictly monotonically increasing with respect to at least either or Yi- for each i (1 ~ i ~ p). Then, the solution V and y- to the preceding problem satisfy
yt
NK=O,
i = 1,2,. oo,p.
(7.3.2)
x
Proof Suppose, to the contrary, that the solution V, y-, and to the problem under consideration do not satisfy the relation (7.3.2),then for some
254
7
METHODOLOGY
i (1 ~ i ~ p) both yt and number b such that
Yi-
are positive. Hence there exists a positive
N:= N -
b > 0
Yi-
(j
:=
s: -
> O.
Letting y+ denote the vector such that j » i. j=i
and
y- be defined in a similar way, we have f(x)
+ y+ - y- = f.
It follows, then, from our assumption that g(y+, y-) < g(V,
which contradicts the fact that given problem.
y+
and
Y-
y-),
are the minimal solution to the
Remark 7.3.1 According to Lemma 7.3.1, for solutions V, y-, and problem in Lemma 7.3.1, we have
Pt =]; - h(x), K = /;(x) - ];,
if /;(x) ~
t:
if /;(x) ~
i.
xto the minimization
Therefore, if less of the level of the ith criterion is preferred to more, then Pt and Yi- represent the overattainment and the underattainment of the ith criterion, respectively.
Remark 7.3.2 The following particular forms of g(.) are usually used in many practical cases:
(i) g(y+, y-) =
p
I
(yt
i;1
(ii) g(y+,y-) =
p
I
yt
i=l
(iii) g(y+,y-) =
p
I
i;1
Yi-
+ yn
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
255
In case (i), the function 9 becomes identical to II! - f(x)lll. This is often used when the most desirable solution should satisfy f(x) = f. On the other hand, when less of each objective function is preferred to more, case (iii) may occur. Note that case (ii) does not make sense under such a circumstance. In case (iii), if there exists a feasible solution x E X such that f(x) ~ J, the optimal value of the problem (GP') becomes o. Therefore, the problem (GP') with the objective function 9 with case (iii) is equivalent to
x
Find
XEX
such that f(x) ~
f.
Also in case (iii), if there exists no feasible solution x E X such that f(x) -;;; J, then the problem (GP') reduces to Minllf(x) - Jill over X. In such a circumstance, it is better to consider J as the aspiration level rather than the goal. For cases in which more of each objective function is preferred to less, a similar discussion is possible for case (ii). Throughout the preceding consideration, it should be noted that the goal programming can treat not only optimization but also satisficing. Remark 7.3.3
If the goal J is unattainable, the distance between f(x) and J can be considered to represent a measure of regret resulted from unattainability of f(x) to f. As a more general distance function, Charnes and Cooper suggested using Minkowski's metric d(f(x), J) =
p
L (tJd); -
.!i(xW) 1/.,
tJ ? O.
i=l
In particular, p
for
Q(
= 1
d(f(x),J) =
L tJJ[;(x) -];I
i=l
for
Q(
=2
d(f(x), J)
=
Ur norm )
p
L (tJJ[;(x) -
];12)1/2
(l2-norm)
i=1
for
Q(
= 00
d(f(x), JJ = Max tJJ[;(x) - ];I l~i~p
(loo-norm or Tchebyshev norm).
Several years later, the solution that minimizes such a regret was resuggested as a compromise solution (Zeleny [Z3]). Although there is no essential difference between them, the terminology of goal programming has been used mainly for problems with the formulation stated in Remark 7.3.2, whereas that of the compromise solution is used for the formulation with more general distance functions. For more details and extensions, refer to
256
7
METHODOLOGY
Charnes and Cooper [C8], Hwang and Masud [HI7], Ignizio [11,12], and Lee [Ll]. 7.3.3.1.1 Discussion. There have been many applications of goal programming to practical problems (see, for example, Charnes and Cooper [C8], Hwang and Masud [HI7], Ignizio [11], and Lee [Ll]). However, in order that decision makers accept the solution obtained by the method, it should reflect the decision makers' value judgment. Although goal programming is recognized to be very easy to handle, it has the following two major drawbacks:
(i) How do we choose the distance function? (ii) How do we decide the weighting vector (fll>
••• , flp)?
We may notice that the distance function is not necessarily required to approximate the decision maker's preferences globally, but it should be so at least locally near the solution. In many applications so far, the iI-distance (or either y+ or y- in some cases) has been used as the distance function. As a result, in order to get better approximation of the preference of decision makers, several devices have been developed such as solving the goal programming iteratively (see, for example, Ignizio [11]). Problem (ii) is more serious. In earlier days, no rule was suggested for deciding the weight, and, hence, systems analysists adopted the weight based on their experience. In 1972, Dyer [07] suggested a method for deciding the weight under interaction with decision makers. Consider a case in which less of each objective function is preferred to more. In view of Remark 7.3.2, a formulation of goal programming may be given by (IGP)
Minimize subject to
flT y-
!(x)+y+-y-=! y+, y- ~ 0
Dyer's idea originates from an observation that the distance function of problem (IGP) gives a piecewise linear approximation of an additive separable value function V(f(x)). Therefore, he suggested to set fli =
(oV/o/;)/(oV/oJj).
In the interactive goal programming by Dyer, decision makers are requested to answer these weights as the marginal rate of substitutions with respect to their preferences. With this interaction, the goal programming is solved iteratively until the iteration converges. This interaction, however, suffers from the difficulty that the marginal rate of substitution seems to be beyond man's ability. This will be restated in more detail in the next subsection.
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
7.3.2
257
Interactive Optimization Methods
Although a number of interactive optimization methods have been developed for solving multiobjective programming problems, we discuss two of them here, typical in the historical and philosophical sense; namely, the interactive Frank-Wolfe method (Geoffrion et al. [G7]) and the surrogate worth trade-off method (Haimes et al. [H2]). For references to other methods, see, for example, Chankong and Haimes [C5], Goicoeche et al. [G8] and Hwang and Masud [HI7]. 7.3.2.1 Interactive Frank-Wolfe Method
The Frank-Wolfe method, orginally suggested for solving quadratic programming problems, can be applied to general convex programming problems (CP)
Maximize
subject to
V(x)
x
E
X.
It has been known that the method converges to a solution with a good initial convergence rate under the condition of V being a real-valued differentiable pseudoconcave function of the n-vector x = (x l ' X2' .•. , x n) of real variables and the set X c R" being convex and compact (Zangwill [Z2]). Its algorithm is given by
of
Step O. Choose Xl E X and set k = 1. Step 1. Determine the search direction dk = Maximize Vxf(xk)y
I - x\ where I
over y
E
is a solution
X.
Step 2. Determine the step length a that solves k
+ ad
over a E [0, 1]. ) k) Iff(~ + akd ~f(Xk), then stop. Otherwise, let ~+1 = x k + I'J.kd\ and go to Step 1 replacing k with k + 1. Maximize
f(x k
k
In multiobjective programming, we presume for the total objective function the decision maker's value function V(fl' ... ,fp) over the objective space, which is not known explicitly. Note that k) p aVf(x k)) k). (7.3.3) VxV(f(x = i~l aj; VxHx The term aV;aj; is the so-called marginal value with respect to ii, whose evaluation has been supposed to be impossible in traditional economics. However, as in traditional economics, it also suffices for us to use the
258
7
METHODOLOGY
marginal rate of substitution (MRS) instead of the marginal value for our present purpose. Let A = (It, ... , Jp ) be indifferent to B = (Jl,""]; + 111;, ... ,]j-l1fj, ... ,Jp ) . Namely, in order to keep the indifference relation between A and B, the improvement of fj by I1fj is required for compensating for the sacrifice of 1; by 111;. Then the marginal rate of substitution mij of1; for fj at J is defined as the limit of the rate of substitution I1fjj 111; as B tending to A. That is, mij=
. (l1fj) I1m 111;
.
B--+A
As is readily seen, there is a relation between the marginal rate of substitution and the marginal value mij
= (oVjoJ;)j(oVjofj).
Hence, the relation (7.3.3) can be rewritten as
Note here that the term OV(fk)jofs does not have any effect on the solution of the maximization problem (CP), so long as s is selected in such a way that the term becomes negative at each iteration. Therefore, in using the FrankWolfe method for our multiobjective problem in an interactive way, the decision maker is requested to answer his MRS at Step 1 and the most preferred point along the search direction at Step 2. 7.3.2.1.1 Discussion. We have the following questions in the interactive Frank-Wolfe method: (i) Can we guarantee the differentiability and quasiconcavity of V(fb ... ,fp) so that the method may converge? (ii) How is the MRS of the decision maker assessed? (iii) Can the decision maker make the line search easily?
As to question (i), Wehrung [W6] proved that the convergence property of the method under some appropriate conditions on the decision maker's preference rather than the value function V(f) is complete, (2) transitive, (3) continuous in the sense that both sets
(l)
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
259
and are closed for any], (4) convex in the sense that for any fl
>- P
=>
r
andj? (Xfl + (l - cx)p ?::J 2 for all
a
E
[0, I),
(5) there exists a unique MRS at each point (fh ... ,fp). It should be noted that the convexity of preference may be interpreted to
mean that the preference is discriminating enough to exclude thick indifference curves (Debreu [D3]). The conditions of the convexity of preference and the unique existence of MRS will be seen later to be very strong in many practical situations. Observe also that the check of these conditions is laborious, even if we accept them. Question (ii) is serious. We may easily become aware of difficulties in estimating the MRS of decision makers. Although some devices for getting the MRS have been made (Dyer [D7]), it has been observed that many decision makers cannot easily answer their MRS (Wallenius [W2] and Cohon and Marks [CI4]. In fact, in getting an MRS (mi), decision makers are requested to specify the amounr af by which they agree to sacrifice j, for improving Jj by a given small amount !J.fj in order to keep the indifference relation. It has been observed in many cases that such a judgment troubles decision makers very much (see, for example, Wallenius [W2]). Furthermore, recall that the MRS is given as a limit of a ratio of finite differences. Strictly speaking, such a limit operation, implied by the terminology of marginal, seems beyond human ability. As is widely perceived, human beings cannot recognize a small change within some extent. This threshold of human recognition is called the just noticeable difference (JND). Therefore, we can not make the amount !1Jj arbitrarily small. This observation shows the fact that the MRS obtained through the interaction with decision makers is just an approximation (possibly rough) to the correct one. An effect of this error in evaluation of the MRS to the convergence property of the method was analyzed by Dyer [D8], who concluded the method is stable to such an error. However, we may see in Fig. 7.7 that some of the underlying assumptions in Dyer's conclusion (in particular, the exact line search) are too strong in actual cases. If the JND of the attribute X is XB - XA, in the evaluation of the MRS we cannot access further from the point B to the point A, which implies that the MRS represents the direction d. When the difference between A and Cis within the JND along the direction d, the decision maker answers that A is most preferred along the direction. Namely, the optimization process terminates at the point A, which differs from the true optimum (if any) and possibly even from the satisfactory solution.
260
7
METHODOLOGY
d
optimal Q5;tion
, I
, I
I I
,
I
I
,-JND-ooI
'------'--....I-----------------x Fig. 7.7.
Note here that, in the presence of the JND, indifference curves in a usual sense cannot so much as exist (and hence neither can the MRS), because the JND causes an intransitive indifference. For more details on the JND and intransitive indifference, see Chipman [C9J, Luce [Ll l ], and Tversky [T1S]. Question (iii) is not as severe as question (ii). Recall that we cannot expect the exact line search even in traditional mathematical programming with explicitly known objective functions; hence, it suffices for convergence to find an approximate point J more preferable to fk. Even so, it is not very easy for decision makers to answer a more preferable point along the direction, because it requires them to rank vectors, that is, to evaluate multiple criteria simultaneously. In particular, this situation becomes severe as we come close to the solution. Furthermore, poor line searches cause poor rates of convergence. Wehrung [W7] suggested a method that determines a subset of a predetermined finite number of equal subintervals of Uk, IJ based on the Fibonacci search method, the golden section method, the bisection method, and other methods. On the other hand, Oppenheimer [OlJ suggested the iterative utilization of a proxy function with a known form as a local approximation of V(f(x)). As a proxy function, he suggested the use of the exponential sum p
P1(f)
=- L
aiexp(-w;fJ
;=1
or the power sum p
P2(f) = -
L a.f]".
i=1
7.3 MULTIOBJECTIVE PROGRAMMING METHODS
The parameters equation
a., w;, and lXi are mij
261
decided for Pt , for example, by solving the
= (oP(f)/o/i)(oP(f)/ofj)
Using Oppenheimer's method, the number of assessments of the decision maker's preference is expected to decrease. However, in general, the convergence (and convergence rate) would be sensitive to the form of proxy function and the assessed MRS. 7.3.2.2
The Surrogate Worth Trade-Off Method
If the less of each objective functionj, is preferred to the more, it is natural to consider the decision maker's preference to be monotonous; namely,
Then, according to Theorem 3.1.1, the most preferable solution with respect to the decision maker's preference should belong to the set of efficient solutions. In the Geoffrion-Dyer-Feinberg method, the obtained solution may be proved theoretically to be an efficient solution under some appropriate conditions such as the convexity of the preference, the criteria j., and the set X. In practice, however, this cannot necessarily be expected due to several kinds of errors including human factors. Haimes et al. [H2] suggested a method, called the surrogate worth trade-off (SWT) method, in which the search is limited on the efficient set without using decision makers' MRS explicitly. Before stating the method, note the following property of the preferentially optimal solution. Assumption 7.3.1
The set of efficient solutions in the objective space defines a smooth surface
E in RP. Definition 7.3.1 (Efficiency Trade-Off Ratio)
Let (t 1 , ••• , tp-Ill) denote the direction ratio of the outer normal of Eat an efficient point]. Then t j gives information on the trade-off between f p and fj at the point I for keeping the efficiency. Therefore, t j is called the efficiency trade-off ratio between j, andjj.
262
7
METHODOLOGY
Lemma 7.3.2
Let fO = f(xO) be a preferentially optimal solution, and let mi = m ip
(i = 1, ... , p - 1) for the MRS of the decision maker. We then have
fO
E
E,
mi(f°) = tUO)
(i
= 1, .. . .p
- 1)
Furthermore, under the convexity of the preference, each objective function, and the set X, the preceding condition is also sufficient for the preferential optimality «i».
Proof With the help of a value function representing the decision maker's preference, the lemma follows from the well-known Kuhn-Tucker theorem (Haimes et al. [H2]). Also see Nakayama [N3] for a proof in terms of the preference relation. The algorithm of the SWT method is summarized as follows: Step 1. Generate a representative subset of efficient solutions. To this end, the auxiliary optimization stated in Section 3.4.3 is recommended for appropriately chosen vectors (S1>"" Sp-I), where the reference objective function is taken esf; without loss of generality. Step 2. The decision maker is asked to score each efficient solution obtained. Let Aj be the Lagrange multiplier associated with the constraint fix) ~ Sj in the auxiliary optimization at step 1. Under the hypothesis of the smoothness of the efficient surface E, Aj gives information on the efficiency trade-off between fp and fj at the solution f(x). We ask the decision maker how much he would like to improvej, by Aj units per one-unit degradation of fj atf(x) while other criteria being fixed at!.(x) (s #- j, p) and would he please score the degree of desire with its sign. Here the plus sign indicates the case in which the amount of improvement offp , Aj produced by the unit sacrifice offj is too short (i.e., mj > A), while the minus sign is associated with the opposite case (mj < Aj)' If he is indifferent about that trade, the score is to be zero im, = AJ Figure 7.8 justifies this manner of scoring. These scores are recorded as the surrogate worth W pj' The same procedure is made for all representative efficient solutions obtained at step 1. Step 3. An optimal solution x can be obtained as the one satisfying for all j #- p.
(7.3 .3)
If there is no such a solution among the representative efficient solutions, then we decide the function forms of Wpj(f) by some appropriate methods such as multiple regression analysis. After that, we can get a most preferable solution by solving Eq. (7.3.3). 7.3.2.2.1 Discussion. We may see easily several difficulties in the SWT method; they include the smoothness of the efficient surface, generating the
7.3
263
MULTIOBJECTIVE PROGRAMMING METHODS
fiX}
efficient curve
L..---,-----i------'---------
fj
Wi
+
o I - - - " t - - -__~-----------
fj
Fig. 7.8. SWT method.
representative set of efficient solutions, scoring the surrogate worth, and multiple regression and solving a system of nonlinear equations Wpj(x) = O. We shall discuss these points below. The smoothness of the efficient surface is guaranteed only for limited cases, for example, in which each objective function is smooth and strictly convex. If the efficient surface is not smooth, we could have a condition for the preferential optimality of fO -(mb"" mp-b 1) E (T(f(X),j°))O
264
7
METHODOLOGY
(see, for example, Nakayama [N3]). In this case, the Lagrange multiplier (Al' ... , Ap-lo 1) obtained by the auxiliary optimization at step 1 is merely a vector of ((T(f(X),f°W, and hence mj = Aj U = 1, ... , p - 1) cannot be expected, in general. We need to solve an auxiliary nonlinear optimization problem in order to get an efficient solution. In many practical problems, this step requires much computational effort, and hence the least number of representative efficient solutions is more desirable. However, if the number is too few, then the most preferable solution might not be in the given representative efficient set. The case, therefore, charges us with the extra effort of multiple regression and solving the nonlinear equations Wpj = O. In the original version of the SWT method, the surrogate worth is needed only to score an ordinal measure. However, since such an ordinal score does not make sense for comparison of the differences among them, the subsequent multiple regression and the nonlinear equation wpj(f(x)) = 0 are very difficult to solve. In Chankong and Haimes [C6], the surrogate worth is supposed to be scored by a cardinal measure. However, the consistency of the surrogate worth as a cardinal measure becomes a new problem. In cases with only a few objective functions, a solution satisfying wpi!(x)) = 0 U = 1, ... , p - 1)may be relatively easily obtained in the given representative efficientset. In these lucky cases, we do not need to invoke the multiple regression analysis to get the function form of Wpj(f) and solve the system of nonlinear equations wpif) = O. In general, however (particularly in cases with a large number of objective functions) such a lucky case is hopeless, and, therefore, the stated extra efforts are needed. The difficulty for these operations has been seen in the preceding discussion for the ordinal surogate worth. In order to overcome some of these difficulties, several modifications have been developed. Note that in the SWT method since the search is restricted on the efficient surface, the independent variable is the p - l-dimensional vector (flo ... , !P-l)' Let the efficient surface be represented by the equation E(ft> '" ,fp) = O. If we have f p = w(flo ... ,fp-l) by solving the equation, then for the value function restricted on the efficient surface E, V(fj, ... ,fp-l)
.= V(ft>·· .,fp-l' w(ft> .. ·,fp-d)
Vj + "VpWj U = 1, .. . ,p -1), (7.3.4) where "f0 = 8V/8Jj, Vj = 8V/iJJj, and Wj = 8w/8Jj. On the other hand, on the Vj
=
efficient surface E,
Ej+EpWj=O
(j=I, ... ,p-I)
where Ej = 8E/iJJj. Therefore, making use of the fact that m, = Vj/v"
and
t j = Ej/Ep,
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
265
we have, under the monotonicity of the preference,
u
=
1, .. . ,p -1).
(7.3.5)
In order to decrease the number of representative efficient solutions, Chankong and Haimes [C4] suggested an interactive SWT method on the basis of relation (7.3.5). Instead of the pregiven set of representative efficient solutions, they improve the search point (which is an efficient solution) iteratively using the information of the surrogate worth at the point; namely,
Wpj(l(x» > 0
«)
=>
eJ+l = eJ
+ fJJj (-)
for sufficiently small fJJj U = 1, ... , p - 1). Observe that this method corresponds to the simple ascent method for the reduced value function V. If we use the MRS of the decision maker explicitly, the steepest ascent direction of the reduced value function may be obtained through relation (7.3.4). It should be noted that, in general, the simple ascent method bears a poorer rate of convergence than the steepest acent method. Furthermore, the method of selecting the magnitude of fJJj becomes a problem. If the value of fJJj is too small, we will get a poor rate of convergence. On the other hand, in this method we do not need to invoke the multiple regression analysis nor solve the equations wpif(x» = O. According to our experiences, we have observed that the graphical information is more understandable for decision makers than just numerical information. In particular, the efficient curve of fp versus Jj provides the global information on the trade-off for efficiency, and therefore the decision maker can answer his preferred solution on the displayed curve much more easily than just on the basis of the efficiency trade-off ratio, which merely provides the local information on the trade-off for efficiency. Recall that the SWT method searches for a most preferable solution in the space (It> ... ,fp-l) by solving auxilliary optimization problems to get efficient solutions. We may then think of using the relaxation method (coordinatewise optimization method) in that space. Nakayama et al. [Nll] suggested the graphical interactive relaxation method based on this consideration, in which the decision maker is requested only to answer his preferred solution on the shown efficient curve between a paired objective function. The pair of objective functions is changed cyclically while other objective functions are fixed at the level of the answered preferred solution so far (see Fig. 7.9). In the method using graphic displays in an interactive way, the burden of the decision maker is much less than the other methods stated above. However, it usually needs much computational effort to get each efficient curve. Moreover, its convergence property is limited to the case with smooth efficient surfaces.
266
7
METHODOLOGY
f p vs f2
efficient curves
f p vs fl
efficient curves
Fi~.
7.3.3
7.9. IRM.
Interactive Satisficing Methods
As long as we take an approach of optimization, we cannot help requiring decision makers to make such a high degree of judgment as marginal rate of substitution or ordering vectors. In order to decrease the burden of decision makers, interactive programming methods that take the approach of satisficing as a decision rule have been developed. We have seen that goal programming may be used, in which the solution is just a satisfactory one if the aspiration level is feasible, or it is as close as possible to the aspiration levels if it is not feasible.
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
267
Note here that satisficing was originally suggested due to the observation of the limit of human cognitive ability and available information through decision making (Simon [S9J). In our problem formulation, however, we are supposed to be able to evaluate the values of criteria functions by some means, although we also observe the limit of human cognitive ability. For these problems, mere satisficing is not necesarily satisfactory to the decision maker as a decision rule. For example, let us consider a case in which a decision maker ordered two designers to design some industrial product. Almost at the same time and almost at the same expense, these two designers completed their designs, which were both satisfactory to the decision maker. However, one of the designs is superior to the other in terms of all criteria. In this case, there seems no doubt that the decision maker should adopt the superior one. This example allows us to use a solution x which is satisfactory and in addition gurantees that there is no other feasible solution that is superior to x in terms of all criteria. Definition 7.3.2 (Satisfactory Pareto Solution) A solution that is satisfactory and (weakly) Pareto efficient is called a satisfactory (resp. weak) Pareto solution. In other words, when f is an aspiration level of the decision maker, an alternative x is said to be a satisfactory Pareto solution if f(x)
i: f(x)
for all x
E
X
and f(x) ~
f.
A satisfactory weak Pareto solution can be defined analogously by replacing i with v; in the first inequality. In general, satisfactory (weak) Pareto solutions constitute a subset of X. However, we can narrow down the set of satisfactory Pareto solutions and obtain a solution close to the preferentially optimal one by tightening the aspiration level, if necessary. On the other hand, even if there does not exist any satisfactory Pareto solution for the initial aspiration level, we can attain one by relaxing the aspiration level. Originally, the aspiration level is fuzzy and flexible. It seems that decision makers change their aspiration levels according to situations. From this point of view, the interactive satisficing methods provide a tool that supports decision makers in finding an appropriate satisfactory (weak) Pareto solution by some adjustment of the aspiration level. Usually the aspiration level is changed in such a way that I:" = To p(fk),
268
7
METHODOLOGY
r
where represents the aspiration level at the kth iteration. The operator P selects the Pareto solution nearest in some sense to the given aspiration level The operator T is the trade-off operator, which changes the kth aspiration Ievelj" if the decision maker does not compromise with the shown solution P(Jk). Of course, since P(]k) is a Pareto solution, there exists no feasible solution that makes all criteria better than p(Jk), and the decision maker must trade-off among criteria if he wants to improve some of criteria. Based on this trade-off, a new aspiration level is decided as To P(Jk). The process is continued until the decision maker obtains an agreeable solution.
r.
7.3.3.1
STEM
STEM (Step Method) by Benayoun et al. [B6] was originally developed for solving linear multiobjective programming problem, and its extension to general nonlinear cases is straightforward (Shimizu [S8]). The following discussion will be made for nonlinear multiobjective problems. LetJ* be an ideal point, and let us try to find a solution as close as possible to this ideal point. Although several distance function may be available as in goal programming, Zoo-norm is usually recommended: subject to
Min Max Jld};(x) - Jil xeX l~i~p
f(x) - Jt
t1f,
~
(i = 1, ... , p),
where t1f is an admissible deviation from the ideal levelj',". Equivalently, Min
subject to
~
x,~
Jli(f(x) - Ji*) ~ ~, f(x) - Ji* ~ !1f X E
(i=l, (i
= 1,
,p)
, p)
X.
Note here that the aspiration level is given by]; = J;* + t1f. The solution to these problems is clearly a satisfactory weak Pareto solution if the aspiration level is feasible; namely,] E J(X), However, taking into account the total balance among the levels of objective functions at the obtained solution, decision makers are sometimes dissatisfied with some of them. More importantly, the aspiration level might happen to be infeasible in some cases. Under such a circumstance, some of aspiration levels should be relaxed in order to get a satisfactory solution. In other words, the decision maker is now faced with a trade-off problem.
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
269
On the basis of the above consideration, STEM and its modifications provide a tool for getting a reasonable solution by changing the aspiration level in an interactive way. Step O. Set k = 1, and determine the ideal pointf* and the weight Ili by some appropriate way (see Remark 7.3.4 later). Find a solution x k to the following problem: Mine
subject to
Ili(/;(X) - f;*)
~
x,~
e,
(i
= 1, ... , p)
X E
X.
Step 1. Seeing f(x k ) obtained above, the decision maker classifies the objective functions into two classes: the ones with satisfactory levels (its index set is denoted by S) and the ones with unsatisfactory levels (its index set is denoted by U). If there is no unsatisfactory /;, then the iteration terminates. Otherwise the decision maker relaxes the level of the satisfactory objective functions, whose amount is set 11/;. Step 2. Solve subject to
Min~ x.~
e,
Ili(/;(X) - fi*) ~
i E U,
/;(x) - fi* ~ 11/;, X E
i E S,
X.
Set the solution to be x k + 1 and return to step 1.
Remark 7.3.4 In using the la-type distance function, it is very important to make a common scale for the objective functions. For example, if the positive and negative values of the objective functions are mixed, or if their numerical orders are extremely different from each other, then in the Min-Max problem some of objective functions are sensitive and others not. Note that the origin is set up by the ideal pointf*, and the weight Ili corresponds to the scale factor. In STEM the ideal pointf* is set by
/;*
=
Min /;(x)
:= N(xt),
XEX
and the weight Ili is given by normalizing
w; =
tr-> J.* ( ~)-1 '
/;
'.
Vj~l
c~
,
270
7
f,
METHODOLOGY
fp
f2
f,
f~(X1)
f 2(Xi)
.......
fp(Xi)
f2
f, (X2)
f;(X2)
.......
f p (X
2)
f*(x*) p p
Fig. 7.10. Payoff matrix.
where cij is the coefficient of the ith objective function};(x) = I CijXj,fimax is the maximum component of the ith row vector in the so-called payoff matrix given by Fig. 7.10 and}; = fimax iffimax > 0, otherwise}; = Ifn 7.3.3.1.1 Discussion. Intuitively speaking, the stated method relaxes the objective functions that attain their aspiration levels at an early stage and distributes the benefit caused by their sacrifice to other objective functions equally; the process is continued until all objective functions become satisfactory. However, it is not clarified just how much sacrifice of some objective functions gives how much improvement to other objective functions. As a result, it is up to the decision maker's guess with the help of the payoff matrix. Additionally, in many practical cases, the decision maker desires to improve some of the objective functions much more strongly than he may agree to sacrifice others. Therefore, it seems more natural to ask the decision makers how much they want to improve unsatisfactory objectives as well as how much they agree to sacrifice satisfactory ones. From this point of view, Wierzbicki [W13, W14] and Lewandowski and Grauer [L5] suggested a method called the reference point method or dynamic interactive decision analysis support systems (DIDASS) (see also Grauer et al. [G9]). Independently from their works, Nakayama [N4, N8] also developed a similar method called the satisficing trade-off method, which will be discussed in more detail later.
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
271
efficient curve Min-Max solution corresponding to the aspiration level f w band of aspiration level
Min-Max solution corresponding to the aspiration level tb
f* Fig, 7.11. Satisficing trade-off method.
7.3.3.2 Satisficing Trade-Off Method
First, we note several important properties for the Min-Max problem (Fig. 7.11) (P 00)
Min Max /lJj;(x) XEX
fn
l~i~p
Lemma 7.3.3
For fi* < Min{/;{x): x E X}, each Pareto solution can be obtained by solving the Min-Max problem (P 00) with an appropriate positive weight /li (i = 1, ... ,p). Proof Immediate from Theorem 3.4.9. Lemma 7.3.4
Suppose that for any x
E
X
fi* f* /li = 1/(]; -
fn
(i = 1, ... , p),
(7.3.7)
272
7
METHODOLOGY
x
then the solution to the Min-Max problem (PC 0). Otherwise, setfi* to be sufficiently small. (2) The weights J1; (i = 1, ... ,p) are automatically set by the ideal pointf* and the aspiration level f. For the weight in Eq. (7.3.13), the value of J1;(f;* - };(x)) can be considered to represent the normalized degree of nonattainability of /;(x) to the ideal point tr. This enables us to ignore the need to pay extra attention to the difference among the dimension and the numerical order of criteria. (3) By solving the Min-Max problem under the preceding conditions we can get a satisfactory weak Pareto solution in the case in which the aspiration level J is feasible, and a weak Pareto solution even in the case in which J is unfeasible. Interpreting this intuitively, in case in which J is feasible the obtained satisfactory weak Pareto solution is the one that improves equally in some sense each criterion as much as possible, and in case in which J is infeasible we get a weak Pareto solution nearest to the ideal point that shares an equal amount of normalized sacrifice for each criterion. This practical
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
275
meaning encourages the decision maker to accept easily the obtained solution. (4) At the stage of trade-off, the new aspiration level is set in such a way that for any
iE
n,
for any
i E
It
In this event, if Jk+l is feasible, then in view of Lemma 7.3.4 we have a satisfactory weak Pareto solution by solving the Min-Max problem. In setting the new aspiration level, therefore, it is desirable to pay attention so that] becomes feasible. (5) In order that the new aspiration level]k+l may be feasible, the criteria h (i E I~) should be relaxed enough to compensate for the improvement of h(xk ) (i E To make this trade-off successful without solving a new MinMax problem, we had better make use of sensitivity analysis based on Lemma 7.3.5. Since we already know xk is a Pareto solution, the feasibility of xk+ 1 can be checked by relation (7.3.14). Here /; is introduced to make relation (7.3.14)available for nonconvex cases in which relation (7.3.12)does not necessarily hold. Moreover, observe in view of Eq. (7.3.11) that
m.
(i = 1, ... , p)
(7.3.15)
represents the efficient trade-off, which reflects the mutual effect of change of each criterion restricted to the Pareto surface. Based on this information of efficienttrade-off, the decision maker can easily judge how much the criterion I, for i E I~ should be improved and how much the criteria j, for i E I~ should be relaxed. In particular, since a little relaxation ofjj withj such that Aj = 0 cannot compensate the improvement of fi (i E under the condition that Jk+ 1 should be on the Pareto surface, we must relax at least one jj for j such that Aj =1= 0 or give a sufficientlylarge relax for jj such that Aj = O.
m
Remark 7.3.4
The stated feasibility check is just for the purpose of decreasing the number of Min-Max problems that must be solved and is not necessarily performed strictly. In some cases, several trials for trade-off cannot succeed in holding relation (7.3.14) and the decision maker would tend to be tired of trading off again and again. In this circumstance, we had better go to solving Min-Max problem immediately even if relation (7.3.14) does not hold. However, in cases that require a lot of effort to solve Min-Max problems, a few retrials of trade-off could not be helped.
276
7
METHODOLOGY
Remark 7.3.5
In some cases, decision makers want to know the global feature of the feasible set in trading-off. To this end, it is convenient to show the so-called payoff matrix by minimizing each objective function independently, which was introduced in STEM. Of course, since the payoff matrix does not change during the interactive process, systems analysists had better prepare it in advance of the decision process. Remark 7.3.6
It may be possible to change the ideal pointf* while the aspiration level] is fixed during the decision process. This is considered to be a dual approach of the satisficing trade-off method. However, if the initial aspiration level happens to be a Pareto solution, then we cannot get other Pareto solution by solving the Min-Max problem as long as the aspiration level is fixed. Hence, when the decision maker wants another solution, he ,must change the aspiration level. Taking such a case into account, the way of changing the aspiration level seems better.
7.3.3.2.1 Discussion. As we have seen earlier, the solution obtained by the satisficing trade-off method as well as STEM is not necessarily a Pareto solution but is guaranteed only to be a weak Pareto solution. For weak Pareto solutions, there may exist another solution that improves some criteria while other criteria are left unchanged, and hence weak Pareto solutions seem to be inadequate as a decision making solution. We can get a Pareto solution from a weak Pareto solution as follows:
(AP)
Maximize
subject to
hex) + e, = hex),
X
(i
= 1,
, p)
(i
= 1,
, p)
EX,
where ~ is a (weak) Pareto solution. If all e, for the solution to (AP) are zero, then x itself is a Pareto solution. If there are some [;i oj; 0, then the solution x to problem (AP) is a Pareto solution. However, in some practical problems it is very expensive to perform such an auxilliary optimization. In order to avoid such an additional scalar optimization problem, some researchers suggested the use of some kinds of
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
277
augmented norms (Oinkelbach and Isermann [06J; Steuer and Choo [S14J). One typical example is given by (7.3.16) Recall that, in general, in interactive satisficing methods the aspiration 1 = To p(Jk), where P is an operator that finds the level is changed by Pareto solution nearest in some sense to the aspiration level This can be done by using some scalarization function. The sense of nearest strongly depends on what kind of the scalarization function we use. We impose the following requirements on the scalarization functions:
r+
r.
1. They can cover all Pareto solutions. 2. Solutions to the auxilliary scalar optimization should be Pareto solutions. 3. Solutions to the auxiliary scalar optimization should be satisfactory, if the aspiration level is feasible. Since our decision-making solution is a satisfactory Pareto solution, the scalarization function should hold the property that its maximization or minimization with appropriately chosen parameters can produce any Pareto solution in order to be applicable for any cases no matter what the problem may be and no matter what the preference of the decision maker may be. Furthermore, the solution to the auxilliary scalar optimization for the scalarization function should be nothing but a Pareto solution, because our aim is to find a satisfactory Pareto solution. Finally, if the aspiration level is feasible, then there exists a satisfactory Pareto solution. Therefore, it would not make sense to show the decision maker a nonsatisfactory solution when the aspiration level is feasible. These observations make the preceding three requirements reasonable. Unfortunately, it should be noted, however, that there is no scalaraization function satisfying all these requirements. It is well known that lrnorm cannot necessarily yield any Pareto solution if the problem is nonconvex. Even if we use an la-norm (1 < or: < 00), we cannot always get all Pareto solutions depending on the degree of nonconvexity of the problem. In Fig. 7.12, we can see that the only norm (scalarization function) that can yields any Pareto solution in any problem is the weighted Tchebyshev norm. However, the weighted Tchebyshev norm violates requirement 2. Namely, the weighted Tchebyshev norm produces not only Pareto solutions but also weak Pareto solutions. For any 0 < (X < 00, the solution obtained by minimizing Sa is guaranteed to be a Pareto solution. However, it can be easily seen that the augmented norm Sa violates requirements 1 and 3 (Fig.
278
7
METHODOLOGY
weighted Tchebyshev norm
.
f*
f1 Fig. 7.12.
7.12). Recall here that the weighted Tchebyshev norm satisfies requirement 3 (Lemma 7.3.4). 7.3.4
Interactive Programming Methods as Man-Machine Interfaces
It should be noted that interactive programming methods have a feature as a man-machine interface. Therefore, it goes without saying that it is very important in developing these interactive methods to make the best use of the strong points of men and machines. Machines (computers) are strong at routine, iterative computation and can treat large scale and complex computation with high speed. On the other hand, men are good at global (but, possibly, rough) judgment, pattern recognition, flair, and learning. With these points in mind, we impose the following properties on desirable interactive multiobjective programming methods:
1. (easy) The way of trading-off is easy. In other words, decision makers can easily grasp the total balance among the objectives. 2. (simple) The judgment and operation required of decision makers is as simple as possible. 3. (understandable) The information shown to decision makers is as intuitive and understandable as possible.
7.3
MULTIOBJECTIVE PROGRAMMING METHODS
279
4. (quick response) The treatment by computers is as quick as possible. 5. (rapid convergence) The convergence to the final solution is rapid. 6. (explanatory) Decision makers can easily accept the obtained solution. In other words, they can understand why it is so and what it came from. 7. (learning effect) Decision makers can learn through the interaction process. Interactive programming methods seem promising, in particular for design problems. However, in applying optimization techniques, we often encounter some difficulties. For example, in structural design problems such as bridges, function forms of some of criteria cannot be obtained explicitly, and their values are usually obtained by complex structural analysis. Similarly, values of criteria in the design of camera lenses are obtained by simulation of ray trace; moreover, the number of criteria is sometimes over 100. From such a practical viewpoint, many existing interactive optimization methods require too many auxiliary optimizations during the whole interaction process. Moreover, some of them require too high a degree of judgment of decision makers such as the marginal rate of substitution, which seems to be beyond men's ability. For the purpose of getting a solution, interactive satisficing methods seem advantageous, because they are usually simple and easy to carry out. However, the fact that we can obtain a solution with less information on decision makers' preference is in some cases a demerit as well as a merit. For, in many decision problems, there are multiple persons concerned with those problems, and, hence, it is important to get better mutual understanding and communication. To this end, methods providing the preference information of these people seem rather advantageous. This point is outstanding in particular for problems of public sectors. Since the real world is complex, it is not so easy to conclude which method is the best. Many practical applications could encourage us to sophisticate methodology in future.
This page intentionally left blank
REFERENCES
AI. A2. A3. A4.
A5. A6. A7. BI. B2. B3. B4. B5. B6.
Allais, M., and Hagen, O. (eds.), "Expected Utility Hypotheses and the Allais Paradox," Reidel, Dordrecht, Holland, 1979. Arrow, K. J., "Social Choice and Individual Values," Wiley, New York, 1957. Arrow, K. J., 'Essays in the Theory of Risk-Bearing," Markham, Chicago, 1971. Arrow, K. J., Barankin, E. W., and Blackwell, D., Admissible points of convex sets, in "Contributions to the Theory of Games," H. W. Kuhn and A. W. Tucker, (eds.), Vo!. 2, Princeton Univ. Press, Princeton, New Jersey, 1953. Arrow, K. J.. Hurwicz, L.. and Uzawa, H., "Studies in Linear and Non-Linear Programming," Stanford Univ. Press, Stanford, California, 1958. Atkinson, A. B., On the measurement of inequality, J. Econom. Theory 2, 244-263 (1970). Aubin, J. P., "Mathematical Methods of Game and Economic Theory," North-Holland Pub!., Amsterdam, 1979. Barbu, V., and Precupanu, T., "Convexity and Optimization in Banach Spaces," Sijthoff and Noordhoff, Alphen aan de Rijn, 1978. Baumol, W. H., The cardinal utility which is ordinal, Econom. J. 68, 665-672 (1958). Bazaraa, M. S., and Shetty, C. M., "Foundations of Optimization," Springer-Verlag, Berlin and New York, 1976. Bell, D. E., Multiattribute utility functions: decompositions using interpolation, Management Sci. 25, 744-753 (1979). Bell, D. E., Keeney, R. L., and Raiffa, H. (eds.), "Conflicting Objectives in Decisions," Wiley, New York, 1976. Benayoun, R., de Mongolfier, J., Tergny, J., and Larichev, 0., Linear programming with multiple objective functions: STEP Method (STEM), Math. Programming 1, 366-375 (1971).
B7. Benson, H. P., An improved definition of proper efficiency for vector minimization with respect to cones, J. Math. Anal. Appl. 71, 232-241 (1979). B8. Benson, H. P., On a domination property for vector maximization with respect to cones, J. Optim. Theory Appl. 39,125-132 (1983).Errata Corrige, J. Optim. Theory 43, 477-479 (1984). 281
282
REFERENCES
B9.
Benson, H. P., Efficiency and proper efficiency in vector maximization with respect to cones, J. Math. Anal. Appl. 93, 273-289 (1983).
BIO.
Benson, H. P., and Morin, T. L., The vector maximization problem: proper efficiency and stability, SIAM J. Appl. Math. 32, 64-72 (1977).
BII.
Berge, C., "Topological Spaces," MacMillan, New York, 1963.
B12.
Berge, C., "Graphs and Hypergraphs," North-Holland Pub1., Amsterdam, 1973.
B13.
Bitran, G. R., Duality for nonlinear multiple-criteria optimization problems, J. Optim. Theory Appl. 35, 367-401 (1981). Bitran, G. R., and Magnanti, T. L., The structure of admissible points with respect to cone dominence, J. Optim. Theory Appl. 29, 573-614 (1979).
B14. B15.
Bodily, S., Stability, equality, balance and multivariate risk, in "Organizations: Multiple Agents with Multiple Criteria," J. N. Morse, (ed.), Springer-Verlag, Berlin and New York,1980.
Borwein, J. M., Proper efficient points for maximizations with respect to cones, SIAM J. ControlOptim. 15, 57-63 (1977). B17. Borwein, J. M., The geometry of Pareto efficiency over cones, Math. Operationsforsch. und Statist. Ser. Optimization 11, 235-248 (1980). B18. Borwein, J. M., On the existence of Pareto efficient points, Math. Oper. Res. 8, 64-73 (1983). B19. Borwein, J. M., and Nieuwenhuis, J. W., Two kinds ofnorrnality in vector optimization, Math. Programming 28,185-191 (1984). B16.
B20.
Bowman, V. J., Jr., On the relationship of the Tchebycheffnorm and efficient frontiers of multiple-criteria objectives, in "Multiple Criteria Decision Making," H. Thriez and S. Zionts, (eds.), Springer-Verlag, Berlin and New York, 1976.
B21.
Brumelle, S., Duality for multiple objective convex programs, Math. Oper. Res. 6, 159-172 (1981).
CI.
Cesari, L., and Suryanarayana, M. B., Existence theorems for Pareto optimization in Banach spaces, Bull. Amer. Math. Soc. 82,306-308 (1976).
C2.
Cesari, L., and Suryanayarana, M. B., An existence theorem for Pareto problems, Nonlinear Anal. 2, 225-233 (1978).
C3.
Cesari, L., and Suryanayarana, M. B., Existence theorems for Pareto optimization: multivalued and Banach space valued functionals, Trans. Amer. Math. Soc. 244, 37-65 (1978).
C4.
Chankong, V., and Haimes, Y. Y., "The Interactive Surrogate Worth Trade-Off (ISWT) for Multiobjective Decision-Making," Springer-Verlag, Berlin and New York, 1978.
C5.
Chankong, V., and Haimes, Y. Y., On the characterization of noninferior solutions of the vector optimization problem, Automatica-J. IFAC 18,697-707 (1982). Chankong, V., and Haimes, Y. Y., "Multiobjective Decision Making: Theory and Methodology," North-Holland Pub1., Amsterdam, 1"983.
C6. C7.
Charnes, A., and Cooper, W., "Management Models and Industrial Applications of Linear Programming, Vo1. I," Wiley, New York, 1961.
C8.
Charnes, A., and Cooper, W. W., Goal programming and multiple objective optimization, J. Oper. Res. Soc. 1, 39-54 (1977). Chipman, J. S., Consumption theory without transitive indifference, in "Preferences, Utility, and Demand," J. S. Chipman, L. Hurwicz, M. J. Richter, and H. F. Sonnenschein, (eds.), Harcourt Brace Jovanovich, New York, 1971.
C9.
283
REFERENCES
CIO. Choo, E. D., and Atkins, D. R., Proper efficiency in nonconvex multicriteria programming, Math. Oper. Res. 8, 467-470 (1983). Cll. C12.
Clarke, F. H., "Optimization and Nonsmooth Analysis," Wiley, New York, 1983. Cochrane, J. L., and Zeleny, M., (eds.), "Multiple Criteria Decision Making," Univ, of South Carolina Press, Columbia, South Carolina, 1973.
C13.
Cohon, J. L., "Multiobjective Programming and Planning," Academic Press, New York, 1978.
C14.
Cohon, J. L., and Marks, D. H., A review and evaluation of multi objective programming techniques, Water Resources Res. 11,208-220 (1975).
C15.
Corley, H. W., An existence result for maximizations with respect to cones, J. Optim. Theory Appl. 31, 277-281 (1980).
C16.
Corley, H. W., Duality for maximizations with respect to cones, J. Math. Analy. Appl. 84,560-568 (1981).
C17.
Craven, B. D., Strong vector minimization and duality, (1980).
z. Angew. Math.
Mech. 60, 1-5
Dl.
Da Cunha, N. 0., and Polak, E., Constrained minimization under vector-valued criteria in finite-dimensional spaces, J. Math. Anal. Appl. 19, 103-124 (1967).
D2.
Dantzig, G. B., Folkman, J., and Shapiro, N., On the continuity of minimum set of continuous functions, J. Math. Anal. Appl. 17, 519-548 (1967).
D3.
Debreu, G., "Theory of Value," Wiley, New York, 1959.
D4.
Delahare, J. P., and Denel, J., The continuities of the point-to-set maps, definitions and equivalences, Math. Programming Study. 10,8-12 (1979).
D5.
Diamond, P. A., Cardinal welfare, individualistic ethics, and interpersonal comparison of utility: comment, J. Polito Econom. 75, 765-766 (1967).
D6.
Dinkelbach, W., and Isermann, H., On decision making under multiple criteria and under incomplete information, in "Multiple Criteria Decision Making," J. L. Cochrane and M. Zeleny, (eds.), Univ, of South Carolina Press, Columbia, South Carolina, 1973.
D7.
Dyer, J. S., Interactive goal programming, Management Sci. 19, 62-70, (1972).
D8.
Dyer, J. S., The etfect of errors in the estimation of the gradient on the Frank-Wolfe algorithm, with implications for interactive programming, Oper. Res. 22, 160-174 (1974).
D9.
Dyer, J. S., and Sarin, R. K., Group preference aggregation rules based on strength of preference, Management Sci. 25, 822-831 (1979).
DIO.
Dyer, J. S., and Sarin, R. K., Measurable multiattribute value functions, Oper. Res. 27, 810-822 (1979). Ekeland, I., and Temam, R., "Convex Analysis and Variational Problems", NorthHolland Pub!., Amsterdam, 1976.
El. E2. Fl.
F2.
Evans, J. P., and Gould, F. J., Stability in nonlinear programming, Oper. Res. 18, 107-118 (1970). Fan, K., On systems of linear inequalities, in "Linear Inequalities and Related Systems," H. W. Kuhn and A. W. Tucker, (eds.), Princeton Univ. Press, Princeton, New Jersey, 1956. Fandel, G., and Gal, T., (eds.), "Multiple Criteria Decision Making Theory and Applications, Proceedings, Bonn, 1979," Springer-Verlag, Berlin and New York, 1979.
284 F3.
F4. F5. F6. F7.
REFERENCES Farquhar, P. H., Research directions in multiattribute utility analysis, in "Essays and Surveys: Multiple Criteria Decision Making," P. Hansen, (ed.), Springer Publ., New York, 1983. Fiacco, A. V., "Introduction to Sensitivity and Stability Analysis in Nonlinear Programming," Academic Press, New York, 1983. Fishburn, P. C., "Decision and Value Theory," Wiley, New York, 1964. Fishburn, P. C, Methods of estimating additive utilities, Management Sci. 13,435-453 (1967). Fishburn, P. C., "Utility Theory for Decision Making," Wiley, New York, 1970.
F8.
Fishburn, P. C., "The Theory of Social Choice," Princeton Univ. Press, Princeton, New Jersey, 1973.
F9.
Fishburn, P. C, Bernoullian utilities for multiple-factor situations, in "Multiple Criteria Decision Making," J. L. Cochrane and M. Zeleny, (eds.), Univ. of South Carolina Press, Columbia, South Carolina, 1973. Fishburn, P. C, Nontransitive measurable utility, J. Mathematical Psychology 26,31-67 (1982).
FlO. FII. F12. F13.
F14.
Fishburn, P. C, Transitive measurable utility, J. Econom. Theory 31,293-317 (1983). Fishburn, P. C., and Keeney, R. L., Generalized utility independence and some implications, Oper. Res. 23, 928-940 (1975). Fishburn, P. C., and Vickson, R. G., Theoretical foundations of stochastic dominance, in "Stochastic Dominance," G. A. Whitmore and M. C Findlay, (eds.), Lexington Books, Lexington, Massachusetts, 1978. French, S., Hartley, R., Thomas, L. C., and White, D. J., "Multi-Objective Decision Making," Academic Press, London, 1983. .
Gl.
Gale, D., "The Theory of Linear Economic Models," McGraw-HilI, New York, 1960.
G2.
Gale, D., Kuhn, H. W., and Tucker, A. W., Linear programming and the theory of games, in "Activity Analysis of Production and Allocation," T. C. Koopmans, (ed.), Wiley, New York, 1951. Gearhart, W. B., Compromise solutions and estimation of the noninferior set, J. Optim. Theory Appl. 28, 29--47(1979). Gearhart, W. B., Characterization of properly efficient solutions by generalized scalarization methods, J. Optim. Theory Appl. 41, 491-502 (1983). Geoffrion, A. M., Proper efficiency and the theory of vector maximization, J. Math. Anal. Appl. 22, 618-630 (1968). Geoffrion, A. M., Duality in nonlinear programming: a simplified applications oriented development, SIAM Rev. 13,1-37 (1974). Geoffrion, A. M., Dyer, J. S., and Feinberg, A., An interactive approach for multicriterion optimization, with an application to the operation of an academic department, Management Sci. 19, 357-368 (1972). Goicoeche, A. G., Hansen, D. R., and Duckstein, L., "Multiobjective Decision Analysis with Engineering and Business Applications," Wiley, New York, 1982. Grauer, M., Lewandowski, A., and Wierzbicki, A. P., DIDASS-theory, implementation and experiences, in [G 10). Grauer, M., and Wierzbicki, A. P., (eds.), "Interactive Decision Analysis; Proceedings of an International Workshop on Interactive Decision Analysis and Interpretative Computer Intelligence, IIASA, Laxenburg, Austria 1983." Springer-Verlag, Berlin and New York, 1984.
G3. G4. G5. G6. G7.
G8. G9. G 10.
REFERENCES
285
Gil. Gros, c., Generalization of Fenchel's duality theorem for convex vector optimization, European J. Oper. Res. 2,368-376 (1978). G 12. di Guglielmo, F., Nonconvex duality in multiobjective optimization, Math. Oper. Res. 2, 285-291 (1977). HI. Hadar, J., and Russell, W. R., Stochastic dominance and diversification, J. Econom. Theory 3,288-305 (1971). H2. Haimes, Y. Y., Hall, W. A., and Freedman, H. B., "Multiobjective Optimization in Water Resources Systems, The Surrogate Worth Trade-off Method," Elsevier Scientific, New York, 1975. H3. Hansen, P., "Essays and Surveys on Multiple Criteria Decision Making," SpringerVerlag, Berlin and New York, 1983. H4. Harsarnyi, J. C., Cardinal welfare, individualistic ethics, and interpersonal comparisons of utility, J. PolitoEconomy 63,309-321 (1955). H5. Hartley, R., On cone-efficiency, cone-convexity, and cone-compactness, SIAM J. Appl. Math. 34, 211-222 (1978). H6. Hazen, G. B., and Morin, T. L., Optimality conditions in nonconical multiple-objective programming, J. Optim. Theory Appl. 40, 25-60 (1983). H7. Henig, M. I., Proper efficiencywith respect to cones, J. Optim. Theory Appl. 36, 387-407 (1982). H8.
Henig, M. I., Existence and characterization of efficient decisions with respect to cones, Math. Programming 23, 111-116 (1982).
H9.
Henig, M. I., The domination property in multicriterion optimization, The Israel Institute of Business Research, Working Paper No. 781/83 (1983).
HIO. Herstein, I. N., and Milnor, J., An axiomatic approach to measurable utility, Econometrica 21, 291-297 (1953). HI1.
Hogan, W. W., Point-to-set maps in mathematical programming, SIAM Rev. 15, 591-603 (1973). H12. Holmes, R. B., "A Course on Optimization and Best Approximation," Springer-Verlag, Berlin and New York, 1972. H13. Huang, C. C., Kira, D., and Vertinsky, I., Stochastic dominance rules for multiattribute utility functions, Rev. Econom. Stud. XLV, 611-615 (1978). H14. Huard, P., (ed.), "Point-to-Set Maps and Mathematical Programming," Math. Programming Stud. 10 (1979). HI5.
Huard, P., Background to point-to-set maps in mathematical programming, Math. Programming Stud. 10, 1-7 (1979).
H16. Hurwicz, L., Programming in linear spaces, in "Studies in Linear and Nonlinear Programming," K. J. Arrow, L. Hurwicz, and H. Uzawa, (eds.), Stanford Univ. Press, Stanford, California, 1958. H17. Hwang, C. L., and Masud, A. S. M., "Multiple Objective Decision Making Methods and Applications," Springer-Verlag, Berlin and New York, 1979. H18. II.
Hwang, C. L., and Yoon, K., "Multiple Attribute Decision Making: Methods and Applications, State-of-the-Art Survey," Springer-Verlag, Berlin and New York, 1981.
Ignizio, J. P., "Goal Programming and Extensions," Lexington Books, Lexington, Massachusetts, 1976. 12. Ignizio, J. P., Generalized goal programming, Compo Oper. Res. 10,277-291 (1983).
286
REFERENCES
13. Ijiri, Y., "Management Goals and Accounting for Control," North-Holland Publ., Amsterdam, 1965. 14. Ioffe, A. D., and Tihomirov, Y. M., "Theory of External Problems," North-Holland Pub!., Amsterdam, 1979. 15. lsermann, H., The relevance of duality in multiple-objective linear programming, in "Multiple Criteria Decision Making," M. K. Starr and M. Zeleny, (OOs.), North-Holland Pub!., Amsterdam, 1977. 16. Isermann, H., On some relations between a dual pair of multiple objective linear programs, Z. Oper. Res. 22, 33-41 (1978). 17. Isermann, H., Duality in multiple objective linear programming, in "Multiple Criteria Problem Solving," S. Zionts, (00.), Springer-Verlag, Berlin and New York, 1978.
n. Jahn, J., Duality in vector optimization, Math.
Programming 25,343-353 (1983).
J2. Jahn, J., Scalarization in vector optimization, Math. Programming 29,203-218 (1984). 13. Jurkiewicz, E., Stability of compromise solution in multicriteria decision-making problems, J. Optim. Theory Appl. 40, 77-83 (1983). K1. Kahneman, D., and Tversky, A., Prospect theory: an analysis of decision under risk, Econometrica 47,263-291 (1979). K2. Kawasaki, H., Conjugate relations and weak subdifferentials, Math. Oper. Res. 6, 593-607 (1981). K3. Kawasaki, H., A duality theorem in multiobjectivenonlinear programming, Math. Oper. Res. 7, 95-110 (1982). K4. Keeney, R. L., A group preference axiomatization with cardinal utility, Management Sci. 23, 140-145 (1976). K5. Keeney, R. L., Equity and public risk, Oper. Res. 28, 527-534 (1980). K6. Keeney, R. L., and Raiffa, H., "Decision with Multiple Objectives: Preferences and Value Tradeoffs," Wiley, New York, 1975. K7. Klinger, A., Improper solutions of the vector maximum problems, Oper. Res. 15, 570-572 (1967). K8. Kornbluth, J. S. H., Duality, indifference and sensitivity analysis in multiple objective linear programming, Oper. Res. Quart. 25, 599-614 (1974). K9.
Krantz, D., Luce, D., Suppes, P., and Tversky, A., "Foundations of Measurement," Vol, I, Academic Press, New York, 1971. KIO. Kuhn, H. W., and Tucker, A. W., Nonlinear programming, in "Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability," Berkeley, California, pp. 481-492 (1951). L1. Lee, S. M., "Goal Programming for Decision Analysis," Auerbah Publishers, Philadelphia, 1972. L2. Leitmann, G., (ed.), "Multicriteria Decision Making and Differential Games," Plenum, NewYork, 1976. L3. Leitmann, G., and Marzollo, A., (eds.), "Multicriteria Decision Making," SpringerVerlag, Berlin and New York, 1975. L4. Levy, H., and Paroush, J., Toward multivariate efficiencycriteria, J. Econom. Theory 7, 129-142 (1974). L5. Lewandowski, A., and Grauer, M. The reference point optimization approach-methods of efficientimplementation, IIASA WP-82-26 (1982).
REFERENCES
287
L6.
Lin, J. G., Multiple-objective problems: Pareto-optimal solutions by the method of proper equality constraints, IEEE Trans. Automatic Control AC·21, 641-650 (1976). L7. Lin, J. G., Proper equality constraints and maximization of index vectors, J. Optim. Theory Appl. 20, 215-244 (1976). Lin, J. G., Multiple-objective optimization by a multiplier method of proper equality constraints - Part I: Theory, IEEE Trans. Automatic Control AC·24, 567-573 (1979). L9. Luc, D. T., On the domination property in vector optimization, J. Optim. Theory Appl. 43, 327-330 (1984). L8.
LlO.
Luc, D. T., Duality theory in multiobjective programming, J. Optim. Theory Appl. 43, 557-582 (1984).
L11.
Luce, R. D., Semiorders and a theory of utility discrimination, Econometrica 24,178-191 (1956).
Ll2.
Luce, D., and Raiffa, H., "Games and Decisions; Introduction and Critical Survey," Wiley, London, 1957.
L13.
Luce, R. D., and Suppes, P., Preference, utility and subjective probability, in "Handbook of Mathematical Psychology III," R. D. Luce, R. R. Bush, and E. Galanter, (eds.), Wiley, New York, 1965. Luenberger, D. G., "Optimization by Vector Space Methods," Wiley, New York, 1969. Machina, M. J., Expected utility analysis without the independence axiom, Econometrica 50,277-323 (1982). Mangasarian, O. L., "Nonlinear Programming," McGraw-Hill, New York, 1969. Mazzoleni, P., Duality and reciprocity for vector programming, European J. Oper. Res. 10,42-50 (1980). Minami, M., Weak Pareto-optimal necessary conditions in a nondifferentiable multiobjective program on a Banach space, J. Optim. Theory Appl. 41, 451--461 (1983). Morse, J. N., (ed.), "Organizations: Multiple Agents with Multiple Criteria, Proceedings, University of Delaware, Newark, 1980." Springer-Verlag, Berlin and New York, 1981. Naccache, P. H., Connectedness of the set of nondominated outcomes in multicriteria optimization, J. Optim. Theory Appl. 25, 459--467 (1978).
Ll4. MI. M2. M3. M4. M5.
NI. N2. N3.
Naccache, P. H., Stability in multicriteria optimization, J. Math. Anal. Appl. 68, 441-453 (1979).
Nakayama, H., Subjective programming in multi-criterion decision making, in F2, 252-265 (1980). N4. Nakayama, H., Proposal of satisficing trade-off method for multiobjective programming, Trans. Soc. Inst, Control Eng. 20, 29-35 (1984), (in Japanese). N5. Nakayama, H., Geometric consideration of duality in vector optimization, J. Optim. (1984). Theory Appl. 44, 625~655 N6. Nakayama, H., Duality in linear vector optimization, IIASA Working Paper WP 84-86 (1984). N7. Nakayama, H., Lagrange duality and its geometric interpretation, in "Proceedings of the seminar of Mathematics of Multiobjective Optimization," CISM (to be published). N8. Nakayama, H., and Furukawa, K., Satisficing trade-off method applied to structural design problems in civil engineering, Large Scale Systems (to be published). N9. Nakayama, H., Takeguchi, T., and Sano, M.. Intersective graphics for portfolio selection, in H3, 280-289 (1983).
288
REFERENCES
NIO.
Nakayama, H.. Tanino, T.. Matsumoto, K., Matsuo, H .• Inoue, K., and Sawaragi, Y.. Methodology for group decision support with an application to assessment of residential environment, IEEE Trans. Systems Man Cybernet. SMC-9, 477-485 (1979).
NI I.
Nakayama, H., Tanino, T., and Sawaragi, Y, An interactive optimization method in multicriteria decision making, IEEE Trans. Systems Man Cybernet. SMC-I0, 163-169 (1980). Nakayama, H., Tanino, T., and Sawaragi, Y., Stochastic dominance for decision problems with multiple attributes and/or multiple decision makers, in "Proceedings of the Eighth Triennial World Congress of IFAC," pp. 1397-1402 (1982). von Neumann, J., and Morgenstern, 0., "Theory of Games and Economic Behavior," Princeton Univ. Press, Princeton, New Jersey, 1947. Neustadt, L. W., "Optimization," Princeton Univ. Press, Princeton, New Jersey, 1976. Nieuwenhuis, J. W., Supremal points and generalized duality, Math. Operationsforsch. Statist. Ser. Optimization 11, 41-59 (1980). Nieuwenhuis, J. W., Some Minimax Theorems in Vector-Valued Functions, J. Optim. Theory Appl. 40, 463-475 (1983). Nieuwenhuis, J. W., About Isermann duality, J. Optim. Theory Appl. 41, 481-490 (1983). Nijkamp, P., and van Delft, A., "Multi-criteria Analysis and Regional Decision Making," Martinus Hijhoff Social Sciences Division, Leiden, 1977. Oppenheimer, K. R., A proxy approach to multiattribute decision making, Management Sci. 24, 675-689 (1978). Penrose, R., A generalized inverse for matrices, Proc. Cambridge Philos. Soc. 51, 406-413 (1955). Ponstein, J., "Approaches to the Theory of Optimization," Cambridge Univ, Press, London and New York, 1980. Ponstein, J., On the dualization of multi objective optimization problems, University of Groningen, Econometric Institute, Report 87 (OR-8201), 1982. Ponstein, J., Infima of sets, proper efficiencyand multiobjective optimization, University of Groningen, Econometric Institute, Report 88 (OR-8202), 1982. Pratt, J. W., Risk aversion in the small and in the large, Econometrica 32, 122-136 (1964). Raiffa, H., "Decision Analysis: Introductory Lectures on Choice Under Uncertainty," Addison-Wesley, Reading, Massachusetts, 1968.
N12.
N13. N14. N15. N16. N17. N18.
o I. PI. P2. P3. P4. P5. RI. R2.
Richard, S. F., Multivariate risk aversion, Management Sci. 22, 12-21 (1975).
R3.
Rietveld, P., "Multiple Objective Decision Method and Regional Planning," NorthHolland Pub\., Amsterdam, 1980. Ritter, K., Optimization theory in linear spaces I, Math. Ann. 182, 189-206 (1969).
R4. R5.
Ritter, K., Optimization theory in linear spaces II, Math. Ann. 183, 169-180 (1969).
R6.
Ritter, K., Optimization theory in linear spaces Ill, Math. Ann. 184, 133-154 (1970).
R7.
Rockafellar, R. T., "Convex Analysis," Princeton Univ. Press, Princeton, New Jersey, 1970. Rockafellar, R. T., Conjugate Duality and Optimization, in "CBMS Lecture Notes Series No. 16," SIAM, Philadelphia, 1974. •
R8. R9.
Rodder, W., A generalized saddlepoint theory, European J. Oper. Res. 1,55-59 (1977).
RIO. Rosinger, E. E., Duality and alternative in multiobjective optimization, Proc. Amer. Math. Soc. 64, 307-312 (1977).
REFERENCES
289
R11.
Ruys, P. H. M., and Weddepohl, H. N., Economic theory and duality, in "Convex Analysis and Mathematical Economics," J. Kriens, (ed.), Springer-Verlag, Berlin and New York, 1979. S1. Saaty, T. L., A scaling method for priorities in hierarchical structures, J. Mathematical Psychology 15, 234-281 (1977). S2. Sakai, Y., "Economics under Uncertainty," Yuhikaku, (in Japanese), 1982.
S3. Salukvadze, M. E., "Vector-Valued Optimization Problems in Control Theory," Academic Press, New York, 1979. S4. Sarin, R. K., Strength of preference and risky choice, Oper. Res. 30, 982-997 (1982). S5. Sawaragi, Y., Nakayama, H., and Tanino, T., Modeling and management control of environmental systems, in "Optimization and Control of Dynamic Operational Research Models," S. G. Tzafestas, (ed.), North-Holland Pub!., Amsterdam, 1982. S6. Schonfeld, P., Some duality theorem for the non-linear vector maximum problem, Unternehmensforsch. 14, 51-63 (1970). S7. Sen, A., "Collective Choice and Social Welfare," Holden-Day, San Francisco, California, 1970. S8. Shimizu, K., "Theory of Multi objective and Conflict," Kyoritsu Syuppan, (in Japanese), 1982. S9. Simon, H., "Models of Man," Wiley, New York, 1957. S10. Smale, S., Global analysis and economics I: Pareto optimum and a generalization of Morse theory, in "Dynamical Systems," M. Peixoto, (ed.), Academic Press, New York, 1973. SII.
Smale, S., Global analysis and economics III: Pareto optima and price equilibria, J. Math. Econom. 1,213-221 (1974). S12. Sposito, V. A., and David, H. T., A note on Farkas lemma over cone domains, SIAM J. Appl. Math. 22,356--358 (1972). S13. Starr, M. K., and Zeleney, M., (eds.), "Multiple Criteria Decision Making," NorthHolland Pub!., Amsterdam, 1977. S14. Steuer, R. E., and Choo, E., An interactive weighted TchebyshefTprocedure for multiple objective programming, Math. Programming 26,326--344 (1983). S15. Stoer, J., and Witzgall, c., "Convexity and Optimization in Finite Dimensions I," Springer-Verlag, Berlin and New York, 1970. TI. Takeguchi, T., and Akashi, H., Multi-attribute decision making by stochastic dominance, Trans. Soc. lnstr. Control Eng. 19, 13-19 (1983), (in Japanese). T2. Takeguchi, T., and Akashi, H., Analysis of decisions under risk with incomplete knowledge, IEEE Systems Man Cybernet. (to be published). T3.
Tamura, K., and Arai, S., On proper and improper efficient solutions of optimal problems with multicriteria, J. Optim. Theory Appl. 38,191-205 (1982).
T4. Tamura, K., and Miura, S., Necessary and sufficient conditions for local and global nondominated solutions in decision problems with multiobjectives, J. Optim. Theory Appl. 28, 501-523 (1979). T5. Tamura, H., and Nakamura, N., Decompositions of multiattribute utility functions based on convex dependence, Oper. Res. 31, 488-506 (1983). T6.
Tang, Y., Conditions for constrained efficient solutions of multi-objective problems in Banach spaces, J. Math. Anal. Appl. 96, 505-519 (1983).
290
REFERENCES
Tanino, T., Saddle points and duality in multi-objective programming, Internat. J. Systems Sci. 13, 323-335 (1982). T8. Tanino, T., Fuzzy preference orderings in group decision making, Internat, J. Fuzzy Sets Systems 12,117-131 (1984). T9. Tanino, T., and Sawaragi, Y., Duality theory in multi objective programming, J. Optim. Theory Appl. 27, 509-529 (1979). TlO. Tanino, T., and Sawaragi, Y., Stability of nondominated solutions in multicriteria decision-making, J. Optim. Theory Appl. 30, 229-253 (1980). TIL Tanino, T., and Sawaragi, Y., Conjugate maps and duality in multiobjective optimization, J. Optim. Theory Appl. 31,473--499 (1980). T12. Tanino, T., Nakayama, H., and Sawaragi, Y., On methodology for group decision support, in "Organizations: Multiple Agents with Multiple Criteria," J. N. Morse, (ed.), Springer-Verlag, Berlin and New York, 1981. T7.
T13.
T14.
Tanino, T., Nakayama, H., and Sawaragi, Y., Parametric preference orderings in group decision making, in "Proceedings of Eighth Triennial World Congress of IFAC," pp. 1749-1754 (1982). Thiriez, H., and Zionts, S., (eds.), "Multiple Criteria Decision Making," Jouy-en-Josas, France, Springer-Verlag, Berlin and New York, 1976.
TIS.
Tversky, A., Intransitivity of preferences," Psychological Rev. 76, 31-48 (1969).
WI.
Wagner, D. H., Semicompactness with respect to a Euclidean cone, Canad. J. Math. 29, 29-36 (1977).
W2. Wallenius, J., Comparative evaluation of some interactive approaches to multicriterion optimization, Management Sci. 21, 1387-1396 (1973). W3. Wan, Y. H., On local Pareto optima, J. Math. Econom. 2, 35-42 (1975). W4. Warburton, A. R., Quasiconcave vector maximization: connectedness of the sets of Pareto-optimal and weak Pareto-optimal alternatives, J. Optim. Theory Appl. 40, 537-557 (1983). W5. Warfield, J. N., "Societal Systems - Planning, Policy and Complexity," Wiley, New York,1976. W6. Wehrung, D. A., Interactive identification and optimization using a binary preference relation, Oper. Res. 26, 322-332 (1978). W7. Wehrung, D. A., Multidimensional line search using a binary preference relation, Oper. Res. 27, 356-363 (1979). W8. Wendell, R. E., and Lee, D. N., Efficiency in multiple objective optimization problems, Math. Programming 12, 406-414 (1977). W9. White, D. J., "Optimality and Efficiency," Wiley, New York, 1982. WlO. Whitmore, G. A., Third-degree stochastic dominance, Amer. Econom. Rev. 60, 457--459 (1970). WII. Whitmore, G. A., and Findlay, M. c., "Stochastic Dominance," Lexington Books, Lexington, Massachusetts, 1978. W12. Wierzbicki, A. P., Basic properties of scalazing functionals for multiobjective optimization, Math. Oper. Statist. Ser, Optim. 8, 55-60 (1977). W13. Wierzbicki, A., The use of reference objectives in multiobjective optimization, in "Multiple Criteria Decision Making Theory and Application," G. Fandel, and T. Gal, (eds.), Springer-Verlag, Berlin and New York, 1980.
REFERENCES
291
W14. Wierzbicki, A., A mathematical basis for satisficing decision making, in "Organizations: Multiple Agents with Multiple Criteria," J. Morse, (ed.), Springer Publ., New York, 1981. W15. Wilhelm, J., "Objectives and Multi-objective Decision Making under Uncertainty," Springer-Verlag, Berlin and New York, 1975. YI. Yu, P. L., Cone convexity, cone extreme points, and nondominated solutions in decision problems with multiple objectives, J. Optim. Theory Appl. 14,319-377 (1974). Y2. Yu, P. L., Second order game problem decision dynamics in gaming phenomena, J. Optim. Theory Appl. 27, 144-166 (1979). Y3. Yu, P. L., and Leitmann, G., Compromise solutions, domination structures, and Salukvadze's solution, J. Optim. Theory Appl. 13,362-378 (1974). Y4. Yu, P. L., and Zeleny, M., The set of all undominated solutions in linear cases and the multicriteria simplex method, J. Math. Anal. Appl. 49, 430-468 (1974). Zl. Zadeh, L. A., Optimality and nonscalar-valued performance criteria, IEEE Trans. Automat. Control AC-8, 59-60 (1963). Z2. Zangwill, W. I., "Nonlinear Programming: A Unified Approach," Prentice-Hall, Englewood Cliffs, New Jersey, 1969. Z3. Zeleny, M., Compromise programming, in "Multiple Criteria Decision Making," J. L. Cochrane and M. Zeleny, (eds.), Univ. of South Carolina Press, Columbia, South Carolina, 1973. Z4. Zeleny, M., "Linear Multi-objective Programming," Springer-Verlag, Berlin and New York, 1974. Z5. Zeleny, M., (ed.), "Multiple Criteria Decision Making: Kyoto," Springer-Verlag, Berlin and New York, 1976. Z6. Zeleny, M., "Multiple Criteria Decision Making," McGraw-Hili, New York, 1982. Z7. Zionts, S., (ed.), "Multiple Criteria Problem Solving: Proceedings, Buffalo, NY 1977," Springer-Verlag, Berlin and New York, 1978. Z8. Zionts, S., and Wallenius, 1., An interactive programming method for solving the multiple criteria problem, Management Sci. 22, 652-663 (1976). Z9. Zowe, J., A duality theory for a convex programming in order-complete vector lattices, J. Math. Anal. Appl. SO, 273-287 (1975). ZIO. Zowe, J., The saddle-point theorem of Kuhn and Tucker in ordered vector spaces, J. Math. Anal. Appl. 57, 41-55 (1977).
This page intentionally left blank
INDEX
A Acute cone, 7 Acyclicity, 48 Additive independence, 226 Additive utility independence, 228 Admissibility, 205 Allais paradox, 238 Antisymmetry,26 Archimedean condition, 232 Archimedean property, 213 Asymmetry, 26
B Benson's proper efficiency, 37 Best approximation, 77 Biconjugate function, 18 Biconjugate map, 168 Borwein's proper efficiency, 37 C Cardinal independence, 224 Certainty equivalent, 219 Class P, 193 Closure of a convex function, 17 Completeness of a preference order, 26 Compromise solution, 79 Conditional preference, 221 Conditional preference function, 221 Cone, 7 Cone boundedness, 20
Cone closed point-to-set map, 24 Cone closedness of a set, 52 Cone compactness, 50 Cone concave point-to-set map, 24 Cone convex functions, 15 Cone convex point-to-set map, 24 Cone convex set, 13 Cone semicompactness, 50 Cone semicontinuity, 58 Conjugate duality, 5, 167, 190 Conjugate function, 18 Conjugate map, 168 Connectedness of a preference order, 26 of a set, 66 Constraint problem, 86 Continuity of point-to-set map, 22 Convex analysis, 6 Convex cone, 7 Convex dependence, 234 Convex function, 14 Convex hull, 7 Convex set, 7 D
Decomposability, 220 Decomposed representation, 223 Difference independence, 230 Domination cone, 29 Domination structure, 28 Dual map, 144 Duality, 5,137,148,179,196,205
293
294
INDEX E
Effective domain, 14 Efficiency, 28, 29 Isermann, 206 proper, 36 weak, 33, 162 Efficiency trade-off ratio, 261 Efficient solution, 33 Epigraph, 13 Expected utility, 218 Extended recession cone, 40,51 External stability, 47,59 F
Farkas lemma, 129 Fenchel's inequality, 18 Finitely generated convex cone, II Finitely generated convex set, II G Gale-Kuhn-Tucker duality, 131 fP-regularization, 194 Geoffrion's proper efficiency, 40 Goal programming, 253 Group decision, 240
"
Halfspace, 15 Henig's proper efficiency, 39 Homomorphism, 211 Hyperplane, 15
Ideal point, 77, 268, 273 Independence additive, 226 additive utility, 228 cardinal, 224 difference, 230 interpolation, 234 of Neumann-Morgenstern utility theory, 218, 242 preferential,221 utility, 227 weak difference, 229 Indicator function, 14, 18
Indifference, 26 Infimum, 188 Interactive Frank-Wolfe method, 257 Interactive goal programming, 256 Interactive optimization method, 257 Interactive relaxation method, 266 Interactive satisficing method, 266 Interactive surrogate worth trade-off method, 265 Internal stability, 60 Interpolation independence, 234 Irreflexivity, 26 Isermann duality, 134 Isermann efficiency, 206 Isomorphism, 211
J Jensen's inequality, 14 J-normal, 159 J-stable, 159 Just noticeable difference, 259 K
Kernel,60 Kornbluth duality, 132 Kuhn-Tucker condition, 89 Kuhn-Tucker constraint qualification, 42, 90 Kuhn-Tucker's proper efficiency, 42 L
Lagrange duality, 5, 127 Lagrangian, 183 Lexicographic order, 31 Local risk aversion, 220 Lock step method, 233 Lottery, 219 Lower semicontinuity, 22, 101, 112 M
Marginal rate of substitution (MRS), 256, 258 Measurable value function, 215 Mixture set, 217 Multiattribute preference function, 220, 237 Multiobjective optimization problem, 2 Multivariate risk aversion, 248
295
INDEX
Multivariate stochastic dominance, 248 (Ii, IX)-norm, 78
N
Negative transitive, 26 Normal vector, 16 Normality, 158, 228 J-normal, 159 N-normal, 159 von Neumann-Morgenstern utility function, 217
o One-sided directional derivative, 19 Order lexicographic, 31 Pareto, 30 preference, 3, 25 preference difference, 215 strict partial, 26 total, 26, 214 weak, 26, 214,232 Order dense set, 241
p Pareto order, 30 Pareto optimal solution, 36 Pay-off matrix, 270 Perturbation function, 87, 92 Perturbation map, 138 Pointed cone, 7 Point-to-set map, 21 Polar, 8 Polyhedral convex cone, II Polyhedral convex set, II Preference difference order, 215 Preference indifference, 26 Preference function, 3, 27, 211 Preference order, 3, 25 Preferential independence, 221 Projecting cone, 37 Proper convex function, 14 Proper efficiency, 36 Benson's, 37 Borwein's,37 Geoffrion's, 40 Renig's,39 Kuhn-Tucker's,42 Properly efficient solution, 36
R
Recession cone, 10 Reflexive, 26 Risk averse, 220 neutral, 220 premium, 220 prone, 220 S
Saddle point, 142, 152, 183 Satisfactory Pareto solution, 267 Satisficing trade-off method, 271 Scalarizable point-to-set map, 194 Scalarization, 70 Scalarized function, 194 Separation of sets, 66 theorem, 66 Slater's constraint qualification, 140, 159 Solution set map, 93 Solvability condition, 232 Stability, 88, 92, 94,107,119,122 external, 47,59 internal, 60 J-stable, 159 Stable problem, 182 STEM, 268 Stochastic dominance, 244 multivariate, 248 Strict partial order, 26 Strict polar, 8 Strict preference, 26 Strong biconjugate, 203 Strong closure, 203 Strong conjugate, 203 Strong infimum, 202 Strong Lagrangian, 208 Strong maximum, 202 Strong minimum, 202 Strong normality, 207 Strong saddle point, 208 Strong separation theorem, 16 Strong subgradient, 203 Strong supremum, 202 Subdifferentiable, 19, 172 Subdifferential, 19, 172 Subgradient, 19, 172 Support function, 18
296
INDEX
Supporting hyperplane, 17 Supremum, 188 Surrogate worth trade-off method, 261 Symmetry, 26
T
Tangent cone, 36 Tchebyshev norm, 82, 255 Thomsen condition, 231 Total order, 26, 214 Transitivity, 26
u Uniform compactness, 22 Upper semicontinuity, 22, 95, 108 Utility function, 3, 211, 217 Utility independence, 227
V
Value function, 3, 211, 214, 250 Vector optimization, 2 Vector-valued Lagrangian function, 142
von Neumann-Morgenstern utility function, 217,243 W
Weak biconjugate map, 193 Weak conjugate map, 193 Weak connectedness, 26 Weak difference independence. 229 Weak duality, 147, 180 Weak efficiency, 33, 162 Weak infimum, 191 Weak Lagrangian, 200 Weak maximal set, 191 Weak minimal set, 191 Weak order, 26 Weak Pareto optimal solution, 36 Weak Pareto order, 30 Weak saddle point, 200 Weak stability, 198 Weak subdifferential, 195 Weak subgradient, 195 Weak supremum, 191
z Zerolike matrix, 204