Applications of Multi-Objective Evolutionary Algorithms
ADVANCES IN NATURAL COMPUTATION Series Editor: Xin Yao (The University of Birmingham, UK) Published Vol. 2: Recent Advances in Simulated Evolution and Learning Eds: Kay Chen Tan, Meng Hiot Lint, Xin Yao & Lipo Wang
Applications of Multi-Objective Evolutionary Algorithms A d v a n c e s
I n N a t u r a l
C o m p u t a t i o n
-
V o l .
1
editors
Carlos A Coello Coello (CINVESTAV-IPN, Mexico)
Gary B Lamont (Air Force Institute of Technology, Wright-Patterson AFB, USA)
World Scientific NEW JERSEY • LONDON • SINGAPORE • BEIJING • S H A N G H A I • H O N G K O N G • TAIPEI • CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
APPLICATIONS OF MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS Advances in Natural Computation — Vol. 1 Copyright © 2004 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-256-106-4
Printed in Singapore by World Scientific Printers (S) Pte Ltd
FOREWORD
Computer science is playing an increasingly important role in many other fields, such as biology, physics, chemistry, economics and sociology. Many developments in these classical fields rely crucially on advanced computing technologies. At the same time, these fields have inspired new ideas and novel paradigms of computing, such as evolutionary, neural, molecular and quantum computation. Natural computation refers to the study of computational systems that use ideas and draw inspiration from natural systems, whether they be biological, ecological, molecular, neural, quantum or social. World Scientific Publishing Co. publishes an exciting book series on "Advances in Natural Computation." The series aims at serving as a central source of reference for the theory and applications of natural computation, establishing the state of the art, disseminating the latest research discoveries, and providing potential textbooks to senior undergraduate and postgraduate students. The series publishes monographs, edited volumes and lecture notes on a wide range of topics in natural computation. Examples of possible topics include, but are not limited to, evolutionary algorithms, evolutionary games, evolutionary economics, evolvable hardware, neural networks, swarm intelligence, quantum computation, molecular computation, ecological informatics, etc. This volume edited by Drs. Carlos A. Coello Coello and Gary B. Lamont represents an excellent start to this leading book series. It addresses a fast growing field of multi-objective optimization, especially its applications in many disciplines, from engineering design to spectroscopic data analysis, from groundwater monitor to regional planning, from autonomous vehicle navigation to polymer extrusion, and from bioinformatics to computational
V
vi
Foreword
finance. It complements the journal papersa beautifully and should be on the bookshelf of anyone who is interested in multi-objective optimization and its diverse applications. The two editors of this volume are both leading experts on evolutionary multi-objective optimization. They have done an outstanding job in putting together the most comprehensive book on the applications of evolutionary multi-objective optimization. I hope all readers will enjoy this volume as much as I do! The upcoming volume in this book series on "Recent Advances in Simulated Evolution and Learning" will appear soon. If you are interested in writing or editing a volume for this book series, please get in touch with the Series Editor. Xin Yao Series Editor Advances in Natural Computation August 2004
a
Carlos A. Coello Coello, Special Issue on Evolutionary Multi-objective Optimization, IEEE Transactions on Evolutionary Computation, 7(2), April 2003.
PREFACE
The intent of this book is to present a variety of multi-objective problems (MOPs) which have been solved using multi-objective evolutionary algorithms (MOEAs). Due to obvious space constraints, the set of applications included in the book is relatively small. However, the editors believe that such a set is representative of the current trends among both researchers and practitioners across many disciplines. This book aims not only to present a representative sampling of realworld MOPs currently being tackled by practitioners, but also to provide some insights into the different aspects related to the use of MOEAs in real-world applications. The reader should find the material particularly useful in analyzing the pragmatic (and sometimes peculiar) point of view of each contributor regarding how to choose a certain MOEA and how to validate the results using metrics and statistics. Another aspect that is worth addressing is the limited variety of MOEAs adopted throughout this book which is not as diverse as those presented in the literature. This indicates a certain degree of maturity within this research community, and at the same time defines some important current trends among practitioners. By reading the chapters, it is evident that certain MOEAs that some researchers in the field might consider "oldfashioned" (e.g., the Niched Pareto Genetic Algorithm) continue to be used by various practitioners. At the same time, it is evident that other "modern" MOEAs (e.g., the Non-dominated Sorting Genetic Algorithm II) with available software are becoming increasingly popular. As MOEA software evolves and each incorporates an increasingly larger variety of operators, generic MOEA software should be available. For example, such software is currently being integrated into various optimization packages incorporating a variety of search techniques. Of course, the MOEA discipline continvii
viii
Preface
ues to evolve more sophisticated variants, hybridization techniques, unique methodologies depending on the problem domain, and use of efficient parallel computation, with application to an increasing broader class of highdimensional complex problems. The spectrum of real-world optimization MOPs dealt with in this book include, among others, aircraft design, robot planning, identification of interesting qualitative features in biological sequences, circuit design, production system control, city planning, ecological system management, and bioremediation of chemical pollution. Thus the organization of the book is structured around engineering, biology, chemistry, physics, and management disciplines. Throughout this book, the reader should find not only problems with different degrees of complexity, but also with different practical requirements, user constraints, and a variety of MOEA solution approaches. We would like to thank all the contributors for providing their insights regarding the use of MOEAs in solving real-world multi-objective problems. Without their serious consideration, contemplation, and devoted efforts in general, the discipline of MOEAs would not have evolved as well as this book. Such activity makes MOEAs a viable approach in finding effective and efficient solutions to complex MOPs. Observe that the contributors are from many countries reflecting the international interest in MOEA applications and the interdisciplinary nature of optimization research. As indicated, this book presents a collection of MOEA applications which provide the professional and the practitioner with direction to achieving "good" results in their selected problem domain. For the beginner, the Introductory chapter and the variety of MOEA application chapters should provide an understanding of generic MOPs and MOEA parameter and operator selection leading to acceptable results. For the expert, the variety of MOP applications generates a wider understanding of MOEA operator selection and insight to the path leading to problem solutions. Additional applications and theoretical MOEA papers can be found at the Evolutionary Multi-Objective Optimization (EMOO) Repository internet web site at http://delta.cs.cinvestav.mx/~ccoello/EMOO/with mirrors at http://www.lania.mx/~ccoello/EMOO/ and at http://neo.lcc.uma.es/emoo/. As of mid 2004, the EMOO Repository
Preface
ix
contained over 1700 bibliographic references, including more than 100 PhD theses, and over 1000 conference papers and 400 journal papers. However, the EMOO Repository is continually being updated. There is not only a large collection of bibliographic references (many of them electronically available), but also contains public-domain MOP and MOEA software and sample test problems as well as some other useful information which allows one to start working in this exciting research field. The general organization of the book is based on the types of applications considered. Chapter 1 provides some preliminary material intended for those not familiar with the basic concepts and terminology adopted in evolutionary multi-objective optimization. This first chapter also provides a brief description of each of the other 29 chapters that integrate this book. These 29 chapters are divided in four parts. The first part is the largest and it consists of engineering applications (e.g., civil, mechanical, aeronautical, and chemical engineering, among others). This first part includes chapters 2 to 13. The second part consists of scientific applications (e.g., computer science, bioinformatics and physics, among others) and includes chapters 14 to 19. The third part consists of industrial applications (e.g., design, manufacture, packing and scheduling, among others) and includes chapters 20 to 24. The fourth and last part consists of miscellaneous applications such as data mining, finance and management. This last part includes chapters 25 to 30. The first editor gratefully acknowledges the support obtained from CINVESTAV-IPN and from the NSF-CONACyT project 42435-Y. He also thanks Dr. Luis Gerardo de la Fraga for his continuous support and Erika Hernandez Luna for her valuable help during the preparation of this book. The second author acknowledges the support of his graduate students including Rick Day and Mark Kleeman. We also acknowledge the use of an academic license of the Word2Tex™ converter (developed by Chikrii Softlab) to convert some of the chapters submitted in MS Word™ to I^TfrjX 2E, which is the tool that we adopted to process the entire manuscript. The editors thank Steven Patt, from World Scientific, who was very professional, incredibly helpful and always replied promptly to all of the editors' queries. The editors also thank Prof. Xin Yao for deciding to include this book within his Advances in Natural Computation series.
x
Preface
Finally, the editors would like to thank their spouses for letting them spend the many hours on the preparation of the manuscript. Also, we thank the many students, researchers and practitioners working in this field. They have contributed directly and indirectly a large variety of ideas that continue to expand this interdisciplinary field of multi-objective evolutionary computation. Their constant efforts and contributions have made possible this book and innovative publications that we expect to see in the years to come.
Carlos A. Coello Coello Gary B. Lamont August 2004
CONTENTS
v
Foreword
vii
Preface 1
2
An Introduction to Multi-Objective Evolutionary Algorithms and Their Applications 1.1 Introduction 1.2 Basic Concepts 1.3 Basic Operation of a MOEA 1.4 Classifying MOEAs 1.4.1 Aggregating Functions 1.4.2 Population-Based Approaches 1.4.3 Pareto-Based Approaches 1.5 MOEA Performance Measures 1.6 Design of MOEA Experiments 1.6.1 Reporting MOEA Computational Results 1.7 Layout of the Book 1.7.1 Part I: Engineering Applications 1.7.2 Part II: Scientific Applications 1.7.3 Part III: Industrial Applications 1.7.4 Part IV: Miscellaneous Applications 1.8 General Comments References
1 1 3 4 6 6 7 8 11 14 15 16 16 19 20 21 22 23
Applications of Multi-Objective Evolutionary Algorithms in Engineering Design 2.1 Introduction 2.2 Multi-Objective Evolutionary Algorithm 2.2.1 Algorithms
29 29 31 33
xi
xii
Contents
2.3 Examples 2.3.1 Design of a Welded Beam 2.3.2 Preliminary Design of Bulk Carrier 2.3.3 Design of Robust Airfoil 2.4 Summary and Conclusions References 3
4
Optimal Design of Industrial Electromagnetic Devices: A Multiobjective Evolutionary Approach 3.1 Introduction 3.2 The Algorithms 3.2.1 Non-Dominated Sorting Evolution Strategy Algorithm (NSESA) 3.2.1.1 Pareto Gradient Based Algorithms (PGBA) and Hybrid Strategies 3.2.1.2 Pareto Evolution Strategy Algorithm (PESTRA) 3.2.1.3 Multi Directional Evolution Strategy Algorithm (MDESTRA) 3.3 Case Studies 3.3.1 Shape Design of a Shielded Reactor 3.3.1.1 Direct Problem 3.3.1.2 Inverse Problem 3.3.1.3 Sample-and-Rank Approach 3.3.1.4 Optimization Results 3.3.2 Shape Design of an Inductor for Transverse-FluxHeating of a Non-Ferromagnetic Strip 3.3.2.1 Direct Problem 3.3.2.2 Inverse Problem 3.3.2.3 Sample-and-Rank Approach 3.3.2.4 Optimization Results 3.4 Conclusions References Groundwater Monitoring Design: A Case Study Combining Epsilon Dominance Archiving and Automatic Parameterization for the NSGA-II 4.1 Introduction 4.2 Prior Work
36 36 40 46 50 52
53 53 54 55 58 59 61 61 62 62 63 65 67 69 70 72 73 73 75 75
79 79 81
Contents
4.3 Monitoring Test Case Problem 4.3.1 Test Case Overview 4.3.2 Problem Formulation 4.4 Overview of the e-NSGA-II Approach 4.4.1 Searching with the NSGA-II 4.4.2 Archive Update 4.4.3 Injection and Termination 4.5 Results 4.6 Discussion 4.7 Conclusions References 5
6
xiii
83 83 83 84 86 87 89 91 97 97 98
Using a Particle Swarm Optimizer with a Multi-Objective Selection Scheme to Design Combinational Logic Circuits 5.1 Introduction 5.2 Problem Statement 5.3 Our Proposed Approach 5.4 Use of a Multi-Objective Approach 5.5 Comparison of Results 5.5.1 Example 1 5.5.2 Example 2 5.5.3 Example 3 5.5.4 Example 4 5.5.5 Example 5 5.5.6 Example 6 5.6 Conclusions and Future Work Acknowledgements References
101 101 102 104 107 109 109 110 112 114 117 118 120 122 122
Application of Multi-Objective Evolutionary Algorithms in Autonomous Vehicles Navigation 6.1 Introduction 6.2 Autonomous Vehicles 6.2.1 Experimental Setup 6.2.2 Vehicle Model 6.2.3 Relative Sensor Models 6.2.3.1 Steering Encoder 6.2.3.2 Wheel Encoder
125 126 127 127 128 129 129 129
xiv
Contents
6.2.4 Absolute Sensor Models 6.2.4.1 Global Positioning Systems 6.2.4.2 Inertial Measurement Unit 6.2.5 Simulation and Measurement of the Vehicle State . . 6.2.6 Prediction of the Vehicle State 6.3 Parameter Identification of Autonomous Vehicles 6.3.1 Problem Formulation 6.3.2 A General Framework for Searching Pareto-Optimal Solutions 6.3.3 Selection of a Single Solution by CoGM 6.4 Multi-Objective Optimization 6.4.1 Evaluation of Functions 6.4.1.1 Rank Function 6.4.1.2 Fitness Function 6.4.2 Search Methods 6.4.2.1 MCEA 6.4.2.2 MOGM 6.5 Application of Parameter Identification of an Autonomous Vehicle 6.6 Conclusions 6.7 Acknowledgement References 7
130 130 131 131 131 133 133 134 136 138 138 138 139 139 139 140 141 148 151 151
Automatic Control System Design via a Multiobjective Evolutionary Algorithm 155 7.1 Introduction 155 7.2 Performance Based Design Unification and Automation . . 158 7.2.1 The Overall Design Architecture 159 7.2.2 Control System Formulation 160 7.2.3 Performance Specifications 161 7.2.3.1 Stability 161 7.2.3.2 Step Response Specifications 162 7.2.3.3 Disturbance Rejection 162 7.2.3.4 Robust Stability 162 7.2.3.5 Actuator Saturation 163 7.2.3.6 Minimal Controller Order 164 7.3 An Evolutionary ULTIC Design Application 165 7.4 Conclusions 172 References 174
Contents
8
9
The Use of Evolutionary Algorithms to Solve Practical Problems in Polymer Extrusion 8.1 Introduction 8.2 Polymer Extrusion 8.2.1 Single Screw Extrusion 8.2.2 Co-Rotating Twin-Screw Extrusion 8.2.3 Optimization Characteristics 8.3 Optimization Algorithm 8.3.1 Multi-Objective Optimization 8.3.2 Reduced Pareto Set Genetic Algorithm with Elitism (RPSGAe) 8.3.3 Travelling Salesman Problem 8.4 Results and Discussion 8.4.1 Single Screw Extrusion 8.4.2 Twin-Screw Extrusion 8.5 Conclusions Acknowledgments References Evolutionary Multi-Objective Optimization of Trusses 9.1 Introduction 9.2 Related Work 9.3 ISPAES Algorithm 9.3.1 Inverted "ownership" 9.3.2 Shrinking the Objective Space 9.4 Optimization Examples 9.4.1 Optimization of a 49-bar Plane Truss 9.4.1.1 The 49-bar Plane Truss as a Single-Objective Optimization Problem with Constraints . . 9.4.1.2 The 49-bar Plane Truss as a Multi-Objective Optimization Problem with Constraints . . 9.4.2 Optimization of a 10-bar Plane Truss 9.4.2.1 The 10-bar Plane Truss as a Single-Objective Optimization Problem with Constraints . . 9.4.2.2 The 10-bar Plane Truss as a Multi-Objective Optimization Problem with Constraints . . 9.4.3 Optimization of a 72-bar 3D Structure
xv
177 177 178 178 179 183 184 184 186 187 189 189 194 196 197 197 201 201 202 204 207 207 212 212 212 215 215 216 217 217
xvi
Contents
9.4.3.1 The 72-bar 3D Structure in Continuous Search Space as a Single-Objective Optimization Problem with Constraints 9.4.3.2 The 72-bar 3D Structure in Discrete Search Space as a Single-Objective Optimization Problem with Constraints . . 9.5 Final Remarks and Future Work Acknowledgments References 10 City and Regional Planning via a MOEA: Lessons Learned 10.1 The Traditional Approach 10.2 The MOEA Approach 10.3 City Planning: Provo and Orem 10.4 Regional Planning: The WFMR 10.5 Coordinating Regional and City Planning 10.6 Conclusions Acknowledgments References 11 A Multi-Objective Evolutionary Algorithm for the Covering Tour Problem 11.1 Introduction 11.2 The Covering Tour Problem 11.2.1 The Mono-Objective Covering Tour Problem . . . . 11.2.2 The Bi-Objective Covering Tour Problem 11.2.3 Optimization Methods 11.2.3.1 A Heuristic Method 11.2.3.2 An Exact Method 11.3 A Multi-Objective Evolutionary Algorithm for the Bi-Objective Covering Tour Problem 11.3.1 General Framework 11.3.2 Solution Coding 11.3.3 Genetic Operators 11.3.3.1 The Crossover Operator 11.3.3.2 The Mutation Operator 11.4 Computational Results 11.5 Conclusions and Outlooks
219 222 222 223 223 227 227 229 231 235 238 239 240 240 247 248 251 251 252 253 253 254 255 255 256 257 257 258 258 260
Contents
Acknowledgement References 12 A Computer Engineering Benchmark Application for Multiobjective Optimizers 12.1 Introduction 12.2 Packet Processor Design 12.2.1 Design Space Exploration 12.2.2 Basic Models and Methods 12.3 Software Architecture 12.3.1 General Considerations 12.3.2 Interface Description 12.4 Test Cases 12.4.1 Problem Instances 12.4.2 Simulation Results 12.5 Summary Acknowledgments References 13 Multiobjective Aerodynamic Design and Visualization of Supersonic Wings by Using Adaptive Range Multiobjective Genetic Algorithms 13.1 Introduction 13.2 Adaptive Range Multiobjective Genetic Algorithms . . . . 13.3 Multiobjective Aerodynamic Optimization 13.3.1 Furmulation of Optimization 13.3.2 CFD Evaluation 13.3.3 Overview of Non-Dominated Solutions 13.4 Data Mining by Self-Organizing Map 13.4.1 Neural Network and SOM 13.4.2 Cluster Analysis 13.4.3 Visualization of Design Tradeoffs: SOM of Tradeoffs 13.4.4 Data Mining of Design Space: SOM of Design Variables 13.5 Conclusions Acknowledgements References
xvii
261 261
269 269 271 272 274 281 281 283 284 284 286 289 292 292
295 295 297 300 300 302 303 305 305 307 308 310 311 312 313
xviii
Contents
14 Applications of a Multi-Objective Genetic Algorithm in Chemical and Environmental Engineering 14.1 Introduction 14.2 Physical Problem 14.3 Genetic Algorithm 14.4 Problem Formulation 14.5 Conclusions References
317 317 319 320 322 337 338
15 Multi-Objective Spectroscopic Data Analysis of Inertial Confinement Fusion Implosion Cores: Plasma Gradient Determination 341 15.1 Introduction 342 15.2 Self-Consistent Analysis of Data from X-ray Images and Line Spectra 344 15.3 A Niched Pareto Genetic Algorithm for Multi-Objective Spectroscopic Data Analysis 347 15.4 Test Cases 349 15.5 Application to Direct-Drive Implosions at GEKKO XII . . 354 15.6 Application to Indirect-Drive Implosions at OMEGA . . . 357 15.7 Conclusions 359 Acknowledgments 361 References 361 16 Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 365 16.1 Introduction 365 16.2 Medical Image Processing 366 16.2.1 Medical Image Reconstruction 367 16.3 Computer Aided Diagnosis 369 16.3.1 Optimization of Diagnostic Classifiers 370 16.3.2 Rules-Based Atrial Disease Diagnosis 370 16.4 Treatment Planning 372 16.4.1 Brachytherapy 373 16.4.1.1 Dose Optimization for High Dose Rate Brachytherapy 373 16.4.1.2 Inverse Planning for HDR Brachytherapy . 374 16.4.2 External Beam Radiotherapy 376
Contents
16.4.2.1 Geometrical Optimization of Beam Orientations 16.4.2.2 Intensity Modulated Beam Radiotherapy Dose Optimization 16.4.3 Cancer Chemotherapy 16.5 Data Mining 16.5.1 Partial Classification 16.5.2 Identification of Multiple Gene Subsets 16.6 Conclusions References
xix
377 379 381 382 383 385 386 387
17 On Machine Learning with Multiobjective Genetic Optimization 393 17.1 Introduction 393 17.2 An Overview 396 17.2.1 Machine Learning 396 17.2.2 Generalization 398 17.2.3 Multiobjective Evolutionary Algorithms (MOEA) & Real-World Applications (RWA) 401 17.2.3.1 Achieving Diversity 403 17.2.3.2 Monitoring Convergence 403 17.2.3.3 Avoiding Local Convergence 405 17.3 Problem Formulation 406 17.4 MOEA for Partitioning 410 17.4.1 The Algorithm 411 17.4.2 Chromosome Representation 412 17.4.3 Genetic Operators 412 17.4.4 Constraints & Heuristics 412 17.4.5 Convergence 413 17.5 Results and Discussion 415 17.6 Summary & Future Work 419 Acknowledgements 421 References 421 18 Generalized Analysis of Promoters: A Method for D N A Sequence Description 18.1 Introduction 18.2 Generalized Clustering 18.3 Problem: Discovering Promoters in DNA Sequences . . . .
427 428 429 432
xx
Contents
18.4 Biological Sequence Description Methods 18.5 Experimental Algorithm Evaluation 18.6 Concluding Remarks Appendix References
434 438 442 443 443
19 Multi-Objective Evolutionary Algorithms for Computer Science Applications 451 19.1 Introduction 451 19.2 Combinatorial MOP Functions 452 19.3 MOP NPC Examples 453 19.3.1 Multi-Objective Quadratic Assignment Problem . . 453 19.3.1.1 Literary QAP Definition 455 19.3.1.2 Mathematical QAP Definition 455 19.3.1.3 General mQAP 455 19.3.1.4 Mathematical mQAP 455 19.3.1.5 Mapping QAP to MOEA 457 19.3.2 MOEA mQAP Results and Analysis 459 19.3.2.1 Design of mQAP Experiments and Testing . 459 19.3.2.2 QAP Analysis 460 19.3.3 Modified Multi-Objective Knapsack Problem (MMOKP) 465 19.3.4 MOEA MMOKP Testing and Analysis 471 19.4 MOEA BB Conjectures for NPC Problems 476 19.5 Future Directions 478 References 478 20 Design of Fluid Power System Using a Multi Objective Genetic Algorithm 20.1 Introduction 20.2 The Multi-Objective Optimization Problem 20.3 Multi-Objective Genetic Algorithms 20.3.1 The Multi-Objective Struggle GA 20.3.2 Genome Representation 20.3.3 Similarity Measures 20.3.3.1 Attribute Based Distance Function 20.3.3.2 Phenotype Based Distance Function . . . . 20.3.3.3 Real Number Distance 20.3.3.4 Catalog Distance
483 483 485 486 487 489 489 490 490 490 491
Contents
20.3.3.5 Overall Distance 20.3.4 Crossover Operators 20.4 Fluid Power System Design 20.4.1 Optimization Results 20.5 Mixed Variable Design Problem 20.5.1 Component Catalogs 20.5.2 Optimization Results 20.6 Discussion and Conclusions References 21 Elimination of Exceptional Elements in Cellular Manufacturing Systems Using Multi-Objective Genetic Algorithms 21.1 Introduction 21.2 Multiple Objective Optimization 21.3 Development of the Multi-Objective Model for Elimination ofEEs 21.3.1 Assumptions 21.3.2 The Set of Decision Criteria 21.3.3 Problem Formulation 21.3.3.1 Notation 21.3.3.2 The Objective Functions 21.3.3.3 The Constraints 21.3.3.4 The Multi-Objective Optimization Problem (MOP) 21.3.4 A Numerical Example 21.4 The Proposed MOGA 21.4.1 Pseudocode for the Proposed MOGA 21.4.2 Fitness Calculation 21.4.3 Selection 21.4.4 Recombination 21.4.5 Updating the Elite Set 21.4.6 Stopping Criteria 21.5 Parameter Setting 21.6 Experimentation 21.7 Conclusion References
xxi
492 493 494 496 498 499 499 500 502
505 506 510 511 511 511 511 511 512 514 514 515 517 518 519 520 520 520 520 521 522 525 526
xxii
Contents
22 Single-Objective and Multi-Objective Evolutionary Flowshop Scheduling 529 22.1 Introduction 529 22.2 Permutation Flowshop Scheduling Problems 531 22.3 Single-Objective Genetic Algorithms 532 22.3.1 Implementation of Genetic Algorithms 532 22.3.2 Comparison of Various Genetic Operations 535 22.3.3 Performance Evaluation of Genetic Algorithms . . . 539 22.4 Multi-Objective Genetic Algorithms 541 22.4.1 NSGA-II Algorithm 542 22.4.2 Performance Evaluation of the NSGA-II Algorithm . 544 22.4.3 Extensions to Multi-Objective Genetic Algorithms . 548 22.5 Conclusions 551 References 552 23 Evolutionary Operators Based on Elite Solutions for Bi-Objective Combinatorial Optimization 23.1 Introduction 23.2 MOCO Problems and Solution Sets 23.3 An Evolutionary Heuristic for Solving biCO Problems . . . 23.3.1 Overview of the Heuristic 23.3.2 The Initial Population 23.3.3 Bound Sets and Admissible Areas 23.3.4 The Genetic Map 23.3.5 The Crossover Operator 23.3.6 The Path-Relinking Operator 23.3.7 The Local Search Operator 23.4 Application to Assignment and Knapsack Problems with Two Objectives 23.4.1 Problem Formulation 23.4.2 Experimental Protocol 23.5 Numerical Experiments with the Bi-Objective Assignment Problem 23.5.1 Minimal Complete Solution Sets and Initial Elite Solution Set 23.5.2 Our Results Compared with Those Existing in the Literature 23.6 Numerical Experiments with the Bi-Objective Knapsack Problem
555 556 557 559 559 561 562 563 564 565 566 567 567 568 569 569 571 573
Contents
23.6.1 Minimal Complete Solution Sets and the Initial Elite Solution set 23.6.2 Our Results compared with Those Existing in the Literature 23.7 Conclusion and Perspectives References 24 Multi-Objective Rectangular Packing Problem 24.1 Introduction 24.2 Formulation of Layout Problems 24.2.1 Definition of RP 24.2.2 Multi-Objective RP 24.3 Genetic Layout Optimization 24.3.1 Representations 24.3.1.1 Sequence-Pair 24.3.1.2 Encoding System 24.3.2 GA Operators 24.3.2.1 Placement-Based Partially Exchanging Crossover 24.3.2.2 Mutation Operator 24.4 Multi-Objective Optimization Problems by Genetic Algorithms and Neighborhood Cultivation GA 24.4.1 Multi-Objective Optimization Problems and Genetic Algorithm 24.4.2 Neighborhood Cultivation Genetic Algorithm . . . . 24.5 Numerical Examples 24.5.1 Parameters of GAs 24.5.2 Evaluation Methods 24.5.2.1 Sampling of the Pareto Frontier Lines of Intersection(//,/) 24.5.2.2 Maximum, Minimum and Average Values of Each Object of Derived Solutions (IMM A ) • 24.5.3 Results 24.5.3.1 Layout of the Solution 24.5.3.2 A M I 3 3 24.5.3.3 rdm500 24.6 Conclusion References
xxiii
574 575 575 577 581 582 583 583 583 584 585 586 586 587 589 589 589 589 591 593 594 594 594 595 595 596 597 598 600 600
xxiv
Contents
25 Multi-Objective Algorithms for Attribute Selection in Data Mining 25.1 Introduction 25.2 Attribute Selection 25.3 Multi-Objective Optimization 25.4 The Proposed Multi-Objective Methods for Attribute Selection 25.4.1 The Multi-Objective Genetic Algorithm (MOGA) . 25.4.1.1 Individual Encoding 25.4.1.2 Fitness Function 25.4.1.3 Selection Methods and Genetic Operators . 25.4.2 The Multi-Objective Forward Sequential Selection Method (MOFSS) 25.5 Computational Results 25.5.1 Results for the "Return All Non-Dominated Solutions" Approach 25.5.2 Results for the "Return the 'Best' Non-Dominated Solution" Approach 25.5.3 On the Effectiveness of the Criterion to Choose the "Best" Solution 25.6 Conclusions and Future Work References
603 603 605 606 608 609 610 610 610 611 612 615 616 620 623 624
26 Financial Applications of Multi-Objective Evolutionary Algorithms: Recent Developments and Future Research Directions 627 26.1 Introduction 627 26.2 A Justification for MOEAs in Financial Applications . . . 628 26.3 Selected Financial Applications of MOEAs 631 26.3.1 Portfolio Selection Problems 631 26.3.2 Vederajan et al 633 26.3.3 Lin et al 636 26.3.4 Fieldsend k Singh 639 26.3.5 Schlottmann & Seese 642 26.4 Conclusion and Future Research Directions 646 26.5 Acknowledgement 649 References 649
Contents
27 Evolutionary Multi-Objective Optimization Approach to Constructing Neural Network Ensembles for Regression 27.1 Introduction 27.2 Multi-Objective Optimization of Neural Networks 27.2.1 Parameter and Structure Representation of the Network 27.2.2 Objectives in Network Optimization 27.2.3 Mutation and Learning 27.2.4 Elitist Non-Dominated Sorting and Crowded Tournament Selection 27.3 Selecting Ensemble Members 27.4 Case Studies 27.4.1 Experimental Settings 27.4.2 Results on the Ackley Function 27.4.3 Results on the Macky-Glass Function 27.5 Discussions and Conclusions Acknowledgements References 28 Optimizing Forecast Model Complexity Using Multi-Objective Evolutionary Algorithms 28.1 Introduction 28.2 Artificial Neural Networks 28.3 Optimal Model Complexity 28.3.1 Early Stopping 28.3.2 Weight Decay Regularization and Summed Penalty Terms 28.3.3 Node and Weight Addition/Deletion 28.3.4 Problems with These Methods 28.4 Using Evolutionary Algorithms to Discover the Complexity/ Accuracy Trade-Off 28.4.1 Pareto Optimally 28.4.2 Extent, Resolution and Density of Estimated Pareto Set 28.4.3 The Use of EMOO 28.4.4 A General Model 28.4.4.1 mutateQ 28.4.4.2 weightadjustQ
xxv
653 653 655 655 656 658 659 659 661 661 661 666 669 672 672 675 675 677 681 681 681 682 682 684 684 685 687 689 689 691
xxvi
Contents
28.4.4.3 unitadjustQ 28.4.4.4 The Elite Archive 28.4.4.5 replaceQ 28.4.5 Implementation and Generalization 28.5 Empirical Validation 28.5.1 Data 28.5.2 Model Parameters 28.6 Results 28.7 Discussion Acknowledgments References 29 Even Flow Scheduling Problems in Forest Management 29.1 Benchmark Problem 29.1.1 Introduction 29.1.2 Methodology 29.1.3 Results and Discussion 29.1.3.1 Visual Interpretation 29.1.3.2 Performance Indices 29.1.3.3 Statistical Approaches 29.1.3.4 Implications for Forest Management Problems 29.2 Applying Single Objective Genetic Algorithms to a Real-World Problem 29.2.1 Introduction 29.2.2 Methodology 29.2.2.1 Input Data 29.2.2.2 Implementation 29.2.3 Results and Discussion 29.2.4 Conclusion 29.3 Applying NSGA-II: A Truly Bi-Objective Approach . . . . 29.3.1 Introduction 29.3.2 Methodology 29.3.3 Results 29.3.3.1 Effect of Encoding on the Spread and ParetoOptimality 29.3.3.2 Comparing the Single and Multiple Objective Genetic Algorithm 29.3.3.3 Effect of Population Size on Solution Quality
691 691 692 692 693 694 694 695 697 697 698 701 701 701 703 703 703 704 706 707 708 708 709 709 709 710 712 715 715 715 715 715 716 718
Contents
29.3.3.4 Validity of the Plans 29.3.4 Conclusion 29.4 Speeding Up the Optimization Process 29.4.1 Introduction 29.4.2 Methodology 29.4.3 Results and Discussion 29.4.4 Conclusions Acknowledgements References
xxvii
719 722 723 723 723 723 724 724 724
30 Using Diversity to Guide the Search in Multi-Objective Optimization 30.1 Introduction 30.2 Diversity in Multi-Objective Optimization 30.3 Maintaining Diversity in Multi-Objective Optimization . . 30.3.1 Weighted Vectors 30.3.2 Fitness Sharing 30.3.3 Crowding/Clustering Methods 30.3.4 Restricted Mating 30.3.5 Relaxed Forms of Dominance 30.3.6 Helper Objectives 30.3.7 Objective Oriented Heuristic Selection 30.3.8 Using Diversity to Guide the Search 30.4 The Two-Objective Space Allocation Problem 30.4.1 Problem Description 30.4.2 Measuring Diversity of Non-Dominated Sets 30.5 Using Diversity to Guide the Search 30.5.1 Diversity as a Helper Objective 30.5.2 Diversity to Control Exploration and Exploitation . 30.5.3 The Population-Based Hybrid Annealing Algorithm 30.6 Experiments and Results 30.6.1 Experimental Setting 30.6.2 Discussion of Obtained Results 30.7 Summary References
727 727 729 730 731 732 732 733 733 735 735 736 736 737 739 740 740 741 742 744 744 745 747 748
Index
753
CHAPTER 1 AN INTRODUCTION TO MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS AND THEIR APPLICATIONS
Carlos A. Coello Coello1 and Gary B. Lamont 2 1 CINVESTAV-IPN Evolutionary Computation Group Dpto. de Ing. Elect./Secc. Computation Av. IPN No. 2508, Col. San Pedro Zacatenco Mexico, D.F. 07300, MEXICO E-mail:
[email protected] 2 Department of Electrical and Computer Engineering Graduate School of Engineering and Management Air Force Institute of Technology WPAFB, Dayton, Ohio, 45433, USA E-mail:
[email protected] This chapter provides the basic concepts necessary to understand the rest of this book. The introductory material provided here includes some basic mathematical definitions related to multi-objective optimization, a brief description of the most representative multi-objective evolutionary algorithms in current use and some of the most representative work on performance measures used to validate them. In the final part of this chapter, we provide a brief description of each of the chapters contained within this volume.
1.1. Introduction Early analogies between the mechanism of natural selection and a learning (or optimization) process led to the development of the so-called "evolutionary algorithms" (EAs) 3 , in which the main goal is to simulate the evolutionary process in a computer. The use of EAs for optimization tasks has become very popular in the last few years, spanning virtually every application domain 22 ' 44 ' 25 ' 4 . From the several emergent research areas in which EAs have become in1
2
Carlos A. Coello Coello and Gary B. Lamont
creasingly popular, multi-objective optimization has had one of the fastest growing in recent years12. A multi-objective optimization problem (MOP) differs from a single-objective optimization problem because it contains several objectives that require optimization. When optimizing a singleobjective problem, the best single design solution is the goal. But for multiobjective problems, with several (possibly conflicting) objectives, there is usually no single optimal solution. Therefore, the decision maker is required to select a solution from a finite set by making compromises. A suitable solution should provide for acceptable performance over all objectives40. Many fields continue to address complex real-world multi-objective problems using search techniques developed within computer engineering, computer science, decision sciences, and operations research10. The potential of evolutionary algorithms for solving multi-objective optimization problems was hinted as early as the late 1960s by Rosenberg47. However, the first actual implementation of a multi-objective evolutionary algorithm (MOEA) was produced until the mid-1980s48'49. Since then, a considerable amount of research has been done in this area, now known as evolutionary multiobjective optimization (EMOO)12. The growing importance of this field is reflected by a significant increment (mainly during the last ten years) of technical papers in international conferences and peer-reviewed journals, special sessions in international conferences and interest groups in the Internet13. The main motivation for using EAs to solve multi-objective optimization problems is because EAs deal simultaneously with a set of possible solutions (the so-called population) which allows us to find several members of the Pareto optimal set in a single run of the algorithm, instead of having to perform a series of separate runs as in the case of the traditional mathematical programming techniques40. Additionally, EAs are less susceptible to the shape or continuity of the Pareto front (e.g., they can easily deal with discontinuous and concave Pareto fronts), whereas these two issues are known problems with mathematical programming techniques7'18'12'61. This monograph attempts to present an extensive variety of highdimensional MOPs and their acceptable statistical solutions using MOEAs as exercised by numerous researchers. The intent of our discussion then is to promote a wider understanding and an ability to use MOEAs in order b
The first author maintains an EMOO repository with over 1700 bibliographical entries at: h t t p : / / d e l t a . c s .cinvestav.mx/'ccoello/EMOO, with a mirror at http://www.lania.mx/-ccoello/EMOO/
An Introduction to MOEAs and Their Applications
3
to find "good" solutions in a wide spectrum of high-dimensional real-world applications.
1.2. Basic Concepts In order to provide a common basis for understanding the rest of this book, we provide next a set of basic definitions normally adopted both in singleobjective and in multi-objective optimization 12 : Definition 1 (Global Minimum): Given a function f : 0, C S = R" -» K, 0 ^ 0, for x € f) the value f* = f(x*) > —oo is called a global minimum if and only if
Vfefi:
/(£*)6S. With these basic MOP definitions, we are now ready to delve into the structure of MOPs and the specifics of various MOEAs. 1.3. Basic Operation of a MOEA The objective of a MOEA is to converge to the true Pareto front of a problem which normally consists of a diverse set of points. MOPs (as a rule) can present an uncountable set of solutions, which when evaluated produce vectors whose components represent trade-offs in decision space. During MOEA execution, a "local" set of Pareto optimal solutions (with respect to the current MOEA generational population) is determined at each EA generation and termed PCUrrent{t), where t represents the generation number. Many MOEA implementations also use a secondary population, storing all/some Pareto optimal solutions found through the generations55. This secondary population is termed Pknown (t), also annotated with t (representing completion of t generations) to reflect possible changes in its membership during MOEA execution. Pknown (0) is defined as 0 (the empty set) and Pknown alone as the final, overall set of Pareto optimal solutions returned by a MOEA. Of course, the true Pareto optimal solution set (termed
An Introduction to MOEAs and Their Applications
5
Ptrue) is not explicitly known for MOPs of any difficulty. Ptrue is defined by the functions composing an MOP; it is fixed and does not change. Pcurrent^), Pknown, and Ptrue are sets of MOEA genotypes where each set's phenotypes form a Pareto front. We term the associated Pareto front for each of these solution sets as PFcurrent(t), PFknown, and PFtrue. Thus, when using a MOEA to solve MOPs, one implicitly assumes that one of the following conditions holds: PFknown Q PFtrue or that over some norm (Euclidean, RMS, etc.), PFknown G [PFtrUe, PFtrUe + e], where e is a small value. Generally speaking, a MOEA is an extension on an EA in which two main issues are considered: • How to select individuals such that nondominated solutions are preferred over those which are dominated. • How to maintain diversity as to be able to maintain in the population as many elements of the Pareto optimal set as possible. Regarding selection, most current MOEAs use some form of Pareto ranking. This approach was originally proposed by Goldberg25 and it sorts the population of an EA based on Pareto dominance, such that all nondominated individuals are assigned the same rank (or importance). The idea is that all nondominated individuals get the same probability to reproduce and that such probability is higher than the one corresponding to individuals which are dominated. Although conceptually simple, several possible ways exist to implement a MOEA using Pareto ranking18'12. The issue of how to maintain diversity in an EA as been addressed by a extensive number of researchers39'27. The approaches proposed include fitness sharing and niching19, clustering54'65, use of geographically-based schemes to distribute solutions36'14'13, and the use of entropy32'16, among others. Additionally, some researchers have also adopted mating restriction schemes51'63'41. More recently, the use of relaxed forms of Pareto dominance has been adopted as a mechanism to encourage more exploration and, therefore, to provide more diversity. From these mechanisms, e-dominance has become increasingly popular, not only because of its effectiveness, but also because of its sound theoretical foundation38. In the last few years, the use of elitist schemes has also become common among MOEA researchers. Such schemes tend to consist of the use of an external archive (normally called "secondary population") that may interact in different ways with the main (or "primary") population of the MOEA. Despite storing the nondominated solutions found along the evo-
6
Carlos A. Coello Coello and Gary B. Lamont
lutionary process, secondary populations have also been used to improve the distribution of the solutions35 and to regulate the selection pressure of a MOEA65. Alternatively, a few algorithms use a plus (+) selection mechanism by which parents are combined with their offspring in a single population from which a subset of the "best" individuals is retained. The most popular from these algorithms is the Nondominated Sorting Genetic Algorithm-II (NSGA-II)21. 1.4. Classifying MOEAs There are several possible ways to classify MOEAs. The following taxonomy is perhaps the most simple and is based on the type of selection mechanism adopted: • Aggregating Functions • Population-based Approaches • Pareto-based Approaches We will briefly discuss each of them in the following subsections. 1.4.1. Aggregating Functions Perhaps the most straightforward approach to deal with multi-objective problems is to combine them into a single scalar value (e.g., adding them together). These techniques are normally known as "aggregating functions", because they combine (or "aggregate") all the objectives of the problem into a single one. An example of this approach is a fitness function in which we aim to solve the following problem: k
min ^2,Wifi{x)
(4)
z=l
where Wi > 0 are the weighting coefficients representing the relative importance of the k objective functions of our problem. It is usually assumed that
X> = 1
(5)
Aggregating functions may be linear (as the previous example) or nonlinear46'59'28. Aggregating functions have been largely underestimated
An Introduction to MOEAs and Their Applications
7
by MOEA researchers mainly because of the well-known limitation of linear aggregating functions (i.e., they cannot generate non-convex portions of the Pareto front regardless of the weight combination used17). Note however that nonlinear aggregating functions do not necessarily present such limitation12, and they have been quite successful in multi-objective combinatorial optimization30. 1.4.2. Population-Based
Approaches
In this type of approach, the population of an EA is used to diversify the search, but the concept of Pareto dominance is not directly incorporated into the selection process. The classical example of this sort of approach is the Vector Evaluated Genetic Algorithm (VEGA), proposed by Schaffer49. VEGA basically consists of a simple genetic algorithm with a modified selection mechanism. At each generation, a number of sub-populations are generated by performing proportional selection according to each objective function in turn. Thus, for a problem with k objectives, k sub-populations of size M/k each are generated (assuming a total population size of M). These sub-populations are then shuffled together to obtain a new population of size M, on which the genetic algorithm applies the crossover and mutation operators. VEGA has several problems, from which the most serious is that its selection scheme is opposed to the concept of Pareto dominance. If, for example, there is an individual that encodes a good compromise solution for all the objectives (i.e., a Pareto optimal solution), but it is not the best in any of them, it will be discarded. Schaffer suggested some heuristics to deal with this problem. For example, to use a heuristic selection preference approach for nondominated individuals in each generation, to protect individuals that encode Pareto optimal solutions but are not the best in any single objective function. Also, crossbreeding among the "species" could be encouraged by adding some mate selection heuristics instead of using the random mate selection of the traditional genetic algorithm. Nevertheless, the fact that Pareto dominance is not directly incorporated into the selection process of the algorithm remains as its main disadvantage. One interesting aspect of VEGA is that despite its drawbacks it remains in current use by some researchers mainly because it is appropriate for problems in which we want the selection process to be biased and in which we have to deal with a large number of objectives (e.g., when handling constraints as objectives in single-objective optimization9 or when solving problems in which the objectives are conceptually identical11).
8
Carlos A. Coello Coello and Gary B. Lamont
1.4.3. Pareto-Based
Approaches
Under this category, we consider MOEAs that incorporate the concept of Pareto optimality in their selection mechanism. A wide variety of Paretobased MOEAs have been proposed in the last few years and it is not the intent of this section to provide a comprehensive survey of them since such a review is available elsewhere12. In contrast, this section provides a brief discussion of a relatively small set of Pareto-based MOEAs that are representative of the research being conducted in this area. Goldberg's Pareto Ranking: Goldberg suggested moving the population toward PFtrue by using a selection mechanism that favors solutions that are nondominated with respect to the current population25. He also suggested the use of fitness sharing and niching as a diversity maintenance mechanism19. Multi-Objective Genetic Algorithm (MOGA): Fonseca and Fleming23 proposed a ranking approach different from Goldberg's scheme. In this case, each individual in the population is ranked based on how many other points dominate them. All the nondominated individuals in the population are assigned the same rank and obtain the same fitness, so that they all have the same probability of being selected. MOGA uses a niche-formation method in order to diversify the population, and a relatively simple methodology is proposed to compute the similarity threshold (called a share) required to determine the radius of each niche. The Nondominated Sorting Genetic Algorithm (NSGA): This method53 is based on several layers of classifications of the individuals as suggested by Goldberg25. Before selection is performed, the population is ranked on the basis of nondomination: all nondominated individuals are classified into one category with a dummy fitness value, which is proportional to the population size, to provide an equal reproductive potential for these individuals. To maintain the diversity of the population, these classified individuals are shared with their dummy fitness values. Then this group of classified individuals is ignored and another layer of nondominated individuals is considered. The process continues until all individuals in the population are classified. Stochastic remainder proportionate selection is adopted for this technique. Since individuals in the first front have the maximum fitness value, they always get more copies than the rest of the
An Introduction to MOEAs and Their Applications
9
population. An offshoot of this approach, the NSGA-II 21 , uses elitism and a crowded comparison operator that ranks the population based on both Pareto dominance and region density. This crowded comparison operator makes the NSGA-II considerably faster than its predecesor while producing very good results. Niched Pareto Genetic Algorithm (NPGA): This method employs an interesting form of tournament selection called Pareto domination tournaments. Two members of the population are chosen at random and they are each compared to a subset of the population. If one is nondominated and the other is not, then the nondominated one is selected. If there is a tie (both are either dominated or nondominated), then fitness sharing decides the tourney results28. Strength Pareto Evolutionary Algorithm (SPEA): This method attempts to integrate different MOEAs65. The algorithm uses a "strength" value that is computed in a similar way to the MOGA ranking system. Each member of the population is assigned a fitness value according to the strengths of all nondominated solutions that dominate it. Diversity is maintained through the use of a clustering technique called the "average linkage method." A revision of this method, called SPEA2 62 , adjusts slightly the fitness strategy and uses nearest neighbor techniques for clustering. In addition, archiving mechanism enhancements allow for the preservation of boundary solutions that are missed with SPEA. Multi-Objective Messy Genetic Algorithm (MOMGA): This method extends the mGA20 to solve multi-objective problems. The MOMGA55 is an explicit building block GA that produces all building blocks of a user specified size. The algorithm has three phases: Initialization, Primordial, and Juxtapositional. The MOMGA-II algorithm was developed by Zydallis as an extension of the MOMGA67. It was developed in order to expand the state of the art for explicit building-block MOEAs. While there has been a lot of research done for single objective explicit building-block EAs, this was a first attempt at using the concept for MOPs. Exponential growth of the population as the building block size grows may be a disadvantage of this approach in some applications.
10
Carlos A. Coello Coello and Gary B. Lamont
Multi-Objective Hierarchical Bayesian Optimization Algorithm (hBOA): This search technique is a conditional model builder. It expands the idea of the compact genetic algorithm and the stud genetic algorithm. The hBOA defines a Bayesian model that represents "small" building blocks (BBs) reflecting genotypical epistasis using a hierarchical Bayesian network45. The mhBOA31 is in essence a linkage learning algorithm that extends the hBOA and attempts to define tight and loose linkages to building blocks in the chromosome over a Pareto front. In particular, this method uses a Bayesian network (a conditional probabilistic model) to guide the search toward a solution. A disadvantage of this algorithm is the time it takes to generate results for a relatively small number of linkages. Pareto Archived Evolution Strategy (PAES): This method, formulated by Knowles and Corne34, uses a (1+1) evolution strategy, where each parent generates one offspring through mutation. The method uses an archive of nondominated solutions to compare with individuals in the current population. For diversity, the algorithm generates a grid overlaid on the search space and counts the number of solutions in each grid. A disadvantage of this method is its performance on disconected Pareto Fronts. Micro-Genetic Algorithm for Multi-Objective Optimization: The micro-genetic algorithm was introduced by Coello Coello and Toscano Pulido10 and, by definition, has a small population requiring a reinitialization technique. An initial random population flows into a population memory which has two parts: a replaceable and a non-replaceable portion. The non-replaceable part provides the population diversity. The replaceable portion of course changes at the end of each generation where this population undergoes crossover and mutation. Using various elitism selection operators, the non-dominated individuals compose the replaceable portion. General Multi-Objective Program (GENMOP): This method is a parallel, real-valued MOEA initially used for bioremediation research33. This method archives all previous population members and ranks them. Archived individuals with the highest ranks are used as a mating pool to mate with the current generation. The method uses equivalence class sharing for niching to allow for diversity in the mating pool. A disadvantage of this algorithm is the Pareto ranking of the archived individuals at each generation.
An Introduction to MOEAs and Their Applications
11
Other researchers have combined elements of these MOEAs to develop unique MOEAs for their specific problem domain with excellent results. 1.5. MOEA Performance Measures The use of performance measures (or metrics) allows a researcher or computational scientist to assess (in a quantitative way) the performance of their algorithms. The MOEA field is no different. MOEA performance measures tend to focus on the phenotype or objective domain as to the accuracy of the results. This is different to what most operations researchers do. They tend to use metrics in the genotype domain. But since there is an explicit mapping between the two, it doesn't really matter in which domain you define your metrics12'57. MOEA metrics can be used to measure final performance or track the generational performance of the algorithm. This is important because it allows the researcher to manage the algorithm convergence process during execution. This section presents a variety of MOEA metrics, yet, no attempt is made to be comprehensive. For a more detailed treatment of this topic, the interested reader should consult additional references12'60'66. Error Ratio (ER): This metric reports the number of vectors in PFknown that are not members of PFtrUe- This metric requires that the researcher knows PFtrue. The mathematical representation of this metric is shown in equation 6: ER = & i £ i
(6)
where n is the number of vectors in PFknown and e^ is a zero when the i vector is an element of PFtrue or a 1 if i is not an element. So when ER = 0, the PFknown is the same as PFtrue; but when ER = 1, this indicates that none of the points in PFknown are in PFtrue. Two Set Coverage (CS): This metric60 compares the coverage of two competing sets and outputs the percentage of individuals in one set dominated by the individuals of the other set. This metric does not require that the researcher has knowledge of PFtrue. The equation for this metric is shown in equation 7: CS(X>,X") ± \a"^";WeX':a'yg"\
(?)
12
Carlos A. Coello Coello and Gary B. Lamont
where X', X" C X are two sets of phenotype decision vectors, and (X1, X") are mapped to the interval [0,1]. This means that CS = 1 when X' dominates or equals X". Generational Distance (GD): This metric was proposed by Van Veldhuizen and Lamont56. It reports how far, on average, PFknown is from PFtrue. This metric requires that the researcher knows PFtTUe • It is mathematically defined in equation GD
A
(Ek^!
(8)
n where n is the number of vectors in PFknown, P = 2, and Di is the Euclidean distance between each member and the closest member of PFtrue, in the phenotype space. When GD = 0, PFknown — PFtrue. Hyperarea and Ratio (H,HR): These metrics, introduced by Zitzler & Thiele64, define the area of coverage that PFknown has with respect to the objective space. This would equate to the summation of all the areas of rectangles, bounded by the origin and (fi(x),f2{x)), for a two-objective MOEA. Mathematically, this is described in equation 9:
# = j(Jai|«iePF fcnoum l
(9)
where Vi is a nondominated vector in PFknown a n d di is the hyperarea calculated between the origin and vector V(. But if PFknown is not convex, the results can be misleading. It is also assumed in this model that the origin is (0,0). The hyperarea ratio metric definition can be seen in equation 10:
HR±%-
(10)
where Hi is the PFknown hyperarea and H2 is the hyperarea of PFtrueThis results in HR > 1 for minimization problems and HR < 1 for maximization problems. For either type of problem, PFknown = PFtrue when HR = 1. This metric requires that the researcher knows PFtrue.
An Introduction to MOEAs and Their Applications
13
Spacing (S): This metric was proposed by Schott50 and it measures the distance variance of neighboring vectors in PFknown. Equation 11 defines this metric.
(11) and di = minj(\fl(x) - f((x)\ + \f2{x) - f((x)\)
(12)
where i, j = 1 . . . , n, d is the mean of all di, and n is the number of vectors in PFknown- When 5 = 0, all members are spaced evenly apart. This metric does not require the researcher to know PFtrue. Overall Nondominated Vector Generation Ratio (ONVGR):
This metric measures the total number of nondominated vectors during MOEA execution and divides it by the number of vectors found in PFtrUe • This metric is defined as shown in equation 13:
ONVG = PJlalse
(13)
When ONVGR = 1 this states only that the same number of points have been found in both PFtrue and PFknown- It does not infer that PFtrue — PFknown- This metric requires that the researcher knows PFtrue. Progress Measure RP: For single-objective EAs, Back3 defines a metric that measures convergence velocity. This single-objective metric is applied to multi-objective MOEAs55, and is reflected in equation 14:
RP = inJ^
(14)
V GT where Gi is the generational distance for the first generation and GT is the distance for generation T. Recall that generational distance was defined in equation 8 and it measures the average distance from PFtrue to PFknownThis metric requires that the researcher knows PFtrue.
14
Carlos A. Coello Coello and Gary B. Lamont
Generational Nondominated Vector Generation (GNVG): This is a simple metric, introduced by Van Veldhuizen55 that lists the number of nondominated vectors produced for each generation. This is defined in equation 15 GNVG = \PFcurrent(t)\
(15)
This metric does not require the researcher knows PFtrue. Nondominated Vector Addition (NVA): This metric, introduced by Van Veldhuizen55, calculates the number of nondominated vectors gained or lost from the previous PFknown generation. Equation 16 defines this metric. NVA = \PFknown(t)\ - \PFknown(t - 1)|
(16)
But this metric can be misleading when a new vector dominates two or more vectors from the previous generation. In addition, this metric may remain static over the course of several generations while new points are added that dominate others from the previous generation. This metric does not require the researcher knows PFtrueAs to what metrics are appropriate, it of course depends upon the MOEA application to the given MOP. Since in real-world applications, the true Pareto Front is unknown, relative metrics are usually selected. It is also worth observing that recent research has shed light on the limitations of unary metrics (i.e., performance measures that assign each approximation of the Pareto optimal set a number that reflects a certain quality aspect)66. Such study favors the use of binary metrics. As a consequence of this study, it is expected that in the next few years MOEA researchers will eventually adopt binary metrics on a regular basis, but today, the use of unary metrics (such as error ratio and many of the others discussed in this section) is still common. 1.6. Design of MOEA Experiments To conduct a thorough evaluation of the performance of any MOEA, a design of experiments or methodology should be common practice prior to testing and evaluating the search results. The main goal of MOEA research is the creation of an effective and efficient algorithm that renders
An Introduction to MOEAs and Their Applications
15
good solutions. But to achieve that goal, several smaller goals need to be addressed. These goals can be classified under two categories: effectiveness goals and efficiency goals. Effectiveness goals should list the effectiveness goals and the experimental design employed to validate that these goals are met. Efficiency Goals should list the experimental design employed to validate the efficiency goals. Also, a section on the Computing Environment should indicate the computing environment for ease of repeatability. Finding good solutions is the top priority for any MOEA application research. Therefore one has to validate that their algorithm does indeed find good solutions. Benchmarks can also used. In addition, comparison with current MOP designs is appropriate for validation. Once a baseline set of runs are completed and analyzed, algorithm parameters can be tweaked to possibly improve effectiveness. The various application chapters in this text have attempted to adhere to such an experimental design and follow the reporting techniques of the next section. 1.6.1. Reporting MOEA Computational
Results
Before the advent of the "Scientific Method", many engineers and scientists merely used the trial and error method in an attempt to gain insight into a particular problem. The scientific method is the process by which engineers and scientists, collectively and over time, endeavor to construct an accurate (that is, reliable, consistent and non-arbitrary) representation of the world or the problem which they study. Recognizing that personal and cultural beliefs influence both our perception and our interpretation of natural phenomena, we aim through the use of standard procedures and criteria to minimize those influences when developing a conjecture or a theory or a qualification. In summary, the scientific method attempts to minimize the influence of bias or prejudice in the experimenter. Each application chapter reporting computational experiments attempt to follow the above objective since they use computer generated evidence to compare or rank competing MOEA software techniques and Pareto Front solutions. Chapter authors consider various classical references that can direct computational experimentation6'15'29. According to Jackson, et al.29, the researcher should always keep in mind various elements identified as to "What to Keep In Mind When Testing:" • Are the results presented statistically sufficient to justify the claims made?
16
Carlos A. Coello Coello and Gary B. Lamont
• Is there sufficient detail to reproduce the results? • When should a statistically-based experiment be done-usually when a claim such as "this method is better (i.e. faster, more accurate, more efficient, easier-to-use, etc.)"? • Are the proper test problems being used? • Are all possible performance measures (efficiency, robustness, reliability, ease-of-use, etc.) addressed? • Is enough information provided with respect to the architecture of the hardware being used? One should organize the design of experiments. For example, one should discuss input and output data, the identification of all parameters available during testing (for all tests the parameters are the same unless otherwise indicated), a discussion of the random number generators and seeds and other topics that are pertinent to the set of experiments. Following the general information, each individual experiment is presented with the objective and methodology of the experiment identified. For each experiment any parameter settings or environmental settings that differ from the generalized discussion are duly noted. Various statistical methods should be addressed such as mean, average, max, min, student i-test, Kruskal-Wallis test, and others as appropriate for the computational experiment. 1.7. Layout of the Book After presenting some basic concepts, terminology and a brief discussion on methodological aspects related to the use of MOEAs, we devote this last section to discuss briefly each of the remaining chapters that are integrated into this book or monograph. As indicated in the preface, these 29 chapters are divided in four application collections. The specific chapters that compose each of these parts are summarized in the following subsections. Note that many authors use specific MOEAs that are summarized in Section 1.4. Also, observe that some of the various metrics discussed in Section 1.5 are employed in statistical MOEA evaluation employing the experimental testing techniques of Section 1.6. 1.7.1. Part I: Engineering
Applications
Considering that the use of MOEAs in engineering has been very extensive, this first part is the largest in the book, as it includes chapters from 2 to 13.
An Introduction to MOEAs and Their Applications
17
In Chapter 2, Ray adopts a scheme that handles objectives and constraints separately. Nondominance is used not only for selecting individuals but also to handle constraints. The MOEA adopted in this work is the NSGA53 with elitism. The approach is applied to some engineering design problems (a welded beam, a bulk carrier and an airfoil). Farina and Di Barba apply in Chapter 3 several approaches to the design of industrial electromagnetic devices (the case studies consist of a magnetic reactor and an inductor for transverse-flux heating of a metal strip). The authors consider the use of the Non-dominated Sorting Evolutionary Strategy Algorithm0 (NSESA), a Pareto Gradient Based Algorithm (PGBA), a Pareto Evolution Strategy Algorithm (PESTRA), and a Multi Directional Evolution Strategy Algorithm (MDESTRA). At the end, they decide to adopt hybrid approaches in which NSESA is combined with both a deterministic and a local-global strategy. Reed & Devireddy use in Chapter 4 the NSGA-II21 enhanced with the e-dominance archiving and automatic parameterization techniques38 to optimize ground water monitoring networks. The authors indicate that the use of e-dominance not only eliminated the empirical fine-tuning of parameters of their MOEA, but also reduced the computational demands by more than 70% with respect to some of their previous work. In Chapter 5, Hernandez Luna and Coello Coello use a particle swarm optimizer with a population-based selection scheme (similar to VEGA49) to design combinational logic circuits. One of the relevant aspects of this work is that the problem to be solved is actually mono-objective. However, the use of a multi-objective selection scheme improves both the robustness and the quality of the results obtained. Furukawa et al. present in Chapter 6 the application of two MOEAs to the sensor and vehicle parameter determination for successful autonomous vehicles navigation. The MOEAs adopted are: (1) the Multi-objective Continuous Evolutionary Algorithm (MCEA) and (2) the Multi-Objective Gradient-based Method (MOGM). Due to space limitations, only the results produced by the MCEA are presented in the chapter, although the authors indicate that both MOEAs reach the same final results. It is worth noticing the use of the so-called Center-of-Gravity Method (CoGM) to select a single solution from the Pareto optimal set produced by the MCEA. In Chapter 7, Tan and Li use a MOEA to design optimal unified linear time-invariant control (ULTIC) systems. The proposal consists of a methodc
This algorithm is a variation of the NSGA53.
18
Carlos A. Coello Coello and Gary B. Lamont
ology for performance-prioritized computer aided control system design in which a MOEA toolbox previously designed by the authors is used as an optimization engine. An interesting aspect of this work is that the user is allowed to set his/her goals on-line (without having to restart the entire design cycle) and can visualize (in real-time) the effect of such goal setting on the results. The proposed methodology is applied to a non-minimal phase plant control system. Gaspar-Cunha and Covas in Chapter 8 apply a MOEA to solve polymer extrusion problems. The authors optimize the performance of both singlescrew and co-rotating twin-screw extruders. The MOEA adopted is called Reduced Pareto Set Genetic Algorithm with Elitism (RPSGAe) and was previously proposed by the same authors24. An interesting aspect of this work is that the RPSGAe uses a clustering technique not to maintain diversity as is normally done, but to reduce the number of Pareto optimal solutions. The problems solved are formulated as multi-objective traveling salesperson problems (i.e., they are actually dealing with multi-objective combinatorial optimization problems). In Chapter 9, Hernandez Aguirre and Botello Rionda propose an extension of the Pareto Archived Evolution Strategy (PAES)36 which is able to deal with both single-objective and multi-objective optimization problems. The proposed approach is called Inverted and Shrinkable Pareto Archived Evolutionary Strategy (ISPAES), and is used to solve several truss optimization problems (a common problem in structural and mechanical engineering). The main differences between ISPAES and PAES are in the selection mechanism and the implementation of the adaptive grid. The test problems adopted include both single and multiple objective problems as well as discrete and continuous search spaces. Balling presents in Chapter 10 an interesting application of MOEAs to city and regional planning. The MOEA adopted uses the maximin fitness function previously proposed by the author5. The approach has been applied to plan the Wasatch Front Metropolitan Region in Utah (in the USA). An interesting aspect of this work is the discussion presented by the author regarding the reluctance from the authorities to actually implement some of the plans produced by the MOEA. The author attributes this reluctance both to the high number of (nondominated) plans produced (no scheme to incorporate user's preferences8 was adopted by Balling) and to the psychological impact that this sort of (radically new) approach has on people. Jozefowiez et al. present in Chapter l l a MOEA to solve the bi-objective
An Introduction to MOEAs and Their Applications
19
covering tour problem. The MOEA adopted is the NSGA-II21, and the results are compared with respect to an exact algorithm based on a branchand-bound approach which can be applied only to relatively small instances of the problem. The chapter also presents a thorough review of multiobjective routing problems reported in the specialized literature. Chapter 12, by Kiinzli et al., presents a benchmark problem in computer engineering (the design space exploration of packet processor architectures). Besides describing several details related to the proposed benchmark problem, the authors also refer to the text-based interface developed by them which is platform and programming language independent. This aims to facilitate the use of different MOEAs (across different platforms) to solve such problem. In the last chapter of the first part (Chapter 13), Obayashi and Sasaki present the use of a MOEA for aerodynamic design of supersonic wings. The MOEA adopted is the Adaptive Range Multiobjective Genetic Algorithm (ARMOGA), which is based on an approach originally developed by Arakawa and Hagiwara2. The multi-objective extensions are based on MOGA23. An interesting aspect of this work is the use of Self-Organizing Maps (SOMs) both to visualize trade-offs among the objectives of the problem and to perform some sort of data mining of the designs produced. 1.7.2. Part II: Scientific
Applications
The second part of the book, which focuses on scientific applications of MOEAs, includes chapters from 14 to 19. In Chapter 14, Ray presents the use of a MOEA to optimize gas-solid separation devices used for particulate removal from air (namely, the design of cyclone separators and venturi scrubbers). The author used the NSG A53, mainly because of her previous experience with such algorithm. Mancini et al. present in Chapter 15 the use of a MOEA for an application in Physics: the spectroscopic data analysis of inertia] confinement fusion implosion cores based on the self-consistent analysis of simultaneous narrow-band X-ray images and X-ray line spectra. The MOEA adopted is the Niched-Pareto Genetic Algorithm (NPGA)28. Chapter 16, by Lahanas, presents a survey of the use of MOEAs in medicine. The types of problems considered include medical image processing, computer-aided diagnosis, treatment planning, and data mining. In Chapter 17, Kumar describes the use of a MOEA in the solution of high-dimensional and complex domains of machine learning. The MOEA
20
Carlos A. Coello Coello and Gary B. Lamont
is used as a pre-processor for partitioning these complex learning tasks into simpler domains that can then be solved using traditional machine learning approaches. The MOEA adopted is the Pareto Converging Genetic Algorithm (PCGA), which was proposed by the author37. Romero Zaliz et al. describe in Chapter 18 an approach for identifying interesting qualitative features in biological sequences. The approach is called Generalized Analysis of Promoters (GAP) and is based on the use of generalized clustering techniques where the features being sought correspond to the solutions of a multiobjective optimization problem. A MOEA is then used to identify multiple promoters occurrences within genomic regulatory regions. The MOEA adopted is a Multi-Objective Scatter Search (MOSS) algorithm. Lamont et al. present in Chapter 19 an application of the multi-objective messy genetic algorithm-II (MOMGA-II) to two NP-complete problems: the multi-objective Quadratic Assignment Problem (mQAP) and the Modified Multi-objective Knapsack Problem (MMOKP). 1.7.3. Part III: Industrial
Applications
The third part of the book, which focuses on real-world industrial applications of MOEAs, includes chapters from 20 to 24. In Chapter 20, Anderson uses a MOEA to design fluid power systems. The MOEA adopted is called multi-objective struggle genetic algorithm (MOSGA) and was proposed by the same author1. The approach is further extended so that it can deal with mixed variable design problems (i.e., with both continuous and discrete variables). Mansouri presents in Chapter 21 the application of a MOEA in cellular manufacturing systems. The problem tackled consists of deciding on which parts to subcontract and which machines to duplicate in a cellular manufacturing system wherein some exceptional elements exist. The MOEA adopted is the NSGA53. Chapter 22, by Ishibuchi and Shibata, presents the solution of flowshop scheduling problems (both single- and multi-objective) using genetic algorithms. The multi-objective instances are solved using the NSGA-II21. The authors recommend the use of mating restrictions and a hybridization with local search in order to improve the performance of the MOEA adopted. Gandibleux et al. deal in Chapter 23 with multi-objective combinatorial optimization problems. The approach adopted in this case is peculiar, since it is a population-based heuristic that uses three operators: crossover, path-
An Introduction to MOEAs and Their Applications
21
relinking and a local search on elite solutions. However, this approach differs from a MOEA in two main aspects: (1) it does not use Pareto ranking, and (2) it performs no direction searches to drive the approximation process. The authors apply their approach to the bi-objective assignment problem and to the bi-objective knapsack problem. In Chapter 24, Watanabe and Hiroyasu apply a MOEA to the solution of the multi-objective rectangular packing problem, which is a discrete combinatorial optimization problem that arises in many applications (e.g., truck packing and floor planning, among others). The MOEA adopted is the Neighborhood Cultivation Genetic Algorithm (NCGA) which was proposed by the authors58. 1.7.4. Part IV: Miscellaneous
Applications
The fourth and last part of the book, deals with miscellaneous applications of MOEAs in a variety of domains, and includes chapters from 25 to 30. Pappa et al. present in Chapter 25 the use of MOEAs to select attributes in data mining. The authors use two approaches that were previously proposed by them: (1) an elitist multi-objective genetic algorithm (which uses Pareto dominance) in which all the nondominated solutions found pass unaltered to the next generation42 and (2) a multi-objective forward sequential selection method43. In Chapter 26, Schlottmann and Seese present a fairly detailed survey of the use of MOEAs in portfolio management problems. The authors emphasize the importance of the incorporation of problem-specific knowledge into a MOEA as to improve its performance in such financial applications. The authors also identify some other potential applications of MOEAs in finance. Chapter 27, by Jin et al., describes the application of a MOEA to the evolution of both the weights and the structure of neural networks used for regression and prediction. The MOEA adopted is the NSGA-II21, expanded with Lamarckian inheritance. The authors report success of the MOEA to generate diverse neural network ensemble members, which significantly improves the regression accuracy, particularly in cases in which a single network is not able to predict reliably. In Chapter 28, Fieldsend and Singh use a MOEA to train neural networks used for time series forecasting. The MOEA adopted is a variation of PAES36. The most interesting aspect of this work is that the use of a multi-objective approach allows the user to get a good representation of the
22
Carlos A. Coello Coello and Gary B. Lamont
complexity/accuracy trade-off of the problem being solved. This may lead to the selection of neural networks with very low complexity. Chapter 29, by Ducheyne et al., presents the application of MOEAs in forest management problems (particularly forest scheduling problems). Two MOEAs are studied by the authors: MOGA23 and the NSGA-II21. An interesting aspect of this work is the use of fitness inheritance52 to speed up the optimization process. Finally, in Chapter 30, Landa Silva and Burke propose the use of diversity measures to guide a MOEA's search. Such an approach is used to solve space allocation problems arising in academic institutions. The MOEA adopted is called Population-based Hybrid Annealing Algorithm and was previously proposed by the same authors. In this approach, each individual is evolved by means of local search and a specialized mutation operator. This MOEA combines concepts of simulated annealing, tabu search, evolutionary algorithms and hillclimbing.
1.8. General Comments As has been seen in the previous presentation, this book includes a wide variety of applications of MOEAs. Nevertheless, if we consider the important growth of the number of publications related to MOEAs in the last few years, it is likely that we will see more novel applications in the near future. As a matter of fact, there are still several areas in which applications of MOEAs are rare (e.g., computer vision, operating systems, compiler design, computer architecture, and business activities among others). The application of MOEAs to increasingly challenging problems is triggering more research on MOEA algorithmic design as well as influencing developmental trends. For example, the hybridization of MOEAs with other mechanisms (e.g., local search) may become standard practice in complex MOP application domains. This volume constitutes an initial attempt to collect a representative sample of contemporary MOEA applications, thus providing insight to their efficient and effective use. Of course, it is expected that more and more specialized monographs and textbooks will include the use of MOEAs in diverse problem domains because of the expanding understanding and utility of MOEA concepts in solving complex high-dimensional MOPs.
An Introduction to MOEAs and Their Applications
23
References 1. Johan Andersson. Multiobjective Optimization in Engineering Design— Applications to Fluid Power Systems. PhD thesis, Division of Fluid and Mechanical Engineering Systems. Department of Mechanical Engineering. Linkoping Universitet, Linkoping, Sweden, 2001. 2. Masao Arakawa and Ichiro Hagiwara. Development of Adaptive Real Range (ARRange) Genetic Algorithms. JSME International Journal, Series C, 41(4):969-977, 1998. 3. Thomas Back. Evolutionary Algorithms in Theory and Practice. Oxford University Press, New York, 1996. 4. Thomas Back, David B. Fogel, and Zbigniew Michalewicz, editors. Handbook of Evolutionary Computation. Institute of Physics Publishing and Oxford University Press, New York, 1997. 5. Richard Balling. The Maximin Fitness Function; Multiobjective City and Regional Planning. In Carlos M. Fonseca, Peter J. Fleming, Eckart Zitzler, Kalyanmoy Deb, and Lothar Thiele, editors, Evolutionary Multi-Criterion Optimization. Second International Conference, EMO 2003, pages 1-15, Faro, Portugal, April 2003. Springer. Lecture Notes in Computer Science. Volume 2632. 6. Richard S. Barr, Bruce L. Golden, James P. Kelly, Mauricio G. C. Resende, and Jr. William R. Stewart. Designing and Reporting on Computational Experiments with Heuristic Methods. Journal of Heuristics, 1:9-32, 1995. 7. Carlos A. Coello Coello. A Comprehensive Survey of Evolutionary-Based Multiobjective Optimization Techniques. Knowledge and Information Systems. An International Journal, 1(3):269—308, August 1999. 8. Carlos A. Coello Coello. Handling Preferences in Evolutionary Multiobjective Optimization: A Survey. In 2000 Congress on Evolutionary Computation, volume 1, pages 30-37, Piscataway, New Jersey, July 2000. IEEE Service Center. 9. Carlos A. Coello Coello. Treating Constraints as Objectives for SingleObjective Evolutionary Optimization. Engineering Optimization, 32(3):275308, 2000. 10. Carlos A. Coello Coello. A Short Tutorial on Evolutionary Multiobjective Optimization. In Eckart Zitzler, Kalyanmoy Deb, Lothar Thiele, Carlos A. Coello Coello, and David Corne, editors, First International Conference on Evolutionary Multi-Criterion Optimization, pages 21-40. Springer-Verlag. Lecture Notes in Computer Science No. 1993, 2001. 11. Carlos A. Coello Coello and Arturo Hernandez Aguirre. Design of Combinational Logic Circuits through an Evolutionary Multiobjective Optimization Approach. Artificial Intelligence for Engineering, Design, Analysis and Manufacture, 16(l):39-53, January 2002. 12. Carlos A. Coello Coello, David A. Van Veldhuizen, and Gary B. Lamont. Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers, New York, May 2002. 13. David W. Corne, Nick R. Jerram, Joshua D. Knowles, and Martin J. Oates.
24
14.
15. 16.
17. 18. 19.
20. 21. 22. 23.
24.
Carlos A. Coello Coello and Gary B. Lamont
PESA-II: Region-based Selection in Evolutionary Multiobjective Optimization. In Lee Spector, Erik D. Goodman, Annie Wu, W.B. Langdon, HansMichael Voigt, Mitsuo Gen, Sandip Sen, Marco Dorigo, Shahram Pezeshk, Max H. Garzon, and Edmund Burke, editors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'2001), pages 283-290, San Francisco, California, 2001. Morgan Kaufmann Publishers. David W. Corne, Joshua D. Knowles, and Martin J. Oates. The Pareto Envelope-based Selection Algorithm for Multiobjective Optimization. In Marc Schoenauer, Kalyanmoy Deb, Giinter Rudolph, Xin Yao, Evelyne Lutton, Juan Julian Merelo, and Hans-Paul Schwefel, editors, Proceedings of the Parallel Problem Solving from Nature VI Conference, pages 839-848, Paris, France, 2000. Springer. Lecture Notes in Computer Science No. 1917. H. Crowder, R. S. Demo, and J. H. Mulvey. On Reporting Computational Experiments with Mathematical Software. ACM Transactions on Mathematical Software, 5(2):193-203, June 1979. Xunxue Cui, Miao Li, and Tingjian Fang. Study of Population Diversity of Multiobjective Evolutionary Algorithm Based on Immune and Entropy Principles. In Proceedings of the Congress on Evolutionary Computation 2001 (CEC'2001), volume 2, pages 1316-1321, Piscataway, New Jersey, May 2001. IEEE Service Center. Indraneel Das and John Dennis. A Closer Look at Drawbacks of Minimizing Weighted Sums of Objectives for Pareto Set Generation in Multicriteria Optimization Problems. Structural Optimization, 14(l):63-69, 1997. Kalyanmoy Deb. Multi-Objective Optimization Using Evolutionary Algorithms. John Wiley & Sons, Chichester, UK, 2001. Kalyanmoy Deb and David E. Goldberg. An Investigation of Niche and Species Formation in Genetic Function Optimization. In J. David Schaffer, editor, Proceedings of the Third International Conference on Genetic Algorithms, pages 42-50, San Mateo, California, June 1989. George Mason University, Morgan Kaufmann Publishers. Kalyanmoy Deb and David E. Goldberg, mga in C: A Messy Genetic Algorithm in C. Technical Report 91008, Illinios Genetic Algorithms Laboratory (IlliGAL), September 1991. Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and T. Meyarivan. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2): 182-197, April 2002. David B. Fogel. Evolutionary Computation. Toward a New Philosophy of Machine Intelligence. The Institute of Electrical and Electronic Engineers, New York, 1995. Carlos M. Fonseca and Peter J. Fleming. Genetic Algorithms for Multiobjective Optimization: Formulation, Discussion and Generalization. In Stephanie Forrest, editor, Proceedings of the Fifth International Conference on Genetic Algorithms, pages 416-423, San Mateo, California, 1993. University of Illinois at Urbana-Champaign, Morgan Kauffman Publishers. Antonio Gaspar-Cunha and Jose A. Covas. RPSGAe-Reduced Pareto Set Genetic Algorithm: Application to Polymer Extrusion. In Xavier Gandibleux,
An Introduction to MOEAs and Their Applications
25.
26. 27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
25
Marc Sevaux, Kenneth Sorensen, and Vincent T'kindt, editors, Metaheuristics for Multiobjective Optimisation, pages 221-249, Berlin, 2004. Springer. Lecture Notes in Economics and Mathematical Systems Vol. 535. David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Publishing Company, Reading, Massachusetts, 1989. P. Hajela and C. Y. Lin. Genetic search strategies in multicriterion optimal design. Structural Optimization, 4:99-107, 1992. Jeffrey Horn. The Nature of Niching: Genetic Algorithms and the Evolution of Optimal, Cooperative Populations. PhD thesis, University of Illinois at Urbana Champaign, Urbana, Illinois, 1997. Jeffrey Horn, Nicholas Nafpliotis, and David E. Goldberg. A Niched Pareto Genetic Algorithm for Multiobjective Optimization. In Proceedings of the First IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Intelligence, volume 1, pages 82-87, Piscataway, New Jersey, June 1994. IEEE Service Center. Richard H. F. Jackson, Paul T. Boggs, Stephen G. Nash, and Susan Powell. Guidelines for Reporting Results of Computational Experiments - Report of the Ad Hoc Committee. Mathematical Programming, 49:413-425, 1991. Andrzej Jaszkiewicz. On the performance of multiple-objective genetic local search on the 0/1 knapsack problem—a comparative experiment. IEEE Transactions on Evolutionary Computation, 6(4):402-412, August 2002. Nazan Khan. Bayesian optimization algorithms for multiobjective and heirarchically difficult problems. Master's thesis, University of Illinois at UrbanaChampaign, Urbana, IL, July 2003. Hajime Kita, Yasuyuki Yabumoto, Naoki Mori, and Yoshikazu Nishikawa. Multi-Objective Optimization by Means of the Thermodynamical Genetic Algorithm. In Hans-Michael Voigt, Werner Ebeling, Ingo Rechenberg, and Hans-Paul Schwefel, editors, Parallel Problem Solving from Nature—PPSN IV, Lecture Notes in Computer Science, pages 504-512, Berlin, Germany, September 1996. Springer-Verlag. Mark R. Knarr, Mark N. Goltz, Gary B. Lamont, and Junqi Huang. In Situ Bioremediation of Perchlorate-Contaminated Groundwater using a MultiObjective Parallel Evolutionary Algorithm. In Congress on Evolutionary Computation (CEC'2003), volume 1, pages 1604-1611, Piscataway, New Jersey, December 2003. IEEE Service Center. Joshua Knowles and David Corne. M-PAES: A Memetic Algorithm for Multiobjective Optimization. In 2000 Congress on Evolutionary Computation, volume 1, pages 325-332, Piscataway, New Jersey, July 2000. IEEE Service Center. Joshua Knowles and David Corne. Properties of an Adaptive Archiving Algorithm for Storing Nondominated Vectors. IEEE Transactions on Evolutionary Computation, 7(2):100-116, April 2003. Joshua D. Knowles and David W. Corne. Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy. Evolutionary Computation, 8(2):149-172, 2000.
26
Carlos A. Coello Coello and Gary B. Lamont
37. Rajeev Kumar and Peter Rockett. Improved Sampling of the Pareto-Front in Multiobjective Genetic Optimizations by Steady-State Evolution: A Pareto Converging Genetic Algorithm. Evolutionary Computation, 10(3):283-314, Fall 2002. 38. Marco Laumanns, Lothar Thiele, Kalyanmoy Deb, and Eckart Zitzler. Combining Convergence and Diversity in Evolutionary Multi-objective Optimization. Evolutionary Computation, 10(3):263-282, Fall 2002. 39. Samir W. Mahfoud. Niching Methods for Genetic Algorithms. PhD thesis, University of Illinois at Urbana-Champaign, Department of General Engineering, Urbana, Illinois, May 1995. 40. Kaisa M. Miettinen. Nonlinear Multiobjective Optimization. Kluwer Academic Publishers, Boston, Massachusetts, 1999. 41. Tadahiko Murata, Hisao Ishibuchi, and Mitsuo Gen. Specification of Genetic Search Directions in Cellular Multi-objective Genetic Algorithms. In Eckart Zitzler, Kalyanmoy Deb, Lothar Thiele, Carlos A. Coello Coello, and David Corne, editors, First International Conference on Evolutionary MultiCriterion Optimization, pages 82-95. Springer-Verlag. Lecture Notes in Computer Science No. 1993, 2001. 42. Gisele L. Pappa, Alex A. Freitas, and Celso A.A. Kaestner. Attribute Selection with a Multiobjective Genetic Algorithm. In G. Bittencourt and G.L. Ramalho, editors, Proceedings of the 16th Brazilian Symposium on Artificial Intelligence (SBIA-2002), pages 280-290. Springer-Verlag. Lecture Notes in Artificial Intelligence Vol. 2507, 2002. 43. Gisele L. Pappa, Alex A. Freitas, and Celso A. A. Kaestner. A Multiobjective Genetic Algorithm for Attribute Selection. In Proceedings of the 4th International Conference on Recent Advances in Soft Computing (RASC-2002), pages 116-121, Nottingham, UK, December 2002. Nottingham Trent University. 44. Ian C. Parmee. Evolutionary and Adaptive Computing in Engineering Design. Springer, London, 2001. 45. Martin Pelikan and David E. Goldberg. Heirarchical problem solving and the bayesian optimization algorithm. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000), pages 267-274, 2000. 46. Jacques Periaux, Mourad Sefrioui, and Bertrand Mantel. GA Multiple Objective Optimization Strategies for Electromagnetic Backscattering. In D. Quagliarella, J. Periaux, C. Poloni, and G. Winter, editors, Genetic Algorithms and Evolution Strategies in Engineering and Computer Science. Recent Advances and Industrial Applications, chapter 11, pages 225-243. John Wiley and Sons, West Sussex, England, 1997. 47. R. S. Rosenberg. Simulation of genetic populations with biochemical properties. PhD thesis, University of Michigan, Ann Harbor, Michigan, 1967. 48. J. David Schaffer. Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. PhD thesis, Vanderbilt University, 1984. 49. J. David Schaffer. Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. In Genetic Algorithms and their Applications: Proceedings of the First International Conference on Genetic Algorithms, pages 93-
An Introduction to MOEAs and Their Applications
27
100. Lawrence Erlbaum, 1985. 50. Jason R. Schott. Fault Tolerant Design Using Single and Multicriteria Genetic Algorithm Optimization. Master's thesis, Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, Massachusetts, May 1995. 51. K. J. Shaw and P. J. Fleming. Initial Study of Practical Multi-Objective Genetic Algorithms for Scheduling the Production of Chilled Ready Meals. In Proceedings of Mendel'96, the 2nd International Mendel Conference on Genetic Algorithms, Brno, Czech Republic, September 1996. 52. Robert E. Smith, Bruce A. Dike, and S. A. Stegmann. Fitness Inheritance in Genetic Algorithms. In Proceedings of the 1995 ACM Symposium on Applied Computing, pages 345-350, Nashville, Tennessee, USA, February 1995. ACM. 53. N. Srinivas and Kalyanmoy Deb. Multiobjective Optimization Using Nondominated Sorting in Genetic Algorithms. Evolutionary Computation, 2(3):221-248, Fall 1994. 54. Gregorio Toscano Pulido and Carlos A. Coello Coello. Using Clustering Techniques to Improve the Performance of a Particle Swarm Optimizer. In Kalyanmoy Deb et al., editor, Genetic and Evolutionary Computation-GECCO 2004- Proceedings of the Genetic and Evolutionary Computation Conference, pages 225-237, Seattle, Washington, USA, June 2004. Springer-Verlag, Lecture Notes in Computer Science Vol. 3102. 55. David A. Van Veldhuizen. Multiobjective Evolutionary Algorithms: Classifications, Analyses, and New Innovations. PhD thesis, Department of Electrical and Computer Engineering. Graduate School of Engineering. Air Force Institute of Technology, Wright-Patterson AFB, Ohio, May 1999. 56. David A. Van Veldhuizen and Gary B. Lamont. Evolutionary Computation and Convergence to a Pareto Front. In John R. Koza, editor, Late Breaking Papers at the Genetic Programming 1998 Conference, pages 221-228, Stanford University, California, July 1998. Stanford University Bookstore. 57. David A. Van Veldhuizen and Gary B. Lamont. On Measuring Multiobjective Evolutionary Algorithm Performance. In 2000 Congress on Evolutionary Computation, volume 1, pages 204-211, Piscataway, New Jersey, July 2000. IEEE Service Center. 58. Shinya Watanabe, Tomoyuki Hiroyasu, and Mitsunori Miki. Neighborhood Cultivation Genetic Algorithm for Multi-Objective Optimization Problems. In Lipo Wang, Kay Chen Tan, Takeshi Furuhashi, Jong-Hwan Kim, and Xin Yao, editors, Proceedings of the l^th Asia-Pacific Conference on Simulated Evolution and Learning (SEAL'02), volume 1, pages 198-202, Orchid Country Club, Singapore, November 2002. Nanyang Technical University. 59. R. S. Zebulum, M. A. Pacheco, and M. Vellasco. A multi-objective optimisation methodology applied to the synthesis of low-power operational amplifiers. In Ivan Jorge Cheuri and Carlos Alberto dos Reis Filho, editors, Proceedings of the XIII International Conference in Microelectronics and Packaging, volume 1, pages 264-271, Curitiba, Brazil, August 1998. 60. Eckart Zitzler, Kalyanmoy Deb, and Lothar Thiele. Comparison of Multiobjective Evolutionary Algorithms: Empirical Results. Evolutionary Computa-
28
Carlos A. Coello Coello and Gary B. Lamont
tion, 8(2):173-195, Summer 2000. 61. Eckart Zitzler, Marco Laumanns, and Stefan Bleuler. A Tutorial on Evolutionary Multiobjective Optimization. In Xavier Gandibleux, Marc Sevaux, Kenneth Sorensen, and Vincent T'kindt, editors, Metaheuristics for Multiobjective Optimisation, pages 3-37, Berlin, 2004. Springer. Lecture Notes in Economics and Mathematical Systems Vol. 535. 62. Eckart Zitzler, Marco Laumanns, and Lothar Thiele. SPEA2: Improving the Strength Pareto Evolutionary Algorithm. In K. Giannakoglou, D. Tsahalis, J. Periaux, P. Papailou, and T. Fogarty, editors, EUROGEN 2001. Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems, Athens, Greece, September 2001. 63. Eckart Zitzler and Lothar Thiele. An Evolutionary Algorithm for Multiobjective Optimization: The Strength Pareto Approach. Technical Report 43, Computer Engineering and Communication Networks Lab (TIK), Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, May 1998. 64. Eckart Zitzler and Lothar Thiele. Multiobjective Optimization Using Evolutionary Algorithms—A Comparative Study. In A. E. Eiben, editor, Parallel Problem Solving from Nature V, pages 292-301, Amsterdam, September 1998. Springer-Verlag. 65. Eckart Zitzler and Lothar Thiele. Multiobjective Evolutionary Algorithms: A Comparative Case Study and the Strength Pareto Approach. IEEE Transactions on Evolutionary Computation, 3(4):257-271, November 1999. 66. Eckart Zitzler, Lothar Thiele, Marco Laumanns, Carlos M. Fonseca, and Viviane Grunert da Fonseca. Performance Assessment of Multiobjective Optimizers: An Analysis and Review. IEEE Transactions on Evolutionary Computation, 7(2):117-132, April 2003. 67. Jesse Zydallis. Explicit Building-Block Multiobjective Genetic Algorithms: Theory, Analysis, and Development. PhD thesis, Air Force Institute of Technology, Wright Patterson AFB, OH, March 2003. 68. Jesse B. Zydallis, David A. Van Veldhuizen, and Gary B. Lamont. A Statistical Comparison of Multiobjective Evolutionary Algorithms Including the MOMGA-II. In Eckart Zitzler, Kalyanmoy Deb, Lothar Thiele, Carlos A. Coello Coello, and David Corne, editors, First International Conference on Evolutionary Multi-Criterion Optimization, pages 226-240. Springer-Verlag. Lecture Notes in Computer Science No. 1993, 2001.
CHAPTER 2 APPLICATIONS OF MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS IN ENGINEERING DESIGN
Tapabrata Ray Temasek Laboratories, National University of Singapore 5 Sports Drive 2, Singapore 117508 E-mail:
[email protected] Engineering design is a multidisciplinary and multifaceted activity that requires simultaneous consideration of various design requirements and resource constraints. Such problems are inherently multiobjective in nature and involve highly nonlinear objectives and constraints often with functional and slope discontinuity that limits the effective use of gradient based optimization methods for their solution. Furthermore, in absence of preference information among the objectives, the goal in a multi-objective optimization problem is to arrive at a set of Pareto optimal designs. Evolutionary algorithms are particularly attractive in solving such problems as they are essentially stochastic, zero-order methods which maintain a set of solutions as a population and improves them over generations. In order for an optimization algorithm to be an effective design tool, it should be computationally efficient and easy to use with minimal number of user inputs. A number of engineering design optimization examples are presented here and solved using a multi-objective evolutionary algorithm. The examples clearly demonstrate the benefits offered by multi-objective optimization and highlight the key features of the evolutionary algorithm. 2.1. Introduction Real life problems in design optimization involve maximization or minimization of multiple objectives most of which are often in conflict. Unlike a single objective optimization problem where the aim is to find the best solution which is often unique, the aim in a multiple objective optimization problem is to arrive at a set of Pareto optimal designs. A design x* £ F is termed Pareto optimal if there does not exist another x G F, such that 29
30
T. Ray
fi(x) < /J( X *) for alH = 1,..., k objectives and fj(x) < /j(x*) for at least one j . Here, F denotes the feasible space (i.e., regions where the constraints are satisfied) and fj (x) denotes the j t h objective corresponding to the design x. If the design space is limited to M solutions instead of the entire F, the set of solutions are termed as nondominated solutions. Since in practice, all the solutions in F cannot be evaluated exhaustively, the goal of multiobjective optimization is to arrive at the set nondominated solutions with a hope that it is sufficiently close to the set of Pareto solutions. Diversity among these set of solutions is also a desirable feature as it means making a selection from a wider set of design alternatives. Classical gradient based methods are not efficient for multiobjective problems as they often lead to a single solution instead of a set of nondominated solutions. Multiple runs also cannot guarantee to reach a different nondominated solution each time. Population based methods like the evolutionary algorithm is particularly attractive for such classes of problems as they maintain a set of solutions as a population and improves them over time. Thus, they are capable of arriving at the set of nondominated solutions in a single run. The objectives and constraint functions of a design optimization problem are known to be highly nonlinear and computationally expensive to evaluate. These functions often possess functional and slope discontinuity that limits the efficient use of classical gradient based optimization methods. Zero order methods like the evolutionary algorithm and its variants have an edge over the other gradient based approaches in this respect. Evolutionary algorithms require evaluation of numerous designs at every generation and hence the total computational time required to solve a design optimization problem is usually high. This is a cause of concern for real life applications and multiple processors and novel learning schemes are typically employed to contain the total computational time within affordable limits. Another important feature of a design optimization problem is the presence of a large number of constraints. Such constraints typically arise out of designer's preferences, resource limitations, physical laws and performance or statutory requirements. The presence of constraints is known to significantly affect the performance of all optimization algorithms, including evolutionary search methods. There have been a number of approaches to handle constraints in the domain of mathematical programming including penalty functions and their variants, repair methods, use of decoders, separate treatment of constraints and objectives and hybrid methods incorporating constraint satisfaction methods. An excellent description of various
Applications of Multi-Objective Evolutionary Algorithms in Engineering Design
31
constraint handling schemes appear in Michalewicz and Schoenauer1 and Coello2. An ideal constraint handling scheme should not require additional user inputs and preferably avoid scaling and aggregation of constraint violations while at the same time make the best use of all computed information. The variables of a design optimization problem usually have an underlying physical significance and hence their range of variation can be decided apriori. The total number of variables for a design problem is usually large, some of which assume continuous, integer or discrete values. This means that an optimization algorithm for engineering design should be able to deal with mixed variables and its performance should not degrade largely with an increase in the problem size. Various aspects of engineering design optimization problem have been discussed by Rao3 and Deb4. Both of these texts focus more on modeling the problem and subsequently discuss about the gradient based methods for their solution. A comprehensive discussion on various multiobjective optimization techniques are presented by Deb5 and Coello et al.6. Once again these texts discuss more on various mechanisms within the multiobjective algorithm and their effects on the solution. The above discussion provides an overview of design optimization problems in general and outlines some of the features that an optimization algorithm should posses to effectively and efficiently solve these classes of problems. Section 2 provides the motivation and necessary details of the evolutionary algorithm that has been used in this study. Three design examples are discussed in details in Section 3 while Section 4 summarizes and lists the major conclusions. 2.2. Multi-Objective Evolutionary Algorithm The evolutionary algorithm presented in this text is designed to effectively and efficiently solve constrained, multiobjective problems from the domain of engineering design. Unlike most of its counterparts, the algorithm handles objectives and constraints separately using two fitness measures. The fitness measures are derived through nondominace and hence do not rely on scaling and aggregation of constraint violations or objectives. Fundamentally, the algorithm is built upon the following generic notions: • The algorithm drives the set of solutions towards feasibility first, before trying to improve an infeasible individuals' objective value. • A feasible solution is preferred over an infeasible solution. • Between two feasible solutions, the one with a better nondominated
32
T. Ray
rank based on the objective matrix is preferred over the other. • Between two infeasible solutions, one with a lower non-dominated rank based on the constraint matrix is preferred over the other. Ray et al.7 first proposed the use of non-dominated rank of an individual to compare infeasible solutions. The Nondominated Sorting Genetic Algorithm (NSGA) introduced by Srinivas and Deb8 has been used in this study to rank the individuals. Although the process of non-dominated sorting based on a constraint or the objective matrix is computationally expensive, it certainly eliminates the need for scaling and weighting factors, which are otherwise required to derive a single scalar measure of fitness. Furthermore, the information of all constraint violations is used by the algorithm rather than an aggregate or only the maximum violation as used by most penalty function based approaches. The details of the algorithm are explained in the context of a multi-objective, constrained minimization problem. Minimize: / = [/i(x)/ 2 (x)---/ m (x)].
(1)
Subject to: # ( x ) >ait
hj(x)=bj,j
i = l,2,...,q.
= l,2,...,r.
(2)
(3)
Where there are q inequality and r equality constraints and x = [xi x2 • • • xn] is the vector of n design variables. It is a common practice to transform the equality constraints (with a tolerance 8) to a set of inequalities and use a unified formulation for all constraints: —hj(x) > —bj—5 and hj(x) > bj — 8. Thus r equality constraints will give rise to 2r inequalities, and the total number of inequalities for the problem is denoted by s, where s = q + 2r. For each individual, c denotes the constraint satisfaction vector given by c — [ciC2 ... cs] where
{
0
if satisfied, i = 1,2,..., s
(4) a»-3i( x )
if violated, i - 1,2,..., q
bi — 8 - / i , ( x ) if v i o l a t e d , i = q + l,q + 2,... ,q + r —bi — 8 + hi(x) if v i o l a t e d , i = q + r + l,q + r + 2,... , s .
For the above Cj's, CJ = 0 indicates the ith constraint is satisfied, whereas Oil 0 indicates the violation of the constraint. The CONSTRAINT matrix for a population of M individuals assumes the form
Applications of Multi-Objective Evolutionary Algorithms in Engineering Design 33
" Cn
C12 • • ' Cu '
C21
CONSTRAINT =
C22
.
.
•• •
.
C2s
.
( 5 )
. CMl CM2 ' " ' cMs .
The objective matrix assumes the form /11 /21
O B J E C T I V E =
.
/12 /22
.
.
• • •
fik
• • '
/2fc
.
( 6 )
. / M l /M2 • ' • ftik .
In a population of M individuals, all nondominated individuals are assigned a rank of 1. The rank 1 individuals are removed from the population and the new set of nondominated individuals is assigned a rank of 2. The process is continued until every individual in the population is assigned a rank. Rank = 1 in the objective or the constraint matrix indicates that the individual is nondominated. It can be observed from the constraint matrix that when all the individuals in the population are infeasible, the Rank = 1 solutions are the best in terms of minimal constraint violation. Whenever there is one or more feasible individuals in the population, the feasible solutions assume the rank of 1. The pseudo code of the algorithm is presented below. 2.2.1.
Algorithms
(1) i < - 0 . (2) Generate M individuals representing a population: Pop(t) — I\,..., IM uniformly in the parametric space. (3) Evaluate each individual: Compute the objective and constraints i.e., fk{h) and Cj(Ii); for i = 1,2, . . . , M individuals, A; = 1,2,... ,P objectives and j = 1,2,... ,S constraints.
(4) Identify Elites: E(t) C Pop(t); where E(t) is the set of Elites. The remaining set of individuals are referred as R(t) such that Pop(t) =E(t)UR(t). (5) Preserve the Elites:
34
T. Ray
Pop{t + 1) =
4.013\/
^
5#-^ /
r,
13 can be combined with a parameterization strategy for the NSGA-II14 to accomplish the following goals: (1) ensure the algorithm will maintain diverse solutions, (2) eliminate the need for trial-and-error analysis for parameter settings (i.e., population size, crossover and mutation probabilities), and (3) allow users to sufficiently capture tradeoffs using a minimum number of design evaluations. A sufficiently quantified tradeoff can be defined as a subset of nondominated solutions that provides an adequate representation of the Pareto frontier that can be used to inform decision making. In this chapter, section 4.2 overviews prior studies used in the development of the e-NSGA-II. Section 4.3 discusses the groundwater monitoring
e-Dominance Archiving and Automatic Parameterization
81
test case used to demonstrate the e-NSGA-II. Sections 4.4 and 4.5 provide a more detailed description of the e-NSGA-II and its performance for the groundwater monitoring test case, respectively. Sections 4.6 and 4.7 discuss how the e-NSGA-II and future extensions of this work have significant potential to help environmental engineers address computationally intensive applications where stakeholders must balance more than two performance objectives (i.e.,high-order Pareto optimization problems). 4.2. Prior Work The £-NSGA-II combines the external archiving techniques recommended by Laumanns et al. 12 with automatic parameterization techniques15'16 developed to eliminate trial-and-error analysis for setting the NSGA-IFs parameters. A primary drawback of using EMO methods for environmental applications lies in the large costs associated with assessing performance {i.e., algorithmic reliability and solution quality). The common practice of assessing performance for a distribution of random seeds employed in the EMO literature is often prohibitively expensive in terms of computational costs and in terms of the time that must be invested by users. The goal of the automated parameterization approaches developed by Reed et al.15 is to eliminate the need to assess algorithmic performance for a distribution of random number seeds and instead focus on the NSGA-II's reliability and efficiency for a single random seed. Reliability is addressed in the approach by adaptively increasing the size of the population. The method uses multiple runs in which the nondominated solutions are accumulated from search performed using successively doubled population sizes. The runs (i.e.,searches with the successively doubled population sizes) continue until either the user-defined maximum run-time is reached or sufficient solution accuracy has been attained. The NSGA-II parameterization approach presented by Reed et al.lA was demonstrated on the same case study that will be discussed in Section 3.0 of this chapter. Their approach required a total of 38,000 function evaluations, which is an 80-percent reduction from prior published results4. Moreover, the method enabled Reed and Minsker7 to solve a 4-objective monitoring application (i.e., a high order Pareto optimization problem), which represents a new problem class within the environmental literature that has historically been dismissed as intractable (e.g., see pages 197-198 of Sun17). Although Reed et al.14 helped to demonstrate how EMO can help environmental engineers step beyond 2-objective applications, the method
82
Patrick Reed and Venkat Devireddy
employs an inefficient form of archiving, the approach fails to allow users to bias search towards important objectives, and the method does not take advantage of early run results to guide subsequent search. Reed et al.li recommended offline analysis for accumulating nondominated solutions across multiple runs. Offline analysis can be viewed as an unbounded archive (i.e., the number of solutions stored in memory is not limited) of the nondominated solutions found by the NSGA-II in every generation of every run. Laumanns et al.12 highlight that unbounded archiving leads to memory and nondomination sorting inefficiencies. The £-NSGAII approach discussed in this chapter was specifically developed following Laumanns et al's theoretical recommendations for bounding archive size and improving solution diversity using the principle of e-domination. £domination requires users to specify the precision with which they wish to quantify each objective. User-specified precisions can be used to bias search towards regions of an application's objective space with the highest precision requirements (see Section 4.0 for more details). The e-domination archive was used in this study to maintain a diverse representation of the Pareto optimal set; moreover the archived solutions found with small populations are used to pre-condition search with larger populations and minimize the number of design evaluations required to solve an application. The reader should note that beyond the £-NSGA-II, Deb et al.13 have also proposed an extension of the NSGA-II to improve its diversity operator termed the Clustered NSGA-II (C-NSGA-II) as well as a steady state e-dominance multiobjective evolutionary algorithm (MOEA) that balances convergence speed and diversity. The C-NSGA-II replaces the crowding distance procedure with the clustering technique that was used in the Strength Pareto Evolutionary Algorithm18. Though C-NSGA-IFs results were better than NSGA-II, the large computational time for implementing the method's clustering algorithm eliminated the algorithm from consideration. The steady state e-MOEA13 helped to encourage our usage of £-dominance archives in this chapter. We did not use the steady state eMOEA itself in this chapter because (1) the algorithm is limited to the realcoded representation and (2) the algorithm's small generation gap (i.e.,only 1 population member is being replaced during each iteration) limited its ability to take advantage of small populations runs to reduce the overall number of function evaluations required to solve an application. Readers interested in other online adaptive strategies beyond the £-NSGA-II should reference the micro-GA19 and its successor, micro-GA220, which use the concept of small population sizing and automatic parameterization. Tan et
e-Dominance Archiving and Automatic Parameterization
83
al.21 demonstrate a dynamic population sizing scheme for the incrementing multiobjective evolutionary algorithm (IMOEA). Kursawe22 and Abbas23 utilize online adaptive strategies to enhance Pareto optimization results attained from an evolutionary strategy and differential evolution, respectively. 4.3. Monitoring Test Case Problem 4.3.1. Test Case Overview The e-NSGA-II's performance is demonstrated on a 2-objective test case originally modeled by Reed et al.A. The test case is based on an actual site located at the Lawrence Livermore National Laboratory in Livermore, California, which has been historically contaminated with a large spill of the solvent perchloroethylene (PCE). The goal of this application is to monitor the site using groundwater monitoring wells (i.e., wells drilled to sample subsurface water) to track the PCE's migration. PCE is a human health concern because the solvent is known to cause cancer in exposed human beings. The site is undergoing long-term monitoring, in which groundwater samples are used to assess the effectiveness of current efforts to reduce the site's contamination. During this long-term monitoring phase for a contaminated site, sampling and laboratory analysis can be a controlling factor in the costs of managing a site. The monitoring wells can sample from 1 to 3 locations along their vertical axis and have a minimum spacing of 60-m between wells in the horizontal plane. Quarterly sampling of the entire network of wells has a potential cost of over $70,000 annually for PCE testing alone, which could translate into millions of dollars because the site's life span will be several decades and potentially even centuries. 4.3.2. Problem
Formulation
Equation 1 gives the multiobjective problem formulation for quantifying the tradeoff between minimizing sampling costs and the maintenance of a high quality interpolated picture of the PCE contamination. Minimize F(xK) = [fi(xK), f2(xK)} , VK £ fi /i(*«) =
nwell
E Cs{i)xKi
/2(*i t )=E l (c; H (« i )-c e ", t (%)) 2 3= 1
,^
84
Patrick Reed and Venkat Devireddy
F(xK) is a vector valued objective function whose components [fi(xK), /2(^ K )] represent the cost and squared relative estimation error (SREE), respectively, for the Kth monitoring scheme xK taken from the collection of all possible sampling designs fl. Equation 2 defines the binary decision variables representing the nth monitoring scheme. f 1, if the ith well is sampled w , . .„. xki = { „' , Vfc,z (2) (_ 0, otherwise If the ith well is sampled it is assumed that all available locations along the vertical axis of that well will be sampled at a cost of Cs{i). Cs(i) ranged from $365 to $1095 for 1 to 3 samples analyzed for PCE solely. Sampling all available levels within each well reduces the size of f2 from 2 50 to 2 20 where 50 and 20 represent the total number of sampling locations and monitoring wells (nwell), respectively. Reducing the size of Q enabled the entire decision space of this application to be enumerated. Enumeration was employed to identify the true Pareto frontier so that the performance of the e-NSGA-II could be rigorously tested. In particular, the enumerated Pareto frontier is used in this chapter to show the algorithm's efficiency and reliability. The SREE objective provides a measure of how the interpolated picture of the plume using data only from wells included in the Kth sampling plan compares to the result attained using data from all available sampling locations. The measure is computed by summing the squared deviations between the PCE estimates using data from all available sampling locations, c*all(uj), and the estimates based on the Kth sampling plan c*st(uj) at each location Uj in the interpolation domain. Each Uj specifies the coordinates for the j t h grid point in the interpolation domain. The interpolation domain consisted of a total of 3300 grid points (nest in equation 1). The PCE estimates used in the calculation of the SREE for each of the sampling designs were attained using a nonlinear spatial interpolation method (see 4 for more details).
4.4. Overview of the £-NSGA-II Approach The e-NSGA-II algorithm proposed in this chapter aims at reducing user interaction requirements and the computational complexity associated with solving multiobjective optimization problems. EMO algorithms require the user to specify the following parameters:
e-Dominance Archiving and Automatic Parameterization
• • • •
85
Population Size Run length Probability of Crossover Probability of Mutation
The specification of these parameters is typically done using multiple trial-and-error runs of an EMO algorithm; wasting user time and computational resources. e-NSGA-II enables the user to specify the precision with which they want to quantify the Pareto optimal set and all other parameters are automatically specified within the algorithm. A brief description of the algorithm is given below in Figure 4.1.
Fig. 4.1. Schematic overview of the e-NSGA-II.
The proposed algorithm consists of three steps. The first step utilizes the NSGA-II with a starting population of 5 individuals to initiate EMO search. The initial population size is set arbitrarily small to ensure the algorithm's initial search is done using a minimum number of function evaluations. Subsequent increases in the population size adjust the population size to the appropriate size based on problem difficulty. In the second step, the e-NSGA-II uses a fixed sized archive to store the nondominated solutions generated in every generation of the NSGA-II runs. The archive is updated using the concept of e-dominance, which has the benefit of ensuring that
86
Patrick Reed and Venkat Devireddy
the archive maintains a diverse set of solutions, e-dominance requires the user to define the precision with which they want evaluate each objective (e.g., quantify costs in thousands, hundreds, or tens of dollars) by specifying an appropriate e value for each objective. The third step checks if user-specified termination performance criteria are satisfied and the Pareto optimal set has been sufficiently quantified. If the criteria are not satisfied, the population size is doubled and search is continued. When increasing the population, the initial population of the new run has solutions injected from the archive at the end of the previous run. The algorithm terminates if either a maximum user time is reached or if doubling the population size fails to significantly increase the number of nondominated solutions found across two runs. The following sections discuss the e-NSGA-II in greater detail. 4.4.1. Searching with the NSGA-II The e-NSGA-II was motivated by the authors' goal of minimizing the total number of function evaluations required to solve computationally intensive environmental applications and eliminate trial-and-error analysis for setting the NSGA-II's parameters. Population size has been the key parameter controlling the performance and efficiency of our prior applications4'7'14. The dynamic population sizing and injection approach applied in the e-NSGA-II simply exploits computationally inexpensive small populations to expedite search while increasing population size commensurate with problem difficulty to ensure the Pareto optimal set can be reliably approximated. The initial population size, NQ is set to some arbitrary small value {e.g., 5), as it is expected that subsequent multi-populations runs would adjust for an undersized population. A randomly selected subset of the solutions obtained using the small population sizes are injected into subsequent larger populations, aiding faster convergence to the Pareto front. This can be viewed as using series of "connected" NSGA-II runs that share results so that the Pareto optimal set can be reliably approximated. Computational savings should be viewed in two contexts: (1) the use of minimal population sizes and (2) elimination of random seed analysis. Note that the number of times the population size will be doubled varies with different random seeds, though exploiting search with small populations will on average dramatically reduce computational times. Moreover, our approach eliminates the need to repeatedly solve an application for a distribution of random seeds.
e-Dominance Archiving and Automatic Parameterization
87
The NSGA-II's remaining parameters are set automatically based on whether an application is being solved using a real or binary coding. The results shown in this chapter are for a binary coded application where the NSGA-II's parameters are set following the approach recommended by Reed et al. 14. The initial and all subsequent populations are allowed to search for a fixed run length, t. The run length is set using the domino convergence model developed by Thierens et al.2i such that each population is allowed to search for 21 generations, where Us the binary string length. For the monitoring application presented in this chapter maximum run length was specified to be 40 generations. The uniform crossover operator was used to minimize positional bias25 with the probability of crossover Pc set to 0.5 based on prior empirical results25 as well as a theoretical disruption boundary relation derived by Thierens26. The probability of mutation Pm is set to 1/iVwhere N is the current population size. This relationship is based on the recommendations of DeJong27 and Schaffer28 and has the advantage of increasing the diversity of small populations and preserving solutions in large population runs. The proposed e-NSGA-II algorithm can be easily adapted for real coded problems by using Deb's29 recommended run length (i.e,250 generations), as well as his recommended settings for the crossover and mutation operators used in the real-coded version of the NSGA-II. 4.4.2. Archive Update Recent studies in the EMO literature have highlighted the importance of balancing algorithmic convergence speed and solution diversity12'13. These studies have highlighted that the NSGA-II remains one of the fastest converging methods available, but its crowding operator fails to promote diversity for challenging EMO problems. The £-NSGA-II overcomes this failure using e-dominance archives12. The e-dominance archiving approach is particularly attractive for environmental applications because it allows the user to define the precision with which they want to quantify their tradeoffs while bounding the size of the archive and maintaining a diverse set of solutions. Figure 4.2 adapted from Deb et al.13 illustrates the e-dominance approach. The concept of e-dominance requires the user to define the precision they want to use to evaluate each objective. The user specified precision or tolerance vector e defines a grid for a problem's objective space [see Figure 4.2], which biases the NSGA-II's search towards the portions of a
88
Patrick Reed and Venkat Devireddy
Fig. 4.2.
Illustration of e-dominance (adapted from Deb et al.13.)
problem's decision space that have the highest precision requirements. Figure 4.2 illustrates how e-domination allows decision makers to extend a solution's zone of domination based on their required precision for each objective (i.e., E\ and e2 )• Under traditional nondomination sorting solution P dominates region PECF whereas using e-domination the solution dominates the larger region ABCD. The e-dominance archive improves the NSGA-II's ability to maintain a diverse set of nondominated solutions by only allowing 1 archive member per grid cell. In the case when multiple nondominated points reside in a single grid cell, only the point closest to the lower left corner of the cell (assuming minimization) will be added to the on-line archive thereby ensuring convergence to the true Pareto optimal set12. For example, solution 1 in figure 4.2 would be stored in the archive because it is closer to Point G than solution 2. The archive is updated in every generation of the e-NSGA-II runs with a diverse set of "e-nondominated" solutions, which are guaranteed to be separated by a minimum distance of e^in the ^objective. The values specified for e also directly impact the algorithm's convergence speed. A high precision representation of the Pareto optimal set can be captured by specifying the precision tolerances e to be very small. Small precision tolerances will increase the number of Pareto optimal solutions that are e-nondominated, increase the archive size, and increase population sizing requirements (30, p. 74). The e-NSGA-II has the advantage of allowing users to dramatically reduce computation times, by accepting a lower resolution (i.e.,specifying higher values of e) representation of the Pareto frontier. Note that lower resolution approximate representations of the Pareto frontier can be helpful by reducing the number of designs decision makers must consider. In environmental applications, decision makers can benefit from a small set of Pareto optimal solutions that allow them to interpret the general shape or
e-Dominance Archiving and Automatic Parameterization
89
inflection of their design tradeoffs to support a diminishing returns analysis for their design criteria (e.g.,How much can contaminant map uncertainty be reduced with additional groundwater samples?) 4.4.3. Injection and
Termination
The e-NSGA-II also seeks to speed convergence by pre-conditioning search with larger population runs with the prior search results attained using small populations. In prior efforts14, any attempts to inject solutions found using small population into subsequent runs made the NSGA-II prematurely converge to poor representations of the Pareto optimal set, especially for problems with greater than 2 objectives. The e-domination archive's ability to preserve diversity plays a crucial role in overcoming this limitation. As described previously in Section 4.1, the e-NSGA-II begins search with an initial population of 5 individuals for 21 generations from which the e-nondominated solutions identified in this initial run are stored in the archive. A minimum of two successive runs must be used to determine if further search is justified. Search progress is rated in terms of a user-defined criterion that specifies the minimum percentage change in the number of £-nondominated individuals AND found in two successive runs. For example, consider two successive runs of the £-NSGA-II in which the first run uses a population of N sampling designs to evolve a e-nondominated set composed of A individuals, while the second run uses a population of 2N designs to evolve a e-nondominated set of K individuals. The results of these runs are used in equation 1 to define which of the two following courses of action will be taken: (1) population size is again doubled, resulting in 4N individuals to be used in an additional run of the e-NSGA-II or (2) the algorithm stops to allow the user to assess if the e-nondominated set has been quantified to sufficient accuracy. if AJVD < ('—-j—*•) 100 then double N and continue search else stop search
(3)
The archive at the end of each run contains e-nondominated solutions that can be used to guide search in future runs and speed convergence to the Pareto front. This is achieved by injecting e-nondominated solutions from the archive at the end of the run with population size JVinto the initial population of the next run which has a population size 2N. Figure 4.2
90
Patrick Reed and Venkat Devireddy
illustrates the two scenarios that arise when the e-NSGA-II injects solutions from the archive generated with a population size N into the initial generation of a run with a population size 27V. In scenario 1 shown in Figure 4.3a, the archive size A is smaller than the subsequent population size 2N. In this case, 100-percent of the enondominated archive solutions are injected into the first generation of the subsequent run with 2N individuals. We have found that the number of injected solutions should be maximized to aid rapid convergence. The e-dominance archive in combination with successive doubling of population size guarantees the e-NSGA-II will maintain sufficient solution diversity. Figure 4.3b shows the second injection scenario, which occurs when the archive size A is greater than the next population size 27V. In this case, 2/V e-nondominated archive solutions are selected randomly and injected into the first generation of the next run, again maximizing the impact of injected solutions.
Fig. 4.3. Schematic representation of injection when (a) the archive size A is smaller than the next population size 2N and (b) the archive size A is larger than the next population size 2N.
e-Dominance Archiving and Automatic Parameterization
91
4.5. Results The following experiments were designed to validate that the e-NSGA-II can efficiently and reliably approximate the monitoring test case's enumerated Pareto front [shown in Figure 4.4]. The experiments show the e-NSGAIFs efficiency relative to the NSGA-II using both Deb's29 recommended parameter settings as well as the parameter settings recommended by Reed et al.14. An additional goal of these experiments is to clearly demonstrate how user termination and precision criteria impact the e-NSGA-II's performance.
Fig. 4.4. The enumerated tradeoff between cost and mapping error designated as SREB.
Table 4.10 compares the performance of the proposed algorithm with a fixed population sized NSGA-II using Deb's recommended parameter settings and the adaptive population sizing approach of Reed et al.14. These experiments were designed to be a conservative performance test for the eNSGA-II by favoring the fixed and adaptive population NSGA-II runs. The runs based on Deb's and Reed's prior recommendations both use unbounded
92
Patrick Reed and Venkat Devireddy
offline archives, in which all identified nondominated solutions were stored in memory and used to generate their final results. Additionally, since the enumerated tradeoff is discrete with only 36 solutions the £-NSGA-II's injection has reduced role in enhancing performance relative to problems with larger Pareto optimal sets. The problem was solved with 50 random seeds and the average number of nondominated solutions obtained as well as the average number of function evaluations taken is reported. The NSGA-II parameter settings based on Deb's recommendations used a population size of 100, run length of 250 generations while the probabilities of crossover and mutation were set to 0.5 and 0.01, respectively. Reed et al.14 recommended that users specify an initial population size equal to twice the number of Pareto optimal solutions they expected to find. For the monitoring case study solved in this chapter Reed et al. 14 used an initial population of 60. In this comparison, the initial population is set to 60 and the rest of the parameters are specified as the same as those described in Section 4.1. Table 4.10. Comparison of the proposed algorithm with NSGA-II and the previous design methodology proposed by Reed et al.14. |
Average no. of solutions found Min no. of function evaluations Ave. no. of function evaluations" Max no. of function evaluations
,
,L
n l Deb et al
30 25000 25000 25000
J
,
il
E-NSGA-II
Reed et al —r „ i r~— Ajyj) = 10% AND = 5% 32 25 27 16800 " 1400 1400 39889 7992 15512 74400 ~ 25400 25400
Thefixedpopulation NSGA-II was able tofindan average of 30 nondominated solutions using 25000 function evaluations. The adaptive populationsizing runs based on these parameter settings took nearly an average of 40000 function evaluations to obtain an average of 32 nondominated solutions per run. As expected, the additional 15000 design evaluations are the primary reason why more nondominated solutions were found per run relative to the fixed population runs using Deb's settings. It should be noted that these runs seek an enumerated front that was quantified using 6-digit precision (10"06). For the e-NSGA-II runs, cost [ecost] and SREE [ESREE] precision limits were set equal to 0.001 and 10~05, respectively. In test runs performed for this chapter, it was observed that the SREE objective was the most sensitive to precision limits and that there were no substantial impacts on performance when esREE was set equal to be less than 10~OB [see Figure 4.8]. The e-NSGA-II required an average of 7992 function evaluations to obtain
e-Dominance Archiving and Automatic Parameterization
93
an average of 25 solutions. Although the e-NSGA-II found fewer solutions on average than other methods, the method generally found a more diverse set of solutions representing the full extent of the Cost—SREE tradeoff. The e-NSGA-II results represent a computational savings of 70 and 80percent relative to using Deb's and Reed et al.'s recommended parameter settings. Moreover, it should be emphasized that the e-dominance archive is dramatically more efficient than using offline analysis. Table 4.10 also shows that reducing the ff-NSGA-II's termination AND does result in an increase in the number of nondominated solutions found, but on average the user must expend twice the number of function evaluations. Figure 4.5 shows a run result representative of the e-NSGA-II's average performance, in which 6200 function evaluations where used to evolve 26 nondominated points (setting A/VD equal to 10-percent).
Fig. 4.5. Typical performance of the £-NSGA-II when AJVD is set equal to 10-percent.
For this application, setting AND to 10-percent produces a sufficient representation of the Cost-SREE tradeoff; all subsequent results use this
94
Patrick Reed and Venkat Devireddy
termination criterion. Figures 4.6 and 4.7 graphically compare the results obtained from the best and the worst runs of the three tested variants of the NSGA-II. Figure 4.6 shows the best results obtained from the 50 runs of each version of the algorithm measured in terms of number of nondominated solutions found. All of the algorithms were able to capture the true front. The fixed sized NSGA-II as well as the e-NSGA-II required 25000 function evaluations to capture the entire enumerated front, while the adaptive population NSGA-II using offline analysis described by Reed et al (2003) used 74400 function evaluations.
Fig. 4.6. Comparison of the solutions obtained from the best run of each algorithm rated in terms of number of nondominated solutions found.
The worst results defined in terms of the number of nondominated solutions obtained are plotted in Figure 4.7. Initial review of the plot may lead the reader to conclude that the e-NSGA-II did not perform as well as the other methods. This particular run of the e-NSGA-II highlights a
e-Dominance Archiving and Automatic Parameterization
95
potential side-effect of using AND termination criterion. In this case, doubling the population size from 10 to 20 failed to produce more than a 10 percent increase in the number of nondominated solutions found leading to a premature end to the run. Figure 4.7 shows that the e-NSGA-II found 18 solutions after just 1400 function evaluations. Moreover, the 18 solutions are distributed over the entire Cost—SREE tradeoff; if performance is measured in terms of solution diversity and convergence speed the e-NSGA-IFs performance is far superior to the other methods. If the user wants a more accurate representation of the tradeoff they would only have to lower AND value and continue the run.
Fig. 4.7. Comparison of the solutions obtained from the worst run of each algorithm rated in terms of number of nondominated solutions found.
The values of ecostand £SREB indicate the degree of precision that the user expects to use when evaluating each of the two objectives. The user can bias the search towards a certain objective by increasing the precision requirements for that objective. For the Cost—SREE monitoring problem,
96
Patrick Reed and Venkat Devireddy
the SREE objective is the most sensitive to precision requirements. The effects of varying the value of SSREE on the number of solutions obtained by the e-NSGA-II as well as the total number of function evaluations required are demonstrated in Figure 4.8 for a £cost and AJVC of 0.001 and 10-percent, respectively. The value of CSREE is varied between 10~03 and 10~06. The results shown in Figure 4.8 are averaged for 50 random seeds. The number of nondominated solutions found by the e-NSGA-II does not increase significantly as £SREB^ decreased beyond 10~05. Figure 4.8 indicates that users can attain approximations to the Cost—SREE tradeoff using less than 4000 function evaluations, which represents an order of magnitude decrease relative to our prior solution approaches for this problem.
Fig. 4,8. Average variation of the number of nondominated solutions found and the number of function evaluations required with different values of SSREE-
e-Dominance Archiving and Automatic Parameterization
97
4.6. Discussion The e-NSGA-II gives users more direct control to balance their accuracy needs and the computational demands associated with evolving the Pareto frontiers for their applications. A key result presented in this chapter is that with as few as 1400 function evaluations, the algorithm was able to approximate the Cost—SREE tradeoff's general shape by identifying a diverse set of solutions along the entire extent of the curve. The approximate representation could be used for by environmental decision makers to make reasonable assessments of the diminishing returns of using more than 35 samples [corresponding to a scaled cost of 0.6 in Figure 4.7]. The computational efficiency of the e-NSGA-II will aid our future efforts in exploring the use of EMO to solve high-order Pareto optimization problems. Reed and Minsker7 introduced the value of considering more than 2 objectives for water resources and environmental design applications. High-order Pareto frontiers allow decision makers to better understand interactions between their objectives. As an example, Reed and Minsker's 4-objective monitoring application highlighted previously unknown objective conflicts that significantly impact the design of LTM systems. The e-NSGA-II has significant potential for dramatically reducing the computational costs of evolving high-order Pareto fronts. The algorithms edominance archive will also enhance decision-making by bounding the size of the Pareto optimal set that stakeholders must consider. For monitoring applications, the value of mapping accuracy (SREE) can be visualized in space. Stakeholders can visualize members of the reduced set of solutions evolved by the e-NSGA-II to better understand how their objectives impact designs and to exploit low cost improvements in their design objectives. 4.7. Conclusions The e-NSGA-II demonstrates how e-dominance archiving can be combined with a parameterization strategy for the NSGA-II to accomplish the following goals: (1) ensure the algorithm will maintain diverse solutions, (2) eliminate the need for trial-and-error analysis for parameter settings {i.e., population size, crossover and mutation probabilities), and (3) allow users to sufficiently capture tradeoffs using a minimum number of design evaluations. A sufficiently quantified tradeoff can be defined as a subset of nondominated solutions that provide an adequate representation of the Pareto frontier that can be used to inform decision making. Results are presented for a 2-objective groundwater monitoring case study in which the archiv-
98
Patrick Reed and Venkat Devireddy
ing and parameterization techniques for the NSGA-II combined to reduce computational demands by greater than 70-percent relative to prior published results. The methods of this chapter can be easily generalized to other multiobjective applications to minimize computational times as well as trial-and-error parameter analysis. References 1. B. J. Ritzel, J. W. Eheart, and S. R. Ranjithan, Using genetic algorithms to solve a multiple objective groundwater pollution containment problem. Water Resources Research, 1994. 30(5): p. 1589-1603. 2. D. Halhal, G. A. Walters, D. Ouazar and D. A. Savic, Water network rehabilitation with structured messy genetic algorithm. Journal of Water Resources Planning and Management, 1997. 123(2): p. 137-146. 3. D. H. Loughlin, S. R. Ranjithan, J. W. Baugh Jr. and E. D. Brill Jr., Application of Genetic Algorithms for the Design of Ozone Control Strategies. Journal of the Air and Waste Management Association, 2000. 50: p. 1050-1063. 4. P. Reed, B.S. Minsker, and D.E. Goldberg, A multiobjective approach to cost effective long-term groundwater monitoring using an Elitist Nondominated Sorted Genetic Algorithm with historical data. Journal of Hydroinformatics, 2001. 3(2): p. 71-90. 5. M. A. Erickson, A. Mayer, and J. Horn, Multi-objective optimal design of groundwater remediation systems: application of the niched Pareto genetic algorithm (NPGA). Advances in Water Resources, 2002. 25(1): p. 51-56. 6. Z. Kapelan, D. A. Savic, and G. A. Walters, Multiobjective Sampling Design for Water Distribution Model Calibration. Journal of Water Resources Planning and Management, 2003. 129(6): p. 466-479. 7. P. Reed and B. S. Minsker, Striking the Balance: Long-Term Groundwater Monitoring Design for Conflicting Objectives. Journal of Water Resources Planning and Management, 2004. 130(2): p. 140-149. 8. Task Committee on Long-Term Groundwater Monitoring Design, LongTerm Groundwater Monitoring: The State of the Art. 2003, Reston, VA: American Society of Civil Engineers. 9. National Research Council, Environmental Cleanup at Navy Facilities: Adaptive Site Management. 2003, Washington, D. C.: The National Academies Press. 10. Energy, D.o., DOE/EM-0563 A report to Congress on long-term stewardship: Volume I Summary Report. 2001, Office of Environmental Management: Washington, D.C. 11. K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions. Evolutionary Computation, 2002. 6(2): p. 182-197. 12. M. Laumanns, L. Thiele, K. Deb and E. Zitzler, Combining Convergence and Diversity in Evolutionary Multiobjective Optimization. Evolution-
e-Dominance Archiving and Automatic Parameterization
99
ary Computation, 2002. 10(2): p. 263-282. 13. K. Deb, M. Mohan, and S. Mishra, A Fast Multi-objective Evolutionary Algorithm for Finding Well-Spread Pareto-Optimal Solutions. In Fonseca et al., editor, Proceedings of the Evolutionary Multi-Criterion Optimization. Second International Conference, EMO 2003, Faro, Portugal, 2003, Springer. Lecture Notes in Computer Science. Volume 2632: p 222-236. 14. P. Reed, B.S. Minsker, and D.E. Goldberg, Simplifying Multiobjective Optimization: An Automated Design Methodology for the Nondominated Sorted Genetic Algorithm-II. Water Resources Research, 2003. 39(7): p. 1196, doi: 10.1029/2002WR001483. 15. P. Reed, T. Ellsworth, and B.S. Minsker, Spatial Interpolation Methods for Nonstationary Plume Data. Ground Water, 2004. 42(2): p. 190-202. 16. V. Devireddy and P. Reed. An Efficient Design Methodology for the Nondominated Sorted Genetic Algorithm-II. In Late Breaking Papers within the Proceedings for the 2003 Genetic and Evolutionary Computation Conference (GECCO 2003). 2003. Chicago, IL: p. 67-71. 17. N.-Z. Sun, Inverse Problems in Groundwater Modeling. Theory and Applications of Transport in Porous Media, ed. J. Bear. Vol. 6. 1994, New York,
NY: Kluwer Academic Publishers. 18. E. Zitzler, M. Laumanns, and L. Thiele, SPEA2: Improving the Strength Pareto Evolutionary Algorithm. 2001, Department of Electrical Engineering, Swiss Federal Institute of Technology: Zurich, Switzerland. 19. Carlos A. Coello Coello and Gregorio Toscano Pulido. Multiobjective Optimization using a Micro-Genetic Algorithm. In Lee Spector et al., editor, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'2001), San Francisco, California, 2001. Morgan Kaufmann Publishers: p 274-282. 20. Gregorio Toscano Pulido and Carlos A. Coello Coello, The Micro Genetic Algorithm 2: Towards Online Adaptation in Evolutionary Multiobjective Optimization. In Fonseca et al., editor, Proceedings of the Evolutionary Multi-Criterion Optimization. Second International Conference, EMO 2003, Faro, Portugal, 2003, Springer. Lecture Notes in Computer Science. Volume 2632: p 252-266. 21. K.C. Tan, T.H. Lee, and E.F. Khor. Evolutionary Algorithms with Dynamic Population Size and Local Exploration for Multiobjective Optimization. IEEE Transactions on Evolutionary Computation, 5(6):565-588, December 2001. 22. F. Kursawe. A Variant of Evolution Strategies for Vector Optimization. In H. P. Schwefel and R.Manner, editors, Parallel Problem Solving from Nature. 1st Workshop, PPSN I, volume 496 of Lecture Notes in Computer Science, pages 193-197, Berlin, Germany, Oct 1991.Springer-Verlag. 23. H. A. Abbass. The Self-Adaptive Pareto Differential Evolution Algorithm. In Congress on Evolutionary Computation (CEC'2002), volume 1, pages 831836, Piscataway, New Jersey,May 2002. IEEE Service Center. 24. D. Thierens, D.E. Goldberg, and A.G. Pereira. Domino Convergence, Drift, and the Temporal-Salience Structure of Problems. In The 1998 IEEE
100
Patrick Reed and Venkat Devireddy
International Conference on Evolutionary Computation. 1998: IEEE Press. 25. T. Back, D. Fogel, and Z. Michalewicz, Handbook of Evolutionary Computation. 2000, Bristol, UK. 26. D. Thierens, Analysis and design of genetic algorithms. 1995, Katholieke Universiteit: Leuven, Belgium. 27. K. DeJong, An analysis of the behavior of a class of genetic adaptive systems. 1975, University of Michigan: Ann Arbor, ML 28. J. D. Schaffer, R. A. Caruana, L. J. Eshelman and R. Das, A study of control parameters affecting online performance of genetic algorithms for function optimization. In Proceedings of the Third International Conference on Genetic Algorithms. 1989: Morgan Kaufmann. 29. K. Deb, Multi-Objective Optimization using Evolutionary Algorithms. 2001, New York, NY: John Wiley L Sons LTD. 30. N. Khan, Bayesian Optimization Algorithms for Multiobjective and Hierarchically Difficult Problems. Masters Thesis, 2003, University of Illinois at Urbana-Champaign: Urbana
CHAPTER 5 USING A PARTICLE SWARM OPTIMIZER WITH A MULTI-OBJECTIVE SELECTION SCHEME TO DESIGN COMBINATIONAL LOGIC CIRCUITS
Erika Hernandez Luna and Carlos A. Coello Coello CINVESTAV-IPN Evolutionary Computation Group Dpto. de Ing. Elect./Secc. Computation Av. IPN No. 2508, Col. San Pedro Zacatenco Mexico, D.F. 07300, MEXICO E-mail: elunaQcomputacion.es.cinvestav.mx
[email protected] In this chapter, we propose the introduction of a multi-objective selection scheme in a particle swarm optimizer used for designing combinational logic circuits. The proposed selection scheme is based on the use of subpopulations to distribute the search effort in a better way within the particles of the population as to accelerate convergence while improving the robustness of the algorithm. For our study, we compare six PSObased approaches, combining different encodings (integer and binary) with both single- and multi-objective selection schemes. The comparative study performed indicates that the use of a population-based approach combined with an integer encoding improves both the robustness and quality of results of PSO when designing combinational logic circuits. 5.1. Introduction The Particle Swarm Optimization (PSO) algorithm is a biologicallyinspired technique originally proposed by James Kennedy and Russell Eberhart 18 ' 19 . PSO has been successfully used as a (mainly nonlinear) optimization technique and has become increasingly popular mainly due to its simplicity (in terms of its implementation), its low computational cost and its good overall performance 19 . The main idea behind PSO is to simulate the movement of a flock of birds seeking food. In this simulation, the behavior of each individual 101
102
Hernandez Luna and Coello Coello
gets affected by both an individual and a social factor. Each individual (or particle) contains its current position in the search space as well as its velocity and the best position found by the individual so far 19 . As many other biologically-inspired heuristics, PSO is a population-based approach that can be denned as P' = (m(/(P)), where P is the population, which consists of a set of positions in search space, / is the fitness function that returns a vector of values that indicate the goodness of each individual, andTOis a manipulation function that generates a new population from the current population. Such a manipulation function is based on the behavioral model of insect colonies x. PSO can be seen as a distributed behavioral algorithm that performs (in its more general version) multidimensional search. In the simulation, the behavior of each individual is affected by either the best local or the best global individual. The approach uses a population of potential solutions (called "particles") and a measure of performance similar to the fitness value used with evolutionary algorithms. Also, the adjustments of individuals are analogous to the use of a crossover operator. However, this approach introduces the use of flying potential solutions through hyperspace (used to accelerate convergence). Additionally, PSO allows individuals to benefit from their past experiences 19. In this chapter, we propose the use of a multi-objective selection scheme to design combinational circuits. Our approach is based on some of our previous research on circuit design using genetic algorithms 6 . The proposal consists of handling each of the matches between a solution generated by our PSO approach and the values specified by the truth table as equality constraints. To avoid the dimensionality problems associated with conventional multi-objective optimization techniques, we use a population-based approach similar to the Vector Evaluated Genetic Algorithm (VEGA) 26 . 5.2. Problem Statement The main goal of logic circuit simplification is normally the minimization of the amount of hardware necessary to build a certain particular system, since less hardware will normally imply a lower final cost. The problem of interest to us consists of designing a circuit that performs a desired function (specified by a truth table), given a certain specified set of available logic gates. The complexity of a logic circuit is a function of the number of gates in the circuit. The complexity of a gate generally is a function of the number of inputs to it. Because a logic circuit is a realization (implementation) of
Using a PSO to Design Combinational Logic Circuits
103
a Boolean function in hardware, reducing the number of literals in the function should reduce the number of inputs to each gate and the number of gates in the circuit—thus reducing the complexity of the circuit. Our overall measure of circuit optimality is the total number of gates used, regardless of their kind. This is approximately proportional to the total part cost of the circuit. Obviously, this sort of analysis must be performed only for fully functional circuits. Boolean functions can be simplified through algebraic manipulations. However, the process is tedious and requires considerable experience from the human designer as to achieve compact circuits. As it is known, there are several standard graphical design aids such as the Karnaugh Maps 17'29, which are widely used by human designers. There are also other tools more suitable for computer implementation such as the Quine-McCluskey Method 25>22, Espresso 2 and MisII 3 . Evolutionary algorithms have been applied to the design of circuits of different types, and have been found very useful in a wide variety of applications due to their robustness and exploratory power. The area devoted to the study and application of evolutionary algorithms to design electronic circuits is called evolvable hardware 27.16-30. This area has been subdivided by some authors into two sub-areas 31: (1) intrinsic evolution: deals with the design and validations of the circuits directly in hardware. (2) extrinsic evolution: only deals with computer simulations of the circuits without reaching their actual implementation in hardware. Within extrinsic evolution, several types of heuristics have been applied to design combinational logic circuits. For example: genetic programming 23,20,n,4^ a n t c o i o n y IO^ g e n e ti c algorithms 5 , and, only recently, particle swarm optimization 13'8. Despite the drawbacks of classical combinational circuit design techniques, some of them can handle truth tables with hundreds of inputs, whereas evolutionary algorithms are restricted to relatively small truth tables 23. However, the most interesting aspect of evolutionary design is the possibility of studying its emergent patterns 23 ' 5 . The goals are, therefore, different when we design circuits using evolutionary algorithms. First, we aim to optimize circuits (using a certain metric) in a different way, and intuitively, we can think of producing novel designs (since there is no human intervention). Such novel designs have been shown in the past 23.24.5>15.
104
Hernandez Luna and Coello Coello
Second, it would be extremely useful to extract design patterns from such evolutionary-generated solutions. This could lead to a practical design process in which a small (optimal) circuit is used as a building block to produce complex circuits. Such a divide-and-conquer approach has also been suggested in the past 28>23. 5.3. Our Proposed Approach The first important component of the algorithm proposed in this paper is the representation adopted to encode a circuit. In our case, we used a bidimensional matrix as in our previous work 5 (see Figure 5.1). More formally, we can say that any circuit can be represented as a bidimensional array of gates Sij, where j indicates the level of a gate, so that those gates closer to the inputs have lower values of j . (Level values are incremented from left to right in Figure 5.1). For a fixed j , the index i varies with respect to the gates that are "next" to each other in the circuit, but without being necessarily connected. Each matrix element is a gate (there are 5 types of gates: AND, NOT, OR, XOR and WIREd) that receives its 2 inputs from any gate at the previous column as shown in Figure 5.1. This sort of encoding was originally proposed by Louis 21. The so-called "cartesian genetic programming" 23 also adopts a similar encoding to the matrix previously described. Using the aforementioned matrix, a logic circuit can be encoded using either binary or integer strings. PSO, however, tends to deal with either binary or real-numbers representation. For our comparative study, we will adopt two integer representations: (1) Integer A: This encoding was proposed by Hu et al. (2) Integer B: This encoding is proposed by us.
14
).
In the PSO algorithm, the individual factor Pbest refers to the decisions that the individual has made so far and that have worked best (in terms of its performance measure). This value has an impact on its future decisions. Additionally, the social factor Ntest refers to the decisions that the other individuals (within a certain neighborhood) have made so far and that have worked best for them. This value will also affect the future decisions of the individuals in the given neighborhood. d
WIRE basically indicates a null operation, or in other words, the absence of gate, and it
is used just to keep regularity in the representation used by our approach that otherwise would have to use variable-length strings.
Using a PSO to Design Combinational Logic Circuits
Fig. 5.1.
105
Encoding used for each of the matrix elements that represent a circuit.
Figure 5.2 shows the pseudocode of the PSO algorithm that we propose for the design of combinational logic circuits. Its main difference with respect to traditional PSO has to do with the update of the position of the particle in each of its dimensions (marked with ** in Figure 5.2). The main procedure for updating each dimension d of the particle for a traditional binary approach, an integer A and an integer B approach is shown next: • Binary approach if flipfsig^)] = 1 then Copy into the d position of the particle the value 1 else Copy into the d position of the particle the value 0 • Integer A approach if flip[sig(ud)] — 1 then Copy into the d position of the particle the corresponding value of Nbest. • Integer B approach if flip[sig(wd)] = 1 then Copy to the particle the value of N^est in the position d els if flip[l - sig(wd)] = 1 then
106
Hernandez Luna and Coello Coello
Randomly initialize the population of particles, P. Repeat { For each particle i in the population P { Compute the fitness of the particle P[i] If the fitness of P[i] is better than the fitness of the best particle found so far Pbest[2{Nbest[i] - P[i\) ** Update the position of the particle P[i] } Apply uniform mutation with a (user given) rate. } Until reaching the stop condition Fig. 5.2. Pseudocode of the PSO algorithm adopted in this work. Note the addition of a mutation operator.
Copy into the d position of the particle the corresponding value of PbestIn all cases, flip\p] returns 1 with a given probability p. The variable Vd refers to the velocity of the particle in the d dimension The function sig normalizes variable Vd and is defined as follows:
(1) Both, the Integer A and the Integer B approaches normalize the velocity of each dimension of the particle in the range 0 to 1, so that we can further determine (in a random way) whether we need to change the current position or not (this is done with the probability given by the velocity). If the change is required, then we copy to the particle the value of Nbest in the current position. Otherwise, the Integer A approach leaves the particle intact. When the change is not required, the Integer B approach checks
Using a PSO to Design Combinational Logic Circuits
107
again whether is necessary to change the current position, but now using a probability of 1 — va, where Vd is the current velocity. If the change is required, then we copy to the particle the value of P\,tst in the position that we are updating. Otherwise, we leave the particle intact. These two integer representations are exemplified in Figure 5.3. As in our previous work 8 , we introduce here a mutation operator to our PSO algorithm in order to improve its exploratory power, since this seems necessary when applying this approach to the design of circuits. Furthermore, in this case, the particles try to follow the same characteristics of iV(,est and Pbest and could get stuck in their current position. Thus, the use of a mutation operator is vital in order to avoid this problem.
Fig. 5.3. Example of the two integer representations used for our PSO algorithm.
5.4. Use of a Multi-Objective Approach The objective function in our case is defined as in our previous work 5 : it is the total number of matches (between the outputs produced by an encoded circuit and the intented values defined by the user in the truth table). For each match, we increase the value of the objective function by one. If the encoded circuit is feasible (i.e., it matches the truth table completely), then we add one (the so-called "bonus") for each WIRE present in the solution. Note however, that in this case, we use a multi-objective approach to assign fitness. The main idea behind our proposed approach is to use a population-
108
Hernandez Luna and Coello Coello
based multi-objective optimization technique similar to VEGA 26 to handle each of the outputs of a circuit as an objective (see Figure 5.4). In other words, we would have an optimization problem with m equality constraints, where m is the number of values (i.e., outputs) of the truth table that we aim to match. So, for example, a circuit with 3 inputs and a single output, would have m = 23 = 8 values to match. At each generation, the population is split into m + 1 sub-populations, where m is defined as indicated before (we have to add one to consider also the objective function). Each subpopulation optimizes a separate constraint (in this case, an output of the circuit). Therefore, the main mission of each sub-population is to match its corresponding output with the value indicated by the user in the truth table. Old Sub-populations 1
f(x)
1
f(x)
2
°i(x)
2
°/ x )
3
0
2(
x
Apply [\
)
• • m+1 Fig. 5.4.
New Sub-populations
I
\
3
°2(X)
genetic operators
o(x)
• • m+1
m
o(x) m
Graphical representation of the selection scheme approach adopted.
The main issue here is how to handle the different situations that could arise. Our proposal is the following: if Oj(X) ^ tj else if v ^ 0 AND x e R else
then then
fitness(X) = 0 fitness = —v fitness = /(X)
where Oj(X) refers to the value of output j for the encoded circuit X; tj is the value specified for output j in the truth table; v is the number of outputs that are not matched by the circuit X (< m); and R is the subpopulation whose objective is to match all the output values from the
Using a PSO to Design Combinational Logic Circuits
109
truth table. Finally, /(X) is the fitness function defined as: , , _ f h(X) if X is infeasible ,. [ ^ >~\ h(X) + w(X) otherwise ' In this equation, h(X) refers to the number of matches between the circuit X and the values denned in the truth table, and w(X) is the number of WIREs in the circuit X. As can be seen, the scheme adopted in this work is slightly different from the one used by our MGA reported in 6 . The main reason for adopting this approach is that in our experiments, it produced more competitive results, improving in most cases the results obtained with our single-objective PSO, as we will see in the next section. 5.5. Comparison of Results The truth tables used to validate our PSO approach were taken from the specialized literature. In our experimental study, we compared the following approaches: a binary multi-objective PSO approach (BMPSO), a PSO approach using an integer A encoding (EAMPSO), a PSO approach using an integer B encoding (EBPSO), a binary single-objective PSO (BPSO), a single-objective PSO approach using integer A encoding (EAPSO), a single-objective PSO approach using integer B encoding (EBPSO) and the multi-objective genetic algorithm for circuit design (MGA) 6 . For each of the examples shown, we performed 20 independent runs, and the available set of gates considered was the following: AND, OR, NOT, XOR and WIRE. We used a matrix of size 5 x 5 in all cases, except for the second example for which a 6 x 6 matrix was adopted. The parameters adopted by both BPSO and BMPSO were the following: fa = fa = 0.8, Vmax = 3.0, mutation rate Pm = 0.1 and neighborhood size = 3. EAPSO, EAMPSO, EBPSO and EBMPSO used: fa = fa = 0.2, Vmax = 0.4, Pm = 0.1 and neighborhood size = 3. The MGA used Pm = 0.00667 and a crossover rate = 0.5 (as suggested in 6 ). 5.5.1. Example 1 Our first example has 4 inputs and 1 output, as shown in Table 5.11. The additional parameters adopted by each approach are shown in Table 5.12. Note that we attempted to perform the same number of fitness function evaluations with all the approaches compared. In Table 5.13, we show a comparison of the results of all the approaches adopted. The best solution
110
Hernandez Luna and Coello Coello
S = (B + {D® A))1 + (C® D{D © A)) Fig. 5.5. Diagram and Boolean expression corresponding to the best solution found by our multi-objective PSO approaches for example 1.
found for this example has 6 gates and is graphically shown in Figure 5.5. Note that both BMPSO and EBMPSO were able to find a circuit that uses one gate less than their single-objective counterparts (i.e., BPSO and EBPSO). Nevertheless, the average fitness of both BMPSO and EBMPSO were lower than the values of their single-objective counterparts. Also note that although EAMPSO was not able to improve the solutions obtained by EAPSO, its percentage of feasible circuits increased from 65% to 85%. Also, the averagefitnessof EAMPSO was 30.25 compared to the 26.75 value produced by EAPSO. In this example, the MGA did not perform too well when compared with any of our PSO versions. Its percetange of feasible circuits was low (35%) and it was not able to find the solution with only 6 gates produced by some of the PSO approaches. Another interesting fact was that EBPSO had the best average fitness (31.2), but was not able to produce circuits with 6 gates. EAMPSO, in contrast, had the second best average fitness (30.25), but was able to find circuits with only 6 gates 5% of the time. Thus, EAMPSO can be considered as the best overall performer in this example. The Boolean expression corresponding to the best solution found by a human designer is: 5 = {(A © B) © {(AD)(B + C))) + ((A + C) + D)'. This solution has 9 gates and was generated using Karnaugh maps and Boolean algebra. This solution has been reported before in the specialized literature (see 7) and can be used as a reference to compare the results obtained by our PSO approach. The best solution found by our PSO approaches only requires 6 gates. 5.5.2. Example 2 Our second example has 4 inputs and 1 output and its truth table is shown in Table 5.14. The additional parameters adopted by each approach are shown in Table 5.15.
Using a PSO to Design Combinational Logic Circuits
111
Table 5.11. Truth table for example 1. D 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
C B A I S~ 0 0 0 ~T~ 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 I 1
Table 5.12. Parameters adopted for example 1. Technique MPSO PSO MGA
Population size 68 50 170
Iterations Fitness function evaluations 1,471 100,028 2,000 ~ 100,000 601) 102,000
Table 5.13. Comparison of the results obtained by our multi-objective versions of PSO, our single-objective PSO versions, MGA and a human designer for the first example. b.s.=best solution. approach gates freq. feas. avg.# avg. std. b.s. b.s. circs. gates fitn. dev. BMPSO 8 5% 20% 22.8 18.~ 6.622 EAMPSO 6 5% 85% 10.75 30.25 6.680 EBMPSO 6 5% 75% 12.75 28.25 7.953 BPSO 9 15% 45% 19.1 21.9 7.887 EAPSO 6 5% 65% 14.25 26.75 8.902 EBPSO 7 30% 90% 9.8 31.2~ 5.616 MGA 7 15% 35% 19.95 21.05 8.929 Human designer 9 -
In Table 5.16, we show a comparison of the results of all the approaches adopted. The best solution found for this example has 6 gates and is graphically shown in Figure 5.6. Note that in this example, BPSO had a slightly
112
Hernandez Luna and Coello Coello
o S
S = (C © D){B © C) + ((B © A) © (C © D)) Fig. 5.6. Diagram and Boolean expression corresponding to the best solution found by our multi-objective PSO approaches for example 2.
better performance than BMPSO (both in terms of average fitness and in terms of frequency with which the best solution was found). The two multi-objective algorithms that adopted an integer encoding (EAMPSO and EBMPSO) showed an excellent performance, being able to find a circuit with 6 gates (fitness of 35) in every single run (the standard deviation was zero). This performance is significantly better than that of the singleobjective versions of these two algorithms (EAPSO and EBPSO). Again, the MGA did not perform too well when compared with any of our PSO versions. The MGA was not able to produce feasible circuits in all of its runs, and the best circuit was found only 30% of the time. In this case, both EAMPSO and EBMPSO were the best overall performers, with an average fitness of 35 and a standard deviation of zero. The Boolean expression corresponding to the best solution found by a human designer is: S = (A®B)®(C®D)+D'(CA)+B(A'D). This solution has 11 gates and was generated using Karnaugh maps and Boolean algebra. It is worth contrasting the best solution produced by the human designer with respect to the best solution found by our PSO approaches which only requires 6 gates. 5.5.3. Example 3 Our third example has 5 inputs and 1 output, as shown in Table 5.17. The additional parameters adopted by each approach are shown in Table 5.18. In Table 5.19, we show a comparison of the results of all the approaches adopted. The best solution found for this example has 7 gates and is graphically shown in Figure 5.7. In this case, none of the binary versions of PSO was able to produce feasible circuits, which exemplifies the usefulness of
113
Using a PSO to Design Combinational Logic Circuits Table 5.14. Truth table for example 2. ~D
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
1
Table 5.15.
C
B
A IS
0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0
0 1 1 0 1 1 0 1 1 0 1 1 0 1 1
1 1 1I 0
Parameters adopted for example 2.
Technique Population size Iterations Fitness function evaluations MPSO 68 1,471 100,028 PSO ~ 50 2,000 ~ 100,000 MGA [ 170 I 600 ~ 102,000 Table 5.16. Comparison of the results obtained by our multi-objective versions of PSO, our single-objective PSO versions, MGA and a human designer for the second example. b.s.=best solution. approach BMPSO EAMPSO EBMPSO BPSO EAPSO EBPSO MGA
HP
gates freq. feas. b.s. b.s. circs. (5 65% 100% 6 100% 100% 6 100% ~l00% 6 75% 100% 6 75% 95% 6 85% 100% 6 ~ 30% 90%
| 1 1I - I -
avg.# gates 679 6 6 6.75 7.3 6.15 9.3
-
avg. fitn. 34.1 35 ~ 35 34.25 33.7 34.85 31.7~
std. dev. 1.3338 0 0 1.6181 4.4615 0.3664 6.2669 '
- I -
adopting integer encodings in PSO. There were mixed results for the other approaches. Both EAMPSO and EAPSO found the best solution with the same frequency (15%), but EAMPSO found feasible circuits 40% of the time (versus 35% of EAPSO). In terms of average fitness both EAMPSO and EAPSO had similar results (41.7 vs. 40.45). Thus, we can conclude
114
Hernandez Luna and Coello Coello
S = (((£ + D)(B © A))(C + (ED))) © B Fig. 5.7. Diagram and Boolean expression corresponding to the best solution found by our multi-objective PSO approaches for example 3.
that EAMPSO was the best overall performer in this example. Interestingly, EBPSO had both the highest average fitness (41.9) and the highest percentage of feasible circuits (45%), but was not able to find a circuit with 7 gates. The MGA was able to find circuits with 7 gates, but both its percentage of feasible circuits (20%) and its average fitness (36) were low in comparison with the multi-objective PSO approaches. The Boolean expression corresponding to the best solution found by a human designer is: S = B(D'C + E'(D © C)) + A(DC + E(D © C)). This solution has 13 gates and was generated using Karnaugh maps and Boolean algebra. It is worth contrasting the best solution produced by the human designer with respect to the best solution found by our PSO approaches which only requires 7 gates. 5.5.4. Example 4 Our fourth example has 4 inputs and 2 outputs as shown in Table 5.20. The additional parameters adopted by each approach are shown in Table 5.21. In Table 5.22, we show a comparison of the results of all the approaches adopted. The best solution found for this example has 7 gates and is graphically shown in Figure 5.8. In this case, BPSO produced considerably better results than its multi-objective counterpart (BMPSO) both in terms of average fitness (46.95 vs. 38.60) and in terms of percentage of feasible circuits produced (95% vs. 50%). EAMPSO, however, was able to considerably improve the results produced by its single-objective counterpart (EAPSO) also in terms of both average fitness (49.25 vs. 43.55) and percentage of feasible circuits produced (100% vs. 70%). Note that both EBMPSO and EBPSO were able to find feasible circuits in all their runs and had similar average fitnesses (49.85 vs. 49.25), but the former converged more often
Using a PSO to Design Combinational Logic Circuits
115
Table 5.17. Truth table for example 3. ~E~ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
D C B A IS 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0 0 1 1 0 1 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0 1 0 1 0 1 1 0 1 1 1 1 1 0 0 0 1 1 0 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 1 1 0 0 0 1 1 0 1 1 1 1 1 0 0 1 1 1 1 1 1
Table 5.18. Parameters adopted for example 3. Technique
Population size
Iterations
MPSO PSO MGA
99 50 330
20,000 39,600 6,000
Fitness function evaluations ~~~
1,980,000 1,980,000 1,980,000
to the best solution found (90% vs. 60%). In fact, EBMPSO was the best overall performer in this example. Again, the MGA had a poor performance with respect to the PSO-based multi-objective approaches (EAMPSO and EBMPSO), although it had a better average fitness than both BMPSO and
116
Hernandez Luna and Coello Coello Table 5.19. Comparison of the results obtained by our multiobjective versions of PSO, our single-objective PSO versions, MGA and a human designer for the third example. b.s.=best solution. approach
BMPSO~ EAMPSO EBMPSO~ BPSO EAPSO EBPSO MGA Human designer
gates b.s.
freq. b.s.
feas. circs.
avg.# gates
* ~ 0% 0% * 7 15% 40% 26.3 7 5% 20% 32.25 * 0% 0% * 7 15% ~ 35% ~~27.55 8 20% 45% 26.1 8 5% 20% 38 13
-
-
-
avg. fitn.
std. dev.
29.8~ 0.410 41.7 14.543 35.75~ 11.461 29.9 0.308 40.45~ 14.529 41.9 13.619 36 13.322 -
Table 5.20. Truth table for example 4. ~1b ~~0
C 0
B 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 0 0 1 1 1 1
0 1 1 0 0 1 1 0 0 1 1
0 0 0 0 1 0 1 0 1 1 1 1
A I So I Si 0 I 0~ 1 1 0 1 1 0 0 1 1 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 1 1
EAPSO and was also able to find the circuit of 7 gates generated by the PSO-based approaches. The Boolean expression corresponding to the best solution found by a human designer is: So = B'D1 + C'A'(D' + B') and Si - BD{A + C). This solution has 12 gates and was generated using Karnaugh maps and Boolean algebra. Note that the outputs were solved separately (as traditionally done when using Karnaugh maps). It is worth contrasting the best solution produced by the human designer with respect to the best solution found by our PSO approaches which only requires 7 gates.
Using a PSO to Design Combinational Logic Circuits
117
S0 = ((CA){B + D) + BD)' Sl = (CA)(B + D)(BD) Fig. 5.8. Diagram and Boolean expression corresponding to the best solution found by our multi-objective PSO approaches for example 4. Table 5.21.
Parameters adopted for example 4.
Technique Population size MPSO ~ 99 PSO 50 MGA 330
Iterations 2,000 4,000 610
Fitness function evaluations 198,000 200,000 201,300 ~
Table 5.22. Comparison of the results obtained by our multiobjective versions of PSO, our single-objective PSO versions, MGA and a human designer for the fourth example. b.s.=best solution. approach
gates b.s. BMPSO~ 7 EAMPSO 7 EBMPSO 7 BPSO 7 EAPSO 7 EBPSO 7 MGA 7 Human designer 12
freq. b.s. 10% 65% 90% 30% 40% 60% 25%
feas. avg.# avg. std. circs. gates fitn. dev. ~~50%~ 18.4 " 38.6 1T210 100% 7.75 49.25 1.333 " 100% 7.15 49.85 0.489 ' 95% 10.05 46.95 4.330 70% 13.45 43.55 8.530 ~ 100% 7.75 49.25 1.160 75% 13.4 43.6 8.090
-
5.5.5. Example 5 Our fifth example has 4 inputs and 3 outputs as shown in Table 5.23. The additional parameters adopted by each approach are shown in Table 5.24. In Table 5.25, we show a comparison of the results of all the approaches adopted. The best solution found for this example has 7 gates and is graphically shown in Figure 5.9. In this case, none of the binary versions of PSO was able to generate feasible circuits. Note that the performance of EAPSO
118
Hernandez Luna and Coello Coello
So = {AC © {B © D)) © ((£> © AC) + (B © £>))
5i = AC © (B © D); 5 2 = C © A Fig. 5.9. Diagram and Boolean expression corresponding to the best solution found by our multi-objective PSO approaches for example 5.
was better than that of EAMPSO both in terms of average fitness (55.85 vs. 53.30) and in terms of frequency with which the best solution was found (10% vs. 5%). However, EBMPSO had a slightly better performance than EBPSO both in terms of average fitness (58.90 vs. 58.75) and in terms of the frequency with which the best solution was found (35% vs. 15%). Nevertheless, EBPSO had a slightly better percentage of feasible circuits found than EBMPSO (70% vs. 65%). Although marginally, we conclude that EBMPSO was the best overall performer in this example. The MGA was not able to generate circuits with 7 gates, but it found feasible circuits more consistently than most of the PSO-based approaches. The Boolean expression corresponding to the best solution found by a human designer is: So = {AC){B®D)+BD, Sl = C'(B®D) + C{A®{B® D)) and 52 = A © C. This solution has 11 gates and was generated using Karnaugh maps and Boolean algebra. Note that the outputs were solved separately. It is worth contrasting the best solution produced by the human designer with respect to the best solution found by our PSO approaches which only requires 7 gates. 5.5.6. Example 6 Our sixth example has 4 inputs and 4 outputs, as shown in Table 5.26. The additional parameters adopted by each approach are shown in Table 5.27. In Table 5.28, we show a comparison of the results of all the approaches adopted. The best solution found for this example has 7 gates
119
Using a PSO to Design Combinational Logic Circuits Table 5.23. 5.
D 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
1
Truth table for example
C B A I S o I Si I sV 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0
Table 5.24.
1 1
0 0 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0
0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
1I1 I1 I 0
Parameters adopted for example 5.
Technique Population size Iterations Fitness function evaluations MPSO ~ 147 5,000 735,000 PSO 50 14,700 735,000 MGA I 490 | 1,500 | 735,000 Table 5.25. Comparison of the results obtained by our multi-objective versions of PSO, our single-objective PSO versions, MGA and a human designer for the fifth example. b.s.=best solution. approach
gates b.s. BMPSO ~ * EAMPSO" 7 EBMPSO 7 BPSO * EAPSO 7 EBPSO 7 MGA 8 Human designer 11
freq. b.s. 0% 5% 35% 0% 10% 15% 10%
feas. circs. 0% 45% 65% 0% 55% 70% 70%
avg.# gates * 19.7 14.1 * 17.15 14.25 15.9
avg. fitn. 44.5 53.3 58.~9~ 45.65 55.85 58.75 57.1
std. dev. 1.100 8.053 ' 8.985 1.089 8.610 8.123 7.490
-
and is graphically shown in Figure 5.10. In this case, none of the binary versions of PSO was able to produce feasible circuits. The performance of EAMPSO was considerably better than that of its single-objective counterpart (EAPSO) both in terms of frequency of the best solution found
120
Hernandez Luna and Coello Coello
So = (CA)(DB); S3 = CA Si = DB © (CA){DB); S2 = DA® BC Fig. 5.10. Diagram and Boolean expression corresponding to the best solution found by our multi-objective PSO approaches for example 6.
(30% vs. 10%) as in terms of the percentage of feasible circuits found (80% vs. 35%). EBMPSO had also a better performance than its single-objective counterpart (EBPSO) both in terms of frequency of the best solution found (25% vs. 15%) as in terms of the percentage of feasible circuits found (75% vs. 35%). In this case, the MGA performed better than any of the PSObased approaches, producing the highest average fitness (80.4) with the lowest number of fitness function evaluations. Thus, the MGA was the best overall performer in this example. The Boolean expression corresponding to the best solution found by a human designer is: So = (DC)(BA), Si = (DB)(CA)', S2 = CB @ DA and 5 3 = CA. This solution has 8 gates and was reported in 7 , where a multi-objective genetic algorithm was used. It is worth noticing that the best solution found by our PSO approaches uses only 7 gates. 5.6. Conclusions and Future Work In this chapter, we have introduced a population-based PSO approach (similar to VEGA 26) to design combinational logic circuits. We have also presented a study in which six PSO-based algorithms were compared (using both single- and multi-objective schemes and different encodings). Also, a population-based genetic algorithm (MGA) was included in the comparison, since we were interested in analyzing the effect of the search engine adopted in the quality and consistency of the results obtained. The results obtained clearly indicate that the population-based PSO approaches proposed perform better than the MGA.
121
Using a PSO to Design Combinational Logic Circuits Table 5.26.
Truth table for example 6.
D C B A I So I Si 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 1 0 1 1 0 1 1 1 0 0 0 0 0 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 1 1 0
Table 5.27.
I S2 I S3 0 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 1 0 0 1 1 0 0 1
Parameters adopted for example 6.
Technique Population size MPSO ~ 195 PSO 50 MGA 650
Iterations Fitness function evaluations 5,000 ~ 975,000 19,500 975,000 500 325,000
Table 5.28. Comparison of the results obtained by our multiobjective versions of PSO, our single-objective PSO versions, MGA and a human designer for the sixth example. b.s.=best solution. approach BMPSO~ EAMPSO EBMPSO BPSO EAPSO EBPSO MGA Human designer
gates freq. feas. avg.# avg. std. b.s. b.s. circs. gates fitn. dev. * ' 0% 0% * 60.35 0.7452 7 30% 80% 11.8 77.2 7.7432 7 ~ 25% " 75% 13.15 75.85~ 8.0934 ' * ~ 0% 0% * 60.75~ 0.6387 ' 7 10% 35% 21.2 67.8 8.9713 7 15% 35% 22.05 66.95 8.64 7 15% 100%~ 8.6 ~ 80.4 ~ 1.14 8
-
-
.
.
.
Within the six PSO-based techniques compared, it was clear that the approaches that adopted both a multi-objective selection scheme and an Integer B encoding 8 were the best overall performers. The results also suggest that the use of binary PSO for designing combinational logic circuits
122
Hernandez Luna and Coello Coello
is not advisable, since this sort of approach had difficulties even for reaching the feasible region in some cases. An interesting outcome of our study is that we found that PSO acts as a better search engine than a genetic algorithm when adopting a population-based selection scheme for designing combinational logic circuits. As part of our future work, we are interested in exploring alternative encodings (e.g., graphs and trees) that have not been used so far with particle swarm optimizers 19 . We are also interested in studying some alternative multi-objective selection schemes (e.g., Pareto ranking 12) in the context of combinational circuit design using PSO 9 . Acknowledgements The first author acknowledges support from CONACyT through a scholarship to pursue graduate studies at the Computer Science Section of the Electrical Engineering Department at CINVESTAV-IPN. The second author gratefully acknowledges support from CONACyT through project 42435-Y.
References 1. Peter J. Angeline. Evolutionary optimization versus particle swarm optimization: philosophy and performance differences. In Waagen D. Porto V.W., Saravanan N. and Eiben A.E., editors, Evolutionary Programming VII: Proceedings of the Seventh Annual Conference on Evolutionary Programming, pages 611-618. Springer, 1998. 2. R. K. Brayton, G. D. Hachtel, C. T. McMullen, and A. L. SangiovanniVincentelli. Logic Minimization Algorithms for VLSI Synthesis. Kluwer Academic Publishers, Dordrecht, The Netherlands, 1984. 3. R. K. Brayton, R. Rudell, A. Sangiovanni-Vincentelli, and A. R. Wang. MIS: A multiple-level logic optimization system. IEEE Transactions on ComputerAided Design, CAD-6 (6):1062-1081, November 1987. 4. Bill P. Buckles, Arturo Hernandez Aguirre, and Carlos Coello Coello. Circuit design using genetic programming: An illustrative study. In Proceedings of the 10th NASA Symposium on VLSI Design, pages 4.1-1-4.1-10, Albuquerque NM, 2002. 5. Carlos A. Coello Coello, Alan D. Christiansen, and Arturo Hernandez Aguirre. Use of Evolutionary Techniques to Automate the Design of Combinational Circuits. International Journal of Smart Engineering System Design, 2(4):299-314, June 2000. 6. Carlos A. Coello Coello and Arturo Hernandez Aguirre. Design of combinational logic circuits through an evolutionary multiobjective optimization approach. Artificial Intelligence for Engineering, Design, Analysis and Manufacture, 16(l):39-53, January 2002.
Using a PSO to Design Combinational Logic Circuits
123
7. Carlos A. Coello Coello, Arturo Hernandez Aguirre, and Bill P. Buckles. Evolutionary Multiobjective Design of Combinational Logic Circuits. In Jason Lohn, Adrian Stoica, Didier Keymeulen, and Silvano Colombano, editors, Proceedings of the Second NASA/DoD Workshop on Evolvable Hardware, pages 161-170. IEEE Computer Society, Los Alamitos, California, July 2000. 8. Carlos A. Coello Coello, Erika Hernandez Luna, and Arturo Hernandez Aguirre. Use of particle swarm optimization to design combinational logic circuits. In Pauline C. Haddow Andy M. Tyrell and Jim Torresen, editors, Evolvable Systems: From Biology to Hardware. 5th International Conference, ICES 2003, pages 398-409, Trondheim, Norway, 2003. Springer, Lecture Notes in Computer Science Vol. 2606. 9. Carlos A. Coello Coello, David A. Van Veldhuizen, and Gary B. Lamont. Evolutionary Algorithms for Solving Multi- Objective Problems. Kluwer Academic Publishers, New York, May 2002. ISBN 0-3064-6762-3. 10. Carlos A. Coello Coello, Rosa Laura Zavala Gutierrez, Benito Mendoza Garcia, and Arturo Hernandez Aguirre. Automated Design of Combinational Logic Circuits using the Ant System. Engineering Optimization, 34(2): 109127, March 2002. 11. Edgar Galvan Lopez, Riccardo Poli, and Carlos A. Coello Coello. Reusing Code in Genetic Programming. In Genetic Programming, 7th European Conference, EuroGP'2004, pages 359-368, Coimbra, Portugal, April 2004. Springer. Lecture Notes in Computer Science. Volume 3003. 12. David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Reading, MA, 1989. 13. Venu G. Gudise and Ganesh K. Venayagamoorthy. Evolving digital circuits using particle swarm. In Proceedings of the NNS-IEEE International Joint Conference on Neural Networks, pages 468-472, Portland, OR, USA, 2003. 14. Xiaohui Hu, Russell C. Eberhart, and Yuhui Shi. Swarm intelligence for permutation optimization: a case study on n-queens problem. In Proceedings of the IEEE Swarm Intelligence Symposium 2003 (SIS 2003), pages 243-246, Indianapolis, Indiana, USA., 2003. 15. Eduardo Islas Perez, Carlos A. Coello Coello, and Arturo Hernandez Aguirre. Extraction of Design Patterns from Evolutionary Algorithms using CaseBased Reasoning. In Yong Liu, Kiyoshi Tanaka, Masaya Iwata, Tetsuya Higuchi, and Moritoshi Yasunaga, editors, Evolvable Systems: From Biology to Hardware (ICES'2001), pages 244-255. Springer-Verlag. Lecture Notes in Computer Science No. 2210, October 2001. 16. Tatiana Kalganova. A new evolutionary hardware approach for logic design. In Annie S. Wu, editor, Proc. of the GECCO'99 Student Workshop, pages 360-361, Orlando, Florida, USA, 1999. 17. M. Karnaugh. A map method for synthesis of combinational logic circuits. Transactions of the AIEE, Communications and Electronics, 72 (I):593-599, November 1953. 18. James Kennedy and Russell C. Eberhart. Particle Swarm Optimization. In Proceedings of the 1995 IEEE International Conference on Neural Networks, pages 1942-1948, Piscataway, New Jersey, 1995. IEEE Service Center.
124
Hernandez Luna and Coello Coello
19. James Kennedy and Russell C. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco, California, 2001. 20. John R. Koza. Genetic Programming. On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, Massachusetts, 1992. 21. Sushil J. Louis. Genetic Algorithms as a Computational Tool for Design. PhD thesis, Department of Computer Science, Indiana University, August 1993. 22. E. J. McCluskey. Minimization of boolean functions. Bell Systems Technical Journal, 35 (5):1417-1444, November 1956. 23. Julian F. Miller, Dominic Job, and Vesselin K. Vassilev. Principles in the Evolutionary Design of Digital Circuits—Part I. Genetic Programming and Evolvable Machines, l(l/2):7-35, April 2000. 24. Julian F. Miller, Tatiana Kalganova, Natalia Lipnitskaya, and Dominic Job. The Genetic Algorithm as a Discovery Engine: Strange Circuits and New Principles. In Proceedings of the AISB Symposium on Creative Evolutionary Systems (CES'99), Edinburgh, Scotland, April 1999. 25. W. V. Quine. A way to simplify truth functions. American Mathematical Monthly, 62 (9):627-631, 1955. 26. J. David Schaffer. Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. In Genetic Algorithms and their Applications: Proceedings of the First International Conference on Genetic Algorithms, pages 93100. Lawrence Erlbaum, 1985. 27. Peter J. Bentley Timothy G. W. Gordon. On evolvable hardware. In Seppo J. Ovaska and Les m Sztandera, editors, Soft Computing in Industrial Electronics, pages 279-323, Heidelberg, 2002. Eds,.Physica-Verlag. 28. Jim Torresen. A Divide-and-Conquer Approach to Evolvable Hardware. In Moshe Sipper, Daniel Mange, and Andres Perez-Uribe, editors, Proceedings of the Second International Conference on Evolvable Systems (ICES'98), pages 57-65, Lausanne, Switzerland, 1998. Springer-Verlag. 29. E. W. Veitch. A Chart Method for Simplifying Boolean Functions. Proceedings of the ACM, pages 127-133, May 1952. 30. Xin Yao and Tetsuya Higuchi. Promises and Challenges of Evolvable Hardware. In Tetsuya Higuchi, Masaya Iwata, and W. Liu, editors, Proceedings of the First International Conference on Evolvable Systems: From Biology to Hardware (ICES'96), Lecture Notes in Computer Science, Vol. 1259, pages 55-78, Heidelberg, Germany, 1997. Springer-Verlag. 31. Ricardo S. Zebulum, M. A. Pacheco, and M. Vellasco. Evolvable Systems in Hardware Design: Taxonomy, Survey and Applications. In T. Higuchi and M. Iwata, editors, Proceedings of the First International Conference on Evolvable Systems (ICES'96), pages 344-358, Berlin, Germany, 1997. SpringerVerlag.
CHAPTER 6 APPLICATION OF MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS IN AUTONOMOUS VEHICLES NAVIGATION
Tomonari Furukawa*, Gamini Dissanayake^ and Hugh F. Durr ant-Why te^t ARC Centre, of Excellence in Autonomous Systems 'School of Mechanical and Manufacturing Engineering The University of New South Wales, Sydney 2052 Australia E-mail:
[email protected] ''Faculty of Engineering The University of Technology, Sydney, 2007 Australia E-mail:
[email protected] **' Australian Centre for Field Robotics The University of Sydney, Sydney 2006 Australia E-mail:
[email protected] The successful navigation of an autonomous vehicle heavily depends on the accuracy of the parameters of the vehicle and sensor models, which are determined before the vehicle is in use. One of the main sources of error is the present prior way of determining parameters, partly because the currently accepted procedure for determining the parameters is not sufficiently accurate and partly because the parameters vary as the vehicle is driven. This chapter presents an application of multi-objective evolutionary algorithms to the sensor and vehicle parameter determination for successful autonomous vehicles navigation. Followed by the multi-objective formulation, a general framework for multi-objective optimization, two types of search methods to find solutions efficiently and a technique for selecting a final solution from the multiple solutions are proposed. The proposed parameter determination technique was applied to an autonomous vehicle developed by the authors, and an appropriate parameter set has been obtained.
125
126
T. Furukawa, et al.
6.1. Introduction Driving a vehicle in an unstructured outdoor environment, involving skillful manoeuvres, often exposes a human driver to significant danger. Autonomous vehicles, which carry a navigation system to provide the knowledge of vehicle position and trajectory and subsequently control the vehicle along a desired path, have received considerable attention in the last decade1 ~3. Sensors used in the navigation system can be classified into two types, namely absolute sensors and relative sensors. The absolute sensors directly measure the position and/or orientation of the vehicle with respect to its environment. The laser range finder that observes beacons present in the environment4, Inertial Measurement Unit (IMU) that measures the angular velocities and the accelerations of the vehicle in three orthogonal axes, Global Positioning System (GPS), compass and gyroscope belong to this class of sensors. The relative sensors, usually known as dead-reckoning sensors, measure the vehicle state internally from the vehicle drive train. The wheel and steering encoders are typical examples for this class. The role of dead-reckoning sensors in navigation, together with a kinematic model, is to predict the position and orientation of the vehicle. Absolute sensors are, meanwhile, employed to reset errors that inevitably accumulate due to the integration present in the prediction step. As absolute information is usually not available at high enough rates to be useful for control purposes, it is important that the dead-reckoning sensors provide accurate information between such updates. Despite the dramatic theoretical progress, the bottleneck for the successful vehicle autonomy is the inaccuracy of kinematic parameters and calibration factors used in the kinematic vehicle and sensor models. The parameters associated with the kinematic vehicle and sensor models are measured or computed only when the vehicle is designed or commissioned1, although the characteristics of electro-mechanical systems change with time gradually. Further, the encoders attached to the wheels and steering are calibrated using only specific manoeuvres such as moving along straight lines and circular paths5'6 and correlating the distance travelled as measured by the encoders and an external measuring device, typically a tape measure. The solution to this accuracy problem is to compute the kinematic parameters and calibration constants using data gathered during the normal operation of an autonomous vehicle7. This makes it possible to check whether the parameters used in the navigation algorithms are accurate and
Application of MOEAs in Autonomous Vehicles Navigation
127
make any necessary changes without resorting to specific test manoeuvres or modifications to sensor configurations. This chapter describes an application of multi-objective evolutionary algorithms8"12 (MOEAs) to the identification of such kinematic parameters and calibration factors used in autonomous navigation. Since the difference between the vehicle state computed using the kinematic equations and the relative sensor and that computed using an absolute sensor is the criterion used to find the parameters, one may think of applying any of the conventional single-objective optimization methods13"20. The primary reason for the use of a multi-objective optimization technique stems from the fact that the difference is represented in terms of two different types of error, the position error and the orientation error, each of which is derived from a different set of sensor readings21"23. In accordance to the multi-objective problem formulation, a general framework for multi-objective optimization is presented, and, further, Multi-objective Continuous Evolutionary Algorithms (MCEAs) and Multiobjective Gradient-based Method (MOGM) are proposed to solve this class of multi-objective optimization problems efficiently. The solution to a multiobjective optimization problem is a solution space rather than a single point solution, so that the multi-objective optimization method results in finding Pareto-optimal solutions, which describe the solution space through distribution. The Center-of-Gravity Method (CoGM) is thus proposed to select an appropriate final solution from the Pareto-optimal solutions. This chapter is organized as follows. Section 6.2 provides the background material on autonomous vehicles and describes the experimental set up used for obtaining the data. The parameter identification problem for autonomous vehicles is formulated as a multi-objective optimization problem in Section 6.3. A general framework for finding Pareto-optimal solutions and CoGM for selecting a final solution are also described. Section 6.4 describes the MCEA and MOGM, whereas the solution to the problem of parameter identification in autonomous vehicles is provided in Section 6.5. Conclusions are summarized in Section 6.6. 6.2. Autonomous Vehicles 6.2.1. Experimental
Setup
Fig. 6.1 shows a vehicle used as a test bed for research into the navigation of autonomous vehicles. This is a rear wheel driven vehicle that is steered using an Ackerman type steering linkage driving the front wheels. Four sensors
128
T. Furukawa, et al.
are mounted on the vehicle. As dead-reckoning sensors, an encoder, fitted to the rear left wheel, gives a measure of the vehicles speed, and a linear variable differential transformer (LVDT) on the steering rack provides a measurement proportional to the steering angle. The encoder and the LVDT are read at a rate of 20 Hz. Carrier Phase Differential GPS unit with a rated accuracy of 0.02 m in position and 0.02 m/s in velocity when at least six satellites are in view is used to measure the absolute position of the vehicle at a sample rate of 4 Hz. An inertial measurement unit comprising of three orthogonal gyroscopes and three accelerometers are also mounted on the vehicle. In the work described in this paper, only one of these gyroscopes is used to measure the angular velocity of the vehicle about a vertical axis. The inertial measurement unit provides information at a sample rate of 125 Hz.
Fig. 6.1. Autonomous vehicle.
6.2.2. Vehicle Model The kinematic model of a vehicle moving in the horizontal plane is shown in Fig. 6.2. Location of the vehicle is given by state variables [x,y,(f>], where x and y are the coordinates of the center of the rear-axle, and tfi is the orientation of the vehicle body as shown. The inputs that are used to control the vehicle are the velocity at the center of the rear axle, v, and the average
Application of MOEAs in Autonomous Vehicles Navigation
129
steering angle 7. The equations of motion for this vehicle at any time instant k are given by x(k) = v(k) • coscfr(k), y{k)
=v(k)
-sin(&),
0(AO = ^ - t a n 7 ( * ) ,
(1)
where I is the vehicle wheel base.
Fig. 6.2. State and control of the vehicle.
6.2.3. Relative Sensor Models 6.2.3.1. Steering Encoder The steering encoder measures the displacement of the steering rack PsTE(k), which is linearly proportional to the steering angle. The steering angle j(k) can be expressed as j(k) =cx •psTE(k) + c2,
(2)
where c\ and ci are the gain and the offset of the encoder. 6.2.3.2. Wheel Encoder The wheel encoder provides the angular position of the left rear wheel of the vehicle. Difference between successive position measurements can be used to determine the velocity of the left rear wheel. The velocity of the vehicle
130
T. Furukawa, et al.
v(k) is related to the velocity measured by the encoder pvEL(k) through the following kinematic transformation. v(k) = c3 • pVEL{k) + 4>(k) • b,
(3)
where C3 is the gain of the encoder. Note that the substitution of Eq. 3 into Eq. 1 introduces another velocity term v(k). Assembling the velocity terms, resultant velocity v(k) is described as V{k)
= i-fc.tan 7 (fc)
(4)
6.2.4. Absolute Sensor Models 6.2.4.1. Global Positioning Systems The GPS mounted on the vehicle directly provides the absolute position [XGPSIVGPS} at which the sensor is mounted. The position of the vehicle obtained from the GPS sensor, [x, y] is given by the following equations: x(k) = xGPS(k) - r • cos{(f>{k) + 9}, y(k)=yGPs(k)~r-sin{4>(k)+9},
(5)
where r and 8 are the location of the GPS unit, in polar coordinates, with respect to the local coordinate frame on the vehicle as shown in Fig. 6.3. Note that [x,y,<j>] represents the position/orientation obtained from the absolute sensor measurements.
Fig. 6.3. Location and notation of sensors mounted on the vehicle.
Application of MOEAs in Autonomous Vehicles Navigation
131
6.2.4.2. Inertia! Measurement Unit The rate of change of orientation of the vehicle 4> is related to the reading of the gyroscope PINS by
4>(k) = piNs(k) + POFF,
(6)
where the initial offset POFF is the average value of the gyroscope measurements obtained a priori when the vehicle is stationary: POFF =
7
K
•
(')
f
6.2.5. Simulation and Measurement of the Vehicle State The sensors described in the last section can be used to both simulate and measure the state of the vehicle at any instant. In the simulation, the state of the vehicle [x(k + 1), y(h + l),4>(k + 1)] at time instant k + 1 is iteratively computed from the state of the vehicle [x(k),y(k),(f>(k)] as shown in Fig. 6.4, so that data to be prepared a priori are control inputs [v(k), 7(fe)] with respect to all time (k = 1,2,..., kf) and initial state of the vehicle [x(0),y(0), (k)], can also be computed from the absolute sensor readings at all times. Since the data obtainable from the gyroscope is the rate of change of the vehicle orientation (k + 1) from (f>(k), given its initial state ( * ' k'f+3)-x(i • k'f+j)\\2 i=\
j=\
+\\y(i-k'f+j)-y(i-k'f+j)\\2, np
k'f
/ofl/(x) = J2 E I W • k'f + 3) - Hi • k'f + j)\\\ j = l 3=1
(9)
134
T. Furukawa, et al.
and = y(i • k'f),i = l,...,n p ,
x(i -k'f) = x(i-k'f),y(i-k'f) (i-k'f)
= 4(i-k'f),i
=l,...,np.
(10)
where k'f is the number of iterations for each period which is used for further autonomous navigation, and np is the number of partitions in the vehicle operation. The total number of iterations is given by kf = k\ • nv.
6.3.2. A General Framework for Searching Solutions
Pareto-Optimal
Fig. 6.6 shows the flowchart of the framework of the multi-objective optimization proposed in this chapter. In order to find multiple solutions, the multi-objective optimization searches with A multiple points, i.e.,
x(tf) = {Xl\...,xf}e0R")\ th
(ii)
th
where xf is the i search point at K generation. The initial population, X(0), is generated randomly within a specified range [x m i n ,x m a x ]. Each objective function value fj(x.f) is then calculated with each parameter set x.f, finally yielding
F(J0 = {f(xf),...f(x£)}.
(12)
Unlike the other MOEAs, two scalar criteria are evaluated for each search point in the proposed framework. One is the rank in Paretooptimality as usual
0(/r) = {6»(xfr),...,(?(xf)})
(13)
where 6 : 5ft™ —> N, and the other is a positive real-valued scalar objective function or a fitness, which is derived by taking the rank into account:
*(Ar) = {0(xf) ) ...,0(xf)},
(14)
where 0 for 1 < i < m and f(x) < f(y) for all feasible y (assuming minimization). Three are the mechanisms taken from evolutionary multiobjective optimization that are more frequently incorporated into constraint-handling techniques 18: (1) Use of Pareto dominance as a selection criterion. Examples of this type of approach are given in 6>16>8. (2) Use of Pareto ranking 14 to assign fitness in such a way that nondominated individuals (i.e., feasible individuals in this case) are
Multi-Objective Optimization of Trusses
203
assigned a higher fitness value. Examples of this type of approach 99 99
7
are given in zz>ZJ-\ (3) Split the population in subpopulations that are evaluated either with respect to the objective function or with respect to a single constraint of the problem. This is the selection mechanism adopted in the Vector Evaluated Genetic Algorithm (VEGA) 26. Examples of this type of approach are given in 29>20>9.
In order to sample the feasible region of the search space widely enough to reach the global optimum it is necessary to maintain a balance between feasible and infeasible solutions. If this diversity is not maintained, the search will focus only in one area of the feasible region. Thus, it will lead to a local optimum solution. A multiobjective optimization technique aims to find a set of trade-off solutions which are considered good in all the objectives to be optimized. In global nonlinear optimization, the main goal is to find the global optimum. Therefore, some changes must be done to those approaches in order to adapt them to the new goal. Our main concern is that feasibility takes precedence, in this case, over nondominance. Therefore, good "trade-off" solutions that are not feasible cannot be considered as good as bad "trade-off" solutions that are feasible. Furthermore, a mechanism to maintain diversity must normally be added to any evolutionary multiobjective optimization technique. Tied to the constraint handling mechanism of ISPAES, we can find an enhanced selection operator. A desirable selection operator will provide a blend of feasible and infeasible individuals at any generation of the evolutionary process. Higher population diversity enhances exploration and prevents premature convergence. A robust evolutionary algorithm for constrained optimization will provide a selection mechanism with two clear objectives: to keep diversity, and to provide promissory individuals (approaching the optimum). These goals are difficult to reach when the selection mechanism is driven by "greedy rules" that fail to cooperate. A poor selection mechanism could minimize the effort of the diversity mechanism if only best-and-feasible individuals are favored. Similarly, a poor diversity preservation mechanism could never provide interesting individuals to the Pareto dominance-based selection operator as to create a promissory blend of individuals for the next generation.
204
A. Hernandez and S. Botello
9.3. ISPAES Algorithm All of the approaches discussed in the previous section have drawbacks that keep them from producing competitive results with respect to the constraint-handling techniques that represent the state-of-the-art in evolutionary optimization. In a recent technical report 18, four of the existing techniques based on multiobjective optimization concepts (i.e., COMOGA 29, VEGA 9 , MOGA 8 and NPGA 7) have been compared using Michalewicz's benchmark 19 and some additional engineering optimization problems. Although inconclusive, the results indicate that the use of Pareto dominance as a selection criterion gives better results than Pareto ranking or the use of a population-based approach. However, in all cases, the approaches analyzed are unable to reach the global optimum of problems with either high dimensionality, large feasible regions or many nonlinear equality constraints 18. In contrast, the approach proposed in this paper uses Pareto dominance as the criterion selection, but unlike the previous work in the area, a secondary population is used in this case. The approach, which is a relatively simple extension of PAES 17 provides, however, very good results, which are highly competitive with those generated with an approach that represents the state-of-the-art in constrained evolutionary optimization. The structure of the ISPAES algorithm is shown in Figure 9.1. Notice the two loops operating over the Pareto set (in the external storage). The right loop aims for exploration of the search space, the left loop aims for population diversity and exploitation. ISPAES has been implemented as an extension of the Pareto Archived Evolution Strategy (PAES) proposed by Knowles and Corne 1T for multiobjective optimization. PAES's main feature is the use of an adaptive grid on which objective function space is located using a coordinate system. Such a grid is the diversity maintenance mechanism of PAES and it constitutes the main feature of this algorithm. The grid is created by bisecting k times the function space of dimension d (d is the number of objective functions of the problem. In our case, d is given by the total number of constraints plus one. In other words, d = n + p+1, where n is the number of inequality constraints, and p is the number of equality constraints. Note that we add one to this summation to include the original objective function of the problem) . The control of 2kd grid cells means the allocation of a large amount of physical memory for even small problems. For instance, 10 functions and 5 bisections of the space produce 250 cells. Thus, the first feature introduced
205
Multi-Objective Optimization of Trusses
ISPAES ALGORITHM INITIAL POPULATION PICK PARENT FROM LESS CROWDED AREA r <SELECT>
I
[
* MUTATION
I T
^
,;
^ PARETO
|
I
S E T
I
'
N
° 1
CHILD DOMINATES
PA ENT?
R
iT~^
\
Yes
I ADD CHILD BY USING PROCEDURE
NEW PARETO SET Fig. 9.1. The logical structure of ISPAES algorithm
in ISPAES is the "inverted" part of the algorithm that deals with this space usage problem. ISPAES's fitness function is mainly driven by a feasibility criterion. Global information carried by the individuals surrounding the feasible region is used to concentrate the search effort on smaller areas as the evolutionary process takes place. In consequence, the search space being explored is "shrunk" over time. Eventually, upon termination, the size of the search space being inspected will be very small and will contain the solution desired (in the case of single-objective problems. For multi-objective problems, it will contain the feasible region). The main algorithm of ISPAES is shown in Figure 9.2. Its goal is the construction of the Pareto front which is stored in an external memory (called file). The algorithm performs Maxnew loops, generating a child h from a random parent c in every loop. Therefore, the ISPAES algorithm introduced here is based on a (1 + 1) — ES. If the child is better than the
206
A. Hernandez and S. Botello
parent, that is, the child dominates its parent, then it is inserted in file, and its position is recorded. A child is generated by introducing random mutations to the parent, thus, h = mutate(c) will alter a parent with increments whose standard deviation is governed by Equation 1. maxsize: maximum size of file c: current parent £ X (decision variable space) h: child of c € X, a^- individual in file that dominates h ad,: individual in file dominated by h current: current number of individuals in file cnew: number of individuals generated thus far g: pick a new parent from less densely populated region every g new individuals r: shrink space at every r new individuals current = 1; cnew=0; c = newindividual(); add(c); While cnew<MaxNew do h = mutate(c); cnew+ =1; if (' £ V S-T.
Cy'yj'
_
^VW
_
Cyl
ui
The validity of this criterion is immediate since the implication is the definition of the cover, and in the other direction, following the rules is equivalent to building a solution. 11.2.3. Optimization
Methods
11.2.3.1. A Heuristic Method In this paragraph, we will describe the heuristic designed by Gendreau et al. 13 , which is used in our meta-heuristic. This heuristic combines the GENIUS heuristic for the TSP 12 with the PRIMAL1 Set Covering heuristic by Balas and Ho 1 for the set covering problem (SCP). The PRIMAL1 heuristic gradually includes nodes v in the solution according to a greedy criterion, in order to minimize a function f(cv,bv), where, at each step, cv is the cost of including node v £ V in the solution, and bv is the number of nodes of W covered by v. cv is expressed by the value of the minimum spanning tree built upon the edges defined by the vertices present in this solution and v. The minimum spanning tree is built using the Prism algorithm. The three following functions suggested by Balas and Ho are used: (1) / ( * A ) = 1 5 ^ (2) f(Cy,bV) = t (3) f(Cy,by)
=CV
PRIMAL1 first applies criterion 1 in a greedy fashion until all the vertices of W are covered. Then, the nodes, which cover an over covered node of W, are removed from the solution. After that, the solution is completed using criterion 2, and nodes, which cover over covered nodes of W, are removed.
254
N. Jozefowiez, F. Semet and E-G. Talbi
The process is iterated with criterion 3. A second solution is constructed by applying this time the criteria in order 1, 3, 2. The best of these two solutions is retained. The following heuristic is run twice like in PRIMAL 1 with the two sequences of criteria. STEP 1 Initialization. Set H 0 and filr{A) > 0, denoted as upper and lower service curves, respectively. Initially, we specify all available processing units as our resource set R and associate the corresponding costs to them. For example we may have the resources R = {ARM9, MEngine, Classifier, DSP, Cipher, LookUp, CheckSum, PowerPC}. During the allocation step (see Figure 12.2), we select those which will be in a specific architecture, i.e., if alloc(r) = 1, then resource r £ R will be implemented in the packet processor architecture. The upper and lower service curves specify the available computing units of a resource r in a relative measure, e.g., processor cycles or instructions. In particular, /?"(A) and /?'(A) are the maximum and minimum number of available processor cycles in any time interval of length A. In other words, the service curves of a resource determine the best case and worst case computing capabilities. For details, see, e.g., Thiele et al. 18. Software Application and Binding The purpose of a packet processor is to simultaneously process several streams of packets. For example, one stream may contain packets that store audio samples and another one contains packets from an FTP application. Whereas the different streams may be processed differently, each packet of a particular stream is processed identically, i.e., each packet is processed by the same sequence of tasks. Definition 2: We define a set of streams s £ S and a set of tasks t £ T. To each stream s there is an ordered sequence of tasks V(s) = [to,...,tn] associated. Each packet of the stream is first processed by task t0 £ T, then successively by all other tasks until tn £ T. As an example we may have five streams 5 = {RTSend, NRTDecrypt, NRTEncrypt, RTRecv, NRTForward}. According to Figure 12.3, the packets of these streams when entering the packet processor undergo different sequences of tasks, i.e., the packets follow the paths shown. For example, for stream s = NRTForward we have the sequence of tasks V(s) = [LinkRX, VerifyIPHeader, ProcessIPHeader, Classify, RouteLookUp, ... , Schedule, LinkTx]. Definition 3: The mapping relation M C T x R defines all possible bindings of tasks, i.e., if (t, r) £ M, then task t could be executed on resource r. This execution of t for one packet would use w(r, t) > 0 computing units of
276
S. Kiinzli, S. Bleuler, L. Thiele and E. Zitder
Fig. 12.3. Task graph of a packet processing application
r. The binding B of tasks to resources B C M is a subset of the mapping such that every task t £ T is bound to exactly one allocated resource r £ R, alloc(r) — 1. We also write r = bind(t) in a functional notation. In a similar way as alloc describes the selection of architectural components, bind defines a selection of the possible mappings. Both alloc and bind will be encoded using an appropriate representation described later. The 'load' that a task t puts onto its resource r = bind(t) is denoted as w(r,t). Figure 12.4 represents an example of a mapping between tasks and resources. For example, task 'Classify' could be bound to resource 'ARM9' or 'DSP'. In a particular implementation of a packet processor we may have 6m 0. There are no streams with equal priority.
A Computer Engineering Benchmark Application
111
Fig. 12.4. Example of a mapping of task to resources
In the benchmark application, we suppose that only preemptive fixedpriority scheduling is available on each resource. To this end, we need to associate to each stream s a fixed priority prio(s) > 0, i.e., all packets of s receive this priority. From all packets that wait to be executed in a memory, the run-time environment chooses one for processing that has the highest priority among all waiting packets. If several packets from one stream are waiting, then it prefers those that are earlier in the task chain V(s). Application Scenarios A packet processor will be used in several, possibly conflicting application scenarios. Such a scenario is described by the properties of the input streams, the allowable end-to-end delay (deadline) for each stream and the available total memory for all packets (sum of all individual memories of the processing elements). Definition 5: The properties of each stream s are described by upper and lower arrival curves a™ (A) and a's(A). To each stream s e S there is associated the maximal total packet memory m(s) > 0 and an end-to-end deadline d(s) > 0, denoting the maximal time by which any packet of the stream has to be processed by all associated tasks V(s) after his arrival. The upper and lower arrival curves specify upper and lower bounds on the number of packets that arrive at the packet processor. In particular, a"(A) and a's(A) are the maximum and minimum number of packets in any time interval of length A. For details, see, e.g., Thiele et al. 21. Definition 6: The packet processor is evaluated for a set of scenarios
278
5. Kiinzli, S. Bleuler, L. Thiele and E. Zitzler
b £ B. The quantities of Definition 5 are defined for each scenario independently. In addition, whereas the allocation alloc is defining a particular hardware architecture, the quantities that are specific for a software application are also specific for each scenario b £ B and must be determined independently, for example the binding bind of tasks to processing elements and the stream priorities prio. Performance Analysis It is not obvious how to determine for any memory module, the maximum number of stored packets in it waiting to be processed at any point in time. Neither is it clear how to determine the maximum end-to-end delays experienced by the packets, since all packet flows share common resources. As the packets may flow from one resource to the next one, there may be intermediate bursts and packet jams, making the computations of the packet delays and the memory requirements non-trivial. Interestingly, there exists a computationally efficient method to derive worst-case estimates on the end-to-end delays of packets and the required memory for each computation and communication. In short, we construct a scheduling network and apply the real-time calculus (based on arrival and service curves) in order to derive the desired bounds. The description of this method is beyond the scope of this chapter but can be found in Thiele et al. 21>18-19. As we know for each scenario the delay and memory in comparison to the allowed values d(b,s) and m(b,s), we can increase the input traffic until the constraints are just about satisfied. In particular, we do not use the arrival curves aYb ^ and aL * directly in the scheduling network, but linearly scaled amounts ipi> • a^b , and ipt, • aL s~., where the scaling factor ipt, is different for each scenario. Now, binary search is applied to determine the maximal throughput such that the constraints on delay and memory are just about satisfied. For the following discussion, it is sufficient to state the following fact: • Given the specification of a packet processing design problem by the set of resources r € R, the cost function for each resource cost(r), the service curves /?" and /?£, a set of streams s E S, a set of application tasks t 6 T, the ordered sequence of tasks for each stream V(s), and the computing requirement w(r,t) for task t on resource r;
A Computer Engineering Benchmark Application
279
• given a set of application scenarios b e B with associated arrival curves for each stream a?b s and al,b SN , and a maximum delay and memory for each stream d(b, s) and m(b, s); • given a specific HW/SW architecture denned by the allocation of hardware resources alloc(r), for each scenario b a specific priority of each stream prio(b, s) and a specific binding bind(b, t) of tasks t to resources; • then we can determine — using the concepts of scheduling network, real-time calculus and binary search — the maximal scaling factor tpb such that under the input arrival curves ^ • aYb > and tpf, • aL SN the maximal delay of each packet and the maximal number of stored packets is not larger than d(b, s) and m(b, s), respectively. As a result, we can define the criteria for the optimization of packet processors. Definition 7: The quality measures for packet processors are the associated cost cost = J2rl-Ralloc(r)cost(r) and the throughput ipt, for each scenario b £ B. These quantities can be computed from the specification of a HW/SW architecture, i.e., alloc{r), prio(b, s) and bind(b, t) for all streams s £ S and tasks t £ T. Now, the benchmark application is defined formally in terms of an optimization problem. In the following, we will describe the two aspects, representation and variation operators, that are specific to the evolutionary algorithm implementation. Representation Following Figure 12.2 and Definition 7, a specific HW/SW architecture is denned by alloc(r), prio(b,s) and bind(b,t) for all resources r £ R, streams s £ S and tasks t £ T. For the representation of architectures, we number the available resources from 1 to \R\; the tasks are numbered from 1 to \T\, and each stream is assigned a number between 1 and \S\. The allocation of resources can then be represented as integer vector A £ {0, l}' f l ', where A[i] = 1 denotes, that resource i is allocated. To represent the binding of tasks on resources, we use a two-dimensional vector Z £ { 1 , . . . , |i?|}lBlxlTl, where for all scenarios b £ B it is stored which task is bound to which resource. Z[«][j] = k means that in scenario i task j is bound to resource k. Priorities of flows are represented as a two-dimensional vector P £ { l , . . . , | 5 | } | B | x | l S | , where we store the streams according to their priorities, e.g., P[i][j] = A; means that in scenario i, stream k has priority j , with 1 being the highest priority. Obviously, not all possible encodings A,
280
5. Kiinzli, S. Bleuler, L. Thiele and E. Zitzler
Z, P represent feasible architectures. Therefore, a repair method has been developed that converts infeasible solutions into feasible ones. Recombination The first step in recombining two individuals is creating exact copies of the parent individuals. With probability 1 - Pcross, these individuals are returned as offspring and no recombination takes place. Otherwise, crossing over is performed on either the allocation, the task binding or the priority assignment of flows. With probability Pcross-aiioc, a one-point crossover operation is applied to the allocation vectors Ai and A2 of the parents: First we randomly define the position j where to perform the crossover, then we create the allocation vector Anewi for the first offspring as follows: ^nei«l[i] = Ai[i], if 1 < i < j Anewi\i] = A2\i], H j < i
1017 already for the problem instance with 2 objectives and even larger for the instances with 3 or 4 objectives. As an example, an approximated Pareto front for the 3-objective instance is shown in Figure 12.8—the front has been generated by the optimizer SPEA2 2S. The x-axis shows the objective value corresponding to *Load2 under Load 2 (as defined in Table 12.61), the y-axis shows the objective value corresponding to \PLoad?,, whereas the z-axis shows the normalized total cost of the allocated resources. The two example architectures shown in Figure 12.8 differ only in the allocation of the resource 'Cipher', which is a specialized hardware for encryption and decryption of packets. The performance of the two architectures for the load scenario with real-time flows to be processed is more or less the same. However, the architecture with a cipher unit performs around 30 times better for the encryption/decryption scenario, at increased cost for the cipher unit. So, a designer of a packet processor that should have the capability of encryption/decryption would go for the solution with a cipher unit (solution on the left in Figure 12.8), whereas one would decide for the cheaper solution on the right, if there is no need for encryption. 12.4.2. Simulation Results To evaluate the difficulty of the proposed benchmark application, we compared the performance of four evolutionary multiobjective optimizers, namely SPEA2 25 , NSGA-II 8 , SEMO 15 and FEMO 15, on the three afore-
A Computer Engineering Benchmark Application
287
Fig. 12.8. Two solution packet processor architectures annotated with loads on resources for the different loads specified in Table 12.61
mentioned problem instances. For each algorithm, 10 runs were performed using the parameter settings listed in Tables 12.62 and 12.63; these parameters were determined based on extensive, preliminary simulations. Furthermore, all objective functions were scaled such that the corresponding values lie within the interval [0,1]. Note that all objectives are to be minimized, i.e., the performance values are reversed (smaller values correspond to better performance). The different runs were carried out on a Sun Ultra 60. A single run for 3 objectives, a population size of 150 individuals in conjunction with SPEA2 takes about
288
S. Kiinzli, S. Bleuler, L. Thiele and E. Zitzler Table 12.62. Parameters for population size and duration of runs dependent on the number of objectives. # of objectives 2 3 4 Table 12.63. tion 12.2) Mutation ->
population size 100 150 200
# of generations 200 300 400
Probabilities for mutation and crossover (cf. Sec-
Allocation
Pmut Pmut-alloc *Tnut — alloc — zero
-> Crossover -> ->
Binding Allocation Binding
Pmut-bind Pcross Pcross-alloc | PCross-bind
= = —
= = = =
0.8 0.3 \J.O
0-5 0.5 0.3 0-5
20 minutes to complete. In the following we have used two binary performance measures for the comparison of the EMO techniques: (1) the additive e-quality measure 28 , and (2) the coverage measure 27 . The e-quality measure Ie+(A,B) returns the maximum value d, which can be subtracted from all objective values for all points in the set of solutions A, such that the solutions in the shifted set A1 equal or dominate any solution in set B in terms of the objective values. If the value is negative, the solution set A entirely dominates the solution set B. Formally, this measure can be stated as follows: L+ (A, B) = max { min I max {a,i - bA > > , y b€B {a€A {0 UiiHigh) then CPAF = CPAF + Wi3 If ^ > Ui(High2) then CPAF = CPAF - Wi4
where U represents different thresholds and weights Wij € [0,1]. For the 48 parameters we have a total of 192 weights and 192 threshold parameters. CPAF is a level that determines the final diagnosis. After a statistical study a subset of 32 rules and their associated 32 thresholds was selected that maximizes the discrimination power of the classifier. If the CPAF level is within a security interval [-F, F] then there is not enough certainty about the diagnosis and the case is left undiagnosed. The diagnosis is positive (a PAF patient) if CPAF > F and negative if CPAF < F. The MO procedure uses two optimization objectives: the classification rate CR and the coverage level CL: (1) CR = Number of correct diagnosed cases / Number of diagnosed cases (2) CL — Number of diagnosed cases / Total number of cases Two PAF diagnosis cases have been considered: • Optimization of weights of decision rules given by an expert. In this case the chromosome length is 32. • Optimization of threshold and weights of the decision rules given by an expert. In this case the chromosome length is 64. Three MOEA algorithms were tested: The Strength Pareto Evolutionary
372
M. Lahanas
Algorithm SPEA 14, the Single Front genetic Algorithm SFGA 15 and the New Single Front Genetic Algorithm NSFGA. 16 A mutation and crossover probability 0.01 and 0.6, respectively was used and each algorithm evolved 1000 generations with a population size set to 200. NSFGA, SFGA and SPEA showed a similar performance. The best results were obtained when both threshold and weights were optimized. The results are similar to other results using classic schemes but the MO optimization leads to multiple solutions that can be of interest for certain patients who suffer from other disorders and certain solutions could be more suitable. 16.4. Treatment Planning Every year more than one million patients only in the United States will be diagnosed with cancer. More than 500000 of these will be treated with radiation therapy. 17 Cancer cells have a smaller probability than healthy normal cells to survive the radiation damage. The dose is the amount of energy deposited per unit of mass. The physical and biological characteristics of the patient anatomy and of the source, such as intensity and geometry are used for the calculation of the dose function, i.e. the absorbed dose at a point in the treatment volume. The dose distribution specifies the corresponding three-dimensional non-negative scalar field. A dose distribution is possible if there is a source distribution which is able to generate it. A physician prescribes the so-called desired dose function i.e. the absorbed dose as a function of the location in the body. The objectives of dose optimization are: • Deliver a sufficiently high dose in the Planning Target Volume (PTV) which includes besides the Gross Tumor Volume (GTV) an additional margin accounting for position inaccuracies, patient movements, etc. • Protect the surrounding normal tissue (NT) and organs at risk (OARs) from excessive radiation. The dose should be smaller than a critical dose Dcra specific for each OAR. Radiation oncologist use for the evaluation of the dose distribution quality a cumulative dose volume histogram (DVH) for each structure ( PTV, NT or OARs), which displays the fraction of the structure that receives at least a specified dose level. The objectives are called DVH-based objectives
Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 373
if expressed in terms of DVHs related values. The determination of the dose distribution for a given source distribution, the so-called forward problem, is possible and a unique solution exists. For the inverse problem, i.e. the determination of the source distribution for a given dose distribution, is not always possible or the solution is not unique. Optimization algorithms are therefore used to minimize the difference between the desired and the obtained dose function. 16.4.1. Brachytherapy High dose rate (HDR) brachytherapy is a treatment method for cancer where empty catheters are inserted within the tumor volume. A single 192Ir source is moved inside the catheters at discrete positions (source dwell positions, SDP) using a computer controlled machine. The dose optimization problem considers the determination of the n dwell times (or simply weights) for which the source is at rest and delivers radiation at each of the n dwell positions, resulting in a dose distribution which is as close as possible to the desired dose function. The range of n varies from 20 to 300. If the positions and number of catheters and the SDPs are given after the implantation of the catheters, we term the process postplanning. The optimization process to obtain an optimal dose distribution is called dose optimization. The additional determination of an optimal number of catheters and their position, so-called inverse planning, is important as a reduction of the number of catheters simplifies the treatment plan in terms of time and complexity, reduces the possibility of treatment errors and is less invasive for the patient. Dose optimization can be considered as a special type of inverse planning where the positions and number of catheters and the SDPs are fixed. 16.4.1.1. Dose Optimization for High Dose Rate Brachytherapy MO dose optimization for HDR brachytherapy was first applied by Lahanas et al18 using NPGA, 12 NSGA 19 and NRGA 20 with a real encoding for the SDPs weights. A number of 3-5 DVH derived objectives, depending on the number of OARs was used. The results were superior to optimization results using a commercial treatment planning system. More effective was the application of SPEA 21 using dose variance based objectives that enables the support from deterministic algorithms that provide 10-20 solutions with which the population is initialized. A faster op-
374
M. Lahanas
timization than with SPEA was possible using NSGA-II. 22'23 Both SPEA and NSGA-II require the support from a deterministic algorithm that improves significantly the optimization results and the convergence speed. Pareto global optimal solutions can be obtained with L-FBGS that allows to evaluate the performance of MOEAs for the HDR dose optimization problem. The optimization with 100 individuals and 100 generations requires less than one minute which is the time required to obtain a single solution with simulated annealing. NSGA-II was used for dose optimization using DVH-based objectives, for which deterministic algorithms cannot be used as multiple local minima exist. 25 The DVH-based objectives provide a larger spectrum of solutions than the dose variance-based objectives. The archiving method of PAES 26 was included and the algorithm was supported by L-BFGS solutions using variance-based objectives. 23 A SBX crossover 27 and polynomial mutation 28 were used. Best results were obtained for a crossover probability in the range 0.7-1.0 and a mutation probability 0.001-0.01. 16.4.1.2. Inverse Planning for HDR Brachytherapy The NSGA-II algorithm was applied for the HDR brachytherapy inverse planning problem 29 where the optimal position and number of catheters has to be found additional to the dwell position weights of the selected catheters. A two-component chromosome is used. The first part W contains the dwell weight of each SDP for each catheter with a double precision floatingpoint representation. The second part C is a binary string which represents which catheters have been selected: the so-called active catheters. The inverse planning algorithm is described by the following steps: (1) Determine geometrically the set of all allowed catheters. (2) Initialize individuals with solutions from a global optimization algorithm. (3) Perform a selection based on constrained domination ranking. (4) Perform a SBX crossover for the SDP weights chromosome and one point crossover for the catheter chromosome with rescaled dwell times. (5) Perform a polynomial mutation for the SDP weights chromosome and flip mutation for the catheter chromosome with rescaled dwell times.
Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 375
(6) Perform a repair mechanism to set the number of used catheters of each solution in a given range. (7) Reset scaling according to number of active SDPs. (8) Evaluate dosimetry for each individual. (9) If termination criteria are satisfied output set of non-dominated archived solutions else go to (3). Inverse planning considers a range of solutions with different number of active SDPs, Therefore the dwell weights of the parents before crossover are divided by the number of active SDPs to be independent on this number. After mutation the weights of each offspring are multiplied by the number of SDPs in the active catheters encoded in the C chromosome. For dose optimization and inverse planning decision making (DCM) tools are necessary to filter a single solution 30>31 from the non-dominated set that matches at best the goals of the treatment planner. Dose optimization and inverse planning with MOEAs together with DCM tools were implemented in the commercial Real-Time HDR prostate planning system SWIFT™ (Nucletron B.V., Veenendaal, The Netherlands) and patients are now treated by this system. A display table of a list of values for all solutions of the objectives, DVHs for all OARs, the NT and the PTV of each solution is provided. Other parameters are Dg0 (dose that covers 90% of the PTV), V150 (percentage of PTV that receives more than 150% of the prescription dose) and the extreme dose values. The entire table for every such quantity can then be sorted and solutions can be selected and highlighted by the treatment planner. Constraints can be applied such as to show only solutions with a PTV coverage, i.e. percentage of the PTV that receives at least 100% of the prescription dose, larger than a specified value. Solutions that do not satisfy the constraints are removed from the list. This reduces the number of solutions and simplifies the selection of an optimal solution. The DVHs of all selected solutions can be displayed and compared, see Fig. 16.1. Other decision-making tool are projections of the Pareto front onto a pair of selected objectives. For M objectives the number of such projections is M ( M - l ) / 2 . The position of selected solutions can be seen in these projections. This helps to identify their position in the multidimensional Pareto front and to quantify the degree of correlation between the objectives and of the possibilities provided by the non-dominated set. The Pareto front provides information such as: the range of values for each objective and the trade-off
376
M. Lahanas
Fig. 16.1. Example of DVHs (a) for the PTV and (b) for the urethra of a representative set of non-dominated solutions. A single solution selected by a treatment planner is shown.
between other DVH derived quantities, see Fig. 16.2.
Fig. 16.2. Example of a trade-off between the percent of PTV that is covered at least with the prescribed dose DVH(D re j) and the percent of volume with a dose higher than a critical dose limit (a) for the urethra and (b) for the rectum. For the urethra a rapidly increasing fraction receives an over dosage as the coverage for the PTV increases above 80%.
With MOEAs the best possible solution can be obtained, considering the objective functions and the implant geometry, and this increases the probability of treatment success. 16.4.2. External Beam
Radiotherapy
In external beam radiotherapy, or teletherapy, high energy photon beams are emitted from a source on a rotating gantry with the patient placed so that the tumor is at the center of the rotation axis. Haas et al 32'33
Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 377
proposed the use of MOEAs for the solution of the two main problems in radiotherapy treatment planning. First find an optimal number of beams and their orientation and second determine the optimum intensity distribution for each beam. Both problems are considered separately. A beam configuration is selected based on experience or using geometric methods. Then the intensity distributions of these beams is optimized. In the last few years mostly SO algorithms have been proposed for the simultaneous solution of both problems. 16.4.2.1. Geometrical Optimization of Beam Orientations The aim of beam orientation optimization is to find a configuration of beams such that a desired dose distribution can be achieved. A single beam would deposit a very high dose to the NT. Using more beams it is possible to increase the dose in the tumor, keeping the dose in the surrounding healthy tissue at a sufficiently low level, but the treatment complexity increases. The idea of using geometrical considerations in the cost function was first proposed by Haas et al 34 and NPGA to obtain an optimum beam configuration. Simplifications such a limitation in 2D, using the most representative 2D computed tomography slice in the plan have been used. The geometric objective functions to be minimized are: (1) Difference between the area where all M beams overlap and the area of the PTA: fPTA = area(Si n B2 • • • D BM) - area(PTA)
(6)
(2) Overlap area between each beam and the j-th OAR: •^
areaiBi D OAR A
foAR, =J2 Pi i=l
_
~
~M
f /3(SPTA - SOAR) if SOAR < SPTA 11
, ^
(?) /gs
if SOAR > SPTA
SPTA and SOAR are distances shown in Fig. 16.3 and j3 a parameter
that favors beam entry points further away from OARs. (3) Overlap from pair wise beam intersections to minimize hot spots: M-\
fNT=Y;Yj i=l
M area B
j=i
( i
n B
i)
(9)
378
M. Lahanas
Fig. 16.3. Geometric parameters used for the solution of the beam orientation problem. The gantry angle 6 of a field (beam) is shown. The patient body including the normal tissue NT, one organ at risk (OAR) and the planning target area (PTA) which includes the tumor is shown.
An example of the geometry of a radiation field and parameters used by Haas et al is shown in Fig. 16.3. An integer representation for the beam gantry angle was used. The length of each chromosome is equal to the number of beams involved in the plan. A particular solution, i.e. a chromosome, is represented as a vector CT = (#i,... 9M) where 9i is the i-th individual beam gantry angle. For the integer representation a intermediate recombination is used 35, such that the parents Cp\ and Cp2 produce the offspring CO: Co = round{CP1 + j{CP2 - CP1))
(10)
where 7 is a random number in the interval [—0.25,1.25]. 36 A mutation operator is used to introduce new beam angles into the population by generating integers that lie in the range [0... 359°]. Important was the inclusion of problem specific operators which attempt to replicate the approach followed by experienced treatment planners. Such an operator is used to generate k equispaced beams as this distribution will reduce the area of overlap between the beams. One gantry angle from a particular chromosome is selected randomly positioning the k — 1 remaining beams evenly. A further mutation operator is used to perform a local search by shifting randomly by a small amount (less than 15°) one of the selected beam gantry angles.
Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 379
16.4.2.2. Intensity Modulated Beam Radiotherapy Dose Optimization In IMRT each beam is divided in a number of small beamlets (bixels), see Fig. 16.4. The intensity of each beamlet can individually be adjusted. A sparse dose matrix is precalculated and contains the dose value at each sampling point from each bixel with a unit radiation intensity. The intensity (weight) of each beamlet has to be determined such that the produced dose distribution is "optimal". The number of parameters can be as large as 10000.
Fig. 16.4. Principle of IMRT dose optimization. The contours of the body, the PTV and one OAR are shown. The problem is to determine the intensities of the tiny subdivisions (bixels) of each beam, so that the resulting dose distribution is optimal.
Lahanas et al used NSGA-IIc 37 algorithm for the optimization of the intensity distribution in IMRT where the orientation and the number of beams are fixed. 38 The dose variance-based objective functions are: for the PTV the dose variance fprv around the prescription dose Dref, for NT the sum of the squared dose values /NT and for each OAR the variance IOAR for dose values above a specific critical dose value D°rAR. 1
NPTV
/— = ^ £ « T V - ^ 7 ) 2
(ID (12)
380
M. Lahanas
f
JOAR=Jj
,
NoAR V^ H TJfjOAR d
2^ \ i
r,OAR\/jOAR d
~ Dcr
)\ j
nOAR\2
~ Dcr
)
/-• n \
V16)
H(x) is the Heaviside step function. d?TV, d^T and d®AR are the calculated dose values at the j-th sampling pointforthe PTV, the NT and each OAR respectively. Npxv, NNT and NOAR are the corresponding number of sampling points. Depending on the number of OARs we have 3-6 objectives. For the multidimensional problem it was required to use supported solutions 39 , i.e. solutions initialized by another optimization algorithm. Even if constraints can be used for some of the objectives a large number of non-dominated solutions is required to obtain a representative set of the multidimensional Pareto front. An archive was used, similar to the PAES algorithm, with all non-dominated solutions archived. This allows to keep the population size in the range 200-500 and the optimization time below one hour. Tests show that NSGA-IIc and SPEA are not able to produce high quality solutions. Only a very small local Pareto optimal front can be found far away from the very extended global Pareto front that can be obtained by the gradient based optimization algorithm L-BFGS. 24 Strong correlations exist between the optimization parameters and could be the reason of the efficiency of the L-BFGS algorithm that uses gradient information not available to the genetic algorithms. Using a fraction of solutions initialized by L-BFGS and an arithmetic crossover NSGA-IIc is able to produce a representative set of non-dominated solutions in a time less than the time required using sequentially the L-BFGS algorithm each time with a different set of importance factors. Previous methods used in IMRT include simulated annealing 40 which is very slow, iterative approaches and filtered back-projection 41 . The large number of objectives and the non-linear mapping from decision to objective space requires a very large number of solutions to obtain a representative non-dominated set with SO optimization algorithms. 4 4 The benefit of using MOEAs is the information of the trade-off between the objectives which is essential for selecting an optimal solution. E. Schreibmann et al 45 applied NSGA-IIc for IMRT inverse planning. The user specifies a minimum and maximum number of beams, usually 3-9, to be considered. Constraints can be applied by using the constraint domination relation. A two-component chromosome is used, with a part for weights and beams, similar to inverse planning in brachytherapy. After mutation L-BFGS is applied with 30 iterations to optimize the intensity
Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 381
distributions of each solution. The number of iterations increases during the evolution. Clinical acceptable results can be obtained in one hour. More than 5000 archived solutions are obtained after 200 generations using a population size of 200 solutions. Arithmetic crossover is used with a random mixing parameter a £ [0,1] and a flip mutation. A mutation and crossover probability 0.01 and 0.9 respectively is used. 16.4.3. Cancer Chemotherapy Petrovski et al 46 applied MOEAs for the cancer chemotherapy treatment problem. Anti-cancer drugs are given to a patient in n doses at times £i , . . . , £ „ . Each dose is a cocktail of d drugs characterized by the concentrations Cij,i = 1 , . . . , n , and j — l,...,d. The problem is the optimization of the concentrations C,j. The response of the tumor to the chemotherapy treatment is modelled analytically by:
^T=N(t) \]n(J^-£KjJ2Cij(H(t-ti)-H(t-ti+1?)
(14)
where N(t) is the number of tumor cells at time t, A and 0 are tumor growth parameters, H(t) the Heaviside step function and KJ denote the efficacy of the anticancer drugs. The objectives of the MO optimization are: (1) Maximization of the tumor eradication
/i(c)= in
f («!))*
0,Vie
[ l , . . . , n ] , V j e [l,...,d]}
(17)
382
M. Lahanas
• Maximum cumulative dose Ccum for each drug. n
0, Vj G [ 1 , . . . , d]}
(18)
8=1
• Maximum tumor size of the tumor allowed. 0, Vi G [ 1 , . . . , n]}
(19)
• Restriction of toxic side effects by the chemotherapy. d
ff4(c) = {C s -e//, fc -53»?*iC* i >O ) Vte[l,...,n],Vfce[l,... ) m]} (20) %j represent the risk of damaging the k-th organ or tissue by the j-th drug SPEA was used for the MO chemotherapy treatment optimization with a maximum of 10000 generations. The crossover probability was 0.6 and a large mutation probability 0.1 was used. The population size was TV = 50 and the external archive size of SPEA was 5. A binary encoding was used for the decision variables Cij. Each individual is represented by n ddimensional vectors Cij each of them with 4 bytes that corresponds to 25 possible concentration units for each drug. For the optimization the constraints are added to the objective functions as penalties m
^Pjmax2{-gm{c),Q)
(21)
i=i
where Pj are penalty parameters. For a breast cancer chemotherapy treatment case d = 3 drugs were considered (Taxotere, Adriamycon and Cisplatinum) using n = 10. New treatment scenarios have been found by the application of SPEA. The representative set contains a number of treatment strategies some of them which were not found by SO optimization algorithms. This provides the therapists a larger repertoire of treatment strategies out of which the most suitable for certain cases can be used. 16.5. Data Mining Knowledge Discovery can be seen as the process of identifying novel, useful and understandable patterns in large data sets.
Application of Multiobjective Evolutionary Optimization Algorithms in Medicine 383
The goal of classification 4 7 is to predict the value (the class) of a userspecified goal attribute based on the values of other attributes, the so-called predicting attributes. Classification rules can be considered a particular kind of prediction rules where the rule antecedent ("IF part") contains a combination - typically, a conjunction - of conditions on predicting attribute values, and the rule consequent ("THEN part") contains a predicted value for the goal attribute. Complete classification may be infeasible when there are a very large number of class attributes. Partial classification, known as nugget discovery, seeks to find patterns that represent a "strong" description of a particular class. The consequent is fixed to be a particular named class. Given a record t, antecedent^) is true if t satisfies the predicate antecedent. Similarly consequent^) is true if t satisfies the predicate consequent. The subsets defined by the antecedent or consequent are the sets of records for which the relevant predicate is true. Three sets of records are defined 4 7 : A = {t G D\antecedent(t)} i.e. the set of records defined by the antecedent, B = {t 6 D\consequent(t)} i.e. the set of records defined by the consequent, C = {t € D\antecedent(t) A consequent(t)}. The cardinality of these sets are a, b and c respectively. The confidence confer) and the coverage cov(r) of a rule r are: • conf(r) = c/a • cov(r) = c/b A strong rule may be defined as one that meets certain confidence and coverage thresholds normally set by a user. 16.5.1. Partial Classification Iglesia et al47 used NSGA-II for nugget discovery. An alternative algorithm ARAC, which can deliver the Pareto global optimal front of all partial classification rules above a specified confidence/coverage threshold, was used for the analysis of the NSGA-II results. The objectives used are conf{r) and conf(r). The antecedent comprises a conjunction of Attribute Tests, ATs. A binary encoded string is used to represent the solution as a conjunctive rule. The first part of the string represents the m nominal attributes. Each numeric attribute is represented by a set of Gray-coded lower and upper limits
384
M. Lahanas
using 10 bits. For all attributes, when the data is loaded, the maximum and minimum values are calculated and stored. The second part of the string represents categorical attributes, with as many bits for each attribute as distinct values the categorical attribute can take. If a bit assigned to a categorical attribute is set to 0 then the corresponding label is included as an inequality in one of the conjuncts. To evaluate a solution, the bit string is first decoded, and the data in the database is scanned. For a database with n attributes, the ATs for nominal attributes can be expressed in various forms such as: ATj — v where v is a value from the domain of ATj, for some 1 < j < n. A database record x meets this simple value test if x[ATj] = v. ATj ^ v for some 1 < j < n. A record x meets this inequality test if x[ATj] < v. A bit string decoded represents to the following format: IF h 8 to solve this problem. This method is a particularly attractive tools to solve such complex optimization problems because of their generality and their capability, stemming from application of multimodal optimization procedures, to isolate local optima. 18.3. Problem: Discovering Promoters in DNA Sequences Biological sequences, such as DNA or protein sequences, are a good example of the type of complex objects that maybe described in terms of meaningful structural patterns. Availability of tools to discover these structures and to annotate the sequences on the basis of those discoveries would greatly improve the usefulness of these repositories that currently rely on methods developed on the basis of computational efficiency and representation accuracy rather than on terms of structural and functional properties deemed to be important by molecular biologists. An important example of biological sequences are prokaryotic promoter data gathered and analyzed by many compilations 10'9>16 that reveal the presence of two well conserved sequences or submotifs separated by varih
The notions of proximity and neighborhood in feature space is application dependent.
Generalized Analysis of Promoters: A Method for DNA Sequence Description
433
able distances and a less conserved sequence. The variability of the distance between submotifs and their fuzziness, in the sense that they present several mismatches, hinder the existence of a clear model of prokaryotic corepromoters. The most representative promoters in E. coli (i.e. a 70 subunits) are described by the following conserved patterns: (1) TTGACA: This pattern is an hexanucleotide conserved sequence whose middle nucleotide is located approximately 35 pair of bases upstream of the transcription start site. The consensus sequence for this pattern is TTGACA and the nucleotides reported in 16 reveal the following nucleotide distribution: T69T7gG6iA56C54A54, where for instance the first T is the most seen nucleotide in the first position of the pattern and is present in 69 % of the cases. This pattern is often called -35 region. (2) TATA AT: This pattern is also an hexanucleotide conserved sequence, whose middle nucleotide is located approximately 10 pair of bases upstream of the transcription start site. The consensus sequence is TATAAT and the nucleotide distribution in this pattern is T77A76T6oA6iA56T82, which is often called -10 region1*. (3) CAP Signal: In general, a pyrimidine (C or T) followed by a purine (A or G) compose the CAP Signal. This signal constitutes the transcription start site (TSS) of a gene. (4) Distance(TTGACA, TATAAT). The distance between the TTGACA and TATAAT consensus submotifs follows a data distribution between 15 and 21 pair of bases. This distance is critical in holding the two sites at the appropriate distance for the geometry of RNA polymerase 10 . The identification of the former RNA polymerase or promoters sites becomes crucial to detect gene activation or repression, by the way in which such promoters interact with different regulatory proteins (e.g. overlapping suggest repression and distances of approximately 40 base pairs suggest typical activation). Moreover, combining the promoter sites with other regulatory sites 37 can reveal different types of regulation, harboring RNA polymerase alone, RNA polymerase recruiting other regulatory protein, or cooperative regulations among more than one regulator22. Different methods have been used to identify promoters 30>15.2>9) but several failed to perform accurate predictions because of their lack of flexibility, by using crisp instead of fuzzy models for the submotifs (e.g., TATAAT or TTGACA 3 1 ), or restricting distances between submotifs to fixed values (e.g., 17 base
434
R. Romero Zaliz et al.
pairs12). The vagueness of the compound promoter motifs and the uncertainty of identifying which of those predicted sites correspond to a functional promoter can be completely solved only by performing mutagenesis experiments22. Thus more accurate and interpretable predictions would be useful in order to reduce the experiment costs and ease the researchers work. 18.4. Biological Sequence Description Methods In this paper we present results of the application of GAP to the discovery of interesting qualitative features in DNA sequences based on those ideas discussed in Section 18.2. The notion of interesting feature is formally defined by means of a family of parameterized models M = {Ma} specified by domain experts27 who are interested in finding patterns such as epoch descriptors of individual or multiple DNA sequences. These idealized versions of prototypical models are the basis for a characterization of clusters as cohesive sets that is more general than their customary interpretation as "subsets of close points." To address the promoter prediction problem we take advantage of the ability of representing imprecise and incomplete motifs, the fuzzy sets representations flexibility and interpretability, and the multi-objective genetic algorithms ability to obtain optimal solutions using different criteria. Our proposed method GAP represents each promoter submotif (i.e., 10 and -35 regions and the distance that separates them) as fuzzy models, whose membership functions are learned from data distributions13'21. In addition, as a generalized clustering method, GAP considers the quality of matching with each promoter submotif model (Q), as well as the size of the promoter extend (5), by means of the distance between submotifs, as the multiple objectives to be optimized. To do so, we used a Multi-objective Scatter Search (MOSS) optimization algorithm 18'8, which obtains a set of multiple and optimal promoter descriptions for each promoter region. Moreover, the former matching is also considered by MOSS as a multimodal problem, since there is more than one solution for each region. GAP, by using MOSS, overcomes other methods used for DNA motif discovery, such as Consensus/Patser based on weight probabilistic matrices (see Section 18.5), and provides the desired trade-off between accurate and interpretable solutions, which becomes particurary desirable for the end users. The extension of the original Scatter Search (SS) heuristic 18 uses the DNA regions where promoters should be detected as inputs and finds all optimal
Generalized Analysis of Promoters: A Method for DNA Sequence Description
435
relationships among promoter submotifs and distance models. In order to extend the original SS algorithm to a multi-objective environment we need to introduce some concepts6'5: A multi-objective optimization problem is defined as: Maximize Qm(x, Ma), subject to gj(x) > 0, hk(x) = 0, x i < xi < x\
m = 1,2,..., \M\; jg = k= ,i —
l,2,...,J; l,2,...,K; l,2,...,n.
where Ma is a generalized clustering model, \M\ corresponds to the number of models and Qm the objectives to optimize, J to the number of inequality constraints, K to the number of equality constraints and finally n is the number of decision variables. The last set of constraints restrict each decision variable Xi to take a value within a lower x\ and an upper x\ ' bound. Specifically, we consider the following instantiations: • \M\ = 3. We have three models: M^ and M\ are the models for each of the boxes,TTGACA-box and TATAAT-box, respectively, and M\ corresponds to the distance between these two boxes (recall Equations 1 and 2, and Figure 18.1). • \Q\ = 3. We have three objectives consisting of maximizing the degree of matching to the fuzzy models (fuzzy membership): Ql{x,Mi),Q2{x,Ml) and Q3(x,Ml) • J = 1. We have just one constraint g-\_: the distance between boxes can not be less than 15 and no more than 21 pair of bases. • K = 0. No equality constraints needed. • Only valid solutions are kept in each generation. • The boxes can not be located outside the sequence searched, that is, it can not start at negative positions or greater than the length of the query sequence. Definition 8: A solution x is said to dominate solution y (x -< y), if both conditions 1 and 2 are true: (1) The solution x is no worse than y in all objectives: fi(x) ^ fi(y) for alH = 1,2,... ,M; (2) The solution x is strictly better than y in at least one objective: fj(x) < fj(y) for at least one i 6 {1,2,..., M}. If x dominates the solution y it is also customary to write that x is nondominated by y.
436
R. Romero Zaliz et al.
In order to code the algorithm, three different models were developed. Both submotif models were implemented by using their nucleotide consensus frequency as discrete fuzzy sets, whose membership function has been learned from distributions13 The first model corresponding to the TATAAT-box was formulated as: Ml = iHataat{x) =Ml(si)U...U/4(a;)
(1)
where the fuzzy discrete set corresponding to the first nucleotide of the submotif To.77A076To.60A0.6iA0.56T0.82 was defined as n\(xi) = A/0.08+ T/0.77 + G/0.12 + C/0.05, and the other fuzzy sets corresponding to positions 2-6 were calculated in a similar way according to data distributions from16. The second model corresponding to the TTGACA-box was described as: Ml = littgaca (X) = n\ (Xl ) U ... U l4 (X)
(2)
where the fuzzy crisp set corresponding to the first nucleotide of the submotif T0.69T0.79G0.51A0.56C0.54A0.54 was defined as tf(x) = A/0.12+T/0.69+ G/0.13 + C/0.06 and the other fuzzy sets corresponding to positions 2-6 were calculated in a similar way accordingly to data distributions from16. The union operation corresponds to fuzzy set operations21'13. The third model, i.e., the distance between the previous submotifs, was built as a fuzzy set, whose triangular membership function M^ (see Figure 18.1) was learnt from data distributions9 centered in 17, where the best value (one) is achieved. Therefore, the objective functions Qm correspond to the membership to the former fuzzy models Ma.
Fig. 18.1. Graphical representation of M% Combination Operator and Local Search. We used a block representation
Generalized Analysis of Promoters: A Method for DNA Sequence Description 437
to code each individual, where each block corresponds to one of the promoter submotifs (i.e., TATAAT-box or TTGACA-box). Particularly, each block was represented by two integers, where the first number corresponds to the starting point of the submotif, and the second one represents the size of the box (see Figure 18.2). The combination process was implemented as Phenotype ttgaca tataat gtttatttaatgtttacccccataaccacataatcgcgttacact
t
t
char 6
char 29
Genotype Gen 0
Gen 1
[(6,6)]
[(29,6)]
/i = 0.578595
h = 0.800000
f3 = 1.000000
Fig. 18.2. Example of the representation of an individual
a one-point combine operator, where the point is always located between both blocks. For example, given chromosomes with two blocks A and B, and parents P = A\Bi and P' = A2B2, the corresponding siblings would be S — A1B2 and 5' — A^B\. The local search was implemented as a search for nondominated solutions in a certain neighborhood. For example, a local search performed over the chromosome space involves a specified number of nucleotides located on the left or right sides of the blocks composing the chromosome. The selection process considers that a new mutated chromosome that dominates one of its parent will replace it, but if it becomes dominated by its ancestors no modification is performed. Otherwise, if the new individual is not dominated by the nondominated population found so far, it replaces its father only if it is located in a less crowded region (see Figure 18.3). Algorithm. We modified the original SS algorithm to allow multipleobjective solutions by adding the nondominance criterion to the solution ranking6. Thus, nondominated solutions were added to the set in any order, but dominated solutions were only added if no more nondominated solutions could be found. In addition to maintaining a good set of nondominated solutions, and to avoid one of the most common problems of multi-objective algorithms such as multi-modality6, we also kept track of the diversity of
438
R. Romero Zaliz et al.
the available solutions through all generations. Finally, the initial populations were created randomly and unfeasible solutions corresponding to out of distance ranges between promoter submotifs (gi) were checked at each generation. Figure 18.4 clearly illustrates the MOSS algorithm proposed in GAP. 1: Randomly select which block g in the representation of the individual c to apply local search. 2: Randomly select a number n in [—neighbor,neighbor] and move the block g, n nucleotides. Notice that it can be moved upstream or downstream. Resulting block will be g' and resulting individual will be called c'. 3: if c' meets the restrictions then 4: if c' dominates c then 5: Replace c with c' 6: end if 7: if c' does not dominate c and c' is not dominated by c and c' is not dominated by any solution in the Non-Dominated set then 8: Replace c with c' if crowd(c') < crowd(c). 9: end if 10: end if Fig. 18.3. Local search
18.5. Experimental Algorithm Evaluation The GAP method was applied to a set of known promoter sequences reported in9. In this work 261 promoter regions and 68 alternative solutions (multiple promoters) defined in9 for the corresponding sequences (totalizing 329 regions) constituted the input of the method. To evaluate the performance of GAP, we first compare the obtained results with the ones retrieved by a typical DNA sequence analysis method, the Consensus/Patser n . Then, we compare the ability of MOSS with the other two Multiobjective Evolutionary Algorithms (MOEAs), i.e., the Strength Pareto Evolutionary Algorithm (SPEA)33 and the (ju + A) MultiObjective Evolutionary Algorithm (MuLambda)20. All of the former MOEA algorithms share the same following properties: • They store optimal solutions found during the search in an external set. • They work with the concept of Pareto dominance to assign fitness values to the individuals of the population.
Generalized Analysis of Promoters: A Method for DNA Sequence Description
439
1: Start with P = 0. Use the generation method to build a solution and the local search method to improve it. If x 0 P then add x to P, else, reject x. Repeat until P has the user specified size. 2: Create a reference set RefSet with 6/2 nondominated solutions of P and 6/2 solutions of P more diverse from the other 6/2. If there are not enough nondominated solutions to fill the 6/2, complete the set with dominated solutions. 3: NewSolution IICiAOI AAAGA11C CTTTAGTA GATAAT TTAAGTGTiCTTTAAT 32 AAATTATC TACTAA AGGAATCT TTAGTCAAG TTTATT TAAGATGACTTAACTAT 35 AATTCTCATGT TTGACA GCTTATCA TCGATAAGC TAGCTT TAATGCGGTAGTTTAT 38 TCACAATTCTCAAG TTGATA ATGAGAAT CATTATTGA CATAAT TGTTATTATTrTAC 35 TGTTTCAACACC ATGTAT TAATTGTG TTTATTTG TAAAAT TAATTTTCTGACAATAA 30 CTGGC TGGACT TCGAATTCA TTAATGCGG TAGTTT ATCACAGTTAA 38 AATAACCGTCAGGA TTGACA CCCTCCCA ATTGTATGT TTTCAT GCCTCCAAATCTTGGA 39 GCCAGTTAAATAGC TTGCAA AATACGTGG CCTTATGGT TACAGT ATGCCCATCGCAGTT 39 TAGAGATTCTCTTG TTGACA TTTTAAAAG AGCGTGGAT TACTAT CTGAGTCCGATGCTGTT 38 GGTGTATGCATTTA TTTGCA TACATTCA ATCAATTGT TATAAT TGTTATCTAAGGAAAT 38 TAGATAACAATTGA TTGAAT GTATGCAA ATAAATGCA TACACT ATAGGTGTGGTTTAAT 37 TGATAAGCAATGC TTTTTT ATAATGCC AACTTAGTA TAAAAT AGCCAACCTGTTCGACA 38 CGGTTTTTTCTTGC GTGTAA TTGCGGAG ACTTTGCGA TGTACT TGACACTTCAGGAGTG 38 TATCTCTGGCGGTG TTGACA TAAATACC ACTGGCGGT GATACT GAGCACATCAGCAGGA 38 TACCTCTGCCGAAG TTGAGT ATTTTTGC TGTATTTGT CATAAT GACTCCTGTTGATAGAT 38 TAACACCGTGCGTG TTGACT ATTTTACC TCTGGCGGT GATAAT GGTTGCATGTACTAAG 38 TTAACGGCATGATA TTGACT TATTGAAT AAAATTGGG TAAATT TGACTCAACGATGGGTT 39 GAGCCTCGTTGCGT TTGTTT GCACGAACC ATATGTAAG TATTTC CTTAGATAACAAT 38 AACACGCACGGTGT TAGATA TTTATCCC TTGCGGTGA TAGATT TAACGTATGAGCACAA 38 TTTTTCTAAATACA TTCAAA TATGTATC CGCTCATGA GACAAT AACCCTGATAAATGCT 42 CATCTGTGCGGTAT TTCACA CCGCATATGGTGCACTCTCAG TACAAT CTGCTCTGATGCCGCAT 38 ATCAAAGGATCTTC TTGAGA TCCTTTTT TTCTGCGCG TAATCT GCTGCTTGCAAACAAAA 38 AAGAATTCTCATGT TTGACA GCTTATCA TCGATAAGC TTTAAT GCGGTAGTTTATCACA 27 TCG TTTTCA AGAATTCA TTAATGCGG TAGTTT ATCACAGTTAA 42 TTCATACACGGTGC CTGACT GCGTTAGCAATTTAACTGTGA TAAACT ACCGCATTAAAGCTTA 39 GTGCTACAGAGTTC TTGAAG TGGTGGCCT AACTACGGC TACACT AGAAGGACAGTATTTG 38 AAGAATTCTCATGT TTGACA GCTTATCA TCGATGCGG TAGTTT ATCACAGTTAA 38 AAGAATTCTCATGT TTGACA GCTTATCA TCGGTAGTT TATCAC AGTTAAATTGC 39 AAGAATTCTCATGT TTGACA GCTTATCAT CGATCACAG TTAAAT TGCTAACGCAG 33 TTCTCATGT TTGACA GCTTATCA TCGATAAGC TAAATT TTATATAAAATTTAGCT 33 TTCTCATGT TTGACA GCTTATCA TCGATAAGC TAAATT TATATAAAATTTTATAT 38 CTGTTGTTCAGTTT TTGAGT TGTGTATA ACCCCTCAT TCTGAT CCCAGCTTATACGGT GATCGCACGATCTG TATACT TATTTGAGT AAATTAACC CACGAT CCCAGCCATTCTTCTGC CGATTTCGCAGCAT TTGACG TCACCGCT TTTACGTGG CTTTAT AAAAGACGACGAAAA 30 TT TTGTAG AGGAGCAAACAGCGTTTGCGA CATCCT TTTGTAATACTGCGGAA 30 ATTATCA TTGACT AGCCCATC TCAATTGG TATAGT GATTAAAATCACCTAGA 38 ATACGCTCAGATGA TGAACA TCAGTAGG GAAAATGCT TATGGT GTATTAGCTAAAGC 37 CTTTCACACTCCGC CCTATA AGTCGGAT GAATGGAA TAAAAT GCATATCTGATTGCGTG 36 TTGCATCAAATG CTTGCG CCGCTTCT GACGATGAG TATAAT GCCGGACAATTTGCCGG 38 TTGCCGCAGGTCAA TTCCCT TTTGGTCC GAACTCGCA CATAAT ACGCCCCCGGTTTG 38 ATGCCTTGTAAGGA TAGGAA TAACCGCC GGAAGTCCG TATAAT GCGCAGCCACATTTG 38 GTAGGCGGTCATA CTGCGG ATCATAGAC GTTCCTGTT TATAAA AGGAGAGGTGGAAGG 39 GTACCGGCTTACGC CGGGCT TCGGCGGTT TTACTCCTG TATCAT ATGAAACAACAGAG 38 CACAGAAAGAAGTC TTGAAC TTTTCCGG GCATATAAC TATACT CCCCGCATAGCTGAAT 38 ATGGGCTTACATTC TTGAGT GTTCAGAA GATTAGTGC TAGATT ACTGATCGTTTAAGGAA 37 ACTAAAGTAAAGAC TTTACT TTGTGGCG TAGCATGC TAGATT ACTGATCGTTTAAGGAA 37 TTTCTACAAAACAC TTGATA CTGTATGA GCATACAG TATAAT TGCTTCAACAGAACAT 38 GTAAGCGGTCATTT ATGTCA GACTTGTC GTTTTACAG TTCGAT TCAATTACAGGA 38 ATGCGCAACGCGGG GTGACA AGGGCGCG CAAACCCTC TATACT GCGCGCCGAAGCTGACC 38 TGTAAACTAATGCC TTTACG TGGGCGGT GATTTTGTC TACAAT CTTACCCCCACGTATA 38 GATCCAGGACGATC CTTGCG CTTTACCC ATCAGCCCG TATAAT CCTCCACCCGGCGCG 38 ATAAGGAAAGAGAA TTGACT CCGGAGTG TACAATTAT TACAAT CCGGCCTCTTTAATC 38 AAATTTAATGACCA TAGACA AAAATTGG CTTAATCGA TCTAAT AAAGATCCCAGGACG 38 TTCGCATATTTTTC TTGCAA AGTTGGGT TGAGCTGGC TAGATT AGCCAGCCAATCTTT 37 CGACTTAATATACT GCGACA GGACGTCC GTTCTGTG TAAATC GCAATGAAATGGTTTAA 36 CGCCCTGITCCG CAGCTA AAACGCAC GACCATGCG TATACT TATAGGGTTGC 33 AGCCAGGT CTGACC ACCGGGCAA CTTTTAGAG CACTAT CGTGGTACAAAT 36 ATGCTGCCACCC TTGAAA AACTGTCG ATGTGGGAC GATATA GCAGATAAGAA CCC TTGAAA AACTGTCGATGTGGGACGATA TAGCAG ATAAGAATATTGCT 37 GGCACGCGATGGG TTGCAA TTAGCCGG GGCAGCAGT GATAAT GCGCCTGCGCGTTGGTT 37 TTTTAAATTTCCTC TTGTCA GGCCGGAA TAACTCCC TATAAT GCGCCACCACTGACACG
Table 6. Results for the training sequences
) found •/ •/ •/ S •/ S •/ S •/ S •/ S S •/ ^ •/ •/ S w •/ •/ S •/ •/ S •/ •/ •/ •/ S •/ •/ •/ •/ •/ S •/ •/ S •/ •/ •/ •/ •/ •/ •/ •/ •/ •/ S D Q •/ S •/ •/ S •/ •/ S •/ S •/ •/ •/ •/ •/ •/ •/ •/ •/ S S •/ •/ •/ • S •/
Generalized Analysis of Promoters: A Method for DNA Sequence Description 449 sequence rrnABP2 rrnB-P3 rrnB-P4 rrnDEXP2 rrnD-Pl rrnE-Pl rrnG-Pl rrnG-P2 rrnXl RSFprimer RSFrnal S10 sdh-Pl sdh-P2 spc spot42r ssb str SucAB supB-E T7-A1 T7-A3 T7-C T7-D T7A2 T7E TAC16 TnlOPin TnlOPout TnlOtetA TnlOtetR TnlOtetR* TnlOxxxPl Tnl0xxxP2 Tnl0xxxP3 Tn2660bla-F3 Tn2661bla-Pa Tn2661bla-Pb Tn501mer Tn501merR Tn5TR Tn5neo Tn7-PLE tnaA tonB trfA trfB trp trpP2 trpR trpS trxA tufB tyrT tyrT/109 tyrT/140 tyrT/178 tyrT/212 tyrT/6 tyrT/77 uncl uvrB-Pl uvrB-P2 uvrB-P3 uvrC uvrD 434PR 434PRM
ttgaca 15 14 15 15 15 15 15 15 15 15 15 15 14 15 15 15 15 15 15 15 15 15 15 15 15 11 10 9 15 15 15 11 15 15 11 15 15 5 14 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 13 2 13 15 15 15 15 15 15 15 15
tataat 37 40 36 37 37 37 37 37 37 38 39 37 37 37 38 37 39 38 39 38 38 38 38 38 39 34 32 33 38 39 39 34 37 38 38 38 38 28 39 37 38 38 38 37 39 39 38 38 38 39 38 39 38 37 39 34 24 38 37 38 39 38 38 37 38 38
promoter GCAAAAATAAATGC CTATGATAAGGAT GCGTATCCGGTCAC CCTGAAATTCAGGG GATCAAAAAAATAC CTGCAATTTTTCTA TTTATATTTTTCGC AAGCAAAGAAATGC ATGCATTTTTCCGC GGAATAGCTGTTCG TAGAGGAGTTTGTC TACTAGCAATACGC ATATGTAGGTTAA AGCTTCCGCGATTA CCGTTTATTTTTTC TTACAAAAAGTGCT TAGTAAAAGCGCTA TCGTTGTATATTTC AAATGCAGGAAATC CCTTGAAAAAGAGG TATCAAAAAGAGTA GTGAAACAAAACGG CATTGATAAGCAAC CTTTAAGATAGGCG ACGAAAAACAGGTA CTTACGGATG AATGAGCTG TCATTAAG AGTGTAATTCGGGG ATTCCTAATTTTTG TATTCATTTCACTT TGATAGGGAG TTAAAATTTTCTTG AAATGTTCTTAAGA CCATGATAGA TTTTTCTAAATACA GGTTTATAAAATTC CCTC TTTTCCATATCGC CATGCGCTTGTCCT TCCAGGATCTGATC CAAGCGAACCGGAA ACTAGACAGAATAG AAACAATTTCAGAA ATCGTCTTGCCTTA AGCCGCTAAAGTTC AGCGGCTAAAGGTG TCTGAAATGAGCTG ACCGGAAGAAAACC TGGGGACGTCGTTA CGGCGAGGCTATCG CAGCTTACTATTGC ATGCAATTTTTTAG TCTCAACGTAACAC ACAGCGCGTCTTTG TTAAGTCGTCACTA TGCGCGCAGGTC C ATTTTTCTCAAC ATTATTCTTTAA TGGCTACTTATTGT TCGAGTATAATTTG TCAGAAATATTATG ACAGTTATCCACTA GCCCATTTGCCAGT TGGAAATTTCCCGC AAGAAAAACTGTAT ACAATGTATCTTGT
TTGACT TACTCA CTCTCA TTGACT TTGTGC TTGCGG TTGTCA TTGACT TTGTCT TTGACT TTGAAG TTGCGT TTGTAA TGGGCA TACCCA TTCTGA TTGGTA TTGACA TTTAAA TTGACG TTGACT TTGACA TTGACG TTGACT TTGACA ATGATA TTGACA TTAAGG CAGAAT TTGACA TTCTCT TGGTAA TTGATG TTGTCA TTTAAA TTCAAA TTGAAG GTGATA TTGACT TTCGAA TTCCAT TTGCCA TTGTAA TAGACA TTGAAT TTGACA TTGACG TTGACA GTGACA CTGATC ATCTCA TTTACG TTGCAT TTTACA TTTACG TACAAA GTGACG ATGTCG GTAACA TCGCCA TTGAAA TTGGCA GTGATG TTCCTG TTGTCT TTGGCA TTGACA TTGTCA
CTGTAGCG GGAAGGCG TCTTATCCTT ATCAAACCGT CCTGACA GTTCGTGG CTGAAAGA GGAAAGCG AAAAAATT GGGATCCC CCTGCGGA GAACTCCC GGCCGGAA TAACTCCC CTGTAGCG GGAAGGCG TCCTGAGC CGACTCCC TGATAGAC CGATTGATT TTATGCACC TGTTAAGGC TCGGTGGT TAAGTATG TGATTTTG TGAACAGCC GCTTCTTC GTCAAATT TATCCTTG AAGCGGTGT ACTGAACA AAAAAGAG ATGGTACAA TCGCGCGTT CCTTTTCG GCATCGCCC AACTGCCCC TGACACTAA CTGCAAGG CTCTATACG TAAAGTCT AACCTATAG ACATGAAG TAAACACGG CAATGTTA ATGGGCTGA TGATGGGT CTTTAGGTG ACATGAAGT AACATGCAG TTTACACA TTACAGTGA ATTAATCA TCGGCTCG TGGATACAC ATCTTGTCA TGGTAAAG AGAGTCGTG CTCTATCAT TGATAGAGT ATCACTGAT AGGGAGTGG AATAACTC TATCAATGA ATTTTTAT TTCCATGA CGACCACA TCATCATGA ATAACATACCGTCAGTATGTT TATGTATC CGCTCATGA ACGAAAGG GCCTCGTGA CGCTTATT TTTATAGGT CCGTAGATG AGTAGGGAAG TTGAAATT GGATAGCG GTGACCTC CTAACATGG GCTGGGGC GCCCTCTGG ACTGAAAT CAGTCCAGT AAAACTCT GAGTGTAA ATGATTGCT ATTTGCATT GCGGAACCA ATGTTTAGC TGCGAGAA ATGTTTAGC ATTAATCA TCGAACTAG TTTTAACA CGTTTGTTA CGCACGTTT ATGATATGC GCCAGCCT GATGTAATT AAAGCGTAT CCGGTGAAA GAACTCGC ATGTCTCCA GCGGCGCG TCATTTGA GTAATCGAA CGATTATTC GTACTGGCA CAGCGGGTC TCGAGAA AAACGTCT ATCATACC TACA.CAGC CTTTACAG GCGCGTCA GCAAAAATA ACTGGTTACC TCACGGGG GCGCACCG TAATTAAG TACGACGAG AACTGTTTT TTTATCCAG TGGATAAC CATGTGTAT GAACGTGA ATTGCAGAT TCTCTGAC CTCGCTGA AACAAGAT ACATTGTAT AATACAGT TTTTCTTGT
TATTAT TAAAAT TAAAAT TAATAT TATAAT TATAAT TATAAT TATTAT TATAAT CATCAT TAAACT TATAAT TATACT TATCAT TATAAT TAAAGT TACACT TAAAAT GACAGT CATAAT GAT ACT TACGAT TAGTCT TAGGCT TAAGAT TATACT TATAAT TATGAT TAAAAT TATTTT TAAAAT TAGAGT TAGATT TACCAT TATGGT GACAAT TACGCT TAATGT TAAGGT TAACCT TAACGT TAAGGT TATGCT TAATGT TAAAAT TAAACT TAAACT TTAACT CAAGGT TATCGT TATCAG TAAAGT TAGAAT TATGAT TTTAAT TTTGTT TAAGTC TGAAGA. TTTGAT TTTAAT TATAAT TAAAAT TATAAT TAGAGT TATGCT TATAAT GAAAAT GAAGAT
Table 7. Results for the training sequences
GCACACCCCGCGCCGC GGGCGGTGTGAGCTTG AGCCAACCTGTTCGACA ACGCCACCTCGCGACAG GCGCCTCCGTTGAGACG GCGCCTCCATCGACACG GCGCCACCACTGACACG GCACACCGCCGCGCCG GCGCCTCCATCGACACG CTCATAAATAAAGAA GAAAGAACAGATTTTG GCGCGGGCTTGTCGT GCCGCCAGTCTCCGGAA GTGGGGCATCCTTACCG GCCGCGCCCTCGATA TAGTCGCGTAGGGTACA TATTCAGAACGATTTT TCGGCGTCCTCATAT TTTAAAAGGTTCCTT GCGCCCCGCAACGCCGA TACAGCCATCGAGAGGG GTACCACATGAAACGAC TATCTTACAGGTCATC TTAGGTGTTGGCTTTA ACAAATCGCTAGGTAAC CAAGGCGACTACAGATA GTGTGGAATTGTG CAAATGGTTTCGCGAAA ATCGAGTTCGCACATC ACCACTCCCTATCAGT AACTCTATCAATGATA GTCAACAAAAATTAGG TAAAATAACATACC AAACATACTGACGG ATCATGATGATGTGGTC AACCCTGATAAATGCT TATTTTTATAGGTTAA C ATGATA AT AATGGTTT TACGCTATCCfcATTTC TACTTCCGTACTCA TCATGATAACTTCTGCT TGGGAAGCCCTGCAA GTGAAAAAGCAT AGCCTCGTGTCTTGCG CGAGACCTGGTTT AGAGTCTCCTT TCTCTCATGTG AGTACGCAAGTTCACGT AAAGGCGACGCCGCCC ACTCTTTAGCGAGTACA TCTATAAATGACC CAACTAGTTGGTTAA GCGCGCTACTTGATGCC GCGCCCCGCTTCCCGAT CGCCAGCAAAAATAA TACGGTAATCG GTGCACTATACA TATGATGCGGGCAGGTCGTGACG ATGATGCGCCCCGCTTC CCGTTACGGATGAAAAT TTGACCGCTTTTTGAT TACATACCTGCCCGC TTGTTGGCATAATTAA TAGAAAACACGAGGCA GATGATCACCAAGG CAGCAAATCTGTATAT ACAAGAAAGTTTGTTGA TGGGGGTAAATAACAGA
found 7 •/ •/ S •/ •/ •/ •/ S v^ •/ •" S •/ S •/ •/ •/ •/ J S •/ •/ •/" •/ •/ •/ S •/ ./" •/ •/ •" •/ •/ S •/ •/ S S •/ S • •/ •/ •/ •/ •/ y •/ •/ S S •/" •/ • S J • S S •/ S •/ S •/ •/ •/
CHAPTER 19 MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR COMPUTER SCIENCE APPLICATIONS
Gary B. Lamont, Mark P. Kleeman, Richard 0. Day Genetic Computational Techniques Research Group Department of Electrical and Computer Engineering Air Force Institute of Technology Wright Patterson Air Force Base, Dayton, OH 45433, USA E-mail:
[email protected] mkleeman@afit. edu
[email protected] In this chapter, we apply the multi-objective Messy Genetic Algorithm-II (MOMGA-II) to two NP-Complete multi-objective optimization problems. The MOMGA-II is a multi-objective version of the fast Messy Genetic Algorithm (fmGA) that is an explicit building-block method. First, the MOMGA-II is used to determine 'good' formations of unmanned aerial vehicles (UAVs) in order to limit the amount of communication flow. The multi-objective Quadratic Assignment Problem (mQAP) is used to model the problem. Then, the MOMGA-II is applied to the Modified Multi-objective Knapsack Problem (MMOKP), a constrained integer based decision variable MOP. The empirical results indicate that the MOMGA-II is an effective algorithm to implement for similar NPComplete problems of high complexity.
19.1. Introduction Multi-objective evolutionary algorithms (MOEAs) are stochastic computational tools available to researchers for solving a variety of multi-objective problems (MOPs). Of course, there are many pedagogical polynomialcomplexity MOPs that can be solved optimally using deterministic algorithms. However, there are also many real world MOPs that are too computationally intensive to be able to obtain an optimal answer in a reasonable amount of time. These problems are considered NP-complete (NPC) prob451
452
Gary B. Lamont et al.
lems1. The use of MOEAs to find "good" solutions to these high-dimension exponential-complexity problems is of great utility. We address the general category of NPC as defined in a MOP structure. Two MOP NPC examples are solved in depth using MOEAs in order to provide specific insight. Some generic comments are presented that address MOEA approaches for NPC combinatoric problems. 19.2. Combinatorial MOP Functions Multi-objective Optimization Problems (MOPs), a variation of Combinatorial Optimization Problems, are a highly researched area in the computer science and operational research fields. MOPs generally model real-world problems better than their single objective counterparts as most real world problems have competing objectives that need to be optimized. MOPs are used to solve many /VP-Complete problems. The detailed notational symbology for these problems, as well as for Pareto optimality, are described in Chapter 1. Table 19.75 lists just a few of these NP-Complete (NPC) types of problems. In essence, NPC combinatoric MOP problems are constrained minimization problems with the additional constraint on x such that it is only able to take on discrete values (e.g., integers). The use of these combinatorial MOPs in any proposed MOEA test suite should also be considered. On one hand, EAs often employ specialized representations and operators when solving these NPC problems which usually prevent a general comparison between various MOEA implementations. On the other hand, NPC problems' inherent difficulty should present desired algorithmic challenges and complement other test suite MOPs. Databases such as TSPLIB 29, MP-Testdata 32, and OR Library 2, exist for these iVP-Complete problems. On another note, the fitness landscapes for various ./VP-Complete problems vary over a wide range. For example, the knapsack problem reflects a somewhat smooth landscape while the TSP problem exhibits a many-faceted landscape. The later then being more difficult to search for an "optimal" Pareto front. Other NP-Complete problem databases are also available 25>6>34. As an example, for the multi-objective 0/1 knapsack problem with n
'By "NP" problem of course is meant non-polynomial deterministic Turing machine polynomial execution time. "C" refers to the polynomial mapping of various NP combinatoric problems to each other; i.e., a complete set 8 .
Multi-Objective Evolutionary Algorithms for Computer Science Applications
453
Table 19.75. Possible Multi-objective iVP-Complete Functions | JVP-Complete Problem Travelling Salesperson Coloring Set/Vertex Covering Maximum Independent Set (Clique) Vehicle Routing Scheduling Layout iVP-Complete Problem Combinations 0/1 Knapsack - Bin Packing Quadratic Assignment
Example ^j Min energy, time, and/or distance; Max expansion Min number of colors, number of each color Min total cost, over-covering Max set size; Min geometry Min time, energy, and/or geometry Min time, missed deadlines, waiting time, resource use Min space, overlap, costs Vehicle scheduling and routing Max profit; Min weight Max flow; Min cost
knapsacks and m items, the objective is to maximize /(x) = (/ 1 (x),.../ n (x))
(B.I)
where m
(B-2)
/i(x) = Y,P*JXJ i=i
and where pij is the profit of the j item in knapsack i and Xj is 1 if selected. The constraint is m
2_,Wi,3Xi
— Cifor att i
(B-3)
i=i
where u>ij is the weight of item j in knapsack i and Cj is the capacity of knapsack j . 19.3. MOP NPC Examples In order to gain insight to applying multi-objective evolutionary algorithms (MOEAs) to NPC MOP problems, discussed is the multi-objective quadratic assignment problem and a modified MOP knapsack problem. Insight provides for general understanding of MOEA development for NPC MOPs. 19.3.1. Multi-Objective
Quadratic Assignment
Problem
The standard quadratic assignment problem (QAP) and the multi-objective quadratic assignment problem (mQAP) are NP-complete problems. Such
454
Gary B. Lamont et al.
problems arise in real-world applications such as facilities placement, scheduling, data analysis, manufacturing, and resource use. Most QAP examples can be thought of as minimizing the product of two matrices, for example a distance matrix times a flow matrix cost objective. Many approaches to solving large dimensional QAPs involve hybrid algorithms including GA integration with local search methods such as Tabu search and simulated annealing. Here we examine the mQAP as mapped to a heterogenous mix of unmanned aerial vehicles (UAVs) using a MOEA. Our model concentrates on minimizing communication flow and maximize mission success by positioning UAVs in a selected position within a strict formation. Various experiments are conducted using a MOEA approach. The specific algorithm used was the multi-objective Messy Genetic Algorithm-II (MOMGA-II), an explicit building-block method. Solutions are then compared to deterministic results (where applicable). The symbolic problem description is initially discussed to provide problem domain insight. Regarding a specific application, consider UAVs, flying in large groups. One possible scenario is to have a heterogenous group of UAVs flying together to meet a specific objective. There could be some in the group that are doing reconnaissance and reporting the information for security purposes. In a large heterogenous group, such as this, one UAVs position with respect to the other UAVs is important. For example, it would be best to place some UAVs around the outside of the group in order to protect the group as a whole. It would also be advantageous to have the reconnaissance UAVs nearer to the ground in order to allow them to have an unobstructed field of view. While location in the formation for their particular part of the mission is important, they also need to be in a position where they can communicate effectively with other UAVs. For example, the reconnaissance UAVs need to communicate coordinates, to enable them to find their target. Other UAVs need to communicate with all of the other UAVs when they sense approaching aircraft, so that the group can take evasive action (likefishbehavior). All of this communication may saturate on communication channel, so multiple communication channels are used. All of these channels of communication can also dictate where the best location in the group may be for each UAV type. The UAV communication and mission success problem is a natural extension of the mQAP. The mQAP comes from the quadratic assignment problem (QAP) and was introduced by Knowles and Corne 17. The scalar quadratic assignment problem was introduced in 1957 by Koopmans and
455
Multi-Objective Evolutionary Algorithms for Computer Science Applications
Beckmann, when they used it to model a plant location problem3. It is defined as follows. 19.3.1.1. Literary QAP Definition The QAP definition is based on a fixed number of locations where each location is a fixed distance apart from the others. In addition to the locations, there are an equal number of facilities. Each facility has afixedflowto each other facility. A solution consists of placing each facility in one and only one location. The goal is to place all facilities in such a way as to minimize the cost of the solution, where the cost is defined as the summation of each flow multiplied times the distance. 19.3.1.2. Mathematical QAP Definition n
n
i=i
j=i
(c.1) where n is the number of objects/locations, aij is the distance between location i and location j , bij is the flow from object i to object j , and •Ki gives the location of object i in permutation TT 6 P(n) where P{n) is the QAP search space, which is the set of all permutations of {1,2,..., n) 18 . This problem is not only NP-hard and NP-hard to approximate, but is almost intractable. It is generally considered to be impossible to solve optimally any QAP that has 20 instances or more within a reasonable time frame 3>27. 19.3.1.3. General mQAP The mQAP is similar to the scalar QAP, with the exception that there are multiple flow matrices - each needing to be minimized. For example, the UAVs may use one communication channel for passing reconnaissance information, another channel for target information, and yet another channel for status messages. The goal is to minimize all the communication flows between the UAVs. The mQAP is defined in mathematical terms in equations C.2 and C.3 19.3.1.4. Mathematical mQAP The mQAP is defined in mathematical terms in equations C.2 and C.3
456
Gary B. Lamont et al.
mmimfze{C(7r)} = {C1(7r),C2(7r),...,Crm(7r)}
(C.2)
Ck(n) =n£%n) J2 E °0-^W).. * e l..m
(C.3)
where
»=i
j=i
and where n is the number of objects/locations, a,ij is the distance between location i and location j , bfj is the kth flow from object i to object j , •Ki gives the location of object i in permutation ?r £ P(n), and 'minimize' means to obtain the Pareto front 18. Much work has been done with respect to classify solutions found in the fitness landscape of QAP instances. Knowles and Corne 18 identified two metrics for use with the mQAP: diameter and entropy. Diameter of the population is defined by Bachelet l and is shown in Equation C.4:
dmm(P) =
1—
(C.4)
where dist(ir,fi) is a distance measurement that measures the smallest number of two-swaps that need to be performed in order to transform one solution, 7T, into another solution, fi. The distance measure has a range of [0,n-l]. The metric entropy measures the dispersion of the solutions. It is shown in equation C.5:
(c.5) where n*j is a measure of the number of times object i is assigned to the j location in the population. Many approaches have been tried to solve the QAP. Researchers interested in finding the optimal solution can usually only do so for problems that are of size 20 or less; moreover, even problem sizes of 15 are considered to be difficult 3 . In cases where it is feasible to find the optimal solution (less than size 20), branch and bound methods are typically used 10>28-3. Unfortunately, most real-world problems are larger than size 20 and thus require the employment of other solving methods in order to find a good solution in a reasonable time. For instance, the use of Ant Colonies has been explored
Multi-Objective Evolutionary Algorithms for Computer Science Applications
457
and is found to be effective when compared to other available heuristics 7,31,21 Evolutionary algorithms have also been applied 23>20>n>26. A good source where researchers compare performances of different search methods when solving the QAP can be found at 33-22. 19.3.1.5. Mapping QAP to MOEA Table 19.76. Test Name
Test Suite
Instance Category
# of locations
"~KC10-2fl-[l,2,3]uni II Uniform ~ KC20-2fl-[l,2,3]uni Uniform KC30-3fl-[l,2,3]uni Uniform KC10-2fl-[l,---,5]rl Real-like KC20-2fl-[l,---,5]rl Real-like ~~KC30-3fl-[l,2,3]rl || Real-like [|
Table 19.77.
10 20 30 10 20 30
# of flows II
||
2 2 3 2 2 3
MOMGA-II settings
Parameter
Value
GA-type Representation Eras BB Sizes Pent Psplice Pynutation string length Total Generations Thresholding Tiebreaking
fast messy GA Binary 10 1-10 2% 100% 0% 100, 200, 300 100 No No
|
The Multi-objective messy Genetic Algorithm-II (MOMGA-II) program is based on the concept of the Building Block Hypothesis (BBH). The MOMGA-II is based off of the earlier MOMGA algorithm 38 . The MOMGA implements a deterministic process to generate an enumeration of all possible BBs, of a user specified size, for the initial population. This process is referred to as Partially Enumerative Initialization (PEI). Thus, the MOMGA explicitly uses these building blocks in combination to attempt to solve for the optimal solutions in multi-objective problems.
458
Gary B. Lamont et al.
The original messy GA consists of three distinct phases: Initialization Phase., Primordial Phase, Juxtapositional Phase. The MOMGA uses these concepts and extends them where necessary to handle k > 1 objective functions. In the initialization phase, the MOMGA produces all building blocks of a user specified size. The primordial phase performs tournament selection on the population and reduces the population size if necessary. The population size is adjusted based on the percentage of "high" fitness BBs that exist. In some cases, the "lower" fitness BBs may be removed from the population to increase this percentage. In the juxtapositional phase, BBs are combined through the use of a cut and splice recombination operator. Cut and splice is a recombination (crossover) operator used with variable string length chromosomes. The cut and splice operator is used with tournament thresholding selection to generate the next population. A probabilistic approach is used in initializing the population of the fmGA. The approach is referred to as Probabilistically Complete Initialization (PCI) 9. PCI initializes the population by creating a controlled number of BBs based on the user specified BB size and string length. The fmGA's initial population size is smaller than the mGA (and MOMGA by extension) and grows at a smaller rate as a total enumeration of all BBs of size o is not necessary. These BBs are then "filtered", through a Building Block Filtering (BBF) phase, to probabilistically ensure that all of the desired good BBs from the initial population are retained in the population. The BBF approach effectively reduces the computational bottlenecks encountered with PEI through reducing the initial population size required to obtain "good" statistical results. The fmGA concludes by executing a number of juxtapositional phase generations in which the BBs are recombined to create strings of potentially better fitness. The MOMGA-II mirrors the fast messy Genetic Algorithm (fmGA) and consists of the following phases: Initialization, Building Block Filtering, and Juxtapositional. The MOMGA-II differs from the MOMGA in the Initialization and Primordial phase, which is referred to as the Building Block Filtering phase. The initialization phase of the MOMGA-II uses PCI instead of the PEI implementation used in the MOMGA and randomly creates the initial population. The application of an MOEA to a class of MOP containing few feasible points creates difficulties that an MOEA must surpass in order to generate any feasible points throughout the search process. A random ini-
Multi-Objective Evolutionary Algorithms for Computer Science Applications
459
tialization of an MOEA's population may not generate any feasible points in a constrained MOP. Without any feasible solutions in the population, one must question whether or not the MOEA can even conduct a worthwhile search. In problems where the feasible region is greatly restricted, it may be impossible to create a complete initial population of feasible solutions randomly. Without feasible population members, any MOEA is destined to fail. Feasible population members contain the BBs necessary to generate good solutions. It is possible for an infeasible population member to contain a BB that is also present in a feasible solution. As it is also possible for mutation to generate a feasible population member from a infeasible population member. However, typically feasible population members contain BBs that are not present in infeasible population members. Evolutionary operations (EVOP) applied to feasible members tend to yield better results than EVOPs applied to infeasible population members. Therefore, it is critical to initialize and maintain a population of feasible individuals. 19.3.2. MOEA mQAP Results and Analysis 19.3.2.1. Design of mQAP Experiments and Testing The goal of the experiments was to compare the MOMGA-II results with other programs that have solved the mQAP. In order to do this, a benchmark data set was needed for comparison purposes. The test suite chosen was created by Knowles 16 and for smaller sized problems a deterministic search program was used to get definitive results. See Table 19.76 for an entire listing of the test suite problems. Table 19.77 lists the MOMGA-II default parameter settings used during the mQAP experiments. Building block sizes 1 through 10 were used. Each building block size was run in a separate iteration, or era, of the program. Population sizes were created using the Probabilistically Complete Initialization method referred to earlier in the paper. Specifically, the population for each era was determined using the the population formula found in Equation C.6. PopSize = NumCopies x AlleleCombo x Choose(Prob_Len, Order) (C.6) Where PopSize is the population size to be found, NumCopies is the number of duplicate copies of alleles that we want to have, AlleleCombo is 2Order where Order is the building block size, and Choose(ProbLen, Order) is a combination that takes the problem length and the building block size
460
Gary B. Lamont et al.
as its variables. These settings were chosen based on previous settings used when MOMGA-II was applied to the multi-objective knapsack problem. Due to the extended length of time it took to generate data, other settings could not be evaluated as well. It is recommended that future experiments be run with different settings in order to determine the best settings for these particular problems. The MOMGA-II results are taken over 30 data runs. The MOMGA-II was run on a Beowulf PC cluster consisting of 32 dual-processor machines, each with 1-GB memory and two 1-GHz Pentium III processors (using Redhat LINUX version 7.3 and MPI version 1.2.7.1). The MOMGA-II code was run in two different manners. One method started with a randomized competitive template and passed the improved competitive template to larger building sizes. The other method had separate competitive templates for each building block size. The first method allows the algorithm to exploit "good" solutions as the algorithm runs. The second method allows the larger building block sizes to explore the search space more. The MOMGA-II code was run in order to generate a population with good (low) fitness values for theflowsand found the non-dominated points. After the unique Pareto points for each of the runs was found, the results were combined, one at a time, and pareto.enum was used to pull out the unique Pareto points for each round. A simple MatLab program was then used that showed how the data values improved as more runs were run. 19.3.2.2. QAP Analysis Table 19.78 compares our original results (competitive template passed to larger building block sizes) to those found by Knowles and Corne 18 and the optimal results, when applicable, using a simple program that goes through all possible permutations. Abbreviations used in the table are as follows: Non-Dominated (ND), Diameter(Dia), Entropy(Ent), and Deterministic(Det). For all of the instances with 10 locations and 10 facilities Knowles and Corne used a deterministic algorithm. For all the instances with 20 locations and 20 facilities, they used local search measures which employed 1000 local searches from each of the 100 different A vectors. For the instances with 30 locations and 30 facilities, they employed a similar local search measure which used 1000 local searches from each of the 105 different A vectors 17.
461
Multi-Objective Evolutionary Algorithms for Computer Science Applications
Table 19.78. Comparison of QAP Results
Ttest Name
Knowles Results
#NDI pts
KC10-2fl-luni II 13 I KC10-2fl-2uni 1 KC10-2fl-3uni 130 KC20-2fl-luni 80 KC20-2fl-2uni 19 KC20-2fl-3uni 178 KC30-3fl-luni 705 KC30-3fl-2uni 168~ KC30-3fl-3uni 1257 KC10-2fl-lrl 58 KC10-2fl-2rl 15 KC10-2fl-3rl 55 KC10-2fl-4rl 53 KC10-2fl-5rl 49 ~ KC20-2fl-lrl 541 KC20-2fl-2rl 842 KC20-2fl-3rl 1587 KC20-2fl-4rl 1217~ KC30-3fl-lrl 1329 KC30-3fl-2rl 1924
I
Dia
Ent
7 I 0.71 II 6 0.39 8 0.78 15 0.828 14 0.43 16 0.90 24 0.97 22 0.92 24 ~~0.96 8 0.68 7 0.49 8 0.62 8 ' 0.58 8 ~0.63 15 ~0,63 14 ~ 0 . 6 15 0.66 15 ~0.51 24 0.83 24 0.86
Our Results
#NDI pts 13 1 118 24 538 51 126 58 155 44 10 36 34 45 17 12 29 25 191 183
Dia I
PFtTUe Points
I
Ent
EA
I ^~\
%
Det
Found
5 I 0.69 II 9 I 13 I 0 0 1 1 6 0.87 40 130 11 0.82 15 1.48 ~ 12 0.92 20 O50 22 ~0.64 ~ 20 0.56 ~ 5 0.61 21 58 ~ 5 0.56 5 15 ~ 6 0.71 23 55 4 6753 24 53 6 ~0.69 36 49 ~ 12 0.73 11 0.76 12 0.91 10 ~0.18 ~ 24 0.79 ~ 24 0.77
69 100 31
36 33 42 45 73
By comparing the initial MOMGA-II results for the 10 locations with 10 facilities instances, it can be shown that the results did not equal the results for the Pareto optimal (found deterministically). It can also be assumed that the MOMGA-II results for problems with 20 & 30 locations and 20 & 30 facilities did not find all the true Pareto front members. When compared to Knowles and Corne's test results, the MOMGA-II results might be deficient depending on if they indeed have found true Pareto front points. Figures 19.1 and 19.2 illustrate results for 30 locations and 30 facilities. Then we ran the MOMGA-II with the exact same settings, but we randomized our competitive template for each building block size. Table 19.79 shows the outcome of those results with respect to our initial run and the optimal results. The old method refers to using the same competitive templates throughout all the building block sizes and the new method randomizes the competitive template before each building block size. As you can see, by allowing more exploration of the search space, we were able to find more PFtrue points. Figures 19.3 and 19.4 show some of the results of these runs. The results show that the new method performs much better than the old method on
462
Gary B. Lamont et al.
Fig. 19.1. Pareto front found for the KC30-3fl-2rl test instance
Fig. 19.2. Pareto front found for the KC30-3fl-3uni test instance
all instances except one. The one time the old method performs better is when there is only one data point as a solution. These results show that, with the exception of one instance, the new method is more effective than
MvXti- Objective Evolutionary Algorithms for Computer Science Applications
463
Table 19.79. Comparison of MOMGA-II Methods to PFtruf,
Test Name
Total PFlrue
KC10-2fl-luni KC10-2fl-2uni KC10-2fl-3uni KC10-2fl-lrl KC10-2fl-2rl KC10-2fi-3rl KC10-2fl-4rl KC10-2fl-5rl
13 1 130 58 15 55 53 49
Mean Std. Dev. Mean (w/o anomaly) Std. Dev. (w/o anomaly)
II
True Pareto Front Points Old Percent New Method Found Method II
II
9 1 40 21 5 23 24 36
69 100 31 36 33 42 45 73 53.76 24.59
II ' '
II
Percent Found
11 0 122 56 11 50 47 49
85 0 94 97 73 91 89 100 78.49 32.75
47.16
89.70
17.28
8.82
1
i.5 1
Fig. 19.3. instance
Comparison of MOMGA-II methods to optimal results on KC10-2fl-lrl test
the old method. These suggest that randomizing the competitive template, allows the algorithm to explore the objective space more effectively and yield better results. It is believed that the MOMGA-II suffers from "sped-
464
Gary B. Lamont et al.
ation". This can be overcome by adding some competitive templates near the center of the Pareto Front. See 13>14 for more detailed analysis of these results.
Fig. 19.4. Comparison of MOMGA-II methods to optimal results on KC10-2fl-luni test instance
The results from Table 19.79 support these findings. Whenever there were many points to find, the new method always found more than the old method. The reason why the old method performed better than the new method when there was only one point to find is due to the fact that both competitive templates are pointing at the same location. This directs the search in the same direction as opposed to dividing the search into two directions. Since the new method doesn't have this directed search passed on to the larger building block sizes, they start at a disadvantage when trying to find one or two points. Additional experiments were done to see if building block size played a role in where the points were located along the Pareto Front. We found that, on average, about twice as many large building blocks populate the outside of the Pareto Front than the smaller building block sizes do. This is due to more bits being set in the genotype domain and allows for a better solution in the phenotype domain. These results support the results that are discussed in Section 19.4 of this chapter.
Multi-Objective Evolutionary Algorithms for Computer Science Applications
465
More in-depth analysis of the results can be found in 5'14-13. 19.3.3. Modified Multi-Objective Knapsack Problem (MMOKP) The generic multiple knapsack problem (MKP) consists of maximizing the profit (amount of items) for all knapsacks while adhering to the maximum weight (constraint) of each knapsack. The MOMGA-II is applied to the Modified Multi-objective Knapsack Problem (MMOKP), also a constrained integer based decision variable MOP. The formulation contains a large number of decision variables. MOEAs are suited to attempt and solve problems of high dimensionality and hence the MOMGA-II is suited for this application. The MMOKP is formulated in 100, 250, 500, and 750 item formulations with integer based decision variables and real-valued fitness functions. While the MMOKP formulation used does not reflect the true multi-objective formulation of the true multiple knapsack problem (MKP) due to the constraint that any item placed into one of the knapsacks must also be placed into all of the knapsacks. However, the MMOKP remains a good test problem due to the large number of decision variables and the difficulty associated with generating solutions on the Pareto front. Many researchers have selected this MOP to test their non-explicit building-block MOEAs 12,15,19,30,35,36,37 T h e MMOKP has been selected for testing due to the difficulty of finding good solutions to this problem and to evaluate the performance of an explicit BB-based MOEA approach as applied to this MOP. Since the MMOKP has similar characteristics to other real-world MOPs, it is a good test problem to use. The specific MOMGA-II settings used are presented in Table 19.80. Results are taken over 30 data runs in order to compare the results of the MOMGA-II to other MOEAs also executed over 30 data runs. The MOMGA-II was run on a SUN Ultra 10 with a single 440 MHz processor, 1024 MB of RAM, and using the Solaris 8 operating system. Table 19.80. MOMGA-II settings |
Parameter Eras BB Sizes
|
Value 10 TlO
Pent Psplice
2% 100%
string length Total Generations
100, 250, 500, 750 100
|
466
Gary B. Lamont et al.
The overall goal is to maximize the profit obtained from each of the knapsacks simultaneously while meeting the weight constraints imposed. The MOP formulation follows for m items and n knapsacks , where Pi,j = profit of item j according to knapsack i, Wij = weight of item j according to knapsack i, Ci = capacity of knapsack i For the MMOKP problem with n knapsacks and m items, the objectives are to maximize /(x) = ( / l « , - / n ( x ) )
(C.7)
/i(x) = $ > , ; * i
(C.8)
where
and where Xj — Hi item j is selected, 0 otherwise 37. The constraints are: m ^WijXj
19>37. The extremes are referred to as the endpoints of the curve or /c-dimension surface as dictated by the k objective functions. The difficulty of generating the extreme points of the Pareto front is attributed to the necessary identification of multiple BBs of different sizes. Implicit BB-based MOEAs may only generate BBs of a single size or may not be executed with a population size large enough to statistically generate the multiple good BBs of various sizes necessary to generate PFtrue • Various examples can illustrate the effect that different BB sizes may have in finding various points in the ranked fronts and in PFtrue • Since many MOEA researchers conduct their research efforts with implicit BB-based MOEAs, building block concepts and the effects of the identification of good BBs are not readily noticeable. Through research conducted using the MOMGA-II and the theoretical development of population sizing equations 38 based upon the Building Block Hypothesis, the need for different sized BBs to generate PFtrue becomes apparent. Many of the existing MOEAs are not effective at finding all of the points on the Pareto front and more explicitly, points at the endpoints or end sections of the Pareto front when applied to test suite and real-world MOPs 12'19>37. While generating any point(s) on the Pareto front may be useful for real-world applications in which potential solutions have not been found, it would be even more useful if a researcher could generate a good distribution of points across the entire front. This has been identified by researchers utilizing MOEAs as an important issue 4-12>19. A question that the MOEA community should answer is: Why do various MOEAs fail to find the endpoints of the Pareto front or if they do find some of the points, why does this typically occur with larger population sizes?
When using an explicit BB-based MOEA, the implications of Van Veldhuizen's theorem are that one must use a BB of the same order as the largest order BB required to solve each of the functions in the MOP leading to various conjectures 38!
478 19.5.
Gary B. Lamont et al. Future Directions
The mQAP and the MMOKP are examples of difficult NPC MOP problems to solve deterministically for relatively large problem sizes. Stochastic algorithms, like MOEAs, take a long time to get a "good" answer for a large number of locations simply because the solution space is so large and of exponential complexity. It's imperative to ensure that the proper building block sizes are used in order to populate PFknown with enough members to get as close to PFtTue as possible. Thus, in applying MOEAs to large dimensional NPC MOPs, one should consider possible problem relaxation, analysis of building block structures, use of a variety of MOEAs and operators, parallel computation, and finally an extensive design of experiments with appropriate metric selection, parameter sensitivity analysis, and comparison. We plan on looking at how chromosome sizing affects the mQAP results. By changing the bit representation, we can cut the chromosome size down from 10 bits per location to 4. This effectively halves the genotype space from previous experiments. This should produce better results since the search space is reduced. This concept can also improve the efficiency in solving other NP-Complete MOPs. References 1. Vincent Bachelet. Metaheuristiques Paralleles Hybrides: Application au Probleme D'affectation Quadratique. PhD thesis, Universite des Sciences et Technologies de Lille, December 1999. 2. John E. Beasley. Or-library. 12 May 2003 http://mscmga.ms.ic.ac.uk/ info.html. 3. Eranda Cela. The Quadratic Assignment Problem - Theory and Algorithms. Kluwer Academic Publishers, Boston, MA, 1998. 4. Carlos A. Coello Coello, David A. Van Veldhuizen, and Gary B. Lamont. Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers, New York, May 2002. 5. Richard O. Day, Mark P. Kleeman, and Gary B. Lamont. Solving the Multiobjective Quadratic Assignment Problem Using a fast messy Genetic Algorithm. In Congress on Evolutionary Computation (CEC'2003), volume 4, pages 2277-2283, Piscataway, New Jersey, December 2003. IEEE Service Center. 6. Eranda ela. Qaplib - a quadratic assignment problem library. 8 June 2004 http://www.opt.math.tu-graz.ac.at/qaplib/. 7. L. M. Gambardella, E. D. Taillard, and M. Dorigo. Ant colonies for the quadratic assignment problems. Journal of the Operational Research Society, 50:167-176, 1999.
Multi- Objective Evolutionary Algorithms for Computer Science Applications
479
8. M. R. Garey and D. S. Johnson. Computers and Intractability - A Guide to the Theory of NP-Completeness. Freeman, San Francisco, 1979. 9. David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, and Georges Harik. Rapid, accurate optimization of difficult problems using fast messy genetic algorithms. In Stephanie Forrest, editor, Proceedings of the Fifth International Conference on Genetic Algorithms, pages 56-64. Morgan Kauffmann Publishers, 1993. 10. Peter Hahn, Nat Hall, and Thomas Grant. A branch-and bound algorithm for the quadratic assignment problem based on the hungarian method. European Journal of Operational Research, August 1998. 11. Jorng-Tzong Horng, Chien-Chin Chen, Baw-Jhiune Liu, and Cheng-Yen Kao. Resolution of quadratic assignment problems using an evolutionary algorithm. In Proceedings of the 2000 Congress on Evolutionary Computation, volume 2, pages 902-909. IEEE, IEEE, 2000. 12. Andrzej Jaszkiewicz. On the performance of multiple-objective genetic local search on the 0/1 knapsack problem—a comparative experiment. IEEE Transactions on Evolutionary Computation, 6(4):402-412, August 2002. 13. Mark P. Kleeman. Optimization of heterogenous uav communications using the multiobjective quadratic assignment problem. Master's thesis, Air Force Institute of Technology, Wright Patterson AFB, OH, March 2004. 14. Mark P. Kleeman, Richard O. Day, and Gary B. Lamont. Multi-objective evolutionary search performance with explicit building-block sizes for npc problems. In Congress on Evolutionary Computation (CEC2004), volume 4, Piscataway, New Jersey, May 2004. IEEE Service Center. 15. Joshua Knowles and David Corne. M-PAES: A Memetic Algorithm for Multiobjective Optimization. In 2000 Congress on Evolutionary Computation, volume 1, pages 325-332, Piscataway, New Jersey, July 2000. IEEE Service Center. 16. Joshua Knowles and David Corne. Instance generators and test suites for the multiobjective quadratic assignment problem. Technical Report TR/IRIDIA/2002-25, IRIDIA, 2002. (Accepted for presentation/publication at the 2003 Evolutionary Multi-criterion Optimization Conference (EMO2003)), Faro, Portugal. 17. Joshua Knowles and David Corne. Towards Landscape Analyses to Inform the Design of Hybrid Local Search for the Multiobjective Quadratic Assignment Problem. In A. Abraham, J. Ruiz del Solar, and M. Koppen, editors, Soft Computing Systems: Design, Management and Applications, pages 271279, Amsterdam, 2002. IOS Press. ISBN 1-58603-297-6. 18. Joshua Knowles and David Corne. Instance generators and test suites for the multiobjective quadratic assignment problem. In Carlos Fonseca, Peter Fleming, Eckart Zitzler, Kalyanmoy Deb, and Lothar Thiele, editors, Evolutionary Multi-Criterion Optimization, Second International Conference, EMO 2003, Faro, Portugal, April 2003, Proceedings, number 2632 in LNCS, pages 295310. Springer, 2003. 19. Marco Laumanns, Lothar Thiele, Eckart Zitzler, and Kalyanmoy Deb. Archiving with Guaranteed Convergence and Diversity in Multi-Objective
480
20. 21. 22.
23. 24. 25. 26. 27. 28.
29. 30.
31. 32. 33. 34.
Gary B. Lamont et al.
Optimization. In W.B. Langdon and et. al., editors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '2002), pages 439447, San Francisco, California, July 2002. Morgan Kaufmann Publishers. In Lee, Riyaz Sikora, and Michael J. Shaw. A genetic algorithm-based approach to flexible flow-line scheduling with variable lot sizes. IEEE Transactions on Systems, Man and Cybernetics - Part B, 27:36-54, February 1997. Vittorio Maniezzo and Alberto Colorni. The ant system applied to the quadratic assignment problem. IEEE Transactions on Knowledge and Data Engineering, 11:769-778, 1999. Peter Merz and Bernd Freisleben. A comparison of memetic algorithms, tabu search, and ant colonies for the quadratic assignment problem. In Proceedings of the 1999 Congress on Evolutionary Computation, 1999. CEC 99, volume 3, pages 1999-2070. IEEE, IEEE, 1999. Peter Merz and Bernd Freisleben. Fitness landscape analysis and memetic algorithms for the quadratic assignment problem. IEEE Transactions on Evolutionary Computation, 4:337-352, 2000. Zbigniew Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag, 2nd edition, 1994. Arnold Neumaier. Global optimization test problems. 8 June 2004 http: //www.mat.univie.ac.at/~neum/glopt/test.html. Volker Nissen. Solving the quadratic assignment problem with clues from nature. IEEE Transactions on Neural Networks, 5:66-72, 1994. Panos M. Pardalos and Henry Wolkowicz. Quadratic assignment and related problems. In Panos M. Pardalos and Henry Wolkowicz, editors, Proceedings of the DIM ACS Workshop on Quadratic Assignment Problems, 1994. K. G. Ramakrishnan, M. G. C. RESENDE, and P. M. PARDALOS. A branch and bound algorithm for the quadratic assignment problem using a lower bound based on linear programming. In C. Floudas and P. M. PARDALOS, editors, State of the Art in Global Optimization: Computational Methods and Applications. Kluwer Academic Publishers, 1995. Gerhard Reinelt. Tsplib. 4 May 2003 http://www.iwr.uni-heidelberg.de/ groups/comopt/software/TSPLIB95/. Masatoshi Sakawa, Kosuke Kato, and Toshihiro Shibano. An interactive fuzzy satisficing method for multiobjective multidimensional 0-1 knapsack problems through genetic algorithms. In Proceedings of the 1996 International Conference on Evolutionary Computation (ICEC'96), pages 243-246, 1996. Kwang Mong Sim and Weng Hong Sun. Multiple ant-colony optimization for network routing. In First International Symposium on Cyber Worlds (CW'02), volume 2241, pages 277-281. IEEE, IEEE, 2002. G. Skorobohatyj. Mp-testdata. 20 May 2003 http://elib.zib.de/pub/ Packages/mp-testdata/. Eric D. Taillard. Comparison of iterative searches for the quadratic assignment problem. Location science, 3:87-105, 1995. Ke Xu. Bhoslib: Benchmarks with hidden optimum solutions for graph problems. 8 June 2004 http://www.nlsde.buaa.edu.cn/~kexu/benchmarks/ graph-benchmarks.htm.
Multi-Objective Evolutionary Algorithms for Computer Science Applications
481
35. Eckart Zitzler. Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. PhD thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, November 1999. 36. Eckart Zitzler, Marco Laumanns, and Lothar Thiele. SPEA2: Improving the Strength Pareto Evolutionary Algorithm. Technical Report 103, Computer Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH) Zurich, Gloriastrasse 35, CH-8092 Zurich, Switzerland, May 2001. 37. Eckart Zitzler and Lothar Thiele. Multiobjective Evolutionary Algorithms: A Comparative Case Study and the Strength Pareto Approach. IEEE Transactions on Evolutionary Computation, 3 (4): 257-271, November 1999. 38. Jesse Zydallis. Explicit Building-Block Multiobjective Genetic Algorithms: Theory, Analysis, and Development. PhD thesis, Air Force Institute of Technology, Wright Patterson AFB, OH, March 2003.
CHAPTER 20 DESIGN OF FLUID POWER SYSTEMS USING A MULTI OBJECTIVE GENETIC ALGORITHM
Johan Andersson Department of Mechanical Engineering, Linkoping University SE-581 83 Linkoping, Sweden E-mail:
[email protected] Within this chapter the multi-objective struggle genetic algorithm is employed to support the design of hydraulic actuation systems. Two concepts, a valve-controlled and a pump-controlled system for hydraulic actuation are evaluated using Pareto optimization. The actuation systems are analyzed using comprehensive dynamic simulation models to which the optimization algorithm is coupled. The outcome from the Pareto optimization is a set of Pareto optimal solutions, which allows visualization of the trade-off between the objectives. Both systems are optimized, resulting in two Pareto fronts that visualize the trade-off between system performance and system cost. By comparing the two Pareto fronts, it could be seen under which preferences a valve system is to be preferred to a pump system. Thus, optimization is employed in order to support concept selection. Furthermore, general design problems usually constitute a mixture of determining both continuous parameters as well as selecting individual components from catalogs or databases. Therefore the optimization is extended to handle a mixture of continuous parameters and discrete selections from catalogs. The valve-controlled system is again studied, but this time with cylinders and valves arranged in hierarchical catalogs, resulting in a discrete Pareto optimal front. 20.1. Introduction Design is an iterative feedback process where the performance of the system is compared with the specification, see for example Pahl and Beitz12 and Rozenburg and Eekels13. Usually this is a manual process where the designer makes a prototype system, which is tested and modified until sat483
484
Johan Andersson
isfactory. With the help of a simulation model, the prototyping could be reduced to a minimum. If the desired behavior of the system can be described as a function of the design parameters and the simulation results, it is possible to introduce optimization as a tool to further help the designer to reach an optimal solution. A design process that comprises simulation and optimization is presented in Andersson1 and depicted in Figure 20.1 below.
Fig. 20.1. A system design process including simulation and optimization.
The 'problem definition' in Figure 20.1 results in a requirements list which is used in order to generate different solution principles/concepts. Once the concepts have reached a sufficient degree of refinement, modeling and simulation are employed in order to predict the properties of particular system solutions. Each solution is evaluated with the help of an objective function, which acts as a figure of merit. Optimization is then employed in order to automate the evaluation of system solutions and to generate new system proposals. The process continues until the optimization is converged and a set of optimal systems are found. One part of the optimization is the evaluation of design proposals. The second part is the generation of new and hopefully better designs. Thus, optimization consists of both analysis (evaluation) and synthesis (generation of new solutions). Often the first optimization run does not result in the final design. If the optimization does not converge to a desired system, the concept has to be modified or the problem reformulated, which results in new objectives. In Figure 20.1 this is visualized by the two outer loops back to 'generation of solution principles' and 'problem definition' respectively. Naturally the activity 'generation of solution principles' produces a number of conceivable concepts, which each one is optimized. Thus each concept is brought to maximum performance; optimization thereby pro-
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
485
vides a solid basis for concept selection. This will be illustrated later in a study of hydraulic actuation systems. One essential aspect of using modeling and simulation is to understand the system we are designing. The other aspect is to understand our expectations on the system, and our priorities among the objectives. Both aspects are equally important. It is essential to engineering design to manage the dialog between specification and prototype. Often simulations confirm that what we wish for is unrealistic or ill-conceived. Conversely, they can also reveal that our whishes are not imaginative enough. However, engineering design problems are often characterized by the presence of several conflicting objectives. When using optimization to support engineering design, these objectives are usually aggregated to one overall objective function. Optimization is then conducted with one optimal design as the result. Another way of handling the problem of multiple objectives is to employ the concept of Pareto optimality. The outcome from a Pareto optimization is a set of Pareto optimal solutions, which visualizes the tradeoff between the objectives. In order to choose the final design, the decision-maker then has to trade the competing objectives against each other. General design problems also consists of a mixture of determining both continuous parameters as well as selecting individual components from catalogs or databases. Thus, an optimization strategy suited for engineering design problems has to be able to handle a mixture of continuous parameters as well as discrete selections of components from catalogs. This chapter continues with describing a nomenclature for the general multi-objective design problem. Thereafter, multi-objective genetic algorithms are discussed and the proposed multi-objective struggle GA is described together with the genetic operators used. The optimization method is then connected to the HOPSAN simulation program and applied to support the design of two concepts for hydraulic actuation. Thus, it is shown how optimization could be employed in order to support concept selection. The simulation model is then extended to include component catalogs for valves and cylinders. The optimization strategy is modified accordingly and the problem is solved as a mixed discrete/continuous optimization problem. 20.2. The Multi-Objective Optimization Problem A general multi-objective design problem is expressed by equation B . I , where fi(x), f2(x),..., fk(x) are the k objectives functions, ( x i , x2, •••,xn)
486
Johan Andersson
are the n optimization parameters, and 5 6 Rn is the solution or parameter space. Obtainable objective vectors, {F (x) \x € S} are denoted by Y. Y G Rk is usually referred to as the attribute space, where dY is the boundary of Y. For a general design problem, F is non-linear and multi-modal and S might be defined by non-linear constraints containing both continuous and discrete member variables. minF(x) = [/i(x),/ 2 (x),..,/fc(x)] s.t. x e S x=
(B.I)
(xi,x2,..;Xn)
The Pareto subset of dY is of particular interest to the rational decisionmaker. The Pareto set is defined by equation (B.2). Considering a minimization problem and two solution vectors x,y € S, x is said to dominate y, denoted x >- y, if: Vt e {1, 2,..., k} : fi (x) < ft (y) and 3j 6 {1, 2,.... k} : fj (x) < fj (y) (B.2) If the final solution is selected from the set of Pareto optimal solutions, there would not exist any solutions that are better in all attributes. It is clear that any final design solution should preferably be a member of the Pareto optimal set. If the solution is not in the Pareto optimal set, it could be improved without degeneration in any of the objectives, and thus it is not a rational choice. This is true as long as the selection is done based on the objectives only. The presented nomenclature is visualized in Figure 20.2 below. 20.3. Multi-Objective Genetic Algorithms Genetic algorithms are modeled after mechanisms of natural selection. Each optimization parameter (xn) is encoded by a gene using an appropriate representation, such as a real number or a string of bits. The corresponding genes for all parameters Xi, ..xn form a chromosome capable of describing an individual design solution. A set of chromosomes representing several individual design solutions comprise a population where the most fit are selected to reproduce. Mating is performed using crossover to combine genes from different parents to produce children. The children are inserted into the population and the procedure starts over again, thus creating an artificial Darwinian environment. For a general introduction to genetic algorithms, see work by Goldberg8.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
487
Fig. 20.2. Solution and attribute space nomenclature for a problem with two design variables and two objectives.
When the population of an ordinary genetic algorithm is evolving, it usually converges to one optimal point. It is however tempting to adjust the algorithm so that it spreads the population over the entire Pareto optimal front instead. As this idea is quite natural, there are many different types of multi-objective genetic algorithms. For a review of genetic algorithms applied to multi-objective optimization, readers are referred to the work done by Deb3. Literature surveys and comparative studies on multiobjective genetic algorithms are also provided by several other authors, see for example Coello4, Horn10 and Zitzler and Thiele15. 20.3.1. The Multi-Objective
Struggle GA
In this paper the multi-objective struggle genetic algorithm (MOSGA)1'3 is used for the Pareto optimization. MOSGA combines the struggle crowding genetic algorithm presented by Grueninger and Wallace9 with Pareto-based ranking as devised by Fonseca and Fleming7. As there is no single objective function to determine the fitness of the different individuals in a Pareto optimization, the ranking scheme presented by Fonseca and Fleming is employed, and the "degree of dominance" in attribute space is used to rank the population. Each individual is given a rank based on the number of individuals in the population that are preferred to it, i.e. for each individual the algorithm loops through the whole population counting the number of preferred individuals. "Preferred to" is implemented in a strict Pareto sense, according to equation (B.2), but one could also combine Pareto optimality with the satisfaction of objective goal levels, as discussed in ref.7. The principle of the MOSGA algorithm is outlined below.
488
Johan Andersson
Step 1: Initialize the population. Step 2: Select parents using uniform selection, i.e. each individual has the same probability of being chosen. Step 3: Perform crossover and mutation to create a child. Step 4: Calculate the rank of the new child. Step 5: Find the individual in the entire population that is most similar to the child. Replace that individual with the new child if the child's ranking is better, or if the child dominates it. Step 6: Update the ranking of the population if the child has been inserted. Step 7: Perform steps 2-6 according to the population size. Step 8: If the stop criterion is not met go to step 2 and start a new generation. Step 5 implies that the new child is only inserted into the population if it dominates the most similar individual, or if it has a lower ranking, i.e. a lower "degree of dominance". Since the ranking of the population does not consider the presence of the new child it is possible for the child to dominate an individual and still have the same ranking. This restricted replacement scheme counteracts genetic drifts and is the only mechanism needed in order to preserve population diversity. Furthermore, it does not need any specific parameter tuning. The replacement scheme also constitutes an extreme form of elitism, as the only way of replacing a non-dominated individual is to create a child that dominates it. The similarity of two individuals is measured using a distance function. The method has been tested with distance functions based upon the Euclidean distance in both attribute and parameter space. A mixed distance function combining both the attribute and parameter distance has been evaluated as well. The result presented here was obtained using an attribute based distance function. An inherent property of the crowding method is the capability to identify and maintain multiple Pareto fronts, i.e. global and local Pareto fronts in multi modal search spaces, see 1>2'3. In real world applications can only parts of the true problem be reflected in the formulation of the optimization problem. Therefore it is valuable to know about the existence of local optima as they might posses other properties, such as robustness, that are important to the decision maker but not reflected in the objective functions.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
489
In single objective optimization, niching techniques have been introduced in order to facilitate the identification of both global and local optima. As can be seen from the description of the method there are no algorithm parameters that have to be set by the user. The inputs are only: population size, number of generations, genome representation and crossover and mutation methods, as in every genetic algorithm. 20.3.2. Genome
Representation
The genome encodes design variables in a form suitable for the GA to operate upon. Design variables may be values of parameters (real or integer) or represent individual components selected from catalogs or databases. Thus, the genome is a hybrid list of real numbers (for continuous parameters), integers and references to catalog selections, see Figure 20.3. A catalog could be either a straight list of elements, or the elements could be arranged in a hierarchy. Each element of a catalog represents an individual component. The characteristics of catalogs would be discussed further on and exemplified by the design example. Real numbers 4.237
I
6.87e-3
Catalog selections
I
12 \
I
37
12th element, 1st catalog Fig. 20.3. Example of the genome encoding. The first two elements represent real variables and the last two elements catalog selections.
20.3.3. Similarity
Measures
Speciating GAs require a measure of likeness between individuals, a so called similarity measure. The similarity measure is usually based on a distance function that calculates the distance between two genomes. The similarity could be based on the distance in either the attribute space (between the objectives), the phenotype space (between the design parameters) or the genotype space (in the genome encoding). As direct encoding is used (not a conversion to a string of bits), a phenotype and a genotype distance function would yield the same result. It is shown in references 3-2
490
Johan Andersson
that the choice between an attribute based and a parameter based distance function might have a great influence on the outcome of the optimization. To summarize; an attribute space distance measure gives a fast and precise convergence on the global Pareto optimal front, whereas a parameter based distance function does not converges as fast but has the advantage of identifying and maintaining both global and local Pareto optimal fronts. 20.3.3.1. Attribute Based Distance Function One way of comparing two individual designs is to calculate their distance in attribute space. As we want the population to spread evenly on the Pareto front (in attribute space) it seems to be a good idea to use an attribute based distance measure. The distance between two solutions (genomes) in attribute space is calculated using the normalized Euclidean distance, see equation C.I.
Distance^, b) = , £ ( f ^ I f \ ^ = 1 \/imax
Jimin/
)' \
^
K
Where fta and fib are the objective values for the ith objective for o and b respectively. fimax and fimin is the maximum and the minimum of the i:th objective in the current population, and k is the number of objectives. Thus, the distance function will vary between 0, indicating that the individuals are identical, and 1 for the very extremes. 20.3.3.2. Phenotype Based Distance Function Another way of calculating the distance between solutions is to use the distance in parameter (phenotype) space. As the genome might be a hybrid mixture of real numbers and catalog selections, we have to define different distance functions to work on different types of elements. The methods described here build on the framework presented by Senin et al.14. In order to obtain the similarity between two individuals the distance between each design variable is calculated. The overall similarity is then obtained by summing up the distances for each design variable. 20.3.3.3. Real Number Distance A natural distance measure between two real numbers is the normalized Euclidean distance, see equation C.2.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
Distance^, b) = J I ^— I y \ m a x distance/
491
(C.2)
Where a and b are the values for the two real numbers and max distance is the maximum possible distance between the two values (i.e. the search boundaries). 20.3.3.4. Catalog Distance Distance between two catalog selections could be measured through relative positions in a catalog or a catalog hierarchy. The relative position is only meaningful if the catalog is ordered, see Figure 20.4.
Fig. 20.4. Examples of ordered and unordered catalogs.
The dimensionless distance between two elements within the same catalog is expressed by equation C.3 and exemplified in Figure 20.5. ^ . ,. pos(a) — pos(b) Distance^, b) = , max distance
Fig. 20.5. Distance evaluation for two elements of an ordered catalog.
._, . (C.3)
492
Johan Andersson
For catalog hierarchies equation C.3 has to be generalized as exemplified in Figure 20.6. For elements belonging to the same sub-catalog, the distance is evaluated using the relative position within that sub-catalog. Otherwise, the maximum length of the path connecting the different sub-catalog is used. This implies that for two given sub-catalogs an element in one catalog is equally distant from every element in the other catalog. The length of the path is calculated as the maximal distance within the smallest common hierarchy. In both cases, the distance is normalized by dividing with the maximum distance (i.e. the catalog size).
Fig. 20.6. Exemplification of distances between different catalog elements in a hierarchical catalog.
20.3.3.5. Overall Distance So far, distance measures for individual design variables have been developed. An overall distance measure for comparing two genomes is obtained by aggregating the distances for the individual design variables, see equation C.4.
Dist!ince(a,b) = tDiStanCe{DVi) i=i
(CA)
n
Where a and b are the two designs being compared, and n is the number of design variables (DV) encoded by the genome. Thus, the phenotype distance between two individual designs is calculated by summing up the individual distances for each element of the genome.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
20.3.4. Crossover
493
Operators
As the genome is a hybrid mix of continuous variables and catalog selections, we define different operators to work on different type of elements. Uniform crossover is used, which implies that each element of the fathers genome is crossed with the corresponding element from the mothers genome. For real numbers BLX crossover6 is used, see exemplification in Figure 20.7. For catalog selections, an analog crossover scheme is employed as illustrated in Figure 20.8.
Fig. 20.7. The outcome of a BLX crossover between two real numbers a and b is randomly selected from an interval of width 2d centered on the average M.
Fig. 20.8. An exemplification of the catalog crossover. The outcome of a crossover of individuals within the same catalog (a and b) are randomly selected from the interval between them. For individuals from different sub-catalogs (c and d) the outcome is randomly selected within the smallest common hierarchy.
494
Johan Andersson
20.4. Fluid Power System Design The objects of study are two different concepts of hydraulic actuation systems. Both systems consist of a hydraulic cylinder that is connected to a mass of 1000 kilograms. The objective is to follow a pulse in the position command with a small control error and simultaneously obtain low energy consumption. Naturally, these two objectives are in conflict with each other as low control error implies large acceleration which consumes more energy. The problem is thus to minimize both the control error and the energy consumption from a Pareto optimal perspective.
Fig. 20.9. The valve concept for hydraulic actuation.
Two different ways of controlling the cylinder are studied. In the first more conventional system, the cylinder is controlled by a directional valve, which is powered from a constant pressure system. In the second concept, the cylinder is controlled by a servo pump. Thus, the systems have different properties. The valve concept has all that is required for a low control error, as the valve has a small mass and thus very high bandwidth. On the other hand, the valve system is associated with higher losses, as the valve constantly throttles fluid to the tank. The different concepts have been modeled in the simulation package HOPSAN, see ref.11. The system models are depicted in Figures 20.9 and 20.10 respectively.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
495
The models of each component of the systems consist of a set of algebraic and differential equations considering effects such as friction, leakage and non-linearities as for example limited stroke distances and stroke speeds. HOPSAN uses a distributed simulation technique where each component contains its own numerical solver. The components are then connected using transmission line elements as describes in ref.11. The distributed simulation technique has the advantage that the components are numerically separated from each other which promotes stability. Furthermore, the computational time grows linearly with the size of the problem, which is not true for centralized solvers. The HOPSAN simulation software could be freely downloaded from the web. The valve system consists of the mass and the hydraulic cylinder, the directional valve and a p-controller to control the motion. The directional valve is powered by a constant pressure pump and an accumulator, which keeps the system pressure at a constant level. The optimization parameters are the sizes of the cylinder, valve and the pump, the pressure level, the feedback gain. Furthermore, a leakage parameter is added to both systems in order to guarantee sufficient damping. Thus, this problem consists of six optimization parameters and two objectives.
Fig. 20.10. The pump concept of hydraulic actuation.
The pump concept contains fewer components: the cylinder and the mass, the controller and the pump. A second order low-pass filter is added in order to model the dynamics of the pump. The pump system consists of only four optimization parameters. The performance of a relatively fast
496
Johan Andersson
pump system is depicted in Figure 20.11.
Fig. 20.11.
Typical pulse response for a pump system.
20.4.1. Optimization Results Both systems where optimized in order to simultaneously minimize the control error /i and the energy consumption / 2 . The control error is obtained by integrating the absolute value of the control error and adding a penalty for overshoots, see equation D.I. The energy consumption is calculated by integrating the hydraulic power, expressed as the pressure times the flow, see equation D.2 4
/ 2
4
h = f \Xref - X\dt + a l I (X> Xref)dt 0
\0
\
(D.I)
+ f {x < Xref)dt 2
/
4 h =
{Qpump • Ppump)dt
(D.2)
0
The optimization is conducted with a population size of 30 individuals over 200 generations. The parameters are real encoded and BLX crossover is used to produce new offspring, and the Euclidean distance in attribute space was used as the similarity measure.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
497
As a Pareto optimization searches for all non-dominated individuals, the final population will contain individuals with a very high control error, as they have low energy consumption. It is possible to obtain an energy consumption close to zero, if the cylinder does not move at all. However, these solutions are not of interest, as we want the system to follow the pulse. Therefore, a goal level/constraint on the control error is introduced. The optimization strategy is modified so that solutions below the goal level on the control error are always preferred to solutions that are above it regardless of their energy consumption. In this manner, the population is focused on the relevant part of the Pareto front. The obtained Pareto optimal fronts for both systems are depicted in Figure 20.12. In order to achieve fast systems, and thereby low control errors, large pumps and valves are chosen by the optimization strategy. A large pump delivers more fluid, which enables higher speed of the cylinder. However, bigger components consume more energy, which explains the shape of the Pareto fronts.
Fig. 20.12. Pareto fronts showing the trade-off between energy consumption and control error for the two concepts. The graph on the right shows a slow pulse response, whereas the graph on the left shows a fast pulse response.
498
Johan Andersson
When the Pareto fronts for different concepts are drawn within the same graph, as in Figure 20.12, an overall Pareto optimal front could be obtained by identifying the non-dominated set from all Pareto optimal solutions obtained. It is then evident that the final design should preferably be on the overall Pareto front, which elucidates when it is rational to switch between concepts. The servo pump system consumes less energy and is preferred if a control error larger than 0.05 ms is acceptable. The servo valve system is fast but consumes more energy. If a lower control error than 0.05 ms is desired, the final design should preferably be a servo valve system. In order to choose the final design, the decision-maker has to select a concept and then study the trade-off between the control error and the energy consumption and select a solution point on the Pareto front. This application shows how Pareto optimization can be employed to support concept selection, by visualizing the pros and cons of each concept.
20.5. Mixed Variable Design Problem Real design problems usually show a mixture of determining continuous parameters as well as selecting existing components from catalogs or databases, see Senin et al.14. Therefore, the multi-objective genetic algorithm has been extended to handle a mixture of continuous variables as well as discrete catalog selections. The object of study for the mixed variable design problem is the valve actuation system depicted in Figure 20.9. The objective is again to design a system with good controllability, but this time at low cost. When designing the system, cylinders and valves are selected from catalogs of existing components. To achieve good controllability we can choose a fast servo valve, which is more expensive than a slower proportional valve. Therefore, there is a trade-off between cost and controllability. The cost for a particular design is composed of the cost for the individual components as well as the cost induced by the energy consumption. Other parameters such as the control parameter, the leakage coefficient and the pump size have to be determined as well. Thus the problem is multiobjective with two objectives and five optimization variables, of which two are discrete catalog selections and three are continuous variables. For this optimization the pressure level is not an optimization parameter, as it is determined by the choice of the cylinder.
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
499
20.5.1. Component Catalogs For the catalog selections, catalogs of valves and cylinders have been included in the HOPSAN simulation program. For the directional valve, the choice is between a slow but inexpensive proportional valve or an expensive and fast servo valve. Valves from different suppliers have been arranged in two ordered sub-catalogs as depicted in Figure 20.13. The same structure applies to the cylinders as they are divided into sub-catalogs based on their maximum pressure level. The pressure in the system has to be controlled so that the maximum pressure for the cylinder is not exceeded. A low-pressure system is cheaper but has inferior performance compared to a high-pressure system. Each catalog element contains a complete description of that particular component, i.e. the parameters that describe the dynamics of the component, which is needed by the simulation model as well as information on cost and weight etc.
Fig. 20.13. The catalog of directional valves is divided into proportional valves and servo valves. Each sub-catalog is ordered based on the valve size. For each component, a set of parameters describing the component is stored together with information on cost and weight.
20.5.2. Optimization
Results
The system has been optimized using a population of 40 individuals and 400 generations. In order to limit the Pareto front a goal level on the control error was introduced for this problem as well. The result could be divided into three distinct regions depending on valve type and pressure level, see Figure 20.14. As can be seen from Figure 20.14, there is a trade-off between system performance (control error) and system cost. By accepting a higher cost, better performance could be achieved. The cheapest designs consist
500
Johan Andersson
Fig. 20.14. Optimization results. In (a) the obtained Pareto optimal front is shown in the objective space. Different regions have been identified based on valve and cylinder selections, which is shown in the parameter space in (b).
of small proportional valves and low-pressure cylinders. By choosing larger proportional valves and high-pressure cylinders, the performance could be increased at the expense of higher cost. If a still better performance is desired, a servo valve has to be chosen, which is more expensive but has better dynamics. The continuous parameters, such as the control parameter, tend to smoothen out the Pareto front. For a given valve and cylinder, different settings on the continuous parameters affect the pulse response. A faster response results in a lower control error, but also a higher energy consumption and thereby higher cost. Therefore, there is a local trade-off between cost and performance for each catalog selection. 20.6. Discussion and Conclusions Modelling and simulation are very powerful tools that could support the engineering design process and facilitate better and deeper understanding of the systems being developed. When connecting an optimization strategy to the simulation model the knowledge acquisition process could be sped-up further as the optimization searches through the simulation model in an efficient manner. Furthermore, the optimization frequently identifies loopholes and shortcomings of the model since it is unbiased in its search for an optimal design. As a system designer, or model developer, it is hard to conduct such a thorough inspection of the model as the optimization does. Thus even more information could be gathered out from the simula-
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
501
tion models if combined with an optimization strategy. As has been shown in this chapter optimization also elucidates how the preferences among the objectives impact the final design. Thus optimization facilitates the understanding of the system being developed as well as our expectations on the system and our priorities among the objectives. In this chapter the multi-objective struggle genetic algorithm is connected to the HOPSAN simulation program in order to support the design of fluid power systems. The method has been applied to two concepts of hydraulic actuation systems; a valve-controlled system and a pump controlled system, which has been modeled in the HOPSAN simulation environment. Both systems were optimized in order to minimize the control error and the energy consumption. Naturally, these two objectives are in conflict with each other, and thus the resulting Pareto fronts visualize the trade-off between control error and energy consumption for each concept. The existence of the trade-off was known beforehand, but with support of the Pareto optimization the trade-off could be quantified and the performance for designs at different regions on the Pareto front could be visualized in order to point out the effects of the trade-off. When the Pareto optimal fronts for different concepts are drawn in the same graph, the advantages of the concepts are clearly elucidated. An overall Pareto optimal front could be obtained by identifying the non-dominated set from all Pareto optimal fronts. The rational choice is naturally to select the final design from this overall Pareto optimal set. Thus the decisionmaker is advised which concept to choose depending on his or her preferences, and hence Pareto optimization could be a valuable support for concept selection. In this application it was recognized that the concepts had different properties, i.e. one concept is faster but consumes more energy, but it was not known under which preferences one concept were better than the other, i.e. when the Pareto fronts intersected. Therefore, Pareto optimization contributed to elucidate the benefits of the different concepts. The conception of an overall Pareto front and thereby the support for concept selection is one of the main contributions of this chapter. Subsequently, the method has been extended to handle selection of individual components from catalogs, and thus the problem is transformed to a mixed discrete/continuous optimization problem. Component catalogs have therefore been added to the simulation program where each catalog element contains all data needed by the simulation program as well as properties such as cost and weight. Furthermore, the GA has been extended with genomes with the ability to represent hierarchical catalogs as
502
Jokan Andersson
well as operators for similarity measures and crossover between catalogs elements. The valve-controlled system was again optimized resulting in a discrete Pareto front that visualizes the trade-off between system cost and system performance based on discrete selections of valves and cylinders. For future work, the catalogs could be exchanged for databases, where each element could be extended to contain the entire simulation model for a particular component. These models could either be made by the system designer, or be provided by the supplier, in such a form that proprietary information is not jeopardized. In this way, the supplier does not only supply a component for the final system, but also the simulation model describing the component. Furthermore, optimization is transformed from being a system model operator to a system model creator. References 1. Andersson J., Multiobjective Optimization in Engineering Design - Applications to Fluid Power Systems, Dissertation, Linkoping studies in science and Technology, Dissertation No. 675, Linkoping University, Linkoping, Sweden, 2001. 2. Andersson J. and Krus P., "Multiobjective Optimization of Mixed Variable Design Problems", in Proceedings of 1st International Conference on Evolutionary Multi Criteria Optimization, Zitzler E et al. (editors), SpringerVerlag, Lecture Notes in Computer Science No. 1993, pp 624-638, 2001. 3. Andersson J. and Wallace D., "Pareto optimization using the struggle genetic crowding algorithm", Engineering Optimization, Vol. 34, No. 6 pp. 623-643, 2002. 4. Coello Coello C , An empirical study of evolutionary techniques for multiobjective optimization in engineering design, PhD thesis, Department of Computer Science, Tulane University, 1996. 5. Deb K., Multi-objective Objective Optimization using Evolutionary algorithms, Wiley and Sons Ltd, 2001. 6. Eshelman L. J. and Schaffer J. D., "Real-Coded Genetic Algorithms and Interval-Schemata," in Foundations of Genetic Algorithms 2, L. D. Whitley, Ed., San Mateo, CA, Morgan Kaufmann, pp. 187-202, 1993. 7. Fonseca C. M. and Fleming P. J., "Multiobjective optimization and multiple constraint handling with evolutionary algorithms - Part I: a unified formulation," IEEE Transactions on Systems, Man, & Cybernetics Part A: Systems & Humans, vol. 28, pp. 26-37, 1998. 8. Goldberg D. E., Genetic Algorithms in Search and Machine Learning, Addison Wesley, Reading, 1989. 9. Grueninger T. and Wallace D., "Multi-modal optimization using genetic algorithms," Technical Report 96.02, CADlab, Massachusetts Institute of Technology, Cambridge, 1996. 10. Horn J., "Multicriterion decision making," in Handbook of evolutionary
Design of Fluid Power Systems Using a Multi Objective Genetic Algorithm
503
computation, T. Back, D. Fogel, and Z. Michalewicz, Eds., IOP Publishing Ltd and Oxford University Press, pp. F1.9:l - F1.9:15, 1997. 11. Jansson A., Krus P., Hopsan -a Simulation Package, User's Guide, Technical Report LITHIKPR-704, Dept. of Mech. Eng., Linkoping University, Sweden, 1991. http://hydra.ikp.liu.se/hopsan.html. 12. Pahl G. and Beitz W., Engineering Design - A Systematic Approach, Springer-Verlag, London, 1996. 13. Roozenburg N. and Eekels J., Product Design: Fundamentals and Methods, John Wiley & Sons Inc, 1995. 14. Senin N., Wallace D. R., and Borland N., "Mixed continuous and discrete catalog-based design modeling and optimization," in Proceeding of the 1999 CIRP International Design Seminar, U. of Twente, Enschede, The Netherlands, 1999. 15. Zitzler E. and Thiele L., "Multiobjective Evolutionary Algorithms: A Comparative Case Study and the Strength Pareto Approach," IEEE Transaction on evolutionary computation, vol. 3, pp. 257-271, 1999.
CHAPTER 21 ELIMINATION OF EXCEPTIONAL ELEMENTS IN CELLULAR MANUFACTURING SYSTEMS USING MULTI-OBJECTIVE GENETIC ALGORITHMS
S. Afshin Mansouri Industrial Engineering Department Amirkabir University of Technology P.O.Box 15875-4413, Tehran, Iran E-mail:
[email protected] Cellular manufacturing is an application of group technology in order to exploit similarity of parts processing features in improvement of the productivity. Application of a cellular manufacturing system (CMS) is recommended for the mid-volume, mid-variety production environments where traditional job shop and flow shop systems are not technically and/or economically justifiable. In a CMS, collections of similar parts (part families) are processed on dedicated clusters of dissimilar machines or manufacturing processes (cells). A totally independent CMS with no intercellular parts movement can rarely be found due to the existence of exceptional elements (EEs). An EE is either a bottleneck machine allocated in a cell while is being required in the other cells at the same time, or a part in a family that requires processing capabilities of machines of the other cells. Despite the simplicity of production planning and control functions in a totally independent CMS, such an independence cannot be achieved without machine duplication and/or part subcontracting, which have their own side effects. These actions deteriorate some other performance aspects of the production system regarding cost, utilization and workload balance. In this chapter, tackling the EEs in a CMS is formulated as a multi-objective optimization problem (MOP) to simultaneously take into account optimization of four conflicting objectives regarding: intercellular movements, cost, utilization, and workload balance. Due to the complexity of the developed MOP, neither exact optimization techniques nor total enumeration are applicable for large problems. For this, a multi-objective genetic algorithm (MOGA) solution approach is proposed, which makes use of the non-dominated sorting idea in conjunction with an elitism scheme to provide the manufacturing system designers with a set of near Pareto-optimal solutions. Application 505
506
S. Afshin Mansouri
of the model and the solution approach in a number of test problems show its suitability for real world instances.
21.1. Introduction The majority of manufacturing industries employ three basic designs to organize their production equipments. These include: job shop, flow shop and cellular designs. A job shop process is characterized by the organization of similar equipment by function (such as milling, drilling, turning, forging, and assembly). As jobs flow from work centre to work centre, or department to department, a different type of operation is performed in each centre or department. Orders may flow similar or different paths through the plant, suggesting one or several dominantflows.The layout is intended to support a manufacturing environment in which there can be a great diversity of flow among products. Fig. 21.1 depicts a job shop design.
Fig. 21.1. A job shop design.
Theflowshop is sometimes called a product layout because the products always flow the same sequential steps of production. Fig. 21.2 shows a typical flow shop system. In a cellular manufacturing system, machines are divided into manufac-
Cellular Manufacturing Systems Using Multi-Objective Genetic Algorithms
507
Fig. 21.2. A flow shop design.
turing cells, which are in turn dedicated to process a group of similar parts called part family. Cellular manufacturing strives to bring the benefits of mass production to high variety, medium-to-low volume quantity production. It has several benefits such as reduced material handling, work-inprocess inventory, setup time and manufacturing lead time, and simplified planning, routing and scheduling activities. Fig. 21.3 shows a cellular configuration. Each of the above-mentioned systems has its own rational range of application. Fig. 21.4 illustrates relative position of these systems in terms of production volume and product variety. Identification of part families and machine groups in the design of a CMS is commonly referred to as cell design/formation. Many solution approaches for cell design problem have been proposed over the last three decades. Mansouri et al.1 and Offodile et al.2 provide comprehensive reviews of these approaches. There are occasions where all of the machines/parts cannot be exclusively assigned to a machine cell/part family. These are known as Exceptional Elements (EEs).The EEs cause a number of problems in the operation of CMSs, e.g. intercellular part movements and unbalance of the workload across the cells. Dealing with the EEs has also been a subject of research.
508
S. Afshin Mansouri
Fig. 21.3. A cellular design.
Fig. 21.4. Relative position of the three manufacturing systems.
For instance, Logendran and Puvanunt3, Moattar-Husseini and Mansouri4, Shafer et al.5, Sule6 and Seifoddini7 develop solution approaches for this problem. Fig. 21.5(a) demonstrates initial machine part incidence matrix in a 4 machines, 6 parts problem. A "1" entry in the matrix indicates that there is a relationship between the associated parts and machines, i.e. the
Cellular Manufacturing Systems Using Multi-Objective Genetic Algorithms
509
part requires that particular machine in its process route. In Fig. 21.5(b), the sorted matrix is shown along with a decomposition scheme which separates all the machines and parts in two interdependent cells as: Cell 1: {(M2, Ml), (PI, P3, P6)}, and Cell 2: {(M4, M3), (P2, P4, P5)} where M and P stands for Machine and Part, respectively. There are two exceptional parts (PI and P4) and two exceptional (bottleneck) machines (M2 and M4) in the CMS proposed in Fig. 21.5(b).
Parts
Parts
1 2 3 4 5 6 «, I |
1 2 3 4
1
1 1
1 1 1 1 1 11 11 ...... . a. Initial matrix
1 3 6 2 4 5 1 1
« | J
2 1 4 3
I 1 1 ll [ i l l 1
1 111 1 1 1
b. A decomposition scheme on the sorted ,. matrix
Fig. 21.5. Initial and final machine-part incidence matrixes.
In deciding on which parts to subcontract and which machines to duplicate, one should take into account the associated side effects, e.g. cost increment and utilization decrement. Any effort to decrease intercellular part movements (as a performance measure) may degrade other measures of performance. In other words, there are multiple objectives to be considered in tackling the EEs. Hence it can be formulated as a multi-objective optimization problem (MOP). In this chapter, a MOP model is introduced for dealing with the EEs in a CMS and a MOGA-based solution approach to find locally non-dominated or near Pareto-optimal solutions. The remaining sections are organized as follows. An overview of multi-objective optimization is given in Section 2. Section 3 formulates the problem of dealing with the EEs as a MOP model. The developed MOGA-based solution approach for the model is introduced in Section 4, and subsequently its parameters are set in Section 5. Experiments on a number of test problems are conducted in Section 6. Finally concluding remarks are summarized in Section 7.
510
5. Afshin Mansouri
21.2. Multiple Objective Optimization A MOP can be defined as determining a vector of design variables within a feasible region to minimize a vector of objective functions that usually conflict with each other. Such a problem takes the form: Minimize {/i(X),/ 2 (X), ...,/ m (X)} subject to g(X) < 0
(B.I)
where X is vector of decision variables; /i(X) is the ith objective function; and g(X) is a constraint vector. Usually, there is no single optimal solution for B.I, but rather a set of alternative solutions. These solutions are optimal in the wider sense that no other solutions in the search space are superior to them when all objectives are considered. A decision vector X is said to dominate a decision vector Y (also written as X >- Y) iff: /i(X) < fi(Y) for all i E {1, 2, . . . , m};and
(B.2)
/i(X) -< fi(Y) for at least one i £ {1, 2, . . . , m}
(B.3)
There are various solution approaches for solving the MOP. Among the most widely adopted techniques are: sequential optimization, e-constraint method, weighting method, goal programming, goal attainment, distance based method and direction based method. For a comprehensive study of these approaches, readers may refer to Szidarovszky et al.s. Evolutionary algorithms (EAs) seem particularly desirable to solve multi-objective optimization problems because they deal simultaneously with a set of possible solutions (the so-called population) which allows to find an entire set of Pareto-optimal solutions in a single run of the algorithm, instead of having to perform a series of separate runs as in the case of the traditional mathematical programming techniques. Additionally, EAs are less susceptible to the shape or continuity of the Pareto-optimal frontier, whereas these two issues are a real concern for mathematical programming techniques. However, EAs usually contain several parameters that need to be tuned for each particular application, which is in many cases highly time consuming. In addition, since the EAs are stochastic optimizers, different runs tend to produce different results. Therefore, multiple runs of the same algorithm on a given problem are needed to statistically describe their performance on that problem. These are the most challenging issues with using EAs for solving MOPs. For detail discussion on application of EAs in multi-objective optimization see Coello et al.9 and Deb10.
Cellular Manufacturing Systems Using Multi-Objective Genetic Algorithms
511
21.3. Development of the Multi-Objective Model for Elimination of EEs 21.3.1.
Assumptions
It is assumed that part subcontracting and machine duplication are two possible alternatives for the elimination of EEs from a CMS as proposed by Shafer et al.5. It is also assumed that partial subcontracting is not allowed. In other words, the whole demand to a part should be supplied by subcontractors once it has been decided to subcontract it. 21.3.2. The Set of Decision
Criteria
The following set of criteria is considered for development of the MOP model: • Minimizing intercellular parts movements, • Minimizing total cost of machine duplication and part subcontracting, • Minimizing under-utilization of machines in the system, and • Minimizing deviations among the levels of the cells' utilization. Among the above-mentioned objectives, minimizing intercellular parts movement is of special importance as it is the key factor to make cells independent. However any effort to reduce intercellular parts movement by means of machine duplication and part subcontracting, increases cost, deteriorates overall utilization of machinery, and imbalances levels of utilization among the cells. The other objectives are considered to overcome these side effects. 21.3.3. Problem
Formulation
21.3.3.1. Notation Set of indices i :Index for machine types, i=l,...,m j :Index for part types, j=l,...,p k : Index for cells, k=l,...,c Decision variables
512
5. Afshin Mansouri
Two binary decision variables are defined to formulate the problem as : Xj = 1 if part j is subcontracted and Xj — 0 otherwise; y ^ = 1 if machine i is duplicated in cell k and yi^ = 0 otherwise. Set of parameters Dj-. annual demand for part j ; SJ: incremental cost of subcontracting a unit of part j ; Uj: processing time of a unit of part j on machine i; PM jti : number of intercellular transfers required by part j as a result of machine type i not being available within the part's manufacturing cell; Mi : annual cost of acquiring an additional machine i; CMi : annual machining capacity of each unit of machine i (minutes); HFk'- set of parts assigned to cell k; MCk • set of machines assigned to cell k; GFk'- set of parts assigned to the cells other than k while requiring some of the machines in cell k; BMk • set of the bottleneck machines required by the parts in cell k; EP\. : set of exceptional parts in cell k; EM j : set of bottleneck machines required by the exceptional part 3\ CSk '• number of machines assigned to cell fe; MCS : maximum cell size; c : number of cells; UCk '• utilization of cell A;; and OU : overall utilization of the CMS. 21.3.3.2. The Objective Functions We define the solution vector X = (XJ'S , j/i^'s) which consists of binary decision variables. The objectives considered for dealing with the EEs are described as follows: Objective 1: minimizing intercellular parts movement Intercellular movement of parts is one of the major problems associated with the EEs in a CMS, which complicates production and inventory
Cellular Manufacturing Systems Using Multi-Objective Genetic Algorithms
513
management functions. Minimization of the intercellular parts movement is sought through the following objective function:
A(X)= £ £ (1-*;) * ( £ (PMjti x (1-ifc,*))) (C.I) *=1 j€EPk
\iEEMj
J
Objective 2: minimizing total cost of machine duplication and part subcontracting Any reduction in intercellular parts movement by machine duplication and / or part subcontracting will result in cost increment. Hence minimization of total part subcontracting and machine duplication cost is included in the model as follows:
/ 2 (X) = £ £ k jeEPk
[(DjX S3 x Xj) + £ (Mi x j,i>fc) J (C.2) \
i£EMj
J
Objective 3: minimizing overall machine under-utilization Since machine duplication and / or part subcontracting deteriorates machinery utilization, minimization of the overall machines under-utilization, which is equivalent to maximization of overall utilization, is taken into account employing the following objective function: E UCk x \CSk+ Y. Vi,k\ MX) = l-OU = 1- ^ V ^ ^
E icsk+
k=l \
(C.3)
E vi,k) ii = 10000. Cell 2: HF2 = {P2, P4, P5}, MC2 = {M4, M3}, GF2 = {PI}, BM2 = {M2}, EP2 = {P4}, EMi = {M2}, CS2 = 2, FM4,2 = D4 = 1000. The decision vector for this problem is X = {a;i,£4,2/4,1,2/2,2}- The objective functions and constraints of the problem are as follows: Minimize {^(X), / 2 (X), / 3 (X), / 4 (X)} where:
(C.9)
516
S. Afshin Mansouri
MX)
= ((10000) x(l-Xl)x
(l-t/4,1)) + ((1000) x (1 -
Xi)
/ 2 (X) = (10000) x (3) x ( n ) + (7000) x (2/4,1) + (1000) x (1) x (i 4 ) + (6000) x (2/2,2) / 3 (X) = 1-OU
/ 4 (X) =
( ( t / d - O C / ) 2 + (UC2-OU)2)
x (l-y 2 ,2)) (CIO) (c l
•
m ;
(C.12)
(C.13)
where: 0[/
^ 1
=
((UCi)x(2 + y4A) + (t7C 2 )x(2 + y2,2)) ((2 + 2/4,1) + (2 + 2/2,2))
= 210000+210000+210000 X(j/4,l)
u W2
'
j
X
((10000) x (7 + 8) + (8000) x (3 + 7) + (2000) x (8 + 1 ) (10000) x (7 + 8) x (zi) + (10000 x 2) x (1 - xA)) JJC
l
(C.15)
1 v 210000+210000+210000x(»/2,2)
((7000) x (10 + 3) + (1000) x (1 + 6) + (4000) x (9 + 5 ) (1000) x (1 + 6) x (x4) + (10000 x 9) x (1 - xi))
(C.16)
subject to: (2 + 1/4,1) < 3
(C.17)
(2 + ?/2,2)X 2 y X6and X 4 y (X8, X 9 , X 12 , X 13 , X 15 , X 16 ). The set of the nondominated or Pareto-optimal solutions for the example problem include: X i , X 3 , X 4 , X 5 , X 7 , X10 and X14. Table 21.85.
Total solutions of the numerical example.
_ . . . Decision vectors ~~7/ I Xi={0,0,0,0} 11000 X2={0,0,0,1} 10000 X 3 = { 0 , 0,1,0} 1000 X4 = {0,0, 1, 1} 0 X5 = {0, 1, 0,0} 10000 Xt '•'- {0, 1,0,1} 10000 X7 - { 0 , 1 , 1 , 0 } 0 X s » { 0 , 1, 1, 1} JO X, = { 1 , 0 , 0 , 0 } 1000 X ! O - { 1 , 0, 0, 1} 0 X (1 = {1,0, 1,0} .1000 Xt* = {l. 0, 1, 1} , 0 X n = {l, 1,0,0} d XM = {1, 1, 0, 1} 0 X,5 = {1, 1, 1,0} 0 Xit = {l, 1, 1, 1 } | 0 j i= 0 : 2
mm ^ A ' z ' O r )
(biCOA)
9=1
If a convex hull C — conv{z(z) : x 6 E} is computed in the objective space, z(a;) for any supported efficient solution x £ SE belongs to the boundary of this convex hull. SE is composed of SE1 and SE2; SE1 is the set of SE solutions x, such that z(x) is on the vertex of the convex hull C and SE2 = SE \ SE1. Computing the SE2 set is generally more difficult than computing SE1, because the former requires the enumeration of all optimal solutions that minimize (biCO.\) with A given. A solution x in the set NE = E \ SE of non-supported efficient solutions is the one for which z(x) is not on the boundary of the convex hull. In the bi-objective case, NE solutions are located in the triangles drawn on two successive supported efficient solutions in the objective space. There is no theoretical characterization leading to the efficient computation of NE solutions. Generally, several distinct efficient solutions xl,x2, x3 can correspond to the same non-dominated point z(x1) = z(x2) = z(x3) in the objective space. The solutions x1,x2,xi are said to be equivalent in the objective space. The number of such equivalent solutions is generally quite large, and so the enumeration of all of them may be intractable. In such a situation, it is impossible to design an efficient algorithm that can compute all efficient solutions. All the introduced sets are then redefined restrictively according to the notion of a minimal complete set17 of efficient solutions. A set of efficient solutions is minimal if and only if no two of its efficient solutions are equivalent. The application of this definition to the introduced sets gives rise to the Em, SEm, SElm, SE2m, and NEm minimal complete sets. Figure 23.1, which summarizes the inclusion relationship among these sets, illustrates, for example, that SE\m C SEm C SE. The published papers are sometimes unclear about the ability of the algorithms that they present. Some authors claim that their algorithm can enumerate "all" efficient solutions in terms of the set E. However, as mentioned before, it is generally difficult to compute this set. Thus, it is important to clearly define the class of efficient solutions handled by the algorithm.
Evolutionary Operators Based on Elite Solutions for biCO
559
Pig. 23.1. Classification of efficient solutions
23.3. An Evolutionary Heuristic for Solving biCO Problems The principle of our heuristic11'12'13 is based on the intensive use of three operators applied to a population composed uniquely of elite solutions. The following sections present the main features of the heuristic. Its algorithmic framework is shown in algorithm 1. 23.3.1. Overview of the Heuristic Let us introduce PE, which denotes the set of elite solutions. PE is first initialized with a subset of supported solutions (routine detectPEinit). Three operators are used: a crossover (routine crossoverWithElites), a path-relinking (routine pathRelinkingWithElites), and a local search (routine localSearchOverNewElites). Upper and lower bound sets defined in the objective space (routine buildBoundSets) provide acceptable limits for performing a local search. A genetic map, derived from the elite solutions (routine elaborateGeneticInf ormation), provides useful information to crossover operators for fixing certain bits. This genetic information is refreshed periodically. Each new elite solution is noted (routine noteNewSolutions). Three rules, which can be used separately or in combination, define a stopping condition (routine isTheEnd?). Basically, the heuristic can be stopped after a predefined effort (rule 1 with parameter iterationMax) or after an elapsed time (rule 2 with parameter timeMax). Rule 3 concerns the detection of unfruitful iterations. This rule allows the heuristic to be stopped when no new elite solutions are produced after a certain number of iterations (parameters seekChangesFrequency and noChangeMax). Each iteration of the algorithm performs one crossover operation, which generates one solution, and one path-relinking operation, which generates
560
X. Gandibleux, H. Morita, N. Katoh
Algorithm 1 The entry point Require: input data which determines the objective functions and constraints ; parameter(s) for the stopping condition chosen. Ensure: PE - - Compute the initial elite population set PEinit detectPEinit( data 4- , pelnit t) ; pe 4- pelnit --| Compute the lower and the upper bound sets buildBoundSets( data 4- , pe \. , lowerB t, upperB t ) --| A first local search on the PEina solution set localSearchOverNewElites(pe X ) --| Identify the genetic heritage and elaborate the genetic map elaborateGeneticInf ormation( pe 4- , map "f) - -| Initialize the running indicators iteration 4- 1 ; elapsedTime
MOGA-1
MOFSS-1
32.93 ± 3.11 36.34 ± 1.08 37.07 ± 2.99 7.49 ± 0.70 15.95 ± 1.43 6.0 ± 0.98 1.86 ± 0.76 10.2 ± 1.25 6.0 ± 2.32 0.0 ± 0.0 26.07 ± 1.03 16.83 ± 2.55 2.02 ± 0.12 15.75 ± 1.4 26.03 ± 1.78 3.2 ± 0.91 6.69 ± 1.82 5.28 ± 0.95 -
26.38 ± 1.47 (+) 28.32 ± 0.71 (+) 30.14 ± 1.85 (+) 16.65 ± 0.4 (-) 12.44 ± 1.84 2.19 ± 0.36 (+) 1.43 ± 0.73 5.13 ± 1.27 (+) 2.68 ± 1.1 (+) 0.0 ± 0.0 23.07 ± 1.16 11.33 ± 1.92 (+) 2.22 ± 0.18 22.65 ± 1.19 (-) 23.16 ± 1.29 2.97 ± 0.75 0.56 ± 0.56 (+) 3.84 ± 0.67 8 2
N/A 36.47 ± 1.84 40.85 ± 1.45 18.5 ± 0.70 (-) 15.04 ± 1.35 11.15 ± 1.60 (-) 1.86 ± 0.76 7.98 ± 1.37 6.01 ± 2.09 0.18 ± 0.07 (-) 28.16 ± 1.72 33.5 ± 6.49 (-) 2.32 ± 0.23 31.19 ± 1.69 (-) 33.74 ± 1.78 (-) 4.57 ± 0.89 6.07 ± 1.69 7.16 ± 0.77 (-) 0 7
used in the experiments (a dual-PC with l.lGHz clock rate and 3Gbytes memory). The results in Table 25.101 show that MOGA-1 obtained significantly better error rates than the baseline solution (column "C4.5") in 8 data sets. In contrast, the baseline solution obtained significantly better results than MOGA-1 in just two data sets. MOFSS-1 has not found solutions with significantly better error rates than the baseline solution in any data set. On the contrary, it found solutions with significantly worse error rates than the baseline solution in 7 data sets. As can be observed in Table 25.102, the tree sizes obtained with the solutions found by MOGA-1 and MOFSS-1 are significantly better than the ones obtained with the baseline solution in 15 out of 18 data sets. In the other three data sets the difference is not significant. In summary, both MOGA-1 and MOFSS-1 are very successful in finding solutions that led to a significant reduction in tree size, by comparison with the baseline solution of all attributes. The solutions found by MOGA-1 were also quite successful in reducing error rate, unlike the solutions found by MOFSS-1, which unfortunately led to a significant increase in error rate in a number of data sets. Hence, these results suggest that MOGA-1 has
Multi-Objective Table 25.102.
Algorithms for Attribute
Selection in Data Mining
619
Tree sizes obtained with C4.5, MOGA-1 and MOFSS-1 Tree Size (number of nodes)
Data set Arrhythmia Balance-Scale Bupa Car Crx Dermatology Glass Ionosphere Iris Mushroom Pima Promoters Sick-euthyroid Tic tac toe Vehicle Votes Wine Wisconsin Wins over C4.5 Losses over C4.5
CJU5
MOGA-1
MOFSS-1
80.2 ± 2.1 41.0 ± 1.29 44.2 ± 3.75 165.3 ± 2.79 29.0 ± 3.65 34.0 ± 1.89 11.0 ± 0.0 26.2 ± 1.74 8.2 ± 0.44 32.7 ± 0.67 45.0 ± 2.89 23.8 ± 1.04 24.8 ± 0.69 130.3 ± 4.25 134.0 ± 6.17 10.6 ± 0.26 10.2 ± 0.68 28.0 ± 2.13 -
65.4 ± 1.15 ( + ) 16.5 ± 3.45 (+) 7.4 ± 1.36 (+) 29.4 ± 5.2 (+) 11.2 ± 3.86 (+) 25.2 ± 0.96 (+) 11.0 ± 0.0 13.0 ± 1.4 (+) 5.8 ± 0.53 (+) 30.0 ± 0.89 (+) 11.0 ± 2.6 (+) 11.4 ± 2.47 (+) 11.2 ± 1.35 (+) 21.1 ± 4.54 (+) 95 ± 3.13 (+) 5.4 ± 0.88 (+) 9.4 ± 0.26 25 ± 3.71 15 0
N/A 7.5 ± 1.5 (+) 11.4 ± 2.78 (+) 17.7 ± 1.07 (+) 24.6 ± 8.27 23.2 ± 2.84 (+) 11.0 ± 0.0 14.2 ± 2.23 (+) 6.0 ± 0.68 (+) 27.2 ± 1.76 (+) 9.2 ± 1.85 (+) 9.0 ± 1.2 (+) 9.6 ± 0.79 (+) 10.6 ± 1.4 ( + ) 72.8 ± 10.98 (+) 5.6 ± 1.07 (+) 8.6 ± 0.26 (+) 18 ± 1.53 (+) 15 0
effectively found a good trade-off between the objectives of minimizing error rate and tree size, whereas MOFSS-1 minimized tree size at the expense of increasing error rate in a number of data sets. Table 25.103. Number of Pareto dominance relations C4.5 MOGA-1 MOFSS-1
significant
C4.5
MOGA-1
MOFSS-1
X 14 8
0 X 0
0 7 X
Table 25.103 compares the performance of MOGA-1, MOFSS-1 and C4.5 using all attributes considering both the error rate and the tree size at the same time, according to the concept of significant Pareto dominance. This is a modified version of conventional Pareto dominance tailored for the classification task of data mining, where we want to find solutions that are not only better, but significantly better, taking into account the standard deviations (as explained earlier for Tables 25.101 and 25.102). Hence, each cell of Table 25.103 shows the number of data sets in which the solution
620
G.L. Pappa, A.A. Freitas and C.A.A. Kaestner
found by the method indicated in the table row significantly dominates the solution found by method indicated in the table column. A solution Si significantly dominates a solution 5 2 if and only if: • obji(Si) + sd^Si) < obj1{S2) - sdl{S2) and . not[o&j2(S2) + sd2(S2) < o6j2(5i) - sd2(S1)\ where obji(Si) and sdi(Si) denote the average value of objective 1 and the standard deviation of objective 1 associated with solution Si, and similarly for the other variables. Objectivel and objective2 can be instantiated with error rate and tree size, or vice-versa. For example, in the bupa dataset we can say that the solution found by MOGA-1 significantly dominates the solution found by MOFSS-1 because: (a) In Table 25.101 MOGA-l's error rate plus standard deviation (30.14+1.85) is smaller than MOFSS-1's error rate minus standard deviation (40.85-1.45); and (b) concerning the tree size (Table 25.102), the condition "not (11.4 + 2.78 < 7.4 - 1.36)" holds. So, both conditions for significant dominance are satisfied. As shown in Table 25.103, the baseline solution (column "C4.5") did not significantly dominate the solutions found by MOGA-1 and MOFSS-1 in any dataset. The best results were obtained by MOGA-1, whose solutions significantly dominated the baseline solution in 14 out of the 18 datasets and significantly dominated MOFSS-1's solutions in 7 data sets. MOFSS1 obtained a reasonably good result, significantly dominating the baseline solution in 8 datasets, but it did not dominate MOGA-1 in any dataset. A more detailed analysis of these results, at the level of individual data sets, can be observed later in Tables 25.104 and 25.105. 25.5.3. On the Effectiveness of the Criterion to Choose the "Best" Solution Analyzing the results in Tables 25.99, 25.100, 25.101 and 25.102 we can evaluate whether the criterion used to choose a single solution out of all non-dominated ones (i.e., the criterion used to generate the results of Tables 25.101 and 25.102) is really able to choose the "best" solution for each data set. We can do this analyzing the dominance relationship (involving the error rate and tree size) between the single returned solution and the baseline solution. That is, we can observe whether or not the single solution returned by MOGA-1 and MOFSS-1 dominates, is dominated by, or is neutral with respect to the baseline solution. Once we have this information, we can compare it with the corresponding relative frequencies associated with the
621
Multi-Objective Algorithms for Attribute Selection in Data Mining
solutions found by MOGA-all/MOFSS-all (columns Fdominate, FdominaUd, Fneutrai of Tables 25.99 and 25.100). This comparison is performed in Tables 25.104 and 25.105, which refer to MOGA and MOFSS, respectively. In these two tables the first column contains the data set names, the next three columns are copied from the last three columns in Tables 25.99 and 25.100, respectively, and the last three columns are computed from the results in Tables 25.101 and 25.102, by applying the above-explained concept of significant Pareto dominance between the MOGA-l's/MOFSS-l's solution and the baseline solution. Table 25.104.
Performance of MOGA-all versus MOGA-1
Performance of MOGA-all's solutions wrt baseline solution Data set
Jrf om
F,jom_eli
Arrhythmia Balance-Scale Bupa Car Crx Dermatology Glass Ionosphere Iris Mushroom Pima Promoters Sick- euthyroid Tic tac toe Vehicle Votes Wine Wisconsin
0.21 0.7 0.31 0.002 0.56 0.8 0 0.37 0.8 0.68 0.34 0.33 0.02 0 0.25 0.6 0.48 0.5
0.33 0 0 0 0.05 0 0.06 0.12 0.02 0 0 0 0.02 0 0.18 0 0.31 0.2
Fneut 0.46 0.3 0.69 0.998 0.39 0.2 0.94 0.5 0.18 0.32 0.66 0.67 0.96 1 0.57 0.4 0.21 0.3
Performance of MOGA-l's solution wrt baseline solution Pom
Dom_ed
Neut
X X X X X X X X X X X X X X X X X X
As can be observed in Table 25.104, there are only 4 data sets in which the solutions found by MOGA-1 do not dominate the baseline solutions: car, glass, tic-tac-toe and Wisconsin. For these 4 data sets the solutions found by MOGA-1 were neutral (last column of Table 25.104), and the value of Fneutrai was respectively 0.998, 0.94, 1 and 0.3. Therefore, in the first three of those data sets it was expected that the single solution chosen by MOGA-1 would be neutral, so that the criterion used for choosing a single solution cannot be blamed for returning a neutral solution. Only in the Wisconsin data set the criterion did badly, because 50% of the found
622
G.L. Pappa, A.A. Freitas and C.A.A. Table 25.105.
Kaestner
Performance of MOFSS-all versus MOFSS-1
Performance of MOFSS-all's solutions wrt baseline solution
Performance of MOFSS-1's solution wrt baseline solution
Data set
Fdom
Fdorn_ed
Pom
Arrhythmia Balance-Scale Bupa Car Crx Dermatology Glass Ionosphere Iris Mushroom Pima Promoters Sick- euthyroid Tic tac toe Vehicle Votes Wine Wisconsin
0.54 0.5 0.65 0.07 0.89 0 0.99 0.14 0.86 0 0.95 0.27 0.1 0.11 0.17 0.1 0.92 0.45
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0.37
Fneut 0.46 0.5 0.35 0.93 0.11 1 0.01 0.86 0.14 1 0.05 0.73 0.9 0.89 0.83 0.9 0.07 0.18
Dom-ed
Neut
X X X X X X X X X X X X X X X X X
solutions dominated the baseline solution but a neutral solution was chosen. The criterion was very successful, managing to chose a solution that dominated the baseline, in all the other 14 data sets, even though in 8 of those data sets less than 50% of the solutions found by MOGA-all dominated the baseline. The effectiveness of the criterion can be observed, for instance, in arrhythmia and sick-euthyroid. Although in arrhythmia the value of Fdominated was quite small (0.21), the solution returned by MOGA-1 dominated the baseline solution. In sick-euthyroid, 96% of the solutions found by MOGA-all were neutral, but a solution that dominates the baseline solution was again returned by MOGA-1. With respect to the effectiveness of the criterion when used by MOFSS-1, unexpected negative results were found in 2 data sets of Table 25.105, namely crx and glass. For both data sets, despite the high values of Fdominate, the solutions chosen by MOFSS-1 were neutral. The opposite happened in ionosphere, sickeuthyroid and votes, where Fneutrai had high values, but single solutions better than the baseline solution were chosen by MOFSS-1. The relatively large number of neutral solutions chosen by MOFSS1 happened because in many data sets the tree size associated with the solution chosen by MOFSS-1 was smaller than the tree size associated with
Multi-Objective Algorithms for Attribute Selection in Data Mining
623
the baseline solution, whilst the error rates of the former were larger than the error rates of the latter. Overall, the criterion for choosing a single solution was moderately successful when used by MOFSS-1, and much more successful when used by MOGA-1. A possible explanation for this result is that the procedure used for tailoring the criterion for MOFSS, described earlier, is not working very well. An improvement in that procedure can be tried in future research. It is important to note that, remarkably, the criterion for choosing a single solution did not choose a solution dominated by the baseline solution in any data set. This result holds for both MOGA-1 and MOFSS-1. 25.6. Conclusions and Future Work This chapter has discussed two multi-objective algorithms for attribute selection in data mining, namely a multi-objective genetic algorithm (MOGA) and a multi-objective forward sequential selection (MOFSS) method. The effectiveness of both algorithms was extensively evaluated in 18 real-world data sets. Two major sets of experiments were performed, as follows. The first set of experiments compared each of the non-dominated solutions (attribute subsets) found by MOGA and MOFSS with the baseline solution (consisting of all the original attributes). The comparison aimed at counting how many of the solutions found by MOGA and MOFSS dominated (in the Pareto sense) or were dominated by the baseline solution, in terms of classification error rate and decision tree size. Overall, the results (see Tables 25.99 and 25.100) show that both MOGA and MOFSS are successful in the sense that they return solutions that dominate the baseline solution much more often than vice-versa. The second set of experiments consisted of selecting a single "best" solution out of all the non-dominated solutions found by each multi-objective attribute selection method (MOGA and MOFSS) and then comparing this solution with the baseline solution. Although this kind of experiment is not often performed in the multi-objective literature, it is important because in practice the user often wants a single solution to be suggested by the system, to relieve him from the cognitive burden and difficult responsibility of choosing one solution out of all non-dominated solutions. In order to perform this set of experiments, this work proposed a simple way to choose a single solution to be returned from the set of non-dominated solutions generated by MOGA and MOFSS. The effectiveness of the proposed criterion was analyzed by comparing the results of the two different
624
G.L. Pappa, A.A. Preitas and C.A.A. Kaestner
versions of MOGA and MOFSS, one version returning all non-dominated solutions (results of the first set of experiments) and another version returning a single chosen non-dominated solution. Despite its simplicity, the proposed criterion worked well in practice, particularly when used in the MOGA method. It could be improved when used in the MOFSS method, as discussed earlier. In the future we intend to analyze the characteristics of the data sets where each of the proposed methods obtained its best results, in order to find patterns that describe the data sets where each method can be applied with greater success. References 1. Aha, D.W., Bankert, R.L.: A Comparative Evaluation of Sequential Feature Selection Algorithms. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data: AI and Statistics V. Springer-Verlag, Berlin Heidelberg New York, (1996), 1-7. 2. Guyon, I., Elisseeff, A. : An Introduction to Variable and Feature Selection. In: Kaelbling, L. P. (ed.) Journal of Machine Learning Research 3, (2003), 1157-1182. 3. Freitas, A.A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag (2002). 4. Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, England (2001). 5. Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, Kluwer, (1998). 6. Bala, J., De Jong, K., Huang, J.,Vafaie, H., Wechsler, H.: Hybrid learning using genetic algorithms and decision trees for pattern classification. In: Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI-95), (1995), 719-724. 7. Bala, J., De Jong, K., Huang, J., Vafaie, H., Wechsler, H.: Using learning to facilitate the evolution of features for recognizing visual concepts. Evolutionary Computation 4(3),(1996), 297-312. 8. Chen, S., Guerra-Salcedo, C, Smith, S.F.: Non-standard crossover for a standard representation - commonality-based feature subset selection. In: Proc. Genetic and Evolutionary Computation Conf. (GECCO-99), Morgan Kaufmann, (1999), 129-134. 9. Guerra-Salcedo, C, Whitley, D.: Genetic Search for Feature Subset Selection: A Comparison Between CHC and GENESIS. In: Proc. Genetic Programming Conference 1998, (1998), 504-509. 10. Guerra-Salcedo, C, Chen, S., Whitley, D., Smith, S.: Fast and accurate feature selection using hybrid genetic strategies. In: Proc. Congress on Evolutionary Computation (CEC-99),Washington D.C., USA. July (1999), 177184. 11. Cherkauer, K.J., Shavlik, J.W.: Growing simpler decision trees to facilitate
Multi-Objective Algorithms for Attribute Selection in Data Mining
12. 13. 14. 15. 16.
17. 18.
19. 20. 21. 22.
23.
24. 25. 26.
knowledge discovery. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD-96), AAAI Press, (1996), 315-318. Terano, T. , Ishino, Y. interactive genetic algorithm based feature selection and its application to marketing data analysis. In: Liu, H. ,Motoda, H. (Eds.) Feature Extraction, Construction and Selection,Kluwer, (1998), 393-406. Vafaie, H., DeJong, K.Evolutionary Feature Space Transformation. In: Liu, H., Motoda, H. (Eds.) Feature Extraction, Construction and Selection, Kluwer, (1998), 307-323. Yang, J. ,Honavar, V.: Feature subset selection using a genetic algorithm. Genetic Programming 1997: Proc. 2nd Annual Conf. (GP-97), Morgan Kaufmann, (1997), 380-385. Yang J., Honavar V.: Feature subset selection using a genetic algorithm. In: Liu, H., Motoda, H. (Eds.) Feature Extraction, Construction and Selection, Kluwer,(1998), 117-136. Moser A., Murty, M.N.: On the scalability of genetic algorithms to very large-scale feature selection. In: Proc. Real-World Applications of Evolutionary Computing (EvoWorkshops 2000). Lecture Notes in Computer Science 1803, Springer-Verlag, (2000), 77-86. Ishibuchi, H., Nakashima, T.: Multi-objective pattern and feature selection by a genetic algorithm. In: Proc. 2000 Genetic and Evolutionary Computation Conf. (GECCO-2000), Morgan Kaufmann, (2000), 1069-1076. Emmanouilidis, C, Hunter, A., Maclntyre, J.: A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator. In: Proc. 2000 Congress on Evolutionary Computation (CEC-2000), IEEE, (2000), 309-316. Rozsypal, A., Kubat, M.: Selecting Representative examples and attributes by a genetic algorithm. Intelligent Data Analysis 7, (2003), 290-304. Llora, X.,Garrell, J.: Prototype Induction anda attribute selection via evolutionary algorithms. Intelligent Data Analysis 7, (2003), 193-208. Coello Coello, C.A., Van Veldhuizen, D.A., Lamont, G.B.: Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers, New York (2002). Pappa, G.L., Freitas, A.A., Kaestner, C.A.A.: Attribute Selection with a Multiobjective Genetic Algorithm. In: Proc. of 16 th Brazilian Symposium on Artificial Intelligence, Lecture Notes in Artificial Intelligence 2507, SpringVerlag, (2002), 280-290. Pappa, G.L., Freitas, A.A., Kaestner, C.A.A.: A Multiobjective Genetic Algorithm for Attribute Selection. In: Proc. of 4 International Conference on Recent Advances in Soft Computing (RASC), University of Nottingham, UK, (2002), 116-121. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. 3rd edn. Springer-Verlag, Berlin Heidelberg New York, (1996). Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, (1993). Bhattacharyya, S.: Evolutionary Algorithms in Data mining: MultiObjective Performance Modeling for Direct Marketing. In: Proc of 6th ACM
625
626
G.L. Pappa, A.A. Freitas and C.A.A. Kaestner
SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000), ACM Press (2000), 465-471. 27. Zitzler, E., Thiele, L.: Multiobjective Evolutionary Algorithms: A Comparative Study and the Strength Pareto Approach. In: IEEE Transactions on Evolutionary Computation 3(4), (1999), 257-271. 28. Murphy, P.M., Aha, D.W.: UCI Repository of Machine Learning databases. [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science, (1994). 29. Coello Coello, C.A.: Handling Preferences in Evolutionary Multiobjective Optimization: A Survey. In: Proc. of Congress on Evolutionary Computation (CEC-2002), IEEE Service Center, New Jersey (2000), 30-37.
CHAPTER 26 FINANCIAL APPLICATIONS OF MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS: RECENT DEVELOPMENTS AND FUTURE RESEARCH DIRECTIONS
1
Frank Schlottmann 1 ' 2 and Detlef Seese2 GILLARDON AG financial software, Research Department, Alte Wilhelmstr. 4, D-75015 Bretten, Germany E-mail:
[email protected] 2
Institute AIFB, University Karlsruhe (TH) D-76128 Karlsruhe, Germany E-mail:
[email protected] The area of finance contains many algorithmic problems of large practical interest whose complexity prevents finding efficient solutions. The range of application in this area covers e. g. portfolio selection and risk management and reaches from questions of real world financial intermediation to sophisticated research problems. Since there is an urgent need to solve these complex problems heuristic approaches like Evolutionary Algorithms are a potential toolbox. The application of Multi-Objective Evolutionary Algorithm concepts to this area has started more recently compared to the vast majority of other application areas, as e. g. design and engineering. We give a brief survey on promising developments within this field and discuss potential future research directions. 26.1. Introduction It is one of the goals of computational finance to develop methods and algorithms to support decision making. Unfortunately, many problems of practical or theoretical interest are too complex to be solvable exactly by a deterministic algorithm in reasonable computing time, e. g. using a method that applies a simple closed-form analytical expression. Such problems require approximation procedures which provide sufficiently good solutions while requiring less computational effort compared to an exact algorithm. Heuristic approaches are a class of algorithms which have been developed 627
628
F. Schlottmann and D. Seese
to fulfil these requirements in many problem contexts, see e. g. Fogel & Michaelwicz1 or Hromkovic2 for the general methodology and Schlottmann & Seese3 for an overview of heuristic algorithm applications to financial problems. Moreover, Chen's book4 contains a selection of mainly singleobjective Evolutionary Algorithm applications in the finance area. In contrast to these more general literature surveys, we concentrate solely on Multi-Objective Evolutionary Algorithm (MOEA) applications in finance in the following text. The main advantage of MOEAs is their ability to investigate many objectives/goals at the same time. Hence, they offer many possibilities to support decision making particularly in finance where a majority of naturally multi-criteria problems have been considered only in a simplified single-objective manner for a long time. Since we do not address general concepts and details of standard MOEAs, we refer the reader e. g. to Deb5, Coello et al.6 and Osyczka7 for an introduction as well as a thorough coverage of this methodology. The rest of this chapter is structured as follows: In the next section, we point out the complexity of financial problems. Afterwards, we give an introduction to portfolio selection problems in the standard Markowitz setting. We discuss some successful MOEA applications from the literature which solve different problems related to portfolio selection. The chapter ends with a conclusion and potential future research directions. 26.2. A Justification for MOEAs in Financial Applications Many decision problems in the area of finance can be viewed as some of the hardest problems in economics. This is caused by their strong interrelation with many other difficult decision problems in business life, by the huge number of parameters which are usually involved in these problems and often by the intrinsic complexity of some of these problems itself. Complexity influences financial decision making in many forms. There is the huge number of parameters often involved in financial decision problems, financial systems are often highly dynamic and recently there are proofs that even financial decision problems for simple models have a high algorithmic complexity preventing the existence of efficient algorithmic solutions. Often such complexity results give insights into structural reasons for the difficulties preventing to support decision making with the help of computers. For instance, Aspnes et al.8 showed that already for a very simple model of a stock market the complexity of the problem to predict the market price depends essentially on the number of trading strategies in comparison
Financial Applications of MOEAs: Recent Developments and Future Research
629
to the number of traders. If there is a large number of traders but they employ a relatively small number of strategies, then there is a polynomialtime algorithm for predicting future price movements with high accuracy and if the number of trading strategies is large, market prediction becomes complex. Of course, such complexity results require a precise definition of complexity. A widely accepted formal definition of complex problems results by comparing the asymptotic computation times of their solution algorithms. Here the computation time - measured in elementary computation steps is a function denning for each size of the input, which is measured e.g. as the number of input variables of the given problem, the number of steps the algorithm needs to compute the result in a worst case. Such computation time functions can be compared with respect to their asymptotic growth rates, i.e. comparing the growths of the function neglecting constant factors and possibly a finite number of input sizes. The observation is that most problems which can be solved efficiently in practice can be solved via algorithms whose asymptotic growth rate is at most polynomial in the input size n, i.e. at most nk for a constant k and in most cases k is small. All these problems are gathered in the class P. Unfortunately, for almost all problems of practical or theoretical importance no polynomial time algorithms are known - instead of that only exponential time solutions are found, e.g. with an exponential number 2™ of necessary calculation steps for n given input variables of the considered problem. Another observation is that almost all of these problems can be computed in polynomial time by a nondeterministic algorithm. Such algorithms can perform a certain number of computation steps in parallel and choose the shortest computation path at the end. An equivalent way to describe such algorithms is to allow guessing a solution and the only thing the algorithm has to do in polynomial time is to verify that the guessed solution is correct. All problems computable in polynomial time via a nondeterministic algorithm build the class NP and almost all problems of practical importance are included in this class. It is an outstanding open problem in computer science to decide whether P = NP holds. In an attempt to answer this question the class of NP-complete problems was defined and investigated. A problem is defined to be NP-complete if it is in NP and each problem in NP can be reduced to it via a deterministic polynomial time algorithm. NP-complete problems are thus the hardest problems in the class NP. If one finds an algorithm which solves one of these problems in polynomial time then all problems in NP can be solved
630
F. Schlottmann and D. Seese
in polynomial time, hence P = NP. It is widely conjectured, that no such algorithm exists and this conjecture is one of the most famous problems in computer science which is open already for more than two decades. The really surprising fact is not the existence of NP-complete problems, but the fact that almost all problems of practical interest belong to this class of problems: Until now there are thousands of NP-complete problems in all areas of application and there is no known algorithm which requires only a polynomial number of computational steps depending on the input size n for an arbitrarily chosen problem that belongs to this class of problems. So it is not surprising that recently it could be proved that also many problems in finance belong to the class of NP-complete problems, since they have a combinatorial structure which is equivalent (with respect to polynomialtime reductions) to well-known NP-complete problems, e. g. constrained portfolio selection and related questions of asset allocation (which are considered later in this chapter) are equivalent to the following problem which is known to be NP- complete. KNAPSACK: Given is a finite set U is given together with positive integers s(u) (the size of u) and v{u) (the value of u) for each element u 6 U, a positive integer B as size constraint and a positive integer K as value goal. Find a subset U' CU such that £)«€£/' s(u) < B and J2uev v(u) > KMore details on knapsack problems and a large collection of further complexity results can be found e. g. in Garey & Johnson9, see also Papadimitriou10 for the formal definitions of computational complexity and Kellerer et al.11 for a contemporary monograph on knapsack problems. An illustrative formulation of knapsack problems in portfolio selection is given in the next section, whereas e. g. Seese & Schlottmann12'13 provide corresponding complexity results. The main consequence of the above mentioned complexity results is that we require approximation algorithms that yield sufficiently good solutions for complex finance problems and consume only polynomial computational resources measured by the size of the respective problem instance (e. g. number of independent variables). For some complex problem settings and under certain assumptions, particularly linearity or convexity of target functions in optimization problems, there are analytical approximation algorithms which provide a fast method of finding solutions having a guaranteed quality of lying within an e-region around the globally best solution (s). If the considered problem instance allows the necessary restrictions for the application of such algorithms, these are the preferred choice, see Ausiello et al.14 for such considerations. However, some applications in finance require non-
Financial Applications of MOEAs: Recent Developments and Future Research
631
linear, non-convex functions (e. g. valuation of financial instruments using non-linear functions), and sometimes we know only the data (parameters) but not the functional dependency between them, so there is nevertheless a need for methods that search for good solutions in difficult problem settings while spending only relatively small computational cost. This is the justification for heuristic approaches like MOEAs, which unlike conventional algorithms, allow imprecision, uncertainty as well as partial truth and can handle multiple objectives in a very natural manner. These requirements are matching many real-world search and optimization problems. MOEAs offer especially on the basis of their evolutionary part adaptability as one of their characteristic features and thus permit the tracking of a problem through a changing environment. Moreover on the basis of their multi-objective part they allowflexibledecisions of the management on the basis of the actually present information. Hence they are an interesting tool in a complex and dynamically changing world. The next section contains some examples of such heuristic approaches to complex financial problems.
26.3. Selected Financial Applications of MOEAs 26.3.1. Portfolio Selection Problems All MOEA approaches which will be discussed later in this subsection focus on portfolio selection problems or related questions. To give a brief introduction of this application context, we concentrate on standard Markowitz15 portfolio selection problems first. Given is a set of n £ N financial assets, e. g. exchange traded stocks. At time to € M., each asset i has certain characteristics describing its future payoff: Each asset i has an expected rate of return Hi per monetary unit (e. g. dollars) which is paid at time t\ 6 E, t\ > to- This means if we take a position in y £ K units of asset 1 at time to our expected payoff in t\ will be Hi y units. Moreover, the covariances between the rates of return of all assets are given by a symmetric matrix £ := Wij)i,je{i,...,«}• I n this straightforward notation, an is the variance of asset i-th's rate of return and Gij is the covariance between asset i-th's rate of return and asset j-th's rate of return. A portfolio is defined by a vector x :— (xi,..., xn) G IRn which contains the weight x, € R of asset i e { 1 , . . . , n} in its i-th component. In the standard problem formulation, the weights of a portfolio are
632
F. Schlottmann and D. Seese
normalized as follows:
j>
=i
(c.i)
»=i
Depending on the specific problem context, there are additional restrictions on the weights, e. g. lower bounds (a common constraint is Xi > 0), upper bounds and/or integrality constraints. This topic will be addressed in more detail later. At this point, it is sufficient to denote the set of all unconstrained portfolios by S C 1™ and the set of feasible portfolios which satisfy the required constraints by F C 5. If the specific portfolio selection problem is unconstrained, one can simply assume F = S. Usually, at least two conflicting target functions are considered: A return function freturnix) which is to be maximized and a risk function frisk{x) which is to be minimized. In the standard Markowitz setting these functions are defined as follows: n
freturnix) \= ^
x
i Mi
(C2)
i=l n
n
frisk (x) := ^2 XI X i xi °{i i=l
(C-3)
j=l
The above definition of freturn resembles the fact that the expected rate of return of a portfolio is the weighted sum of the assets' expected rate of return. And in the above specification of frisk the standard deviation of the portfolio rate of return is chosen as a risk measure which describes the level of uncertainty about the future payoff at time ti. In the context of portfolio management, a feasible portfolio x £ F is dominated by a feasible portfolio y G F iff at least one of the following two conditions is met: freturnix) < freturniv) A frisk(x) > frisk{v)
(C.4)
freturnix) < freturniv) A friskix) > friskiy)
(C.5)
As rational investors prefer non-dominated portfolios over dominated portfolios, one is usually interested in finding an approximation of the so-called efficient frontier which is identical to the set of all feasible nondominated portfolios. In the standard finance literature, this is formulated as a constrained single objective problem: Given a rate of return r*, find a feasible portfolio x* £ F satisfying freturnix*) = r* A frisk(x*) = min{/ri.*(a:)}. xEF
(C.6)
Financial Applications of MOEAs: Recent Developments and Future Research 633
If there are no integrality constraints or other restrictions which raise the complexity such problems can be solved using standard Quadratic Programming algorithms (under the assumption that £ is positive definite). Prom a computational complexity point of view, this is equivalent to solving a knapsack-like problem using real-valued decision variables - in the knapsack problem formulation in section 26.2 we considered binary decision variables, hence the complexity is different (lower) here although the objective function is not linear. By considering two objective functions instead of modelling an objective function constraint, we obtain a quite natural problem formulation for a MOEA approach which allows more flexibility concerning both the objective functions and the constraints on the portfolios. And we point out that the question of finding non-dominated portfolios raised above can easily be extended to a multi-period problem where the payoff at each additional future point of time t2, t3,..., tm, m e N, Mi £ { 1 , . . . , m} : U £ E is considered separately. This results in 2 • m objective functions to be optimized. In the following subsections we will summarize several applications of MOEAs in this context. Particularly, we will describe the deviation from the above Markowitz problem setting, the genetic modelling, the chosen genetic variation operators and the parameter sets used in empirical tests of the methodology.
26.3.2. Vederajan et al. The article by Vederajan et al.16 contains different applications of Genetic Algorithm (GA) methodology to portfolio selection problems in the Markowitz context. At first, the authors consider the standard problem of portfolio selection from the previous section and add the constraint \/i£{l,...,n}:0<Xi<xmax
(C.7)
where xmax 6ffi-|_is a constant. Besides a single-objective GA approach using a weighted sum of the frisk and freturn objective functions from section 26.3.1 which we do not consider here, Vederajan et al. also propose a MOEA approach searching for non-dominated feasible invididuals with respect to the two objectives. They use the Non-dominated Sorting Genetic Algorithm (NSGA) from Srinivas & Deb17 based on the following genetic representation of the Xi variables: Each decision variable is represented by a binary string of fixed length lCOnst, which represents the weight of the asset in the portfolio. The strings of all decision variables are concatenated such that the resulting genotype of each
634
F. Schlottmann and D. Seese
individual consists of a binary gene string of length n • I const- It has to be emphasized here that this genetic modelling restricts the search space to a discrete subset of W1:
F : = | a ; 6 {O.d.ca • - , . . . , Q C O B , , _ 1 - ^ - J
} | £ > < = 1 | (C.8)
Here the constants c; > 0 are chosen together with lconst such that Xi > 0 (trivial) and Xi < xmax is assured. To incorporate the summation constraint from equation (C.I) into the algorithm, Vederajan et al. propose a repairing procedure for infeasible individuals derived from Bean18: The Xi values of an infeasible individual are sorted in descending order to obtain a permutation TT(I') of the decision variables. Using this permutation one starts with the highest value given by xn(k) for k := 1 and raises k successively until Xa=i xn(i) > 1 f° r the minimum k. Knowing this value k one sets
{
xn(j)
if j < k,
1 - Eti 1 **«
iij = k,
(C.9)
0 otherwise. This repairing operation is applied each time an infeasible individual is generated (e. g. after random initialization of the first population). The selection operator used for reproduction of individuals in the NSGA is standard binary tournament, and the genetic variation operators are one-point-crossover with crossover probability pcross '•= 0.9 and a standard binary complement mutation operator applied with probability pmut := 0.01 to each single bit in the gene string. Diversity preservation in the population is achieved by a niching approach using the sharing function n(J \ ._ J 1 ~ ( ^ 7 7 ) ti dxy < Sconst, gyaxy) • - < . (o.iuj (^ 0 otherwise. Here, dxy is the Euclidean distance between the fitness function values of a given individual x and a given individual y. sconst is the maximum accepted value of dxy for two arbitrary individuals which belong to the same niche. Vederajan et al. perform several experiments with stock market data particularly consisting of the historical asset price means and covariances for Boeing, Disney, Exxon, McDonald's and Microsoft stocks from January 1991 to December 1995. Their NSGA application to the Markowitz problem
Financial Applications of MOEAs: Recent Developments and Future Research 635
described above yielded a well-converged approximation of many Paretooptimal solutions within 100 population steps. Each population contained 1000 individuals. Concerning their application of a MOEA to a quadratic optimization problem instead of using standard quadratic programming approaches, Vederajan et al. point out an interesting fact which the authors of this chapter also encountered when it came to real-world portfolio selection problems: As it has already been mentioned in section 26.3.1, the covariance matrix ( is required to be positive definite to apply standard quadratic programming algorithms. If there are numerical issues (e .g. numerical imprecision due to rounding and/or floating-point arithmetic), this assumption might be violated. Moreover, a violation is not unlikely for real-world data, particularly when n gets large, since the covariances are estimated from real asset price time series which do not necessarily satisfy a priori given restrictions of the mathematical tool for portfolio analysis. Thus, a MOEA approach is even suitable for such a standard problem setting. In addition to the above results, Vederajan et al. also consider a variant of their Markowitz problem setting where transaction cost due to changes in a portfolio (rebalancing) are an additional ingredient which causes problems for standard quadratic programming algorithms. Thus, the authors apply their NSGA approach again using a third objective function which is to be minimized: n
fcost(x) :- ^2ci(xi - x{f,
(C.ll)
;=i
where X{ G E^. is the given initial weight of asset i in the portfolio that is to be changed potentially due to rebalancing transactions, and the constant Ci £ R is the transaction cost for asset i. The above NSGA approach is again applied to the given five asset problem instance, just the number of individuals per population is raised to 1500. Vederajan et al. illustrate the three-dimensional boundary of the approximated solutions in the objective function space and give a reasonable interpretation for the shape of the approximated Pareto front. As a summary, the work by Vederajan et al. contains an early application of the MOEA methodology to portfolio selection problems and provides even a practical justification for the application of this methodology to standard Markowitz problem settings where quadratic programming approaches are often considered to be mandatory. Furthermore, an interesting application of a MOEA to portfolio selection problems with transaction cost
636
F. Schlottmann and D. Seese
is illustrated.
26.3.3. Lin et al. In their study, Lin et al.19 consider the following variation of the standard Markowitz problem from section 26.3.1: Each asset can only be held in nonnegative integer units, i. e. | Vi e { 1 , . . . , « } : a* eNU{0}}
S:={x:=(xu...,xn)
(C.12)
The market price of one unit of asset % which can be bought is pi. There is an upper limit Ui on the maximum monetary value which is invested into each asset i, i. e. Vie{l,...,n}
(C.13)
:piXi