Lecture Notes in Artificial Intelligence Edited by R. Goebel, J. Siekmann, and W. Wahlster
Subseries of Lecture Notes in Computer Science
6704
Kishan G. Mehrotra Chilukuri K. Mohan Jae C. Oh Pramod K. Varshney Moonis Ali (Eds.)
Modern Approaches inApplied Intelligence 24th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2011 Syracuse, NY, USA, June 28 – July 1, 2011 Proceedings, Part II
13
Series Editors Randy Goebel, University of Alberta, Edmonton, Canada Jörg Siekmann, University of Saarland, Saarbrücken, Germany Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany Volume Editors Kishan G. Mehrotra Chilukuri K. Mohan Jae C. Oh Pramod K. Varshney Syracuse University, Department of Electrical Engineering and Computer Science Syracuse, NY 13244-4100, USA E-mail: {mehrotra, mohan, jcoh, varshney}@syr.edu Moonis Ali Texas State University San Marcos, Department of Computer Science 601 University Drive, San Marcos, TX 78666-4616, USA E-mail:
[email protected] ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-21826-2 e-ISBN 978-3-642-21827-9 DOI 10.1007/978-3-642-21827-9 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011929232 CR Subject Classification (1998): I.2, H.3-4, F.1-2, C.2, I.4-5, H.2.8 LNCS Sublibrary: SL 7 – Artificial Intelligence © Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
There has been a steady increase in demand for efficient and intelligent techniques for solving complex real-world problems. The fields of artificial intelligence and applied intelligence cover computational approaches and their applications that are often inspired by biological systems. Applied intelligence technologies are used to build machines that can solve real-world problems of significant complexity. Technologies used in applied intelligence are thus applicable to many areas including data mining, adaptive control, intelligent manufacturing, autonomous agents, bio-informatics, reasoning, computer vision, decision support systems, fuzzy logic, robotics, intelligent interfaces, Internet technology, machine learning, neural networks, evolutionary algorithms, heuristic search, intelligent design, planning, and scheduling. The International Society of Applied Intelligence (ISAI), through its annual IEA/AIE conferences, provides a forum for international scientific and industrial communities to interact with each other to develop and advance intelligent systems that address such concerns. The 24th International Conference on Industrial, Engineering and Other Applications of Applied Intelligence Systems (IEA/AIE-2011), held in Syracuse, NY (USA), followed the IEA/AIE tradition of providing an international scientific forum for researchers in the diverse field of applied intelligence. Invited speakers and authors addressed problems we face and presented their solutions by applying a broad spectrum of intelligent methodologies. Papers presented at IEA/AIE-2011 covered theoretical approaches as well as applications of intelligent systems in solving complex real-life problems. We received 206 papers and selected the 92 best papers for inclusion in these proceedings. Each paper was reviewed by at least three members of the Program Committee. The papers in the proceedings cover a wide number of topics including feature extraction, discretization, clustering, classification, diagnosis, data refinement, neural networks, genetic algorithms, learning classifier systems, Bayesian and probabilistic methods, image processing, robotics, navigation, optimization, scheduling, routing, game theory and agents, cognition, emotion, and beliefs. Special sessions included topics in the areas of incremental clustering and novelty detection techniques and their applications to intelligent analysis of time varying information, intelligent techniques for document processing, modeling and support of cognitive and affective human processes, cognitive computing facets in intelligent interaction, applications of intelligent systems, nature-inspired optimization – foundations and application, chemoinformatic and bioinformatic methods, algorithms and applications. These proceedings, consisting of 92 chapters authored by participants of IEA/AIE-2011, cover both the theory and applications of applied intelligent
VI
Preface
systems. Together, these papers highlight new trends and frontiers of applied intelligence and show how new research could lead to innovative applications of considerable practical significance. We expect that these proceedings will provide useful reference for future research. The conference also invited three outstanding scholars to give plenary keynote speeches. They were Ajay K. Royyuru from IBM Thomas J. Watson Research Center, Henry Kauts from the University of Rochester, and Linderman from Air Force Research Laboratory. We would like to thank Springer for their help in publishing the proceedings. We would also like to thank the Program Committee and other reviewers for their hard work in assuring the high quality of the proceedings. We would like to thank organizers of special sessions for their efforts to make this conference successful. We especially thank Syracuse University for their generous support of the conference. We thank our main sponsor, ISAI, as well as our cooperating organizations: Association for the Advancement of Artificial Intelligence (AAAI), Association for Computing Machinery (ACM/SIGART, SIGKDD), Austrian Association for Artificial Intelligence (OeGAI), British Computer Society Specialist Group on Artificial Intelligence (BCS SGAI), European Neural Network Society (ENNS), International Neural Network Society (INNS), Japanese Society for Artificial Intelligence (JSAI), Slovenian Artificial Intelligence Society (SLAIS), Spanish Society for Artificial Intelligence (AEPIA), Swiss Group for Artificial Intelligence and Cognitive Science (SGAICO), Taiwanese Association for Artificial Intelligence (TAAI), Taiwanese Association for Consumer Electronics (TACE), Texas State University-San Marcos. Finally, we cordially thank the organizers, invited speakers, and authors, whose efforts were critical for the success of the conference and the publication of these proceedings. Thanks are also due to many professionals who contributed to making the conference successful. April 2011
Kishan G. Mehrotra Chilukuri Mohan Jae C. Oh Pramod K. Varshney Moonis Ali
Organization
Program Committee
General Chair
Program Chairs
Moonis Ali, USA Kishan G. Mehrotra, USA Mohan Chilukuri, USA Jae C. Oh, USA Pramod K. Varshney, USA
Invited Session Chair
Sanjay Ranka, USA
Local Arrangements Chair
Thumrongsak Kosiyatrakul, USA
Program Committee Adam Jatowt, Japan Ah-Hwee Tan, Singapore Amruth Kumar, USA Andres Bustillo, Spain Anna Fensel, Austria Antonio Bahamonde, Spain Azizi Ab Aziz, The Netherlands B¨arbel Mertsching, Germany Bin-Yih Liao, Taiwan Bipin Indurkhya, India Bohdan Macukow, Poland Bora Kumova, Turkey C.W. Chan, Hong Kong Catholijn Jonker, The Netherlands Cecile Bothorel, France C´esar Garc´ıa-Osorio, Spain Changshui Zhang, Canada Chien-Chung Chan, USA Chih-Cheng Hung, USA Chilukuri K. Mohan, USA Chiung-Yao Fang, Taiwan Chunsheng Yang, Canada Chun-Yen Chang, Taiwan Colin Fyfe, UK Coral Del Val-Mu˜ noz, Spain
Dan Halperin, Israel Dan Tamir, USA Daniela Godoy, Argentina Dariusz Krol, Poland David Aha, USA Djamel Sadok, Brazil Domingo Ortiz-Boyer, Spain Don Potter, USA Don-Lin Yang, Taiwan Duco Ferro, The Netherlands Emilia Barakova, The Netherlands Enrique Frias-Martinez, Spain Enrique Herrera-Viedma, Spain Erik Blasch, USA Fevzi Belli, Germany Floriana Esposito, Italy Fran Campa G´ omez, Spain Francois Jacquenet, France Fred Freitas, Brazil Gerard Dreyfus, France Geun-Sik Jo, South Korea Gonzalo Aranda-Corral, Spain Gonzalo Cerruela-Garc´ıa, Spain Greg Lee, Taiwan Gregorio Sainz-Palmero, Spain
VIII
Organization
Guillen Quintana, Spain Guna Seetharaman, USA Gwo-Jen Hwang, Taiwan Hamido Fujita, USA Hans-Werner Guesgen, New Zealand Hasan Selim, Turkey Henri Prade, France Hiroshi Okuno, Japan Hisao Ishibuchi, Japan Huey-Ming Lee, Taiwan Humberto Bustince, Spain Iris Van De Kieft, The Netherlands Ishfaq Ahmad, USA Istenes Zolt´an, Hungary Jae Oh, USA Jamal Bentahar, Canada Jan Treur, The Netherlands Janusz Kacprzyk, Poland Jason J. Jung, South Korea Jean-Charles Lamirel, France Jeffrey Saltz, USA Jeng-Shyang Pan, Taiwan Jennifer Golbeck, USA Jes´ us Aguilar, Spain Jes´ us Maudes Raedo, Spain Jing Peng, USA John Dolan, USA Jorge Romeu, USA Jos´e Francisco Diez-Pastor, Spain Juan Jos´e Rodr´ıguez-D´ıez, Spain Judy Qiu, USA Jun Hakura, Japan Jyh-Horng Chou, Taiwan Kaikhah Khosrow, USA Kaoru Hirota, Japan Katarzyna Musial, UK Kazuhiko Suzuki, Japan Kishan Mehrotra, USA Krzysztof Juszczyszyn, Poland Kurosh Madani, France Kush Varshney, USA Laszlo Monostori, Hungary Lav Varshney, USA Leszek Borzemski, Poland Ling-Jyh Chen, Taiwan
Lin-Yu Tseng, Taiwan Lipo Wang, Singapore Longbing Cao, Australia Maciej Grzenda, Poland Man-Kwan Shan, Taiwan Manton Matthews, USA Marco Valtorta, USA Mario K¨ oppen, Japan Maritza Correa, Spain Mark Hoogendoorn, The Netherlands Masaki Kuremtsu, Japan Matthijs Pontier, The Netherlands Michal Lower, Poland Michele Folgheraiter, Germany Miquel S´ anchez-Marr´e, Spain Monika Lanzenberger, Austria Nancy McCraken, USA Natalie Van Der Wal, The Netherlands Ngoc-Thanh Nguyen, Germany Nicol´ as Garc´ıa-Pedrajas, Spain Niek Wijngaards, The Netherlands Nikolay Mirenkov, Japan Oshadi Alahakoon, Australia Pascal Wiggers, The Netherlands Patrick Br´ezillon, France Paul Chung, UK Philipp Baer, Germany Prabhat Mahanti, Canada Pramod Varshney, USA RadosLaw Katarzyniak, Poland Raja Velu, USA Rajmohan M., India Rianne Van Lambalgen, The Netherlands Riichiro Mizoguchi, Japan Robbert-Jan Beun, The Netherlands Rocio Romero, Spain Rodolfo Haber, Spain Rodrigo Ventura, Portugal Rung-Ching Chen, Taiwan Ruppa Thulasiram, Canada Shaheen Fatima, UK Shie-Jue Lee, Taiwan Shogo Okada, Japan Shusaku Tsumoto, Japan
Organization
Shyi-Ming Chen, Taiwan Simone Mainai, Italy Srini Ramaswamy, USA Stefano Ferilli, Italy Sung-Bae Cho, South Korea Takayuki Ito, Japan Tetsuo Kinoshita, Japan Tibor Bosse, The Netherlands Tim Hendtlass, Australia Tim Verwaart, The Netherlands Thumrongsak Kosiyatrakul, Thailand Tiranee Achalakul, Thailand Valery Tereshko, UK Victor Rayward-Smith, UK Victor Shen, Taiwan
Additional Reviewers Chein-I Chang, USA Chun-Nan Hsu, Taiwan John Henry, USA Jozsef Vancza, Hungary Michelle Hienkelwinder, USA
Vikt´ oria Zs´ ok, Hungary Vincent S. Tseng, Taiwan Vincenzo Loia, Italy Walter Potter, USA Wei-Shinn Ku, USA Wen-Juan Hou, Taiwan Wilco Verbeeten, Spain Yasser Mohammad, Egypt Ying Han, Spain Yo-Ping Huang, Taiwan Youngchul Bae, South Korea Yu-Bin Yang, Canada Yukio Ohsawa, Japan Zia Ul-Qayyum, Pakistan Zsolt-Janos Viharos, Hungary
IX
Table of Contents – Part II
Section 1: Cognitive Computing Facets in Intelligent Interaction Environmental Sound Recognition for Robot Audition Using Matching-Pursuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nobuhide Yamakawa, Toru Takahashi, Tetsuro Kitahara, Tetsuya Ogata, and Hiroshi G. Okuno Cognitive Aspects of Programming in Pictures . . . . . . . . . . . . . . . . . . . . . . . Yutaka Watanobe, Rentaro Yoshioka, and Nikolay Mirenkov
1
11
An Approach for Smoothly Recalling the Interrupted Tasks by Memorizing User Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kohei Sugawara and Hamido Fujita
21
Implementing an Efficient Causal Learning Mechanism in a Cognitive Tutoring Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Usef Faghihi, Philippe Fournier-Viger, and Roger Nkambou
27
Model Checking Commitment Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohamed El-Menshawy, Jamal Bentahar, and Rachida Dssouli Mobile Architecture for Communication and Development of Applications Based on Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luis M. Soria-Morillo, Juan A. Ortega-Ram´ırez, ´ Luis Gonz´ alez-Abril, and Juan A. Alvarez-Garc´ ıa A Simplified Human Cognitive Approach for Supporting Crowd Modeling in Tunnel Fires Emergency Simulation . . . . . . . . . . . . . . . . . . . . . Enrico Briano, Roberto Mosca, Roberto Revetria, and Alessandro Testa Model Checking Epistemic and Probabilistic Properties of Multi-agent Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Wan, Jamal Bentahar, and Abdessamad Ben Hamza Modeling Users of Crisis Training Environments by Integrating Psychological and Physiological Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gabriella Cortellessa, Rita D’Amico, Marco Pagani, Lorenza Tiberio, Riccardo De Benedictis, Giulio Bernardi, and Amedeo Cesta Personality Estimation Based on Weblog Text Classification . . . . . . . . . . . Atsunori Minamikawa and Hiroyuki Yokoyama
37
48
58
68
79
89
XII
Table of Contents – Part II
Design of an Optimal Automation System: Finding a Balance between a Human’s Task Engagement and Exhaustion . . . . . . . . . . . . . . . . . . . . . . . Michel Klein and Rianne van Lambalgen
98
A Cognitive Agent Model Using Inverse Mirroring for False Attribution of Own Actions to Other Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Treur and Muhammad Umair
109
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iris van de Kieft, Catholijn M. Jonker, and M. Birna van Riemsdijk
120
A Computational Model of Habit Learning to Enable Ambient Support for Lifestyle Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michel C.A. Klein, Nataliya Mogles, Jan Treur, and Arlette van Wissen
130
Section 2: Applications of Intelligent Systems An Intelligent Method to Extract Characters in Color Document with Highlight Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun-Ming Tsai
143
Development of Technological System Structure for Threaded Connections Assembly under Conditions of Uncertainty . . . . . . . . . . . . . . . Roman Chumakov
153
Automatic Vehicle Identification by Plate Recognition for Intelligent Transportation System Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaushik Deb, My Ha Le, Byung-Seok Woo, and Kang-Hyun Jo
163
Intelligent Page Recommender Agents: Real-Time Content Delivery for Articles and Pages Related to Similar Topics . . . . . . . . . . . . . . . . . . . . . . . . Robin M.E. Swezey, Shun Shiramatsu, Tadachika Ozono, and Toramatsu Shintani Meta-learning Based Optimization of Metabolic Pathway Data-Mining Inference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom´ as V. Arredondo, Wladimir O. Ormaz´ abal , Diego C. Candel, and Werner Creixell
173
183
Section 3: Optimization, Scheduling, and Routing Multiple Pickup and Delivery TSP with LIFO and Distance Constraints: A VNS Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiang Gao, Andrew Lim, Hu Qin, and Wenbin Zhu
193
Table of Contents – Part II
Distributed Learning with Biogeography-Based Optimization . . . . . . . . . . Carre Scheidegger, Arpit Shah, and Dan Simon Scheduling a Single Robot in a Job-Shop Environment through Precedence Constraint Posting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel D´ıaz, M. Dolres R-Moreno, Amendo Cesta, Angelo Oddi, and Riccardo Rasconi
XIII
203
216
An Intelligent Framework to Online Bin Packing in a Just-In-Time Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sergey Polyakovskiy and Rym M’Hallah
226
A Greedy Heuristic for Airline Crew Rostering: Unique Challenges in a Large Airline in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qiao Chen, Andrew Lim, and Wenbin Zhu
237
Optimal Algorithms for Two-Dimensional Box Placement Problems . . . . Wenbin Zhu, Wee-Chong Oon, Yujian Weng, and Andrew Lim
246
An Algorithm for the Freight Allocation Problem with All-Units Quantity-Based Discount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiang Gao, Andrew Lim, Wee-Chong Oon, and Hu Qin
256
A Distributed, Heterogeneous, Target-Optimized Operating System for a Multi-robot Search and Rescue Application . . . . . . . . . . . . . . . . . . . . . . . . Karl Muecke and Brian Powell
266
A Heuristic for the Multiple Container Loading Cost Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chan Hou Che, Weili Huang, Andrew Lim, and Wenbin Zhu
276
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lijun Wei, Andrew Lim, and Wenbin Zhu
286
Real-Time Resource Allocation Co-processor . . . . . . . . . . . . . . . . . . . . . . . . Stuart W. Card A Hybrid Search Strategy to Enhance Multiple Objective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li Ma and Babak Forouraghi
296
302
Section 4: Nature Inspired Optimization – Foundations and Applications Forest Planning Using Particle Swarm Optimization with a Priority Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Philip W. Brooks and Walter D. Potter
312
XIV
Table of Contents – Part II
Fuzzy Robot Controller Tuning with Biogeography-Based Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . George Thomas, Paul Lozovyy, and Dan Simon Development of a Genetic Fuzzy Controller for an Unmanned Aerial Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yan Qu, Swetha Pandhiti, Kalesha S. Bullard, Walter D. Potter, and Karl F. Fezer
319
328
Toward Evolving Self-organizing Software Systems: A Complex System Point of View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liguo Yu, David Threm, and Srini Ramaswamy
336
Evolving Efficient Sensor Arrangement and Obstacle Avoidance Control Logic for a Miniature Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muthukumaran Chandrasekaran, Karthik Nadig, and Khaled Rasheed
347
Section 5: Chemoinformatic and Bioinformatic Methods, Algorithms, and Applications Feature Selection for Translation Initiation Site Recognition . . . . . . . . . . . Aida de Haro-Garc´ıa, Javier P´erez-Rodr´ıguez, and Nicol´ as Garc´ıa-Pedrajas
357
DTP: Decision Tree-Based Predictor of Protein Contact Map . . . . . . . . . . Cosme Ernesto Santiesteban-Toca and Jesus Salvardor Aguilar-Ruiz
367
Translation Initiation Site Recognition by Means of Evolutionary Response Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rafael del Castillo-Gomariz and Nicol´ as Garc´ıa-Pedrajas An Evolutionary Algorithm for Gene Structure Prediction . . . . . . . . . . . . . Javier P´erez-Rodr´ıguez and Nicol´ as Garc´ıa-Pedrajas Prediction of Drug Activity Using Molecular Fragments-Based Representation and RFE Support Vector Machine Algorithm . . . . . . . . . . Gonzalo Cerruela Garc´ıa, Irene Luque Ruiz, and ´ Miguel Angel G´ omez-Nieto
376 386
396
Section 6: Neural Network, Classification, and Diagnosis A Hybrid Video Recommendation System Using a Graph-Based Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ¨ urk and Nihan Kesim Cicekli Gizem Ozt¨ A Diagnostic Reasoning Approach to Defect Prediction . . . . . . . . . . . . . . . Rui Abreu, Alberto Gonzalez-Sanchez, and Arjan J.C. van Gemund
406 416
Table of Contents – Part II
Multiple Source Phoneme Recognition Aided by Articulatory Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark Kane and Julie Carson-Berndsen Plan Recommendation for Well Engineering . . . . . . . . . . . . . . . . . . . . . . . . . Richard Thomson, Stewart Massie, Susan Craw, Hatem Ahriz, and Ian Mills Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kesav Kancherla and Srinivas Mukkamala
XV
426 436
446
Section 7: Neural Network and Control Improvement of Building Automation System. . . . . . . . . . . . . . . . . . . . . . . . Mark Sh. Levin, Aliaksei Andrushevich, and Alexander Klapproth
459
Efficient Load Balancing Using the Bees Algorithm . . . . . . . . . . . . . . . . . . . Anabela Moreira Bernardino, Eug´enia Moreira Bernardino, Juan Manuel S´ anchez-P´erez, Juan Antonio G´ omez-Pulido, and Miguel Angel Vega-Rodr´ıguez
469
Predicting the Distribution of Thermal Comfort Votes . . . . . . . . . . . . . . . . Anika Schumann and Nic Wilson
480
Section 8: Agents, Game Theory, and Bidding Strategic Bidding Methodology for Electricity Markets Using Adaptive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tiago Pinto, Zita Vale, F´ atima Rodrigues, Hugo Morais, and Isabel Pra¸ca Compromising Strategy Based on Estimated Maximum Utility for Automated Negotiation Agents Competition (ANAC-10) . . . . . . . . . . . . . . Shogo Kawaguchi, Katsuhide Fujita, and Takayuki Ito Negotiating Privacy Preferences in Video Surveillance Systems . . . . . . . . . Mukhtaj Singh Barhm, Nidal Qwasmi, Faisal Z. Qureshi, and Khalil el-Khatib
490
501 511
The Bayesian Pursuit Algorithm: A New Family of Estimator Learning Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuan Zhang, Ole-Christoffer Granmo, and B. John Oommen
522
A Two-Armed Bandit Based Scheme for Accelerated Decentralized Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ole-Christoffer Granmo and Sondre Glimsdal
532
XVI
Table of Contents – Part II
Specification of Interlevel Relations for Agent Models in Multiple Abstraction Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Treur
542
Section 9: Cognition, Emotion, Psychology, and Beliefs An Argumentation Framework for Deriving Qualitative Risk Sensitive Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wietske Visser, Koen V. Hindriks, and Catholijn M. Jonker Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion of Mental States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tibor Bosse, Mark Hoogendoorn, Michel C.A. Klein, Jan Treur, and C. Natalie van der Wal Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
556
566
579
Table of Contents – Part I
Section 1: Incremental Clustering and Novelty Detection Techniques and Their Application to Intelligent Analysis of Time Varying Information Classification Model for Data Streams Based on Similarity . . . . . . . . . . . . Dayrelis Mena Torres, Jes´ us Aguilar Ruiz, and Yanet Rodr´ıguez Sarabia Comparison of Artificial Neural Networks and Dynamic Principal Component Analysis for Fault Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan Carlos Tud´ on-Mart´ınez, Ruben Morales-Menendez, Luis Garza-Casta˜ n´ on, and Ricardo Ramirez-Mendoza Comparative Behaviour of Recent Incremental and Non-incremental Clustering Methods on Text: An Extended Study . . . . . . . . . . . . . . . . . . . . Jean-Charles Lamirel, Raghvendra Mall, and Mumtaz Ahmad
1
10
19
Section 2: Bayesian and Probabilistic Networks Fault Diagnosis in Power Networks with Hybrid Bayesian Networks and Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luis Eduarda Garza Casta˜ n´ on, Deneb Robles Guill´en, and Ruben Morales-Menendez
29
Learning Temporal Bayesian Networks for Power Plant Diagnosis . . . . . . Pablo Hernandez-Leal, L. Enrique Sucar, Jesus A. Gonzalez, Eduardo F. Morales, and Pablo H. Ibarguengoytia
39
On the Fusion of Probabilistic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . Salem Benferhat and Faiza Titouna
49
Section 3: Methodologies Basic Object Oriented Genetic Programming . . . . . . . . . . . . . . . . . . . . . . . . Tony White, Jinfei Fan, and Franz Oppacher
59
Inferring Border Crossing Intentions with Hidden Markov Models . . . . . . Gurmeet Singh, Kishan. G. Mehrotra, Chilukuri K. Mohan, and Thyagaraju Damarla
69
XVIII
Table of Contents – Part I
A Framework for Autonomous Search in the Ecli pse Solver . . . . . . . . . . . . Broderick Crawford, Ricardo Soto, Mauricio Montecinos, Carlos Castro, and Eric Monfroy
79
Multimodal Representations, Indexing, Unexpectedness and Proteins . . . Eric Paquet and Herna Lydia Viktor
85
A Generic Approach for Mining Indirect Association Rules in Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen-Yang Lin, You-En Wei, and Chun-Hao Chen Status Quo Bias in Configuration Systems . . . . . . . . . . . . . . . . . . . . . . . . . . Monika Mandl, Alexander Felfernig, Juha Tiihonen, and Klaus Isak Improvement and Estimation of Prediction Accuracy of Soft Sensor Models Based on Time Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hiromasa Kaneko and Kimito Funatsu Network Defense Strategies for Maximization of Network Survivability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frank Yeong-Sung Lin, Hong-Hsu Yen, Pei-Yu Chen, and Ya-Fang Wen PryGuard: A Secure Distributed Authentication Protocol for Pervasive Computing Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chowdhury Hasan, Mohammad Adibuzzaman, Ferdaus Kawsar, Munirul Haque, and Sheikh Iqbal Ahamed
95 105
115
125
135
Section 4: Feature Extraction, Discretization, Clustering, Quantization, and Data Refinement A Global Unsupervised Data Discretization Algorithm Based on Collective Correlation Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Zeng, Qi-Gang Gao, and Dan Pan A Heuristic Data-Sanitization Approach Based on TF-IDF . . . . . . . . . . . . Tzung-Pei Hong, Chun-Wei Lin, Kuo-Tung Yang, and Shyue-Liang Wang Discovering Patterns for Prognostics: A Case Study in Prognostics of Train Wheels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chunsheng Yang and Sylvain L´etourneau
146 156
165
Section 5: Applications of Artificial Intelligence Automating the Selection of Stories for AI in the News . . . . . . . . . . . . . . . Liang Dong, Reid G. Smith, and Bruce G. Buchanan
176
Table of Contents – Part I
Diagnosability Study of Technological Systems . . . . . . . . . . . . . . . . . . . . . . . Michel Batteux, Philippe Dague, Nicolas Rapin, and Philippe Fiani Using Ensembles of Regression Trees to Monitor Lubricating Oil Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andres Bustillo, Alberto Villar, Eneko Gorritxategi, Susana Ferreiro, and Juan J. Rodr´ıguez
XIX
186
199
Section 6: Image Processing and Other Applications Image Region Segmentation Based on Color Coherence Quantization . . . Guang-Nan He, Yu-Bin Yang, Yao Zhang, Yang Gao, and Lin Shang
207
Image Retrieval Algorithm Based on Enhanced Relational Graph . . . . . . Guang-Nan He, Yu-Bin Yang, Ning Li, and Yao Zhang
220
Prediction-Oriented Dimensionality Reduction of Industrial Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maciej Grzenda Informative Sentence Retrieval for Domain Specific Terminologies . . . . . . Jia-Ling Koh and Chin-Wei Cho
232
242
Section 7: Intelligent Techniques for Document Processing Factoring Web Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David W. Embley, Mukkai Krishnamoorthy, George Nagy, and Sharad Seth
253
Document Analysis Research in the Year 2021 . . . . . . . . . . . . . . . . . . . . . . . Daniel Lopresti and Bart Lamiroy
264
Markov Logic Networks for Document Layout Correction . . . . . . . . . . . . . Stefano Ferilli, Teresa M.A. Basile, and Nicola Di Mauro
275
Extracting General Lists from Web Documents: A Hybrid Approach . . . . Fabio Fumarola, Tim Weninger, Rick Barber, Donato Malerba, and Jiawei Han
285
Section 8: Modeling and Support of Cognitive and Affective Human Processes Towards a Computational Model of the Self-attribution of Agency . . . . . Koen Hindriks, Pascal Wiggers, Catholijn Jonker, and Willem Haselager
295
XX
Table of Contents – Part I
An Agent Model for Computational Analysis of Mirroring Dysfunctioning in Autism Spectrum Disorders . . . . . . . . . . . . . . . . . . . . . . . Yara van der Laan and Jan Treur Multi-modal Biometric Emotion Recognition Using Classifier Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ludmila I. Kuncheva, Thomas Christy, Iestyn Pierce, and Sa’ad P. Mansoor Towards a Fully Computational Model of Web-Navigation . . . . . . . . . . . . . Saraschandra Karanam, Herre van Oostendorp, and Bipin Indurkhya
306
317
327
Section 9: Robotics and Navigation Stairway Detection Based on Single Camera by Motion Stereo . . . . . . . . . Danilo C´ aceres Hern´ andez, Taeho Kim, and Kang-Hyun Jo Robot with Two Ears Listens to More than Two Simultaneous Utterances by Exploiting Harmonic Structures . . . . . . . . . . . . . . . . . . . . . . . Yasuharu Hirasawa, Toru Takahashi, Tetsuya Ogata, and Hiroshi G. Okuno Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
338
348
359
Environmental Sound Recognition for Robot Audition Using Matching-Pursuit Nobuhide Yamakawa1 , Toru Takahashi1 , Tetsuro Kitahara2, Tetsuya Ogata1 , and Hiroshi G. Okuno1 1
Graduate School of Informatics, Kyoto University, Kyoto, Japan {nyamakaw,tall,ogata,okuno}@kuis.kyoto-u.ac.jp http://winnie.kuis.kyoto-u.ac.jp/members/nyamakaw 2 Department of Computer Science and System Analysis, College of Humanities and Sciences, Nihon University, Tokyo, Japan
[email protected] http://virgo.cssa.chs.nihon-u.ac.jp/~ kitahara/index-e.html
Abstract. Our goal is to achieve a robot audition system that is capable of recognizing multiple environmental sounds and making use of them in human-robot interaction. The main problems in environmental sound recognition in robot audition are: (1) recognition under a large amount of background noise including the noise from the robot itself, and (2) the necessity of robust feature extraction against spectrum distortion due to separation of multiple sound sources. This paper presents the environmental recognition of two sound sources fired simultaneously using matching pursuit (MP) with the Gabor wavelet, which extracts salient audio features from a signal. The two environmental sounds come from different directions, and they are localized by multiple signal classification and, using their geometric information, separated by geometric source separation with the aid of measured head-related transfer functions. The experimental results show the noise-robustness of MP although the performance depends on the properties of the sound sources. Keywords: Environmental sound recognition, Matching pursuit, Robot audition, Computational auditory scene analysis.
1
Introduction
The physical control technology for robots has progressed remarkably in recent years. However, their perceptive ability to understand the real world, especially its auditory scenes, is still in its infancy. For this reason, a number of researchers have been developing the technologies, namely robot audition, that enable robots to understand real auditory scenes [1,2,3]. The most important feature in robot audition is for robots to be able to recognize auditory scenes with their own ears, that is, with the microphones installed on the robots themselves. Although various spoken dialogue robots have been developed, most of them receive sound only from speakers in a wearable headset in order to eliminate the influence of K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 1–10, 2011. c Springer-Verlag Berlin Heidelberg 2011
2
N. Yamakawa et al.
background noise as much as possible [4,5,6]. In contrast, robot audition research aims at the use of microphones built into the body of robots so that they can detect the signals from unspecified sound sources. However, this feature makes the problem more difficult. First, a variety of sounds sometimes occur simultaneously. When multiple sounds are observed simultaneously, the characteristics of each sound often interfered with each other, so the recognition of these sounds is quite degraded. Second, there is a long distance between sound sources and the microphone in general. When sound sources are far from the microphone, the observed signals may become reverberant and have a low signal-to-noise ratio. Due to these difficulties, robot audition usually involves three elemental technologies: sound source localization, sound stream separation, and sound source recognition. Although these technologies were developed independently, attempts to integrate them have recently been made. For example, the robot audition software, HARK [7], has been developed for that purpose. Although the sounds presented in the world are obviously not limited to human speech, most robot audition studies have focused only on speech signals. When HARK detects non-speech sounds, for example, it ignores those sounds, and does not pass them to the recognition stage. Although rejecting non-speech sounds may be a possible solution for spoken dialogue robots, we believe it is not sufficient because such sounds may also contain relevant information. In human-human conversation, non-speech signals existing in scenes can indicate its meaningful sign strongly related to the environment. For example, if an alarm rings while people are chatting, one of the speakers might say “what is that disturbing sound?” and the other(s) would then identify the sound as coming from the alarm over there. Finally, one of them might stop the alarm and the chat would resumes. In order to achieve this kind of interaction between humans and robots, robots need to have the ability to recognize such non-speech sounds (environmental sounds). There are various studies on environmental sound recognition indirectly related to robot audition. Ikeda [8] and Jahns [9] developed a system to recognize psychological conditions of animals by means of audio features of their voices, e.g., linear predictive coding (LPC), and pattern recognition approaches, e.g., hidden-Markov model (HMM). To make mobile devices be capable of recognizing background sound and aware of the scene they are in, the research about audio context recognition has been carried out by several researchers [10,11]. Another main interest in environmental sound recognition is discriminating abnormal sounds from other environmental sounds. Here, the word “abnormal” refers to unknown sounds indicating unusual or emergency situations. Some security applications, e.g., home-security robots [12], are based on this classification. Although some of them are aimed at applications for mobile robots, those studies were not conducted on the assumption that in the real environment multiple sounds come from various directions with a large amount of background noise or reverberation. To achieve environmental sound recognition under such conditions, the ability to localize objective sound sources, separate them from background noise and identify them is required.
Environmental Sound Recognition for Robot Audition Using MP
3
To evaluate the functionality of the robot audition technologies in the situation where a robot has to recognize various sound events simultaneously under noisy environments, we conducted experiments on the identification of dual sound-events with an added background noise employing matching-pursuit (MP) and the sound-source localization and separation technique used in HARK. An eight-channel microphone array embedded on a semi-spherical robot head is used to localize sound sources and separate two environmental sounds occuring simultaneously. Each separated signal is then identified using MP with Gaussian mixture models (GMMs) or HMMs. The identification performance of MP is compared with mel-frequency cepstral coefficients (MFCCs), which is the most widely used speech recognition scheme. The contents of this paper are organized as follows: In Section 2, the problems in environmental sound recognition for robot audition systems are summarized, and the properties and theory of MP are explained briefly. In Section 3, the details of our experiments are described, and a conclusion and our planned future work are given in Section 4.
2
Environmental Sound Recognition by Robot Audition
There are two types of environmental sounds. The first type is what can be called sound events, that is, short sounds that indicate certain events or that something is occurring. For example, the sound of a doorknob indicates that someone is entering or leaving the room, and a chime indicates a certain time (e.g., the start of a class). A feature of this kind of sound is that durations are very short, and short durations may result in localization errors. The second type is relatively stationary background noise. Examples are the noise made by appliances such as air conditioners and the raucous atmosphere of a bar. Whereas the noise of air conditioners is nothing more than noise in general, the noise of a raucous atmosphere could be a cue for identifying where the robot is. The background noise also includes the noise produced by the robot itself. Because the two types of sounds usually occur simultaneously, robots have to identify these sounds after separating them. Based on the discussion above, we assume the following conditions in our environmental sound recognition task: 1. Sound events occur under some background noise with various signal-to-noise ratios (SNRs) (including 0 dB). 2. White noise is used as background noise to simulate the fan-noise of the robot itself. Ideally we should deal with various kinds of background noise including that in bars as discussed above, but we leave them as a future task. 3. The sounds are observed through the eight-channel microphone array installed in the body of the robot. 2.1
Issues in Feature Extraction of Environmental Sounds
In our task, observed sounds are mixtures of sound events and background noise. The difficulty of this task lies in the fact that both signals and noise are
4
N. Yamakawa et al.
environmental sounds. This means that the acoustical characteristics of signals cannot be exploited as cues in separating signals from noise. Whereas the harmonic structure of human speech is widely used as a cue in separating speech signals from noise, the harmonic structure cannot be assumed for environmental sounds. This means that sound source separation is extremely difficult, so we have to design a feature extraction method under the assumption that the signals and noise cannot be sufficiently separated in the source separation phase. From this point of view, mel-frequency ceptral coefficients (MFCCs), which are the most common audio features in the conventional sound recognition method, are considered unsuitable. This is because MFCCs approximate spectral envelopes of a signal with the same weights for all frequencies, and do not discriminate the signal and the noise. Compared to background noises, the energy of sound events tends to be concentrated locally in the time-frequency plane. In addition, sound events are expected to be presented more saliently compared to background noise because the sound events are meant to convey information to someone who hears them. A mechanism for extracting salient components from the time-frequency representation is therefore required for achieving noise-robust feature extraction. 2.2
Feature Extraction Using Matching Pursuit
We use matching pursuit (MP) as an audio feature extraction algorithm to spot salient components in a signal in the time-frequency domain. The algorithm extracts signal components from a given signal in order of the highest energy. The extracted components are expressed as atoms defined in an arbitrary base dictionary, and thus the signal is approximated as a linear sum of the atoms ordered according to energy. The number of extractions is also arbitrary; therefore, by selecting the number carefully, the algorithm can extract salient signal components in the time-frequency domain, and noise robustness for audio features is expected. We obtain an approximation of a given signal s as a linear sum of m atoms, φγ1 ...φγm , where m is an arbitrary number; then s can be expressed as: m αγi φγi + R(m) . (1) s= i=1
Note that R represents a residual signal, and the atoms are selected from the dictionary D = {φγ1 ...φγm } which contains m (≥ m) atoms. The extraction procedure is as follows: 1. Calculate correlation of s and all the atoms in D. 2. Extract the atom with the maximum correlation φγ1 . from s together with the correlation coefficient αγ1 3. Apply the same procedure as 1. to R(1) = s−αγ1 φγ1 , and αγ2 φγ2 is obtained. 4. Iterate the procedure above until m atoms are decomposed. Each atom may have parameters such as width, frequency and amplitude, depending on the type of atom. Accordingly, these parameters can be used as
Environmental Sound Recognition for Robot Audition Using MP
White noise SNR = 0dB
5
Extraction MP/Gabor
Fig. 1. Signal extraction by matching-pursuit with the Gabor wavelet
feature vectors. The computational complexity of the extraction increases linearly with respect to m. Therefore, if the dictionary is designed effectively and the meaningful features are gained with a small number of m, the complexity does not noticeably increase. More theoretical details are described in [13]. Robustness against background noise. Figure 1 illustrates the MP extraction of high energy components (i.e., salient audio features) from an audio signal immersed in a Gaussian noise signal. The signal is a chime ringing sound which has a clear harmonic structure. Its energy is concentrated at around 3500 Hz. The left figure shows the spectrogram of the original signal, which is dry and contains no background noise (i.e., no reverberation or acoustical coloration). White noise with the SNR of 0 dB is added to the original signal, and its spectrogram is plotted in the figure at the center. MP with the Gabor wavelet extracts 48 Gabor atoms from the entire noisy signal (i.e., not a frame-wise analysis) and finally, the signal is reconstructed from the extracted atoms. The reconstructed signal is presented in the right figure. In the center figure, the spectra are almost masked by the noise at every analysis frame. If we extract audio features based on the Fourier transform from the noisy signal, the values of the feature differ substantially from those in the original signal. Therefore, audio feature extraction methods using short-time Fourier transform (STFT), which is a popular technique in the conventional sound recognition scheme, may perform poorly in the presence of such large background noise. The point here is that the reconstructed signal shown in the right figure contains only high energy components around 3500 Hz and no background noise. Thus, if the sustaining peak of spectrum power at 3500 Hz largely characterizes the signal, it can be said that MP succeeded in extracting audio features while eliminating the influence of background noise in this example. This property contributes to attaining noise-robust audio features.
3
Experiments
The experimental results of environmental sound identification are presented in this section. Practically, a robot surrounded by various sound sources and background noises needs to be capable of sound source localization and separation in order to recognize each sound source individually. The objective of our experiments is to investigate the effectiveness of applying MP with the Gabor wavelet to environmental sounds with effects of such signal processing technologies. A
6
N. Yamakawa et al.
robota head is used to create head-related transfer functions (HRTFs) for localization and separation so that the experiments involve robot audition under a more realistic condition. The procedure of our experiments was as follows: 1. make an eight-channel audio signal containing two directional environmental sounds with white-noise, 2. localize and separate the sounds using HARK and obtain two monaural sound files, 3. extract audio features of MFCCs and MP/Gabor, 4. learn multi-condition acoustical models using feature vectors of clean, noised and separated signals, 5. conduct 10-fold cross validation using the separated signals as test data. MP can be applied to the entire signal (a sound unit) or to each analytic frame of the sound. The difference in identification performance between unit-wise or frame-wise analysis is still unknown. We compare of both analysis types in the second experiment. 3.1
Setup
Head-related Transfer Functions of the Robot Head. For sound source localization and separation, we used HRTFs of a real robot with an eight-channel microphone array on its semi-sphere head. The HRTFs were made by the directional impulse responses recorded in an anechoic chamber. The responses were measured at points 150 cm away from the virtual sound sources at every 5 degrees (i.e., 72 HRTFs were made to cover 360◦ ). Localization and separation by HARK. We used the robot audition software HARK to conduct sound source localization and separation. The algorithms selected for localization and separation were multiple signal classification (MUSIC)[14] and geometric source separation (GSS)[15], respectively. Both are available as modules, LocalizeMUSIC and GHDSS, in HARK. To optimize the localization performance for environmental sounds, we set the parameter PERIOD, which represents the number of frames used in the calculation of correlation matrices, to 20, as environmental sounds include impulsive sounds, and hence require a short period of localization. Additionally, since the band width of environmental sounds is generally broader than that of speech signals, LOWER BOUND FREQUENCY and UPPER BOUND FREQUENCY were changed from “500 and 2800 Hz” to “125 and 7900 Hz” so that the system could handle a wider frequency-band range. Pilot tests indicated that the localization performance for environmental sounds was improved by this parameter setup for the modules. Specification of sound classes, audio features, and classifiers. Table 1 summarizes the specification of the audio features, classifiers, and sound classes as well as the training and testing combination. The audio signals used in the experiments were excerpted from the “Sound scene database in real acoustical environments” created by the Real World Computing Partnership (RWCP) [16].
Environmental Sound Recognition for Robot Audition Using MP
7
Table 1. Experimental setup (sound classes, audio features, classifiers) MFCC
Audio features
Classifiers Sound classes
MP/Gabor
MFCC(12) + ΔMFCC(12) + Δpower(1) Window width = 25 msec Shift rate = 10 msec GMM (4‐mix.) or HMMs (6 hidden states)
Frame
Unit
Window‐width = same as MFCC Base‐width = 2‐256 samples
Base‐width = 2‐4096 samples
time‐shift resolution = 16 samples # of extracted atoms = 48 GMM for both or HMMs only for "frame"
one at 0 degree (in front) and the other at 45 degree off chime, handclap, clock‐alarm, coin‐clinks, glass‐cups, doorknob, metal‐plate, phone‐beep, bell‐ringing, metal‐bin
Leaning set
Original signal (clean) + Noised signals (SNR = 0dB) + Noised separated signals = 900 samples
Test set
Separated signals = 100 samples
We selected 10 sound classes sampled at 16 kHz with a time length of 0.7 sec on average from the database. All classes were impulsive sounds (i.e., fast-decaying) except clock-alarm and phone-beep. The number of MP extractions and hidden states of HMM were chosen to be 48 and 6 as these numbers were found to be effective for environmental sounds in our previous study [17]. The training condition involved a combination of multi-condition learning consisting of clean signals (dry sources), the same signals with white noise (SNR = 0 dB) and the separated signals of the noised signals. Each condition contained 300 samples; therefore a training set has 900 samples in total. In the third condition, although there was only one sound source and no stream separation was necessary, we applied GSS to the samples and included them in training sets so that acoustic models could learn not only the sound classes’ features but also the filter-like property of the separation process. 3.2
Experiment 1: Identification Rate of MFCCs and MP
Figure 2 shows the result of the identification test performed with MFCCs and MP/Gabor with two classifier conditions (GMM and HMM of six hidden states). The vertical axis displays identification rates as percentages and the horizontal axis lists the names of the sound classes. This result is for the sounds coming from the 45◦ direction. Since the result does not dramatically differ from that of the 0◦ direction, we concluded that there was no directional dependency in the identification performance in our experiment set. Overall, our method, the frame-wise MP with the Gabor wavelet, outperformed the MFCCs, and generally, ones with HMM showed higher identification
8
N. Yamakawa et al.
/ĚĞŶƚŝĨŝĐĂƚŝŽŶƌĂƚĞй ƚŝŽŶƌĂƚĞй
D&ͬ'DD
D&ͬ,DD
DW;&ƌĂŵĞͿͬ'DD
DW;ĨƌĂŵĞͿͬ,DD
ϭϬϬ ϵϬ ϴϬ ϳϬ ϲϬ ϱϬ ϰϬ ϯϬ ϮϬ ϭϬ Ϭ
Fig. 2. Identification rate of separated sound sources coming from 45◦ using MFCCs/GMM, MFCCs/HMMs, MP(frame)/GMM and MP(frame)/HMMs
rate except for doorknob and bell-ringing. The common property between these two classes is that they originally had peaks of spectrum power higher than 7000 Hz. Due to the LPF-like property of GSS, the power at higher frequency components is attenuated. If the original signal has large energy in the stop-band of the filter, a significant change can be seen in feature vectors created by MP. This is because for separated sound signals, the energy order of signal components can be changed from the original, and more atoms are extracted at the lower frequency region and fewer from the attenuated region. This results in a large difference between separated and non-separated signals in the MP feature space. Because GMM does not factor in the transition patterns of feature vectors, signal components that appear most frequently through all frames are consequently dominant features in the model. In the case of impulsive sounds generated by a collision of two materials, lower frequency components sustain longer while higher frequency ones decay faster. Because of this phenomenon, a GMM can intensively learn the lower frequency features that are commonly strong in nonseparated and separated signals without considering frame sequence like HMMs. Whereas some classes such as chime and glass-cups showed a high identification rate (nearly or above 90%), the others exhibited a rate around 50%. The sound groups with higher performance results have clear harmonic structures, and the states of the structures (i.e., pitch) do not vary significantly over the same sound class. The sound groups with poor performance results generally have unclear harmonic structures or variant pitch information. For example, the phone-beep sound samples consist of recorded beep sounds of various cell phones; hence, the pitch of the beeps has variations. In these cases MP features using frequency values are not consistent, and are therefore not effective audio features for these types of environmental sounds. 3.3
Experiment 2: Part or Unit?
Figure 3 presents the result of the identification test carried out by applying MP to each analytic frame or to a whole signal (i.e., a sound unit).
Environmental Sound Recognition for Robot Audition Using MP
/ĚĞŶƚŝĨŝĐĂƚŝŽŶƌĂƚĞй
DW;ĨƌĂŵĞͿͬ'DD
DW;ĨƌĂŵĞͿͬ,DD
9
DW;ƵŶŝƚͿͬ'DD
ϭϬϬ ϵϬ ϴϬ ϳϬ ϲϬ ϱϬ ϰϬ ϯϬ ϮϬ ϭϬ Ϭ
Fig. 3. Comparison in identification rate of frame-wise and unit-wise MP features
In most cases, the unit-wise analysis performed similarly to the frame-wise with HMMs of six hidden states but failed to identify the phone-beep and the bell-ringing due to the reason discussed in Experiment 1. The unit-wise analysis can only generate one feature vector; therefore, it cannot be classified by HMM, so is classified by GMM. Because GMM models signals independent relationships between sequential frames, it can handle various sound classes more generally. It is remarkable that the unit-wise analysis with GMM shows almost the same performance as the frame-wise one in some sound classes. In other words, the unit-wise MP analysis can express temporal variation of a signal in a single feature vector.
4
Conclusion
This paper presented identification tests of two environmental sounds fired simultaneously by robot audition using matching pursuit with the Gabor wavelet. The sound source localization and separation were performed by the robot audition software, HARK, and its specialized configuration for environmental sound recognition was explored. The experimental results revealed the noise robustness of MP against MFCCs noting that not all properties of environmental sounds can be accurately identified by MP. It is also found that the unit-wise analysis of MP can model temporal variations as well as the harmonic structure of a signal. Our future objectives are planned as follows: first, it can be easily imagined that a speech signal and meaningful environmental sounds are recorded simultaneously in human-robot interaction. Identification experiments that separate a human voice and environmental sounds should be conducted to determine whether the proposed method is still valid for that condition. Second, as most furniture or electrical appliances usually stay in the same position in a home, it may be useful for a robot to refer to a map of their distribution when localizing and recognizing environmental sounds. For example, with the help of a map, the robot can determine whether the input signal comes from a familiar or unknown
10
N. Yamakawa et al.
sound source, and then judge the abnormality of a sound event or subsequently renew the map.
References 1. Rosenthal, D.F., Okuno, H.G.: Computational auditory scene analysis. L. Erlbaum Associates Inc., Mahwah (1998) 2. Brown, G., Cooke, M.: Computational auditory scene analysis. Computer Speech and Language 8(4), 297–336 (1994) 3. Okuno, H.G., Ogata, T., Komatani, K.: Computational Auditory Scene Analysis and Its Application to Robot Audition: Five Years Experience. In: ICKS 2007, pp. 69–76 (2007) 4. Matsusaka, Y., Tojo, T., Kuota, S., Furukawa, K., Tamiya, D., Hayata, K., Nakano, Y., Kobayashi, T.: Multi-person conversation via multi-modal interface — a robot who communicates with multi-user. In: EUROSPEECH 1999, pp. 1723–1726 (1999) 5. Nishimura, R., Uchida, T., Lee, A., Saruwatari, H., Shikano, K.: Aska: Receptionist robot with speech dialogue system. In: IROS 2002, pp. 1308–1313 (2002) 6. Brooks, R., Breazeal, C., Marjanovie, M., Scassellati, B., Williamson, M.: The cog project: Building a humanoid robot. In: Computation for Metaphors, Analogy, and Agents, pp. 52–87 (1999) 7. Nakadai, K., Takahashi, T., Okuno, H.G., Nakajima, H., Hasegawa, Y., Tsujino, H.: Design and Implementation of Robot Audition System’HARK’Open Source Software for Listening to Three Simultaneous Speakers. Advanced Robotics 24 5(6), 739–761 (2010) 8. Ikeda, Y., Jahns, G., Kowalczyk, W., Walter, K.: Acoustic Analysis to Recognize Individuals and Animal Conditions. In: The XIV Memorial CIGR World Congress, vol. 8206 (2000) 9. Jahns, G.: Call recognition to identify cow conditions–A call-recogniser translating calls to text. Computers and Electronics in Agriculture 62(1), 54–58 (2008) 10. Eronen, A.J., Peltonen, V.T., Tuomi, J.T., Klapuri, A.P., Fagerlund, S., Sorsa, T., Lorho, G., Huopaniemi, J.: Audio-based context recognition. IEEE TASLP 14(1), 321–329 (2005) 11. Chu, S., Narayanan, S., Kuo, C.: Environmental sound recognition with timefrequency audio features. IEEE TASL 17(6), 1142 (2009) 12. Ntalampiras, S., Potamitis, I., Fakotakis, N.: Sound classification based on temporal feature integration. ISCCSP 2010, 1–4 (2010) 13. Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE TSP 41(12), 3397–3415 (1993) 14. Schmidt, R.: Multiple emitter location and signal parameter estimation. IEEE TAP 34(3), 276–280 (1986) 15. Parra, L.C., Alvino, C.V.: Geometric source separation: Mergin convolutive source separation with geometric beamforming. IEEE TSALP 10(6), 352–362 (2002) 16. Real World Computing Partnership, Rwcp sound scene database in real acoustical environments, http://tosa.mri.co.jp/sounddb/indexe.htm 17. Yamakawa, N., Kitahara, T., Takahashi, T., Komatani, K., Ogata, T., Okuno, H.G.: Effects of modelling within-and between-frame temporal variations in power spectra on non-verbal sound recognition. In: INTERSPEECH 2010, pp. 2342–2345 (2010)
Cognitive Aspects of Programming in Pictures Yutaka Watanobe, Rentaro Yoshioka, and Nikolay Mirenkov University of Aizu, Aizu-wakamatsu, Fukushima 965-8580, Japan {yutaka,rentaro,nikmir}@u-aizu.ac.jp
Abstract. Programming in pictures is an approach supported by a special programming environment where pictures and moving pictures are systematically used for direct representation of features of computational algorithms and data structures. Within this approach some ”data space” is traversed by a ”front of computation” and necessary operations are performed during this traversal process. There are compound pictures to define algorithmic steps (called Algorithmic CyberFrames) and generic pictures to define the contents of compound pictures. Compound pictures are assembled into special series to represent some predefined algorithmic features. A number of the series is assembled into an Algorithmic CyberFilm. The environment helps to follow a cognitive model where success of users’ perception, comprehension and cognition depend on interaction with, at least, a few different but mutually supplementing features of a picture or a set of pictures. In this paper, we survey a number of techniques that are used with programming in pictures to enhance perception and cognition of people working on developing computational application models and corresponding algorithms. Keywords: programming in pictures, Algorithmic CyberFilm.
1
Introduction
The natural intelligence is related, in a great part, with people’s abilities to perform mental simulation, including thinking/reasoning, problem solving, and decision making. However, this mental simulation depends not only on the internal information processing mechanism of the brain, but also on the levels of abstraction connecting the physical world objects and corresponding abstract models. Syntax-semantics gap in the object-model representations is a fundamental basis for many problems related to the perception and cognition. This gap is predefined by existing systems of symbols, notations, languages, as well as by ambiguous terminologies, trans-disciplinary and cross-cultural contexts. It also depends on the representation of dynamical processes of modeling problems and the adaptive mechanisms of working systems. To bridge the syntaxsemantics gap and decrease burden of mental simulation, modern information systems are supported by special data/knowledge acquisitions, ontologies and high-level interfaces based on different mental strategies related to different decision functions [1-2]. In general, to enhance people’s ability in understanding meaning of objects and processes and in performing mental simulation, there is K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 11–20, 2011. c Springer-Verlag Berlin Heidelberg 2011
12
Y. Watanobe, R. Yoshioka, and N. Mirenkov
a necessity not only to increase the levels of symbols, notations and languages, but also to update our current concept of abstraction. Some new programming languages (first of all related to developing human-computer interfaces) try to react to the above mentioned situation [3], but majority of others still keep the old level of cognitive features by being ”good for computers, but not so friendly for humans.” Here, among related papers we would like to mention cognitive aspects of software design [4], where authors try to find solutions by considering answers to the following questions: is it possible to bring together theoretically a computer program and a natural language text, and what are differences between programming languages and natural languages. Another paper of our interests is [5], where cognitive and computational features of diagrams are considered through another question: why a diagram is (sometimes) worth ten thousand words. In addition, we would like to mention approaches related to analogical representations of programs (including a semiotic analysis of comics)[6], selfassembling tilings (for the creation of the desired shapes)[7], and the long-term activity of researchers in Certec (Sweden) [8] (where the use of pictures as language for people with disabilities is promoted), as well as other variety of publications where the concept of putting human needs first is applied and where a necessity for uniting scientific and artistic sides to obtain a balance between usefulness and attractiveness of technologies is also considered [9]. Our motivation is to develop a programming environment where users can feel themselves comfortable in developing new models and corresponding algorithms, and in representing them in forms recognizable and understandable by other people. In this environment, the confidence of users in making decisions is based on access to previous decisions of others, on quick perception and understanding and on transformations leading to new models and algorithms. Such opportunities are supported by the use of an open set of very-high level algorithmic characters (super-characters), special constructs combining these characters to represent algorithmic steps, special assembly lines to define algorithmic scenes, and libraries (in fact, ontologies) to acquire such super-characters, constructs and algorithm representations. Pictures and moving pictures are applied as expressive forms of these super-characters, constructs, and lines. In many cases pictures and animation are much more attractive for understanding an algorithm (and its features) in comparison with conventional programming text. However, there are cases where pictures are still too abstract and animation is too boring. So, our approach is based on multiple views and multimodal formats to take into account different knowledge and abilities of users. Within this approach, we follow a cognitive model where success of user’s perception, comprehension and cognition depend on interaction with, at least, a few different but mutually supplementing features of a super-character, construct, or algorithm. In this paper we survey a number of techniques that are used with programming in pictures to enhance perception and cognition of people working on developing application models and corresponding algorithms. These techniques are based on special visual symbols and their compositions within compound constructs, on the use of background, foreground and auxiliary images, and on
Cognitive Aspects of Programming in Pictures
13
representing scalable forms of model spaces and attributes declarations, features of algorithmic steps, and multiple views of algorithm as a whole.
2
Programming in Pictures: A Brief Overview
As we have mentioned in the introduction, mental simulation depends not only on internal information processing mechanisms of the brain, but also on levels of abstraction connecting the physical world objects and corresponding abstract models. To make such simulation more efficient, it is necessary to decrease spending of people’s energy and time for recognition of semantics behind unknown terminology, for discovering associations between objects with syntax-semantics forms based on rote memorization, and for fighting with concepts and notations that have a trend to be over-abstracted. In other words, disorientation and cognitive overload are serious obstacles for people. They become tired after overcoming such obstacles and do not have enough energy and time for focusing on their own ideas and models, and on the use of their real knowledge and cognitive abilities. Programming in pictures is a special approach to help people in saving their energy and time. It is supported by a special environment where pictures and moving pictures are systematically used for more direct representation of features of computational algorithms and data structures. Within this approach some ”data space” is traversed by a ”front of computation” and necessary operations are performed during this traversal process. There are compound pictures to define algorithmic steps (called algorithmic CyberFrames) and generic pictures to define the contents of compound pictures. Compound pictures are assembled into special series to represent some predefined algorithmic features. A number of the series is assembled into an Algorithmic CyberFilm. Fig.1 depicts a structure scheme of the Algorithmic CyberFilm format. The first (leftmost) series of frames represents algorithmic skeletons that show space data structures and temporal schemes of computational flows on these structures. The second series shows variables and formulas (actions) that are attached to the space-time points of the algorithmic skeletons. The third series represents input/output operations that define the algorithmic interface with
D[ P[
D[
]= GR [ P[ ]= nod e( D[
4 Languages: Algorithmic Dynamics Algorithmic Commands Algorithmic interFace Integrated View
P[
D[ P[
GR, P ]= GR ]= nod
[
e(
]; )
GR, P T
]; T[
)
T[
GR, P ]= D[ ] + 1; ]= nod e( )
]= ] + T[
]
S -1
GR, P
-1 -1
]= D[ ] + 1; ]= nod e( )
-1
GR D[
]=
GR[
GR D[ P[
];
]= D[
] + 1; ]= nod e( )
Fig. 1. A structure scheme of the Algorithmic CyberFilm format
14
Y. Watanobe, R. Yoshioka, and N. Mirenkov
external world, as well as how a software component based on the algorithm should look from outside. Finally, the fourth series shows a compact combination of main features presented in the above mentioned groups of frames. For each series of frames there is its own visual language. As a result, the programming environment for making Algorithmic CyberFilms is a cluster of four mutually supplemented languages (and supporting systems) to define four different views of each software component. Each language uses its own set of pictures and super-characters. Within this environment, the application users usually do not create frames, algorithmic skeletons and related template programs, but take and assemble them from ontology type libraries. For these users, the programming is reduced to defining sizes of the space structures, declaring variables on these structures and specifying operations on traversal nodes. Filmification of methods has been applied to a great variety of algorithms including sequential and parallel matrix multiplications, solving algebraic and partial differential equations, cellular automation-like algorithms, as well as to algorithms on trees, pyramids, particles-in-cells, etc. (see, for example, [10-12]). A large set of algorithms on graphs and a corresponding library have been presented in [13]. In this paper we focus on some cognitive aspects of programming in pictures and how it is supported by the environment.
3
Integrated View of a Program in Pictures
To illustrate some of the above mentioned features of the Algorithmic CyberFilm technology, let us consider pictures of a program for the Jacobi relaxation algorithm for Laplace’s equation ∂ 2 A/∂x2 +∂ 2 A/∂y 2 = 0 on a 2D grid of N × M size with some values on the boundary of the domain [14]. Variable B is used for keeping the new iteration values and variables e and I are for checking the process of convergence. Variables A, B and I are attached (”affiliated”) to the 2D structure specialized on presenting computation distributed in space and e is attached to a special one-node structure specialized for presenting centralized computation. The picture below (Fig.2(a)) is an integrated view frame (compound picture) defining the algorithm. The first row in the frame is the declaration of the space structures and variables (including their types) attached to the structures. The computation includes three scenes for boundary values initialization, internal node values input and the convergence parameter initialization, and one scene for the main part of the algorithm and results visualization. Each scene has its own super-character (icon) with a ”caption” pointing operations which should be defined within the scene computation. The scene icons are depicted by the left column of the frame and request (from top-to-bottom) specifying 1, 1, 1, and 4 operations, respectively. The operations are defined by the expressions presented to the right of the icons. The first scene is scanning boundary nodes of the 2D grid and performing in the node of visit the rightly defined formula (a highlighted node with vertical/horizontal line means a number of row/column where the node of visit is located). Though this scene icon is intuitively understandable, the user can request its semantics at any moment, as shown by Fig.2(c). The similar case with
Cognitive Aspects of Programming in Pictures
15
(a) (b)
(c)
(d)
(e)
Fig. 2. A program in the pictures of the Jacobi relaxation algorithm for Laplace’s equation
the fourth scene icon is provided by Fig.2(e). In fact, in all these cases, frames of Algorithmic Dynamics view are shown. The embedded clarity support is applicable not only to scene icons, but also to other generic and compound pictures, including icons related to the variable declaration. Fig.2(b) depicts examples of physical meaning and measurement unit clarifications for variable A and B (representing heat in Celsius degrees), as well as unfolded forms of neighborhood (Fig.2(d)) where computational activity is performed. In conventional languages, declarations of variables are based on some abstract data constructs where elementary types of ”integer, real, double, character, etc.” play a fundamental role. Usually, essential efforts should be done to imagine and understand such a construct. For example, to represent a multi-grid (pyramid) construct, a number of 2D-arrays of real or double types representing different layers of the pyramid are declared. This declaration does not display a space structure image and does not provide any information about physical meaning of application attributes and units of their measurement. In our programming environment, special images are used as super-characters or as embedded clarity annotations which can be displayed on user’s demands. Fig.3 depicts a set of super-characters applying for the space structure declarations and Fig.4 depicts a set of supportive images applying for embedded clarity annotations. The supportive images of Fig.4 represent a physical attribute name, corresponding unit of measurements and an icon to hint on attribute meaning.
16
Y. Watanobe, R. Yoshioka, and N. Mirenkov L
H
N N
M
N
N
N
N M
N
N
N
Fig. 3. A set of super-characters applying for the space structure declarations
Time hour
Time min
Time sec
temp
Distance
Distance
Distance
volume
Frequency
Velocity
Hz
km/h
area m2
Resistnace
m3
R
Voltage
V
km
m
Mass
Mass kg
Mass g
Angle
Angle
Wavelength
Watt
rad
deg
m
w
Currency
Currency
Currency
Bandwidth
Pressure
euro
yen
Hz
Amount of substance
Current
dall
t
cm
temp
。 C
。 K
mol
A
Pa
Fig. 4. A set of supportive images applying for embedded clarity annotations
The super-characters represent some space structures (related to user’s model) which nodes and/or edges are used for allocation of application attributes (variables). In fact, the attribute elements obtain a ”space affiliation.” Through such structures the user can see more directly relations between his/her model and variables involved, and visually check these relations. The structures represent scalable spaces which sizes are defined by the users. This control of scalability they perform by assignment of necessary values to parameters presented by super-characters. For example, the leftmost super-character in Fig.3 is to declare a 3D space structure which size is defined by values assigned to M, N, and L. In a similar way, the second from the leftmost super-character is to declare a top-less pyramid space which size is defined by values assigned to N and H. For cases, where the structure spaces can be modified during the computation by applying special operations in appropriate nodes, for example, by cutting operations for removing some tree branches, special versions of the super-characters are used. These versions are to explicitly point about such features of modeling processes. Special cases of very dynamical processes are related to computation on particles. The rightmost super-character is to represent a 2D-model, where the number of particles involved is related to parameter N. For majority of the super-characters of the above mentioned type, there is a family of images to show ”natural” sub-structures of a structure; for example, for a tree structure, leaf nodes or a root node are considered as such substructures; for a 2D structure, the first row and the first column, as well as diagonals are also considered as the ”natural” sub-structures. These images are used to declare the variable allocation not on the whole structure, but on its parts. In addition to these images, there are also structure mask images which help in covering some substructures from computational activity and introducing various types of irregularities necessary for imitating complex processes of real world.
Cognitive Aspects of Programming in Pictures
4
17
Background, Foreground and Auxiliary Images
Algorithmic CyberFrames, especially related to dynamical process representations, show some space data structures, traversal schemes on these structures and high-level formulas on the structure nodes. Corresponding visual objects provide direct influence on the generation of executable codes. Examples of such frames are depicted by the first row of Fig.5. The first frame shows just a graph data structure, the second presents a tree with a node highlighted for some operations, and the third shows a grid with four highlighted nodes for identical operations to be performed in parallel. Such frames and ideas behind them are understandable much easier in comparison with corresponding representations in the conventional language texts. However, their perception and cognition features will be much higher if additional background images can be involved. Examples of such images are depicted by the second row of Fig.5. The first frame in this row shows not only a graph data structure, but also a hint that the model considered will be related to Fukushima prefecture territory. The second frame shows that the activity on the tree leaf node has ”a history” presented by a trace sub-tree, and the third frame depicts a hint about parallel computation in the past and possible computation in the future. The background images of these frames do not have any influence on the code generation, but are very important for helping people in understanding corresponding semantics and performing mental simulation. Another example of helping images is depicted by Fig.6, where different colors are used to show different sub-graphs
Fukushima
Fig. 5. Examples of Algorithmic CyberFrames and Background Images
Frame X
assign
Frame X + 1
to
where edge[
] is minimum
Fig. 6. Other Examples of Algorithmic CyberFrames and Background Images
18
Y. Watanobe, R. Yoshioka, and N. Mirenkov
involved and edges between them. In this case, the helping images and their colors are used not only for clarifying the meaning, but also for defining a halfhighlighted operation (that is for the influence on code generation). Frame X represents that some operation on selecting the node activity at the next frame should be done on a top-right half-highlighted node; for making this selection, data from contour highlighted nodes should be taken. Frame X+1 shows a result of the selection that is a node which was moved from dark-color sub-graph into light-color sub-graph and pointed by an arrow. This frame also shows that some operation should be defined for the top-right node based on data from all nodes. In addition to Frame X and Frame X+1, Fig.6 depicts the definition of the selecting operation mentioned with Frame X. This definition means ”assign a counter highlighted node of one sub-graph into another sub-graph where an edge between sub-graphs has a minimum value.” In general, the Algorithmic CyberFrame is a multilayer package of images (see Fig.7) including background, main ground and foreground slices. For example, the arrow in Frame X+1 at Fig.6 is represented in this package by a supportive image slice from the foreground layer. Other examples of CyberFrames with the foreground and background images can be found in [15]. The number of slices in the layers depends on semantics to be explained. For many CyberFrames main ground images are only used, but there are also cases of auxiliary frames where only supportive images are involved. The auxiliary frames are to help in comprehension of CyberScenes and CyberFilms meaning. They can be considered as cover frames, summary or documentation type annotations.
background
main ground
... layers
foreground
... layers
... layers
cyberFrame
Fig. 7. A scheme of multilayer Algorithmic CyberFrame
5
Case Studies: Libraries (Ontologies) of Super-Characters and Algorithms
Algorithmic CyberFilm programming technology is developed as environment based on a set of libraries where acquisition of super-characters, CyberFrames,
Cognitive Aspects of Programming in Pictures
19
CyberScenes and CyberFilms, as well as template programs is performed. Items of these libraries include a number of views presenting a number of different features through multimedia forms. Between different features inside one item and between different items there are a variety of links and associations that support users’ decisions on creating their models and for specifying corresponding algorithms, and for automatic synthesis of programs from CyberFilm specification. The libraries can be considered as ontologies because of their formal basis for searching items, items’ associations, necessary features and other actions related to automatic synthesis of programs and intelligent support of user’ activities. Usually, each library item obtains double ID to encode its features and a URLlike address. Encoding is based on classification of two types related to “what” is the item from external points of view and “how” is it constructed internally. For example, for a whole algorithm, a set of features of input/output data and their (allowed) relations represent “what” code and a set of space structures, traversal schemes on the structures and representative formulas represent “how” code. The search of items in relational database (saving the items) is performed on a combination of features which selection is organized by corresponding browser/editor subsystems. The interface panels of these subsystems help the users in making decision on which links to go and provide some images of model (information) subspaces and library item associations. This decreases chances for the users to be cognitively overloaded and disoriented and empower their cognition through the experience acquired in the libraries. The environment is a mental and computational system that guide, support and enhance the thinking processes of the users. In addition, it mediates the learning activities that stimulate the cognitive processes and engage learners in creating knowledge that represent their understanding and conception of the information. Continuing development of the libraries (by considering new algorithms and forms of their representations) and the use of them in real-life educational processes are considered as our permanent case studies and usability tests for this new environment. Within these studies and tests, not only understanding algorithms and developing programming skill are analyzed, but also cognitive adaptation of the users within their works on application models is considered. Such analysis and consideration are our guidelines for the environment development. The effectiveness of the approach is evaluated, first of all, by a variety of algorithms represented in the picture-based forms (please, see [10-13, 15] and their references). Practically, this variety does not have limitations because the environment is based on an open set of super-characters allowing the extension of the algorithmic alphabet [16], as well as on adding and editing the embedded clarity annotations.
6
Conclusion
A brief overview of programming in pictures has been presented and a number of techniques that are used to enhance perception and cognition of people working on developing computational application models and corresponding algorithms
20
Y. Watanobe, R. Yoshioka, and N. Mirenkov
have been provided. These techniques are based on special visual symbols and their compositions within compound constructs, on the use of background, foreground and auxiliary images, and on representing scalable forms of model spaces and attributes declarations, features of algorithmic steps, and multiple views of algorithm as a whole, as well as on special acquisition of them in ontology type libraries. The interface panels of the environment help the users in making decisions and in decreasing the chances for becoming cognitively overloaded and disoriented. They guide, support and enhance the thinking processes of the users.
References 1. Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, Heidelberg (2001) 2. Rasmussen, J., Pejtersen, A.M., Goodstein, L.P.: Cognitive Systems Engineering. John Wiley & Sons, Inc., Chichester (1994) 3. Carroll, J. (ed.): Human-Computer Interaction in the New Millennium. AddisonWesley, Reading (2002) 4. Detienne, F., Bott, F.: Software design - cognitive aspects. Springer, Heidelberg (2002) 5. Larkin, J.H., Simon, H.A.: Why a Diagram Is (Sometimes) Worth Ten Thousand Words. In: Diagrammatic Reasoning: Cognitive and Computational Perspectives. AAAI Press, Menlo Park (1995) 6. Kindborg, M., McGee, K.: Visual programming with analogical representations: Inspirations from a semiotic analysis of comics. Journal of Visual Languages and Computing 18(2), 99–125 (2007) 7. Beckers, F.: Pictures worth a thousand tiles, a geometrical programming language for self-assembly. Theoretical Computer Science 410(16), 1495–1515 (2009) 8. Danielsson, H., Jonsson, B.: Pictures as Language. In: Proceedings of the International Conference on Language and Visualisation, Stockholm, Sweden (2001) 9. Shneiderman, B.: Leonardo’s laptop: human needs and the new computing technologies. MIT Press, Cambridge (2002) 10. Yoshioka, R., Mirenkov, N.: Visual Computing within Environment of Selfexplanatory Components. Soft. Computing Journal 7(1), 20–32 (2002) 11. Mirenkov, N., Vazhenin, A., Yoshioka, R., Ebihara, T., Hirotomi, T., Mirenkova, T.: Self-explanatory components: a new programming paradigm. Int. Journal of Software Engineering and Knowledge Engineering 11(1), 5–36 (2001) 12. Ebihara, T., Mirenkov, N., Nomoto, R., Nemoto, M.: Filmification of methods and an example of its applications. International Journal of Software Engineering and Knowledge Engineering 15(1), 87–115 (2005) 13. Watanobe, Y., Mirenkov, N., Yoshioka, R.: Algorithm Library based on Algorithmic CyberFilms. Knowledge-Based Systems 22(3), 195–208 (2009) 14. Relaxation method, http://en.wikipedia.org/wiki/Relaxation_method 15. Watanobe, Y., Yoshioka, R., Tsukumoto, S., Mirenkov, N.: Filmification of methods: Convex Hull algorithms. In: Proceedings of the 2009 IEEE TENCON Conference, Singapore, pp. 1–6 (2009) (CDROM) 16. Yoshioka, R., Watanobe, Y., Mirenkov, N.: Open set of Algorithmic Characters. In: Proceedings of 10th WSEAS Int. Conference on Applied Computer Science, Iwate, Japan, pp. 327–334 (2010)
An Approach for Smoothly Recalling the Interrupted Tasks by Memorizing User Tasks Kohei Sugawara and Hamido Fujita Iwate Prefectural University, Faculty of Software and Information Science, 152-52 Sugo, Takizawa, Iwate-gun, Iwate, 020-0173, Japan Sangikyo Corporation, 4509, Ikebe-cho, Tsuzuki-ku, Yokohama, 224-0053 Japan
[email protected] Abstract. An approach was previously proposed for assisting users on selecting the displayed multiple objects on their screen. This approach is related to have a useful GUI by dynamic configurable visualization of overlapping displayed objects provide by the user for executing a certain task. Our proposed GUI is associated with metaphor of finding a book of related task from the piled up books of similar tasks with different precision. We considered for improving the assist of user’s finding about issues of restarting interrupted work. This paper describes the proposed idea for resolving issues by modifying indication of the user’s behavior information on this GUI and by applirng the color gradation to the twenty four hour scale.
1 Introduction As internet and IT technologies are providing many resources to users to solve problems by navigating through different files. User opens many files and objects on their screen to manipulate them to execute a certain tasks or edit a document related to different knowledge among these collected from different windows. User may lose his/her way when a certain interruption or else obscures the user work. The connectivity among these overlapped objects would be loosen that would have a certain impact on user to strengthen this connectivity back. We need to provide a mechanism that makes the user always engaged to all objects on the workspace, in active fashion to have these windows participation in transitioning user work. We have defined a conceptual framework for designing an interaction based interface that can estimate user workflow using several parametric views related to user situation or else reflecting the user profiling aspects. The user profiles several parametric status (mental state, user age, gender rank and else) as those participate in the style user utilizing these object. This is needed to estimate the suitable or friendly systematic components that ensemble the interaction scenarios, fitting with a specific situation at hand. The situation evolvement can be reflected into another derivation of additional component resembling articulated type of user interface. Such sort of component articulation evolvement is functional of user situation for best performance in executing simultaneous overlapped object on same task force screen. In the office work, user’s K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 21–26, 2011. © Springer-Verlag Berlin Heidelberg 2011
22
K. Sugawara and H. Fujita
continuous computer operation is often interrupted by other task or own break. The interrupted operation is restarted by reselecting target objects with trusting user’s short-term memory. Since the content of the interrupted operation depends on user’s short-term memory in this situation, the time until recalling previous operation is slow if the interruption time is long. When the previous operation is forgotten, user needs to check all displayed objects. In order to give user the best performance, the GUI that can list the overlapping of the all displayed object window from horizontal view point with using user’s behavior was previously proposed by authors [1]. This proposed GUI has two features. One is the computer user can find out intended object window from hidden object window by other object windows. The other feature is this GUI can assist to restart the interrupted task or past work by giving information of important object window to the user.
2 Consideration of Interrupted Work The previously proposed GUI lists the all displayed object windows that is sorted in ascending order from front side layer to backward side on the screen. The active elapsed time of each displayed object window is memorized in this GUI. The proposed GUI indicates materials for deciding intended object window to user when the interrupted task is restarted because the information of overlapping objects are shown at a glance [2][3]. In order to give more tangible information to user, we consider existing interrupted work by assuming the office work. In the previous paper [4] an Optimization-Ware that is called Cyber Manual was presented as one of IT collaborative system for improving workers’ productivity. Cyber Manual is used as a functional core of the Optimization-Ware in Sangikyo Corporation. It is a database to accumulate the knowledge and know-how of a group or an entire company as well as share the information among division. Cyber Manual consists of assembled document. A document creation work on Cyber Manual is often interrupted by general work or co-related other tasks. This interruption is defined to classify three type cases that are a short-term interruption, a middle-term interruption and a long-term interruption. While a document is edited, this creating task is interrupted a short-term about few minutes by telephone calling or a break or e-mail checking. If a document creation could not be finished in a work day, this interrupted work is restarted after a middle-term that is next day or next week. When the coworker that is taking over the related project to the document from the creator of the document revises the document, the document creation have been interrupted a longterm (including a short term and a middle term) since last editing. A document creator needs to recall what object window was used in order to resume document creation because the creator often uses reference for making document. This paper proposes supporting human’s short-term memory in restarting from short or middle term interruption by indicating overlapping each object windows, active elapsed time and relation of simultaneously used object windows. There is an approach for improving user performance of visualization by using context history [5]. However, the proposed GUI cannot get the context history from used object windows for document creation due to the storing the context history on each application program. So, visualizing a day time is proposed for assisting restarting of an interrupted
An Approach for Smoothly Recalling the Interrupted Tasks by Memorizing User Tasks
23
document creation past day by logging desktop environment of displayed object windows as the history of simultaneously used object windows. In the previously proposed GUI, each object window’s activated time is stored by counting elapsed time while target object window is displayed most front side on the screen. With the longest elapsed time of displayed object as the base in this proposed GUI, the listed items of the displayed objects are colored from the left side by each elapsed time. Therefore, a user can know at a glance which object window how long was used in looking back from a short-term interruption to the document creation work. However, if there is a big difference between the longest active elapsed time of an object window and the other active elapsed time of object windows, the length of each shorter time selected object window becomes similar indication. Even if there is the clear time differences between each object windows, user cannot understand the difference in the large gap of elapsed time and recognize the related object windows.
3 Our Approach In order to assist user’s short-term memory in restarting document creation, the idea that is to indicate relation of each displayed object window in the same color by classifying each object window according to operated time slot is proposed. In Addition, we propose the method that is to revert to the past desktop environment by using the past history. 3.1 Proposed Relating Hue to Time Slot We define that there is relation in the activated each object window in same time slot. Because when the user operates plural object windows simultaneously, these target object windows are used by near time. Figure.1 shows applied hue circle (color wheel) to the proposed GUI. A day time as twenty-four hours is applied to twelve colors based on red, green and blue (RGB) color by two hours unit. The twelve colors
Red.Green.Blue Time slot 00.80.FF 0-2 from 22 o’clock 00.00.FF 2-4 22-24 to 24 o’clock 80.00.FF 00.80.FF 20-22 4-6 00.FF.FF 18-20 FF.00.FF 6-8 FF.00.80 00.FF.80 FF.00.00 00.FF.00 16-18 FF.80.00 80.FF.00 8-10 14-16
FF.FF.00 12-14
10-12
Fig. 1. Defined hue circle
24
K. Sugawara and H. Fujita
ware defined that the each gradation level of RGB is changed by half. The maximum gradation level is “FF”, the half gradation level is “80” and minimum gradation level is “00”. As this twelve color transition assumes the gradation of sky in a day, user can know the time from each color. 3.2 Improvement of Visual Information Figure 2 is described the proposed GUI that is applied new proposed ideas. The information of six displayed object windows is sorted in this GUI. The left side of this GUI in Figure.2 is indicating icons of the application program related to each displayed object window. The texts on the right side of proposed list are displaying the titles of each displayed object windows. There is the elapsed time bar of the selected object windows under the each object title. The nearest object window from front side layer on the screen is counted elapsed time as activated object window. In order to prevent count of idle time, the counting of the elapsed time is stopped when the cursor pointer is stopping for the ten minutes that is user-selectable smallest unit on this color scale. In addition the elapsed time is reduced for ten minutes. We defined for simplicity the time to be ten minutes period because the interrupted time is different according to interposing work. The reason of using the cursor pointer movement is that the main operation is mouse operation in document creation of Cyber Manual. The length of the elapsed time bar is decided as maximum length based on one hour. The elapsed time that exceeds the one hour is converted to numeric display with displaying minutes and seconds on the right side of the elapsed time bar. A color scale is displayed on the upper part of this GUI for distinguishing time slot. A check mark on the color scale means which color is now time slot. The issue that user cannot understand the time difference due to big gap of each elapsed time is resolved by deciding
0
12
24
Cyber Manual 56:20 1h
Mailer ‘D’ 28:35 0h
Text Editor ‘C’ 20:12 0h
Internet Browser ‘B’ 26:06 0h
Picture Viewer ‘A’ 15:10 0h
Fig. 2. Applied new proposed ideas to the previously proposed GUI
An Approach for Smoothly Recalling the Interrupted Tasks by Memorizing User Tasks
25
the maximum length of the elapsed time bar based on one hour. Therefore, even if there is the big gap between the object window of longest elapsed time and the other object windows, user can understand the time difference of activated each object windows with displaying total elapsed time. When an object window is selected by user, the background color of the related program icon becomes the color of the previously defined time slot color. Therefore, user can understand the relation of each displayed object window at a glance which object windows are related and when the object window is used. 3.3 Switching to Past Desktop Environment Function In order to look back to past situation, the viewing function of the activated the object window title and switching function are proposed. The proposed GUI stores the history of desktop environment that are top position, left position, height, width, date and time about the displayed object windows on the screen by time slot unit. When the cursor pointer moves to the target time slot in the color scale, the all each color of the color scale are changed to the detailed time slot color as Figure.3 shown. The color scale of time slot indicates the time based on separating two hours to twelve by using twelve color gradations. The titles of used object window at the time slot are listed under the cursor pointer after changing color scale. Therefore, a user can know the past used object windows by overlapping cursor pointer on the intended time slot. In addition, a user can select detail time slot on ten minutes unit for looking back to past situation by clicking each time slot. When the user click on the color scale of intended time slot, all displayed objects are minimized and only target objects are appeared in sequence. Therefore, the layout of displayed object windows are switched to around clicked time and the related windows are readied by just deciding the time slot that user want to restart. The symbol of the black circle in Figure.3 means that the object window has been displayed recent. Meanwhile, the other symbol of the uncolored circle means that the object window has already closed. As the symbols are displayed, user can know at a glance whether which the object windows are displaying resent.
10:00
10:30
11:00
11:30
12:00
٨Internet Browser ‘B’ ٤Text Editor ‘G’ ٨Picture Viewer ‘A’
Fig. 3. Color scale of 120 minutes view
4 Conclusion The functionality of the previously proposed GUI is improved by applying the proposed ideas. The type of interruption on time span is appeared by considering the situation of a document creation on Cyber Manual. Therefore, a document editor can
26
K. Sugawara and H. Fujita
know for restarting environment from this GUI in a short-term interruption or a middle-term interruption by using operation history and color gradation. If the history of the expert worker as efficient workflow applies to the novice worker, the document creation time of the novice worker may be shortened [6]. This paper proposed the approach that is assisting user's short-term memory in order to improve user’s work efficiency. To further improve operational efficiency, the computer needs to give effective information to worker by guessing the next appropriate task. However, the appropriate task is different by worker’s situation, skill level or its role. Therefore, the computer needs to understand the relationship of user profiles and tasks for reasoning appropriate task by using workflow, semantic network and so on. Future research should include the research of actual relationship between users profile and their work in order to give the suitable expression for the user perception to the worker.
Acknowledgment The authors appreciate their team members, Dr. R. Maruta and Dr. S. Hayashida and Ms. N. Taniguchi of Sangikyo Corporation for their useful comments and advices in the development of this work.
References 1. Sugawara, K., Fujita, H.: An Approach of Visualizing Objects Overlapping for Assisting User Cognition. In: Proceedings of the 10th WSEAS International Conference on Applied Computer Science, ACS 2010, pp. 461–464 (2010) 2. Sugawara, K.: Approach for Handling Multiple Objects Without Breaking the Continuity of User’s Thinking. In: Fujita, H. (ed.) New Trends in Software Methodologies, Tools and Techniques Proceedings of the 9th SoMeT 2010, pp. 218–225 (2010) 3. Sugawara, K., Maruta, R.: A New Windows Desktop Icon for Single Point-of-Control of the Application Window. In: Fujita, H., Marirk, V. (eds.) New Trends in Software Methodologies, Tools and Techniques Proceedings of the Eighth SoMeT 2009, pp. 483–491. IOS press, Amsterdam (2009) 4. Maruta, R., Hayashida, S., Taniguchi, N., Sugawara, K., Sengoku, M., Shoji, W.: Developing a Unique Collaborative System – “Optimization Ware” Using a Novel Massive Data Retrieval/Management Approach. In: Fujita, H., Zualkernan, I. (eds.) New Trends in Software Methodologies, Tools and Techniques –Proceedings of the Seventh SoMeT 2008, pp. 113–125 (2008) 5. Hong, J., Suh, E.-H., Kim, J., Kim, S.: Context-aware system for proactive personalized service based on context history. In: Expert Systems with Applications, pp. 7448–7457. Elsevier, Amsterdam (2009) 6. Madhusudan, T., Leon Zhao, J., Marshall, B.: A case-based reasoning framework for workflow model management. In: Data & Knowledge Engineering, pp. 87–115. Elsevier, Amsterdam (2004)
Implementing an Efficient Causal Learning Mechanism in a Cognitive Tutoring Agent Usef Faghihi1, Philippe Fournier-Viger2, and Roger Nkambou3 1
Department of Computer Science, University of Memphis, Tennessee, USA 2 Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan 3 Department of Computer Science, University of Quebec in Montreal, Montreal, Canada
[email protected],
[email protected],
[email protected] Abstract. An important research problem for developing complex cognitive agents is to provide them with human-like learning mechanisms. One important type of learning among episodic, emotional, and procedural learning is causal learning. In current cognitive agents, causal learning has up to now been implemented with techniques such as Bayesian networks that are not scalable for handling a large volume of data like a human does. In this paper, we address this problem of causal learning using a modified constraint based data mining algorithm that respects temporal ordering of the events. That is, from a huge amount of data, the algorithm filters the data to extract the most important information. We illustrate the application of this learning mechanism in a cognitive tutoring agent for the complex domain of teaching robotic arm manipulation. Keywords: Cognitive Tutoring Agent, Causal Learning, Data mining.
1 Introduction Reasoning is crucial to cognition [1-4]. One important type of reasoning is inductive reasoning, where either one tries to generalize rules from a set of examples or from a set of probable or inadequate premises, one decides the likeliness that a conclusion is true. The truth of a conclusion is likely when the premises give some evidence or support towards the conclusion. In inductive reasoning, when evidence is deemed to be absolute, significant, and generally persuasive, they are cogent. They may bring us to a true conclusion. When the evidence is not deemed absolute, significant and persuasive, then the evidence is non-cogent. In inductive reasoning, the relation between premise and conclusion is uncertain. After generalization, one cannot claim that all potential information about a situation has been collected and that there is no additional unseen information that could discredit the hypothesis. For instance consider the following example. From the statement “The football moves when kicked by a player,” we can infer “All footballs move when kicked by a player.” The inductive reasoning is ampliative – it extends our knowledge. One usually needs to examine prior knowledge, relation, set of examples and experiences in order to draw inductive K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 27–36, 2011. © Springer-Verlag Berlin Heidelberg 2011
28
U. Faghihi, P. Fournier-Viger, and R. Nkambou
conclusions from premises. However, all this information that one must examine to come to a conclusion from a set of premises makes it difficult for scientists to propose a universally accepted theory of inductive reasoning [5-7]. The specific memory to register particular events such as what, where and when, is known as episodic memory in human. Episodic memory allows human to adapt its behavior according to previous experiences in the face of a new environment or other agents. The human being systematically constructs its causal knowledge based on episodic memory. Given that episodic memory contains the memory of events their outcomes, we make inductive abstraction to construct relations between two events. Thus, in humans, causal memory is influenced by the information retained by episodic memory. Inversely, new experiences are influenced by causal memory [8-11]. Different methods are proposed for finding causal relations between events: scientific experiments, statistical relations, temporal order, prior knowledge, and so forth. [12]. Once we find a causal relation between two events, we can explain the cause of the second event if it occurs again. As human beings, we recall recent events better than older events [13, Page 475]. Most cognitive agents do not have any fully automated reasoning. The founders of CLARION have claimed they have implemented causal reasoning in it [14]; however, the causes and effects in this agent are set by hand [15]. In this paper, we first briefly explain the functioning of the Conscious Emotional Learning Tutoring System (CELTS), a cognitive agent that was created to provide assistance to learners in learning activities [15-18]. We then explain how we improved causal learning in CELTS by integrating the capability to handle temporal constraints and event constraints to the algorithm. This can improve CELTS’ memory construction and at the same time allow for better interactions with learners. We finally present an experimental evaluation of the agent behavior with learners and an evaluation of the performance of the algorithms with real life data mining datasets.
2 CELTS CELTS is a hybrid artificial intelligent tutor which is based on Baars’ [19] theory of consciousness. It performs through cognitive cycles. Cognitive cycles in CELTS start by perception and usually end by the execution of an action. CELTS uses its Behavior Network (BN) for action selection (Figure 1.B). The BN is implemented based on Maes’ Behaviour Net [20]. It is a network of partial plans that analyses the context to decide what to do and which type of behavior to set off (Figure 1.B). Given that CELTS is a tutor, an expert can define different solutions in the BN to help learners. Thus, BN’s nodes are messages, hints, demonstration, etc (Figure 1.D) to assist learners while they manipulate Canadarm2 in the virtual environment (Figure 1A). The virtual environment is a simulation of Canadarm2, the robotic telemanipulator attached to the International Space Station (ISS) to help astronauts learn how to manipulate Canadarm2 before going to space. The learners’ manipulations of the virtual world simulator, simulating Canadarm2 (Figure 1.A), constitute the interactions between them and CELTS. In particular, the virtual world simulator sends all manipulation data to CELTS, which, in turn, sends learners advices to improve their performance. Canadarm2 manipulation on the ISS is a very difficult task, because there
Implementing an Efficient Causal Learning Mechanism in a Cognitive Tutoring Agent
29
is a constant risk of collision. Our team has now added different types of learning in CELTS based on neurobiology and neuropsychological theory [21]. We have added and Emotional Mechanism [22], which simulates the peripheralcentral” theory of emotions. The peripheral-central approach takes into account both the short and long route of information processing and reactions, as in humans. Both the short and long routes perform in a parallel and complementary fashion in CELTS’ architecture. The Emotional Mechanism (EM) learns and at the same time contributes emotional valences (positive or negative) to the description of the situation. It also contributes to the decisions made and the learning achieved by the system. We have also added an Episodic Mechanism (EPM), which simulates the multipletrace theory of memory consolidation [21, 22]. The multiple-trace theory postulates that every time an event causes memory reactivation, a new trace for the activated memory is created in the hippocampus. Memory consolidation occurs through the reoccurring loops of episodic memory traces in the hippocampus and the construction of semantic memory traces in the cortex. Thus, the cortical neurons continue to rely on the hippocampus even after encoding. We used sequential pattern mining algorithms to simulate this behavior of memory consolidation in CELTS. To do so, every information broadcast in the system during a training session between CELTS and learners is assigned to a specific time. Thus, CELTS’ EPM extracts information from registered events in the system. Given a problem, EPM is capable of finding the best solution among different solutions conceived by the expert in its BN [21]. Finally, we have also added a Causal Mechanism (CLM), which is implemented by using a combination of sequential pattern mining algorithms and association rules respecting temporal ordering of events- statistical relations, temporal ordering of events, and prior knowledge (from conceived by the expert in the BN). Causal knowledge is generated in CELTS after a) the information is broadcast in the system; b) a decision is made about the ongoing problem, which c) is reinforced by future experiences while CELTS receives information from the virtual world and interacts with learners (Figure 1. A.D). Using these three mechanisms, CELTS can memorize learners’ errors and find the causes of the errors. CELTS is a tutor that interacts with learners and is faced with a huge amount of data. The algorithms previously implemented in CELTS generate lots of irrelevant rules. For this reason, CELTS does not always find relevant causes [15]. In this paper, we aim at generating only the rules that are relevant to CELTS’ current situation. To reach this goal, we have replaced our previously developed rule mining algorithm with an improved version that handles event constraints and a temporal constraint. By performing this modification and adapting CELTS’ BN, we are able to generate only relevant rules. As we will see, event constraints and the temporal constraint help CELTS to not only generate fewer and more reliable rules, but also make CELTS produce explanations for the errors caused by learners. As in the case of humans, our new algorithm also helps CELTS to better remember more recent events’ causes. For the remainder of this paper, we briefly review CELTS’ Episodic and Causal Learning Mechanisms. We also explain how we added event constraints and a temporal constraint to the algorithm.
30
U. Faghihi, P. Fournier-Viger, and R. Nkambou
Fig. 1. CELTS’ interface and Canadarm2
3 Episodic Learning Episodic Learning mechanism (EPL) extracts frequently occurring events from its past experiences [15, 21]. In our context, CELTS learns during astronauts’ training sessions for arm manipulation in the Canadarm2 simulator virtual world [23] (Figure 1.A). To construct CELTS’ Episodic Memory, a trace of what occurred in the system is recorded in CELTS’ different memories during consciousness broadcasts [21]. For instance, each event X=(ti,Ai) in CELTS represents what happened during a cognitive cycle. The timestamp ti of an event indicates the cognitive cycle number, whereas the set of items Ai of an event contains an item that represents the coalition of information (e.g., collision risk with ISS) that were broadcast during the cognitive cycle. For example, one partial sequence recorded during our experimentations was . This sequence means that during cognitive cycle 1, the coalition c2 (indicating that learner forgot to adjust camera in the simulator, Figure 1.A) was broadcast, followed by the broadcast of c4 (indicating that learner brought about an imminent collision in the simulator, Figure 1.A. The e{-0.6} in the sequence, indicates that CELTS’ Emotional Mechanism assigned negative valance 0.6 to this information (see [22] for more details). However, e{-0.8} indicates more negative emotional valence which is the imminent collision-dangerous situation on the ISS. If this subsequence appears several times during learners’ interactions with CELTS, the following rule can be discovered: forgetting to adjust the cameras and not adjusting the camera parameters correctly is followed by a collision risk. {Forget-camera adjustment, Selected wrong joint} {Collision risk} This information later helps CELTS to remember learners’ errors and make better decision to help them. Below are a few examples: • Manipulating Canadarm2 close to the space station and not being aware of the distance can lead to a collision: {Canadarm2_NearISS, Not_aware_of_distance} {Collision }
Implementing an Efficient Causal Learning Mechanism in a Cognitive Tutoring Agent
31
• A learner manipulating Canadarm2 close to the space station, being aware of the distance and reaching the goal is an expert: {Canadarm2_NearISS, Aware_of_distance, Goal_attained} {Expert}. • If the learner is inactive, the learner lacks motivation: {learner_inactive} ¼{learner_lacks_motivation}
4 Causal Learning To extract causal rules, we previously designed a custom algorithm for CELTS [15]. However, we have noted in previous experiments that this algorithm can generate a large amount of irrelevant rules where the events are separated by too much time (e.g., several minutes). For example, in the following rule {user_did_not_adjust_cameras} {good_manipulation}. The rule indicates that each time a user forgets to adjust cameras before moving Canadarm2 in the virtual, he or she will manipulate the Canadarm2 well. In the contrary to the above rule, our experiences demonstrate that each time a user forgets to adjust the camera, he or she is likely to make collision risk in the virtual world (Figure 1.A). To improve CELTS performance we used TRuleGrowth [24]. The original TRuleGrowth algorithm takes a database of event sequences (as previously defined) and three parameters as inputssupport, confidence1 and, window_size, which is the maximum time length a rule occurs. In the context of CELTS, this parameter is very important as it allows CELTS to exclude rules between events that are separated by too much time. The window_size constraint is a global parameter that must be selected by an expert in CELTS. Using window_size, support and confidence and temporal ordering of the events, CELTS is capable of inductive reasoning [15]. This information can be interpreted as an estimate of the conditional probability P(Y | X) [25-27]. However, one problem that occurs when discovering rules applying TRuleGrowth in CELTS is that there can be up to several thousands of rules. This, degrades the performance of the data mining algorithm and also of CELTS, which has too many rules to consider for reasoning. To reduce the number of rules and to extract only the most precise and relevant ones, we have adapted the TRuleGrowth algorithm to add constraints on events that a rule can contain so that only the rules that are relevant to CELTS are extracted in any situation. 4.1 Adapting TRuleGrowth to Handle Events Constraints We have modified TRuleGrowth to allow specifying the following four types of event constraints on events that a rule can contain. The constraints are set when an expert, for a given problem, creates solutions in the BN. When the expert creates questions, he tags some CELTS Behavior Network (BN) nodes so that they respect some specific event constraints (Figure 1.B). After assigning constraints, the TRuleGrowth algorithm extracts the rule that respects only set constraints from interactions between 1
Support and confidence. The support of a rule is defined as the number of sequences that contain the rule. The confidence of a rule is defined as the ratio between the number of sequences where the rule appears and the number of sequences containing its left part.
32
U. Faghihi, P. Fournier-Viger, and R. Nkambou
CELTS and learners to help learners with only relevant hints, messages, demonstrations, and so forth. To assign the temporal constraint, CELTS uses its BN’s nodes that are tagged by the expert to respect event constraints. That is, when CELTS reaches the node tagged as C1, the temporal constraint timer for this solution starts automatically. The timer stops once the user makes mistakes. In what follows, we first present these constraints and then explain how they are useful for the reasoning of CELTS. The constraints are the following: • • • •
C1: the set of events that the left part of a rule can contain, C2: the set of events that the right part of a rule can contain, C3: the set of events that the left part of a rule has to contain, C4: the set of events that the right part of a rule has to contain.
We modified TRuleGrowth to ignore events that are excluded according to event constraints C1 and C3, or C2 and C4 when searching for events that can extend the left or right parts of a rule. Although this modification is simple, it can dramatically decrease the execution time and reduce the number of rules found, as we will show in the experimental results of this paper. How event constraints are used by CELTS to discover relevant rules We now explain how the four event constraints that we have defined are used to discover rules that are relevant for CELTS. The constraints can be used to discover rules that answer the following four questions: Q1: What may happen? Constraint C1 is used to force rules to contain only events that occurred in the left part of the rule. For instance, if the event {Forget-camera adjustment} and {Selected wrong arm joint} occurs, CELTS can put these two events in C1 and discover rules such as {Forget-camera adjustment, Selected wrong joint} {Collision risk}, which indicates that this combination of events could result in a collision risk. CELTS can then use this information to stop the learner or take any other remedial action. Q2: What was the cause of the event? Constraint C4 is used to force rules to contain only events that occurred in the right part of the rule. This allows us to create explanations of why these events occurred. For instance, in one situation, CELTS recorded that Collision risk (CR) occurred. By searching the right part of the rules containing constraint C4, CELTS was able to find that forgetting camera adjustments and choosing wrong joint(s) are the most likely causes of the problem. Q3: Given a specific situation, which events will take place? Using constraint C1 and/or C3, we make sure that occurred events will be registered in the left part of the rules, and by using constraints C2 and/or C4 we make sure that the prediction will be registered in the right part of the rules. By using this strategy, CELTS can obtain rules that predict the occurrence of some specific events in the future. For instance, if a learner moves the Canadarm2 too close to the ISS, CELTS can put this event in constraint C3 and the collision event in constraint C4, to know if it is likely that a collision with ISS will occurs. This information can then be used to take the appropriate action.
Implementing an Efficient Causal Learning Mechanism in a Cognitive Tutoring Agent
33
Q4: What may happen after given action of CELTS? Given a situation, constraint C1 is assigned to all registered events that have happened and constraint C3 is assigned to all registered action(s) that CELTS can take for the situation. By using these constraints, CELTS can discover rules that indicate the consequences of the possible actions to be taken. For instance, if while manipulating Canadarm2, the learner makes mistakes by choosing a bad joint in a low visibility situation, C1 is assigned to the situation and the remedial possible actions that CELTS can take are assigned to C3 (e.g., “Engage dialogue about joint selection”). Thus, using constraints C1 and C3 can allow discovering rules such as: {mistakes_with_joints, low_visibility,engage_dialogue } {learner_correct_mistake} Given that the situation described by the rule is similar, CELTS can then use the rule to take appropriate action. Finally, the previous four questions can be combined to achieve more complex reasoning. For instance, CELTS can detect why some learners don’t know which joint must be chosen to achieve a specific goal (the event “don’t_know_right_joint” and “goal#21”, which means putting Canadarm2 in the END point in Figure1.A). To do so, while CELTS interacts with learners, it seeks all the rules whose left part contains the following information: {don’t_know_right_joint, exercise_goal#21}. According to the information extracted by Episodic Learning Mechanism, CELTS knows that {not_perform_distance_evaluation, did_not_performed_camera_adjustement} {doesn’t_know_right_joint, goal#21}. This means that the cause may be that the learner forgot to make a distance evaluation or forgot to adjust camera. According to constraints C3, CELTS can then search the following information "What is the best camera for observing ISS" or "What is the distance between Canadarm2 and ISS" given the cause found and {don’t_know_right_joint,goal#21}. Asking these questions, CELTS can help learners to solve the problem by providing explanations to the learner. This helps CELTS to predict the results of its action and the learner’s response and helps it to choose the best action to help the learner. How CELTS uses found rules to takes decision Given that learners make mistakes, after several interactions with learners, CELTS must choose a rule to find the cause of the learners’ problems. By applying the data mining algorithm, there can be many candidate rules that match the event constraints determined by the current situation and the temporal constraint. The found rules may suggest different causes for the learners’ mistakes. For each rule, CELTS considers the left part of the rule as the cause and the right part as the effect. To choose a rule the following process is used. First, all the rules that do not match the current execution are discarded. Then, to choose a rule among the remaining rules, CELTS will compute the strength of each rule as follows [15]: Strength(rule) = confidence × support. The rule with the highest strength will be chosen to help the learner. Third, if by using the window_size parameter CELTS cannot find the cause of the learner’s error, the algorithm will search the BN by going backward from the point that the error
34
U. Faghihi, P. Fournier-Viger, and R. Nkambou
occurred until finding the cause. In other words, CELTS will search outside the window size time interval if it cannot find the cause within it. If CELTS cannot find any answer, the message “I am not capable of finding the cause of the problem” is shown.
5 Experimental Results To demonstrate how Q1, Q2, Q3 and Q4 can improve the quality of CELTS’ assistance to learners, learners were invited to perform arm manipulations using Canadarm2. To do so, we conducted an evaluation of the new version of CELTS with learners who had previously used the older versions of CELTS. The experiment simply consisted in letting each learner try the new version for one hour, observing them and interviewing them. All learners agreed that this new version is more complete and that the enhanced tutoring services were very helpful and helped CELTS to provide a more natural dialogue. Furthermore, to illustrate how our new modified algorithms improved CELTS performance, we applied it to our database of recorded interactions between learners and CELTS that comprises 20 000 sequences of events. Each sequence is a dialogue conducted by CELTS where sequences contain around 50 messages or questions depending on what the learner answers and the choices CELTS makes. We evaluated the performance of the modified data mining algorithm. Applying TRuleGrowth with minSup = 0.01 and minConf = 0.30 and window size = 20 took 524 seconds and generated 2,655,064 rules. After, we reapplied the same algorithm by activating constraints C1, C2, C3 and C4 during the scenario, while recording 100 more sequences. On average, approximately 64 rules were found each time the algorithm was applied and the execution took on average 0.5 seconds. In other words, a speed improvement of 100 times and a rule reduction of 41,485 times were respectively achieved by using event constraints. This shows that these constraints greatly increase the performance of CELTS, without sacrificing any relevant rules. Before making this improvement CELTS could just not handle such a large database of sequences.
6 Conclusion In this paper, by integrating temporal and event constraints with data mining algorithms in CELTS, we demonstrated that CELTS is capable of finding more reliable causal rules than its previous version. This helps CELTS to provide better assistance to the learners. As in the case of humans, the episodic and causal memories in CELTS mutually influence each other during interactions with learners. For instance, if the causes found by CELTS turn out to be false, it influences the support of the causal rules which in turn influences episodic memory, and increase or decrease the events supports. However, in humans, recent events are remembered better than older events. To simulate this we added temporal constraints to our algorithm. To our knowledge, researchers in artificial intelligence have, up to now, used Bayesian methods to study causal reasoning and causal learning model for cognitive agents. However, the Bayesian approach is not applicable when the agent is faced
Implementing an Efficient Causal Learning Mechanism in a Cognitive Tutoring Agent
35
with large amounts of data, like it is the case in CELTS. Another important issue with Bayesian Networks is that they generally require domain experts to specify conditional probabilities by hand, which is often a difficult and time-consuming task. For CELTS, we chose to use data mining algorithms instead of Bayesian Networks because we wanted to create a completely automatic approach that could acquire causal knowledge incrementally. For future works, we plan to improve the causal learning algorithms to make them incremental (make them reuse past results to further improve their performance). We are also working on improving the pedagogical strategies used by CELTS and other modules such as dialogue generation.
References 1. Gopnik, A., Schulz, L. (eds.): Causal Learning: Psychology, Philosophy, and Computation. Oxford University Press, USA (2007) 2. Sarma, V.V.S.: Decision making in complex systems. Springer, Netherlands (1993) 3. Leighton, J.P.: Defining and describing reason. In: Leighton, J.P., Sternberg, R.J. (eds.) The Nature of Reasoning. Cambridge University Press, Cambridge (2004) 4. Demetriou, A.: Cognitive development. In: Demetriou, A., Doise, W., van Lieshout, K.F.M. (eds.) Life-Span Developmental Psychology, pp. 179–269. Wiley, London (1998) 5. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice Hall, Upper Saddle River (2003) 6. Feeney, A., Heit, E.: Inductive Reasoning: Experimental, Developmental, and Computational Approaches. Cambridge University Press, Cambridge (2007) 7. Sternberg, R.J., Mio, J.S.: Cognitive psychology. Wadsworth/Cengage Learning, Australia, Belmont, CA (2009) 8. Martin, C.B., Deutscher, M.: Remembering. Philosophical Review 75, 161–196 (1966) 9. Shoemaker, S.: Persons and their Pasts. American Philosophical Quarterly 7, 269–285 (1970) 10. Perner, J.: Memory and Theory of Mind. In: Tulving, E., Craik, F.I.M. (eds.) The Oxford Handbook of Memory, pp. 297–312. Oxford University Press, Oxford (2000) 11. Bernecker, S.: The Metaphysics of Memory. Springer, Berlin (2008) 12. Lagnado, D.A., Waldmann, M.R., Hagmayer, Y., Sloman, S.A.: Beyond Covariation: Cues to Causal Structure. In: gopnik, A.S., Laura (eds.) Causal learning: Psychology, Philosophy, and Computation, pp. 154–172. Oxford University Press, Inc., Oxford (2007) 13. Ezak, M.D., Howieson, D.B., Loring, D.W., Hannay, H.J., Fischer, J.S.: Neuropsychological Assessment, 4th edn. Oxford University Press, USA (2004) 14. Sun, R.: The CLARION cognitive architecture: Extending cognitive modeling to social simulation Cognition and Multi-Agent interaction. Cambridge University Press, New York (2006) 15. Faghihi, U., Fournier-Viger, P., Nkambou, R., Poirier, P.: A Generic Causal Learning Model for Cognitive Agent. In: The Twenty Third International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems, IEA-AIE 2010 (2010) 16. Dubois, D., Gaha, M., Nkambou, R., Poirier, P.: Cognitive Tutoring System with "Consciousness". In: Woolf, B.P., Aïmeur, E., Nkambou, R., Lajoie, S. (eds.) ITS 2008. LNCS, vol. 5091, pp. 803–806. Springer, Heidelberg (2008)
36
U. Faghihi, P. Fournier-Viger, and R. Nkambou
17. Faghihi, U., Fournier-Viger, P., Nkambou, R., Poirier, P.: A generic episodic learning model implemented in a cognitive agent by means of temporal pattern mining. In: Chien, B.-C., Hong, T.-P., Chen, S.-M., Ali, M. (eds.) IEA/AIE 2009. LNCS, vol. 5579, pp. 545– 555. Springer, Heidelberg (2009) 18. Faghihi, U., Poirier, P., Dubois, D., Gaha, M.: A new emotional architecture for cognitive tutoring agents. In: Proc. FLAIRS 2008, pp. 445–446 (2008) 19. Baars, B.J.: In the Theater of Consciousness:The Workspace of the Mind. Oxford University Press, Oxford (1997) 20. Maes, P.: How to do the right thing. Connection Science 1, 291–323 (1989) 21. Faghihi, U., poirier, P., Fournier-Viger, P., Nkambou, R., Human-Like Learning in a Conscious Agent. Journal of Experimental & Theoretical Artificial Intelligence (2010) (in press) 22. Faghihi, U., Fournier-Viger, P., Nkambou, R., Poirier, P., Mayers, A.: How Emotional Mechanism Helps Episodic Learning in a Cognitive Agent. Proceedings of the 2009 IEEE Symposium on Intelligent Agents (2009) 23. Nkambou, R., Belghith, K., Kabanza, F.: An Approach to Intelligent Training on a Robotic Simulator Using an Innovative Path-Planner. In: Ikeda, M., Ashley, K.D., Chan, T.-W. (eds.) ITS 2006. LNCS, vol. 4053, pp. 645–654. Springer, Heidelberg (2006) 24. Fournier-Viger, P., Wu, C.W., Tseng, V.S.: TRuleGrowth: Mining Sequential Rules Common to Multiple Sequences with Window Size Constraint. In: Proceedings of FLAIRS 2011, 6 pages. AAAI press, Menlo Park (2011) (in press) 25. Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Data Mining of Association Rules and the Process of Knowledge Discovery in Databases. In: Industrial Conference on Data Mining, pp. 15–36 (2002) 26. Deogun, J., Jiang, L.: Prediction Mining – An Approach to Mining Association Rules for Prediction, pp. 98–108. Springer, Heidelberg (2005) 27. Li, D., Deogun, J.S.: Discovering Partial Periodic Sequential Association Rules with Time Lag in Multiple Sequences for Prediction, pp. 332–341. Springer, Heidelberg (2005)
Model Checking Commitment Protocols Mohamed El-Menshawy, Jamal Bentahar, and Rachida Dssouli Concordia University, Faculty of Engineering and Computer Science, Canada
[email protected],
[email protected],
[email protected] Abstract. We investigate the problem of verifying commitment protocols that are widely used to regulate interactions among cognitive agents by means of model checking. We present a new logic-based language to specify commitment protocols, which is derived from extending CTL∗ with modalities for social commitments and associated actions. We report on the implementation of the NetBill protocol—a motivated and specified example in the proposed language—using three model checkers (MCMAS, NuSMV, and CWB-NC) and compare the experimental results obtained.
1
Introduction
A society mainly evolves through communication among cognitive participating entities. In such a society, people interact and exchange information with each other to satisfy their goals. Such communication requires languages and mechanisms to structure interactions among participants within dialogues. Correspondingly, a cognitive agent-based model for an artificial society should provide adequate support for communication languages by providing means to specify interaction protocols that regulate intelligent interactions among agents. It is also beneficial to equip this model with a formal verification method to eliminate errors in the sense that protocols comply with the given specifications and to increase confidence on the model in terms of safety and efficiency. Motivation. This paper introduces a new logic-based language to specify a special kind of interaction protocols, called commitment protocols [5,6,13,12]. This logic extends CTL∗ [4] with modalities for social commitments and associated actions. Social commitments usually represent the contractual obligations that associate the cognitive agents with one another in order to capture multi-agent intelligent interactions in this artificial society. In this context, social commitments model cognitive aspects of agent interactions by enforcing them to keep the fulfillment of their contracts and reducing cognitive dissonance in agent behaviors. Moreover, we adopt these commitment protocols as they are increasingly used in different applications such as modeling business processes [5,6,13] and developing web-based applications [11,2,7]. Conventionally, commitment protocols are defined in terms of creation and manipulation of social commitments that capture a rich variety of interactions among agents. Several approaches have K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 37–47, 2011. c Springer-Verlag Berlin Heidelberg 2011
38
M. El-Menshawy, J. Bentahar, and R. Dssouli
been proposed to specify commitment protocols [6,13,12,11]. However, the languages used to specify those protocols are either not suitable for model checking (a formal verification method) or defined in computational logics having weak expressive power. Approach. Our approach begins with developing a new logic-based language by extending CTL∗ with modalities for social commitments and associated actions. The resulting logic, ACTL∗sc , is used to: 1) express well-formed formulae of commitments and their contents; 2) specify an abstract semantics of commitment actions; and 3) express desirable properties to be checked such as safety. Formulae of commitments and their contents New ACTL*sc logic
Commitment actions (Abstract semantics) Protocol properties (Formal specifications)
Specification language of commitment protocols
Translated into: ISPL, SMV, and CCS
Model checkers: MCMAS, NuSMV, and CWB-NC
Verification results
Fig. 1. A schematic view of our approach
By abstract semantics, we mean a semantics that captures the performance of the commitment actions without defining the meaning of all concrete action instances. Using an abstract action makes our approach more flexible as it allows for capturing not only standard commitment actions like those introduced in [10] such as Withdraw, but also other possible actions such as Inform. However, the concrete semantics of the commitment actions considered in this paper is completely given in our previous work [8]. Using social commitments and their actions, we define a new specification language of commitment protocols. We present experimental results of verifying the NetBill protocol taken from ebusiness domain and specified in our language against some properties. Finally, the implementation of this protocol is done using the MCMAS model checker [9] along with NuSMV [3] and CWB-NC [14] as benchmarks after translating the protocol into these model checkers input languages ISPL, SMV and CCS respectively (see Figure 1). Organization. Section 2 presents the syntax and semantics of ACTL∗sc logic. In Section 3, we use commitments and their actions to define a new specification language of commitment protocols. The translation and implementation of the NetBill protocol and its verification using the MCMAS, NuSMV and CWB-NC model checkers with experimental results are discussed in Section 4. In Section 5, we discuss relevant literature and future work.
2
Commitments and ACTL∗sc Logic
The introduction of a new branching time ACTL∗sc logic is motivated by the fact that the needed modal connectives for social commitments and associated
Model Checking Commitment Protocols
39
actions cannot be only expressed using CTL∗ [4]. Before introducing the syntax and semantics of our logic, we present the notion of social commitments. Notation 1. Social commitments are denoted by SC(i, j, ϕ), where i is the debtor, j the creditor and ϕ a well-formed formula (wff ) in the proposed logic representing the commitment content. SC(i, j, ϕ) means i socially (i.e., publicly) commits to j that ϕ holds [10,8]. Notation 2. Conditional commitments are denoted by τ → SC(i, j, ϕ), where “→” is the logical implication, i, j and ϕ have the above meanings and τ is a wff in the proposed logic representing the commitment condition [10,8]. We use SC c (i, j, τ, ϕ) as an abbreviation of τ → SC(i, j, ϕ). In this case, we have SC(i, j, ϕ) SC c (i, j, , ϕ), where is the constant for truth. Commitments can be manipulated using a set actions defined in [10]. We classify these actions into two and three party actions. The former ones need only two agents to be performed such as: Create(i, j, SC(i, j, ϕ)) to establish a new commitment, Withdraw(i, j, SC(i, j, ϕ)) to cancel an existing commitment, Fulfill(i, j, SC(i, j, ϕ)) to satisfy the commitment content, Violate(i, j, SC(i, j, ϕ)) to reflect there is no way to satisfy the commitment content, and Release(j, i, SC(i, j, ϕ)) to free a debtor from carrying out his commitment. The latter ones need an intermediate agent to be completed such as: Delegate(i, k, SC(i, j, ϕ)) to delegate an existing commitment to another debtor k to satisfy it on his behalf and Assign(j, k, SC(i, j, ϕ)) to assign an existing commitment to another creditor k. 2.1
Syntax of ACTL∗sc
We use Φp = {p, p1 , p2 , . . .} for a set of atomic propositions, AGT = {i, j, k, . . .} for a set of agent names and ACT = {θ, θ1 , θ2 , . . .} for a set of actions. Definition 1. The syntax of ACTL∗sc is given by the following BNF grammar: S ::= p | ¬S | S ∨ S | E P | C C ::= SC(Agt, Agt, P) P ::= S | ¬P | P ∨ P | XP | PUP | Create(Agt, Agt, C) | Θ(Agt, Agt, C) where Agt and Θ are non-terminals corresponding to AGT and ACT respectively. Formulae in ACTL∗sc are classified into state formulae S and path formulae P. The intuitive meanings of the most constructs of ACTL∗sc are straightforward from CTL∗ operators. The formula Aϕ (resp. Eϕ) means that ϕ holds along all (some) paths starting at the current state. The formula SC(i, j, ϕ) means agent i commits towards agent j to bring about a formula ϕ. The action formula θ(i, j, C) means an action θ is performed on the commitment C by i towards j. Furthermore, there are some useful abbreviations based on temporal operators: F ϕ true Uϕ (sometimes in the future) and Gϕ ¬F ¬ϕ (globally).
40
M. El-Menshawy, J. Bentahar, and R. Dssouli
2.2
Semantics of ACTL∗sc
The semantics of ACTL∗sc logic is interpreted with respect to the formal model M associated to the commitment protocol using a Kripke structure as follows: M = S, ACT, AGT, R, V, Rsc , L, AS, I, where: S is a finite set of states; AGT is a set of agents; ACT is a set of actions; R ⊆ S×AGT×ACT×S is a total labeled transition relation; V : Φp → 2S is a valuation function; Rsc : S × AGT × AGT → 2Π , where Π is the set of all paths, is the accessibility relation for social commitments; L : S → 2AGT×AGT is an agency function that associates to each state a set of two interacting agents in this state; AS ⊆ S is a set of acceptance states; and I ⊆ S is a set of initial states. Agl :θk Instead of (sk , Agl , θk , sk+1 ), the labeled transitions will be written as sk −−−−→ sk+1 . The paths that path formulae are interpreted over have the form π = Agl :θk
Agl+1 :θk+1
Agl :θk
sk −−−−→ sk+1 −−−−−−−→ sk+2 . . . such that ∀k, l ≥ 0, (sk −−−−→ sk+1 ) ∈ R. A path in M is then an infinite sequence of states and labeled transitions with agents and their actions. π(m) refers to the mth state in this sequence. When a state sm is a part of a path π, we write sm ∈ π. The set of all paths starting at sk is denoted by Π sk and sk , π refers to the path π starting at sk . A path π ∈ Rsc (sk , i, j) is an accessible path for the interacting agents i and j iff all states along this path are possible states for the two interacting agents. Formally: π ∈ Rsc (sk , i, j) iff sk = π(0) and ∀sm ∈ π we have (i, j) ∈ L(sm ). Excluding social commitments and action formulae, the semantics of ACTL∗sc state formulae is as usual (semantics of CTL∗ ). The notation M, sk |= ϕ means that the model M satisfies ϕ at state sk and M, sk , π |= ϕ means that the model M satisfies ϕ along the path π starting at state sk (sk = π(0)). The state formula SC(i, j, ϕ) is satisfied in the model M at sk iff the content ϕ is true in every accessible path from this state using Rsc (sk , i, j). M, sk |= SC(i, j, ϕ) iff ∀π ∈ Π sk s.t. π ∈ Rsc (sk , i, j) we have M, sk , π |= ϕ A path π emanating from sk satisfies Create(i, j, SC(i, j, ϕ)) in the model M iff i: Create is the label of the first transition on this path and SC(i, j, ϕ) holds in the next state sk+1 on this path. i:Create
M, sk , π |= Create(i, j, SC(i, j, ϕ)) iff sk −−−−→ sk+1 ∈ R and M, sk+1 |= SC(i, j, ϕ) A path π emanating from sk satisfies θ(Agx , Agy , SC(i, j, ϕ)) in the model M where Agx = Agy , x ∈ {1, 2} and y ∈ {1, 2, 3} iff Agx : θ is the label of the first transition on this path and the commitment has been created in the past. Agx :θ
M, sk , π |= θ(Agx , Agy , SC(i, j, ϕ)) iff sk −−−−→ sk+1 ∈ R and ∃l ≤ k s.t. M, sl , π ↓ sl |= Create(i, j, SC(i, j, ϕ)) where π ↓ sl is the prefix of π starting at sl .
3
Specification of Commitment Protocols
Our specification language of commitment protocols is defined as a set of commitments capturing the business interactions among agents (or roles) at design
Model Checking Commitment Protocols
41
time. In addition to which messages can be exchanged and when, a protocol also specifies the meanings of the messages in terms of their effects on the commitments and each message can be mapped to an action on a commitment. The proposed specification language begins with the commitment COM, which is followed by a message MSG. This message is directly mapped into either commitment actions (captured by θ) or inform action. Specifically, MSG could either be Withdraw, Fulfill, Violate, Release, Assign, Delegate or Inform. The delegated (resp. the assigned) message is followed by create message that enables the delegatee (resp. the assignee) to create a new commitment. The inform message is an action performed by the debtor i to inform the creditor j that a domain proposition holds. It is not a commitment action, but indirectly affects commitments by causing transformation from conditional to unconditional commitments. The domain proposition Dom-Pro identifies the set of propositions related to the application domain of the protocol. The protocol terminates when the interacting agents do not have commitments to each other. The formal specification of commitment protocols is defined using BNF grammar with meta-symbols “::=” and “|” for the choice and “;” for action sequence as follows: Table 1. The formal specification of commitment protocols Protocol ::= COM ; MSG MSG ::= Withdraw(i, j, COM) |Fulfill(i, j, COM) | Violate(i, j, COM) | Release(j, i, COM) | (Assign(j, k, COM) | Delegate(i, k, COM)) ; Create(i, j, COM) | Inform(i, j, Dom-Pro) ; MSG COM ::= SC c (i, j, Prop, Prop) | SC(i, j, Prop) Prop ::= A well−formed formula in ACTL∗sc Dom-Pro ::= Identify domain propositions
The above specification can either be used to reason about the actions at run time or compiled into finite state machines (FSMs) at design time. At run time, the agents can logically compute their execution paths that respect the given protocol specifications using some reasoning rules as axioms [13]. However, these rules are not enough to verify protocols against some properties when the system is complex. Moreover, the flexibility resulting from these reasoning rules can be expensive and may increase the code of the agents [13]. For these reasons, we adopt the second approach, which consists in compiling the commitment protocol into enhanced FSM wherein commitments hold on states and actions label transitions, and thereby the legal computations can be inferred. This enables the protocols to be enhanced with additional transitions to handle exceptions. 3.1
A Motivating Example
Let us consider the NetBill protocol used in [13,8,1] and taken from e-business domain to clarify our specification of commitment protocols. Figure 2 depicts
42
M. El-Menshawy, J. Bentahar, and R. Dssouli
our modeling of the NetBill protocol using enhanced FSM where S is the set of vertices, AGT and ACT the set of edge labels, I = s0 , AS = {s6 }, and R the set of edges. Our modeling begins by an inform message on a path starting at s0 . In this message, the customer (Cus) requests a quote for some goods. This request is followed by the merchant (M er) reply that consists in sending the quote as an offer, which means creating a commitment. The Cus agent could either reject this offer, which means releasing the offer and the protocol ends at the failure state s7 (as it is not in the set AS ), or accept the offer, which means creating a commitment at s3 . When the Cus agent accepts the received offer, then he has three choices: 1) to withdraw his commitment; 2) to delegate it to a financial company (say Bank1 ); or 3) to directly pay the M er agent. As in [12], the Bank1 agent can delegate this commitment to another bank (say Bank2 ), which delegates the commitment back to the Bank1 agent. The bank agents delegate the commitment back and forth infinitely often and this is presented by a transition loop at s11 . In a sound protocol, this behavior should be avoided (see Section 4.2). The M er agent, before delivering the goods to the Cus agent, can withdraw his offer and then move to the recovery state s9 after refunding the payment to the Cus agent. When the Cus agent pays for the requested goods and the M er agent delivers them, the M er agent fulfills his commitment at s5 and then moves to the acceptance state s6 after sending the receipt to the Cus agent. Conversely, the Cus agent can pay for the requested goods without being delivered by the M er agent. In this case, the merchant agent violates his commitment at s8 and then moves to the recovery state s9 after refunding the payment to the customer. Finally, the M er agent, for some reasons, can assign his commitment to another merchant (say M er1 ) at s12 in which a new commitment between Cus and M er1 is created as a new offer. As for the delegate scenario, the assign action can be repeated infinitely often among agents and this scenario, presented by a transition loop at s12 , is unwanted behavior in our protocol.
S0
Cus: Inform (Price-request)
S1
Mer: Create (Offer)
Mer1: Create
S2
Cus: Release (Reject)
S7
Cus: Withdraw
S12
(Offer)
Mer1: Assign (to another Merchant)
Mer: Assign (to Mer1) Cus: Create (Accept) S3
Cus: Fulfill (Pay)
Cus: Delegate (to Bank1) S11 Bank1: Delegate (to another Bank2)
Mer: Violate (Not Deliver)
S4
Bank1: Inform (Pay)
Mer: Fulfill (Deliver)
S8
Mer: Withdraw S10
Mer: Inform (Refund)
S9
Mer: Inform (Refund)
S5
S6
Mer: Inform (Send-Receipt)
Fig. 2. Enhanced FSM of NetBill protocol with commitments and their actions
Model Checking Commitment Protocols
4
43
Implementation
In a nutshell, given the model M representing the NetBill protocol and a formula ϕ, the problem of model checking is establishing whether or not the model M satisfies ϕ. Clarke et al. shown that the problem of CTL∗ model checking can be reduced to the problem of CTL and LTL model checking [4]. This paper follows similar approach by effectively reducing the problem of ACTL∗sc model checking to the problem of model checking ALTLsc and ACTLsc . ALTLsc and ACTLsc are LTL and CTL augmented with commitments and action formulae. 4.1
Translating Commitment Protocols
The first step in our implementation is to translate the NetBill protocol into: 1) ISPL (the input language of MCMAS); 2) SMV (the input language of NuSMV); and 3) CCS (the input language of CWB-NC). In ISPL, this process begins by translating a set of interacting agents (M er, Cus, M er1 and Bank1 ) into standard agents in the Agent section and commitments into local variables in the Vars section. Such variables are of enumeration type including all possible commitment states plus the acceptance, failure and recovery states. The actions on commitments are directly defined in the Actions section. The translation is completed by declaring a set I of initial states in the InitStates section and the formulae that need to be checked in the Formulae section. In SMV, the set of interacting agents are translated into isolated modules, which are instantiated in the main module. The latter module includes the definition of initial conditions using the INIT statement and the keyword SPEC to specify the formulae that need to be checked. The commitment states plus the successful, failure and recovery states are defined as SMV variables in the VAR statement. Such states with actions are used as reasoning rules to automatically evolve the states change. The labeled transition relation between commitment states is described using the TRANS statement. This statement uses the next and Case expressions to represent agent’s choices in a sequential manner. In CCS, each agent in our protocol is represented by a set of processes and each process is specified using the proc statement. The states of commitments are captured by the set of actions performed on these commitments. Each transition is represented by the action labeling this transition followed by another process. 4.2
Expressing Protocol Properties
To automatically verify commitment protocols, we need to express some desirable properties. Specifically, we formally express a rich class of protocol properties: fairness, safety, liveness, and reachability using the proposed logic. These properties include the properties defined in [2,5] and satisfy the same functionalities of the properties defined in [12,1] (see our discussion in Section 5). Fairness constraint property. It is needed to rule out unwanted behaviors of agents in our protocol. An example of fairness constraint is given by ϕ1 , which
44
M. El-Menshawy, J. Bentahar, and R. Dssouli
states that along all paths it is globally the case that on all emerging paths the Bank1 agent eventually does not delegate commitment. ϕ1 = AGAF ¬Delegate(Bank1 ) Safety property. This property means “something bad never happens”. For example, in our protocol a bad situation is: the Cus agent sends the payment, but the M er agent never delivers the requested goods: ϕ2 = AG(¬(P ay(Cus) ∧ AG¬Deliver(M er)) Liveness property. This property means “something good will eventually happen”. For example, in our protocol a good situation is: along all paths it is globally the case that if the Cus agent sends the payment, then in all computations in the future the M er agent will either 1) deliver the requested goods; 2) withdraw his commitment; or 3) violate his commitment: ϕ3 = AG(P ay(Cus) → AF (Deliver(M er) ∨ Withdraw(M er) ∨ (N otDeliver(M er)))
Reachability property. A particular situation can be reached from the initial state via some computation sequences. For example, along all paths it is eventually the case that there is a possibility in the future for the M er agent to deliver the requested goods to the Cus agent: ϕ4 = AF EF Deliver(M er) 4.3
Experimental Results
We implemented the translation tools on top of the three model checkers (MCMAS, NuSMV, and CWB-NC) and provided a thorough assessment of this translation on three experiments. These experiments were meant to compare the verification results of checking the NetBill protocol against some properties specified in ACTLsc (fragment of our logic) using MCMAS and NuSMV (symbolic-based techniques) and CWB-NC (an automata-based technique). We used two criteria (the model size and execution time) to perform these comparisons. The model size is defined as |M | = |S| + |R|, where |S| is the state space and |R| is the relation space. The state space is computed by multiplying the state space of each agent and the state space of the protocol. We can approximate |R| by 10x|S|2 , where 10 is the number of ACTL∗sc operators. So we have |M | ≈ 10x|S|2 . The model size of our experiments is shown in Table 2. Table 3 displays the full results of these three experiments and the execution times (in seconds) on a laptop running 32-bit Windows Vista with 2 GiB of memory and Intel Core 2 Duo clocked at 1.66 GHz. For instance, the execution time for MCMAS increases when augmenting the number of agents from 2 to 4 as the number of OBDD variables needed to encode agents increases with Table 2. The model size in the three experiments Exp.1 Model size 3.5E+7
Exp.2
Exp.3
9.73E+10
6.23E+11
Model Checking Commitment Protocols
45
the number of agents. This proves the fact that the time complexity of model checking is a function of the size of the input problem. The results also reveal that the number of OBDD variables (which reflect the model size) is greater in NuSMV than in MCMAS, but the execution time in NuSMV is better than in MCMAS. In CWB-NC, when the size of the model is small, the execution time is better than in MCMAS and NuSMV (see Table 3). However, when the state space increases (as in Exp.3), the execution time in MCMAS and NuSMV is going to be better than in CWB-NC. Notice that we put “–” in Table 3 because CWB-NC does not use OBDD variables. We can conclude that MCMAS and NuSMV are better than CWB-NC in terms of the execution time. Table 3. Verification results of the MCMAS, NuSMV and CWB-NC MCMAS NuSMV CWB-NC Exp.1 Exp.2 Exp.3 Exp.1 Exp.2 Exp.3 Exp.1 Exp.2 Exp.3 # OBDDs 24 39 49 33 53 67 – – – # Agents 2 3 4 2 3 4 2 3 4 Exec. Time (sec) 0.52 2 6 0.23 1.11 1.98 0.094 0.564 8.814
5
Related Literature and Conclusion
Desai et al. [6] used a modular action description language to specify commitment protocols. Their approach composes multiple protocols to simplify the development of business processes. Yolum and Singh [13] used commitment actions to show how to build and reason about commitment protocols using event calculus. Their approach only indirectly models the fulfillment of a commitment. Our approach belongs to the same line of research, but it complements the above frameworks by defining a more expressive logic-based language to specify commitment protocols, which is suitable for model checking. Venkatraman and Singh [11] presented an approach for testing whether the behavior of an agent complies with a commitment protocol specified in CTL. The proposed approach uses model checking to automatically verify the conformance of commitment protocols with specifications (properties). Cheng [2] and Desai et al. [5] used the automata-based model checker SPIN to verify commitment business protocols and their compositions expressed in LTL. Our language uses not only ALTLsc specification, but also ACTLsc specification. We also used the MCMAS and NuSMV tools, which are more efficient than SPIN (see Table 3). Bentahar et al. [1] proposed a framework where the interacting agentbased systems communicate by combing and reasoning about dialogue games. Our verification technique is based on encoding the protocol and formulae using symbolic model checkers instead of translating them into alternating B¨ uchi tableau automata. Yolum [12] presented the main generic properties needed to design commitment protocols. Our properties meet these generic properties in the sense that the reachability can be used to satisfy the same objective of the effectiveness property. The consistency property is achieved in our protocol by
46
M. El-Menshawy, J. Bentahar, and R. Dssouli
satisfying the safety property. Moreover, the robustness property is satisfied by considering liveness property and fairness paths. Conclusion. In this paper, we presented a new logic to define a specification language of commitment protocols, which excludes spurious aspects that plague existing approaches. We described a verification technique to automatically— instead of semi-automatically [12]—verify the compliance of these protocols with the given properties. In our implementation, we conducted 3 experiments, which demonstrate the following fact: three model checkers were able to verify a variety of complex formulae correctly and efficiently but the symbolic technique is better than the automata one. As future work, we plan to develop a new model checking algorithm for our logic to directly verify the proposed specification language. Acknowledgements. The authors would like to thank Natural Sciences and Engineering Research Council of Canada (NSERC) and Fond Qu´ebecois de la recherche sur la soci´et´e et la culture (FQRSC) for their financial support.
References 1. Bentahar, J., Meyer, J.-J.C., Wan, W.: Model Checking Agent Communication. In: Dastani, M., Hindriks, K.V., Meyer, J.-J.C. (eds.) Specification and Verification of Multi-Agent Systems, 1st edn., pp. 67–102. Springer, Heidelberg (2010) 2. Cheng, Z.: Verifying Commitment based Business Protocols and their Compositions: Model Checking using Promela and Spin. Ph.D. thesis, North Carolina State University (2006) 3. Cimatti, A., Clarke, E.M., Giunchiglia, E., Giunchiglia, F., Pistore, M., Roveri, M., Sebastiani, R., Tacchella, A.: NuSMV: An Open Source Tool for Symbolic Model Checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 359–364. Springer, Heidelberg (2002) 4. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. The MIT Press, Cambridge (1999) 5. Desai, N., Cheng, Z., Chopra, A.K., Singh, M.P.: Toward Verification of Commitment Protocols and their Compositions. In: Durfee, E.H., Yokoo, M., Huhns, M.N., Shehory, O. (eds.) AAMAS, pp. 144–146. IFAAMAS (2007) 6. Desai, N., Singh, M.P.: A Modular Action Description Language for Protocol Composition. In: Proc. of the 22th AAAI Conference on AI, pp. 962–967 (2007) 7. El-Menshawy, M., Bentahar, J., Dssouli, R.: Modeling and verifying business interactions via commitments and dialogue actions. In: Jedrzejowicz, P., Nguyen, N.T., Howlet, R.J., Jain, L.C. (eds.) KES-AMSTA 2010. LNCS, vol. 6071, pp. 11–21. Springer, Heidelberg (2010) 8. El-Menshawy, M., Bentahar, J., Dssouli, R.: Verifiable semantic model for agent interactions using social commitments. In: Dastani, M., El Fallah Segrouchni, A., Leite, J., Torroni, P. (eds.) LADS 2009. LNCS, vol. 6039, pp. 128–152. Springer, Heidelberg (2010) 9. Lomuscio, A., Qu, H., Raimondi, F.: MCMAS: A model checker for the verification of multi-agent systems. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 682–688. Springer, Heidelberg (2009)
Model Checking Commitment Protocols
47
10. Singh, M.P.: An Ontology for Commitments in Multi-Agent Systems: Toward a Unification of Normative Concepts. AI and Law 7(1), 97–113 (1999) 11. Venkatraman, M., Singh, M.P.: Verifying Compliance with Commitment Protocols: Enabling Open Web-based Multi-Agent Systems. Autonomous Agents and MultiAgent Systems 2(3), 217–236 (1999) 12. Yolum, P.: Design Time Analysis of Multi-Agent Protocols. Data and Knowladge Engineering 63(1), 137–154 (2007) 13. Yolum, P., Singh, M.P.: Flexible Protocol Specification and Execution: Applying Event Calculus Planning using Commitment. In: Proc. of the 1st International Joint Conference on AAMAS, pp. 527–534. ACM, New York (2002) 14. Zhang, D., Cleaveland, W.R., Stark, E.W.: The integrated CWB-nC/PIOATool for functional verification and performance analysis of concurrent systems. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003. LNCS, vol. 2619, pp. 431–436. Springer, Heidelberg (2003)
Mobile Architecture for Communication and Development of Applications Based on Context L.M. Soria-Morillo, J.A. Ortega-Ramírez, L. González-Abril, and J.A. Álvarez-García Computer Languages and Systems Dept University of Seville 41012, Seville {lsoria,jortega,luisgon,jalvarez}@us.es
Abstract. The arrival of ubiquitous computing and the increasing use of mobile devices can geta lot of information about the user. This information is used by some applications to adapt its functionality to the user itself and the surrounding environment. In this way the applications need to become more autonomous and less each time user interaction. However, the computational cost, battery consumption and the complex process of obtaining useful information from sensory data means that many applications do not use this information in a massive way. This research proposes a framework and a middleware for the development, implementation and communication of contextual applications. Using this architecture allows communication between applications so that they can share applications without contextual information should both generate. Thanks to definition of a SOA platform,subscription to services provided by other applications is possible through the middleware.
1 Introduction In the last few years, the Context Based Computing has undergone a major revolution, mainly due to three reasons: the integration of sensors into mobile devices, presence of increased generation of mobile phones in “the daily lives of users” and the massive connectivity between devices and Internet. On the one hand, technological progress has allowed the integration of a large number of sensors in next generation mobile devices. Years ago, a large infrastructure would be needed for assembly. Examples of these sensors are triaxial accelerometers, chips based on GPS positioning, wireless connectivity features like WiFi and Bluetooth as well as light sensors and more. Furthermore, the integration of mobile devices in the daily lives of users of a population, allows the development of useful applications for users that can be used anywhere thanks to the portability of devices where run on. We should note that, due to we are working with portable elements, energy savings must be present in any development takes place, as it will be necessary to maximize the usage time of devices. Thus the batteries time is a problem. In addition, it is necessary maximize battery time for the convenience of users of devices, since a high recharge rate can reach displeasing them. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 48–57, 2011. © Springer-Verlag Berlin Heidelberg 2011
Mobile Architecture for Communication and Development of Applications
49
The applications launched by use of context, is a step forward in computing. In such applications, the device itself is which launches applications based on specific conditions of some environmental variables and state of the user who wears it. By obtaining, analyzing and processing data from sensors and minimizing the interaction between the user and the device itself is possible to speed up the use of the device.
2 Objectives The core motivation of this research is to solve the high degree of dispersion when developers create common implementation of applications based on context and no author to date has defined a framework that is fully adapted to the development of ubiquitous systems standard from a common view for all mobile applications. The creation of this framework would be a great advantage for applications to exploit the advantages of knowledge of the context surrounding the user to increase the functionality of applications. In addition to increased functionality, most important would be the total adaptation of content presented to the user and the state which is both the user and his/her environment at the time of execution. In addition to creating a standardized framework, the capabilities of the work environment must support the integration of all context recovery techniques in the most ubiquitous device known: the mobile. This should be done at any time maximizing the number of contexts recognized by the framework, so that the potential of the development of applications is as high as possible. Under this development framework and application structuring, is going to be developed a communication middleware between applications. Thus, based on our own ontology for contextual information systems, applications can share all the necessary information. This drastically reduces the complexity of building new contextual applications, because information could be provided by another application that already offers this functionality.
3 Scenario The following is a usage scenario of the development system based on context. We will analyze the interaction of Daniel, a user of next generation mobile device that has acquired an application developed with the exposed framework, and which uses the proposed middleware. Are 6:30PM and makes a beautiful summer night. The system detects that Daniel is at home, thanks to indoor positioning system and labeling of common sites. During all the night, the mobile phone has been in eco-battery mode, because thanks to the activity recognition module, the system knows mobile phone has been in the table without any use. The system detects that Daniel has left home, thanks to detection of exit to outdoors, and automatically the own system activates the GPS device. The exit detection module is not implemented by own application, but this one a request to the middleware to search if necessary information is available. After consulting the information, the middleware will provide it to the application, in this case if Daniel has left or not the location where he was.
50
L.M. Soria-Morillo et al.
It's Saturday, the day in which Daniel, with his girlfriend Mar often go to the cinema to see films released the previous day. Therefore, after getting into the car, the detection module determines that the frequent destinations most likely fate of Daniel is the home of his girlfriend Mar and activates the navigation system with the specific purpose to meet the traffic to the destination. Once together, Mar and Daniel go to the cinema. Already in the cinema, Mar meets her sister, but since they do not routinely uses the Bluetooth device, she has disabled it. This makes the subsystem responsible for detecting proximity using Bluetooth cannot find any matches. However, Mar starts talking to her sister about what they will do next weekend. Thanks to the module of detection by voiceprint, Daniel's device recognizes the voice of his sister-in-law Rosa and detects her proximity. Through a search process on the agenda, the device detects that Daniel has something yet to talk to Rosa, that is, has a to-do with her. Due to the system has detected that Daniel is close to her, shows an alert with this task. Because the film does not start until in half an hour, Daniel and Rosa have decided talk about, so the agenda of Daniel is released from the appointment with Rosa.
4 Related Works The first researches related to ubiquitous computing emerged in 1991 from the hand of Mark Weiser [2]. This work set a precedent in how to understand computing, and brought the benefits of information systems to the daily lives of users, i.e. what we now understand by ubiquitous computing. During the next years, the trend of ubiquitous systems went almost exclusively the calculation of the position and its applications. This was aided largely the launch of GPS (Global Position System), which reached full operational capability in 1995, and the creation of portable receivers for this technology. During the years 2000 and 2001 a series of frameworks for the management of ubiquitous systems were developed [6], [7] and [8]. In particular, the most widely used architecture of the systems of the time was ContextToolkit. This tool allowed integrating several sensors in one system so that the addition of a new sensor could be carried out in a fast and efficient without being necessary to reprogram the entire system. However, the main limitation of the system was that applications could not obtain context data from ContextToolkit, which was faced with the philosophy of applications based on context that years later adopted the scientific community. However, the real revolution of ubiquitous computing has been in the past three years due to the increase in integrated sensors on mobile devices. Since we mentioned earlier in the Introduction section, massive use of mobile phones has encouraged the development of applications based on context. A clear example of this type of application is the work of Jae Sik Lee et al [10]. It implements a music recommendation system based on the date, location, season, day of week, average temperature, maximum and minimum as well as weather conditions at the site and the present moment. Thus, the system is able to determine the area where the user is and play the music the user usually plays in a context similar to the current.
Mobile Architecture for Communication and Development of Applications
51
In the last years has appears a lot of framework mainly related with internet of things and collaborative environments [1]. However mobile phones middlewaresare not yet totally defined and there is no works related to this field in the references, at least in an integrated manner.
5 Context Technologies After entering the work we carry out, this section shall proceed to give the basic ideas on which we based the system to be developed. In this section we will make a clear distinction between the various technologies used to support the attainment of the context in which the user is at any time. These technologies will be determined by the device on which is executed the framework, so that not all devices support all functionality. However, the technologies that will be described throughout this section are included in the next-generation devices that are currently marketed. Any system that is based on the context to run needs to know a set of information to model the environment around the user. One hand, this model allows taking decisions of execution and on other hand, learning across decisions taken by the user in a given context as a basis for possible automatic decisions. By tradition, as presented above, it is necessary to know who performs the action, where it takes place, when it occurred and what action has been taken. The following technologies enable us to get answers to all these questions. Furthermore, the system will add new information that traditional systems have not taken into account. 5.1 Activity Recognition The activity carried out by the user also has great value to determine the current execution context. When we speak of activity, we refer to the action that is carried out the user from the analysis of the values obtained from accelerometer sensors. By learning and classification of a number of parameters obtained from the accelerometer device, the system is able to classify the activity that the user is performing. We do this out using a classification system based on probability tables and decision trees, and a second level, governed by a hidden Markov model (HMM). In this respect we have developed a detection method for classifying activities with a success rate of 95% the activity takes place. In addition, this system is possible to add new patterns of activity to the system without involving reprogramming of the platform. 5.2 Outdoor Location At present there are very precise location systems outdoors. The most widespread of these is the GPS tracking system. This system requires a direct view of satellites to function properly, so indoor, this system is invalid. Despite this drawback, global coverage and accuracy make this system the most widely used worldwide. However, the main drawback of this system is the high energy cost GPS receiver. This is especially important in embedded systems in mobile devices. Therefore, at this point, the work will focus on reducing energy consumption of this system in order to extend as far as possible, the battery useful time between charges.
52
L.M. Soria-Morillo et al.
A detection system to external inputs and outputs has been developed. This makes it possible to disconnect the GPS device when the user is indoors and reconnect it when the user goes out. This causes the batteries of mobile devices increasing use time by 230% between charges. 5.3 Indoor Location The current techniques for indoor positioning are useful in sensorized spaces, since most of them are based on RFID tags [3], bar-codes, QR-codes or proximity sensors to determine the user's position on a roof or the proximity to a particular object. However, this type of positioning techniques is uncomfortable since it requires an infrastructure for its operation. There are other projects that base their position in the triangulation of WiFi networks. The main problem of such solutions is that WiFi access points must be of high quality in order to eliminate fluctuations in signal strength. In addition to this we must bear in mind that in areas where the density of WiFi networks is low, the effectiveness of the system is not assured. The aim of this work in the field of indoor positioning is getting a new positioning techniques based solely on accelerometer, digital compass and other sensors included in next generation mobile devices. Thus, it is possible to obtain an approximate user's position with a margin of error as small as possible anywhere roofing. In contrast to RFID-based techniques, the solution to be proposed in this work can be used anywhere without the building is adapted for positioning. This is important when we speak of a ubiquitous system. Using this technique will be possible to positioning the user at any house, office or shopping center where he/she is. 5.4 People Detection Detection of people near the user opens a range of possibilities for context-based system and therefore the functionality of the framework that we are developing will increase. A detection system makes the application knows at all times who around the user. The system will be even capable of storing information of new people who had not previously interacted with the system's user. For detection of people close to the user we will use two different but complementary technologies, detection based on sound fingerprints[9] and detection based on Bluetooth signals. 5.5 Semantic Context Induction Mainly due to poor security of the traditional methods of storing information collected from the user's position, it is necessary to develop a new model. This model is based on the labeling of storage locations instead of latitude and longitude of the location of the user. The labeling of places introduces much semantic information in the storage system, so that, only the user and the individuals within their environment could understand the information.
Mobile Architecture for Communication and Development of Applications
53
6 Architecture Layers Once studied the various functional modules supported by the system, it is time to analyze the layers that make up the framework itself. Applications developed based on the layersproposals, will be able to perform all necessary tasks identified in the applications based on context. Furthermore, each layer consists of one or more modules, which contain specific functionality for specific software developed needs. Obviously, each module is interconnected with other modules in other layers of the architecture, so it should be abundantly clear the responsibilities of each. Figure 1 shown de architecture of the framework with its seven layers. As we shall see when each layer is look in depth, the main advantages of this architecture is on the one hand, the independence of the particular specifications of the sensors used and on the other hand, the ability to share information with other applications and even other information systems present on remote machines or devices via data exchange modules. On the other hand, we must pay attention to the architecture proposed in this chapter provides the ability to launch applications based on the certain characteristics of the environment, denoted as context-shot applications. As stated above, we then describe in more detail each of the modules in the architecture for the development of applications based on given context. 6.1 Sensor Adaptation Module (SAM) This part of the structure contains all the logic for adjusting the data from the sensors of the device or other applications through the middleware. Generally, the data must be processed prior to the application can proceed with the analysis, classification and use of such data. 6.2 Information Classification Module (ICM) The information from sensors or other system applications is not usually enough to provide functionality to the application to develop, but these data must be processed to generate new information. It is precisely this layer of the architecture which provides such access to developers. ICM layer is one of the main modules of the architecture; it is where information is processed by algorithms and processing methods. This layer will have support of SAM layer and storage ISM layer through which the application can obtain and store persistently all the information generated. 6.3 Information Storage Module (ISM) The purpose of this layer is to provide a method of persistent or temporary storage of all information generated by lowers layers of the application. This layer should provide methods to access, modify and delete information stored. On the other hand, we should avoid as much as possible to other layers of the architecture to access the ISM, except for the ICM and SSL, because usually the information stored in context has a high degree of privacy. It is therefore essential to prevent other applications or systems uncontrolled access to such information.
54
L.M. Soria-Morillo et al.
Fig. 1. Framework architecture diagram
6.4 System Security Layer (SSL) The security layer provides an effective system to prevent access to information by applications without permission to. In this section does not specify specific security methods to prevent access, although it is recommended that access to the main database of the application is done safely. In addition, whenever possible, they have to delegate the storage of information to the middleware, which has a specific module for this purpose. Thus it is possible to isolate and wean the application of data stored.In this way, the information can be accessed or stored using the functions that the middleware offers. 6.5 Share Information Module (SIM) As mentioned above, access to persistent or temporary information stored in the database should be limited to applications with sufficient privileges to do so. In addition, the information stored must be marked as available for a specific application profiles to provide greater security to the system. Furthermore, external applications residing in other systems may access information through web services, so as to facilitate the exchange of information with other systems. This task will be done by SIM module, which will enable the application designed to send and receive information from other systems via Internet. 6.6 Context Aware Service Module (CASM) This module allows storing information about the launch of applications based on specific context. In particular, the information stored will be the application to execute, the execution parameters and context that must be met for execution. Thus, the CASM module communicates directly with the middleware module sending said registration information of an automatic release. In addition, this module
Mobile Architecture for Communication and Development of Applications
55
provides a point of information exchange in case the current application should be executed because another application has registered the process of automatic execution.
7 Middleware In this research a middleware following SOA architecture is going to be generated, which allow applications to provide services to other applications. In addition, a certain application can subscribe to the services offered by another easily. The proposed middleware consists of five layers. In Figure 2 is shown a diagram explaining the different modules of the middleware architecture.
Fig. 2. Middleware architecture preview
The first of these is the adaptation layer of sensory data. It will package the information received from sensors and will be introduced to the middleware as a service, so that both the information generated by other applications such as collection from the sensors, will be accessible from any application that requires it. Secondly we have the data storage and security system. The first of them is responsible for storing all the necessary information in a persistently, so that information is available at any time for that application that need it. As has been seen, this module should implement a policy of deleting data, to prevent malfunction due to excessive memory. Security manager will be responsible for determining whether an application can do or not to use a particular service. The third layer is subscription services module and service discovery module. Although both are described later in detail, we must take in mind that these modules are responsible for managing all events generated by the middleware for sending and receiving context information between applications. The service discovery module is designed to provide applications a way through which indicates whether a particular service is or is not available in the system. Instead, the subscription to the service
56
L.M. Soria-Morillo et al.
module is used by applications to register the notification system in a particular service that has already been checked that is available. At the next level of middleware context manager appears. This module is the intermediary between applications and the middleware itself, providing an interface to access the system. In addition, process all requests coming from the applications themselves and delegate the service module that is appropriate. On the other hand, the Context Management unifies all communication logic modules, standardizing the ontology intended to be communicated. In the same level as this module is the launching applications module, which is responsible for receiving requests and performs automatic launching of applications. Finally appears the information exchange module, which aims to facilitate communication between applications that are not hosted on the device itself, using a channel of communication and external dissemination middleware. This module is an expansion point in the architecture, because it can extend the operation of the system to a collaborative environment in which applications are not always implemented on the device itself, but in a distributed environment.
8 Future Works Although middleware and architecture development stage is still early, some objectives can be defined to ensure proper adaptation of architecture to the realm of contextual and ubiquitous applications. In this way should be complete the identification of sources of information compatible with the architecture so that it succeeds in obtaining an ontology compatible with existing information sources and provide a general enough to support news data that have not been taken into account. On the other hand it should provide a set of optimal solutions for each of the requirements in the layers of architecture. This will get applications that use this architecture using compatible technology, which will facilitate the reuse of modules among several applications at the time of development.
9 Conclusions The result of this work will aim to solve the traditional problems of systems based on context and provide a common development environment for most devices. This ensures the independence of individual sensors and provides a development interface for applications. The framework will make the development of applications based on context is conducted in a much more efficient than they have done so far. This will ensure an increase in this type of application, with the comfort this means for the user. Finally, we must comment that the structure of the framework will make allow the developers to extend the functionality of them. On this way, the developer could adapt the framework to their own needs, thereby increasing the performance, quality and minimizing the time cost of application development based on context.
Mobile Architecture for Communication and Development of Applications
57
Acknowledgements This research is partially supported by the projects of the Ministry of Science and Innovation INCARE (TSI2006-13390-C02-02) and ARTEMISA (TIN2009-14378C02-01).
References [1] Song, Z., Cárdenas, A.A., Masuoka, R.: Semantic middleware for the Internet of Things. In: Proceeding of Internet of Things, IOT (2010) [2] Weiser, M.: The computer for the 21st century. Scientific American 265(3), 94–104 (1991) [3] Hameed, B., Khan, I., Dürr, F., Rothermel, K.: An RFID based consistency management framework for production monitoring in a smart real-time factory. In: Proceeding of Internet of Things, IOT (2010) [4] Schmidt, A., Beigl, M., Gellersen, H.-W.: There is more to Context than Location. Computers & Graphics Journal 23(6) (1999) [5] Schmidt, A., Forbess, J.: What GPS doesn’t tell you: determining one’s context with low-level sensors. In: Proceedings of the ICECS 1999, 6th IEEE International Conference on Electronics, Circuits and Systems, ICECS 1999 (1999) [6] Dey, A.K., Abowd, G.D.: The Context Toolkit: Aiding the Development of ContextAware Applications. In: Workshop on Software Engineering for Wearable and Pervasive Computing, Limerick, Ireland, June 6 (2000) [7] Dey, A.K., Abowd, G.D.: CybreMinder: A Context- Aware System for Supporting Reminders. In: Proceedings of the 2nd International Symposium on Handheld and Ubiquitous Computing (HUC2K), Bristol, UK, September 25-27, pp. 172–186 (2000) [8] Dey, A.K.: Understanding and Using Context. In: Personal and Ubiquitous Computing, vol. 5(1), pp. 4–7. Springer, Heidelberg (2001), http://www.personal-ubicomp.com/ ISSN: 0949-2054 [9] Lu, L., Zhang, H.-J. (eds.): Unsupervised speaker segmentation and tracking in real-time audio content analysis, vol. 14. Association for Computing Machinery, Inc, New York (2005) [10] Lee, J.S., Lee, J.C.: Context Awareness by Case-Based Reasoning in a Music Recommendation System. In: Ubiquitous Computing System, pp. 45–58 (2008) [11] Nord, J., Synnes, K., Parnes, P.: An Architecture for Location Aware Applications. In: Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS 2002), vol. 9 (2002)
A Simplified Human Cognitive Approach for Supporting Crowd Modeling in Tunnel Fires Emergency Simulation Enrico Briano, Roberto Mosca, Roberto Revetria, and Alessandro Testa DIPTEM – Department of Industrial Production, Technology, and Mathematical Models Via Opera Pia 15, Genoa, Italy {enrico.briano,roberto.revetria,alessandro.testa}@unige.it,
[email protected],
Abstract. This work focuses on the human behavior during an evacuation of a motorway tunnel. First of all the technical prescriptions have been taken into account, according with the current legislation updated after the Mont Blanc Tunnel tragedy of March 1999. In order to study how people behave in case of emergency an innovative simplified approach has been applied based on a mix of Thomas Saaty’s Analytic Hierarchic Process (AHP) and Schmidt’s PECS (Physical Emotional Cognitive Social) Model. This methodology has been implemented in a simulation model implementing both 2D and 3D heat and smoke propagation in order to investigate crowd reaction during escape in tunnel fires emergencies. All the scenarios have been analyzed and evaluated using Analysis of Variance (ANOVA) and Response Surface Methodology (RSM). Keywords: Simulation, Human Cognitive Models, AHP, Human Behavior.
1 Introduction In the last decade safety in motorway tunnels, especially for those longer than 1000 m, has significantly increased its importance inside the European Union Regulation especially after the Mont Blanc tunnel disaster of 24TH March, 1999. The Alpine District countries (Italy, France, Switzerland, Austria, Germany) have the higher number of long tunnels among all European countries, and so national law poses particular attention to safety imposing strict rules over every technical aspect such as traffic lights, SOS, by-passes, escape signs and fire fighting devices. This excess of rules is resulting in a very “overload” environment where scared people get easily confused instead of being guided to safety places. This situation is even worst in Italy, the country that has the highest number of road tunnel in Europe (50% of the total). In a recent study (2005-2008) led by German Automobile Club (ADAC) in cooperation with other 11 Automobile Clubs, on a sample of 22 Italian tunnels examined 14 did not respect the minimum safety criteria presenting serious problems (emergency exits missing, fire fighting system inappropriate, etc.) and only 8 passed such examination. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 58–67, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Simplified Human Cognitive Approach for Supporting Crowd Modeling
59
Based on this research the authors focused on the development of simulation models able to provide more effective measures for managing emergency situations in order to identify critical situation and provide support to better applying the EU technical rules increasing the safety level quickly. This paper presents, after a general methodology, a case study focused on a real 4,5 km tunnel, considering some of the most important aspects of human behavior like physical, emotional, cognitive and social factors, which can have a serious impact on the management of the emergency with the final goal to design a better safety in tunnels.
2 A Simulation Based Methodology: PECS Reference Models In the search for a methodology suitable for studying the human behavioral aspects of the tunnel fire escape the author identified the Physical conditions, Emotional state, Cognitive capabilities and Social status reference model (PECS). PECS is devoted to simulate human behavior embedded in a social environment and according to this definition, human behavior presents a complex structure that can be identified in four different classes of factors, or state variables: physical, emotional, cognitive and social. The human being is consequently - according to Schmidt - perceived as a psychosomatic unit with cognitive capacities embedded in a social environment. The four classes of factors of PECS reference models play a significant role in the behavior control: the model of the human behavior has to take into account the following state variables, in order to be comprehensible and predictable: 1. Physical State Variables, in this case the ability to perceive, decide and eventually run away to safety if not poisoned or injured by smoke or fire. 2. Emotional State Variables, related to the level of fear, angry or courage experienced by the subject. 3. Cognitive State Variables, based on the capacity of correctly interpreter the meaning of signs or the evolution of the current situation. This point is very important since the higher casualty level is among the people that remain in the car too much before decide to escape. 4. Social State Variables, that are playing an important role in the first minutes where leaders emerges and literally coach the crowd to safe or to death. The modeling type of human behavior depends on the nature of the problem; it is not mandatory to model all the four classes of state variables and/or all the interactions among them but the considered PECS model considers all the four classes similarly to the one discussed by Schmidt: there are danger points to be avoided, lives and health to be preserved and the needs of using of cognitive capacity in order to find the better solution possible as well as the social component: people could significantly affect the individual behavior, in terms of cognitive capacity and fear, in fact, without an appropriate cultural basis; the individual is pushed to behave “as the others do”. Like in Schmidt’s Adam model the proposed approach considers different aspects: • Environment: Like Adam lives in a world that is dynamic and constantly changing the individual has to continuously deal with this environment. • PECS State Variables: Individual physical state is represented by the energy level, while the emotional aspect is described by fear that can turn into panic. Adam has
60
•
•
•
•
•
E. Briano et al.
also cognition of himself and his environment like the individual in the tunnel that has to preserve its life and control its fear. Changes of the internal variables: The state variables are subjected to continuous changes during the simulation time. Changes occur in two ways: autonomously and/or triggered by an input; for instance health or energy level is subjected to both changes, in fact it decreases slowly with time and, when Adam falls in a trap or in a danger points, the decrease is slighter. Vice versa, when Adam reaches a food source, his energy level increases, depending on the amount of food consumed, until it reaches a maximum value. Fear variable behaves in the same way: it decreases when Adam becomes more confident of himself and his world but slightly increases when Adam falls in trap. Deliberative and Reactive behavior: The first behavior type follows simple rules, driven for instance by leading motives like hunger drive or the need of thinking, while deliberative behavior needs a predetermined planning action to reach a determined goal. Internal and external actions: Adam’s actions repertoire contains external actions, which have impact on the surrounding environment like gathering a food source, but also internal ones, which only affect Adam himself, like the planning or the thinking action. Same approach to the individual in the tunnel he can struggle to protect his car from fire (like too many do) or find a way to run away to safety. Learning: Adam is characterized by cognition of himself and his world and, thanks to his experience, recognizes some processes of the world where he lives in, like the food regeneration process, knowing even better the time it takes for the food to grow again. However this cognitive process is not sudden, but it is a process in a continuous evolution, because Adam initially does not know the food growth speed, but he learns that gradually in order not to mistake. In such way the individual can rapidly learn that some ways are interrupted by fire and take more long route to finally get safe. Forgetting: Adam is capable to learn but is also subjected to forgetfulness because he is human and his memory capacity is finite. Since the total time spent by individuals in the tunnels is usually very little no forgetting was taken into account developing this model.
The model described in the following section has several points in common with Adam’s model and PECS reference models, because, like Adam, the individual in the tunnel has to preserve his life avoiding danger points (fire, smoke), reaching the by passes (instead of the food sources) using the cognition to recognize the right road signs. In addition however he has to deal with other people that can affect his choices with their fear and their own behavior.
3 Modeling Human Perception and Judgment Using AHP A critical situation like a tunnel evacuation is, involves several parameters that affect the individual choice; for this reason this situation can be addressed to a multi-criteria decisional problem.
A Simplified Human Cognitive Approach for Supporting Crowd Modeling
61
The individual, acting as a decision maker, detects a set of alternatives to be assessed on the basis of high-level criteria and relative sub criteria determined by empirical studies; each criterion has a percentage of importance and reports a score, which determines the impact of the criterion on the decision. The score of each decisional alternative is the weighted average of each criterion scores on the decision multiplied by the weight assigned to each criterion. In order to practically describe the methodology the authors presents a simple case applied to the evaluation of the attractiveness of signs, in this example the sign typologies have been considered in such way: • The IsFireFight type: referring to a target (i.e. fire extinguisher) that allows the individual to take active actions in order to limit the damage to the property and the life. • The IsExit type: denotes a path with the possibility to reach safety. • The IsPhone type: denotes a path with the possibility of establish a communication with the rescue service (i.e. 911). • The IsSOS type: allowing putting in denotes a path/place with the possibility of activation of a fire alarm. Each individual will evaluate every sign type based on its internal know how, this is the sum of its training level and its natural attitudes; in particular for the considered example, only three categories of people have been considered but many others can be subsequently added: 1. “Informed” Individual (or trained). Associated with the higher culture in safety matter, generally is a truck driver. This individual will tend to choose the alternative able to save his life. 2. “Uninformed” Individual (or not trained). Without specific technical knowledge is generally a car driver, he will tend to preserve its property and, subsequently, its life. 3. “Hero” Individual. It is making himself useful for the others, is generally a Medical Doctor, a military, a retired fire fighting men or other people usually involved in rescue activity. For each one of these typologies the importance of each sign type has been calculated with the AHP methodology. As an example, the procedure for Informed Type users is explained in detail; as a first step the behavior local importance matrix {Xij} is built considering the importance of the ith line respect to the jth, column, for instance, considering the line 1 (altruism) and the column 2 (property) the following question for an “informed” individual arise: how is the altruism important respect to the property? If the behavior importance displayed in the line (altruism) is greater than that in the column (property), the Xij element of the matrix is a number greater than 1 (on the contrary we would have put a number lower than 1). Pj =
∏X
ij
i
⎛
∑⎜ ∏ X j
⎝
i
ij
⎞ ⎟ ⎠
(1)
62
E. Briano et al.
∑ (Xj = ∑⎡⎣⎢∑ (Xj n
Wg j
h=1 n
h=1
k
jh
⋅ Ph )
kh
⎤ ⋅ Ph )⎥ ⎦
(2)
Multiplying the local matrix importance {Wj} the global importance or attractiveness of j-th artifact Wgj is obtained and is used by the simulation to attract the individual in one direction instead another other. In the same way the other two behavioral typologies have been analyzed, in particular for the Uninformed Type user, which represents the most common type of car driver, there is no sign more attractive than others, especially if the signs have the same appearance and choice is merely made by chance. For the Hero Type user the Fire Fighting related signs are the most attractive as expected. After determining the sign/target attractiveness for every type of individual involved in the scenario, it is necessary to specify their behavioral time evolution: this is the core of the simulation. This process could be divided in a sequence of logical processes, schematically we have: 1. The individual is “activated” (simulation start) in the post-event condition with a “normal” breathing rate without any Carbon Monoxide (CO) intoxication and is associated to a predefined kind of individual according statistics (i.e. informed, uninformed and hero). 2. The individual starts to take awareness of its situation (it is still inside the car/truck/bus). 3. After a stochastic period of time, it exits from the car/truck/bus and start to look around for a decision, in this process the exiting from inside the vehicle is modeled as a general m/m/1/gd/gd/n queue according to the classical Markovian classification, not all the people inside the car may exit simultaneously its depends of how many doors has the vehicle and how many individuals are using such doors, generally they exit one after the other. 4. Once exited from the car, the individual is in front of a series of possible choices, which determinate eventually its path to safety: 5. According to the physical conditions of the place where find himself (i.e. heat, smoke) the individual identifies a target subset; he determines an eventual path towards the chosen target by excluding from the alternative those that cannot be followed because of heat or obstacles. The choice is based on the instinct and the importance assessment calculated with the above-mentioned AHP multi-criteria analysis. 6. The individual, after its choice has to evaluate the distance to cover and, according to the physical and environmental conditions, begins to run towards his target rather than simply walk trying to reach its desired target. Along the way it may evaluates other eventual alternatives, considering the shortest distance among the possible safe paths (obstacles, heat). When it is close to its target it generally slows down and turns eventually his run into walking. 7. Reaching its target the individual will stochastically evaluate the situation resulting in a possible set of decisions: it may have reached a safe place, it may have reached a sign that push him to go towards another target or it may get confused starting
A Simplified Human Cognitive Approach for Supporting Crowd Modeling
63
another evaluation of possible targets. In this last case the individual will return to process 4 until the individual achieves its safety or, eventually, dies. While the time passes the individual has its physiological process (breathing, heartbeating, etc.) that will involve smoke and CO present in the tunnel resulting in possible intoxication with several notable effects (i.e. confusion, panic, reasoning difficulties, slowing, etc.). Extreme heat is present in the scenario and may produces damages: causing hyperthermia, pain and skin burns forcing the individuals to take extreme decisions that under normal circumstances will never be take into consideration (i.e. crossing a fire to get out of a blocking situation, runaway from smoke far from a sign marking a possible emergency exit, etc.). In such way the proposed model is either driven by rational reactive behavior or by the individual instinct, which can be affected by artifacts encountered in the escaping path (i.e. SOS, Fire Extinguisher, Phone, First Aid Kits, etc.). The model is also able to explain real life observed behavior often caused by instinct, fear and panic that could significantly affect the rationality of the choice. For instance an individual, at the beginning of the accident, is strongly torn because it can not decide between leaving the car reaching safety or stay in because he does not want to leave an expensive properties, like car generally is, burning and so it tries until the last moment to preserve its properties. This indecision can cause anxiety, increasing fear and panic level, inhibiting any rational choice especially for Uninformed Type people. All these behavior has been extensively observed during the experimental campaign: hours of real tunnel camera videotape were carefully studied by the authors and later simulated on two real life exercises. First exercise was the reconstruction of an accident occurred in a mono-directional tunnel (axial ventilating system) and the other was related to the scenario reconstruction of a car accident in a bi-directional tunnel (cross ventilating system). In these real-life simulation two different tunnels were closed by police at 2.00 AM, crashed cars were positioned and artificial smoke was created to re-create the scenario drama. In this situation Fire Fighters made their intervention and the authors were present in order to observe and study the behavior of the people involved. After this real life simulation a computer based model was create to reproduce the behavioral choices of the involved people. This modeling phase is explained in the next section.
4 Implementing a 2D Simulation Model: Methodology and Case Study The implemented evacuation model presents two complementary and synergic modules with a graphic user interface (GUI): • Data input and scenario creation module: it is implemented using bCAD™ software and it is able to map the tunnel, to position the signs/artifacts and the safety devices (fire fighting systems, SOS columns, etc…) along the carriageways, defining, at the same time, the artifacts characteristics (i.e. height, colors, brightness, dimensions, etc.). This module help to build the accident scenario by positioning the vehicles (cars, trucks and buses) involved as well as to specify the burning ma-
64
E. Briano et al.
terials (i.e. plastics, paper, wood, hydrocarbons, rubber, etc.). Data were validated against the American Society for Hazard and Risk Engineers Handbooks. • Evacuation detail module: This module devoted to simulate the behavior of individuals using the above presented methodology. The bCAD™ interface is similar to most of the commercial 2-D drawing tool like AutoCad™ where the authors have developed a specialized scenario creation toolbar Thanks to this application it is possible to localize the cars/trucks/buses involved in the accident, the artifacts and the safety systems as well as locate the fire starting point with its flash power (i.e. 30 MW in the considered scenario). The evacuation module allows the users to observe the fire and smoke evolution at different time steps. In particular it is traced: − the evacuation paths of the individuals involved in the accident; − the smoke concentration and the heat radiance and the heat. The case study reproduces an accident in a 4.5 km long tunnel triggering a fire of 30 MW power starting from a truck, the accident involvers 38 cars, 2 buses each with 60 people on board. The simulator outputs analyzed as KPIs (Key Performance Indicators) are: 1. the number of people not escaped from the tunnel at the end of the 7 minutes simulation time; 2. the number of people escaped after the simulation time; 3. the number of people injured or intoxicated by carbon monoxide (CO) at the end of the simulation time; 4. the minimum, maximum and average exiting time from the vehicles; it represents a sort of “reaction time” after the fire explosion; 5. the minimum, maximum and average escaping time from the tunnel bypass, which is calculated starting from the fire exploding until the escaping from the bypass. These outputs have been set as objective functions for Analysis of Variance (ANOVA) devoted to evaluate the effect of a new artifact layout in the tunnel (i.e. provide new road sign configuration, new ventilation procedure, etc.). The results of the analysis on the main objective functions are shown in the following table 1. Table 1. ANOVA results from experimental campaign Origin of Variation
Variance
DoF
Mean Squared (MS)
# People left in at simulation end time
48.17
1
48.17
Error
14.67
4
3.67
Total Variation
62.83
5
# People Exited Safely
37.50
1
37.50
Error
19.33
4
4.83
F
Ftab
13.14
7.71
7.76
7.71
A Simplified Human Cognitive Approach for Supporting Crowd Modeling
65
Table 1. (continued) Total Variation
56.83
5
Max Permanence Time
607.22
1
607.22
Error
179.68
4
44.92
Total Variation
786.90
5
13.92
7.71
The results obtained clearly show how the new artifact layout significantly affects some KPI: the maximum permanence time in the tunnel, the number of people remained in the tunnel at the end of simulation and, even in a shorter measure, the number of the safely escaped people. On the contrary the new artifact layout position does not affect the minimum and average permanence times as well as the number of injured people. The experimental campaign was than extended with a 23 Factorial Experimental Design in which these factors have been considered: 1. Smoke speed: expressed in m/s of smoke advances in the tunnel by effect of the ventilation, values ranging from 0.2 to 0.8 m/s. 2. Exit (green) sign radiance measured in lux adopted values ranging from 100 lux (minimal legal value) and 800 lux (maximum legal value). 3. Safe (red) sign radiance measured in lux adopted values ranging from 100 lux (minimal legal value) and 800 lux (maximum legal value). The results of such experimental campaign are discussed in the following along with an example of a Response Surface. The number of people remaining trapped in the tunnel at the end of the simulation is affected both by the smoke spreading speed and the sign luminous intensity. Particularly the smoke spreading speed is definitely the most significant factor. Number of Individuals @End Simulation Time (7 min) 800,0
722,2
30,0-40,0
566,7
20,0-30,0
488,9
10,0-20,0
411,1
0,0-10,0
333,3
Safe (red) Radiance [lux]
644,4
255,6
177,8
0,2
0,3
0,3
0,4
0,5 0,5 0,6 Smoke Speed [m/s]
0,7
0,7
100,0 0,8
Fig. 1. Example of the RSM obtained from experimental campaign
66
E. Briano et al.
The Fig. 1 points out as at the smoke spreading speed increasing, the sign luminous intensity becomes irrelevant on the other end in case of low smoke spreading speeds, at the sign luminosity increasing it follows an increase of the people who reach the safety. In other words, once the smoke invades the tunnel it does not exist a luminous intensity value, among those allowed by the enforced standards able to orient the people towards the exit way. The regressive model points out, moreover, as the Exit signal must prevail with respect to the other signs. The Safe signs create indeed a background “noise” worsening the tunnel safety (in terms of users managing to reach the exit). The waiting time inside the motor-vehicles, before to decide to run away is barely affected: it exists only a slight dependence from the smoke speed on the waiting average time. The more rapidly the users see the smoke coming towards them the more quickly they will decide to left the car and run away. As for the waiting time in the car it does not exist significant variations about the maximum and minimum time spent to reach the safety. By converse the average time spent in the tunnel to reach a safety exit is affected both by the smoke speed and by the sign luminous intensity. Particularly with the smoke speed increasing, the average stay time in the tunnel of the people who did not managed to reach the safety decreases. An individual has to manage to reach safety immediately waiting the smoke make the decision to escape too late exposing the individual to a certain death. In this way the abuse of traffic lights inside the tunnels, as suggested by many, will turn into a level of confusion to untrained people resulting in a higher risk of death and injuries (intoxication/burn) since the sign luminosity increases the permanence time in the tunnel. The authors also developed a 3-D model devoted to analyze the flows of the people escaping and, in particular, the smoke evolution dynamics inside the tunnel. The aim of this 3-D model is to provide a feedback about the evolution of the smoke propagation that the evacuation model is not able to give; in fact smoke propagates mainly on the vertical axis, so a 3-D model is required to study the phenomenon; The 3D model has been rendered and inserted in a virtual scenario in which a car can be driven by the user in order to inspect the situation. By now there is only one type of vehicle but it is possible, in the future, to model ambulances or fire department trucks in order to provide a credible environment for training purpose.
5 Conclusions The proposed model results compared with real life observation demonstrate the validity of the approach, in particular during validation of the model subject matter experts recognize emerging behaviors as very similar to the one observed in many real life situation. The EU 2004/54/CE -Dispositions concerning Road Tunnel Safety- mention the possibility of use computer simulation instead of real life simulation but the today available computer simulators are still far from be a credible alternative to practical training and, for such reason, Italian D.Lgs. 5/10/2006 n. 264 provide no support for computer based simulation despite the fact that is simple the translation of cited 2004/54/CE forcing road tunnel owners to adopt very expensive real life simulations.
A Simplified Human Cognitive Approach for Supporting Crowd Modeling
67
This approach could be regarded as a significant starting point for future research in the field of cognitive system simulation allowing the possibility to adopt computer simulation to ensure higher level of safety in nowadays tunnels at a fraction of the cost.
References Schmidt, B.: The Modelling of Human Behaviour. SCS Europe Press, Ghent (2000) ISBN 156555-211-3 Benevolo, P.: Tunnel sotto Esame. Onda Verde (119), 38–39 (2009) Urban, C.: PECS – A Reference Model for the Simulation of Multi-Agent Systems, Tools and Techniques for Social Science Simulation. Physica Verlag, Heidelberg (1999) Rao, A.S., Georgeff, M.P.: BDI-Agents: From Theory to Practice. In: Proceedings of the 1st International Conference on Multi-Agents Systems (ICMAS), San Francisco, CA, USA (1995) Goleman, D.: Emotional Intelligence. Bloomsbury Publishing Plc., London (1996) Schmidt, B., Touissant, A.: Referenzmodelle fur Strategien SiP. Heft 3, 8–15 (1996) Saaty, T.L.: Decision Making for Leaders: The Analytical Hierarchy Process for Decisions in a Complex World ISBN 5-3497959-9, Wadsworth, Paperback, RWS (1988) ISBN 09620317-0-4 Banerjee, B., Abukmail, A., Kraemer, L.: Advancing the Layered Approach to Agent-Based Crowd Simulation. In: Proceedings of PADS 2008, Rome, Italy, June 3-6 (2008) ISBN 9780-7695-3159-5 Lee, S., Son, Y.J.: Integrated Human Decision Making Model Under Belief-Desire-Intention Framenwork for Crowd Simulation. In: Proceedings of WSC 2008 (WinterSim Conference, Miami, FL, USA, December 7-10 (2008) Kaup, D.J., Clarke, T.L., Oleson, R., Malone, L., Jentsch, F.G.: Introducing Age-Based Parameters into Simulations of Crowd Dynamics. In: Proceedings of WSC08 (WinterSim Conference, Miami, FL, USA, December 7-10 (2008) Briano, E., Revetria, R., Testa, A.: Behavior Models for the Evacuation of a Motorway Tunnel. In: Proceedings of WSEAS ICOSSE 2009, Genoa, Italy, October 17-19 (2009) Briano, E., Revetria, R.: A Study of Crowd Behavior in Emergency Tunnel Procedures. International Journal of Mathematics and Computers in Simulation 2(1), 349–358 (2008)
Model Checking Epistemic and Probabilistic Properties of Multi-agent Systems Wei Wan1 , Jamal Bentahar2, and Abdessamad Ben Hamza2 1
2
Department of Electrical and Computer Engineering, Concordia Univeristy Concordia Institute for Information Systems Engineering, Concordia University w
[email protected], {bentahar,hamza}@ciise.concordia.ca
Abstract. Model checking, a formal automatic verification method, has been widely used in multi-agent systems to verify specifications that contain qualitative properties (e.g safety and liveliness) and quantitative properties. Decision making processes based on inherent knowledge are necessary for agents to act appropriately, particularly in uncertain settings. In order to check epistemic (i.e knowledge) and measurable properties in multi-agent systems, we propose a new logic P CT LK, which uses probabilistic, epistemic, and temporal modal operators. We exploit Discrete-Time Markov Chains (DTMC), in which we are able to represent measurable properties with probability, to model uncertainty in multi-agent systems. We extend the formalism of interpreted systems by adding probabilistic features to suit DTMC models and to present the model checking algorithm for our logic. At the end of this paper, we simulate our algorithm using an example of online shopping. Keywords: Probabilistic model checking, Discrete-Time Markov Chains (DTMC), Epistemic logic, Probabilistic logic, Interpreted systems.
1
Introduction
Multi-agent systems are comprised of a set of intelligent agents interacting with each other. In such systems, reasoning with uncertainty is an important feature. Model checking, a well-designed formal technique for verifying if models satisfy specific properties, is widely used to verify the epistemic properties of these systems [9,10,13]. In the past two years, researchers in [3,4,7] have proposed methods to model and check multi-agent systems using Markov chains, which are stochastic processes behaving as transition systems extended by probabilistic choices among successor states. The reason why probabilistic model checking is gaining more and more interest is that classical model checking techniques focus only on the absolute guarantee of correctness and some assumptions are made world that systems run ideally. However, in practice concrete scenarios are characterized by a degree of uncertainty. For example, in multi-agent systems, because of the agents’ autonomous reactivity, their actions are based on observing the environment changes, but very often agents cannot observe the complete environment. Therefore, agents K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 68–78, 2011. c Springer-Verlag Berlin Heidelberg 2011
Model Checking Epistemic and Probabilistic Properties
69
must make estimations through the observable world as a part of their decisionmaking processes. Considering probabilistic aspects when modeling the system allows providing such estimations by assessing the likelihood of different events. Two categories of properties exist [1]: immeasurable qualitative properties and measurable quantitative properties. For qualitative properties, Boolean variables (true or false) can be used to express whether a good event will happen or a bad event will never happen. However, sometimes the likelihood of certain events need to be considered. For example, we may want to know “after ordering, what percentage of items will be successfully delivered”. In this case, qualitative properties cannot provide the precise probability required. Quantitative properties are also necessary to accurately express the system requirements, and probabilistic model checking seems to be appropriate. Checking quantitative properties for knowledge-based multi-agent systems is the main motivation of this paper. Because the majority of existing multi-agent logics only consider certain knowledge, modeling and verifying uncertain knowledge still need more investigation. Uncertain knowledge can be represented using probabilities and the multi-agent system can be modeled as a probabilistic Kripke-like model. In this paper, we define a new logic called probabilistic temporal logic of knowledge P CT LK, that extends P CT L, the probabilistic computation tree logic proposed in [6], by the K operator of knowledge. We also extend the classical interpreted systems used to model multi-agent systems [5] by adding probabilistic transition functions. This extension allows us to model interpreted systems as Discrete-Time Markov Chains (DTMC), which are transition systems with probability distributions. Two reasons are behind the use of DTMC; first, DTMC is the underlying formal model of P CT L, and second, DTMC are widely used to model systems with uncertain information. In order to verify properties expressed with P CT LK, we put forward the model checking algorithm for this new logic. The structure of this paper is organized as follows. In Section 2, we explain how we extend the normal interpreted systems to be modeled using DTMC. In Section 3, we present the syntax and semantics of the P CT LK logic. The model checking problem of P CT LK is described in Section 4. We provide an example, customer online shopping, to simulate the model checking algorithm in Section 5. Finally, we summarize our work and list the future work in Section 6.
2
Interpreted Systems for DT M C
Probability is used in the design and analysis of an agent to measure the likelihood that some specific events will occur. There are several methods of systems modeling to express the probability attributes. One of the most popular operational probabilistic models is Markov chain [1], which is a transition system with probability distributions over the transitions. DTMC is one of three Markov chain models, in which a system is in a given state at each “step”, with the state changing randomly between steps.
70
W. Wan, J. Bentahar, and A.B. Hamza
In Markov chains, the process can only be in a finite set of states. Over a set of atomic propositions AP , a model of DTMC can be expressed as a tuple {S, P, Iinit , L}, where: • S is a nonempty and finite set of states. • P : S × S → [0, 1] is the transition probability function, such that for every state s ∈ S, we have s ∈S P (s, s ) = 1. • I init : S → [0, 1] is the initial distribution such that for all states s ∈ S, s∈S Iinit (s) = 1. • L : S → 2AP is a state labeling function. For mathematics treatment purposes, the initial distribution Iinit can be viewed as an ordered list of row vector (Iinit (s))s∈S , in which the value of every row represents the initial probability from all states in the model. The transition probability function P : S ×S → [0, 1] is represented by the matrix (P (s, t))s,t∈S . The probabilities of state s to its successors t are shown on the row of the matrix, while the probabilities of entering state s from start t are shown on the column of the matrix. According to Fagin et al in [5], in order to model knowledge, we need a framework based on a number of other possible states of affairs besides the true state of affairs. The formalism of interpreted systems, which provides a well-defined semantics to reason about time and knowledge in multi-agent systems, is frequently used in epistemic model checkers, such as MCMAS [11] and MCK[13]. We extend Fagin et al.’s interpreted systems with probability attributes. We then introduce our probabilistic model checking algorithm. Let A = {1, . . . , n} be a set of n agents in the system. Every agent i ∈ A is associated with its local state set Li , and possible actions set Acti . Besides Li and Acti , there are also a set Le and a set Acte for the environment, a special agent for providing global variables and actions that all agents are able to access. Therefore, for the system, a set of global states G ⊆ L1 × . . . × Ln × Le is the set of all possible tuples (l1 , . . . , ln , le ), and each tuple represents a computational state for the whole system. For each agent i, we also use a set of protocols Pi : Li → 2Acti assigning a list of enabled actions to each local state. Associated with the environment is a protocol Pe : Le → 2Acte that represents the functioning behavior of the environment agent. The probabilistic transition function T for the system can be defined as T : G × Act × PAct × G → [0, 1], where Act is the set of joint actions Act ⊆ Act1 × . . . × Actn × Acte that are performed by all the agents and environment respectively. Each agent is associated with a local probabilistic transition function ti ⊆ T . PAct is a probability function of the set of joint actions. For every global state g ∈ G, with transition to state g ∈ G, g ∈G PAct (g, g ) = 1. PActi is the local probability function for each agent i. Given a global initial distribution Iinit , for all states s ∈ G, Iinit (s) = 1. Given a set of atomic propositions AP and an interpretation V ⊆ G × AP , an interpreted system over probabilistic transition function is a tuple: IS =< (Li , Acti , ti , PActi , Pi )i∈A , Iinit , V > Now, we need to associate DTMC, which is a Kripke-like structure extended with probabilities, to our extended interpreted system in order to evaluate
Model Checking Epistemic and Probabilistic Properties
71
epistemic probabilistic properties. The resulting model is MIS = (W, Pt , Iinit , ∼1 , . . . , ∼n , V ). The converting rules from Interpreted system IS to our DTMC interpreted system MIS are described as follow.
• W ⊆ G is the set of reachable states. The state is reachable if and only if the probability is greater than 0 from initiation distribution via T . ⊆ Iinit is the initial distribution of the model. • Iinit • Pt : W × W → [0, 1] is a probability relation, which is obtained by protocols Pi and the probabilistic functions ti , which are subsets of the transition probability function T ; • ∼i is the epistemic relations, one for each agent. It is a subset of W × W for i ∈ A, such that for two global states, (l1 , . . . , ln ) ∼i (l1 , . . . , ln ) if and only if li = li ; • V ⊆ V is the valuation function V : W × AP → {true, f alse}.
3
P CT LK Logic
The language we will use to specify properties is called Probabilistic Computation Tree Logic of Knowledge or P CT LK. This branching-time logic extends Computation Tree Logic (CT L) proposed by Clark et al. in [2] by adding epistemic and probabilistic operators. Thus this language combines CT L logic, epistemic logic [5], and probabilistic logic [1,6]. 3.1
Syntax of P CT LK
P CT LK supports two kinds of formulae: state formulae φ and path formulae ψ. We use p, p1 , p2 , . . . to range over the set of atomic propositions Φp . Given a set of agents A = 1, . . . , n, the syntax of this logic is as follows: φ ::= true | p | φ ∧ φ | ¬φ | Ki φ | P rb (ψ) ψ ::= φ | φ ∪≤n φ | φ ∪ φ where b ∈ [0, 1] is an interval giving rational boundary for the path, ∈ {} is a relational operator indicating the kind of relationship boundary of the probability, and n ∈ N is an integer indicating the maximum steps to achieve a specific state. There are no universal (∀) and existential (∃) path quantifiers in P CT LK. Instead, the linear temporal operators (next) and ∪ (until) are required to follow the probabilistic operator P r immediately. By removing Ki φ from this logic we obtain P CT L [1,6] and by removing φ ∪≤n φ from P CT LK and introducing the ∃ quantifier we obtain CT LK [5]. Three fragments, a temporal logic, an epistemic logic, and a probabilistic logic, are defined for P CT LK in addition to the standard Boolean connectives. The propositional temporal fragment has the same meaning as in CT L. For example, the formula φ has the meaning of “next step φ holds”. φ1 ∪ φ2 means “φ1 holds until φ2 ”. A new step-bounded variant of φ1 ∪≤n φ2 is added, meaning that φ2 will hold within at most n steps while φ1 holds in all states before φ2
72
W. Wan, J. Bentahar, and A.B. Hamza
state has been reached. The step-bounded until is necessary in probabilistic logic because the probability for at most n steps to reach φ2 may be different from at most n + 1 steps to reach φ2 . As in CT L, temporal operators and ∪ are required to be immediately preceded by a path quantifier; in P CT LK, they must follow the operator P r immediately. Other Boolean connectives and operators are derived in the same way as in CT L, for example, ♦φ = true ∪ φ (in the future φ). The step-bounded future can be similarly obtained by step-bounded until: ♦≤n φ = true ∪≤n φ. The always operator can be derived by formula: P r≤b (φ) = P r≥(1−b) (♦¬φ). The details can be found in [1]. The probabilistic operator P rb (ψ) expresses that “ψ is true with probability b”. For illustration, P r≥0.9 (message receive) asserts that “with probability at least 0.9, the next step message will be received”. The epistemic operator Ki is a S5 modality [10], which means knowledge is reflexive and Euclidean [12]. Ki φ represents “agent i knows that φ”. There are some important properties of the epistemic operator. For examples, Ki φ ∧ Ki (φ ⇒ ψ) ⇒ Ki ψ means when an agent knows that φ and knows that φ ⇒ ψ, the agent knows ψ as well. Agents also know what they know and what they do not know. These properties can be expressed as Ki φ ⇒ Ki Ki φ and ¬Ki φ ⇒ Ki ¬Ki φ. More properties and their proof can be found in [5]. We have also interesting properties combining knowledge and probabilities, for example: Ki (P r≥0.5 φ) ⇒ P r≥0.5 φ. 3.2
P CT LK Semantics
Let s ∈ S be a state, π a given path, a ∈ AP an atomic proposition, φ a P CT LK state formula, and ψ a P CT LK path formula. A path is an infinite sequence of states related by transitions, i.e., π = s0 , s1 , s2 , . . . The (i + 1)th state in π is denoted π(i) i.e., π(i) = si . s |= φ denotes “s satisfies φ” or “φ is true in s”. π |= ψ denotes “π satisfies ψ” or “ψ is true in π”. The semantics of P CT LK is as follows. • For a state s s |= a s |= φ1 ∧ φ2 s |= ¬φ
iff V (s, a) = true iff s |= φ1 and s |= φ2 iff s φ
• For a path φ: π |= φ iff π(1) |= φ π |= φ1 ∪≤n φ2 iff ∃0 ≤ k ≤ n, π(k) |= φ2 and ∀0 ≤ i < k π(i) |= φ1 π |= φ1 ∪ φ2 iff ∃k ≥ 0, π(k) |= φ2 and ∀0 ≤ i < k π(i) |= φ1 • For the epistemic operator s |= Ki φ
iff ∀s ∈ W if s ∼i s then s |= φ
For the probabilistic operator, s |= Pb (ψ) means that “the probability, from state s that ψ is true for an outgoing path satisfies b”. For example, s |=
Model Checking Epistemic and Probabilistic Properties
73
P 0”, P CT LK turns into CT LK: P r≥1 is equivalent to ∀ and Pr>0 is equivalent to ∃. P CT LK also is an extension of P CT L [1,6] as it includes the epistemic operator Ki . The properties expressed using P CT L can be checked by the PRISM model checker [8]. Therefore, based on P CT LK syntax, a formula can be a CT L formula, a CT LK but not P CT L formula, a P CT L but not CT LK formula, a conjunction of CT LK and P CT L like Ki φ ∧ P r>0.5 (ψ), or a pure epistemic probabilistic formula under the form Ki (P rb (ψ)). The structure of P CT LK is shown in Fig. 2. In order to calculate the set of states Sat(φ) satisfying a formula φ in P CT LK, we first need to decompose down the formula φ into sub-formulae φ , and compute recursively Sat(φ ). This is done using a bottom-up traversal of the parse tree of φ. For example, the conjunction Ki φ ∧ P rb (ψ) is decomposed into a sub-formula of CT LK (Ki φ) and a sub-formula of P CT L (P rb (ψ)). Based on an interpreted system MIS = (W, Pt , Iinit , ∼1 , . . . , ∼n , V ), the algorithm for computing Sat(Φ) is shown in Table 1. Proof. The validity of case a through c is straightforward. The proof of cases d and e can be found in [10] and [1]. We only prove the case f . Proof of f consists of two parts: 1. Show that if s ∈ Sat(Ki (P rb (φ1 ))), s will satisfies Ki (P rb (φ1 ). s ∈ Sat(Ki (P rb (φ1 ))) ⇒ s ∈ Y ⇒ {s ∈ W |¬((∃s ∈ X) ∧ (s ∼i s ))} ⇒ {s ∈ W |¬∃s ∈ X ∨ ¬(s ∼i s )} ⇒ {s ∈ W |∀s ∈ X ∨ ¬(s ∼i s )} ⇒ {s ∈ W |∀s ∈ X} ⇒ {s ∈ W |∀s ∈ Sat(¬P rb (φ1 ))} ⇒ {s ∈ W |∀s ∈ Sat(P rb (φ1 )} ⇒ s P rb (φ1 ) ⇒ s Ki P rb (φ1 ) 2. Show that for any state s, properties φ1 is true and outgoing path satisfies b, it belongs to Sat(Ki (P rb (φ1 ))) set, s ∈ Sat(Ki (P rb (φ1 ))). s Ki (P rb (φ1 )) ⇒ {s ∈ W |∀s ∈ W ((s ∼i s ) → s (Ki P rb (φ1 ))} ⇒ {s ∈ W |∀s ∈ W (¬(s ∼i s ) ∨ s (Ki P rb (φ1 )))} ⇒ s ∈ Y ⇒ s ∈ Sat(Ki (P rb (φ1 ))) The algorithm for the calculation of the set of states SatCT LK (φ) can be found in [10]. For P CT L, to determine if s ∈ SatP CT L (φ), where φ = P rb ψ, we need to compute the probability P rob(s |= ψ) for the event specified by ψ. Then the set SatP CT L (φ) can be calculated by Sat(P rb ψ) = {s ∈ W |P rob(s |= ψ)p} ( ∈ {}). The details of the algorithm are discussed in [1]. To compute the probability for “next” operator (P rb [φ]), first we need to compute Sat(φ), then we sum all the states in Sat(φ) : s ∈Sat(φ) Pt (s, s ).
Model Checking Epistemic and Probabilistic Properties
75
Table 1. Algorithm for P CT LK model checking Input:MIS = (W, Pt , Iinit , ∼1 , . . . , ∼n , V ), P CT LK formula φ Output: Sat(φ) set of states satisfying φ 1. Decompose the parse tree of formula φ 2. For all the sub-formula φ in φ { Case φ { a. φ is an atomic formula: return {g|V (g, a) = true}; b. φ is ¬φ1 : return W − Sat(φ1 ); c. φ is φ1 ∧ φ2 : return Sat(φ1 ) ∩ Sat(φ2 ); d. φ is in P CT L: return SatP CT L (φ ); e. φ is in CT LK: return SatCT LK (φ ); f. φ is Ki (P rb (φ1 )): { X = Sat(¬P rb (φ1 )); Y = {s ∈ W |∃s ∈ X and s ∼i s }; return W − Y ; } } }
→ − Therefore, we will obtain a vector by multiplying probability relation Pt with a → − bit vector bs for Sat(φ), in which bs = 1 when s ∈ Sat(φ), otherwise, bs = 0. In order to compute SatP CT L (P rb [φ1 ∪≤k φ2 ]), we first need to compute Sat(φ1 ) and Sat(φ2 ). Then we can identify the states that are for sure in the satisfaction set SIN = Sat(φ2 ), and for sure not in the satisfaction set SOUT = W \ (Sat(φ1 ) ∪ Sat(φ2 )). We also need to classify the uncertain states, which may or may not be in the satisfaction set, Sunclear = W \ (SIN ∪ SOUT ). After we divide the states, we can compute the probability solution with a recursive equation: ⎧ 1, ⎪ ⎪ ⎪ ⎨ 0, ≤k P rob(s, φ1 ∪ φ2 ) = ⎪ 0, ⎪ ⎪ ⎩
s ∈W
P (s, s ) · P rob(s, φ1 ∪≤k φ2 ),
if if if if
s ∈ SIN s ∈ SOU T s ∈ Sunclear and k = 0 s ∈ Sunclear and k = 1
For the “until” operator, we can consider “bounded until” with unlimited value. The probability ⎧ solution equation is then as follows: ⎪ if s ∈ SIN ⎨1, P rob(s, φ1 ∪ φ2 ) = 0, if s ∈ SOUT ⎪ ⎩ s ∈W P (s, s ) · P rob(s , φ1 ∪ φ2 ), otherwise If the formula is a pure P CT LK formula (Ki (P rb (φ1 ))), then we compute first the set X of states satisfying ¬P rb (φ1 ), then we return all the states in W except those that have access to the states in X.
76
W. Wan, J. Bentahar, and A.B. Hamza
Theorem 1. The complexity of P CT LK model checking in interpreted systems is PSPACE-complete. Proof. The model checking of P CT LK is composed of the model checking of: 1) CT LK; 2) P CT L; and 3) Ki (P rb (φ1 )). The complexity of 2) in interpreted systems is known to be PSPACE-complete, and the same proof can be used for 2). For 3), computing X needs polynomial space (complexity of P CT L), and computing Y needs only logarithmic space, as only tow states needed to be memorized. Thus, the whole complexity is PSPACE-complete.
5
Case Study
We use a simple online shopping example to simulate our algorithm. The customer requests a delivery, and the system will successfully deliver the goods in 95% of the cases, and will fail in 5% of the cases. The DTMC model is shown on Fig. 3.
Fig. 3. DTMC for online shopping
We have two agents in this system: a customer agent and a server agent. The local states for the customer agent is: {s0 }; while for the server is {s1 }. The environment agent states are {s2 , s3 }; Thus, the global state W = {s0 , s1 , s2 , s3 }. Initial distribution ⎤Iinit is [0, 1, 0, 0]. The probability relation Pt = ⎡ 0 1 0 0 ⎢0 0 0.95 0.05⎥ ⎥ ⎢ ⎣0 0 1 0 ⎦ 0 0 0 0 We would like to check that after a customer orders, he knows that at least 90% items will be successfully delivered, i.e. Kc (P≥0.9 (successf ul)) 1. Sat(succesf ul) = {S2 } 2. ⎡ P r((successf ul)⎤= ⎡ P •⎤P rop ⎤ ≥0.9 (successf ul)) = ⎡ Sat (P 0 0 0 1 0 0 ⎢0 0 0.95 0.05⎥ ⎢0⎥ ⎢0.95⎥ ⎥ ⎥·⎢ ⎥=⎢ ⎢ ⎣0 0 1 0 ⎦ ⎣1⎦ ⎣ 1 ⎦ 0 0 0 0 0 0 3. P r((successf ul)) = [0, 0.95, 1, 0], Sat(P≥0.9 (successf ul)) = {S1 , S2 } 4. Convert Kc (P≥0.9 (successf ul)) to checking P≥1 (P≥0.9 (successf ul)). This conversion reflects the semantics of Ki by representing the quantifier ∀ using P≥1 .
Model Checking Epistemic and Probabilistic Properties
77
5. ⎡ P≥1 (P≥0.9 (successf ⎡ = P⎤· P ropSat (P≥0.9 (successf ul)) = ⎤ ⎡ ⎤ ul)) 1 0 0 1 0 0 ⎢0 0 0.95 0.05⎥ ⎢1⎥ ⎢0.95⎥ ⎥ ⎥·⎢ ⎥=⎢ ⎢ ⎣0 0 1 0 ⎦ ⎣1⎦ ⎣ 1 ⎦ 0 0 0 0 0 0 6. P r((successf ul)) = [1, 0.95, 1, 0], Sat(P≥1 φ) = {S0 , S2 } S0 is a customer agent local state. Therefore, the formula is satisfied. This case study has been implemented in the PRISM and MCMAS model checkers.
6
Conclusion and Future Work
In the field of multi-agent systems, the properties of agents such as their knowledge, beliefs, and autonomous reactions and pro-actions are important aspects for modeling system. In this paper, we extended interpreted systems by adding probabilistic functions so that we are able to use them as DTMC and to model probabilistic multi-agent systems. We also introduced a new language (P CT LK) to cover epistemic and probabilistic features. we also designed an efficient model checking algorithm for our DTMC interpreted systems model and properties expressed using P CT LK. In our algorithm, there are three modules: propositional temporal module, probabilistic module, and epistemic module. These features allow us to use different model checking tools. We also proved that the complexity of our model checking is the same as for CT LK and P CT L. Delgado and Benevides[4] modeled multi-agent systems using DTMC with synchronization actions and defined K − P CT L logic to specify the properties. They use the PRISM probabilistic model checker to verify their work. They do not use or generate the interpreted systems for DTMC. Instead of using PRISM directly, we embed epistemic model checker package into it. In [7] Jamroga proposed Markov temporal logic M T L, an extension of the “Discounted CT L” (DCTL), which uses a discount factor to achieve the probabilistic factor. He used Markov Chains to model the multi-agent systems, but the main focus is on quantitative properties, not epistemic properties. Our probabilistic knowledge model checking is an ongoing project. In the future, we will focus on the following aspects: • Currently, our model checking algorithm is based on decomposing the parse tree of the formula. We are planning to use symbolic model checking methods, such as M T BDD-based (Multi-Terminal Binary Decision Diagrams) or OBDD-based algorithm. • our P CT LK language only includes the simple epistemic operator Ki . In the future, we expect to extended the logic to include more epistemic properties like common knowledge, distributed knowledge, and commitments, etc.
78
W. Wan, J. Bentahar, and A.B. Hamza
References 1. Baier, C., Katoen, J.: Principles of model checking. MIT Press, Cambridge (2008) 2. Clarke, E.M., Grumberg, O., Peled, D.: Model checking. MIT Press, Cambridge (1999) 3. Dekhtyar, M.I., Dikovsky, A.J., Valiev, M.K.: Temporal verification of probabilistic multi-agent systems. In: Anonymous Pillars of Computer Science, pp. 256–265. Springer, Heidelberg (2008) 4. Delgado, C., Benevides, M.: Verification of epistemic properties in probabilistic multi-agent systems. In: Braubach, L., van der Hoek, W., Petta, P., Pokahr, A. (eds.) MATES 2009. LNCS, vol. 5774, pp. 16–28. Springer, Heidelberg (2009) 5. Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Reasoning about knowledge. MIT Press, Cambridge (1995) 6. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects of Computing 6, 512–535 (1994) 7. Jamroga, W.: A Temporal Logic for Markov Chains. In: AAMAS 2008, Padgham, Parkes, May 12-16, pp. 697–704 (2008) 8. Kwiatkowska, M., Norman, G., Parker, D.: PRISM: probabilistic symbolic model checker. In: Anonymous Proceedings, April 14-17, pp. 200–204. Springer, Berlin (2002) 9. Lomuscio, A., Pecheur, C., Raimondi, F.: Automatic Verification of Knowledge and Time with NuSMV. In: IJCAI 2007, Hyderabadad, India, pp. 1384–1389 (2007) 10. Lomuscio, A., Penczek, W.: Symbolic model checking for temporal-epistemic logics. SIGACT News 38(3), 77–99 (2007) 11. Lomuscio, A., Raimondi, F.: MCMAS: a model checker for multi-agent systems. In: Anonymous Proceedings, March 25-April 2, pp. 450–454. Springer, Berlin (2006) 12. Penczek, W., Lomuscio, A.: Verifying epistemic properties of multi-agent systems via bounded model checking. Fundamenta Informaticae, 167–185 (May 2003) 13. Gammie, P., van der Meyden, R.: MCK: Model Checking the Logic of Knowledge. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 479–483. Springer, Heidelberg (2004) 14. Wooldridge, M.J.: An introduction to multi-agent systems. John Wiley & Sons, Chichester (2009)
Modeling Users of Crisis Training Environments by Integrating Psychological and Physiological Data Gabriella Cortellessa, Rita D’Amico, Marco Pagani, Lorenza Tiberio, Riccardo De Benedictis, Giulio Bernardi, and Amedeo Cesta CNR – Consiglio Nazionale delle Ricerche, ISTC Via S.Martino della Battaglia 44, I-00185 Rome, Italy
[email protected] Abstract. This paper describes aspects of a training environment for crisis decision makers who, notoriously, operate in highly stressful and unpredictable situations. Training such decision makers is the goal of PANDORA -B OX, a system which is able to teach a class of trainees representing different authorities that coordinate their interventions in critical situations. This paper dwells on the selection and modeling of the relevant human features that are shown to have an influence in decision making under crisis. The chosen features are used to create a trainee model on the basis of which the system adjusts the training exercises with the ultimate goal of maximizing the effectiveness of training. Trainees models are built by merging physiological and psychological data, and are represented by means of a timeline-based approach, a representation derived from planning technology. The infrastructure built for the trainee modeling constitutes the basis to assess the influence of specific variables (e.g., personality traits, self efficacy, stress and anxiety) on the performance of crisis managers during the training. Keywords: Cognitive Model, Training, Psychological and Physiological Data.
1 Introduction Public health and safety crises, such as hurricane, inundation, infectious disease pandemic are characterized by uncertainty, time pressure and high stakes, requiring good decisions made quickly by all those involved in the response effort, from the senior officials to the local personnel involved in the treatment of people affected by the crises [1]. During these crises there is a tremendous need of effective leaders. Research from a number of fields demonstrates that critical thinking and decision-making skills make it possible for people in at-risk situations to recognize pre-event factors that lead to injury events and to act promptly to prevent an injury event or to diminish its severity. Little research still exists about the specifics of how leaders effectively respond to a crisis (e.g., [2]) and how their personal characteristics can be evaluated in advance of a crisis occurring [3]. Nonetheless, it is very important to understand these processes in order to facilitate the development of capability-building interventions before crisis events occur and, in this way, try to prevent the negative consequences of the crisis itself. In this context it is evident how training plays an important function in the preparation of a crisis manager. A great amount of investment is being devoted to the development of training procedures to increase the ability of crisis managers to face emergency K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 79–88, 2011. c Springer-Verlag Berlin Heidelberg 2011
80
G. Cortellessa et al.
situations. Two main modalities are used for training purpose: (a) the table top exercise (paper based exercises, leaded group discussions guided by a simulated disaster); (b) real world simulation exercise (field tests replicating emergency situations). Table top exercises are low cost and can be easily and frequently organized, but they cannot recreate the real atmosphere, in terms of stress, confusion and need to take critical decisions in very little time. On the other hand, crisis managers trained through simulation exercises on the field can be very effective and can gain valuable skills, but such simulations are very expensive and cannot be easily and quickly organized. Authors are working at the PANDORA project the aim of which is to try to bridge the gap between these two alternative modalities by providing an intelligent training environment, called the PANDORA -B OX, which will synthesize a realistic simulation able to reproduce the stressful factors and to adapt to different trainees. Some challenging aspects contribute to make the PANDORA -B OX an innovative tool in case of success: an “engaging” and realistic reconstruction of a training environment with evolving crisis scenarios to emulate realistic pressure and stress to introduce among trainees; a personalized environment that take into account specific aspects of each individual trainee and allow the personalization of training programs. The focus on the “affective state” of the crisis manager is given by maintaining knowledge of human behavior, in all phases of the emergency management class. It is worth underscoring how most of the current training environments target the operational level of the commanders, while PANDORA is completely devoted to the training of “strategic decision makers”, thus entailing the need to foster critical and creative decision making. This paper describes work done during the first year of the PANDORA project aiming at personalizing training according to trainees static and dynamic analysis. In what follows, Section 2 provides a brief description of the overall PANDORA -B OX architecture to give an idea of the building blocks of the training system, then Section 3 describes the main features that have been chosen to model the trainee’s affective states in connection with stress during a crisis. In particular the section describes the goal of obtaining a user model by integrating psychological and physiological data. The following Section 4 shows the current implementation of the identified features in the PANDORA -B OX by using planning technology. Some conclusions end the paper.
2
PANDORA -B OX: An Adaptable Tool for Crisis Training
The PANDORA -B OX system should act as an interactive training tool for groups of crisis managers. In particular the PANDORA User Requirements Analysis suggested to focus on classes of up to six people to be trained with sessions of about 4 hours supported by a human trainer. The specific scenario we chose for demonstrating the training is the “severe weather conditions”, that is particularly rich and any variant of it may imply problems for all the agencies responsible for the emergency management. The training content quality is ensured by the presence of the Cabinet Office Emergency Planning College (EPC) which exercises a wide range of senior decision makers working in collaboration to resolve emergency situations through leading experts in emergency planning and crisis management. The system will act as a sort of “intelligent
Modeling Users of Crisis Training Environments
81
mediator” between trainer and trainees following a mixed-initiative interaction style. Figure 1 summarizes the main architectural idea pursued within PANDORA. Specifically, a group of trainees, representative of different agencies involved in the resolution of a crisis (e.g., Civil Protection, Local Authorities, Health, Fire Rescue, Police, Transportation and so on) have access to the training system through their training work station. If some of the representative authorities are not present in the class but are considered needed for didactic reasons they can be simulated in the system through Non Player Characters (NPCs), in which case, features and decisions are synthesized by the trainer directly.
Fig. 1. The PANDORA loop around the trainees. Three main modules orchestrate their work to provide adaptable and effective training. An additional module ensures the trainer supervision.
The various participants in the training are characterized by different facets, in relation to both the components closely linked to their role and responsibility, and the particular affective states they may exhibit during the training experience in response to presented stimuli. Each trainee, by interacting with the system, feeds personal data to the PANDORA -B OX, which gathers this information to build a user model (Behavioral Model in the figure). Based on this model, the system synthesizes, using the Behavioral Planner, a personalized training path that meets the specific needs and status of each trainee. The output of this process is passed to a second module (the Crisis Planner), which on the basis of the Behavioral Module decisions, as well as the knowledge of the chosen guiding training scenario, synthesizes a sequence of stimuli appropriate for both the group (information shared among all trainees) and the individual trainees (information tailored to induce the “right level of stress to each trainee”). The plan synthesized by the crisis planner is then given as input to the module called Environment and Emotion Synthesizer which is responsible for an effective rendering of the training temporal plan. In practice such a module adds an additional level of “realism” to the stimuli,
82
G. Cortellessa et al.
by customizing the appropriate presentation mode (e. g., introducing noise on a phone call report) in order to achieve a high level of realism, stress and pressure. The use of advanced 3D scenario reproduction is also studied in the project. A separate module (Trainer Support Framework), allows the trainer to keep control of the training session and dynamically adjust the stimuli based on his/her experience as a trainer. Overall the PANDORA -B OX supports the loop trainer→training environment→trainee, encouraging the customization and adaptation based on the users feedback as well as on goals and suggestions from the trainer.
3 Collecting Relevant Data to Model Crisis Decision Makers The selection of the factors relevant to build the trainee model has been inspired by different sources among which the analysis of the state-of-the-art, frequent interviews to our expert trainer as well as participation in a simulation of training exercise as both trainees and observers. Specifically the analysis of the state-of-the-art has been particularly useful to identify the main “affective factors” that play a crucial role in the decision making under crisis, while the last two sources have contributed to ground the choice on a solid and concrete basis. The results of this multi-dimensional analysis is shown in Table 1. The variables taken into account are personality traits, leadership style, background experience, self-efficacy, stress and anxiety. In particular these variables are subdivided between (a) static features that do not change during training episodes, being mainly related to individual personality and (b) dynamic, which can be on the contrary related to both the context and the time, so they may vary during the training. Both dynamic and static variables are used to create the initial trainee profile, while the dynamic ones are also used to update the model. Similarly, some of the variables are used to derive rules to personalize training, and some of them are used for experimental purposes as we will explain later in the paper. In the next subsections, we will focus on some of the main variables (self-efficacy, stress and anxiety) as guiding examples and will give a more detailed justification for their inclusion in the model. These examples will be also used to explain the modeling and personalization aspect. 3.1 Psychological Data Measurement: Self-Efficacy and Anxiety Individuals belief in their capabilities to perform a certain task successfully has been defined by Bandura [8] as the Self-Efficacy. This belief is likely to differ depending on the activity to which it is related. Accordingly, evidences suggest that self-efficacy is not stable over time, but may vary throughout a training or a learning episode [10]. For example, in the educational context it has been found that self-efficacy may predict which problems and sub-problems a student will select to solve, how long s/he will persist and how much effort will expend. As regards leaders, their level of self-efficacy may be a relevant factor in predicting their performance under crisis situations [11]. Information assessment and decision making are two core behaviors that are critical for effective crisis leadership [2]. Nevertheless, the ambiguity, urgency and high risk of a crisis situation posit some constraints on the abilities of leaders. For example, given the severe time pressure on their behavior,
Modeling Users of Crisis Training Environments
83
Table 1. List of variables used to build the trainee model STATIC
Variable Description They do not change during the training episode
Personality Traits
According to [4], Personality refers to an “individuals characteristic patterns of thought, emotion, and behavior, together with the psychological mechanisms, hidden or not, behind those patterns.” To assess this variable we will use the Revised NEO Personality Inventory, or NEO-FFI (Five Factors Inventory), a psychological personality inventory measuring five main personality factors: neuroticism, extraversion, openness to experience, agreeableness and conscientiousness [5]. NEO-FF has been widely used to correlate behavior and internal personality traits and each of the “big five” has been associated to specific behaviors and/or predicted significantly various job performances, as well as the ability to cope with stress. Within PANDORA we are setting an experimental apparatus in order to investigate the relationship between the personality traits and the performance of crisis decision makers.
Leadership Style
In [6] leadership is defined as a “process whereby an individual influences a group of individuals to achieve a common goal”. In [7] a distinction is made between socio-emotional and task oriented. The “socio-emotional leader” takes into account feelings and moods of individuals, pays attention to the emotional aspects of interpersonal relationships. A leader focused on the task has as his constant concern the attainment by the group of its purposes. In order for a group to be “successful”, both a socio-emotional leadership and a “task leadership centered style” is necessary. The leadership style can obviously not been changed with a short term training, but the training can be personalized in order to let trainees understand the consequences of their leadership style and to encourage them to learn accordingly.
Background experience DYNAMIC
Another relevant variable for a crisis leader is obviously his/her past experience in managing crisis situation. We refer to this as background experience. A very short questionnaire will assess leaders socio-demographic information, their previous experiences with leading public health and safety crises and their level of success in doing it. They might change during the training session
Self-efficacy
Individuals belief in their capabilities to perform a certain task successfully has been defined by [8] as Self-Efficacy. It has been shown that this variable has influence of different aspects like managing stressful situation, increase the performance as well as receive benefits from training programs.
Stress and Anxiety
In [9] stress is defined as “a process by which certain work demands evoke an appraisal process in which perceived demands exceed resources and result in undesirable physiological, emotional, cognitive and social changes”. This definition is considered particularly relevant, since in an emergency situation a key factor is that demand often exceeds resources, both in the management of an emergency and in response options. As for the anxiety we distinguish between State Anxiety, which reflects a “transitory emotional state or condition of the human organism that is characterized by subjective, consciously perceived feelings of tension and apprehension, and heightened autonomic nervous system activity” and Trait anxiety that denotes “relatively stable individual differences in anxiety proneness and refers to a general tendency to respond with anxiety to perceived threats in the environment”. These two variables will be assessed both before and during the training with psychological and physiological measures and well be at the basis of the experiment to study the reaction of different people under stress
they have less time to acquire and process information effectively. As a consequence, they are required to assess information and making decisions under tremendous psychological and physical demands [12]. However, if leaders believes that they can successfully exert leadership by setting direction for work group, gaining the commitment of the group to change goals and working to overcome obstacles, they will more likely solve a crisis situation [13]. On the contrary, if leaders have doubts about themselves regarding their ability to respond aggressively to crisis situations, they will be burdened by the stress and anxiety associated with the ambiguity, urgency and high risk of a crisis [11]. Previous studies [8], indicate that individuals who believe they are not capable of handling difficult situations, more likely experience stress and anxiety, as compared to those who believe they are able to do it. Accordingly, we expect that the lower the level of self-efficacy, the higher the stress and anxiety experienced by leaders. In addition,
84
G. Cortellessa et al.
self-efficacy has the advantage of being open to influence (e.g. training) rather than a trait-like quality that will remain fixed [14]. As a consequence, self-efficacy is especially useful not only for assessing the quality of leadership in crisis, but also for improving it. In addition, if it is true that there is a negative relationship between self-efficacy and anxiety, by improving leaders’ self-efficacy the training is expected to reduce also their level of anxiety. Evidences suggest that under stress people tend to rely on their dominant responses in similar circumstances [15]. On this basis it is expected that leaders that have found themselves to lead in one or more crisis situations, including practicing crisis response protocols (e.g. developing plans) and being efficacious as a leader in general, will show more efficacy to perform leadership task in a crisis. All these considerations suggests that a valid goal for the PANDORA -B OX is to personalize training so as to increase and promote trainees’ level of self-efficacy. Scale to measure self-efficacy and anxiety. The Crisis Leader Efficacy in Assessing and Deciding (C-Lead) Scale [1] is used to measure self-efficacy within the PANDORA B OX. The Scale consists of 9 items; leaders will be classified according to 2 levels of self-efficacy (low: -1 SD) and high: + 1 SD), based on their responses on a scale ranging from 1 (Strongly Disagree) to 7 (Strongly Agree). Anxiety is assessed using the Spielberg’s state-trait anxiety inventory (STAI), which consists of 40 items: 20 designed to assess state anxiety (S-anxiety scale) and the other 20 aimed at evaluating trait anxiety (T-anxiety scale). The first one assesses how respondents feel “right now, at this moment”. The second one evaluates how people “generally feel”. Leaders anxiety will be compared with a reference sample before the start of the training in order to establish a baseline level. Subsequently this measure will serve as their own control for the rest of the training. Anxiety level will be assessed before and after their training in order to verify the effectiveness of the training and any variations in the relationship between self-efficacy level and anxiety. 3.2 Physiological Data Acquisition: Stress and Anxiety In order to enrich the trainee model within PANDORA as well as to adopt an orthogonal approach for the assessment of trainee variables, we also use neurophysiological measures. Specifically, stress and anxiety will be also assessed through these measurements. An Holter monitor is part of the PANDORA system and will be responsible for recording parasympathetic activity (i.e., Heart Rate, HR and Heart Rate Variability, HRV). Autonomic nervous system fires neurons through sympathetic and parasympathetic nervous system. The sympathetic nervous system (SNS) activates the cardioaccelerating center of the heart, increases the ventilatory rhythm and contracts arteries of non-striated muscles. Parasympathetic nervous system (PNS) innervates the cardiomoderator center of the heart, slows ventilatory rhythm and dilates the arteries of non-striated muscles. These systems function in opposition to dynamically modulate vital functions. Heart Rate Variability is a measure of the continuous antagonism between sympathetic and parasympathetic effects on HR yielding also information about the capability to regulate emotional responding. Sympathetic signals increase HR and parasympathetic activity lowers HR, causing shorter and longer interbeat intervals respectively. SNS activity, mediated by norepinephrine, operates slowly on HR through the stellar ganglion while PNS, mediated by acetylcholine, primarily controls in a faster way HR through the
Modeling Users of Crisis Training Environments
85
vagus nerve. HRV analysis results from the monitoring of a set of temporally ordered interbeat intervals, typically by electrocardiography (ECG) recording and commercially available software. The autonomic nervous system allows the body to maintain its balance when experiencing stress. Stressful circumstances are always accompanied by activation of the parasympathetic nervous system critical in regulating of emotions. Recent studies show in depression and anxiety disorders abnormal HRV compared with non-psychiatric controls indicating in such patients decreased vagal tone and elevated sympathetic activity and reflecting deficit in flexibility of emotional physiological mechanisms. Furthermore psychological states have a large impact on autonomic control of the heart causing standard psychological stressors reciprocal sympathetic activation and parasympathetic withdrawal. However response to psychological stress are usually large in the general population indicating individual response uniqueness. In this frame, assessments of users psychological characteristics and measurements of physiological parameters during the training sessions and crisis situations might allow to cross-correlate personality traits and SNS/PNS responses to cognitive and behavioral responses. It is worth highlighting how the physiological measure will be used both during the training to dynamically update the trainee model and after the training to correlate personality traits to physiological response, as well as to compare this response with the trainees’ self assessment.
4 A Timeline-Based Representation of Trainee Model In PANDORA we use a timeline-based approach for both the behavioral and crisis planners [16]. In general terms, a timeline can be seen as a stepwise constant function of time or more specifically as an ordered sequence of values holding on subsequent temporal intervals. Different kinds of values can be used with timelines but, for the current purposes, we will consider only predicate values. A predicate is a statement P (x, y, . . .) that depends on one or more arguments x, y, . . . each having its own domain. Each value is considered valid over a time interval identified by a start and an end timepoints. A set of timeline is used to describe the temporal evolution of the represented world, usually called the domain. To describe consistent temporal behaviors of a domain, the timeline-based approach makes use of the concept of compatibility (called causal pattern in PANDORA). A compatibility specifies requirements among set of values either on the same timeline or on different timelines. In particular it involves (a) a reference value and one or more target values, that are other values on the same or on other timelines (b) some relations, that are constraints limiting the domain of reference value or target value arguments (e.g., a relation can specify that reference and target values would be separated by a given temporal distance). We call a value v admissible if there exists a compatibility having v as reference value and all temporal requirements are satisfied. A value v can also be made admissible through unification with an already admissible value v such that v = v . Finally, a timeline-based domain is called admissible if all values of the domain are admissible and if, for each timeline, no different values overlap in time. Modeling Trainees with Timelines. In the current prototype of the PANDORA -B OX timelines are used to maintain information about trainees features. In particular, various variables have been modeled through different value types:
86
G. Cortellessa et al.
– personality traits are modeled through predicates of the form personalitytrait (x) where x is an integer ranging from 0 to 10; – leadership style through predicates of the form leadershipstyle (x) with x being a boolean variable assuming value 0 for socio-emotional and 1 for task oriented leadership style; – background experience through predicates of the form backgroundexperience (x) where x is an integer assuming values 0 for low experience, 1 for medium experience and 2 for high experience; – self efficacy through predicates of the form self ef f icacy (x) with x being an integer ranging from 0 to 10; – stress and anxiety through predicates respectively of the form stress (x) and anxiety (x) where x is an integer ranging from 0 to 100; Additionally a timeline called profile with predicates of the form prof ile (x) with x being an integer ranging from 0 to 100 model an aggregate classification of the current trainee status and is used to both define different aggregations of trainee variables that may be of interest (i.e., to create different trainee profiles that consider different variables) and consequently to trigger particular behavioral plans suitable for that particular profile. In particular the key connection between Behavioral and Crisis planners is represented by a timeline induced stress that is used by the crisis planner to modulate lesson stimuli toward trainees – see sketch in Fig. 2. Once allowed values are defined, we define causal patters for each value. Static variables do not change over time so compatibilities for their values are of the form Pred(x) { this duration [H, H]; }
where Pred is the corresponding variable predicate (e.g., personality trait, leadership style, etc...), this refers to the refer- Fig. 2. Causal reasoning in the PANDORA -B OX ence value, duration is a constraint on the duration of the reference value, H is the planning horizon. For each allowed value of each dynamic variable, a pattern of a different type is added: Pred(x) { pro profile(y); }
this during pro;
where the label pro identifies the target value profile(y) inside the compatibility and this during pro is a constraint imposing that the reference value must appear during the pro target value. These causal patterns imply that, whenever a new value is added to the corresponding dynamic variable, a profile value is also added to the profiles variable with a free y argument. Based on specific personalization rules, profile value compatibilities allows both the definition the current profile and the synthesis of suggestions for training customization.
Modeling Users of Crisis Training Environments
87
4.1 Toward a Personalization of Training Programs In the current demonstrator we have operationalized the causal connection between user features and the planning of crisis stimuli – Fig. 2. An example of personalization rule is given by the compatibility: profile(x) { se self_efficacy(1); be background_experience(3); st stress(3); this.x = 1; this contains se; this contains be; this equals st; }
We consider two variables from trainees models, namely the self efficacy and background experience. In the example, se and be are labels identifying respectively the target values for self efficacy and background experience while st is the label for the level of induced stress to generate. Additionally, this.x = 1 forces the value of the profile to 1, this contains se and this contains be are constraints imposing that the reference value must contain the se and be target values and this equals st is a constraint imposing that the reference value must appear at the same time of the st target value. Hence in the figure the two variables are used to create four profiles (P1, P2, P3 and P4) by combining all the possible values of the two considered variables (assuming 0, 1 as possible values). Once the profiles are defined, personalization rules specify strategies to personalize training through causal patterns. Specifically the compatibility associates the values of the two behavioral variables (self efficacy and background experience) to a profile and then a profile to an induced stress value. In this example the personalization is inspired by the idea of fostering trainees’ self-efficacy by presenting them crisis events that can be managed successfully (mastery experience), which are neither too difficult nor too easy. The indication on the level of induced stress produced by the Behavioral Planner is given as input to the Crisis Planner that adjusts the current plan (crisis events presented to trainees) according to the given induced stress value. It is worth highlighting how this approach allows to create different trainee profiles by combining different behavioral variables and consequently alternative personalization strategies.
5 Conclusions We have shown how a number of cognitive features have been included in the PANDORA B OX design to support the analysis of the user involvement as well as the level of training personalization. We have also described the complete cycle we have realized for modulating stimuli while dynamically monitoring selected psychological states of trainees. This part of the work is included in our environment and is currently being made more robust. It is worth observing how cognitive reasoning has been made operational in a relevant application field. In PANDORA, psychological features dynamically extracted from both trainees’ self assessment and physiological analysis are used to enrich the training personalization. The robust version of the system will enable future work devoted to a specific evaluation effort aimed at assessing the validity of the proposed approach with end users. Additionally we aim at enlarging the use of personalization strategies to prove the generality of the approach.
88
G. Cortellessa et al.
Acknowledgements. The PANDORA project is supported by EU FP7 (GA.225387) under the joint call ICT/Security and is monitored by REA (Research Executive Agency). Authors are particularly indebted to Keith Strickland from Emergency Planning College (York, UK) for providing his invaluable expertise to the project, and to all the partners for the nice work environment. The PANDORA web site is: http://www.pandoraproject.eu/.
References 1. Adler, C., Pittinsky, T., Sommer, S., Zhu, W.: Measuring the efficacy of leaders to assess information and make decisions in a crisis: The C-LEAD Scale. Technical report, HKS faculty research Working Paper Series (2009) 2. Wooten, L.P., James, E.H.: Linking crisis management and leadership competencies: The role of human resource development. Advances in Developing Human Resources 10(3), 352–379 (2008) 3. Schoemberg, A.: Do crisis plans matter: A new perspective on leading during a crisis. Public Relations Quarterly 22, 640–657 (2005) 4. Funder, D.C.: The personality puzzle, 4th edn. Norton, NewYork (2007) 5. Costa, P.T., McCrae, R.R.: The NEO Personality Inventory (revised) manual. Odessa, FL: Psychological Assessment Resources (1992) 6. Northouse, G.: Leadership theory and practice, 4th edn. Sage publication, Thousand Oaks (2007) 7. Bales, R.F.: Task roles and social roles in problem solving groups. In: Macoby, E.E., Newcomb, T.M. (eds.) Readings in Social Psychology, Holt, Rinehart and Winston, New York (1958) 8. Bandura, A.: Social foundations of thought and actions. A social cognitive theory. Prentice Hall, Englewood Cliffs (1986) 9. Salas, E., Driskell, J., Hughs, S.: The study of stress and human performance. In: Driskell, J., Salas, E. (eds.) Stress and Human Performance. Lawrence Erlbaum Associates, New Jersey (1996) 10. McQuiggan, S., Mott, B., Lester, J.C.: Modeling Self-Efficacy in Intelligent Tutoring Systems: An Inductive Approach. User Modeling and User-Adapted Interaction 18, 81–123 (2008) 11. Pittinsky, T., Hadley, C., Sommer, S.: Measuring the Efficacy of Leaders to Assess Information and Make Decisions in a Crisis: The C-LEAD Scale. Technical report, Center for Public Leadership (2009) 12. Klann, G.: Crisis Leadership. Center for Creative Leadership, Greensboro, NC (2003) 13. Paglis, L., Green, S.: Leadership Self-Efficacy and Managers’ Motivations for Leading Change. Journal of Organizational Behavior 23, 215–235 (2002) 14. Stajkovic, A., Luthans, F.: Self-efficacy and work related performance: A meta-analysis. Psychological Bulletin 124, 240–261 (1998) 15. Dickerson, S.S., Kemeny, M.E.: Acute stressors and cortisol responses: A theoretical integration and synthesis of laboratory research. Psychological Bulletin 130, 355–391 (2004) 16. Muscettola, N.: HSTS: Integrating Planning and Scheduling. In: Zweben, M., Fox, M.S. (eds.) Intelligent Scheduling. Morgan Kaufmann, San Francisco (1994)
Personality Estimation Based on Weblog Text Classification Atsunori Minamikawa and Hiroyuki Yokoyama KDDI R&D Laboratories, Inc., Saitama, Japan {at-minamikawa,yokoyama}@kddilabs.jp
Abstract. This paper describes the investigation of personality estimation from Japanese weblog text. Among various personality types, we focus on Egogram, which has been used in Transactional Analysis and is strongly related to the communicative behavior of individuals. Estimation is performed using the Multinomial Naïve Bayes classifier with some feature words that are selected based on the information gain. The validity of this approach was evaluated with real weblog text of 551 subjects. The results showed that our approach achieved 1225% improvement from baseline. The feature words selected for the estimation are strongly correlated with the characteristics of Egogram. Keywords: Computational Linguistics, Personality Estimation, Egogram.
1 Introduction Recent advancement in Internet technology and mobile devices enable users to record and restore their experience, thought, emotion or other personal information electronically. Vast amount of daily activity record, called Lifelog, are generated and exchanged in various social network services through Internet every day. User profiling using these Lifelog data has attracted attention as the key technology to achieve cognitive computing applications, such as intelligent user interface, personalized recommender system and so forth. Among various kinds of Lifelog, personal weblog is one of the most appropriate data to extract user’s traits since it is widely used. In personal weblogs, many users record their daily event, their thought, and their emotion with text subjectively and release them through Internet. Many researches in computational linguistics explore methods for automatic extraction of various traits from weblogs such as emotion [5, 7], mood [10], deception [16], sentiment or opinion [2]. Although these works provide successful contributions for recognizing the context of the user, it is still difficult to estimate user situation since these traits provides only temporal characteristics of the user. This work focuses on personality as another trait for the same purposes since it provides a longer term aspect of individuals [8]. User personality extracted from text could be useful in many interactive situations, for instance, as a cue to understanding the background of their action to search for the appropriate services which matches one’s personality. This paper describes our investigation of the validity of personality estimation from Japanese personal weblog text. Personal weblog is also appropriate for the extraction of K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 89–97, 2011. © Springer-Verlag Berlin Heidelberg 2011
90
A. Minamikawa and H. Yokoyama
long term characteristics of the author since it is the accumulated corpus of one’s expression for a certain period of time. Many works have investigated the relationship between written texts and personality. Most of these works focus on the personality parameter called Five Factor Model (FFM) or “Big Five” [3] which emerged from lexical research. Argamon et al. [1] especially focused on Extraversion and Neuroticism as the prediction targets of the personality, employing various feature sets, function words, and systemic functional grammar analysis. Mariesse and Walker [8, 9] used features derived by the top down approach using the Linguistic Inquiry and Word Count text analysis program [13]. Oberlander and Nowson [11, 12] investigated the personality prediction accuracy of the five factors by the data-driven approach. Gill et al [6] investigated the relation between blog topic and FFM. Our work focuses on another personality parameter called Egogram which was originally developed by Dusay [4]. Egogram has been developed in the field of Transactional Analysis (TA), which deals with how people communicate with each other from a social and psychological perspective. Compared to FFM which represents the task-related individual behaviors [9], the Egogram model focuses more on the communication aspect of the personality. It provides finer-grained characteristics for the communicative situation; Critical Parent (CP), Nurturing Parent (NP), Adult (A), Free Child (FC), and Adapted Child (AC). Applying Egogram model is much more appropriate for the interactive user interface which needs various kinds of communication, because a lot of researches have been done in TA with the Egogram. For example, which Egogram patterns are a good match, which Egogram pattern is the most appropriate for leaders in cooperative, which ego states cause confliction in certain communication scenarios and not, and so forth. These properties have much advantage over the other personality models. Furthermore, the Egogram has been already widely used especially in Japan for psychological analysis of cooperated work, and for self-analysis in the school teachings and even in the training seminars for new recruits in the company. There are a great store of knowledge and experiment which link the Egogram and cooperatedwork. These previous works gives us a firm basis of the transactional analysis, and our work will provide a new inspiration for the readers who are interested in the interactive communications.
2 Egogram Model Egogram represents the amount of psychological energy which a person distributes at one given time. Each of the five ego states indicates the role which is played in the interaction. In particular, CP and NP represent how one treats others, FC and AC represents how one represents oneself, and A indicates the objectivity in the communication and represents how one controls it. Table 1 shows five ego states and their associated behaviors. The personality of a subject is explained by the relative intensity among these five ego states. For instance, the subject whose CP intensity is much higher than the other ego states tends to be authoritarian and to restrict the freedom of others, and is labeled as “chauvinist” in [4].
Personality Estimation Based on Weblog Text Classification
91
In clinical use, the Egogram of a subject is extracted through counseling by a specialist or an answer sheet examination which has been developed by various organizations. Table 1. Traits of five ego states Ego States CP (Critical Parent) NP (Nurturing Parent) A (Adults) FC (Free Child) AC (Adapted Child)
Features paternal, idealism, responsible, conservative, critical maternal, nurturing, pampering, sympathetic objective, rational, logical, cold creative, active, selfish, abandon adaptable, obedient, cooperative, passive
3 Personality Estimation Method In order to estimate the Egogram from weblog, we use text classification method using machine learning techniques. The classification was performed based-on the appearance of feature words which are derived from statistical analysis of supervised data. Figure 1 illustrates the outline of our method. Details of our method are described in the following subsections. Learning Process
9GDNQI
…
'IQITCO
9GDNQI
(GCVWTG9QTFU 5VCVKUVKECN #PCN[UKU
/QTRJQNQIKECN #PCN[UKU
Estimation Process /QTRJQNQIKECN #PCN[UKU
(GCVWTG 'ZVTCEVKQP
%NCUUKHKECVKQP 0CïXG $C[GU
'IQITCO
Fig. 1. Outline of the personality estimation process. In the learning process, feature words are derived from the statistical analysis of supervised data. Classifiers are also modeled in this process.
3.2 Data Sets We gathered real personal weblog text of 551 Japanese subjects which were carefully selected from the several major Japanese weblog sites to avoid bias on gender and
92
A. Minamikawa and H. Yokoyama
generation (278 males and 273 females). The subjects also answered the personality examination, called TEG2 (Todai-shiki Egogram version 2) [15] which is one of the answer sheet examinations for Egogram. TEG2 consists of 50 items each of which corresponds to each Ego states and is most frequently used in clinical investigation in Japan. From the answers of TEG2 the intensities of the five ego states (CP, NP A, FC and AC) are finally scored using a 5-point scale. Blog Text Feature Words
Morphological Analysis
bag-of-words
Feature Vector
CP
(xCP1, xCP2 ,....,xCPn )
CP Classifier
CP Score
NP
( xNP1, xNP2 ,....,xNPn)
NP Classifier
NP Score
A
( x A1 , xA2 ,....,x An )
A Classifier
A Score
FC
( xFC1 , xFC2 ,....,xFCn )
FC Classifier
FC Score
AC
( xAC1 , xAC2 ,....,x ACn )
AC Classifier
AC Score
Fig. 2. Classification of the ego state score.
3.2 Classification In the estimation, we separately modeled each five ego state, and built a classifier for each state. Each classifier derives the score of each ego state which are transformed into the discrete bins as mentioned in the following section “Evaluation Scenario”. Before classifications, weblog text data are separated into bag-of-words using Sen [14] that is one of the Japanese morphological analysis tools. In next step, feature vector corresponding to each state is derived by counting the appearance of the feature words in the bag-of-words. These feature words were preliminarily selected by the statistical analysis for each ego state. After the feature vectors were derived, each classification was performed using Multinomial Naïve Bayes (MNB) classifier which has been often used for text classification. These MNB classifiers are implemented using the WEKA library [17]. In the evaluation, 10-fold cross validation was used. The original sample was randomly partitioned into 10 subsamples. A single subsample of them was retained as the validation data for testing the model, and the remaining 9 subsamples were used as training data. Then validation process was repeated 10 times with each of the 10 subsamples used once as the validation data. Classification accuracy was derived from the average of the 10 results.
Personality Estimation Based on Weblog Text Classification
93
3.3 Feature Selection Many potential features can be used for text classification. In this work we use wordbased features. The effective feature words are selected based on the information gain which is calculated in the following equations.
InfoGain ( w) = H (C ) − H (C | W )
(1)
H (C ) = −∑ Pr(c) log Pr (c )
(2)
H (C | W ) = − ∑ Pr(wd ) H (C | wd )
(3)
c∈C
wd ∈W
w is the word, W is the class where the , C is the class which corresponds to bins of each ego states, Pr(c) is the probability of the bin c in the supervised data, Pr(wd) is the probability of the word w appearance In other text classification cases, such as gender, age and resident area, nouns are extracted as the effective feature words. However nouns are not appropriate in this work, because 1) the usage of nouns depends on the topic and is not directly related to one’s personality and 2) it is difficult to prepare an adequate corpus which covers the vast amount of nouns. Especially for the second reason, nouns as classification feature words may cause overfitting in personality estimation. Therefore we did not include nouns in the feature words. Instead, we took into account the appearance of adjectives, adverbs, conjunctions, exclamations and some internet slangs and emoticons. In the classifications, we set the upper limits N for the number of feature words, and the words whose information gains are included in the top N are selected as the feature words for classification.
4 Evaluation In this section, we evaluate our method in some estimation scenario. Before the evaluations, we investigate the word class dependency of the egogram estimation. The F-measure of each estimation is calculated as the following Equation . F=
2 PR P+R
(4)
P and R denote the precision and the recall of the estimation, respectively. 4.1 Word Class Dependency In this subsection, we investigate the effect of each word class for the Egogram estimation. Classifier classifies the intensity of each ego state into 2classes, whether the TEG2 score of the subjects is more than the average score or not. Each classification
94
A. Minamikawa and H. Yokoyama
used the feature words which were selected from only one word class. The number of feature words used in each classification is set at 50. As table 2 illustrates, adjective, adverb and emoticon are effective for each ego state estimation. Especially for NP, A, and AC estimation, the improvements from baseline are relatively large. Table 2. Improvements of F-measures of the estimation using each word class compared to the baseline (%). The F-measures of the baseline were the percentage of the majority class. CP
NP
A
FC
AC
Adjective
1.8
9.7
7.3
2.3
4.6
Adverb
2.8
11.1
6.9
4.3
8.4
Conjunction
0.4
-0.6
4.2
-0.2
-1.0
Interjection
-1.8
1.9
2.0
1.1
1.4
Emoticon
5.2
11.1
10.7
1.1
7.2
Symbols
0.0
0.9
3.8
0.0
1.4
Pronoun
0.6
3.3
2.8
0.3
2.6
4.2 Estimation Evaluation In this subsection, we evaluate the classification accuracy according to the following three evaluation scenarios: • (E1) Classify into 2 classes, whether the TEG2 score of each ego state of the subject is more than the average score or not. • (E2) Classify into 3 classes, whether the TEG2 score is high, middle or low. Subjects whose score is 5 or 4 are classified as high; subjects whose score is 3 are classified as middle; the remaining are classified as low. • (E3) Classify into 5 classes, based on the original TEG2 score. Classifiers have been built for each evaluation scenario. Table 3 illustrates examples of the selected feature words translated into English for each ego state. It should be noted that emoticons are effective features for the classifications of four ego states. Especially the emoticon “orz” is effective for CP, FC and AC classifications. This emoticon is a familiar Japanese net slang which represents a man pounding his head on the floor. The o is the head, the r is the arms and the z is the legs. It is used to symbolize the emotion of frustration. Tables 4 shows the results of each ego state classification for three evaluation scenarios. The most accurate performance of the number of feature words is highlighted in bold. The binary classifications of scenario (E1) achieve almost 70-85% accuracy when the number of feature words is over 150. Although FC classification shows the highest accuracy among five, it is due to the high baseline score. The improvement for NP from baseline is the highest performance among the five ego states. For AC, the classification using 50 feature words achieve the highest performance. This indicates the possibility of overfitting when the number of feature words is over 100.
Personality Estimation Based on Weblog Text Classification
95
In multiple classification of scenarios (E2) and (E3), the classification accuracy decreases as the classification grain becomes finer. MNB classifier achieves 12-20% improvement for each ego state. Table 3. Exapmles of feature words selected in the experiment (translated into English) Positive
CP
NP
A
FC
Negative
immediately
orz (disappointed)
supposedly
Oh
actually
joyfull
great
handily
kind of
orz* (disappointed)
smart
point by point
probably
hey
on the contrary
・
philosophic
!
joyful
orz (disappointed)
ω・ (emoticon)
familiar ≧▽≦
primitive (emoticon)
orz (emoticon) AC
・∀・
warm respectable
(emoticon)
Oh
while either
Table 4. F-measures of the estimation in each evaluation scenario. Baseline is the percentage of the majority class
(E1)
Num of Features
CP
NP
A
FC
AC
Baseline
63.8 69.8 74.4
56.9 72.8 77.9
53.3 63.8 66.4
75.0 81.9 83.5
63.8 75.8 75.3
150
77.1
79.5
67.8
85.5
75.0
200
75.4
81.1
69.6
85.7
74.8
MNB
50 100
Baseline (E2)
MNB
39.8
43.1
46.7
47.7
42.2
50 100 150
55.3 59.6 59.8
59.1 64.4 67.0
61.6 64.4 68.4
59.4 62.2 63.4
200
64.0
68.0
58.1 62.0 64.4 64.2
69.8
65.4
39.8
36.2
41.2
47.7
42.2
47.5 51.9 54.3 55.3
48.9 54.1 53.5 55.9
50.7 52.5 52.5 54.5
55.3 57.3 61.4 62.0
51.1 56.3 58.1 58.9
Baseline (E3)
MNB
50 100 150 200
96
A. Minamikawa and H. Yokoyama
5 Discussion In the previous section, we evaluate the personality estimation method which considered the usage of adjectives, adverbs, conjunctions, exclamations and Internet slang in blog text. In the evaluation, subjects whose CP or A scores are higher, have tendency to use more formal expressions, while subjects whose AC scores are relatively higher, use more informal or colloquial expressions, such as net slang and emoticons. Similarly, subjects who tend to use exclamations have high AC scores, while their CP and A scores are low. These results indicate that the appearances of emotional expressions in blog text are strongly related to CP, A, and AC characteristics of the author. It was also seen in the experiment that conjunctions are relatively used by subjects whose CP and A scores are high. On the other hand, high NP subjects tend to use admirative words to others. These findings conform to the characteristics of each ego state illustrated in Table 1. In all evaluation scenarios, our approach achieved higher performance than the baseline. This means that Egogram estimation from weblog text using computational linguistic analysis is a convincing method. Here we discuss the effectiveness and limitation of our method from the aspect of its application. The advantage of our method is that it can preliminarily extract the personality of the user which affects the communicative behavior. Since the personality is the latent characteristics, it is difficult for instantaneous sensing system to detect it with short term observation. By detecting user’s personality, our method enables the system to predict the user reaction in each interactive session and to control its interface behavior as it matches user preference. It is also the advantage of our method that Egogram has the high compatibility with the interactive user interface since it models various communicative situations between people. On the other hand, the limitation of our method is that it cannot detect the user cognitive change during interaction. Therefore it should be combined with the other cognitive system, such as Fujita et al. [18].
6 Conclusion This paper describes the personality estimation method from weblog text classification techniques, in order to achieve interactive user interface in the cognitive computing. The Egogram model is appropriate for such purpose since it deals with the communication aspects of the users. The estimation was performed by Multinomial Naïve Bayes classifier with feature words which were selected based on the information gains. The method was evaluated using real weblog text of 551 subjects. The results show that our approach achieved 12-25% improvements from the baseline. Especially in the classification for the aspect of Nurturing Parent (NP), our approach performed the highest results in five ego states. The feature words selected for classifications are convincing from the aspect of the characteristics of each ego state. Therefore, our estimation method is successful at this stage. The Egogram estimation method could be useful for the interactive user interface of the cognitive applications. For example, when starting the interactive session,
Personality Estimation Based on Weblog Text Classification
97
system can preliminary grasp the characteristics of its users and can predict their reaction in each interactive stage. In future works, other features such as the semantics of the word will be investigated in order to improve the estimation accuracy. It is also important to apply our method to larger corpus for the validation.
References 1. Argamon, S., Dhawle, S., Koppel, M., Pennebaker, J.W.: Lexical Predictors of Personality Type. In: Proc. the 2005 Joint Annual Meeting of the Interface and the Classification Society of North America (2005) 2. Breck, E., Choi, Y., Cardie, C.: Identifying Expressions of Opinion in Context. In: Proc. IJACAI 2007 (2007) 3. Costa, P.T., McCrae, R.R., Revised, N.E.O.: Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI): Professional Manual. Psychological Assessment Resources, Odessa 4. Dusey, J.M.: Egograms and the “Constancy Hypothesis”. Transactional Analysis Journal 2, 37–42 5. Gill, A.J., Gergle, D., French, R.M., Oberlander, J.: Emotion Rating from Short Blog Texts. In: Proc. CHI 2008, pp. 1121–1124 (2008) 6. Gill, A.J., Nowson, S., Oberlander, J.: What Are They Blogging About? Personality, Topic and Motivation in Blogs. In: Proc. ICWSM 2009 (2009) 7. Hancock, J.T., Landrigan, C., Silver, C.: Expressing Emotion in Text-Based Communication. In: Proc. CHI 2007, pp. 929–932 (2007) 8. Mairesse, F., Walker, M.: Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text. Journal of Artificial Intelligence Research 30, 457–500 (2007) 9. Mairesse, F., Walker, M.: Words Mark the Nerds: Computational Models of Personality Recognition through Language. In: Proc. CogSci 2006, pp. 543–548 (2006) 10. Mishne, G.: Experiments with Mood Classification in Blog Posts. In: Style Workshop, SIGIR 2005 (2005) 11. Nowson, S., Oberlander, J.: Identifying More Blogger: Towards Large Scale Personality Classification of Personal Weblogs. In: Proc. ICWSM 2007 (2007) 12. Oberlander, J., Nowson, S.: Whose Thumb is it Anyway? Classifying Author Personality from Weblog Text. In: Proc. ACL 2006, pp. 627–634 (2006) 13. Pennebaker, J., Francis, M.: Linguistic Inquiry and Word Count 2001. Lawrence Erlbaum Associates, Mahwah (2001) 14. Sen (in Japanese), http://ultimania.org/sen/ 15. TEG Research Group in the University Tokyo Medical school, TEG: Tokyo University Egogram (New Ver.), Kaneko Shobou (2000) 16. Toma, C.J., Hancock, J.T.: Reading between the Lines: Linguistic Cues to Deception in Online Dating Profiles. In: Proc. CSCW 2008 (2008) 17. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005) 18. Fujita, H., Hakura, J., Kurematu, M.: Intelligent Human Interface Based on Mental Cloning-Based Software. Journal of Knowledge-Based Systems 22, 216–234 (2009)
Design of an Optimal Automation System: Finding a Balance between a Human’s Task Engagement and Exhaustion Michel Klein and Rianne van Lambalgen Vrije Universiteit Amsterdam, De Boelelaan 1081a, 1081HV Amsterdam {michel.klein,rm.van.lambalgen}@few.vu.nl
Abstract. In demanding tasks, human performance can seriously degrade as a consequence of increased workload and limited resources. In such tasks it is very important to maintain an optimal performance quality, therefore automation assistance is required. On the other hand, automation can also impose additional demands on a person monitoring the system and it will likely reduce the person’s awareness of the situation. Therefore, automation should be adapted to the human’s state in order to keep the human in the loop. In this paper an agent model is proposed that calculates a person’s workload and engagement to an automated task. Based on this agent model, an intelligent support system can provide different types of automation and aims at adapting to the human state such that an optimal balance is found between exhaustion and task engagement. Keywords: Agents, Cognitive Modeling, Adaptive Automation.
1 Introduction Human task performance can degrade over time when demanding tasks are performed. Such degradation can for instance be caused by available resources being exceeded [1]. However, high effectiveness and efficiency levels are of particular importance for critical tasks. In such cases, automated assistance to support humans in effective and efficient task execution is often required. However, while humans take advantage of such systems in that it reduces their workload, these systems can also have negative effects on human functioning. Automation induced complacency causes a person to rely on an automated system, even when that system is not always flawless [2]. In addition, automation reduces a person’s picture of the environment, often referred to as situation awareness [3]. A reduction in situation awareness may lead to human error, especially when automation does fail. Parasuraman defines levels and stages of automation that all have a different effect on a person’s situation awareness and human functioning [4]. To overcome disadvantages of automation, it is important to adapt automation to the state of the human [5]. Existing systems often adapt their level of automation throughout a specific task either to the person’s workload [6] or situation awareness [7]. Also, automation can be adapted by analysis of the human’s state through psychophysiology ([8], [9], [10]). For an extensive review on adaptive automation, see [11]. The system K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 98–108, 2011. © Springer-Verlag Berlin Heidelberg 2011
Design of an Optimal Automation System
99
that is proposed in this paper also aims at providing adaptive automation, but as opposed to previous systems it integrates both workload and situation awareness (referred to as task engagement) in the model of a human’s state. This way, advantages of automation (in reducing workload) as well as disadvantages of automation (in reducing awareness) are taken into account. The design of the system is based on the agentparadigm: the human and its behavior and states are explicitly described in a formal way, as a separate entity, as is the interaction between the human and its environment. This agent model is then used to decide on the effect of specific actions on the human. Within the agent model theories of automation ([4], [5], [6]) and operator functional state ([12], [13]) are used to calculate the effects of a specific level of automation on both a person’s engagement to the task and the human’s functional state (i.e. exhaustion and performance). This allows the support system to choose the optimal support level at each point in time and to find the balance between a reduction in exhaustion on the one hand and maintenance of a person’s engagement on the other hand. In this paper, first a theoretical background on human automation and human workload is given in Section 2. Section 3 describes the agent model and in Section 4 simulation results are presented for different types of automation. Section 5 explains how support can be given using the agent model, which is illustrated by an example. Finally, a discussion of the work is given in Section 6.
2 Theoretical Background Parasuraman ([4]) defines 4 possible types of automation that are applied to 4 stages of information processing: information acquisition, information analysis, decision and action selection and action implementation. Within those 4 types of automation, different levels can be distinguished, ranging from no automation to very high automation. Depending on the specific type and level of automation, each system has different consequences to a person’s workload, situation awareness and performance. Decision automation can decrease a human’s awareness, which impairs human performance in case the automation fails. Therefore, decision automation should be set at a low level, especially in demanding circumstances such as air traffic control. On the other hand, both automation of information acquisition and information analysis allow the human to stay involved with the task and thus are beneficial to a human performing a task. Other effects of the level of automation are that on a person’s workload. A high level of automation reduces the workload of the person performing the task. However, a high level of automation also induces monitoring demands, which in turn increases workload. Workload is influenced by the demands of the task, which determine the effort a person has to give to the task for an optimal performance. A high workload may lead to exhaustion, caused by a depletion of resources when the person generates effort to the task (see [13]). Such a mechanism is also common in sports and exercise, where each person has a critical point that determines a person’s basic level at which effort can be contributed without getting exhausted [14]. In [12] a model is presented that explores these relationships between demands, effort, exhaustion and critical point (also referred to as a human’s functional state). In addition to the type and level of automation, awareness of the operator can also be influenced by the level of self-agency. The concept of self-agency is found in
100
M. Klein and R. van Lambalgen
literature on consciousness and can be defined as ‘the feeling that one causes one’s own actions and their outcomes’ ([15], [16]). Normally, self-agency is present when a task is consciously performed. However, research points out that self-agency can also be evoked by priming of the outcome of an action just before the outcome itself emerges ([17]). This is especially useful when humans are supported by high level information. In this case, such priming leads to an increased awareness of the human to the task even though it is performed by the system.
3 Agent Model for Human’s Task Engagement and Exhaustion To design a system that can intelligently decide on the optimal form of automation, it is necessary to create a dynamic model that describes the interaction between the system, the environment, and the human agent. The overall agent model is depicted in Figure 1 and is composed of three main components: the system, the world and the human (or operator). Within the system component, the possible actions of an intelligent support system in order to provide support are represented. The world contains information on the environment and output of both the support system and the operator. Furthermore, in the operator component concepts and relations on operator states are defined. In this section, first the three components explained in more detail. Secondly, a computational specification of the model is provided, in which formulas are given for the calculation of the values for different concepts.
Fig. 1. Agent Model
3.1 Model Components System. The actions that the support system performs can be divided in two different sorts of actions: actions for automation and actions for agency. Actions for
Design of an Optimal Automation System
101
automation define the different levels and stages of automation the support system can adapt to provide support, based on [3], [4]. All actions for automation affect one of the 4 stages of information processing: sensing, perception, decision making or action. Furthermore, within each stage, the support system can perform a low automation action or a high automation action. Each automation action affects the strength of self-action, the strength of automation, the active task demands and the monitoring demands in the world. In addition, the agency action can be performed by the support system in order to increase task engagement. This function is based on literature on consciousness which states that support can be given to increase a person’s awareness of performing a specific task ([16], [17]). If task engagement is low given the automation actions of the system, agency action can be performed by the support system to overcome this decrease in task engagement. World. The world contains the characteristics of the environment as well as the output of the system and the operator. On the side of the operator, eye movements and task performance can be reflected in the world. On the side of the system, firstly active task demands and monitoring demands are distinguished, where the former represents the demands of performing the task that is left after automation and the latter represents the demands of monitoring the automated task. Both forms of demand influence the total effort generated by the operator at a specific point in time. Secondly, the strength of automation is determined by the automation action of the system. Finally, the system’s automation action also influences the strength of self action, which indicates the extent to which operators feel themselves to be responsible for performing the task. Both strength of automation and strength of self action influence the person’s task engagement. Operator. In the operator component, knowledge of concepts and relations is represented used by the support system to determine task engagement and exhaustion. The world component serves as input for the task engagement, the self agency and the total task effort generated by the operator. A simplified model of operator functional state based on ([12]) is used to determine the level of exhaustion with the total generated effort and a person’s critical point (a person’s basic cognitive abilities). Furthermore, the total task effort is split up into a monitoring effort level and active task effort level. Task engagement is determined by the strength of automation, strength of self-action and the monitoring effort. The output that can be measured based on the states of the operator are eye movements and task performance. 3.2 Formal Computational Model To determine the interaction of the human, the system and the world, a formal model had been developed that specifies the effect of specific types of support on the exhaustion and engagement of a human. In this formalization, all concepts have a value between 0 and 1. In order to distinguish between the different types of support, it is assumed that for each variant of automation it is defined what the effect is on: • • • •
the strength of automation (SoA) strength of self action (SSA) active task demands (AT_D) monitoring demands (M_D).
102
M. Klein and R. van Lambalgen
An example of such a specification is given in Table 1 in the next section, where a simulation based on the model is presented. The total strength of automation (SoA) and the total strength of self action (SSA) are determined by averaging the effect of all automation actions; to determine the total active task demands (AT_D) and the total monitoring task demands (M_D) the demands that result of the individual automation actions are summed up. Total task demands (TT_D) represent the sum of the active task demands and the monitoring demands. TT_D = AT_D + M_D (1) The demands of the task and the exhaustion of the operator together determine how much effort the operator generates in total. The parameter ex_speed determines the influence on exhaustion. TT_E= TT_D * (1-(Ex(t-1)*ex_speed))
(2)
The total effort is distributed between the active task and the monitoring task according to the priority that is given to either active task effort (AT_E) or monitoring effort (M_E) (note that prior_m should be 1-prior_at and the other way around). (3)
AT_E = 0 AT_D-((TT_D-TT_E)-M_D) AT_D-((TT_D-TT_E)*(1-prior_at))
if TT_E = 0 if (TT_D-TT_E)*(1-prior_m)>MD else
M_E = 0 M_D-((TT_D-TT_E)-ATD) M_D-(diff*(1-prior_m))
(4) if TTE = 0 if ((TT_D-TT_E)*(1-prior_at)>AT_D else
The engagement to the task (TE) depends on the Strength of Automation, the Self Agency and the effort that is given to the task. TE =
(5)
w1*(1-SoA)*SSA + w2*1 + w3*AT_E/AT_D w1*(1-SoA)*SSA + w2*M_E/M_D + w3*1 w1*(1-SoA)*SSA + w2*1 + w3*1 w1*(1-SoA)*SSA + w2*M_E/M_D + w3*AT_E/AT_D
if M_D=0 if AT_D=0 if AT_D=0 and M_D=0 else
The formula for exhaustion (Ex) is temporal as it takes into account the person’s exhaustion at the previous point in time. Ex_parameter determines the influence of effort (related to the critical point) on exhaustion. Ex(t+Δt) = Ex(t) + (Ex)* Δt *(TTE-CP)/CP * Ex_parameter Ex(t) + (1-Ex)* Δt *(TTE-CP)/TTE * Ex_parameter
(6) if TTECP
4 Simulations To illustrate the working of the model, a number of simulations have been performed in which the effect of different variants of automation on a person performing the
Design of an Optimal Automation System
103
same fictitious task is compared. In the simulations, the effect of a specific type of automation is investigated on two variables, namely exhaustion and engagement, assuming that the system would give continuous support for a long period of time (in this case 100 time steps, where Δt=0.5). In order to execute the model, it is required to specify the effect of different types of automation on the task demands and the strength of self-action and strength of automation. These values are task specific. In the experiments, four different types of support are considered, i.e. support on sensing, support on perception, support on decision making and finally support on action. The latter two are highly dependent on each other, as action support always requires decision support. Each of the automation actions can be off, medium active, or high active. In Table 1, it is specified what the effect of each support action is on the factors described before. Here, the assumption is that the effect of automation on Active Task Demands and Monitoring Demands is independent of the specific type. On the other hand, the effect of the automation action on strength of automation is strongly dependent on the specific type of automation; sensing and perception automation is relatively more ‘lightweight’ as compared to the decision making automation (e.g. [4]). Table 1. Values that determine effects of automation Automation Action Off Sensing
Perception
Decision Making
Action
Strength of Automation 0
Strength of Self Action 1
Active Task Demands 1
Monitoring Demands 0
Medium
0.1
0.5
0.5
0.5
High
0.4
0
0
1
Off
0
1
1
0
Medium
0.4
0.8
0.5
0.5
High
0.7
0
0
1
Off
0
1
1
0
Medium
0.8
1
0.5
0.5
High
1.0
0
0
1
Off
0
1
1
0
Medium
1.0
0
0.5
0.5
High
1.0
0
0
1
The parameter values in this simulation were as follows: ex_speed=0.2; ex_parameter=0.5; prior_at=0.8; prior_m=0.2. w1=0.7; w2=0.1; w3=0.2. The operator’s critical point was set to 4.4. The pattern of engagement and exhaustion over time are shown in Figure 2 a and b, for the case that there is no automation and the case of medium sense and medium perception automation respectively. In both figures it can be seen that the exhaustion and engagement reaches a stable situation after some time, however, with different value for the variables of interest. Since each specific type of automation leads to some stable values for exhaustion and engangement, it is also possible to compare these end-situations directly. This shows
104
M. Klein and R. van Lambalgen
the eventual effect of a specific type of support directly. Therefore, 10 different combinations of support were compared by simulation, in a somewhat increasing level of complexity; the different combinations are listed in Table 2. Figure 4 shows the person’s engagement and exhaustion for the different combinations of types and levels of automation listed in Table 2.
Fig. 2a, b. Exhaustion and Engagement over time in a system without automation (left) and a system with high sense and high perception automation (right) Table 2. Simulated combinations of automation Combination 1 2 3 4 5 6 7 8 9 10
Automation No Automation Only medium Sensing automation Only high Sensing automation Only medium Perception automation Only high Perception automation Only medium Decision automation High decision automation and high action automation Medium Sensing and Medium Perception automation Medium Sensing, Medium Perception and Medium Decision automation High sensing, high perception, high decision and high action automation
Fig. 3. Engagement and Exhaustion for 10 different combinations of automation
Design of an Optimal Automation System
105
As can be seen in Figure 3, exhaustion is highest in the system without any automation and lowest in the fully automated system. In addition, engagement is also relatively low for combination 7, which has a high decision automation and high action automation.
5 Support Framework This section presents how the model introduced in Section 3 can be used within a intelligent support system to reason about the most appropriate type of support for a human, taking both the engagement and the exhaustion into account. To do so, the support system uses the information presented in Table 1, to obtain the knowledge about the effect of different types of support on the task demands, the strength of automation and the self agency. In addition, the system makes use of a preference value for the optimal balance between exhaustion and engagement. This preference value is task specific, as more critical tasks might require relatively higher engagement than enduring tasks, for which a lower exhaustion is important. Based on this knowledge and the model, the support system can calculate the reduction in exhaustion and engagement for the upcoming period for all different types of support, compared with the scenario in which no support is given, i.e. Exdiff_x(t) = Exno_support(t) – Exsupport_x(t)
(7)
TEdiff_x(t) = TEno_support(t) – TEsupport_x(t) for all different types of support_x, where TEsupport_x is the task engagement when a specific variant of support is given (similar for Exsupport_x). To decide on the optimal balance between exhaustion and engagement, both values can be combined into a single value using a formula with a weighting mechanism, e.g. Fx(t) = w1 * TEdiff_x(t) / w2 * Exdiff_x(t)
(8)
In this function, a high task engagement and a low exhaustion both contribute positively to the combined value (w1 and w2 are weighting factors). Alternative formalizations are also possible, e.g. based on a weighted sum of the TE and (1 – Ex). The system will choose the most optimal type of support (i.e. with the highest value for the combination function) until the time point that it is predicted that a different type of support will result in a better balance. At that time point, a new simulation is started, in which it is predicted what the additional effect of all variants of support is, compared with the type of support given until then, i.e. Exdiff_y(t) = Exsupport_x(t) – Exsupport_y(t)
(9)
TEdiff_y(t) = TEsupport_x(t) – TEsupport_y(t) for all types of support_y, and support_x the current type of support. Example scenario To illustrate the approach, simulations are performed using the agent-model described in Section 3 and the reasoning in the support framework as described above. For the
106
M. Klein and R. van Lambalgen
example scenario, a task is considered for which the effects are described in Table 1. In addition, the engagement and exhaustion are assumed to be equally important: formula 8 is used with weights w1 = w2 = 1. All other variables in the simulations have the same value as in Section 4. First, the system compares the effect of a specific variant of support with giving no support. Figure 4 shows the value of the combination function for all different variants of support listed in Table 2. It can be seen that up to time point 22, the line with small dots (support variant 3, high sensing automation) is the most effective type of support. Figure 5 shows the same simulation up to time point 22, but from time point 23 on the support system compares all variants of support with support variant 3. It can be seen that until the end of the simulation the line with the thick dots is the highest. This represents variant 4, low perception automation. Consequently, the intelligent support system would generate high sensing support up to time point 23, and then would switch to low perception support. Note that in the example no “agency actions” (i.e. actions that would increase the self-agency) are considered. Using those actions, it might be possible to influence the engagement directly, without implementing a lower level of automation (which would increase the task demands).
Fig. 4. Simulated balance between reduction of exhaustion and engagement compared with no support (line with thin dots is support variant 3)
Fig. 5. Simulated balance between reduction of exhaustion and engagement compared with support variant 3 from timepoint 23
Design of an Optimal Automation System
107
6 Discussion In this paper, an agent model and a framework for an intelligent support system is presented to decide on the optimal type of automation, taking both the task engagement and the exhaustion into account. The model is generic and can be used for different tasks. This requires a specification of the effect of the support actions on both the task demands and the strength of automation and the self agency. The specification of these effects is not trivial because it requires quantification for the effect of all actions on different aspects, but part of this specification can be based on actual measurements of demands. The approach has been illustrated with an example scenario. In this scenario, no agency actions are assumed, and a constant balance between the importance of task engagement and exhaustion is assumed. However, it could be that the importance changes over time. The framework can cope with such situations as well. This will result in a more varied behavior of the support system. For example, if the task is very critical at a specific point in time and the operator is too exhausted to perform, automation should be kept constant. When the critical task is performed, the task should be given back to the operator such that the operator’s engagement can be kept optimal.
References 1. Posner, M.I., Boies, S.J.: Components of attention. Psychological Bulletin 78, 391–408 (1971) 2. Rovira, E., McGarry, K., Parasuraman, R.: Effects of imperfect automation on decision making in a simulated command and control task. Human Factors 49(1), 76–87 (2007) 3. Endsley, M.R.: Level of automation effects on performance, situation awareness and workload in a dynamic control task. Ergonomics 42(3), 462–492 (1999) 4. Parasuraman, R.: Designing automation for human use: empirical studies and quantative models. Ergonomics 43(7), 931–951 (2000) 5. Parasuraman, R.: Humans and automation: Use, misuse, disuse, abuse. Human Factors 39, 230–253 (1997) 6. Kaber, D.B., Endsley, M.R.: The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theoretical Issues in Ergonomics Science 5(2), 113–153 (2004) 7. Parasuraman, R., Barnes, M., Cosenzo, K.: Adaptive automation for human-robot teaming in future command and control systems. The International C2 Journal 1(2), 43–68 (2007) 8. Wilson, G.F., Russell, C.A.: Performance enhancement in an uninhabited air vehicle task using psychophysiologically determined adaptive aiding. Human Factors 49(6), 1005– 1018 (2007) 9. Byrne, E.A., Parasuraman, R.: Psychophysiology and adaptive automation. Biological Psychology 42(5), 249–268 (1997) 10. Prinzel, L.J., Freeman, F.G., Scerbo, M.W., Mikulka, P.J., Pope, A.J.: Effects of a psychophysiological system for adaptive automation on performance, workload, and the eventrelated potential P300 component (2003) 11. Parasuraman, R., Wickens, C.D.: Humans: still vital after all these years of automation. Human Factors 50(3), 511–520 (2008)
108
M. Klein and R. van Lambalgen
12. Bosse, T., Both, F., Lambalgen, R.: van, and Treur, J., An Agent Model for a Human’s Functional State and Performance. In: Jain, L., Gini, M., Faltings, B.B., Terano, T., Zhang, C., Cercone, N., Cao, L. (eds.) Proceedings of the 8th IEEE/WIC/ACM International 13. Robert, G., Hockey, J.: Compensatory control in the regulation of human performance under stress and high workload: a cognitive-energetical framework. Biological Psychology 45, 73–93 (1997) 14. Hill, D.W.: The critical power concept. Sports Medicine 16, 237–254 (1993) 15. Wegner, D.M.: Précis of The illusion of conscious will. Behavioral and Brain Sciences 27, 1–46 (2004) 16. Aarts, H., Custers, R., Marien, H.: Priming and authorship ascription: when nonconscious goals turn into conscious experiences of self-agency. Journal of Personality and Social Psychology 96(5), 967–979 (2009) 17. Custers, R., Aarts, H.: Positive affect as implicit motivator: on the nonconscious operation of behavioral goals. Journal of Personality and Social Psychology 89(2), 129–142 (2005)
A Cognitive Agent Model Using Inverse Mirroring for False Attribution of Own Actions to Other Agents Jan Treur1 and Muhammad Umair1,2 1
Vrije Universiteit Amsterdam, Department of Artificial Intelligence De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands 2 COMSATS Institute of Information Technology, Lahore, Pakistan {j.treur,m.umair}@vu.nl,
[email protected] http://www.few.vu.nl/~{treur,mumair}
Abstract. This paper presents a cognitive agent model capable of showing situations where self-generated actions are attributed to other agents, as, for example, for patients suffering from schizophrenia. The mechanism underlying the model involves inverse mirroring: mapping preparation states onto sensory representations of observed actions. It is shown how this mechanism can develop based on Hebbian learning. The model provides a basis for applications to human-like virtual agents in the context of for example, training of therapists or agent-based generation of virtual stories. Keywords: action attribution, cognitive agent model, schizophrenia.
1
Introduction
For the development of agent models as a basis for virtual agents in serious or nonserious gaming, an often used criterion is that they show realistic human-like behaviour. One of the ways to obtain such human-like agent models is to exploit the fast growing amount of neurological literature, so that models are developed that have biological plausibility. In addition, to obtain realistic virtual agents, not only ideally functioning persons should be considered but also persons with deviant behaviour, in order to cover larger parts of the variety in types of behaviour as occurring naturally in the overall human population. This paper addresses an agent model for such naturally occurring deviant behaviour in attribution of self-generated actions to other agents. False attribution of self-generated (e.g., manual or verbal) actions to other agents is a common symptom occurring in patients with schizophrenia. One explanation put forward for the phenomenon that self-generated actions are not attributed to oneself is that self-attribution depends on prediction and monitoring of effects of these actions, and this does not function well for persons with schizophrenia; see, for example [6], [7], [9], [10]. However, in other work it is debated whether this is an appropriate explanation. For example, in [8] experimental work is reported that indicates that differences in these respects between patients with schizophrenia and a control group are not very convincing. In [16] it is argued that a more important role is played by what is called ‘the sense of agency’ (which is at a more conscious, personal level) than action effect prediction and monitoring (which is at a unconscious, subpersonal level). K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 109–119, 2011. © Springer-Verlag Berlin Heidelberg 2011
110
J. Treur and M. Umair
Note that the issue of not attributing a self-generated action to oneself, as addressed in the literature as mentioned above, is not the same as attributing such an action to another agent, as in order to create a mental image of somebody else performing the action requires a shift from a representation of an action from a firstperson to a representation from a third-person perspective (mental rotation). Patients with schizophrenia do not only fail to attribute self-generated actions to themselves, they also attribute them to other agents (which can be real or imaginary). Using neurological literature on mirroring [4], [14], [15], [20] and self-other differentiation [17], in this paper this form of false attribution to other agents is addressed. In this paper, in Section 2 the ideas from the neurological literature are briefly discussed. In Section 3 the cognitive agent model is introduced. Section 4 presents a number of simulation results. In Section 5 a mathematical analysis is made. Finally, Section 6 is a discussion.
2 The Cognitive Agent Model for False Attribution of Actions In this section the cognitive agent model and its detailed specifications are presented. First the modelling some background knowledge is briefly discussed, next the format used is briefly introduced, and the example scenario used is described, and finally the agent model is addressed in detail. Background knowledge. One of the recent neurological findings concerns the mirroring function of certain neurons; e.g., [4], [14], [15], [20]. Mirror neurons are active not only when a person prepares for a specific action or body change, but also when the person observes somebody else intending or performing this action or body change. This includes expressing emotions in body states, such as facial expressions. The idea is that these neurons and the neural circuits in which they are embedded play an important role in social functioning and in (empathic) understanding of others; e.g., [4], [14], [15], [20]. Their discovery is often considered a crucial step for the further development of the discipline of social cognition; cf. [15]. When states of other persons are mirrored by some of the person’s own states that at the same time are connected via neural circuits to states that are crucial for the person’s own feelings and actions (shared circuits), then this provides an effective basic mechanism for how in a social context persons fundamentally affect each other’s actions and feelings; e.g. [14]. Mirroring involves a change of perspective from another agent (third person) to oneself (first person). This requires a nontrivial mental rotation transformation of the available representations (cf. [17]): sensory representations of observed actions of other agents are mapped onto representational structures for self-generated actions. Attribution a self-generated action to another agent is in fact a kind of reverse process. It requires a change of perspective from preparation for a self-generated action (first person) to another agent (third person) perspective, based on a reverse mental rotation transformation of the available representations. In fact this is inverse mirroring: the representational structures for self-generated actions are mapped onto sensory representations of observed actions of other agents, thus forming a mental image of somebody else performing the action. When it is assumed that such an inverse
A Cognitive Agent Model Using Inverse Mirroring
111
mapping is made, a self-generated action is perceived as observed from a third person perspective, and thus it provides a mechanism for the self-generated action to be attributed to another agent. A further question is how such a reverse mental rotation mapping can exist or develop in a neurological context. One possibility is that the mechanism is there initially, due to improper genetics. However, another possibility is that it is developed during lifetime. This is also analysed below, assuming a Hebbian learning principle. This is the principle that connected neurons that are frequently activated simultaneously strengthen their connecting synapse. The principle goes back to Hebb [12], but has recently gained enhanced interest by more extensive empirical support (e.g., [2]), and more advanced mathematical formulations (e.g., [11]). In the models a variant of this principle has been adopted to realise an inverse mirroring connection from preparation of an action to sensory representation of a similar observed action. Modelling Format Used. To formalise the agent model in an executable manner, the hybrid dynamic modelling language LEADSTO has been used; cf. [3]. Within LEADSTO the dynamic property or temporal relation a → →D b denotes that when a state property a occurs, then after a certain time delay (which for each relation instance can be specified as any positive real number D), state property b will occur. Below, this D will be taken as the time step Δt, and usually not be mentioned explicitly. Both logical and quantitative calculations can be specified, and a software environment is available to support specification and simulation. In most cases in the model below some form of combination function f is used. A typical example of an update rule in LEADSTO is SS(w1, V1) & SS(w1, V1) & SR(w3, V3)
→ → SR(w3, V3 + γ [ f(ω1V1, ω2V2) – V3 ] Δt)
which expresses that when the sensor states for w1 and w2 have values V1 and V2 respectively and the sensory representation of w3 has value V3, then after time duration Δt this sensory representation will have value V3 + γ [ f(ω1V1, ω2V2) – V3 ] Δt. Here ω1 and ω2 are the connection strengths from the sensor states to the sensory representation, respectively, and γ is an update speed factor. Moreover, f is a combination function, for which different choices can be made, for example, f(W1, W2) = β(1- (1-W1)(1-W2)) + (1 - β )W1W2
(0 ≤ β ≤1)
In simulations with the agent model, a combination function f(W1, …, Wn) based on a continuous logistic threshold function th(σ, τ, W) has been used of the form f(W1, …, Wn) = th(σ, τ, W1 +… + Wn) (where σ is a steepness and τ a threshold value) with: th(σ, τ, W) =
(
σ
τ
-
στ
) (1
στ
)
For higher values of στ this threshold function is approximated by the expression: th(σ, τ, W) =
1
σ
τ
Example Scenario. The designed agent model will be illustrated for the following scenario. A sensed stimulus s1 leads to a sensory representation of this stimulus, which in turn triggers preparation and execution of an action b as a response of the agent; see the causal chain in the lower part of Fig. 1. Moreover, when another agent
112
J. Treur and M. Umair
performs action b, this is represented as a stimulus s2 that can be sensed; see upper part of Fig. 1. The sensory representation srs(s2) indicates the mental image of another person performing the action. The agent’s mirroring capability is based on the assumption that an activated sensory representation of s2 will also activate the agent’s own preparation for b. This assumption has been confirmed in neurological literature such as [4], [14], [15], [20]. When this latter chain of events happens (i.e., whenever mirroring takes place) , for the model introduced here it is assumed that by Hebbian learning this will strengthen the connection from preparation of b to sensory representation of s2 (observed action), thus developing inverse mirroring capabilities. When such a learning process has achieved substantial connection strength, the agent’s response on stimulus s1 has changed. When s1 is sensed (in the absence of s2), not only will the agent trigger preparation and execution of action b as before, but in addition it will generate a mental image of another agent performing action b (the sensory representation srs(s2)), thus creating a third person perspective on the action.
LP1
ω0 world_state(s2)
LP5
ω1
ω
sensor_state(s2)
srs(s2)
ω4 LP1
ω0 world_state(s1)
LP2
LP3
ω2 sensor_state(s1)
LP4
ω5
ω3
preparation_state(b)
srs(s1)
effector_state(b)
Fig. 1. Overview of the cognitive agent model Table 1. State properties used notation description world state for s SS(s) sensor state for s SRS(s) sensory representation state of s WS(s)
notation description preparation state for b execution state for b cs(ω) strength of connection ω (from preparation of b to sensory representation of s2)
PS(b) ES(b)
Detailed Specification of the Agent Model. In the detailed specification, states have been formalised as shown in Table 1. Moreover, the dynamical relationships between these states are shown in Table 2. Note that each connection has a fixed strength, except the connection from preparation of b to sensory representation of s2, which is adapted over time by the Hebbian learning. More detailed specifications of the dynamical relationships are presented below. First it is shown how in a generic manner sensor states are generated from world states.
A Cognitive Agent Model Using Inverse Mirroring
113
Table 2. Overview of the connections and their weights from states
to state
WS(s)
SS(s)
SS(s1)
SRS(s1)
SRS(s1), SRS(s2)
PS(b)
weights
process
LP
sensing world state representing world state preparing action / mirroring action action execution representing observed action / inverse mirroring action Hebbian learning for inverse mirroring
LP1 LP2 LP3 LP4 LP5 LP6
PS(b)
ES(b)
SS(s2), PS(b)
SRS(s2)
ω0 ω2 ω3, ω4 ω5 ω1, ω
SRS(s2), PS(b)
cs(ω)
η, ζ
LP1 Generating a sensor state If world state s has level V1 and the sensor state for s has level V2 then after duration Δt the sensor state for s will have level V2 + γ [ f(ω0V1) – V2] Δt WS(s, V1) & SS(s, V2) → → SS(s, V2 + γ [ f(ω0V1) – V2] Δt)
Note that this applies to both s1 and s2. Activation for sensory representation for s1 has been modelled in a straightforward manner as shown in LP2. LP2 Sensory representation for a sensor state If property s1 is sensed with level V1 and the sensory representation of s1 has level V2. then after Δt the sensory representation of s will have level V2 + γ [ f(ω2V1) - V2 ] Δt. SS(s1, V1) & SRS(s1, V2) → → SRS(s1, V2 + γ [ f(ω2V1) – V2 ] Δt Next it is shown how an action preparation for b is generated. This is based on the sensory representation of stimulus s1, or based on the sensory representation of s2 representing an observed action of another agent; in the latter case mirroring takes place. Note that the two options are not exclusive. LP3 Preparing and mirroring an action If sensory representation of s1 has level V1, and sensory representation of s2 has level V2, and the preparation for b has level V3 then after Δt the preparation state for b will have level V3 + γ (f(ω3V1, ω4V2) - V3) Δt. SRS(s1,V1) & SRS(s2,V2) & PS(b, V3) → → PS(b, V3 + γ (f(ω3V1, ω4V2) - V3) Δt) Action execution has been modelled in a straightforward manner as shown in LP4. LP4 Action execution If preparation for b has level V1 and the action execution state for b has level V2 then after Δt the action execution state for b will have level V2 + γ (f( ω5V1) – V2) Δt. PS(b, V1) & ES(b, V2)
→ → ES(b, V2 + γ (f(ω5V1) – V2) Δt)
Next it is shown in LP5 how the sensory representation for s2 (a mental image of an observed action) is generated based on a sensor state for s2 or an action preparation for b (or a combination of both). In the latter case inverse mirroring take place. LP5 Representing a world state and inverse mirroring of an action If sensor state of s2 has level V1, and preparation of b has level V2, and the sensory representation of s2 has level V3
114
J. Treur and M. Umair
then after Δt the sensory representation of s2 will have level V3 + γ (f(ω1V1, ωV2) V3) Δt. SS(s2,V1) & PS(b,V2) & SRS(s2, V3) → → SRS(s2, V3 + γ (f(ω1V1, ωV2) - V3) Δt) Finally, it is shown in LP6 how the Hebbian learning process of the connection from preparation state for b to sensory representation s2 of an observed action was modelled. This takes place using the following Hebbian learning rule, with maximal connection strength 1, a learning rate η, and extinction rate ζ (usually taken small): Δω = γ [ ηV1V2 (1 - ω) - ζω ] Δt
Here V1 and V2 are (time-dependent) activation levels of the connected nodes, and
γ is an adaptation speed factor. In differential equation format it can be written as ω
= γ [ ηV1V2 (1 - ω) - ζω ] = γ [ ηV1V2 - (ηV1V2 + ζ) ω ]
A similar Hebbian learning rule can be found in [11], p. 406. By the factor (1 - ω) the learning rule keeps the level of ω bounded by 1. When the extinction rate is relatively low, the upward changes during learning are proportional to both V1 and V2 and maximal learning takes place when both are 1. Whenever one of them is close to 0, extinction takes over, and ω slowly decreases. This is specified as follows: LP6 Learning for inverse mirroring If the sensory representation of stimulus s2 has level V1,
and the preparation for b has level V2, and the connection weight from preparation for b to sensory representation of s2 has level W, then after duration Δt the connection weight from preparation for b to sensory representation of s2 will have level W + γ [ ηV1V2 (1 - W) - ζW ] Δt. → cs(ω, W + γ [ ηV1V2 (1 - W) - ζW ] Δt) SRS(s2, V1) & PS(b, V2) & cs(ω, W) →
3
Simulation Results
A number of simulations have been performed with the focus of simulating normal functioning and deviant functioning of the model. Normal functioning of the agent occurs by parameter settings in which stimulus s1 does not lead to high activation of sensory representation of s2 (i.e., no mental image of somebody else performing the action is created) in absence of the stimulus s2, although stimulus s2 has occurred time and time again in the past. In contrast, deviant functioning occurs by parameter setting in which stimulus s1 does lead to high activation of sensory representation of s2 under similar circumstances, so in this case a mental image of somebody else performing the action is created. In the simulations shown, time is on the horizontal axis and the activation level of the state properties is on the vertical axis. The connection strengths between different states were initialized with 1 (i.e., ω0 = ω1 = ω2 = ω3 = ω4 = ω5 = 1) and kept fixed throughout the simulation, except the connection strength ω which was initialized with 0 and adapted over time by the Hebbian learning rule given in LP6 in Section 2. Other parameters are set as: ∆t = 0.1, learning rate η = 0.3, extinction rate ζ = 0.2,
A Cognitive Agent Model Using Inverse Mirroring
WS(s2)
V a l u e
1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
WS(s1)
V a l u e
0
115
1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 1. World State for s2
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 2. World State for s1
speed factor λ = 0.5, steepness σ = 5 and threshold τ = 0.5. For the initial duration of 45 time units the stimulus s2 occurs three times for 5 time units alternatively, i.e., for the first 5 time units world state for s2 has value 1 and for the next 15 time units value 0, and so on (see Fig. 1). During these 45 time units the world state for s1 was kept 0 (see Fig. 2). This represents the situation in which a person observes somebody else performing some action (or bodily change) and the mirroring function of the preparation neurons makes the person prepare for this action. The fluctuation in the activation level of the sensor state is repeating the same pattern between 0.1 to 0.9 as it only depends (via LP1) upon the world state for s2, which also is repetitive. PS(b) 1 0.9 0.8 V 0.7 a 0.6 l 0.5 u 0.4 e 0.3 0.2 0.1 0
CS(ω) 1 0.9 0.8 V 0.7 a 0.6 l 0.5 u 0.4 e 0.3 0.2 0.1 0
0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 3. Preparation State for b (Normal Functioning)
0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 4. Connection Strength ω (Normal Functioning)
Fig. 3 shows the activation of preparation state PS(b) resulting from the sensory representation pattern. The slight change in the strength of the connection (inverse mirroring) from preparation state PS(b) to sensory representation SRS(s2) (via LP6) shows a similar but slightly delayed fluctuating pattern; see Fig. 4. After that for about 30 time units both world states for s1 and s2 were kept 0, so that the effect of any stimulus on preparation state and execution state becomes zero, as reflected, for example, in Fig. 3. For the rest of the simulation, the world state for s2 is kept 0 while the world state for s1 is becoming 1 (see Figs 1 and 2). As soon as world state for s1 becomes 1, the values of sensor state and its sensory representation increase smoothly and become stable at a high value (of about 0.9). One of the interesting facts (in
116
J. Treur and M. Umair
comparison to the second scenario discussed below) is that, even though a link from preparation state to SRS(s2) develops, it is not strong enough to propagate the effect of SRS(s1) via PS(b) to SRS(s2), see Fig. 4. Hence SRS(s1) only has a positive effect on the activation levels of the preparation state and execution state of b (shown in Fig. 3), and not on SRS(s2). No activation is developed of a mental image SRS(s2) of another person performing action b; this shows normal functioning. To obtain the deviant behavior of the model, again all parameters were initialized with the same values as used for the normal behavior and mentioned earlier in current section, except the extinction rate which was taken lower this time: ζ = 0.01. In this case by Hebbian learning the connection from preparation state of b to sensory representation of s2 achieves a substantially higher connection strength (see Fig. 6) which also induces an upward trend in the fluctuating value of the preparation state for b (see Fig. 5). PS(b)
V a l u e
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
CS(ω)
V a l u e
0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 6. Connection Strength ω (Deviant Functioning)
Fig. 5. Preparation State for b (Deviant Functioning)
This achieved connection strength is sufficient to change the impact of stimulus s1 on SRS(s2) (see Fig. 8) compared to the impact shown in Fig. 7. Even in the absence of the world state for s2, from time 75 onwards, if the world state for s1 occurs, it leads to high activation of SRS(s2) (see Fig. 8), which shows that the agent develops a mental image of somebody else performing action b. This contrasts the case of normal functioning in which case after time point 75 the level of SRS(s2) stays (close to) 0 (see Fig. 7). SRS(s2) 1 0.9 0.8 V 0.7 a 0.6 l 0.5 u 0.4 e 0.3 0.2 0.1 0
SRS(s2)
V a l u e
0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 7. Sensory Representation for s2 (Normal Functioning)
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Time Point
Fig. 8. Sensory Representation for s2 (Deviant Functioning)
A Cognitive Agent Model Using Inverse Mirroring
117
4 Mathematical Analysis of the Model The equilibrium of the connection strength from preparation state of b to the sensory representation of s2 may be found by using the equation for Δω presented in Section 2 to determine the change in the connection strength after ∆t as follows. An equilibrium will occur if Δ ω = 0, so the equation can be rewritten as
γ [ ηV1V2 (1 - ω) - ζω ] Δt = 0
or
ηV1V2 (1 - ω) - ζω = 0
or
As V1V2 ≤ 1, the following is an upper bound for the equilibrium value: 1 1 This expression gives the relation describing the maximum connection strength that may be achieved for given values of learning and extinction rates. It shows that for a smaller value of the extinction rate compared to the learning rate the connection strength will be closer to 1. For the simulation results discussed in Section 3, for the second case (deviant functioning) the learning rate was η = 0.3 and the extinction rate ζ = 0.01 which means the maximum connection strength which may be achieved is 0.97. Indeed it was observed that the connection strength indeed becomes stabilized just below 0.97. Similarly, for the case of normal functioning, the learning rate was η = 0.3 and extinction rate ζ = 0.2. Indeed the equilibrium was below the 0.6 indicated by the above analysis. Note that these are upper bounds resulting from maximal values 1 continuously for V1 and V2. In practice this will not continuously happen, so the strength will stay lower.
5 Discussion One of the recent developments in Neuroscience concerns the notion of a mirror system and its functions; e.g., [4], [14], [15], [20]. Cognitive agent models have been designed using this notion as a point of departure, and showing its role in various high-level cognitive and social capabilities such as prediction, imagination, emotion reading, empathic understanding, imitation, and attribution of observed actions; see, for example [5], [13], [18]. Mirroring is a process from an observed action or body state of another person to the person’s own preparation states. As discussed in [17] this involves a mental rotation mapping sensory representations of observed actions of other agents onto the representational structures for self-generated actions. This realises a change of perspective from another agent (third-person) to perspective from oneself (first-person). Attribution a self-generated action to another agent proceeds in the opposite direction, realising a change of perspective from oneself (first-person) to another agent (third-person) perspective. This requires inverse mirroring: the representational structures for self-generated actions are mapped onto sensory representations of an observed action of another agent, thus creating a mental image of another agent performing the action. When such a mapping involving reverse mental rotation is made, a self-generated action is perceived as observed from a third person
118
J. Treur and M. Umair
perspective. The agent model presented in this paper addresses the issue of false attribution of self-generated actions using such a mechanism. In addition it is shown how the mechanism can develop based on Hebbian learning [12], [2], [11]. Note that due to the opposite direction, an inverse mirroring process is not covered by cognitive agent models based on mirroring, such as descibed in [5], [13], [18]. The modelling format used to formally specify the agent model is based on the executable hybrid dynamical modelling language LEADSTO [3]. This hybrid language combines executable temporal logical elements [1] and numerical dynamical system elements [19]. Although the model can also be specified well in a purely numerical format, an advantage of LEADSTO is the possibility to use a logical format to describe state properties. As a next step a validation study can be conducted to compare the agent model’s behaviour to real behaviours. The agent model obtained can be used as a basis for applications involving realistic, human-like virtual agents in the context of serious or nonserious gaming. In the area of virtual stories it can be used to create virtual characters that react in a less standard but realistic manner. Another possible application is to obtain virtual patients in the area of simulation-based training for psychotherapists.
References 1. Barringer, H., Fisher, M., Gabbay, D., Owens, R., Reynolds, M.: The Imperative Future: Principles of Executable Temporal Logic. John Wiley & Sons, Chichester (1996) 2. Bi, G.Q., Poo, M.M.: Synaptic Modifications by Correlated Activity: Hebb’s Postu-late Revisited. Ann. Rev. Neurosci. 24, 139–166 (2001) 3. Bosse, T., Jonker, C.M., Meij, L., van der Treur, J.: A Language and Environment for Analysis of Dynamics by Simulation. Intern. J. of AI Tools 16, 435–464 (2007) 4. Bosse, T., Memon, Z.A., Treur, J.: A Cognitive and Neural Model for Adaptive Emotion Reading by Mirroring Preparation States and Hebbian Learning. Cognitive Sys-tems Research Journal (in press, 2011) 5. Damasio, A., Meyer, K.: Behind the looking-glass. Nature 454, 167–168 (2008) 6. Farrer, C., Franck, N.: Self-monitoring in schizophrenia. Curr. Psychiatry Rev. 3, 243–251 (2007) 7. Feinberg, I.: Efference copy and corollary discharge: Implications for thinking and its disorders. Schizophrenia Bulletin 4, 636–640 (1978) 8. Fourneret, P., de Vignemont, F., Franck, N., Slachevsky, A., Dubois, B., Jeannerod, M.: Perception of self-generated action in schizophrenia. Cogn. Neuropsych. 7, 139–156 (2002) 9. Frith, C.D., Blakemore, S., Wolpert, D.: Explaining the symptoms of schizophrenia: Abnormalities in the awareness of action. Brain Research Rev. 31, 357–363 (2000) 10. Frith, C.D.: The cognitive neuropsychology of schizophrenia. Lawrence Erlbaum Associates Ltd., Hove (1992) 11. Gerstner, W., Kistler, W.M.: Mathematical formulations of Hebbian learning. Biol. Cybern. 87, 404–415 (2002) 12. Hebb, D.O.: The Organization of Behaviour. John Wiley & Sons, New York (1949) 13. Hendriks, M., Treur, J.: Modeling Super Mirroring Functionality in Action Execution, Imagination, Mirroring, and Imitation. In: Pan, J.-S., Chen, S.-M., Nguyen, N.T. (eds.) ICCCI 2010. LNCS, vol. 6421, pp. 330–342. Springer, Heidelberg (2010)
A Cognitive Agent Model Using Inverse Mirroring
119
14. Hurley, S.: The shared circuits model: How control, mirroring, and simulation can enable imitation, deliberation, and mind reading. Behavioral and Brain Sciences 31, 1–22 (2008) 15. Iacoboni, M.: Mirroring People: the New Science of How We Connect with Others. Farrar, Straus & Giroux, New York (2008) 16. Jeannerod, M.: The sense of agency and its disturbances in schizophrenia: a reappraisal. Exp. Brain Res. 192, 527–532 (2009) 17. Jeannerod, M., Anquetil, T.: Putting oneself in the perspective of the other: A frame-work for self-other differentiation. Social Neuroscience 3, 356–367 (2008) 18. Memon, Z.A., Treur, J.: Designing Social Agents with Empathic Understanding. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS (LNAI), vol. 5796, pp. 279–293. Springer, Heidelberg (2009) 19. Port, R.F., van Gelder, T. (eds.): Mind as Motion: Explorations in the Dynamics of Cognition. MIT Press, Cambridge (1995) 20. Rizzolatti, G., Sinigaglia, C.: Mirrors in the Brain: How Our Minds Share Actions and Emotions. Oxford Univ. Press, Oxford (2008)
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences Iris van de Kieft, Catholijn M. Jonker, and M. Birna van Riemsdijk Man Machine Interaction Group, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands
Abstract. Negotiation support systems (NSSs) aim to assist people during the complex process of negotiation. We argue that having a shared mental model of the negotiation task enables and enhances the collaboration between the human negotiator and the NSS. This paper presents an analysis of negotiation that results in a set of concepts that a shared mental model of the user and the NSS should contain. Discrepancies between the individual mental models can arise for various reasons, such as the constructive nature of preferences. Explanation can increase user understanding of the NSS’s reasoning, allowing the user to detect and resolve discrepancies. We therefore propose using explanation to achieve and maintain sharedness. We present a framework that provides a means to generate content for such explanations, where we focus on the mental models of user and opponent preferences.
1
Introduction
Negotiation is an interactive decision-making process between two or more parties. It is a complex process that involves emotions as well as computational complexity. As a result, even experienced human negotiators can fail to achieve efficient outcomes [11]. This has motivated the development of negotiation support systems (NSSs). These software systems assist a human negotiator (user) in negotiation by, for example, aiding communication, enhancing negotiation skills, and reducing cognitive task load. The Pocket Negotiator project, see [4], strives for synergy between NSS and the human negotiator. The NSS and the user should work together as a team in which their complementary skills are needed to achieve good outcomes. It is wellknown from the social psychology literature that performance of human teams is positively influenced by the team members having a shared understanding or shared mental model of the task and the team work involved ([5,7]). The concept of shared mental model is defined in [3] as: knowledge structures held by members of a team that enable them to form accurate explanations and expectations for the task, and, in turn, coordinate their actions and adapt their behavior to demands of the task and other team members. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 120–129, 2011. c Springer-Verlag Berlin Heidelberg 2011
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences
121
We maintain that having a shared mental model is not only important in human teams, but also in human-agent teams. The representation that an automated agent has of a task can be viewed as its mental model. Discrepancies between the mental models of the NSS and the user may at best result in innocent misunderstandings but at its worst may result in a dysfunctional cooperation. This paper contributes to the technology to achieve sharedness between mental models of NSS and user. In Sect. 2, we first analyze negotiation and the interaction between user and NSS, to determine the essential components of such a shared mental model . Furthermore, the analysis reveals the possible causes of discrepancies between mental models, for example, the constructive nature of preferences. Based on our analysis, in Sect. 3 we identify explanation (see for example [12,13]), as a suitable technology for improving sharedness between mental models. Explanations can increase the transparency of the system, allowing the user to detect and resolve discrepancies. We provide a structured approach to generating such explanations. Due to space restrictions, we focus on one part of the shared mental model: user and opponent preferences. We describe different levels of content selection, for explanations of bids and their utilities. We leave the form (i.e., presentation) of the explanation for future work.
2
Bilateral Multi-issue Negotiation
Bilateral multi-issue negotiation is the process in which two parties try to reach an agreement about multiple issues. The following four major stages can be discerned in integrative negotiation: private preparation, joint exploration, bidding, and closing. Private preparation is about information gathering and reflection before meeting the other party. In joint exploration the negotiating parties talk to each other, but do not place bids on the table. During bidding, both negotiators exchange bids according to the agreed protocol, typically a turn-taking protocol. During the closing stage the outcome of the bidding stage is formalized and confirmed by both parties. We first present a basic negotiation framework, followed by an analysis of the negotiation process. This analysis discusses human weaknesses in negotiation and subsequently focuses on the interaction between user and NSS. 2.1
A Basic Negotiation Framework
Let p ∈ {s, o} represent the negotiating parties “Self” (s) and “Opponent” (o). Let I, with typical element i, denote the finite set of issues under negotiation. For example, in a job negotiation I might consist of the issues salary, car, vacation days. For convenience we assume issues to be numbered from 1 to n, where n = |I|, and henceforth, we will refer to issues by their respective numbers. For example, issue 1 is salary, issue 2 is car, issue 3 is vacation days. For each i ∈ I let Vi be the set of possible values that this issue may have. For example, the issue car may have {yes, no} as possible values. Let V = V1 × V2 × . . . × Vn , be the set of possible outcomes. A possible outcome v ∈ V is
122
I. van de Kieft, C.M. Jonker, and M.B. van Riemsdijk
thus an n-tuple v1 , v2 , . . . , vn . A possible outcome for the job domain can, for example, be 2000, no, 20. A negotiation domain is denoted by D = I, V . A bid in domain D is denoted by bp , with b ∈ V and p the party that proposed the bid. For simplicity, a bid may be denoted as b, if p is understood or not relevant. Each party p has a utility function up : V → [0, 1] which assigns a utility between 0 and 1 to possible outcomes and bids. A commonly used type of utility linear additive normalized function, defined by up (b) = n ifunction is the i i=1 up (b), where up (b) is the utility of issue value bi for party p. That function is defined in terms of a weight wpi and evaluation function eip for that issue and party: uip (b) = wpi eip (bi ). The weight is the relative importance that n party p assigns to issue i in such a way that i=1 wpi = 1, for all p ∈ {s, o}. The evaluation function eip : Vi → [0, 1] assigns a score between 0 and 1 to the possible values of issue i. We use the notation up(b) to represent the ntuple u1p (b), u2p (b), . . . , unp (b). Similarly, ep (b) = e1p (b1 ), . . . , enp (bn ) and w p = 1 n wp , . . . , wp . 2.2
The Weaknesses of the Human Negotiator
In this first part of our analysis, we discuss the problems humans have with negotiation, assuming there is no NSS support. There are two ways to categorize the problems humans have with negotiation: related to outcome, or related to the negotiation process. The outcome related pitfalls in negotiation are: leaving money on the table, settling for too little, rejecting a better offer than any other available option, and settling for terms worse than alternative options [1,11]. The outcome related pitfalls are caused by the problems people have during the negotiation process, which are related to the following (see [1,11] for more information): – Lack of training: Without training, humans often have difficulty in structuring negotiation problems and thinking creatively about such problems. – Lack of preparation: Preparation is insufficient when it leaves the negotiator unaware of an important part of the issues, underlying interests, the preferences and/or circumstances of the parties involved, see for example,. – Structural barriers to agreement: This refers to such problems as die-hard bargainers, a bad atmosphere, power imbalance [6], cultural and gender differences, disruptive or incommunicative people, and a lack of information. – Mental errors: Parties commit mental errors such as the escalation error, biased perception, irrational expectations, overconfidence, and unchecked emotions. – Satisficing: Due to uncertainty of the future, the costs of acquiring information, and the limitations of their computational capacities, people have only bounded rationality, forcing them to make decisions by satisficing, not by maximization. This difficulties indicate why it may be difficult for a human to have an accurate mental model of a negotiation.
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences
2.3
123
The Interaction between Human Negotiator and NSS: What to Share?
In this section we analyze the interaction between user and NSS, to provide insight into their task division. This analysis helps determine the contents of the shared mental model that needs to be cultivated between user and NSS. Negotiation is a prime example of a task for which the human mind is only partially equipped, and for which artificial intelligence can only provide partial assistance. For example, the user has a wealth of knowledge about the world and about interacting with other humans, but need not be a specialist in negotiation. The NSS specializes in negotiation. It makes generic negotiation knowledge available to the human. Also, the user has limited working memory and limited computational power, i.e., bounded rationality. The NSS has better memory and can search much more quickly through much larger outcome spaces. This implies that tasks should be divided between user and NSS in a way that respects their complementary capabilities. This task division suggests they need not share all their knowledge. However, some shared information is necessary for cooperation, hence the need for a shared mental model. The information and knowledge exchange between these two team members is as follows: during the preparation and exploration stage the user needs to inform the NSS about the current negotiation, e.g., the Opponent, the set of issues I, and outcome space V , and the utility functions of Self and Opponent. These utility functions are a model of the actual preferences of Self and Opponent. The actual preferences may not be fully known, and may be subject to change. The NSS needs this user input in order to provide assistance during the bidding stage, when strategic bidding decisions have to be made. For this information exchange to be successful, the user must fully understand the process of negotiation and what is expected of him/her by the NSS, and what can be expected in return. This implies that during the negotiation stages, the NSS needs to provide the user (upon request) with generic negotiation information, but also current negotiation information regarding the Opponent, I, V , and utility functions, in as far as such information is available to the NSS. Thus, a shared mental model of a human negotiator and an NSS should at least contain submodels on: – domain knowledge D • I: set of issues • ∀i ∈ I: Vi the value range of issue i – knowledge about negotiating parties Self and Opponent. For each p ∈ {s, o}: • up : the utility function of p, in so far as known • the emotional status and coping style of p • the negotiation model of p – knowledge about the capabilities of the team members: Self and NSS
124
I. van de Kieft, C.M. Jonker, and M.B. van Riemsdijk
– bidding knowledge • bidding history: the sequence of bids that have been exchanged so far • the current bidding strategy for Self • the bidding protocol, including information about available time 2.4
Discrepancies between Mental Models
Based on the previous subsections, we identify what may cause lack of sharedness with respect to the elements of the shared mental model, or in other words, what may cause discrepancies between the mental models of the user and the NSS. We consider a discrepancy between mental models to exist when one model contains information regarding an element, and the other model contains either conflicting information regarding this element, or no information regarding this element. Once a discrepancy is detected, it can be resolved by adapting (one of) the mental models. One particular aspect that may lead to discrepancies is the constructiveness of domain and preference information. Even with proper preparation, information on the domain and preferences of Self and Opponent is often difficult to determine fully at the start of the negotiation. Humans have been found to discover this information along the way. Due to this constructiveness, the user may discover new knowledge during the negotiation that the NSS does not yet have, thus causing a discrepancy. Table 1 lists, for each team member, some possible causes for their mental model to lack (correct) information. Table 1. Causes for lack of (correct) information in mental models User mental model -lack of training -lack of preparation -constructive domain -bounded rationality -constructive preferences of Self and Other
3
NSS mental model -lack of user input -constructive domain -constructive preferences of Self and Other
Explanation Framework: Resolving Discrepancies
Explanation can serve various purposes, such as improving effectiveness (helping users make good decisions), increasing the users trust in the system and improving transparency of the system [12]. In this paper we are in particular interested in the latter, as this facilitates detection and resolving of discrepancies between mental models of NSS and user. Transparency means explaining how the system works, thus giving the user a better understanding of the NSS’s reasoning process. This allows the user to detect any discrepancies between the mental models, and subsequently to resolve these discrepancies by updating the mental models where necessary. Due to space restrictions, for the remainder of this paper, we focus on the mental models of preferences, as represented by the utility functions uS and uO . The different causes of discrepancies between the mental models (of preferences)
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences
125
suggest that different types of preference information need to be made transparent in different situations. First, lack of training can result in difficulty in switching between one’s own perspective and that of the Opponent. The user may neglect the preferences of the Opponent, whereas the NSS often uses this information for bid calculations. This suggests that both user preferences and opponent preferences should be presented to the user, so that the user remains aware of both perspectives. This will make it easier for the user to understand the NSS’s bid recommendations. Second, due to bounded rationality, users may have difficulty calculating bid utilities. To make transparent why bids are assigned a certain utility value, more insight into the utility function needs to be available. If the NSS assigns a utility that the user believes is too high, it is not always immediately clear to the user why it is too high. In order to resolve the discrepancy, the user needs to be able to determine where the problem lies exactly (i.e., in which part of which mental model). As we assume the utility function is a linear additive function, we distinguish three levels of insight into this function. The first level is the overall utility, up (b). The second level concerns the building blocks of the first level, the issue utilities: uip (b). The third level concerns the building blocks of the second level: the weights and evaluation functions of the issues: wpi and eip (bi ). Third, also due to bounded rationality, users may have difficulty calculating utility differences between bids. This suggests that these differences should also be made transparent. The NSS may determine that bid b is better than bid b , however, the user may also want to know how much better. Explanation about utility differences reduces the chance of mental errors such as irrational expectations. In this section, we present a framework that can be used as the basis for generating explanations to increase transparency with respect to the NSS’s mental model of preferences. Research has distinguished between the content and form of explanations (e.g., [9]). The focus of the proposed framework is content selection; how it should be presented is an additional step that we only touch upon lightly in this paper. Further work will address this in more detail. The explanation content is selected from the so-called originator [10], which in our case is the original NSS without explanation capabilities. The content is selected by the so-called explainer [10], a component that is to be added to the NSS. The explainer is in charge of generating explanations. Given a bid b, we assume that the utility functions up (b) and their sub-parts uip (b), wpi and eip (bi ) are provided by the originator. When comparing two bids, additional content regarding differences in utility is necessary. This may not be readily available in the originator, we thus leave it to the explainer to perform these calculations. Any additional knowledge that should be present in the explainer will be described at the relevant points in the framework. Content selection is presented for each of the three levels of detail of the utility function described above. For each level, the framework provides the content to be selected when evaluating a single bid, as well as the additional content needed when comparing two bids.
126
3.1
I. van de Kieft, C.M. Jonker, and M.B. van Riemsdijk
Overall Utility Level
This level concerns the overall utility function: up (b). When evaluating a single bid b at this level, the content selection, referred to as CS1, consists of the bid itself and the overall utility up (b) for each p involved: CS1(b) = b, us (b), uo (b) When comparing two bids b and b , we first select the relevant content for a single bid (CS1). Additionally, the explainer calculates the difference in overall bid utility for each party. The difference in overall utility for party p between two bids b and b is defined as follows: Δup (b, b ) = up (b ) − up (b). Then, the content selection, referred to as CS2, is: CS2(b, b ) = CS1(b), CS1(b ), Δus (b, b ), Δuo (b, b ) This content could, for example, be presented to the user as follows: “Bid b has Δus (b, b ) more utility for you than bid b. For your opponent bid b has Δuo (b, b ) more utility than bid b.” A graphical display could supplement this text, listing the utilities per bid per party, such as in Table 2. Table 2. Content for overall utility level bid b b
3.2
utility self us (b) us (b ) Δus (b, b )
utility other uo (b) uo (b ) Δuo (b, b )
Issue Utility Level
This level concerns the utility functions per issue, uip (b). Here, more detail is provided as to how the utility of the bid was calculated, by showing how the utilities per issue together determine the overall utility. For a single bid b, we select the bid and the n-tuple of the utility values per issue, for each party: CS1(b) = b, us (b), uo (b) Furthermore, additional explanation knowledge is needed, to explain to the user how the utilities per issue are combined to get the overall utility. Because here we use an additive utility function, the additional content consists of the fact that the combination of the issue utilities is additive. When comparing two bids b, b at this level, we select the relevant content for each bid separately (CS1). Additionally, we provide content to show which issues have different values, and what the associated difference in utility is. We first define the difference in utility of issue i for party p, between bid b and b :
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences
127
Δuip (b, b ) = uip (b ) − uip (b). The corresponding n-tuple of utility difference for p (b, b ) The content selection is then defined as follows: each issue is: Δu s (b, b ), Δu o (b, b ) CS2(b, b ) = CS1(b), CS1(b ), Δu As this content consists of several n-tuples, a textual presentation does not seem suitable; a graphical presentation such as Fig. 1 seems more appropriate. In this figure, the utility differences are not presented as such, however, this information is used to highlight certain utility changes. In general, focus can be created by highlighting just the issues that have different values (DiffIssues), increased utility value (PosDiff ), decreased utility value (NegDiff ), or the same utility value (NoDiff ). This could be made into a dynamic feature for the user to interact with. We therefore define the following sets of issues: DiffIssues(b, b ) PosDiff p (b, b ) NegDiff p (b, b ) NoDiff p (b, b )
= {i|bi = bi } = {i|Δuip(b, b ) > 0} = {i|Δuip(b, b ) < 0} = {i|Δuip(b, b ) = 0}
In addition to highlighting purposes, these sets can be used to select a subset of the content. For example, if only one issue differs between bids b and b (i.e., |DiffIssues(b, b )| = 1), we might choose to only present the content related to that issue i ∈ DiffIssues(b, b ).
Fig. 1. Example presentation of explanation content for two bids, two parties, at issue utility level. The utilities highlighted in white indicate a decrease in utility, the utility highlighted in orange indicates an increase in utility, when comparing b and b
3.3
Level of Issue Weights and Evaluation Functions
This third level provides more detail as to why an issue has a certain utility value. This concerns the weights and evaluation functions of the issues, wpi and eip , which together determine the utility assigned to an issue value. The content when evaluating a single bid b consists of the issue weights and the evaluation function score of each issue value: CS1(b) = b, w p , ep (b)
128
I. van de Kieft, C.M. Jonker, and M.B. van Riemsdijk
Additionally, the explanation should contain the information that the weight and evaluation function score are multiplied to give the issue utility. Several kinds of comparisons can be made at this level. For example, comparisons can take place for one issue across two bids, for two issues within one bid or for two issues across two bids. Due to the many possible comparisons, this information should only be presented to the user when requested. We define Δwp (i, j) = (wpj − wpi ) and Δep (bi , bj ) = ejp (bj ) − eip (bi ). The explainer can use these formulas together with CS1 to determine the content necessary for a requested comparison. An explanation may clarify why an issue i has Δuis (b, b ) more utility for the user in b than in b. Alternatively, an explanation may clarify why, for bid b, issue i is assigned more utility than issue j. For example, the explanation may be: “Although issue j has Δwp (i, j) more weight than issue i, issue value bi has a higher evaluation score (Δep (bi , bj )) than issue value bj . When multiplied, this results in a higher issue utility for value bi of issue i.”
4
Conclusion
We presented an approach in which we analyzed the cooperation between user and NSS from the perspective of shared mental models. This analysis let us determine the elements that should be part of the shared mental model between user and NSS. This analysis also served to determine possible causes of discrepancies between mental models. Following this analysis, we focused on resolving discrepancies for one part of the shared mental model: preferences. We proposed using explanation, as it is a means to increase user understanding of the reasoning of the NSS, which helps detect and resolve discrepancies. We then presented a framework that provides a formal description of content selection for explanaining preferences. We focused on bilateral multi-issue negotiation, nevertheless, as the general idea of explaining preferences also applies to other types of negotiation, such as multilateral negotiation, our work should be extendable to such other types. Future work includes investigating how to present the explanation content (i.e., visualization techniques), a procedure to determine when to present which explanation to the user, implementation of the framework and user tests to validate it, and extending this work to other aspects of the shared mental model. Formalizing our approach of analyzing the problem domain in the light of shared mental models would allow it to be applied to other decision support domains. Also, as integration between explanation and argumentation research has been proposed in [8], it would be interesting to investigate how we might integrate our explanation framework, which focuses on the user-NSS interaction, with argumentation-based negotiation (e.g.,[2]), which is used for the interaction between user and opponent. Acknowledgements. This research is supported by the Dutch Technology Foundation STW, applied science division of NWO and the Technology
Explaining Negotiation: Obtaining a Shared Mental Model of Preferences
129
Program of the Ministry of Economic Affairs. It is part of the Pocket Negotiator project with grant number VICI-project 08075.
References 1. Harvard Business Essentials: Negotiation. Harvard Business School Publishing Corporation, Boston (2003) 2. Amgoud, L., Dimopoulos, Y., Moraitis, P.: A unified and general framework for argumentation-based negotiation. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 1–8. ACM Press, New York (2007) 3. Cannon-Bowers, J., Salas, E., Converse, S.: Shared mental models in expert decision making teams. In: Current Issues in Individual and Group Decision Making, pp. 221–246 (1993) 4. Hindriks, K., Jonker, C.: Creating human-machine synergy in negotiation support systems: towards the pocket negotiator. In: Proceedings of the 1st International Working Conference on Human Factors and Computational Models in Negotiation, pp. 47–54. ACM, New York (2008) 5. Lim, B., Klein, K.: Team mental models and team performance: A field study of the effects of team mental model similarity and accuracy. Journal of Organizational Behavior 27(4), 403–418 (2006) 6. Mastenbroek, W.: Negotiating as emotion management. Theory, Culture & Society 16(4) (1999) 7. Mathieu, E., Heffner, T., Goodwin, G., Salas, E., Cannon-Bowers, J.: The influence of shared mental models on team process and performance. The Journal of Applied Psychology 85(2), 273–283 (2000) 8. Moulin, B., Irandoust, H., B´elanger, M., Desbordes, G.: Explanation and argumentation capabilities: Towards the creation of more persuasive agents. Artificial Intelligence Review 17(3), 169–222 (2002) 9. Pu, P., Chen, L.: Trust-inspiring explanation interfaces for recommender systems. Knowledge-Based Systems 20(6), 542–556 (2007) 10. Roth-Berghofer, T., Richter, M.: On explanation. K¨ unstliche Intelligenz 22(2), 5–7 (2008) 11. Thompson, L.: The mind and heart of the negotiator. Pearson Prentice Hall, NJ (2005) 12. Tintarev, N., Masthoff, J.: A survey of explanations in recommender systems. In: ICDE 2007 Workshop on Recommender Systems and Intelligent User Interfaces, pp. 801–810 (2007) 13. Ye, L., Johnson, P.: The impact of explanation facilities on user acceptance of expert systems advice. MIS Quarterly 157–172 (1995)
A Computational Model of Habit Learning to Enable Ambient Support for Lifestyle Change Michel C.A. Klein, Nataliya Mogles, Jan Treur, and Arlette van Wissen VU University Amsterdam, Agent Systems Research Group De Boelelaan 1081, 1081HV Amsterdam, The Netherlands {michel.klein,nm.mogles,treur,wissen}@few.vu.nl
Abstract. Agent-based applications have the potential to assist humans in their lifestyle change, for instance eliminating addictive behaviours or adopting new healthy behaviours. In order to provide adequate support, agents should take into consideration the main mechanisms underlying behaviour formation and change. Within this process habits play a crucial role: automatic behaviours that are developed unconsciously and may persist without the presence of any goals. Inspired by elements from neurological literature, a computational model of habit formation and change was developed as a basis for support agents able to assist humans in lifestyle and behaviour change. Simulations are presented showing that the model exhibits realistic human-like behaviour. Keywords: habit learning, computational agent model, lifestyle change support.
1 Ambient Support for Lifestyle Change In Western societies health policy is directed at the reduction of medical costs by switching more and more from the treatment of diseases resulting from unhealthy lifestyle to promotion of healthy lifestyle habits [18]. Lifestyle change may comprise eliminating bad habits, for example addictive behaviours (e.g., smoking, alcohol or drugs use), and may simultaneously be directed at adopting new healthy habits, such as dieting and increasing physical activity (e.g., [13], [20], [24], [29], [30]). Considering the fact that lifestyle change requires intensive support, monitoring and supervision (e.g., [25]), the potential of smart ambient applications that assist humans in their daily life is substantial, as they allow for constant monitoring and instant feedback. In order to provide adequate support for humans, these support applications should be able to reason about the main determinants of human behaviour and the mechanisms underlying behaviour change. Apart from conscious goals and decisions, human behaviour is often based on habits – automatic behaviours that can be developed and maintained unconsciously. Habits may persist without the presence of any clear and definite goals and are very difficult to overcome. The model for habit learning and change presented In this paper can be used as basis for ambient intelligence applications to support lifestyle change. Using the model, an ambient application can predict – given a certain context or cue – the behaviour of a person with already formed habits. In addition, the system can K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 130–142, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Computational Model of Habit Learning to Enable Ambient Support
131
exploit the model to reason about required changes in the context or goals that need more attention in order to form new habits or get rid of old ones. The proposed computational model of habit learning was inspired by elements from the neurological literature on habit learning (e.g., [1], [8], [10], [31]), and neural plasticity, such as Hebbian learning (e.g., [3], [11]), and adopts such adaptive mechanisms. The model has been formally specified in an executable manner, in order to conduct experiments and to allow the model to be embedded in an intelligent software agent that can support humans in their lifestyle and behaviour change. This paper is organized as follows. Section 2 addresses some background information on habit learning and change and the neural mechanisms underlying these processes. In Section 3 the description of the model is presented, Section 4 demonstrates some simulation results. Automated verification of the model is presented in Section 5, and finally, Section 6 contains discussion on the topic.
2 Background on Habit Learning Habits are learned dispositions to repeat past responses (cf. [31], p. 843), which by themselves were goal-driven. Once habits have been acquired, they are triggered by the cues in a certain context that co-occurred frequently with past performance, and which activate habitual responses directly, without the mediation of goals. These cues can be locations, objects, sequence of actions or presence of particular persons during or preceding the action performance. Habits formation corresponds thus to a contextresponse learning that is acquired slowly with experience ([31], p. 844). Behaviourists described habits as behaviour as creation of connections between stimulus and a particular response (e.g., [26], [28]). The cognitivist perspective on human behaviour suggests the existence of a central executive controller of behaviour (e.g., [21]). Nowadays, neurological literature describes the mechanisms underlying habit formation, which explain the behaviourists’ stimulus-response-based learning phenomenon, and introduces the concept of neural plasticity. Learning occurs due to the change of the connections strengths, for example, based on a principle known as Hebbian learning (e.g., [3], [11], [16]). It states that if two or more neurons are coactivated, the connections between these neurons strengthen. For example, repeated action in a certain context results to the gradual strengthening of the connection between the context representation and this particular response. These associations are difficult to override, though it is possible to influence habits (indirectly) via the activation of new goals. Strong goals that aim to direct one’s behaviour are associated with activation in the prefrontal cortex. This activation can inhibit the activation of subcortical structures (e.g., basal ganglia and cerebellum), associated with habitual behavior (e.g., [1], [8], [19], [27], [28]). Thus, when habits and goals are both present to guide action, they interact such that under some circumstances humans respond habitually and under other they exert regulatory control to inhibit the cued response. Although a habit is no longer goal-mediated, it can be regulated by post-hoc goal inference or cue control, for example, by 1) inhibiting the performance of responses, 2) drawing one’s attention to the undesired behavior, 3) associating the learned context with multiple responses or 4) altering exposure to the cues in the context [31].
132
M.C.A. Klein et al.
Summarising, from neurological literature such as [1], [8], [10], [15], [19], [23], [27], [32], [36] the following characteristics of habit learning have been identified: 1. Under repeated occurrence of cues and under influence of goal-directed behaviour leading to satisfaction, habits are developed. 2. When a habit has developed, the behaviour will also occur without the presence of a goal, when the cue is present. 3. A developed habit will persist when the relevant goal is present, also in absence of the cue. 4. When a habit was developed based on a goal, and this goal is changed to another (competitive) goal, then the habit can change to a new habit. These patterns have served as requirements for the design of the adaptive computational model described in Section 3. The patterns themselves will be formalized in Section 4 and checked for simulation traces of the computational model.
3 The Computational Model for Habit Learning The structure of the computational model presented in this section is based on the literature described in the previous section. The model is at a cognitive level, which still reflects the underlying neurological concepts, but without taking into account too many neurological details. It uses temporal relationships to describe the mechanisms at work. An overview of the model is depicted in Fig. 1. It enables two alternative ways (paths) in which behaviour can be generated. The first is by the activation of a long term goal (e.g., loose weight), a short term goal corresponding to this long term goal (reduce consumption of high calorie food), generation of an intention (able to achieve the goals), and finally execution of this intended action. The second path goes directly via cue activation in a certain context to the activation of a particular intention that leads to the action execution. This path corresponds to the habit, which is learned over time: the connection between cue and intention changes dynamically after their simultaneous activation according to the Hebbian learning principle. In the model also the influence of feeling on the chosen action has been incorporated: frequent execution of a particular action provides a reinforcement by the feeling of satisfaction after the performed action, and this feeling leads in turn to the higher activation of the intention related to this action. For example, a positive feeling of satisfaction resulting from the consumption of delicious cookies will lead to higher activation of the intention of eating these cookies. The model allows for multiple goals and intentions that result in behaviour. In principle each long term goal has connections with different strengths to short term goals, and the same holds for cues. The dynamical relationships below describe the model in semi-formal form and in a formal temporal relation notation in LEADSTO (cf. [6]). Within LEADSTO a dy→ b denotes that when a state property a (or namic property or temporal relation a → conjunction thereof) occurs, then after a certain time delay, state property b will occur. Below, this delay will be taken as a uniform time step Δt. The first dynamic
A Computational Model of Habit Learning to Enable Ambient Support
133
relationship addresses the Hebbian learning principle applied for the connections between cues en intentions, as also described in ([11], p. 406). LP1 Cue-intention connection adaptation If and and then
relevant cue C with level V1 occurs and intention I has value V2 learning rate from cue C to intention I is η and extinction rate from cue C to intention I is ζ the connection strength between cue C and intention I is w1 after ∆t the connection from cue C to intention I will have strength w1 + (η*V1*V2( 1 - w1) - ξ *w1)* ∆t
cue(C, V1) & intention(I, V2) & learning_rate(C, I, η) & extinction_rate(C, I, ζ) & connection_strength(C, I, w1)
→ → connection_strength(C, I, w1 + (η*V1*V2( 1 - w1) - ζ *w1)* ∆t)
Fig. 1. Computational model for habit learning: overview
134
M.C.A. Klein et al.
The following relationship specifies how activations of short term goals are determined based on long term goals and cues. LP2 If and and and and
Short term goal from cue and long term goals relevant cue C with level V0 occurs, long term goal LG1 has value V1 … and long term goal LGn has value Vn the connection strength between cue C and short term goal SG is w0 the connection strength between long term goal LG1 and short term goal SG is w1 the connection strength between long term goal LG2 and short term goal SG is w2 … and the connection strength between long term goal LGn and short term goal SG is wn and short term goal SG1 has value V3 then short term goal SG1 after ∆t will have level V3 + α(g(σ1,τ1, V0, V1 , V2, …, Vn, w0, w1, w2, …, wn) – V3) ∆t cue(C, V0) & ltgoal(LG1, V1) & … ltgoal(LGn, Vn) & connection_strength(C, SG, w0) & connection_strength(LG1, SG, w1) & … connection_strength(LGn, SG, wn) & stgoal(SG1, V3)
→ → stgoal(SG1, V3 + α(g(σ1, τ1, V0, V1 , V2, Vn, w0, w1, w2, wn) – V3) ∆t)
Here α is a speed parameter that defines the impact of long term goals and context cues upon the new activation value of the short term goal. Moreover, g is a combination function for which various choices are possible; a logistic threshold function has been chosen: g(σ, τ, V0, …, Vn, w0, …, wn) = th(σ, τ, w0V0+ … + wnVn) with th(σ, τ, V) = 1/(1+ e - 4σ (V- τ)) Parameters σ and τ define steepness and threshold values of the function. The threshold function ensures that the value of the goal is most often either close to zero or close to one. Only when the input for the threshold function is close to the threshold value itself, the values of the goal are somewhere between 0 and 1. In all subsequent formulae the combination function g is always based on a threshold function of this form. The third relationship of the model describes how intentions are determined. Intentions depend on short term goals and cues, and the feelings of satisfaction for both short and long term goals. Moreover, different intentions also affect each other by a form of mutual inhibition. Note that for the sake of simplicity in LP3 only two long term and short term goals are considered. LP3 If and and and and and and and and and and and and and and
Intention dynamics short term goal satisfaction SGSAT1 has value V1 long term goal satisfaction LGSAT1 has value V2 short term goal satisfaction SGSAT2 has value V3 long term goal satisfaction LGSAT2 has value V4 relevant short term goal SG1 has value V5 relevant cue C has value V6 intention I1 that corresponds to these goals has value V7 intention In has value V8 the connection strength between intention I1 and intention In is w1 the connection strength between short term goal SG1 and intention I1 is w3 the connection strength between cue C and intention I1 is w4 the connection strength between short term goal satisfaction SGSAT1 and intention I1 is w5 the connection strength between long term goal satisfaction LGSAT1 and intention I1 is w6 the connection strength between short term goal satisfaction SGSAT2 and intention I1 is w7 the connection strength between long term goal satisfaction LGSAT2 and intention I1 is w8
A Computational Model of Habit Learning to Enable Ambient Support
135
then intention I1 that corresponds to these goals after ∆t will have value V7 + β (g((σ2,τ2, V8, V9, V1, V2 , V3, V4, V5, V6, w 3, w 4, w 5, w 6, w 7, w 8, w 1 ) – V7) * ∆t stg_satisfaction(SGSAT1, V1) & ltg_satisfaction(LGSAT1, V2) & stg_satisfaction(SGSAT2, V3) & ltg_satisfaction(LGSAT2, V4) & stgoal(SG1, V5) & cue(C, V6) & intention(I1, V7) & intention(In, V8) & connection_strength(I1, In, w1) & connection_strength(SG1, I1, w3) & connection_strength(C, I1, w4) & connection_strength(SGSAT1, I1, w5) & connection_strength(LGSAT1, I1, w6) & connection_strength(SGSAT2, I1, w7) & connection_strength(LGSAT2, I1, w8)
→ → intention(I1, V7 + β (g(σ2, τ2, V8, V9, V1, V2 , V3, V4, V5, V6, w3, w4, w5, w6, w7, w8, w1 ) – V7) ∆t)
Here β is a parameter that defines the impact of inhibition of other intentions, and the feeling of satisfaction from the performed actions upon the intention to perform new actions. Weight w1 is negative here as it defines inhibition from the alternative competing intention(s). It is assumed that different intentions are conflicting, in other words one cannot perform two behaviour simultaneously to satisfy different goals; for this reason the weights between the intentions are always negative, or inhibitory. The step from intention to behaviour has been kept simple: LP4 From intention to behaviour If intention I with level V occurs, and then behaviour with level V will occur intention(I, V) & V> threshold
V> threshold
→ → behaviour(B, V)
The feeling of satisfaction for a long term goal was modelled as follows: LP5 If and and and and then
Long term goal satisfaction intention I1 has value V2 behaviour B1 with level V1 occurs and long term goal LG corresponding to this behaviour has value V3 long term goal satisfaction LTSAT has value V4 connection strength from behaviour B1 to the long term goal satisfaction LTSAT is w1 connection strength from intention I1 to long term goal satisfaction LTSAT is w2 long term goal satisfaction LTSAT after ∆t will be V4 + θ (f((σ3,τ3, V3 , V1, V2, w1, w2) – V4) * ∆t
behaviour(B1, V1) & intention(I1, V2) & ltgoal(LG, V3) & ltg_satisfaction(LTSAT, V4) & connection_strength(B1, LTSAT, w1) & connection_strength(I1, LTSAT, w2)
→ → ltg_satisfaction(LTSAT, V4 + θ (f(σ3, τ3, V3 , V1, V2, w1, w2) – V4) * ∆t)
Here parameter θ defines the impact of a long term goal, behaviours and intentions upon the long term goal satisfaction. The feeling of satisfaction for a short term goal was modelled in a similar manner: LP6 If and and and and then
Short term goal satisfaction intention I1 has value V2 behaviour B1 with level V1 occurs and short term goal SG corresponding to this behaviour has value V3 short term goal satisfaction STSAT has value V4 connection strength from behaviour B1 to the short term goal satisfaction STSAT is w1 connection strength from intention I1 to short term goal satisfaction STSAT is w2 short term goal satisfaction STSAT after ∆t will be V4 + θ (f((σ4,τ4, V3 , V1, V2, w1, w2) – V4) ∆t
behaviour(B1, V1) & intention(I1, V2) & stgoal(SG, V3) & stg_satisfaction(STSAT, V4) & connection_strength(B1, STSAT, w1) & connection_strength(I1, STSAT, w2)
→ → stg_satisfaction(STSAT, V4 + θ (f(σ4, τ4, V3 , V1, V2, w1, w2) – V4) * ∆t)
136
M.C.A. Klein et al.
4 Simulation and Verification The cognitive computational model described in the previous section was implemented in the Matlab environment. A number of simulations of 50 and 200 time steps have been performed. For the sake of simplicity only two initial long term goals and the corresponding behaviours were assumed. In this section four example simulation runs of 50 time steps are presented. These simulations illustrate the ability of the computational model to exhibit important patterns of habit learning and change. In Table 1 the values are shown used for learning and extinction rate, steepness and threshold values, speed factors, and connection weights (note that weight values for interaction between two options are symmetric). In order to investigate whether the computational model indeed learns and behaves according to what is expected, some logical properties (requirements) have been identified, formalized, and verified against the simulation traces of the model (see also the characteristics informally described at the end of Section 2). In this section, first the language used to express such properties is briefly introduced, followed by the specification of the actual properties, presentation of an example trace illustrating the pattern, and the result of their verification. Table 1. Parameter and connection weight values used α η ζ 0.5 0.01 0.8 connection cue-intention1 initially cue-intention2 initially cue-stgoal1 cue-stgoal2 ltgoal1-stgoal1 ltgoal1-stgoal2 ltgoal2-stgoal1 ltgoal2-stgoal2 ltsat1- intention1 ltsat1- intention2 stsat1- intention1
σ1 15
weight 0.1 0.1 0.1 0.1 0.9 0.2 0.2 0.9 0.9 0.2 0.9
σ2 τ1 β 0.9 0.8 20
σ3 τ2 thr θ 0.5 0.5 0.8 15
connection stsat1- intention2 intention1-behaviour1 intention2-behaviour2 intention1-intention2 intention2-intention1 intention1-ltsat1 behaviour1-ltsat1 intention2-ltsat1 intention2-stsat1 intention1-ltsat2 intention1-stsat2
weight 0.2 1 1 -0.9 -0.9 0.9 0.9 0.1 0.1 0.1 0.1
τ3 σ4 0.2 15
τ4 0.6
connection intention1-stsat1 behaviour1-stsat1 intention2-ltsat1 behaviour2-ltsat1 intention2-stsat1 behaviour2-stsat1 behaviour1- ltsat2 behaviour1-stsat2 behaviour2- ltsat2 behaviour2- stsat2
weight 0.9 0.9 0.2 0.1 0.2 0.1 0.1 0.1 0.9 0.9
Formal specification of desired properties of the computational model enables automatic verification of them against simulation traces. This was performed using the hybrid language TTL and its software environment [5]. TTL is built on atoms referring to states of the world, time points and traces, i.e. trajectories of states over time. Dynamic properties are temporal statements formulated with respect to traces in the following manner. Given a trace γ, the state in γ at time point t is denoted by state(γ, t). These states are related to state properties via the infix predicate |=, where state(γ, t) |= p denotes that state property p holds in trace γ at time t. Based on these
A Computational Model of Habit Learning to Enable Ambient Support
137
statements, dynamic properties are formulated in a sorted first-order predicate logic, using quantifiers over time and traces and the usual first-order logical connectives such as ¬, ∧, ∨, ⇒, ∀, ∃. For more details on TTL, see [5]. Each of the three subsections addresses one scenario. In the figures that demonstrate the simulation results, time is depicted on the horizontal axis and the activation values of the variables of interest are depicted on the vertical axis.1 4.1 Habit Formation In this simulation a specific behaviour is generated by a strong long term goal related to this behaviour in the presence of a strong cue. As a result even after a decrease of the value of the goal corresponding to this behaviour after time point 24, the behaviour persists up to end of the simulation; see Fig. 2. The value of the second long term goal is kept low during the whole simulation; therefore the second type of behaviour that corresponds to this goal does not come to expression. To verify this pattern formally, it first has to be checked whether a specific behaviour results from the presence of a high-level goal and a cue: P0: Long-term goal and cue leads to behaviour If a cue and a high-level goal are present for a certain time duration MIN_DURATION, then at some later time the corresponding behaviour will be present. ∀γ:TRACE, t:TIME [ habit_learning_phase(γ:TRACE, t:TIME, MIN_DURATION, ACT_VALUE2) ⇒ ∃t2:TIME > t , R3:REAL
state(γ, t2) |= has_value(beh1,R3) & R3 > ACT_VALUE2 ]
Here (and in the other properties below) the following abbreviation is used: habit_learning_phase(γ:TRACE, t:TIME, MIN_DURATION:INTEGER, ACT_VALUE:REAL) ≡ ∀t2:TIME > t & t2 < t + MIN_DURATION [ ∃R1:REAL state(γ, t2) |= has_value(ltg1, R1) & R1 > ACT_VALUE2 & ∃R2:REAL state(γ, t2) |= has_value(cue1, R2) & R2 > ACT_VALUE2 ] 2
2
behavior 1 cue behavior 2 long term goal 1
1.8 1.6
1.6
1.4
1.4
1.2
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
5
10
15
20
25 time
30
35
40
45
Fig. 2. Habit formation and persistence
1
behavior 1 cue behavior 2 long term goal 1
1.8
50
0
0
5
10
15
20
25 time
30
35
40
45
50
Fig. 3. Habit formation and cue removal
Note that gradual values for intentions, cues and goals are assumed, which represent the strength of the presence of these variables as perceived by the subject.
138
M.C.A. Klein et al.
Property P0 corresponds to characteristic pattern 1 of habits as defined at the end of Seciton 2, and has been automatically checked and proven to be true for the following values of the constants: MIN_DUR = 2, MAX_LEVEL_P1 = 0.5, ACT_VALUE1 = 0.5, ACT_VALUE2 = 0.7. All properties described in the remainder of this section have been automatically verified and found satisfied for these values. The pattern of habit formation itself (characteristic pattern 2) was specified as follows: P1: Habit persistence If a cue and a high-level goal have been present for some time period MIN_DURATION, the behaviour will exist in the presence of a cue even if the goal is no longer present. ∀γ:TRACE, t:TIME [ habit_learning_phase(γ:TRACE, t:TIME, MIN_DURATION, ACT_VALUE2) & ∀t2:TIME > t + MIN_DURATION, R3: REAL [ state(γ, t2) |= has_value(ltg1, R3) ⇒ R3 < ACT_VALUE1 ]] ⇒ [∀t3:TIME > t2, R4: REAL ⇒
state(γ, t3) |= has_value(cue1, R4) & R4 > ACT_VALUE2
∃t4:TIME>t3, R5:REAL state(γ, t4) |= has_value(beh1, R5) & R5 > ACT_VALUE2 ]]
When in the scenario in Fig. 3, after time point 26 the value of the cue is substantially decreased, habitual behaviour is not performed anymore from time point 31. As expected, and shown in Fig. 3, the second behaviour (‘behaviour 2’) does not occur. Formally, the illustrated characteristic is specified as follows. P2: Habit and cue removal If a habit is formed and the cue and the goal are no longer present, the behaviour will after some time cease to exist. ∀γ:TRACE, t:TIME [ [habit_learning_phase(γ, t, MIN_DURATION, ACT_VALUE2) & ∀t2:TIME > t + MIN_DURATION, R1, R2: REAL [ state(γ, t2) |= has_value(cue1,R1) ⇒ R1 < ACT_VALUE1 & state(γ, t2) |= has_value(ltg1, R2) ⇒ R2 < ACT_VALUE1 ] ⇒ ∃t3:TIME > t2, R3: REAL state(γ, t3) |= has_value(beh1,R3) & R3 < ACT_VALUE1 ] ]
4.2 Influence of Long Term Goal on Behaviour This scenario demonstrates how behaviour is influenced by goals in the absence of the learned cue; see Fig. 4. In the beginning habitual behaviour is formed: a strong cue is present and a behaviour pattern that coincides with the first long term goal. The value of the goal remains the same during the whole run, but the cue almost disappears after time point 24. The low value of the cue does not prevent the behaviour to occur due to the strong influence of the long term goal. This corresponds to characteristic habit pattern 3 from Section 2. This was specified as follows. P3: Habit and cue removal in presence of strong goal
If a habit is formed, the behaviour will still exist if the cue is not present any more and the high-level goal is present. ∀γ:TRACE, t:TIME, [ [habit_learning_phase(γ, t, MIN_DURATION, ACT_VALUE2) & ∀t2:TIME > t + MIN_DURATION, R3: REAL
[state(γ, t2) |= has_value(cue1, R3) ⇒ R3 < ACT_VALUE1 ]
⇒ [∀t3:TIME > t2, R4: REAL state(γ, t3) |= has_value(ltg1, R4) & R4 > ACT_VALUE2 ⇒ ∃t4:TIME>t3,R5:REAL state(γ, t4) |= has_value(beh1, R5) & R5 > ACT_VALUE2 ]]
A Computational Model of Habit Learning to Enable Ambient Support 2
2
behavior 1 cue behavior 2 long term goal 1
1.8 1.6
1.6 1.4
1.2
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
5
10
15
20
25 time
30
35
40
behavior 1 cue behavior 2 long term goal 1 long term goal 2
1.8
1.4
0
139
45
0
0
10
20
30
40
50
60
time
Fig. 4. Influence of the goal on behaviour in Fig. 5. Behaviours resulting from the goal the absence of the original cue change
4.3 The Effect of Goal Change In this simulation the result of the switch from one goal to another is demonstrated in the presence of a strong cue. As shown in Fig. 5, the habitual behaviour (‘behaviour 1’ in the picture) does not disappear immediately after adopting a new goal, conflicting with the previous one. It takes a little time to perform new behaviour pattern after the new goal has been adopted. This simulation demonstrates how the old undesired habitual behaviour can be substituted with the new positive behaviour. The formal specification of this pattern is: P4: New goal results in new habit If a habit is formed for long term goal ltg1, which disappears, a new behaviour will be developed if another long term goal ltg2 is present ∀γ:TRACE, t:TIME _DURATION [ [habit_learning_phase(γ, t, MIN_DURATION, ACT_VALUE2) & ∀t2 > t+MIN_DURATION, R1, R2, REAL
[ state(γ, t2) |= has_value(ltg1,R1) ⇒ R1 < ACT_VALUE1 ] &
[ state(γ, t2) |= has_value(ltg2,R2) ⇒ R2 > ACT_VALUE2 ] ⇒ ∃t3:TIME > t2 , R3, R4, REAL [ state(γ, t3) |= has_value(beh1,R3) & R3 < ACT_VALUE1 & [ state(γ, t3) |= has_value(beh2,R4) & R4 > ACT_VALUE2 ] ]
Fig. 5 also shows the effects of P1, which demonstrate that the new behaviour results in a habit after some amount of time: the behaviour persists even after the corresponding long term goal is no longer present. Combined, P1 and P4 account for characteristic habit pattern 4.
5 Discussion and Conclusions The cognitive computational model presented above can form the basis of an intelligent ambient support application. To this end, an agent based approach for creating ambient intelligence applications can be used [4]. Within such a framework, the ambient system consists of components, i.e., agents, that have context awareness about human behaviours and states, and (re)acts on these accordingly. For this purpose, the behaviour of the subject of the system (a person taken care of) relevant to the support
140
M.C.A. Klein et al.
provided should be explicitly described, e.g., via a computational model. If this is the case, an ambient agent can (re)act by undertaking actions in a knowledgeable manner that improve the human’s wellbeing and performance. Reasoning using an explicit model of the behaviour of a process is called modelbased reasoning [22]. Basically, there are two ways in which model-based reasoning on habits can be used within an intelligent support application. First, predictions can be made of what will happen given certain cues /contexts, long term goals and short term goals. For example, if the system has identified a specific behaviour – such as the eating of cookies at work – several times in the past, and it has knowledge about the short-term and long-term goals, it can predict whether a person in the workcontext will again eat a cookie. These predictions capabilities allow a support application to take action before an undesired habit actually took place. Second, the model can be used to perform analysis of the causes of the undesired behaviour and the effect of interventions on the behaviour of a person [9]. Causes of behaviour can be determined by backward abductive reasoning. For example, if an undesired behaviour is taking place, the presented computational model can be used to find hypothetical causes for this behaviour, for example a short term goal that leads to the intentions for the undesired behaviours. Symmetrically, forward deductive model-based reasoning derives the effect of interventions on the behaviour. For example, determining the effect of a different or more important long term goal after some time. This can be used by the ambient intelligence application to explicitly change the situation, e.g. removing cues, generating additional intention for long term goals leading to different behaviour, or suggesting actions to create new (more desired) habits. Existing models of habit learning take either the perspective of behaviourism that does not follow the internal mechanisms underlying habit development (e.g., [7], [17]) or propose the description of habit learning in a very detailed manner at the lowest neurological level (e.g., [2], [7], [14]). The proposed computational model is at a cognitive level, between the neurological and behavioural level. The proposed way of modeling is a manner to exploit within the computational modeling area principles from the neurological literature, by lifting neurological knowledge to a mental (cognitive/affective) level. In order to successfully model more complex and human-like behaviour, for example incorporating mutual cognitive/affective interactions, and adaptive behaviour, the modeler has to consider such numerical modeling techniques; see also [23]. In future work, the model will be deployed on actual data and used to improve habit performance. Also, the model could be improved by taking into account the environment in which a person is embedded, which is currently limited to perceiving cues, but preferably also incorporates socio-environmental factors shown to play a role in habit formation development (e.g., [12]).
References 1. Ashby, F.G., Turner, B.O., Horvitz, J.C.: Cortical and basal ganglia contributions to habit learning and automaticity. Trends Cogn. Science 14, 208–215 (2010) 2. Baldassarre, G.: A modular neural-network model of the basal ganglia’s role in learning and selecting motor behaviours. Cogn. Systems Research 3, 5–13 (2002)
A Computational Model of Habit Learning to Enable Ambient Support
141
3. Bi, G., Poo, M.: Synaptic modification by correlated activity: Hebb’s postulate revisited. Annu. Rev. Neurosci. 24, 139–166 (2001) 4. Bosse, T., Hoogendoorn, M., Klein, M.C.A., Treur, J.: An Agent-Based Generic Model for Human-Like Ambience. In: Mühlhäuser, M., Ferscha, A., Aitenbichler, E. (eds.) Constructing Ambient Intelligence: AmI 2007 Workshops Proceedings. Communications in Computer and Information Science (CCIS), vol. 11, pp. 93–103. Springer, Heidelberg (2008) 5. Bosse, T., Jonker, C.M., Meij, L., van der Sharpanskykh, A., Treur, J.: Specification and Verification of Dynamics in Agent Models. International Journal of Cooperative Information Systems 18, 167–193 (2009) 6. Bosse, T., Jonker, C.M., Meij, L., van der Treur, J.: A Language and Environment for Analysis of Dynamics by Simulation. International Journal of Artificial Intelligence Tools 16, 435–464 (2007) 7. Cohen, M.X., Frank, M.J.: Neurocomputational models of basal ganglia function in learning, memory and choice. Behavioural Brain Research 199, 141–156 (2009) 8. De Wit, S., et al.: Differential engagement of the ventromedial prefrontal cortex by goaldirected and habitual behaviour toward food pictures in humans. J. Neurosci. 29, 11330– 11338 (2009) 9. Duell, R., Hoogendoorn, M., Klein, M.C.A., Treur, J.: An Ambient Intelligent Agent Model using Controlled Model-Based Reasoning to Determine Causes and Remedies for Monitored Problems. In: Li, Y., Pasi, G., Zhang, C., Cercone, N., Cao, L. (eds.) Proceedings of the Second International Workshop on Human Aspects in Ambient Intelligence, HAI 2008, pp. 489–494. IEEE Computer Society Press, Los Alamitos (2008) 10. Everitt, B.J., Belin, D., Economidou, D., Pelloux, Y., Dalley, J.W., Robbins, T.W.: Neural mechanisms underlying the vulnerability to develop compulsive drug-seeking habits and addiction. Philos. Trans. R Soc. Lond. B Biol. Sci. 363, 3125–3135 (2008) 11. Gerstner, W., Kistler, W.M.: Mathematical formulations of Hebbian learning. Biol. Cybern. 87, 404–415 (2002) 12. Giles-Corti, B., Donovan, R.J.: The relative influence of individual, social and physical environment determinants of physical activity. Social Science & Medicine 54, 1793–1812 (2002) 13. Gollwitzer, P.M., Sheeran, P.: Implementation intentions and goal achievement: A metaanalysis of effects and processes. Advances in Exp. Social Psychology 38, 69–120 (2006) 14. Gurney, K., Prescott, T.J., Wickens, J.R., Redgrave, P.: Computational models of the basal ganglia: from robots to membranes. Trends Neurosci. 27, 453–459 (2004) 15. Haigh, K.Z., Kiff, L.M., Myers, J., Guralnik, V., Geib, W.C., Phelps, J., Wagner, T.: The Independent LifeStyle Assistant (I.L.S.A.): AI Lessons Learned. In: Proc. of the 16th Conference on Innovative Applications of AI, San Jose, California, pp. 852–857 (2004) 16. Lymberis, A., De Rossi, D.E.: Wearable eHealth Systems for Personalised Health Management: State of the Art and Future Challenges. IOS Press, Amsterdam (2004) 17. Machado, A.: Learning the temporal Dynamics of Behaviour. Psychological Review 104, 241–265 (1997) 18. Miller, E.K., Cohen, J.D.: An Integrative Theory of prefrontal Cortex Function. Annual Review of Neuroscience 24, 167–202 (2001) 19. Mowrer, O.H.: Learning Theory and the Symbolic Processes. Wiley, New York (1960) 20. Neal, D.T., Wood, W.: Automaticity in situ: Direct context cuing of habits in daily life. In: Morsella, E., Bargh, J.A., Gollwiter, P. (eds.) Oxford Handbook of Human Action, pp. 442–457. Oxford University Press, New York (2009)
142
M.C.A. Klein et al.
21. Oulette, E.A., Wood, W.: Habit and Intention in everyday life: the Multiple Processes by Which Past Predicts Future Behaviour. Psychological Bulletin 124, 54–74 (1998) 22. Pearl, J., Verma, T.S.: A theory of inferred causation. In: Principles of Knowledge Representation and Reasoning: Proc. Second Int. Conference (KR 1991), pp. 441–452 (1991) 23. Port, R., van Gelder, T.J.: Mind as Motion: Explorations in the Dynamics of Cognition. MIT Press, Cambridge (1995) 24. Quinn, J.A., Pascoe, A., Wood, W., Neal, D.T.: Can’t control yourself? Monitor those bad habits. Pers. Soc. Psychol. Bull. 36, 499–511 (2010) 25. Skinner, B.F.: The Behaviour of Organisms: An Experimental Analysis. AppletonCentury, New York (1938) 26. Tang, C., Pawlak, A.P., Prokopenko, V., West, M.O.: Changes in activity of the striatum during formation of a motor habit. Eur. J. Neurosci. 25, 1212–1227 (2007) 27. Watson, J.B.: Psychology as Behaviourist Views It. Psy. Review 20, 158–177 (1913) 28. Webb, T.L., Sheeran, P., Luszczynska, A.: Planning to break unwanted habits: Habit strength moderates implementation intention effects on behaviour change. British Journal of Social Psychology 48, 507–523 (2009) 29. Webb, T.L., Sheeran, P.: Mechanisms of implementation intention effects: The role of goal intentions, self-efficacy, and accessibility of plan components. British Journal of Social Psychology 47, 373–395 (2008) 30. Wood, W., Neal, D.T.: A New Look at Habits and the Habit-Goal Interface. Psychological Review 114, 843–863 (2007) 31. Yin, H.H., Knowlton, B.J.: The role of Basal ganglia in habit formation. Nature Reviews Neuroscience 7, 464–476 (2006)
An Intelligent Method to Extract Characters in Color Document with Highlight Regions* Chun-Ming Tsai Department of Computer Science, Taipei Municipal University of Education, No. 1, Ai-Kuo W. Road, Taipei 100, Taiwan
[email protected],
[email protected] Abstract. Most conventional characters extraction methods include binarization (background determination), region segmentation, and region identification. Incorrect binarization results adversely influence the segmentation and identification results. This can be a problem when color documents are printed with different background color regions as the binarization will not have effective threshold results and subsequent segmentation and identification steps will not work properly. Conventional region segmentation methods are time-consuming for large document images. Conventional region identification methods are applied for the preceding segmentation results, using a bottom-up method. This study presents an intelligent method to solve these problems, which integrates background determination, region segmentation, and region identification to extract characters in color documents with highlight regions. The results demonstrate that the proposed method is more effective and efficient than other methods in terms of binarization results, extraction results, and computational performance. Keywords: Intelligent methods, character extraction, highlight regions.
1 Introduction Digital libraries (DL) are emerging as a supplement to traditional libraries [1]. The two main goals of DLs are (1) providing digital document images for users who wish to look up the original documents quickly, and (2) providing the digital contents of the documents for searching, reading, and printing easily. Color document image analysis (CDIA) [2], [3], [4] has been studied to convert paper contents to digital contents. The CDIA method can help traditional libraries to become digital libraries. The conventional methods for analyzing color document images have three basic steps: binarization (background determination), region segmentation, and region classification. Binarization includes global, region-based, local, and hybrid methods. Global thresholding methods find a threshold from the entire image information. The Otsu [5] method is a representative work. These global methods neglect spatial relationships among pixels [6] and scan the entire image. Global methods cannot threshold an image with low, middle, gradually decaying, or highlight background distribution. Region-based thresholding divides the image into k × k regions and *
This paper is supported by the National Science Council, R.O.C., under Grants NSC 99-2221E-133-002-.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 143–152, 2011. © Springer-Verlag Berlin Heidelberg 2011
144
C.-M. Tsai
decides how to threshold each region. Chou et al. [7] proposed learning rules to threshold document images. They divided a document image into 3 × 3 regions. Each region is thresholded by decision rules. How to determine the k value intelligently is critical to this method. Local thresholding methods such as Niblack [8] determine an individual threshold for each pixel according to the intensities of itself and its neighbor. Local methods for processing large size color documents are very timeconsuming. Hybrid thresholding methods used by Tseng and Lee [9] combine both global and local methods. They proposed a document image binarization method, which is a two-layer block extraction method. First layer dark foregrounds are extracted by locating connected-components. Second layer background intensities are determined and removed Hybrid methods have better experimental results, but are slower than global methods. The region segmentation includes both top-down and bottom-up methods. Topdown methods segment the whole image into small blocks. The method of recursive X-Y cutting (RXYC) [10] is a representative work. This method recursively makes cuts on the valleys of the projection profiles, obtained by scanning the image in horizontal and vertical directions alternately. An image requires at least two scans to determine threshold cut points. Bottom-up approaches start from document image pixels, and iteratively group them into larger regions. Fletcher and Kasturi [11] used connected component (CC) labeling to extract graphic and text objects from document images. The advantage of this approach is that it can extract italics characters. When document images include complex objects, such as non-rectangular images and various character font sizes, the approach segments images correctly. The time complexity of the CC based approaches is higher than top-down approaches. If documents contain large graphics, after the background determination and CC labeling, many small CCs will be produced, and will need time to merge them into a large graphic. Most traditional region classification methods are bottom-up methods. Lee and Ryu [12] first used CC labeling to extract bounding boxes from input binary images. Subsequently, horizontal and vertical projection methods are used to estimate the periodicity of each region. If a region is not a single periodical region, it would be split into two regions, repeating until the region becomes a single periodical region. After confirmation of the segmentation results, heuristic rules were used to classify segmented maximal homogeneous regions as texts, images, tables, or ruling lines. Lee et al. [13] extracted the bounding boxes of the CCs, then they used the projection profile to determine thirty-six region classification rules, based on geometric characteristics (and forty-four threshold values) of layout objects to facilitate region identification. Most traditional methods first used binarization method to threshold the document images. Then, they used top-down or bottom-up segmentation methods to extract CCs and projection profile was used to group the CCs. Finally, features were extracted. The knowledge or heuristic rules would be used and parameters set to identify the grouped objects into non-text or text objects. However, these methods are dependant on the effectiveness of binarization results. The segmentation and identification results could be adversely influenced. Further, binarization, region segmentation, and region identification methods operated independently; no one helps another. In particular, color documents are printed with different regions having different background colors. These methods cannot have effective results. Subsequent segmentation and identification steps will not work properly. The conventional region segmentation
An Intelligent Method to Extract Characters in Color Document with Highlight Regions
145
methods are time-consuming. An A4 color document image scanned in 300 dpi true color is approximately 25 MBytes. These processing methods will not be timeefficient. Also, the conventional region identification methods are applied to the preceding segmentation results. These methods are of a bottom-up type, and used the document model created from the top-down information. Thus, this study will present an intelligent method designed to solve abovementioned problems, which integrates background determination, region segmentation, and region identification to extract characters in color documents having different background colors.
2 Integrated Method for Extraction Characters The proposal method includes the following steps. First, RGB color space is transformed to gray level. Second, a border background determination is applied to obtain the background color in the border of the document image. Third, the restricted projection method is used to extract the regions in the color document image. Fourth, the preceding segmentation information and boundary analysis are applied to identify the characters and highlight regions in the extracted regions. Fifth, the border background determination is applied again for the highlight regions. Then the highlight regions are converted into binary images, and the characters in the highlight region are extracted. 2.1 Border Background Determination The layout in an article document is arranged by the background space surrounding printed regions. That is, the foreground objects, including text and graphics, are placed at the center of the document image. If the document border can be effectively determined, the document image background distribution can be found. Then, the foreground and the background in the document image can be separated using the background distribution as a reference. Herein, it is assumed that the background distribution of the color document is unimodal Gaussian distribution or changing gradually. The algorithm of the border background determination is as follows: 1. The color document image is transformed into the luminance space. 2. The algorithm of the scan line color differentiation [6] is applied to obtain the left, right, top, and bottom boundaries of the color document. 3. The histograms of the top, bottom, left, and right boundaries are combined to form the total histogram of the border background. 4. The total border histogram is smoothed to remove unreliable peaks by using the Gaussian smoothing filter [2]. 5. Major peaks are determined using the average differences approach [2]. 6. Small peaks are removed and the significant peaks are reserved [2]. 7. If the peak number is one, the background distribution is obtained by the low and high valleys (VLow, VHigh). These values are obtained by using modified triangle method [2]. The low bound (THRLow) and the high bound (THRHigh) of the background distribution are determined by following equations:
THRLow = VLow − σ × k
(1)
146
C.-M. Tsai
THRHigh = VHigh + σ × k
(2)
Note: σ is the standard deviation computed from the border pixels. k is a background constant value, which is used to adjust the low bound and the high bound of the background distribution. Herein, k is set as 2.0 for the purpose of our experiments. 8. If the peak number is two, the peak with the maximum population is the background distribution. The low bound (THRLow) and the high bound (THRHigh) of the background distribution are determined by using the same equations. 9. If the peak number is greater than two, gradually decaying detection [6] is applied to detect if the background is gradually changing. 10.If the background is not gradually changing, the peak with the maximum population is the background distribution. Low bound (THRLow) and high bound (THRHigh) of the background distribution are determined by using the same equations. 11.If the background is gradually changing, the thresholding algorithm [6] for gradually changing background is applied. 2.2 Regions Segmentation The object of region segmentation is to segment the region of interest (ROI) automatically. Usually, the extracted regions are displayed as either a rectangular bounding box or a polygon [3]. The simple method to segment the ROI is to locate the left-top and right-bottom coordinates for the ROI. Herein, a restricted background projection (RBP) algorithm is used to extract the ROI. This RBP algorithm includes horizontal and vertical directions background projection. In each direction, the background pixels in the scanning line are checked until the first foreground pixel is touched. The checking background pixels are summed to obtain the projection profile. A straight-scan is done to extract the disjoint regions. The straight-scan is defined as the pixels in one scanning line which all belongs to the background distribution. The RBP algorithm is described in more detail as follows: 1. The background distribution of the color document is determined. 2. In regard to horizontal restricted background projection (HRBP), the leftward restricted background projection is used to obtain the left background projection profile (LBPP). This LBPP is determined by checking the pixels in the scanning line from left to right until the first foreground pixel is found. The location of the preceding background pixel is recorded in the LBPP. 3. The rightward restricted background projection is used to obtain the right background projection profile (RBPP). It is determined by checking the pixels in the scanning line from right to the corresponding LBPP location until the first foreground pixel is found. The location of the preceding background pixel is the RBPP. 4. After the background projections are completed, the vertical disjoint regions can be extracted by locating the top, the left, and the bottom coordinates from the LBPP and the right coordinates from the RBPP. The consecutive non-straight-scan lines from top to bottom between the two straight-scan lines are grouped as a strip region. The straight-scan lines before and after the consecutive non-straight-scan lines are the top and bottom coordinates. The minimum x coordinates between the
An Intelligent Method to Extract Characters in Color Document with Highlight Regions
147
consecutive non-straight-scan lines of the LBPP is denoted as the left coordinates. The maximum x coordinate from top to bottom of the RBPP is the right coordinate. 5. In regard to vertical restricted background projection (VRBP), each strip region has applied downward and upward restricted background projections to obtain the top background projection profile (TBPP) and the bottom background projection profile (BBPP), similar to the leftward and the rightward projection. After VRBP is applied, many regions can be extracted. The left, right, and top coordinates of each region are obtained from TBPP, the bottom coordinates from BBPP. 2.3 Regions Identification Background and foreground regions are extracted by this approach. Usually, the foreground regions include primitive and composite regions [3]. Primitive regions include noises, vertical lines, horizontal lines, text characters, photographs, and highlight regions. A photograph is defined as a primitive region with large width, large height, which can be bounded by a rectangle. The highlight is defined as a primitive region with homogenous or gradual background color. Composite regions include text lines, text blocks, and columns. A text line is defined as characters aggregated in the vertical or horizontal directions. A text block is defined as the text lines or characters aggregated in the vertical or horizontal directions. A column is defined as a union of regions that do not have vertical straight-scans. These features are used to indentify above-mentioned regions: Horizontal region number (HRN) is the number of the extracted horizontal regions after applying the HRBP. If HRN is greater than one, the region can be divided into HRN horizontal regions. Otherwise, the region cannot be split. Horizontal height (HH) is the height of the extracted horizontal regions after applying the HRBP. Vertical region number (VRN) is the number of the extracted vertical regions after applying the VRBP. If VRN is greater than one, the region can be divided into VRN vertical regions. Otherwise, it cannot be split. Vertical width (VW) is the width of extracted vertical regions after applying the VRBP. Left variance (LV) is the left boundary variance of the extracted region obtained from the LBPP. Right variance (RV) is the right boundary variance as obtained from the RBPP. Top variance (TV) is the top boundary variance of the region obtained from the TBPP. Bottom variance (BV) is the bottom boundary variance of the extracted region obtained from the BBPP. A simple document model from the top-down model is applied to show the proposed identification method which is based on the decision tree shown in Fig. 1. 1.
2. 3. 4. 5.
The HRBP is applied to the color document image. If the HRN is equal to one, the color document image cannot be split in a horizontal direction. The VRBP is applied to obtain the VRN. The complex document model is not considered. If the HRN is greater than one, the color document image can be decomposed into horizontal regions from the horizontal direction. If the height of the horizontal region is greater than threshold value (TH1), this horizontal region belongs to a composite region. Proceed to step 10. If the height of the horizontal region is greater than threshold value (TH2), this region belongs to horizontal text line or highlight text line. Proceed to step 7. If the width of the horizontal region is greater than threshold value (TW1), this region belongs to horizontal line. Otherwise, the horizontal region is noise.
148
C.-M. Tsai
Fig. 1. Decision tree-based regional identification method
6. Applying the VRBP to the height is smaller than and equal to the threshold value (TH1) and is greater than the threshold value (TH2) of the horizontal region. 7. If the VRN is equal to one, this horizontal region cannot be split in the vertical direction. This region is a horizontal highlight text line. 8. If the VRN is greater than one, this horizontal region can be decomposed into characters from the horizontal direction. This region is a horizontal text line. 9. Applying the VRBP to the height is greater than the threshold value (TH1) of the horizontal region. 10. If the VRN is greater than one, the horizontal region can be decomposed into vertical regions in the vertical direction. This horizontal region is a composite region. (To simplify the problem, the complex document model is not considered.) 11. If the VRN is equal to one, the horizontal region cannot be split in the vertical direction, as it is a horizontal highlight text block or a photo. Highlight detection is applied to identify whether this horizontal region is a horizontal highlight text block. The highlight detection is based on LV, RV, TV, and BV. If LV, RV, TV, and BV of this horizontal region are all smaller than a pre-learning threshold value, this horizontal region is a horizontal highlight text block. Otherwise, the horizontal region is a photo. The parameters are derived from a learning phase that inputs training document images. In pattern classification research, a support vector machine (SVM) approach is used to find a hyper-plane that splits the training samples into two classes with the widest margin between them. The parameter of the threshold value one for horizontal height (TH1) is obtained as follows:
An Intelligent Method to Extract Characters in Color Document with Highlight Regions
1. 2. 3. 4.
149
The HRBP is applied to the training color document image. Each horizontal region is labeled as text line or non-text line. Height for horizontal text line and non-text line are calculated as training features. LIBSVM [14] is invoked to construct the decision function.
The proposed identification method establishes an interesting application of the SVM approach in the area of document image analysis. 2.4 Characters Extraction from Highlight Region There are many characters or text lines in the highlight text line or highlight text block, respectively. To extract characters in the highlight text line, the border background determination (Sect. 2.1) is applied to determine the background distribution of the highlight text line. The VRBP (Sect. 2.2) is used to extract characters in the highlight text line. For the highlight text block, the border background determination (Sect. 2.1) is applied again to determine the background distribution of the highlight text block. The HRBP (Sect. 2.2) is first applied to split the text block into horizontal text lines in the highlight. For each horizontal text line, the VRBP (see Sect. 2.2) is then used to extract the characters in the horizontal text line.
3 Experimental Results and Discussion The intelligent method was implemented as a Microsoft Visual C# 2008 Windowsbased application on an Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz Notebook. All test documents were scanned at 300 dpi. Results were saved as true color images. To illustrate thresholding results, the foreground objects are shown using “black” color and the background is shown by “white” color. Different color documents have been used, which include name cards, magazines, textbooks, and color receipts. Figure 2 shows the example for the proposed intelligent method. The original color document image is shown in Fig. 2(a), printed with a red background color. Two highlight regions are designed. One is printed using a black background color, the other with a yellow background color. After applying the border background determination (Sect. 2.1), the HRBP (Sect. 2.2) is used to segment the color document image. The LBPP and RBPP are obtained by using the leftward and the rightward restricted background projection, as shown in Fig. 1(b). The LBPP is white, the RBPP is black. The coordinates of the first horizontal highlight text line are shown in Fig. 2(b). After the HRBP and the decision tree-based identification, six regions were extracted, shown in Fig. 2(c). Region 1, a horizontal line, is drawn with a yellow rectangle. Regions 2, 3, 4, and 6 are horizontal text lines or highlight text lines, drawn with a green rectangle. Region 5 is a composite region, drawn with a brown rectangle. To discriminate between horizontal text line and horizontal highlight text line, the VRBP (Sect. 2.2) is applied. The split results are shown in Fig. 2(d). Region 2 is a horizontal highlight text line, drawn with a green rectangle. Regions 3, 4, and 6 are horizontal text lines, split into characters and delimited by a blue rectangle to differentiate Region 5, the composite region. The VRBP and identification rule in the decision tree are applied. Region 5 is a horizontal highlight text block, drawn with a black
150
C.-M. Tsai
rectangle and is shown in Fig. 2(d). To extract characters from Region 2, the border determination and the VRBP are applied again. These extraction characters are drawn with a blue rectangle and are shown in Fig. 2(e). Similarly, the characters in Region 5 are extracted by using border background determination, HRBP, and VRBP method, drawn with a blue rectangle and shown in Fig. 2(f).
Fig. 2. Example for the proposed intelligent method
Our binarization result is compared with three other methods: Otsu’s [5] global thresholding method, Niblack’s [8] local thresholding method and our previous border thresholding method [6]. Figure 3 shows four thresholding results for Fig. 2(a). Figures 3(a) – 3(d) show the result using our method, Otsu’s method, Niblack’s method (3 x 3), and (d) our previous work. Otsu’s method binarized Region 5 (Fig. 2(c)) correctly, but other regions are inversed, thresholding the gray regions well. The other conditions are not thresholded properly. From Fig. 3(c), the Niblack’s method binarized only Region 5 (Fig. 2(c)) correctly. Our previous work only binarized Region 3, 4, and 6 correctly. The proposed method is a significant improvement. The character extraction results obtained using our method, our previous work [3], Tseng and Lee [9], and Lee and Ryu [12] are shown in Figure 4. Fig. 4(b) shows our previous work can extract the characters in Regions 3, 4, and 6, drawn with blue. Region 1, a horizontal line, is yellow. The characters extraction results by [9] are shown in Fig. 4(c). This result extracted only the characters in Region 5 (Fig. 2(c)). Lee and Ryu cannot extract the characters from Fig. 2(a) because this approach used the CC labeling to obtain the bounding boxes and used projection profile to project the bounding boxes in identifying the segment results. However, when processing the color document image such as Fig. 2(a), the Lee and Ryu’s projection profile method extracted only one object. The proposed method can extract characters correctly.
An Intelligent Method to Extract Characters in Color Document with Highlight Regions
151
Fig. 3. The thresholding results for Fig. 2(a). This image binarized by: (a) our method, (b) Otsu’s method [5], (c) Niblack’s method [6], and (d) our previous work [8].
Fig. 4. The character extraction results for Fig. 2(a) by using (a) our method, (b) our previous work [3], (c) Tseng and Lee [9], and (d) Lee and Ryu [12]. Table 1. The computational performance of our proposed method, our previous work [3], Tseng and Lee [9], and Lee and Ryu [12] for Fig. 2(a)
Methods Ours Our previous work [4] Tseng and Lee [12] Lee and Ryu [16]
Average execution times 18956 (uS) 20110 (uS) 53732 (uS) 81873 (uS)
152
C.-M. Tsai
The computational performance of the methods is shown in Table 1. The average execution time of the Lee and Ryu method is four times that of our proposed and previous methods. Tseng and Lee method is three times that of our proposed method. The proposal method is considerably faster than the compared methods.
4 Conclusions We have presented an intelligent method to extract characters in color document images, which uses border background determination to determine the background distribution of the color image. The restricted background projection and decision tree-based identification are combined to segment regions and identify them. When the highlight region is identified, the border background determination and the restricted background projection are applied again to extract characters. Comparison to other methods demonstrates that the proposed method is more effective and time-efficient.
References 1. Smith, E.B., Monn, D., Veeramachaneni, H., Kise, K., Malizia, A., Todoran, L., El-Nasan, A., Ingold, R.: Reports of the DAS02 working groups. IJDAR 6, 211–217 (2004) 2. Tsai, C.M., Lee, H.J.: Binarization of Color Document Images via Luminance and Saturation Color Features. IEEE Transactions on IP 11(4), 434–451 (2002) 3. Tsai, C.M., Lee, H.J.: Efficiently Extracting and Classifying Objects for Analyzing Color Documents. online version in Machine Vision and Applications (2009) 4. Chen, Y.L., Wu, B.F.: A Multi-Plane Approach for Text Segmentation of Complex Document Images. PR 42(7), 1419–1444 (2009) 5. Otsu, N.: A thresholding selection method from gray-scale histogram. IEEE Trans. Systems, Men, and Cybernetics 9, 62–66 (1979) 6. Tsai, C.M.: An Efficient and Effective Background Determination for Color Document Images. In: ICMLC 2009, Baoding, Hebei, China, vol. 5, pp. 2857–2862 (2009) 7. Chou, C.H., Lin, W.H., Chang, F.: A binarization method with learning-built rules for document images produced by cameras. PR 43(4), 1518–1530 (2010) 8. Niblack, W.: An Introduction to Digital Image Processing, pp. 115–116. Prentice Hall, Englewood Cliffs (1986) 9. Tseng, Y.H., Lee, H.J.: Document Image Binarization by Two-Stage Block Extraction and Background Intensity Determination. PAA 11, 33–44 (2008) 10. Nagy, G., Seth, S., Viswanathan, M.: A Prototype Document Image Analysis System for Technical Journals. IEEE Computer 25(7), 10–22 (1992) 11. Fletcher, L.A., Kasturi, R.: A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images. IEEE Transactions on PAMI 10, 910–918 (1988) 12. Lee, S.W., Ryu, D.S.: Parameter-Free Geometric Document Layout Analysis. IEEE Transactions on PAMI 23(11), 1240–1256 (2001) 13. Lee, K.H., Choy, Y.C., Cho, S.B.: Geometric Structure Analysis of Document Images: A Knowledge-Based Approach. IEEE Transactions PAMI 2(11), 1224–1240 (2000) 14. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Development of Technological System Structure for Threaded Connections Assembly under Conditions of Uncertainty Roman Chumakov Production Processes Management Department Chita State University 30 Alekzavodskaya, Chita, 672039, Russia Tel./Fax: +7 3022 352738
[email protected] Abstract. This article presents an intelligent method to synthesize a technological system structure for threaded connections assembly under conditions of uncertainty. To accomplish the task, a fuzzy neural network was used. In the process of the neural system training, rules for optimal selection of assembly equipment were obtained. The optimal structure of the assembly system provides for high quality characteristics of the assembly process, as well as of the resulting threaded connections. Keywords: assembly, fasteners, neural network, structure.
1 Introduction In the assembly of threaded connections, wide use is made of modern high-production equipment, automation and intensification aids, and control systems to determine operating parameters for the assembly process. In spite of considerable progress in the machine building industry, however, mechanical assembly operations are still highly laborintensive. Threaded connections make up nearly 70% of all mechanical connections in industries worldwide [4], and involve 25-35% of all assembly labor. A firm may, therefore, suffer great financial loss if it fails to choose an adequate assembly strategy [6]. One of the major problems that present day assembly structure designers face is the development of a technological system structure. The development process brings about serious challenges for the specialist due to the complexities involved. The wide range of mounting hardware, existent technologies and equipment structured in a multidimensional technological space should yield an optimal realization variant. This variant should optimally accompany threaded connection design requirements as well as feasibility indices. The initial information in this case is uncertain. Traditional structural system modeling involves a mathematical apparatus of set and graph theory [2]. Usually the relationship between set elements is described as a binary relationship. This approach has a certain limitation which amounts to the impossibility of describing the fuzzy properties of the assembly system elements. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 153–162, 2011. © Springer-Verlag Berlin Heidelberg 2011
154
R. Chumakov
Generalized systems and training methods, based on examples, are widely used for designing information support systems. One promising trend in the field of intelligent systems involves the integration of fuzzy logic with artificial neural networks (Neural Fuzzy Systems) [7]. Neural systems are capable of being learning based on a given number of examples. Their drawback is the impossibility of analyzing the logical conclusion process, i.e. they are a “black box” for the user [1, 3, 8]. Unlike neural systems, fuzzy logic methods are clear for the user from the point of view of the results obtained, but no generalized system can be made on their basis [9]. The purpose of the given work is to develop an intelligent method to form an optimal technological system structure for threaded connections assembly.
2 Structural Analysis of Technological Process of Threaded Connections Assembly Fig. 1 presents the structural chart of the technological system. However, structural presentation alone can’t make the classification of technical objects quite full. It is necessary to consider the following factors: structural, functional, geometrical, physical and mechanical. Drive Fulfilling
Carrier system
Protecting Basic
Loadaccumulating Feeding
Initiating D E V I C E S
P A R T S
А Positioning and transporting B Initiating Basic Protecting Fulfilling
Feeding Loadaccumulating
Drive
Fig. 1. Structure of the assembly automation technological system
Development of Technological System Structure for Threaded Connections Assembly
155
The technological structure of an assembly system depends on the number of initial conditions and requirements relating to the assembly process (Table 1), many of which are hard to formalize. Nowadays fuzzy sets are used to describe such systems. Table 1. Sets of technological conditions Set name
Symbol
Elements or boundary values
Thread size
S1
Type of threaded connection
S2
Basic part material Availability of elements for preliminary fastener insertion into the hole Basic part dimension and weight Basic part surface configuration Capacity Power input (energy consumption) Reliability Cost Matching precision of assembled parts axes Technological system rigidity System controllability Assembly system complexity Assembly unit dimension and mass
S3 S4
Small М3-М5; Medium М6-М12; Large М14-М20 With a gap (gap b/w screw threading and basic part threading); With tightness (tight); Thread-forming Fragile; Plastic Available; Not available
S5 S6 S7 S8 S9 S10 S11
Small; Medium; Large Simple; Complex Low; High Low; High Low; High Low; High Low; High
S12 S13 S14 S15
Low; High Low; High Low; High Low; High
Fuzzy sets are described by a membership function. One of the standard ways is the description of fuzzy sets using a Gauss function: ⎡ ⎛ x − c ⎞2 ⎤ μ Si ( x) = exp ⎢− ⎜ ⎟ ⎥ ⎢⎣ ⎝ σ ⎠ ⎥⎦
(1)
where Si is a technological condition; x is the value of a function (condition); c is the value of set under which the membership function possesses the peak value; and is the function range factor. Thus, mathematical description of such vague requirements as “Matching precision of assembled parts axes should be high” becomes possible. Fig.2 presents a membership function for the “Basic part dimension and weight” fuzzy set.
156
R. Chumakov
μ ( S5 ) 1 0.5 0 Small
Large Medium
S5
Fig. 2. Membership function for “Basic part dimension and weight” fuzzy set
The technological system of an assembly machine consists of two subsystems: a fastener subsystem (the stud connection variant will be considered later) and a basic part subsystem. There are several ways to supply assembly parts to the assembly position (Table 2). Stud connection has a distinguishing feature including the possibility to supply a fastener to the assembly position in two methods: either supply of a stud from a tray or supply of a chuck inserted with a stud. Table 3 presents the supporting assembly equipment. For the stud subsystem two kinds of accumulators were selected, primary and secondary ones. This is due to the fact that a fastener tends to go a complex course from the primary position to the assembly position. Table 2. Methods of supplying studs and parts to the assembly position STUD SUBSYSTEM Supply of a stud
Supply of a chuck inserted with a stud
1. Into the chuck (through the chuck)
1. Into the spindle
2. Into the chuck (on the outside of the chuck) 2. Spindle supplied to the chuck with a stud 3. From the chuck to the stud 4. Into the hole of the part (if a special adjusting element is available on the stud or bush) PART SUBSYSTEM Supply of a part 1. A part is supplied to the machine table / assembly industrial robot 2. A part is supplied with a fixture pallet to the machine table / assembly industrial robot
Development of Technological System Structure for Threaded Connections Assembly
157
Table 3. Support assembly equipment STUD SUBSYSTEM Primary accumulators 1.Magazine accumulator 2. Conveyor 3. Storage place
Secondary accumulators 1. Bunker with a chute 2. Magazine with studs 3. Pallet with studs 4. Magazine with tools 5. Pallet with chucks
PART SUBSYSTEM Accumulators 1. Loader 2. Conveyor with parts 3. Conveyor with fixture pallets
Transfer and supply equipment 1. Feeder 2. Machine manipulator (tool head) 3. Separate manipulator 4. Industrial robot 5. Robocar (trolley) 6. Pallet-loading table 7. Machine spindle Transfer and supply equipment 1. Separate manipulator 2. Industrial robot 3. Pallet-loading table
a10
(a) b2
b3
z2
b1
b4
(b) Fig. 3. Assembly process graph: (a) Stud subsystem; (b) Part subsystem
y1 y2
u4
u2
u3
u1
v6 v7
a8
v5
a7
z1
a9
a6
v6
a5
a3
v3
a4
a2
v4
a1
v1 v2 v3 v4
The assembly process graph is presented in Fig.3. This figure shows the interconnections between the ways to supply parts and assembly equipment. This kind of interconnection description may be viewed as certain for depicting a relational data model [9].
158
R. Chumakov
Equipment: a1 – magazine accumulator; a2 – conveyor; a3 – storage place; a4 – bunker with a chute; a5 – magazine with studs; a6 – pallet with studs; a7 – magazine with tools; a8 – pallet with chucks; a9 – part; a10 – spindle. elements a4, a5 and a6 are secondary stud accumulators (indicated by a dotted line). elements a7 and a8 are secondary accumulators of chucks with studs (indicated by a chain line). b1 – part loader; b2 – conveyor with parts; b3 – conveyor with fixture pallets; b4 – machine/robot table. v1 – feeder; v2 – machine manipulator (tool head); v3 – separate manipulator; v4 – industrial robot; v5 – robocar (trolley); v6 – pallet-loading table; v7 – machine spindle. Methods of supplying studs: u1 – into the chuck (through the chuck); u2 – into the chuck (on the outside of the chuck); u3 – from the chuck to the stud; u4 – into the hole of the part. Supply of a chuck inserted with a stud: y1 – into the spindle; y2 – spindle supplied to the chuck with a stud. Methods of supplying parts: z1 – a part is supplied to the machine table; z2 – a part is supplied with a fixture pallet to the machine table
Taking into consideration the fact that parts to be assembled may be delivered to the assembly position via different ways with the aid of different equipment, the number of possible variants of assembly process implementation amounts to several thousands. The high number of variations is the reason classical methods have failed to assist in choosing the optimal variant.
3 Fuzzy Model and Algorithm to Choose Optimal Assembly Process Structure Proceeding with the objective of this study, we determine the ways to supply parts and equipment to use in forming an assembly process structure. The task is accomplished in the following order. 1. A table is constructed indicating which technological conditions are attributed to which ways of supplying a part and assembly equipment (for stud and part subsystems, respectively). This table enables identification of significant parameters, thus diminishing the caliber of the task. Table 4. Membership table P2 P3 B1 B2 B3 B4 P1 S1 a11 … … … … S14 … aij S1…S14 – set of technological conditions (Table 1). Ways of supplying studs and parts to the assembly position (Table 2): Stud subsystem: P1 – way of supplying a stud, P2 – way of supplying a chuck inserted with a stud; Part subsystem: P3 – way of supplying a part. Supporting assembly equipment (Table 3): Stud subsystem: B1 – primary accumulators, B2 – secondary accumulators, B3 – transfer and supply equipment; Part subsystem: B4 – accumulators, B5 – transfer and supply equipment. aij – a numerical indicator of membership degree, range [0…1]
Development of Technological System Structure for Threaded Connections Assembly
159
2. On the basis of the significant parameters determined in Table 4, a learning set of rules is formed, e.g. for the supporting assembly equipment: Rulei: IF xi is Si1 AND … AND xj is Sij AND… AND xm is Sim, THEN y is Bi 3. An adaptive network-based fuzzy inference system (ANFIS) [5] is created. The number of input signals is determined in accordance with the number of significant parameters. The neural network structure is depicted in Fig.4. The ANFIS fuzzy neural network a hybrid neurofuzzy network. Its structure is isomorphic to a fuzzy knowledge database. The ANFIS comprises 5 layers: Layer 1. is intended for the fuzzification of variables x1…xN. Membership coefficient μ ((Sk )) ( xi ) is determined in accordance with the fuzzification function (1) for each koutput rule. Parameters and c are adapted in the training process. Layer 2. M is the number of neurons in the second layer. Each neuron in that layer corresponds to one fuzzy rule. A neuron in the second layer is connected to those neurons in the first layer forming the antecedents of the corresponding rule. Thus, each neuron in the second layer can accept from 1 up to N input signals. Weight is for the x input array. defined in this layer as
μ ((1)S ) ( x1 )
w1 = μ((1)S ) ( x1 )
w1*
w1* y1 ( x)
…
(m) μ ((k)S ) ( xn ) wm = μ( S ) ( xn )
wm*
y(x)
…
…
x1
xN
Layer:
1
2
3
wm* ym ( x) 4
5
Fig. 4. Structure of ANFIS fuzzy neural network
Layer 3. M is the number of neurons in the third layer. Each neuron in that layer estimates relative fuzzy rule weight ∑ Layer 4. M is also the number of neurons in the fourth layer. Each neuron in the fourth layer is connected to one neuron in the third layer and also with all networks inputs (not illustrated in Fig.4). The function is evaluated in this layer. The values of the function are multiplied by the relative degree of the rule implementation from the previous layer. Layer 5. The only neuron of this layer summarizes all contributions of the rules. 4. The neural network is learns on the basis of the learning set created beforehand.
160
R. Chumakov
5. To check the neural network a test set established. It contains examples which are not part of the learning set. In this manner, the neural network is checked for ability to make a logical conclusion based on the data the network had not been exposed to beforehand.
4 Neural Network for Choosing Optimal Assembly System Structure Let us regard choosing support assembly equipment, namely, accumulators for a part subsystem, as a case in point. We make a membership table (Table 5) for the technological conditions and the part accumulators. The following values are used to indicate membership degrees: 1 indicates a major condition which determines the choice of the given equipment; 0.8 implies that this is always considered, but does not determine the choice in the decision making; 0.4 means that this condition is indirectly related to the equipment choice; and the value of 0 is disregarded in the decision making. Table 5. Membership of technological conditions to support assembly equipment (part accumulators) Technological conditions (Table 1) Part accumulators (Table 3)
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
0
0
0
1
1
1
0.8
0.4
0.4
0.4
0
0.8
0.8
0.4
Natural language rules are formed for parameters with membership degree ≥ 0.8, for example: IF the dimensions and weight of the basic part are large AND the basic part surface configuration is complex AND the power input is above medium AND the system controllability is medium AND the assembly system complexity is medium THEN the part accumulator is a conveyor with fixture pallets. Each condition is set from 0 to 1, while the type of equipment is code marked. Table 6 presents some training set values which determine the choice of the part accumulator. Table 6. Some training set values x4
x5
Input vector x6 x7
x12
x13
Output signal
0 0 1 0 1 0 1 1 0 1 0 0.5 0.5 0.8 0.2 0.7 0.3 0.7 0.5 0 0.5 0.5 0.4 0.6 Output signal marking (code): 0 – loader; 0.5 – conveyor with parts; 1 – conveyor with fixture pallets
0 1 1 0.5
Development of Technological System Structure for Threaded Connections Assembly
161
Fuzzy neural network modeling is performed with the MATLAB® NN toolbox. The resulting neural network structure and test results are given in Fig.5. Ninety-six rules for choosing part accumulators were obtained. The neural network test results show that the network identifies with confidence an input vector which was not part of the learning set. The rest of the assembly equipment, as well as the way of supplying a part, are determined in a similar manner. To complete the technological system structure, seven further neural networks were constructed to match the parameter numbers in Table 4 (P1 – P4 and B1 – B4).
(a)
(b) Fig. 5. Neural network structure (a) and test result (b)
5 Conclusion Traditional methods of decision making have been successfully applied only for a limited number of tasks; usually these are rather simple tasks with a minimal number of estimated alternatives. They include estimation of every single alternative in
162
R. Chumakov
accordance with a certain criterion, a successive binary selection method, based on comparing pairs of separately taken alternatives. These methods can be formalized only if simple and investigated tasks and processes are fully unambiguous. Designing an assembly process for threaded connections constitutes a system with complex relations which admit no formal solution methods. Nevertheless, well-trained and competent specialists often make optimal selection upon decision making. Thus, the methods which combine a formal approach with intuitive experts’ knowledge (expert systems, fuzzy logic and neural networks) have been widely used. Unlike classical systems, the given methods provide for assembly system synthesis under conditions of uncertainty. To this end, a fuzzy neural network is used. It combines the advantages of fuzzy logic and a neural network. The ability of a neural network to generalize permits reduction of the volume of a problem, as typical examples for equipment selection and ways to supply parts are used in the process of neural network training. Fuzzy neural network training resulted in generation of certain rules for technological system structure synthesis depending on initial conditions and assembly technology requirements.
References 1. De Jesús, O., Hagan, M.T.: Backpropagation Algorithms for a Broad Class of Dynamic Networks. IEEE Transactions on Neural Networks 18(1), 14–27 (2007) 2. Gross, J., Yellen, J.: Graph theory and its applications. Chapman & Hall/CRC (2006) 3. Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice Hall Press, NY (1998) 4. Hellmann, K.-H.: Entwicklung der Schraubtechnik. VDI Zeitschrift 131(5), s.89–s.93 (1989) 5. Jang, J.-S.R.: ANFIS: Adaptive Network-based Fuzzy Inference System. IEEE Trans. on Syst., Man and Cybernetics 23(3), 665–685 (1993) 6. John, R., Jones, S.: The economic Relevance of materials joining technology. Welding World 39(3), 145–153 (1997) 7. Lin, C.-T., Lee, C.-C.: Neural Fuzzy Systems. A Neuro-Fuzzy Synergism to Intelligent Systems. Prentice-Hall, NY (1996) 8. Warwick, K., Irwin, G.W., Hunt, K.J.: Neural Networks for Control and Systems. Stevenage, U.K (1992) 9. Zha, X.F., Lim, S.Y.E., Fok, S.C.: Development of Expert System for Concurrent Product Design and Planning for Assembly. International Journal of Advanced Manufacturing Technology 15, 153–162 (1999)
Automatic Vehicle Identification by Plate Recognition for Intelligent Transportation System Applications Kaushik Deb1 , My Ha Le2 , Byung-Seok Woo2 , and Kang-Hyun Jo2 1
Dept. of CSE, Chittagong University of Engineering & Technology, Chittagong-4349, Bangladesh 2 Dept. of EE and Information Systems, University of Ulsan, Daehak road 100, Nam-gu, 680-749 Ulsan, South Korea
[email protected], {lemyha, woo}@islab.ulsan.ac.kr,
[email protected] Abstract. Automatic vehicle identification is a very crucial and inevitable task in intelligent traffic systems. In this paper, initially, a HueSaturation-Intensity (HSI) color model is adopted to select automatically statistical threshold value for detecting candidate regions. The proposed method focuses are on the implementation of a method to detect candidate regions when vehicle bodies and license plate (LP) have similar color based on characteristics of color. Tilt correction in horizontal direction by the least square fitting with perpendicular offsets (LSFPO) is proposed and implemented for estimating rotation angle of the LP region. Then the whole image is rotated for tilt correction in horizontal direction by this angle. Tilt correction in vertical direction by reorientation of the titled LP candidate through inverse affine transformation is proposed and implemented for removing shear from the LP candidates. Finally, statistical based template matching technique is used for recognition of Korean plate characters. Various LP images are used with a variety of conditions to test the proposed method and results are presented to prove its effectiveness. Keywords: HSI color model, Tilt correction, Least square fitting with perpendicular offsets (LSFPO), Affine transformation, and template matching.
1
Introduction
With the rapid development of highway and the wide use of vehicle, people have started to pay more and more attention on the advanced, efficient and accurate intelligent transportation systems (ITSs). The task of recognizing specific object in an image is one of the most difficult topics in the field of computer vision or digital image processing. The vehicle license plate recognition (VLPR) task is quite challenging from vehicle images due to the view point changes, when vehicle bodies and LP have similar color, multi-style plate formats, and the nonuniform K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 163–172, 2011. c Springer-Verlag Berlin Heidelberg 2011
164
K. Deb et al.
outdoor illumination conditions during image acquisition. In addition, vehicle license plate detection (VLPD) system should operate fast enough to be real time to satisfy the needs of ITS and not miss a single interest object from the vehicle image. The VLPR is used widely for detecting speeding cars, security control in restricted areas, unattended parking zones, traffic law enforcement, and electronic toll collection, etc. Recently, the necessity of vehicle license plate recognition (VLPR) has increased significantly. One of the major problems in LP detection, tilt correction and recognition is determining LP systems. This system must guarantee robust detection and recognition under various weather and lighting conditions, independent of orientation and scale of the plate. As far as detection, tilt correction and recognition of the license plate region are concerned, researchers have found various methods of correcting tilt, locating and recognizing license plate. For example, Karhunen-Loeve (K-L) transformation method has been introduced for correcting a VLP tilt in [1]. However, no explanation of extracting LP region has been given in the paper. A region-based LP detection method has been presented in [3], which first applies a mean shift procedure in a spatial-range domain to segment a color vehicle image in order to get LP regions. Fuzzy logic has been applied in detecting license plates in [4]. A template matching application is described in [6]. Where the character recognizer extracts some feature points and uses template matching operators to get a racy solution under multiple acquisition conditions. The emphasis of this paper is on the tilt correction in vertical direction by reorientation of the titled LP candidate through inverse affine transformation is proposed and implemented for removing shear from the LP candidates. In addition, the consolidation of a method to detect candidate regions when vehicle bodies and LPs have similar color. A common drawback of color-based VLPD is the failure to detect the boundaries or border of LPs. This occurs when vehicle and LPs have similar colors. To overcome the common drawback, in this paper, we propose and implement a method named as modified recursive labeling algorithm. Furthermore, we focus on this paper in on the conducting an experiment using template matching technique with the use of normalized cross-correlation to perform for recognition of Korean plate characters.
2
Proposed Framework
In the author’s previous work [2], projection and least square fitting with perpendicular offsets based vehicle license plate tilt correction was presented. We propose in this section, an enhanced version of the framework for VLP tilt correction and recognition as shown in Figure 1. To improve the traditional LP detection method, as license plates can appear at many different angles to the camera’s optical axis, each rectangular candidate region is rotated until they are all aligned in the same way before the candidate decomposition. For correcting a VLP tilt in vertical direction by reorientation of the titled LP candidate through inverse affine transformation is proposed and implemented for removing
Automatic Vehicle Identification by Plate Recognition
165
Fig. 1. The proposed vehicle license plate recognition framework
shear from the LP candidates. A common drawback of color-based VLPD is the failure to detect the boundaries or border of LPs. This occurs when vehicle and LP have similar colors. It is important to mention here that some previous researches [3] and [4] doesn’t solve this problem and they assert leave these issues to be considered in future study. To overcome this common drawback, we propose and implement a method named as modified recursive labeling algorithm. Finally, conducting an experiment using template matching technique with the use of normalized cross-correlation to perform for recognition of Korean plate characters.
3
Vehicle License Plate Recognition Module
In this section, the five primary stages of the proposed VLP recognition framework, i.e., color segmentation, labeling and filtering, correcting tilt in horizontal and vertical direction, decomposing candidates and recognizing LP alphanumeric character have been discussed in details. Color arrangement of the Korean LPs are well classified. A More detailed explanation for color arrangement and outline of the Korean VLPs could be found in [2]. 3.1
Color Segmentation
Color is a distinctive feature because the law decides its usage of the vehicle according to color. Representation of plate color in an invariant manner is of main objectives for our color-based LP detection method. In the proposed framework, input vehicle images are converted into a Hue-Saturation-Intensity (HSI) color images. Then the candidate regions are found by using HSI color space on the basis of using hue, saturation and/or intensity. Many applications use the HSI color model. Machine vision uses HSI color model in identifying the color of different objects. Plate color information is used to detect candidate regions in our experiments, and shape properties of LP allow reducing number of LP-like candidates. A More detailed explanation could be found in [2] for detecting green, yellow, and white license plate pixels. Color segmentation parameters are very sensitive in order to detect as much candidates as possible. All false candidates
166
K. Deb et al.
Fig. 2. Successful license plate identification sequence in an unmoving vehicle: (a) an LP images, when vehicle and LP have similar colors and also with uneven illumination condition, (b) color segmentation results, (c) detected candidate after implementation of morphological closing operation and filtering, and (d) extracted candidate after tilt correction in horizontal direction
will be filtered out on the next stages. Examples of proposed color segmentation method is depicted in Figure 2. 3.2
Labeling and Filtering
In the proposed method, a recursive algorithm is implemented for connected component labeling operation. A common drawback of color-based VLPD is the failure to detect the boundaries or border of LPs. This occurs when vehicles and LPs have similar colors. To overcome this common drawback, we proposed and implemented a new method named modified recursive labeling algorithm. If we investigate carefully, when vehicle bodies and LPs have a similar color, we can find there is little color differences between LPs and vehicle color. Based on this idea, we overcome this problem, by trying to find those color difference parameters. To label connected pixels 8-neighbors recursive algorithm was used. Furthermore, connected pixels were grouped if distance in color space was less then predefined threshold Dmin . Two connected pixels are grouped if Dist(Ii, j , Im, n ) < Dmin . Color distance between two connected pixels Ii,j , Im,n is calculated using equation 1. Dist (Ii,j , Im,n ) =
k k Ii,j − Im,n
(1)
k={R,G,B} k={R,G,B}
k={R,G,B}
and Im,n are pixel belongs to red, green, and blue comwhere Ii,j ponents of input image I with neighbor coordinate (i, j) and (m, n), respectively. In this step, we extract candidate regions that may include LP regions from the binary mask obtained in the previous step. During this step, main geometrical properties of LP candidate such as area, bounding box, and aspect ratio are computed. A more detailed explanation could be found in [2]. These parameters are used in the filtering operation to eliminate LP-like objects from candidate list. Figure 2 portrays the steps for LP segmentation.
Automatic Vehicle Identification by Plate Recognition
3.3
167
Correcting tilt
In order to facilitate character segmentation and recognition in a VLPR system, the tilted LPs in the located image should be corrected in both the horizontal and vertical directions. Following the successful filtering operation of the vehicle image, in this paper, according to the LSFPO, the VLP candidate region is fitted to a straight line. After the line slope is obtained, tilt or rotation angle is estimated. Figure 3(a, b) depicts rotation angle α between the principal axis X of and the the horizontal axis X of the tilt VLP region. Then, the whole image is rotated for tilt correction in horizontal direction by this angle. Correcting horizontal tilt by straight line fitting method based on LSFPO: The least square method is a commonly used fitting method. Fitting requires a parametric model that relates the response data to the predictor data with one or more coefficients. To obtain the coefficient estimates, the least square method minimizes the summed square of residuals. A more detailed explanation could be found in [2]. Correcting horizontal tilt by straight line fitting method based on LSFVO: Given a set of data points. It is desired to find the best fitting line from a given set of data points. In principle, deviation between data and fitting line should be minimized. A more detailed explanation could be found in [2]. After getting the fitting slope a. Let tan α = a, and get the tilt angle α. Rotate the entire image with α from centroid rectangular candidates region of LP image and perform the tilt correction. Figures 5(c1-d1) and 5(c2-d2) portrays a sequence of successful horizontal tilt correction by LSFPO and LSFVO, respectively. Determining the rotation angle using principal axis: Following the successful filtering operation of the image, measurements such as center of area and the axis of least second moment are employed to solve the rotation adjustment problem. A more detailed explanation could be found in [2]. Figures 5(c3) and 5(d3) portrays a sequence of successful horizontal tilt correction by the principal axis (PA) method. Correcting vertical tilt. The purpose of correcting tilt in vertical direction is to correct VLP shear left and right in horizontal axis X as shown in Figure 3(c, d). Vertical tilt correction is also essential to facilitate character
Fig. 3. VLP tilt image in horizontal direction: (a) tilt angle α > 0 and (b) tilt angle α < 0. VLP tilt image in vertical direction: (c) tilt angle θ > 0 and (d) tilt angle θ < 0.
168
K. Deb et al.
Fig. 4. Vehicle tilt correction in vertical direction: (a) Reorientation of 4 points by using inverse general affine transformation, (b) an LP image with titled and (c) shear correction by using inverse general affine transformations
segmentation and to recognize VLP character accurately. Tilt correction in vertical direction by reorientation of the tilted LP candidate through inverse affine transformation is proposed and implemented for removing shear from the LP candidates. Shear correction through general affine transformation: A general affine transformation (AT) from 2D to 2D as in (2) requires six parameters and can be computed from only 3 matching pairs of points ([xj , yj ], [uj , vj ])j=1, 3 . ⎤⎡ ⎤ ⎡ ⎤ ⎡ x u a11 a12 a13 ⎣ v ⎦ = ⎣ a21 a22 a23 ⎦ ⎣ y ⎦ (2) 1 0 0 1 1 We use Hough transformation for getting 4 vertices from candidate LP regions. Hough line is estimated from boundary points of candidate LP (i.e. contour) and vertices are detected by intersection of hough line. After determining the 4 points (vertices) from LP candidate, the inverse general affine transformation (A−1 ) is applied for rectifying candidate LP region as shown in Figure 4. VLP tilt correction in vertical direction by reorientation of the tilted LP candidate through inverse affine transformation is applied for rectifying candidate LP region and is depicted in Figure 5(d). 3.4
Decomposing Candidate
Information is extracted from the image by intensity histograms that play a basic role in image processing, in areas such as enhancement, segmentation and description. In this section, verification and detection of the VLP region as well as character segmentation are considered and discussed. Once the candidate area is binarized the next step is to extract the information. At first, regions without interest such as border or some small noisy regions are eliminated; the checking is made by height comparison with other plate characters height. Figure 6 shows the results for verifying predetermined alphanumeric characters. 3.5
Recognizing LP Alphanumeric Characters
The objects are to recognize alphanumeric characters, both numeric numbers and Korean letters, written on the license plate. In this paper, template matching
Automatic Vehicle Identification by Plate Recognition
169
Fig. 5. Illustration of license plate segmentation: (a) an LP image, (b) detected candidate after filtering, (c) finding best fitting line through LSFPO(c1), LSFVO(c2), and principal axis (c3), (d) extracted candidate after tilt correction
Fig. 6. Steps for verifying predetermined alphanumeric characters: (a) extracting the candidate region, (b) vertical position histogram with LP border, (c) view of normalization candidate region after removing border and noisy area, (d) vertical position histogram (seven peaks for predetermined seven alphanumeric characters in LP region), and (e) equal-sized candidate alphanumeric characters after extraction
method with the use of normalized cross-correlation to perform recognition for different types of alphanumeric characters used in Korean vehicle LP. Before recognition scheme, the characters are normalized. Normalization is to refine the characters into a block containing no extra white spaces (i.e. pixels) in all four sides of the characters. Then each character is fit to equal size as shown in Figure 6(e). Fitting approach is necessary for template matching. For matching the candidate characters with the template, input candidate images must be equal sized with the template characters. Here the characters are fit to 40×25. The extracted characters are cut from license plate region and the characters on database are now equal sized. The Korean license plates contain Korean letters and numbers. For recognition of plate numbers and letters, we use the database where containing 10 prototypes for the Korean plate numbers and 30 prototypes for the
170
K. Deb et al.
Fig. 7. Template used for pattern matching: (a) 10 prototypes for the Korean plate numbers and (b) 30 prototypes for the Korean plate characters
Fig. 8. Illustration of LP segmentation and alphanumeric characters recognition: (a) an LP image, (b) extracted candidate region, (c) equal-sized candidate alphanumeric characters and (d) candidate alphanumeric characters measuring the similarity by NCC
Korean plate characters with the size of 40×25. The database formed is shown in Figure 7. To measure the similarity and find the best match, a statistical method correlation is used. Correlation is an effective technique for image recognition. This method measures the correlation coefficient between a number of known images with the same sized unknown images or part of an image with the highest correlation coefficient between the images producing the best match. In our case, to detect the presence of a candidate character into the template or database, a cross-correlation function is utilized between a sub-area of the normalized template and each candidate character. The candidate plate character recognition process is based on the value of the correlation coefficient. If the value of the correlation coefficient execeeds a threshold set by the user, then the similarity measure is large enough and the input character can be assumed to present. Finally, a box on the target character is plotted as shown in Figure 8(d).
4
Experimental Results and Conclusions
All experiments were done on Pentium-IV 2.4 GHz with 1 GB RAM under MATLAB environment. Images of size 640*480 and 320*240 pixels were used. The image database consists of 200 digital images from different two groups. In order
Automatic Vehicle Identification by Plate Recognition
171
Table 1. Comparison of detection rates Reference number
Detection rate
5 80.4% Proposed framework 100%
to evaluate the proposed framework, two groups of experiment were conducted. First group was used to compare the proposed framework with mathematical morphology (MM) method of [5]. A comparison between the proposed method and a well-reported method in the literature is given in Table 1. From Table 1 it can be seen that the proposed method outperforms the method repot in [5] from the detection rate points of view. The second group contains 175 images. All images in that group represents South Korean license plates from the natural scenes obtained in the nonuniform outdoor illumination conditions, multi-style and color of license plates, and various angles of vision. They were taken in distance of 3 to 15 m. Under these conditions, the success of LP detection has reached to more than 96%. In tilt correction experiments, we compare the tilt performance results of LSFPO with those of LSFVO and principal axis method. A more detailed explanation could be found in [2]. The recognition results of Korean alphanumeric characters (KAC) and average computational time for recognition of single character are shown in Tables 2 and 3, respectively. Table 2. Recognition results Stage
Occurrences of KAC Number of successes
Recognition of KAC 646
634
Success rate 98.15%
Table 3. Average computational time for single character Type
Average computational time (s)
Number 1 Korean word/character 3
In conclusion, a method is presented in this paper for automatic vehicle identification by the recognition of vehicle license plate. In this proposed method, initially, a Hue-Saturation-Intensity (HSI) color model is adopted to select automatically statistical threshold value for detecting candidate regions. These candidate regions may include LP regions; geometrical properties of LP are then used for classification. In addition, the emphasis of this paper is on the implementation of a method to detect candidate regions when vehicle bodies have similar
172
K. Deb et al.
color. In the vehicle horizontal tilt correction process, three correction methods are implemented for comparing the tilt performance results. Analysis and simulation results suggest that LSFPO and principal axis method tilt correction are more precise than LSFVO. However, LSFPO outperforms than principal axis because of faster computational time, easily implemented and more precise tilt correction. In this paper, tilt correction in vertical direction by inverse affine transformation is proposed for removing shear from the LP candidates. Finally, we focus on this paper in on the conducting an experiment using template matching technique with the use of normalized cross-correlation to perform for recognition of Korean plate characters. Color arrangement and predetermined alphanumeric character of the Korean license plate are important features for verification and detection of license plate regions. While conducting the experiments, different view point, illumination conditions, and varied distances between vehicle and camera often occurred. In such cases, confirmed the result is very effective when the proposed method is used. However, the proposed method is sensitive with motion blur in the input image. We leave these issues for consideration in future studies. Acknowledgments. This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the Human Resources Development Program for Convergence Robot Specialists support program supervised by the NIPA(National IT Industry Promotion Agency) (NIPA-2010-C7000-1001-0007)
References 1. Pan, M.-S., Xiong, Q., Yan, J.-B.: A new method for correcting vehicle license plate tilt. Int. J. of Automation and Computing 6(2), 210–216 (2009) 2. Deb, K., Vavilin, A., Kim, J.-W., Kim, T., Jo, K.-H.: Projection and Least Square Fitting with Perpendicular Offsets based Vehicle License Plate Tilt Correction. In: Proceedings of the Society of Instrument and Control Engineers, pp. 3291–3298 (2010) 3. Jia, W., Zhang, H., He, X.: Region-based License Plate Detection. J. Network and Comput. Applications 30(4), 1324–1333 (2007) 4. Chang, S.-L., Chen, L.-S., Chung, Y.-C., Chen, S.-W.: Automatic license plate recognition. IEEE Trans. Intell. Transp. Syst. 5(1), 42–53 (2004) 5. Martin, F., Garcia, M., Alba, J.L.: New Methods for Automatic Reading of VLP’s (Vehicle License Plates). In: Proceedings of the IASTED Int. Conf. on SPPRA (2002) 6. Comelli, P., Ferragina, P., Granieri, M.N., Stabile, F.: Optical recognition of motor vehicle license plates. IEEE Trans. Veh. Technol. 44(4), 790–799 (1995)
Intelligent Page Recommender Agents: Real-Time Content Delivery for Articles and Pages Related to Similar Topics Robin M.E. Swezey, Shun Shiramatsu, Tadachika Ozono, and Toramatsu Shintani Graduate School of Engineering, Nagoya Institute of Technology Gokiso-cho, Showa-ku, Nagoya, Aichi, 466-8555 Japan {robin,siramatu,ozono,tora}@toralab.ics.nitech.ac.jp
Abstract. In this paper, we present an architecture and sample implementation of a system which allows us to push latest up-to-date related contents to any Web news article or page in real-time. The architecture makes use of page Agents which recommend the contents and are persistent as well as synchronized over all page instances in browsers. The Agents are easy to incorporate on any Web page and make use of state-of-the-art Web technology. In our sample implementation, we show how our Agents, coupled with a Complementary Naive Bayes classifier, can recommend latest contents related to 47 Japanese prefectures and over 1700 Japanese cities. We show performance results and conclude on further research to improve the affiliate and user experience on the Web.
1
Introduction
In a previous paper [1], we proposed an architecture for a Multi-Agent System which would assist bloggers and webmasters of topic-oriented content sites to find suited and fair affiliation proposals for their affiliate page slots. We aimed at reducing the load in the affiliate search process and helping the mutually winning strategy when exchanging respective links on websites. In another paper, we have presented as work in progress the system on which we rely for heavy data harvesting and on-the-fly web page modification with push-type technology [2]. This system allowed us to push content in real-time at precise locations on a web page, using a server-side scheduler. In this paper, we present a system which lets Web pages stay up-to-date by using Real-Time Affiliate Content Agents. The implications of this technology are broader, but in the case at hand and current paper we name them Recommender Agents. This system makes use of Page Agents, Agents nested inside Web Pages, which assist in improving the contents of the related articles section of a given article. More than helping find useful affiliates for shared sections of the site such as the blogroll [1], it actually selects the latest up-to-date related articles in real-time for the pages the agents run on. The set from which the related affiliate content can be open (all existing web content in the system) or closed (affiliate content only). K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 173–182, 2011. c Springer-Verlag Berlin Heidelberg 2011
174
R.M.E. Swezey et al.
With a sample implementation as part of the development of a Web platform of which the objective is to involve citizens in regional debate in Japan, we show that our architecture is valid for real-world implementation, if realistic performance requirements are met with the classifier. In the sample implementation, the latest up-to-date news and articles can be fetched from the open Web, and then classified for each city of Japan, then pushed on articles and news related to the same city. Firstly, we will begin by stating the context of the research. Secondly, we detail our architecture and how our current system works. Thirdly, we will show the sample implementation of our system, which fetches latest affiliate news contents related to 47 Japanese prefectures and over 1700 Japanese cities. We detail technical challenges and show achievements and performance. We then conclude and discuss further research about the subject.
2 2.1
Context Affiliation
We call affiliation, the process of linking between websites, when this linking holds more meaning that of a simple reference. Besides friendship or common interest, affiliation seeks to share visitors, as well as raising site awareness mutually. Affiliation is also most commonly defined by commercial affiliation, i.e a generally unidirectional link from an editor (blog, website) to an announcer (commercial or promotion site), in a textual or graphic fashion. The first important dimensions in settling community or commercial affiliation, consists in the statistics of the editor. If this is a community affiliation, then it generally concerns both sites since they are both editors. Those statistics are considered more or less importantly in community affiliation, but they are vital in commercial affiliation, especially when there is no third-party broker between the editor and the announcer. They are also important for the webmaster to study the audience. This dimension was assessed using the Agents described in section 2.3. The second important dimension is context. Affiliate Content does not solely consists in community affiliate links or advertisement blocks, it can also be imported textual content such as news feed content. At any rate, we define affiliate content by contents which are related more or less to each page’s topic, or to the reader’s profile (targeted advertisement). Therefore, affiliate content that can be pushed dynamically depending on the context of the page is of high value to advertisers, or news/topics background and continuity investigators such as those concerned by our sample implementation. 2.2
Limitations of Recommended Contents Sections
For commercial affiliate contents: Most context-related affiliate content today is provided through a system known as sponsored search auctions [3]. This system however is flawed in the sense that any bidder with enough bidding power can win the auction and show totally unrelated content for any keyword, so long as its bid for it is high enough.
Intelligent Page Recommender Agents
175
For related topic contents: Often, related contents sections on Web articles will present flaws such as the following: – It is frozen in time: cached at the time of page generation, or used as a source at the time of writing. It does not recommend latest up-to-date content on the topic, if the article is too old but the topic itself still has ongoing events. – It is limited to the domain/site at hand and does not recommend content from other sources. On recommender systems and synchronous Javascript: Though little research seems to have been done on these issues, we can notice a general trend in real-time recommendation that is aimed at live information streams such as Twitter [4,5]. Our system has the same objective, only that it recommends related contents in real-time directly when users are browsing article pages relevant to a given topic. 2.3
Technology Developed Until Now
Using our previous research utilizing channelizing technology and server-side scheduling [2] (see Fig. 1), website or blog editors can measure the utility of each page slot on the web page, for the various affiliate content blocks in which contents can be recommended. This utility can be measured gathering statistics from various data, such as page scrolling, mouse movement, and so forth. Secondly, the event data is analyzed and interpreted to modelize reader behavior. Those functionalities are implemented using machine learning technology [6]. They can push contents in real-time at the location they want on their Web pages, at the time they wish. This Agent technology can be used independently on CMS-generated Web pages, such as typical blog pages or news sites, as well as on plain HTML pages.
Web Page
Channel
Defining Channels
Content
(Channelizing)
Server Webmasters
Web Browser Push Delivery
Time Table
Edit Contents
Wisdom Ad Balloon
Fig. 1. Push-type Content Delivery Mechanism on Wisdom Ad Balloon
176
R.M.E. Swezey et al.
One of the system’s strong points is that it requires only to modify the HTML page, or the layout template in case of a CMS. In the latter case, since modifying the layout template is generally much easier and do not require any advanced programming skill at all. We have further improved it in the research at hand, with Recommender Agents.
3
General Architecture
In this section, we talk about our new framework and its architecture. We introduce a new architecture for real-time recommendation of contents. Hence, in the rest of this paper, we assimilate the terms Page Agent and Recommender Agent. 3.1
Modules
Our system relies on the following modules: Table 1. Modules Feed Engine
Aggregates RSS feeds from various sources and process new articles Agent Server Server interface for the Recommender Agents Classifier Engine Classifies the new articles: can output a single class or a vector of classes for the articles (in case we want to recommend by cosine similarity) Database Known articles and their respective classes Recommender Agents Page Agents running on instances (Web pages opened by users) of articles and recommending content accordingly
3.2
Agent Interface
Using our previously implemented system and considering other research (see previous section), we have taken our Page Agent framework a step further, by allowing it to use synchronized Javascript objects over numerous instances of opened pages in many browsers. The Agent consists of two parts: – The Agent itself, which interacts with the page and replaces page slots with content by channelizing, and can store his own history and local data. – The shared memory, a synchronized Javascript object shared by the Agents over all article pages opened in browsers which relate to the same class (topic). We call this object a Wisdom Shared Object (WSO). As all operations on the synchronized socket objects are performed on the client side, it is also important to take into account the atomicity of some of these operations. Data corruption can occur if two or more browser agents (i.e execution
Intelligent Page Recommender Agents
177
threads in the general case) try to write into a shared property of each synchronized agent, or if one agent reads a shared property and begins an operation on it before another has finished writing into it. A trivial example being that of a counter incrementation, with the counter stored in a shared property. In the operation counter = counter + 1, run by agents A and B using the shared property counter, a save (write) can occur between the load (read) and save (write) of this operation. To address this synchronization issue, several traditional methods which can be adapted exist [7,8], but we choose a timestamp/revision number technique [9] which is more up-to-date and best suited for client-server handling of conflicts. 3.3
Workflow
Client-Server workflow 1. A user opens the article a in a Web browser. 2. The Agent located on a sends a’s URI and contents to the Agent Server when it is opened on the Web for the first time, for example by his author. This article can also be input by other sources such as a Feed Engine. 3. The Agent Server interrogates the Database to find out a’s topic class ca . In case no result is found, i.e a is new to the system, the Agent Server retrieves a class ca from the Classifier Engine and saves (a, ca ) to the Database. In the case of a vector implementation, instead of just ca we have a vector of pairs (ci , wi )a with wi,a the weight of the geographic class ci in relation to a. 4. In response to the Recommender’s Request, the Agent Server pushes ca , or the vector of pairs (ci , wi )a in the case of a vector implementation. 5. The Recommender fetches a Javascript shared object, building its identifier depending on ca or the vector (ci , wi )a . This object contains recommendations related to a’s geography, and can be fetched from the same Agent Server or another one. Recommender Agent workflow 1. The Agent builds automatically an Article Object out of the Web Page it runs on, by extracting HTML tags marked with specific ID’s. The final object has the same properties as a RSS feed item or HTML5 article: URI, title, article text, etc. 2. As seen on the previous workflow, the Agent sends the Article Object to the Agent Server in order to get either its most probable Topic Class as a scalar, or a Topic Class vector. 3. After having been given a Topic Class ID by the Agent Server, the Agent loads a Shared Object (see Sect. 3.2) associated with the ID of the Topic Class, which is a synchronized shared Javacript object between Agents running on similar topic pages. In the case of a vector implementation, the shared Javascript object is associated to the closest centroid vector of clustered articles dimensioned by the
178
R.M.E. Swezey et al.
geographic classes, which we can find with cosine similarity for the article at hand. 4. The Agent, as well as Agents using the same shared object, reorganizes the recommended contents in it. They can be extended to use their own heuristics: article’s date of publication, visitor counter for each article, and other heuristics depending on the Agent. 5. The Recommender Agent shows the recommended contents, and modifies it in real-time if necessary, in a related contents section of the Web Page.
4
Sample Implementation and Results
4.1
Aim and Original Project
We implemented a sample system based on this architecture, as part of the development of a Web platform of which the objective is to involve citizens in regional debate, based on [10]. This platform is itself a project funded by the Japanese Ministry of Internal Affairs and Communications, as part of the SCOPE1 competitive nationwide research development programme in Japan. In this sample implementation, we aim to fetch the latest up-to-date news and articles from the open Web, classify them for each city of Japan, and push them on articles and news related to the same city, to local keep citizens informed and aware. 4.2
Classifier Module
For the Classifier Module, we chose to use a Complementary Naive Bayes classifier (CNB) [11], because of the lack of data for certain classes, e.g cities and prefectures of Japan which are not popular and do not provide with an amount of training/testing data that can be compared against, for example, the wards of Tokyo metropolis. The data set for training and testing was built from the Japanese Wikipedia article base, by adding each article as a document for the prefecture/city/district class it relates to. For all 2024 classes2 , a total of 105018 documents was gathered, which amounts to 299889 documents when building the hierarchical set. Documents from each city are duplicated for the prefecture they belong to, and documents from each ward are as well duplicated for the city they belong to3 . We conducted a closed-test. To train and test the classifier, we used Apache Mahout4 over an Apache Hadoop5 cluster of 3 machines. Mahout implements CNB as a Transformed 1 2 3 4 5
SCOPE: Strategic Information and Communications Research & Development Promotion Programme. 47 prefectures, 1811 cities including the Tokyo metropolis wards, and 166 wards. During the hierarchical classification, the metropolises of Tokyo, Osaka, Kyoto count as prefectures, and their districts (wards) as independent cities. http://mahout.apache.org/ http://hadoop.apache.org/
Intelligent Page Recommender Agents
179
Weight-normalized Complement Naive Bayes (TWCNB) as described in [11] (steps 1-8). We modified the CNB algorithm of Mahout and added a few other features so that the classifier could be tested in two fashions: flatly and hierarchically. Because our geographic topical classes can be broken down into a threelevel class tree (prefectures/cities/districts), and because of the large amount of classes, we felt that implementing a hierarchical testing of documents [12,13], as opposed to a naive flat testing, would let us achieve better results in: – Accuracy: If the classes can be broken into a class tree, especially when the number of classes is important, we have more chances of achieving better accuracy if we can classify under one common/parent category (the prefecture) before looking for the right child one (the city), instead of directly looking for the city. – Performance: The first classification check is against 47 prefecture, and the second against 1810 ≈ 38 cities, which amounts to roughly 85 score 47 calculations for each document, if classes were equiprobable. If we did a flat classification directly over the 1810 cities, it would amount to 1810 score calculations, and result in a 2029% time cost increase. Table 2. Performance of Flat Classifier Classes Classes for flat training Correctly Classified Instances Incorrectly Classified Instances Total Classified Instances Avg. Time to Classify one Instance
1810 cities 1810 (cities only) 69944 (42.086%) 35074 (57.914%) 105018 522 ms
Table 3. Performance of Hierarchical Classifier Classes
2024: 47 prefectures 1810 cities 167 districts Correctly Classified Instances 69944 (66.602%) Incorrectly Classified Instances 35074 (33.398%) Total Classified Instances 105018 Avg. Time to Classify one Instance 38,6 ms
We can see here that the hierarchical classifier here outperforms the flat classifier for real-world application, notably in processing time performance. However, the accuracy of the hierarchical classifier can be improved, it will be the object of further research. We also intend on doing open tests as well as N-fold tests of the classifier.
180
4.3
R.M.E. Swezey et al.
Page Agent
The Page Agent Module was implemented using the WebSocket protocol of HTML5 (draft 766 ). After having followed the protocol described in Section 3.3 and fetched the shared synchronized Javascript object related to the Topic Class (here, a city) related to the article currently displayed, the Agent implements the following delegates of the object: Table 4. Javascript Wisdom Shared Object Delegates open()
Actions to perform when the shared object is opened on the current Web page (e.g populate the related contents list next to the article) close() Actions to perform when the shared object is closed on the current Web page update() Actions to perform when the shared object is modified by this or another Agent using the object, and its new state pushed to the Page error() Actions to perform when an error occurs
One of the interesting actions that the Recommender Agent can perform outside the real-time modification of the related contents section of the Web Page is the modification of the Page template itself, according to what is indicated for the Topic Class of the shared object it fetches. Extension to custom affiliate template modification: This type of affiliate content is usually more characteristic of commercial affiliation. It basically consists in modifying the appearance (background, colors, images), namely
Fig. 2. Differences between Regular Affiliate Content Blocks and Custom Affiliate Templates 6
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-76
Intelligent Page Recommender Agents
181
the template, of the main page, or a specific subset of pages, depending on the contract (see Fig. 2). The resulting appearance helps the affiliate by raising its brand awareness, e.g showing the brand name or advertising about a product of the brand. One good example of this custom affiliate template strategy can often be seen on video game news sites, promoting an editor’s new game. One example application in the case at hand is, for example, when reading an article about a particular city of Japan, recommending the forthcoming matsuri 7 of this city by changing the appearance of the news site or blog being browsed. Another example is a more visible alert in case of localized disasters such as earthquakes (we have developed such a system, which we call Wisdom Alert).
5
Conclusion
In this paper, we have introduced a novel method for recommending the latest up-to-date contents in a set of related events or topics, dynamically and in realtime, on any article Web page that is related to the events or topics. We made use of persistent and synchronized recommender agents which can easily be extended to implement a lot of functionalities. We considered accuracy, processing time and ease of implementation in our approach. We have addressed each of these challenges in a example implementation of our architecture. In this implementation, we classify articles from more than 47 Japanese prefectures and over 1700 Japanese cities. By making use of hierarchical classification, we are able to categorize articles faster and more accurately, and demonstrate that such a system is scalable and usable in real-world applications. As for the recommender in the client Web pages, it is based on Javascript agents with synchronized shared memory as well as local memory, which require no server code, only a request handler or proxy on the same domain. The algorithm can also be used with traditional AJAX and long-polling to suppress this need. The challenges in research to come are the following: improve placement on the recommender slots not only according to, but through more subtle heuristics which can incorporate but are not limited to: CTR, impact factor of the article, page ranking, etc. We also need to improve the efficiency of our Complementary Naive Bayes classifier in the sample implementation. For the improvements, we plan on testing with bi-grams and tri-grams, and dealing with the classes holding the biggest amount of training documents (Tokyo, Osaka, Hokkaido). As for the testing of the classifier, we are planning to do 10-fold cross validation tests. Finally, in further papers, we will present new modules in our system for supporting large-scale participative debate.
7
Town Festival. Most Japanese cities have their own.
182
R.M.E. Swezey et al.
References 1. Swezey, R., Nakamura, M., Shiramatsu, S., Ozono, T., Shintani, T.: Architecture for Automated Search and Negotiation in Affiliation among Community Websites and Blogs. In: Proceedings of the Twenty Third International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA-AIE (2010) 2. Nakamura, M., Asami, S., Ozono, T., Shintani, T.: A Dynamic Rearrangement Mechanism of Web Page Layouts Using Web Agents. In: Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence, p. 643. Springer, Heidelberg (2009) 3. Lahaie, S., Pennock, D.M., Saberi, A., Vohra, R.V.: Sponsored search auctions. Algorithmic Game Theory, 699–716 (2007) 4. Chen, J., Nairn, R., Nelson, L., Bernstein, M., Chi, E.: Short and tweet: experiments on recommending content from information streams. In: CHI 2010: Proceedings of the 28th International Conference on Human Factors in Computing Systems, pp. 1185–1194. ACM, New York (2010) 5. Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: RecSys 2009: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM, New York (2009) 6. Velayathan, G., Yamada, S.: Behavior-based web page evaluation. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 409–412. IEEE Computer Society, Los Alamitos (2006) 7. Lamport, L.: A new solution of dijkstra’s concurrent programming problem. Commun. ACM 17, 453–455 (1974) 8. Peterson, G.L.: Myths about the mutual exclusion problem. Information Processing Letters 12(3), 115–116 (1981) 9. Cannon, B., Wohlstadter, E.: Automated object persistence for javascript. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 191–200. ACM, New York (2010) 10. Macintosh, A., Gordon, T.F., Renton, A.: Providing Argument Support for EParticipation. Journal of Information Technology and Politics 6(1), 43–59 (2009) 11. Rennie, J., Shih, L., Teevan, J., Karger, D.: Tackling the poor assumptions of naive bayes text classifiers. In: Machine Learning-International Workshop Then Conference, vol. 20, p. 616 (2003) 12. Demichelis, F., Magni, P., Piergiorgi, P., Rubin, M., Bellazzi, R.: A hierarchical naive bayes model for handling sample heterogeneity in classification problems: an application to tissue microarrays. BMC Bioinformatics 7(1), 514 (2006) 13. Langseth, H., Nielsen, T.D.: Classification using hierarchical naive bayes models. Mach. Learn. 63, 135–159 (2006)
Meta-learning Based Optimization of Metabolic Pathway Data-Mining Inference System Tom´as V. Arredondo1 , Wladimir O. Ormaz´ abal1, 2 1,3 Diego C. Candel , and Werner Creixell 1
3
Departamento de Electr´ onica
[email protected] 2 Departamento de Inform´ atica, Universidad T´ecnica Federico Santa Mar´ıa Av. Espa˜ na 1680, Valpara´ıso, Chile CSIS The University of Tokyo, Tokyo, Japan
Abstract. This paper describes a novel meta-learning (MTL) based methodology used to optimize a neural network based inference system. The inference system being optimized is part of a bioinformatic application built to implement a systematic search scheme for the identification of genes which encode enzymes of metabolic pathways. Different MTL implementations are contrasted with manually optimized inference systems. The MTL based approach was found to be flexible and able to produce better results than manual optimization.
1
Introduction
A common problem in bioinformatic research consists of determining the genes that encode enzymes of a particular metabolic process of interest. Obtaining this information is essential to produce a functional map of the metabolic capabilities of the organism and can be a benefit in determining the advantages of using one specific microorganism in a particular application. In our research, we are interested in determining the genes which encode enzymes involved in the bacterial degradation of aromatic compounds that may be harmful to human health. Given that not all bacteria are capable of performing such degradation it is important to determine the genes that are responsible for said functions because it provides a fingerprint which can be used to asses and understand such capability in various bacteria. It is toward this goal, that we search for the genes that encode the enzymes (i.e. proteins) which participate in the process of aromatic compound degradation. With these genes, a search for other such organisms can take place using well known tools and databases such as BLAST [1] and Vector NTI [2]. These tools perform DNA sequence alignment and search in order to find a sequence segment (known as an Open Reading Frame or ORF) which codifies a protein (known as a coding sequence or CDS). The data sources used for said comparisons have DNA, protein sequences and metabolic pathway maps stored within them (MetaCyc [3], UM-BBD [4], KEGG [5], NCBI [6] amongst others). K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 183–192, 2011. c Springer-Verlag Berlin Heidelberg 2011
184
T.V. Arredondo et al.
The integration of Bioinformatic data sources is a continuing problem given the complexity and skills required to utilize them in an efficient manner. With this in mind, we have produced two such applications: GeXpert ([7],[8]) and Phylogist [10]. Both of these systems attempt to improve and optimize the process of integrating data and applications from different public sources. Part of the workflow of these applications involves the selection of candidate genes which encode an enzyme of interest. In both of these applications, an inference system is used to classify the candidates from genes previously analyzed for an organism under study. Previous approaches toward metabolic network reconstruction have used various algorithmic methods such as name-matching and using EC-codes to link metabolic information to genes (e.g IdentiCS [9], metaSHARK [14]). The AUTOGRAPH (AUtomatic Transfer by Orthology of Gene Reaction Associations for Pathway Heuristics) method [15] uses manually curated metabolic networks, orthologs and their associated reactions to compare predicted gene-reaction associations. Our current approach in optimizing the inference system is applicable to any such data-mining application. In Section 2, we describe the bioinformatics application and the inference system used. In Section 3, we expose design details and the proposed implementation solution. We describe our experiments and results obtained in Section 4. Finally in Section 5, we present conclusions and outline ideas for future work.
2 2.1
Bioinformatic Inference Systems Integration Framework Process
Toward determining if certain genes participate in a metabolic pathway of interest (e.g. biodegrading of an aromatic compound), many current bioinformatic data-mining systems ([7]-[12]) utilize an iterative work flow. As in [12], an Artificial Neural Network (ANN) is used to estimate the quality of an alignment match of a DNA candidate sequence in terms of corresponding to the gene encoding a specific protein for a metabolic pathway of interest. This inference system, uses a selection of BLAST results as inputs and seeks to improve the selection of genes based on meta-optimization. In addition, the information provided by experts on previously researched genes must be scored in the organism under study and the inference system (i.e. the Learner) must learn from this information in order to generate future scorings for other genes 2.2
Inference System Workflow
The idea behind our proposed workflow as used in [13] is that once an alignment is performed, the results returned by the BLAST algorithm are passed to an ANN classifier which provides the user with an indicator of its goodness and also permits a constant updating of the training data with the objective of continually improving the neural network.
Meta-learning Based Optimization
185
The input parameters correspond to the BLAST sequence alignment scores, which are defined as follows: – – – – –
Identity: Coincidence percentage. Positives: Symbol similarity percentage. E-value: Random match result probability. Gaps: Missing sequence segment percentage. BitScore: Statistical significance measure.
The inference system output corresponds to an indicator of the goodness of the candidate gene alignment, which take continuous values in the interval [0, 1]. Values higher than 0.5 indicate a good candidate, otherwise it is a bad candidate.
3
Meta-learning Based Solution
It is possible to find diverse approaches that have been categorized as MTL [16,17], all these approaches eventually aim to obtain the best result from the learning process. Some have centered in the study of the nature of the data as in the number of characteristics to analyze, the classes being classified or the amount of training data amongst others [16,17] (e.g. Landmarking [18]); others have applied combinations of base learners (e.g. Stacked Generalization [19]). The majority of previous works [20,21,22], consider certain aspects of interest (e.g. learning parameters or learning model structural elements). To the best of our knowledge, only [23] approaches both parametric aspects and model configuration but in a manner specific to his context. Our current proposal is a general conceptual framework that generalizes the concept of MTL independent of the inference system (i.e. learner or weak learner) and the algorithm being used to optimize it (e.g. backpropagation, genetic algorithms, etc). 3.1
Training and Validation Data Definition
There are diverse possible data sets, given a determined context or problem of interest. Considering one of these possible sets, D, as a set of N elements: D = {d1 , d2 , . . . , dN } , |D| = N
(1)
Where: – – – –
di = (Ii , Oi ) , with 1 ≤ i ≤ N . Ii : Input values. Oi : Output values. |X|: Cardinality of the set X.
Following [24], it is possible to divide the set D in r disjoint subsets T : D = T1 ∪ T2 ... ∪ Tr , with Ti ∩ Tj = ∅, ∀i, j ∈ {1, 2, . . . , r} , i = j
(2)
186
T.V. Arredondo et al.
Starting from these r subsets, we can generate a training set (Si ) and a test set (Ti ). Ti corresponds to one of the r subsets, and Si is the union of the r − 1 remaining subsets. We thus define a training set Si as: Si = D − Ti , with D = {T1 , T2 , . . . , Ti , . . . , Tr }
(3)
Considering expression (2) we establish that: D = Ti ∪ Si , ∀ i ∈ {1, 2, . . . , r} 3.2
(4)
Training Methods and Inference System Model
Extending the idea of learner (i.e. inference system) training used in [24], we define Lk as a learner formed by the pair (learning model, learning method) as: Lk = (modi , trj ), with modi ∈ MOD, trj ∈ TR (5) Where MOD is the set of possible learners (e.g. neural network, fuzzy inference system, etc.) and TR is the set of possible learning methods that can be used to train those models (e.g. backpropagation, genetic algorithm, etc.). Consider L as the set of possible different learners, that is the valid combinations (modi , trj ) of learning models and training methods: L = {L1 , L2 , . . . , Lk , . . .} , k ∈ N
(6)
Considering expressions (5) and (6), as well as the aforementioned restriction it is possible to assume that: |MOD × TR| ≥ |L|
(7)
For every Lk , a vector p can be defined that contains all parameters of the training method being used by Lk . We let Pk be the set of all γ possible p vectors: ∀Lk ∃ a Pk = {pk,1 , pk,2 , . . . , pk,γ } (8) In a similar vein, we define the set Mk of all θ possible inference system model configurations, each one of these represented by a vector m: ∀Lk ∃ a Mk = {mk,1, mk,2 , . . . , mk,θ } 3.3
(9)
Hypothesis
We understand a hypothesis h, as an instance of an inference system type (e.g. a neural network learner) after undergoing through the learning process described in [24]. There is an infinite set of hypotheses h associated with each one of the
Meta-learning Based Optimization
187
learners in L. For example, HA is the set of possible hypotheses generated from learner LA : LA → H A (10) Finally, considering all factors previously mentioned it is possible to establish that a hypothesis h depends on the learner LA being used, on the training data (Si ), on the model being used (MA ), and on the parameter values of the training method being used (PA ). We define Hl,j A (pA,l , mA,j , D) as the set of all hypotheses generated using vectors pA,l and mA,j while trained with Si (1 ≤ i ≤ r) obtained from D: Hl,j A (pA,l , mA,j , D) =
1 hA (pA,l , mA,j , S1 ), . . . , hrA (pA,l , mA,j , Sr )
(11)
We define hA,l,j,i as a hypothesis generated using vectors pA,l and mA,j , and the training set Si . Also, the output for hypothesis hA,l,j,i given an input Ix is defined as: hA,l,j,i (Ix ). As seen in Fig. 1, the MTL seeks to obtain the best possible values for the parameters vector (pA,l ), this process is denoted as tuning. It is also tasked with finding the best possible values for the model configuration (mA,j ), this process is known as fitting. For a particular template, the generated hypotheses allow us to obtain a measure of fitness for the template than instantiated them. The measure of fitness used to evaluate the diverse hypotheses generated is the mean error over r of the test set Ti applied to all hypotheses (i.e. EMT L ). This fitness measure allows the MTL to evaluate and select the best combination of pA,l and mA,j found for a specific problem. EMT L =
r 1 Correctly classified 1− r 1 Total cases
Fig. 1. Inference System Meta-optimizer
(12)
188
4 4.1
T.V. Arredondo et al.
Experiments and Results Neural Networks
We use a multilayer feed forward artificial neural network with one hidden layer and backpropagation for weight updating: Δwij (n + 1) = η · δP j oP j + α · Δwij (n)
(13)
We applied two implementations of backpropagation, the first one (NN1) using with a fixed learning rate (η) and the second one (NN2) with a variable learning rate. According to [25], during each learning cycle the initial value of η decreases exponentially with the d factor until reaching the value ηlow . Upon reaching this value ηlow , η takes the value of ηhigh and this process repeats:
⎞ ⎛ low log ηηhigh ⎠ η(t) = η(t − 1) · exp ⎝ (14) d In Table 1, we show the values that are used for each variable. The chosen parameters are those that were considered to have the greatest possible impact upon hypothesis generation. Table 1. Variable Values and Ranges Meta-learner Ni Nh No η ηlow ηhigh α d NN1 4 1 - 30 1 0 - 0.1 N.A. N.A. 0 - 0.9 N.A. NN2 4 1 - 30 1 0.3 0.01 0.01 - 0.3 0 - 0.9 20 - 84
To avoid overtraining, we utilized Ti to monitor the change in error according to (15). After each training round, in case of an increase in error the process is stopped. We restricted the possible maximum number of training iterations to 20000 and the minimum number to 1000. Toward the selection of training and test sets, we applied cross validation random sub-sampling [26], in a 70% - 30% proportion respectively. Errori = 4.2
|Ti |−1 1 1 2 (Oj − hA,l,j,i (Ix )) |Ti | x=0 2
(15)
Meta-learning Algorithms
A genetic algorithm (GA) and Stochastic Hill Climbing (SHC) were used in order to implement the process of MTL. These modified the number of hidden layer neurons (Nh ) as well as backpropagation parameters according to the test case (fixed or dynamic learning rate).
Meta-learning Based Optimization
189
Genetic Algorithm. The performance of the GA is associated with a series of parameters and considerations which determine its behavior. Previously, researchers have determined certain reasonable ranges for GA parameters to be used in a variety of optimization problems [27,28]. From these, we have chosen the following values for GA settings and parameters: tournament selection, one Elite individual, 100 iterations, a 20 individual population, two point crossover, probability of crossover of 0.95, and probability of mutation of 0.01. In GA-NN1, a granularity of 5 bits was used for Nh , 14 bits for η, and 14 bits for the momentum (α). For GA-NN2 we used 5 bits for Nh , 6 bits for ηhigh , 12 bits for α and 6 bits for d. Stochastic Hill Climbing. Another algorithm that was utilized for MTL was SHC [29]. This algorithm modified the same NN parameters as the GA. We used an equivalent number of objective function evaluations in order to have a comparable effort with respect to the GA. The parameters used were 200 iterations and 10 restarts. In SHC-NN1, we used a granularity of 5 bits for Nh , 14 bits for η, and α was also 14 bits. For SHC-NN2 we used 5 bits for Nh , 6 bits for ηhigh , 12 bits for α and 6 bits for d. 4.3
Results
In Table 2, the total average correct % classification results reached by all classifiers is compared. GA-NNX corresponds to the GA based MTL training NNX and SHC-NNX to the SHC based MTL training NNX. Naive-NNX and ExpertNNX correspond to a superficial (3 hours) and a mored detailed (6 hours) manual parameter search respectively. Table 2. Final Total Average Correct % Classification Results
Naive-NN1 Expert-NN1 Naive-NN2 Expert-NN2 SHC-NN1 SHC-NN2 GA-NN1 GA-NN2
KT 0.9472 ± 0.0266 0.9608 ± 0.0148 0.9540 ± 0.0145 0.9581 ± 0.0134 0.9635 ± 0.0143 0.9581 ± 0.0085 0.9729 ± 0.0124 0.9729 ± 0.0099
JMP 0.9324 ± 0.0168 0.9513 ± 0.0300 0.9391 ± 0.0264 0.9513 ± 0.0264 0.9581 ± 0.0195 0.9567 ± 0.0182 0.9702 ± 0.0166 0.9608 ± 0.0227
LB 0.8878 ± 0.0299 0.9175 ± 0.0224 0.9013 ± 0.0235 0.9202 ± 0.0148 0.9256 ± 0.0182 0.9189 ± 0.0222 0.9229 ± 0.0231 0.9229 ± 0.0231
Table 3, presents the best parameter and model values obtained by the different methods considered for the KT, JMP and LB data respectively. In Table 4, we show cross validation results obtained when testing the learners obtained by GA-NN1 y GA-NN2 with all the data. The rows indicate the training data and the columns indicate the test data used.
190
T.V. Arredondo et al. Table 3. Best Parameter and Model Values Obtained MTL KT GA-NN1 GA-NN2 SHC-NN1 SHC-NN2 MTL JMP GA-NN1 GA-NN2 SHC-NN1 SHC-NN2 MTL LB GA-NN1 GA-NN2 SHC-NN1 SHC-NN2
Ni 4 4 4 4 Ni 4 4 4 4 Ni 4 4 4 4
Nh No 25 1 27 1 5 1 19 1 Nh No 27 1 30 1 25 1 20 1 Nh No 15 1 27 1 5 1 13 1
η 0.06256 0.3 0.0318 0.3 η 0.07543 0.3 0.0683 0.3 η 0.09980 0.3 0.0318 0.3
ηlow N.A. 0.01 N.A. 0.01 ηlow N.A. 0.01 N.A. 0.01 ηlow N.A. 0.01 N.A. 0.01
ηhigh N.A. 0.0471 N.A. 0.0685 ηhigh N.A. 0.0614 N.A. 0.0342 ηhigh N.A. 0.0971 N.A. 0.04
α 0.3357 0.4446 0.8692 0.4775 α 0.6649 0.0718 0.5249 0.7712 α 0.8825 0.4428 0.8692 0.6793
d N.A. 51 N.A. 53 d N.A. 25 N.A. 74 d N.A. 81 N.A. 68
Table 4. Cross Validation Results
KT KT 0.9729 ± 0.0124 JMP 0.9594 ± 0.0142 LB 0.9364 ± 0.0247 KT KT 0.9729 ± 0.0099 JMP 0.9635 ± 0.0143 LB 0.9418 ± 0.0191
5
GA-NN1 JMP 0.9621 ± 0.0188 0.9702 ± 0.0166 0.9094 ± 0.0278 GA-NN2 JMP 0.9472 ± 0.0224 0.9608 ± 0.0227 0.9243 ± 0.0271
LB 0.9270 ± 0.0222 0.9283 ± 0.0211 0.9229 ± 0.0231 LB 0.9202 ± 0.0224 0.9283 ± 0.0191 0.9229 ± 0.0231
Conclusions and Future Work
As seen by the test results, a clear benefit of using MTL can be observed. Better results are obtained than in the manual optimization results and the complete automation of the search by using well known search algorithms (GA and SHC) alleviates the user from a tedious and non trivial task. Also, the proposed conceptual model leaves open a wide variety of potential enhancements and experiments without requiring model modifications and considers aspects previously left out by other learner models. The MTL software developed for this optimization is open source and is freely available [30]. The range of obtained performance results demonstrates that there is an impact in varying selected parameter and model values in fixed and dynamic learning rate cases. The utilization of the backpropagation method with a dynamic
Meta-learning Based Optimization
191
learning rate does not improve obtained results, on the contrary it seems to worsen obtained results [25]. When comparing classification performance, we can observe that using a GA rather than SHC as the MTL produces better results. Even so, both GA and SHC obtain better average results than the manually optimized classifiers. Future work includes the utilization of other algorithms in the MTL, the verification of their performance on a wider variety of problems as well as testing different parameters and model values of different granularity.
Acknowledgment This research was partially financed by Universidad T´ecnica Federico Santa Mar´ıa DGIP Grants 231138 and 231021.
References 1. NCBI: Basic Local Alignment Search Tool, http://blast.ncbi.nlm.nih.gov/Blast.cgi 2. Invitrogen Corporation: Vector NTI Software, http://www.invitrogen.com/site/us/en/home/Applications/ Cloning/Vector-Design-Software.html 3. National Institute of General Medical Sciences: MetaCyc Encyclopedia of Metabolic Pathways, http://www.metacyc.org/ 4. University of Minnesota: University of Minnesota Biocatalysis/Biodegradation Database, http://umbbd.msi.umn.edu/ 5. Kanehisa Laboratories: KEGG Pathways Database, http://www.genome.ad.jp/kegg/pathway.html 6. NCBI: National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/ 7. Arredondo, T., Seeger, M., Dombrovskaia, L., Avarias, J., Calder´ on, F., Candel, D., Mu˜ noz, F., Latorre, V., Agull´ o, L., Cordova, M., Gomez, L.: Bioinformatics integration framework for metabolic pathway data-mining. In: Ali, M., Dapoigny, R. (eds.) IEA/AIE 2006. LNCS (LNAI), vol. 4031, pp. 917–926. Springer, Heidelberg (2006) 8. GeXpert Open Source Project, Available in Sourceforge http://sourceforge.net/projects/gexpert 9. Sun, J., Zeng, A.P.: IdentiCS - Identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence. BMC Bioinformatics 5, 112 (2004) 10. Phylogist UTFSM Bioinformatics Project Website, http://www.feriadesoftware.cl/2007/phylogist/producto.html 11. Arredondo, T., V´ asquez, F., Candel, D., Dombrovskaia, L., Agull´ o, L., C´ ordova, M., Latorre-Reyes, V., Calder´ on, F., Seeger, M.: Dynamic Penalty Based GA for Inducing Fuzzy Inference Systems. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 957–966. Springer, Heidelberg (2007) 12. Leiva, M., Arredondo, T., Candel, D., Dombrovskaia, L., Agull´ o, L., Seeger, M., V´ asquez, F.: Feed-Forward Artificial Neural Network Based Inference System Applied in Bioinformatics Data-Mining. In: Conference Paper IJCNN 2009, Atlanta, USA, June 14-19, pp. 1744–1749 (2009)
192
T.V. Arredondo et al.
13. Burkholderia xenovorans LB400 at the Joint Genome Institute, http://genome.jgi-psf.org/finished_microbes/burfu/burfu.home.html 14. Pinney, J., Shirley, M.W., McConkey, G.A., Westhead, D.R.: metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella. Nucleic Acids Research 33(4), 1399–1409 (2005) 15. Notebaart, R., van Enckevort, F., Francke, C., Siezen, R., Teusink, B.: Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics 7, 296 (2006) 16. Anderson, M.L., Oates, T.: A review of recent research in metareasoning and metalearning. AI Magazine (2007) 17. Vilalta, R., Drissi, Y.: A Perspective View and Survey of Meta-learning. Journal of Artificial Intelligence Review 18(2), 77–95 (2002) 18. Bensusan, H., Giraud-Carrier, C.: Casa Batlo in Passeig or landmarking the expertise space. In: Eleventh European Conference on Machine Learning, Workshop on Meta-learning: Building Automatic Advice Strategies for Model Selection and Method Combination, Barcelona, Spain (2000) 19. Wolpert, D.: Stacked Generalization. Neural Networks 5, 241–259 (1992) 20. Aran, O., Yildiz, T., Alpaydin, E.: An incremental framework based crossvalidation for estimating the architecture of a multilayer perceptron. International Journal of Pattern Recognition and Artificial Intelligence 23(2), 159–190 (2009) 21. Hong, T.P., Lin, W.T., Chen, C.H., Ouyang, C.S.: Learning Membership Functions in Takagi-Sugeno Fuzzy Systems by Genetic Algorithms. In: First Asian Conference on Intelligent Information and Database Systems, pp. 301–306 (2009) 22. Pedersen, M.E.H., Chipperfield, A.J.: Simplifying Particle Swarm Optimization. Applied Soft Computing (2009) 23. Abraham, A.: Meta-learning Evolutionary Artificial Neural Networks. Neurocomputing 56 (January 2004) 24. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997) 25. Watkins, D.: Clementine’s Neural Networks Technical Overview. Technical Report (1997) 26. Bouckaert, R.R., Frank, E.: Evaluating the replicability of significance tests for comparing learning algorithms. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 3–12. Springer, Heidelberg (2004) 27. De Jong, K.A.: An Analysis of the Behavior of a Class of Genetic Adaptive Systems. Ph.D. Dissertation. University of Michigan, Ann Arbor, MI, USA. AAI7609381 (1975) 28. Grefenstette, J.: Optimization of Control Parameters for Genetic Algorithms. IEEE Transactions on Systems, Man, and Cybernetics in Systems, Man and Cybernetics 16(1) (1986) 29. Juels, A., Wattenberg, M.: Stochastic Hillclimbing as a Baseline Method for Evaluating Genetic Algorithms, Technical Report CSD-94-834, Department of Computer Science, University of California at Berkeley (1994) 30. SimMetaLib: Simple Meta-Learning Library, http://profesores.elo.utfsm.cl/~ tarredondo/simmetalib.html
Multiple Pickup and Delivery TSP with LIFO and Distance Constraints: A VNS Approach Xiang Gao1 , Andrew Lim1 , Hu Qin2,1, , and Wenbin Zhu3 1 2
Department of Management Sciences, City University of Hong Kong, Tat Chee Ave, Kowloon Tong, Hong Kong School of Management, Huazhong University of Science and Technology, No. 1037, Luoyu Road, Wuhan, China 3 Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
[email protected] Abstract. The pickup and delivery traveling salesman problem with last-in-first-out loading (TSPPDL) requires that a set of pickup and delivery requests are fulfilled such that the items to be transported are loaded in LIFO order. We generalize the TSPPDL to consider multiple vehicles as well as a maximum distance constraint for each vehicle; a solution to this problem is a set of routes for each vehicle, and each route can be represented by a tree data structure. To solve this problem practically, we propose a variable neighborhood search (VNS) heuristic using several new search operators. Our VNS approach also includes a dynamic programming component for optimally solving TSPPDL instances. Keywords: traveling salesman problem; pickup and delivery; last-infirst-out loading; distance constraint; variable neighborhood search.
1
Introduction
The traveling salesman problem with pickup and delivery (TSPPD) is a wellstudied problem. We are given a set R = {1, . . . , n} of n requests, where each request requires a load to be transported from pickup vertex i+ to delivery vertex i− (1 ≤ i ≤ n). There is a single vehicle with unlimited capacity that starts from a depot vertex 0+ . Its task is to fulfill all requests by visiting each pickup vertex before its corresponding delivery vertex (known as the precedence constraint) and finally return to a depot vertex 0− while minimizing the total distance traveled. The TSPPD is defined on a complete and undirected graph G = (V, E, d), where V = {0+ , 1+ , . . . , n+ , 0− , 1− , . . . , n− } is the vertex set, E = {(x, y) : x, y ∈ V, x = y} is the edge set, and the edge weight d(x, y) is the non-negative distance between vertices x and y where the matrix [d(x, y)] satisfies the triangle inequality condition.
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 193–202, 2011. c Springer-Verlag Berlin Heidelberg 2011
194
X. Gao et al.
A variant of this problem that has received significant attention requires that loading and unloading are done in a last-in-first-out (LIFO) manner, known as the TSPPD with LIFO loading (TSPPDL). This constraint naturally arises when the vehicle storage unit has only a single door located at the rear and works like a stack ; it is especially applicable when the cost of rearranging loaded items is much higher than that of the extra traveling distance caused by the LIFO constraint, i.e., the rearrangement cost dominates the traveling cost. Examples include the transportation of bulky, fragile or hazardous items. The TSPPDL was first mentioned by Ladany and Mehrez [7], who solved a real-life delivery problem in Israel by an enumerative approach. Two classes of solution approaches have been applied to the TSPPDL, namely exact algorithms [4] and heuristics [8,3,12]. The difficulty of the TSPPDL is evidenced by the fact that the largest instances that can be solved optimally contain only 25 requests [4]. Hence, large instances with hundreds of requests arising in practical applications are commonly handled by heuristics. Currently, the best and latest heuristic is the variable neighborhood search (VNS) heuristic proposed by Tu et al. [12] that uses trees to represent the feasible solutions of the TSPPDL. This paper introduces a useful and practical generalization of the TSPPDL by considering two additional factors. The first is to consider m > 1 vehicles, which reflects the fact that practical problems of this nature usually involve a fleet of vehicles rather than a single vehicle. The second is to require that the route length for each vehicle cannot exceed a predetermined limit L, which stems from regulations on drivers’ working time or driving time; this type of distance constraint has been widely applied to many vehicle routing problem (VRP) variants, e.g., [9,5]. We call the resultant problem the multiple pickup and delivery traveling salesman problem with LIFO and distance constraints (MTSPPD-LD). To the best of our knowledge, there is no existing research on the MTSPPDLD. Consequently, our test instances and results can be used as benchmarks by subsequent researchers. We use the tree representation for feasible TSPPDL solutions first proposed by Tu et al. [12] to devise a VNS heuristic for the MTSPPDLD. Furthermore, we devise a dynamic programming (DP) algorithm that is able to solve small instances of the TSPPDL very quickly; this DP algorithm is integrated into our proposed VNS heuristic as a subroutine.
2
Tree Representation of Feasible Solutions
A feasible solution of the MTSPPD-LD is a set of at most m routes and can be represented by a set of vertex sequences; Figure 1(a) shows a feasible solution of an MTSPPD-LD instance comprising two routes. Most heuristics that have been applied to vehicle routing and scheduling problems are based on edgeexchange operations [1], which involve the moving or exchanging of edges in routes. However, such heuristics cannot be directly applied to the MTSPPD-LD because the resultant routes usually violate the LIFO constraint. To resolve this issue, we represent the feasible solutions of the MTSPPD-LD by a set of ordered trees rooted at node 0, where each tree node corresponds to a request.
Multiple Pickup and Delivery TSP with LIFO and Distance Constraints
0+
1+
4+
4-
5+
5-
0-
3-
3+
2-
2+
1-
R oute 1
Tree 1
1+
4+
1-
4-
0+
0-
2+
2-
5+
Tree 2 6+
8+
8-
9+
9-
6-
0-
7-
11-
11+
10-
10+
7+
6+
3-
5-
0+
0+
3+
195
0-
7+
6-
7-
R oute 2
(a)
8+
8-
9+
9-
10+
10-
11+
11-
(b)
Fig. 1. (a) The sequence representation of a feasible solution. (b) The tree representation of a feasible solution.
For example, the tree representation of the solution corresponding to the sequence in Figure 1(a) is given in Figure 1(b). By following the dashed arrows (which is similar to a preorder traversal), we can derive the corresponding vertex sequence from the ordered tree. In addition, all sequences derived from ordered trees automatically respect the precedence and LIFO constraints, and there is a one-to-one correspondence between the set of all ordered trees of this type and the set of all feasible TSPPDL solutions [12]. For the MTSPPD-LD, we can use a separate tree for each vehicle, with the added restriction that the sum of all edge weights in each tree is not greater than the maximum route length L.
3
The VNS Solution Scheme
The Variable Neighborhood Search (VNS) approach [10] systematically performs local search using several neighborhood operators to arrive at a local optimum, and then uses a perturbation operator to move the search to a new area of the search space. It has previously been successfully applied to vehicle routing problems with precedence and loading constraints (e.g., [11,6]). Our VNS heuristic is summarized in Algorithm 1. We first generate an initial solution and regard it as the best solution Sbest . In each iteration, we first invoke a function local search, which performs five operations node-relocate, nodeexchange, subtree-relocate, subtree-exchange and tree-merge in order; whenever local search encounters an improving solution, it restarts from the first operation. Next, we attempt to improve the route represented by each tree in the current solution individually. If the tree involves 14 or more requests, we use the VNS heuristic for TSPPDL devised by Tu et al. [12]; otherwise, we find the optimal tree using a dynamic programming (DP) approach. Then, we update Sbest if the current solution is superior. Finally, the perturbation operator is used to diversify the search process. We repeat the above operations until max nonimproving consecutive non-improving iterations occurs.
196
X. Gao et al.
Algorithm 1. The VNS heuristic for the MTSPPD-LD 1: INPUT: Integer max nonimproving; 2: Scurrent ← generate initial solution; Sbest ← Scurrent ; 3: while i ≤ max nonimproving do 4: local search(Scurrent ); 5: for each tree in Scurrent do 6: if tree contains 14 or more nodes then 7: Optimize tree using TSPPDL VNS heuristic [12] 8: else 9: Optimize tree using DP 10: end if 11: end for 12: if Sbest < Scurrent then 13: Sbest ← Scurrent and i = 0; 14: else 15: i = i + 1; 16: end if 17: Perturbation(Scurrent ) 18: end while
4
Search Operators
In this section, we describe the five search operators and the perturbation operator used by our VNS heuristic; the DP approach to optimally solve the TSPPDL for small trees is presented in the next section. For ease of discourse, we specify that node refers to a tree node corresponding to a request in R, and vertex refers to a pickup or delivery vertex in V . Each solution of the MTSPPD-LD consists of m trees, and trees with only one node 0 (called empty trees) represent vehicles that do not carry out any request. All of the operators only consider moves that satisfy the distance constraint, i.e., the sum of the edge weights for the resultant trees are at most L. Other than tree-merge and perturbation, all of these operators perform the move that results in the greatest savings in total distance out of all possibilities. The node-relocate operator removes a node x ∈ R from tree Tp and inserts it as a child of node y in another tree Tq (Tq can be an empty tree); this moves the request from one vehicle to another. Consider the example in Figures 2(a, b) showing two trees T1 and T2 that form a feasible solution to an instance with 12 requests. First, node x is removed from T1 and its children are linked to its parent node such that a new tree T1 (Figure 2(c)) is created. Next, node x is inserted into T2 as a child of node y (Figure 2(d)); if y has cy children, then there are cy + 1 possible positions for this insertion. If node x now has α right siblings, we can create other neighboring solutions by relocating the 1, . . . , α right siblings of node x as the children of x (Figures 2(e, f)). If the maximum number of children for any node in all trees is w, then the complexity of this operator is O(n2 · w). The node-exchange operator selects two non-root nodes from different trees and exchanges their positions, thereby swapping two requests. This operator runs in O(n2 ) time. Let TSx denote the subtree of T rooted at node x ∈ R; the subtree-relocate operator selects a subtree TSx from one tree and relocates it as the child of a
Multiple Pickup and Delivery TSP with LIFO and Distance Constraints
0 1 3
4
8
y
x 6
5
1
7
8
10
9
11
3
6
2
4
5
(a)
(b)
(c)
0
0
0
y x
0
0
2
y
7 9
10
11
x
10
(e)
7 10
x
11 8
8
(d)
y
7 9
197
11
9
(f)
Fig. 2. (a) Tree T1 in the original solution. (b) Tree T2 in the original solution. (c) Tree T1 in the resulting solution. (d–f) Possible trees T2 in the resulting solution.
node y in another tree; this moves the route segment represented by TSx from one vehicle to another. If node y has cy children, then there will be cy +1 possibilities to place node x as a child of node y. This operator also requires O(n2 ) time. The subtree-exchange operator is analogous to the node-exchange operator except that the exchange is applied to subtrees (rooted at nodes in R) rather than nodes. This operator similarly runs in O(n2 ). The tree-merge operator is applied on a feasible solution and successively attempts to merge a pair of trees into one tree, thereby reducing the number of vehicles by one. The merging of two trees combines their root nodes into one node; there are two ways to do so (Figure 3). With the triangle inequality condition, this merging process always reduces the total distance. Our tree-merge operator is given by Algorithm 2. If the number of non-empty trees in the feasible solution is h ≤ m, the time complexity of this operator is O(h3 ). Algorithm 2. The tree-merge operator 1:
Set i = 1; sort non-empty trees by increasing traveling distance to generate a tree sequence T1 , T2 , . . ., Th ; 2: Check the resultant solutions of merging Ti with each of the trees Tj (i < j ≤ h); 3: if there exists a feasible resultant solution then 4: Perform the merge that results in the largest distance saving and return to Step 1; 5: else 6: if i + 1 < h then 7: i = i + 1 and return to Step 2; 8: else 9: Terminate the process. 10: end if 11: end if
Finally, the perturbation operator helps the search process escape from local optima. Given a feasible solution consisting of h non-empty trees, we randomly remove a subtree from each tree to obtain a set V of nodes (Figure 4(a)). The nodes in V are then randomly chosen in turn and reinserted into the remaining trees. The insertion process follows two rules: (1) choose the non-empty tree
198
X. Gao et al.
0
1
0
0 2
3
4
1
5
0 3
2
4 5
6
(a)
3
4
6
5
1
2
6
(b)
(c)
Fig. 3. (a) The initial solution. (b–c) The new solutions created by tree-merge. V'
0
0
y 1
7
2
0
0
x 1
7
2
6
x
11
y
6 3
4
5
8
9
11 10
3
4
5
10
8
9
(b)
(a)
Fig. 4. (a) The remaining trees and V after subtrees TSx and TSy are removed. (b) New solution after node insertion.
with the smallest total traveling distance; (2) attempt to insert the node in a way similar to the node-relocate operator. If a node cannot be inserted into any non-empty tree, then we insert it into an empty tree.
5
Dynamic Program for TSPPDL
A feasible solution to the MTSPPD-LD consists of a number of trees, each of which is a TSPPDL solution. After performing the local search procedure, we attempt to improve each tree in the current solution. When the tree consists of more than 13 requests, we employ the VNS heuristic proposed by Tu et al. [12], while a DP algorithm is used for trees with 13 requests or fewer. We now describe our DP algorithm for optimally solving the TSPPDL. Let S be the set of requests fulfilled by a vehicle. We define function f (a, b, S) to be the length of the shortest route starting from the pickup vertex of request a ∈ S and a+
a-
a+
b-
a+
b+
c-
a-
a+
a-
c+
b-
a+
b+
b-
a-
a+
a-
b+
b-
(a)
(b)
Fig. 5. Route with the first pickup vertex and the last delivery vertex belonging to (a) the same request and (b) different requests
Multiple Pickup and Delivery TSP with LIFO and Distance Constraints
199
ending at the delivery vertex of request b ∈ S that fulfills all requests in S. The routes covering set S and complying with the LIFO and precedence constraints can be divided into two cases depending on whether the first pickup vertex and the last delivery vertex come from (a) the same or (b) different requests, and each case can be further divided into two groups. These groups are shown in Figure 5, where a, b and c are different requests (a dashed arrow between two vertices signifies a subroute between the vertices). Our DP algorithm is based on these groups. Let dis[x][y] denote the distance between vertices x and y; and let a, b, c, d denote different requests. The values of f (a, a, S) and f (a, b, S) for |S| > 1 can be calculated by:
{dis[a+ ][b+ ] + f (b, c, S − {a}) + dis[c− ][a− ]}, min {dis[a+ ][b+ ] + f (b, b, S − {a}) + dis[b− ][a− ]} (1) ∀b∈S−{a} f (a, b, S) = min min {f (a, a, A) + dis[a− ][c+ ] + f (c, b, B)}, ∀A:a∈A,∀B:b∈B ∀c∈B−{b} min{f (a, a, A) + dis[a− ][b+ ] + f (b, b, B)} (2)
f (a, a, S) = min
min
∀b,c∈S−{a}
For the base case of S = {a}, we set f (a, a, S) = dis[a+ ][a− ]. When the set S contains more than one request, it can be partitioned into two non-empty sets A and B. Given two disjoint non-empty request sets A and B, the subroutine calc(A, B) given in Algorithm 3 updates the values of f (a, b, S), where S = A ∪ B and a, b ∈ S. Our DP algorithm considers all possible partitions of S into two non-empty sets A and B, and invokes calc(A, B) as appropriate (Algorithm 4).
Algorithm 3. The calc(A, B) subroutine 1: INPUTS: Sets A and B, where A ∪ B = S and A ∩ B = ∅; 2: for all i ∈ A do 3: for all j ∈ A do 4: for all k ∈ B do 5: if f (k, k, B) + dis[k− ][i+ ] + f (i, j, A) < f (k, j, S) then 6: f (k, j, S) ← f (k, k, B) + dis[k− ][i+ ] + f (i, j, A) 7: end if 8: if f (i, j, A) + dis[j − ][k+ ] + f (k, k, B) < f (i, k, S) then 9: f (i, k, S) ← f (i, j, A) + dis[j − ][k+ ] + f (k, k, B) 10: end if 11: if |B| = 1 then 12: if dis[k+ ][i+ ] + f (i, j, A) + dis[j − ][k+ ] < f (k, k, S) then 13: f (k, k, S) ← dis[k+ ][i+ ] + f (i, j, A) + dis[j − ][k+ ] 14: end if 15: end if 16: end for 17: end for 18: end for
200
X. Gao et al.
Algorithm 4. The DP Algorithm for TSPPDL 1: INPUT: Set S; 2: Initialize f (a, b, S ) ← +∞ for all S ⊆ S and a, b ∈ S ; 3: Initialize best ← +∞ 4: Let S1 , S2 , . . ., Sg be the subsets of S sorted in non-increasing order of 5: for i = 0 to g do 6: if |Si | = 1 then 7: f (k, k, Si ) ← dis[k+ ][k− ], where k ∈ Si 8: else 9: for each B ⊂ Si do 10: if |S − B| ≥ |B| then 11: Invoke calc(S − B, B) 12: end if 13: end for 14: end if 15: end for 16: for all i ∈ S do 17: for all j ∈ S do 18: if dis[0+ ][i+ ] + f (i, j, S) + dis[j − ][0− ] < best then 19: best ← dis[0+ ][i+ ] + f (i, j, S) + dis[j − ][0− ] 20: end if 21: end for 22: end for 23: Return best;
6
cardinality;
Computational Experiments
To evaluate our VNS heuristic, we conducted experiments using the data set introduced by Carrabs et al. [2]. This data set was derived from six TSP instances fnl4461, brd14051, d15112, d18512, nrw1379, and pr1002 taken from TSPLIB. For each of these TSP instances, subsets of vertices were selected with 25, 51, 75, 101, 251, 501 and 751 vertices. One arbitrary vertex was designated as the depot and the rest were paired to form requests using a random matching process, resulting in 48 test instances. Due to space limitations, we only report the results for instances with 251 or more vertices. Let dmax = maxi∈R {d(0+ , i+) + d(i+ , i− ) + d(i− , 0− )} be the greatest distance for any route involving a single request; we solved each instance with the traveling distance limit L set to 1.5×dmax, 2.0×dmax and 2.5×dmax . The number of vehicles m is set to 20, which is sufficient to ensure that all instances have feasible solutions. The only parameter max nonimproving was set to 10. We executed our VNS heuristic ten times for each traveling distance limit and for each instance with different random seeds. The algorithm was coded in C++ and all experiments were run on a Linux server with 3 GB memory and Intel Xeon(R) 2.66 GHz processor. Computation times reported here are in CPU seconds on this machine. The initial solution was generated by a cheapest insertion heuristic. Beginning with an empty tree, we choose an unselected request whose insertion minimizes the length of the tour. If no request can be inserted into the current tree, we continue with another empty tree. The above process is repeated until all requests are fulfilled. This is a natural heuristic that approximates how a human scheduler would solve the problem. Since the MTSPPD-LD has not been previously investigated, we evaluate our VNS heuristic by comparing it with this natural heuristic.
Multiple Pickup and Delivery TSP with LIFO and Distance Constraints
201
The results are displayed in Table 1, where Cost is the total distance traveled by all vehicles in the final solution averaged over 10 runs, and Gap(%) is the percentage improvement over the cheapest insertion heuristic achieved by our VNS heuristic. The values show that our heuristic is able to improve on the solutions that a human scheduler can reasonably find by between 14.81% and 29.7% on average; the results are similar for instances with 101 vertices or fewer. Furthermore, our heuristic takes less than 15 minutes on average for all instances of up to 751 vertices. Table 1. Comparison of VNS with the cheapest insertion heuristic L = 1.5 × dmax Instance Size Cost Time Gap(%) fnl4461 251 31430.6 48.03 21.93 501 85867.1 230.87 20.92 751 141688.5 279.00 22.97 brd14051 251 28409.1 43.77 26.97 501 61394.7 252.48 29.77 751 100602.1 350.55 22.62 d15112 251 760769.2 7.06 16.48 501 1206575.8 69.69 23.69 751 1713090.8 180.04 18.46 d18512 251 29683.9 32.51 24.83 501 59973.3 223.28 28.94 751 98081.9 461.22 21.29 nrw1379 251 33117.3 20.53 25.18 501 71696.5 97.60 23.51 751 130130.3 133.39 15.02 pr1002 251 279505.2 25.28 19.20 501 630616.9 100.19 14.81 751 1068144.3 222.58 20.31
7
L = 2.0 × dmax L = 2.5 × dmax Cost Time Gap(%) Cost Time Gap(%) 30417.3 59.21 24.9 30594.4 135.99 18.5 81491.2 206.24 16.3 78692.5 424.81 18.4 134488.7 337.23 16.2 128827.3 644.37 16.0 27042.0 68.60 23.7 25379.1 121.10 23.5 57754.9 296.30 27.2 55291.1 384.01 20.8 96035.0 652.23 21.2 93645.8 759.53 21.4 681074.8 22.95 20.5 655343.1 32.01 22.9 1118088.9 97.88 19.4 1097025.4 102.34 16.2 1573359.9 207.23 19.5 1516195.1 319.93 16.8 27471.8 55.32 24.8 26044.7 82.38 21.9 56622.5 311.54 25.0 54884.1 364.42 23.8 90748.8 727.01 23.0 89432.3 858.64 21.6 30996.2 26.39 21.9 30034.3 44.26 16.6 67910.8 150.19 17.5 66061.3 216.92 17.8 121934.0 229.23 15.4 118357.5 326.78 16.5 238978.4 30.69 25.2 225086.1 35.13 20.4 578011.9 136.05 17.0 554952.5 124.42 15.8 987741.3 265.20 15.5 935861.1 359.73 15.8
Conclusions
This article describes a new and practical traveling salesman problem involving pickup and delivery, LIFO loading, multiple vehicles and distance constraints. We make use of a tree representation of feasible solutions to develop a VNS heuristic using several new operators, as well as a dynamic programming subroutine that solves small instances of the single-vehicle subproblem optimally. Our approach can be used as a benchmark for future researchers in this area. Our preliminary work assumes that the number of available vehicles m is sufficient to cater to all requests and a feasible solution can be easily generated. If the number of vehicles is small, a two-phase approach may be required such that a separate first phase finds a feasible solution before our VNS heuristic is employed in the second phase. The class of traveling salesman problems with LIFO loading can be further generalized to handle more practical constraints, e.g., limited capacity vehicles and time windows. We believe that a VNS approach using the tree representation may also be effective for these variants.
202
X. Gao et al.
References 1. Br¨ aysy, O., Gendreau, M.: Vehicle routing problem with time windows, part I: Route construction and local search algorithms. Transportation Science 39(1), 104– 118 (2005) 2. Carrabs, F., Cordeau, J.-F., Laporte, G.: Variable neighborhood search for the pickup and delivery traveling salesman problem with LIFO loading. INFORMS Journal on Computing 19(4), 618–632 (2007) 3. Cassani, L., Righini, G.: Heuristic algorithms for the TSP with rear-loading. In: 35th Annual Conference of the Italian Operational Research Society (AIRO XXXV), Lecce, Italy (September 2004) 4. Cordeau, J.-F., Iori, M., Laporte, G., Gonz´ alez, J.J.S.: A branch-and-cut algorithm for the pickup and delivery traveling salesman problem with LIFO loading. Networks 55(1), 46–59 (2010) 5. Erera, A.L., Morales, J.C., Savelsbergh, M.: The vehicle routing problem with stochastic demand and duration constraints. Transportation Science (2010) 6. Felipe, A., Ortu˜ no, M.T., Tirado, G.: The double traveling salesman problem with multiple stacks: A variable neighborhood search approach. Computers & OperationsResearch 36(11), 2983–2993 (2009) 7. Ladany, S.P., Mehrez, A.: Optimal routing of a single vehicle with loading and unloading constraints. Transportation Planning and Technology 8(4), 301–306 (1984) 8. Levitin, G., Abezgaouz, R.: Optimal routing of multiple-load AGV subject to LIFO loading constraints. Computer & Operations Research 30(3), 397–410 (2003) 9. Li, C.-L., Simchi-Levi, D., Desrochers, M.: On the distance constrained vehicle routing problem. Operations Research 40(4), 790–799 (1992) 10. Mladenovi´c, N., Hansen, P.: Variable neighborhood search. Computers & Operations Research 24(11), 1097–1100 (1997) 11. Tricoire, F., Doerner, K.F., Hartl, R.F., Iori, M.: Heuristic and exact algorithms for the multipile vehicle routing problem. OR Spectrum (2009) 12. Tu, D., Guo, S., Qin, H., Oon, W.-C., Lim, A.: The tree representation of feasible solutions for the tsp with pickup and delivery and LIFO loading. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010), Atlanta, Georgia, USA, pp. 191–196 (July 2010)
Distributed Learning with Biogeography-Based Optimization Carre Scheidegger, Arpit Shah, and Dan Simon Cleveland State University, Electrical and Computer Engineering
Abstract. We present hardware testing of an evolutionary algorithm known as biogeography-based optimization (BBO) and extend it to distributed learning. BBO is an evolutionary algorithm based on the theory of biogeography, which describes how nature geographically distributes organisms. We introduce a new BBO algorithm that does not use a centralized computer, and which we call distributed BBO. BBO and distributed BBO have been developed by mimicking nature to obtain an algorithm that optimizes solutions for different situations and problems. We use fourteen common benchmark functions to obtain results from BBO and distributed BBO, and we also use both algorithms to optimize robot control algorithms. We present not only simulation results, but also experimental results using BBO to optimize the control algorithms of mobile robots. The results show that centralized BBO generally gives better optimization results and would generally be a better choice than any of the newly proposed forms of distributed BBO. However, distributed BBO allows the user to find a less optimal solution to a problem while avoiding the need for centralized, coordinated control.
1 Introduction Biogeography is the study of the geographical distribution of plant and animal life. Alfred Wallace and Charles Darwin were some of the first to observe the patterns of biogeography and introduce the subject to the scientific world [1]. Biogeography did not evolve into the quantitative science that it is today until Robert MacArthur and Edward Wilson created models from their studies of island biogeography in the early 1960s [2]. Other scientists also contributed to the emergence of the theory of island biogeography, most notably Eugene Monroe in 1948 [12]. Biogeography has continued to develop after MacArthur and Wilson’s research and it has recently been used as the motivating framework for the development of an evolutionary algorithm (EA) called biogeography-based optimization (BBO) [3]. In this paper, we apply BBO to experimental mobile robotics control optimization. This paper gives an overview of the evolutionary algorithm called BBO, which is based on mathematical models of biogeography, and which has been developed to solve general optimization problems [3]. BBO has been applied to several real-world problems. In addition to experimental robot control tuning, as discussed in this paper and in [4], BBO has been applied to aircraft engine sensor selection [3], power system optimization [13, 14], groundwater detection [15], mechanical gear train design [16], satellite image classification [17], and neuro-fuzzy system training for biomedical K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 203–215, 2011. © Springer-Verlag Berlin Heidelberg 2011
204
C. Scheidegger, A. Shah, and D. Simon
applications [8]. Recent research in the area of BBO has focused on putting it on a firm theoretical and mathematical foundation, including the derivation of Markov models [19, 20] and dynamic system models [21] that describe its behavior. In this paper, we use BBO in computer generated simulations and on experimental mobile robots to study its performance. This paper also develops BBO’s distributed counterpart, which is based on distributed learning and intelligence, and the simulation results gathered from the distributed algorithm. The distributed algorithm’s communication and control is distributed through various BBO individuals rather than coordinated by a central computer. The development of distributed BBO (DBBO) has been motivated by the confluence of centralized BBO and concepts from distributed learning. Distributed learning is a theory developed to explain how the human mind understands and learns [7] [9]. Human’s mental capabilities are often assumed to be centralized inside the brain, but research has shown that outside social interactions greatly affect how the brain learns [7]. Distributed learning is an example of this type of environmental influence, and is often seen in humans working in teams to solve a common problem or complete a task [5] [6]. Distributed learning has been used in recent years to study how automated technology can be taught to perform human-like tasks using teams of robots that function together with nearly the same effectiveness as a team of humans [6] [7] [10] [11]. This distributed learning or intelligence in robotics is the ability of numerous entities to solve problems, perform tasks, learn, and understand by communicating with other entities in a group, rather than under the control of a centralized coordinator [6] [9] [10] [11]. Each organism is considered an individual with governing logic that allows it to perform a particular task, and the way in which these distributed learning groups carry out a task is dependent on the mode of communication by each entity [5] [6] [8] [10]. The study of distributed interaction is helpful in discovering and researching different ways to make artificial intelligences communicate to perform like a group of human beings, and has increasingly been applied to different types of systems. The advantages of distributed interaction in robotics open doors to new types of systems that do not need a central processor to control the team of robots, and allow flexibility for change and improvement. This flexibility in robotic applications of distributed intelligence is increasingly being studied because it has advantages that centralized systems do not. These advantages include autonomy, fault tolerance, and robustness. Section 2 in this paper gives an outline of BBO. It also proposes a distributed extension of BBO, which is one of the primary contributions of this paper. Section 3 presents fourteen benchmark function simulation results. Section 4 discusses the mobile robot system used for an experimental application of BBO and DBBO, including its design, hardware, task description, simulation results, and experimental results. Section 5 concludes with some discussion and suggestions for future work.
2 Biogeography-Based Optimization (BBO) BBO is an evolutionary algorithm that is modeled on the theory of biogeography. These models of biogeography mathematically describe how species travel to and
Distributed Learning with Biogeography-Based Optimization
205
from environments based on different factors of the environment [2]. These environmental factors can be represented quantitatively and are called suitability index variables (SIVs) and determine the suitability of the area for habitation. Examples of natural SIVs seen often in habitats are the amount of rainfall, the diversity of vegetation, and the temperature range. An area that is highly habitable is considered to have a high habitat suitability index (HSI) [3]. Habitats can be observed by scientists to develop mathematical models of migration, speciation, and extinction. A high-HSI habitat is likely to have a large number of species. It has a high rate of emigration and a low rate of immigration due to its dense population. The opposite occurs in low-HSI habitats because of the habitat’s sparse population. Emigration and immigration rates in a given habitat are proportional to the number of species that reside in that habitat. This habitat suitability concept that biogeography quantifies is what makes biogeography applicable to optimization problems in engineering and other fields of study. An individual in an evolutionary algorithm that has a high HSI (performance, or fitness) represents a good solution, and that individual will emigrate successful features to other individuals. Individuals that receive features from successful candidate solutions tend to increase their own fitness. Quantifiably applying the emigration and immigration of specific features from one individual to another depending on the individuals’ HSI values generally creates better solutions. Biogeography-based optimization (BBO), which was introduced by Simon [3], is the implementation of this extension of biogeography to optimization. 2.1 Centralized BBO Centralized BBO is the original BBO algorithm created to optimize solutions based on the theory of biogeography, and it uses the migration of traits to create better generations of candidate solutions to an optimization problem. As explained previously, a habitat’s migration rates are dependent on the habitat’s HSI. A habitat is analogous to a problem solution, and its HSI is analogous to the fitness of the solution. A solution’s fitness determines its rates of immigration, λ, and emigration, μ, and is determined in a way that is similar to natural biogeography. BBO bases the migration rate of each candidate solution on the HSI of the solution, with high HSI giving a high emigration rate, and low HSI giving a high immigration rate. BBO operates in a way that allows each generation of candidate solutions to improve from one generation to the next. Migration, mutation, and elitism are three characteristics of BBO that allow the population of candidate solutions to share information and keep the best solutions of each generation for the following generation. Migration is among the most influential and unique part of the BBO algorithm and it allows the individuals in the system to immigrate or emigrate data from other individuals in a single generation of the program. Mutation operates as in other evolutionary algorithms, and encourages diversity in the population and allows the replacement of a specific solution feature with another randomly generated solution feature. Mutation allows problem solutions to improve, but it also introduces the risk of degrading the solution. Elitism also operates as in other evolutionary algorithms, and it counteracts the risks of mutation. Elitism saves the best problem solutions at each generation, and replaces the worst solutions in the next generation with these elite solutions [3]. The listing of Algorithm 1 depicts a single generation of centralized BBO.
206
C. Scheidegger, A. Shah, and D. Simon Algorithm 1. Basic description of the BBO algorithm for one generation
For each candidate problem solution Pi Calculate immigration probability i and emigration probability i : [0, 1] is proportional to the fitness of Pi, and i = 1 i i Next candidate solution: i i+1 For each candidate problem solution Pi For each solution variable v in Pi Use immigration probability i to decide whether to immigrate to Pi If immigrating to Pi Select Pk for emigration according to the probability k Pk emigrates data to Pi : Pi(v) Pk(v) End immigration Next solution variable Mutate Pi probabilistically based on mutation probability Next candidate solution: i i+1
2.2 Distributed 2.2. Distributed BBO BBO
.2. Distributed BBOmain contributions of this paper is the development of distributed biogeogOne of the raphy-based optimization (DBBO). Distributed BBO is based on the centralized BBO algorithm, but has been motivated by the theory of distributed systems. DBBO has goals similar to centralized BBO, which is to optimize a task or problem solution. However, it does not use a centralized computer for control, but rather each individual in DBBO is capable of performing the evolutionary algorithm on its own. Although distributed BBO is very closely related to the original centralized implementation,
robot robot
robot
robot
robot robot
Fig. 1. Peers randomly choose one another for communication and feature-sharing in distributed BBO
Distributed Learning with Biogeography-Based Optimization
207
it allows evolution in cases in which the use of a central computer is not ideal or possible. Figure 1 shows an example of a team of robots communicating peer-to-peer and not through a centralized coordinator. The DBBO algorithm runs on each individual separately. We demonstrate this concept using mobile robots in this paper. The individuals were programmed to share information from peer to peer, rather than through a central computer. A peer contains a single problem solution (for example, a single robot with its control algorithm), and a few peers are randomly chosen to communicate and share information with each other. DBBO has the same basic characteristics as BBO and it also uses mutation to diversify the solution sets. Centralized BBO uses elitism; however, elitism is not used in distributed BBO because only a few individuals (as few as two) communicate with each other at each generation. Algorithm 2 shows a simple description of a single generation of the DBBO algorithm. Algorithm 2. Basic description of the DBBO algorithm for one generation
Select m peers {Pi} for communication with each other Revise each peer’s best and worst cost estimates. For each i, MinEsti = mink I {MinEstk} and MaxEsti = maxk I {MaxEstk}, where I is the set of all peers of robot i Calculate each peer’s likelihood to immigrate, , and emigrate, : [0, 1] is proportional to the fitness of Pi relative to its peers, and i = 1 i For each peer Pi For each solution variable v Use immigration probability i to decide whether to immigrate to Pi If immigrating to Pi Select Pk for emigration according to the probability k Pk emigrates data to Pi : Pi(v) Pk(v) End immigration Next solution variable Mutate Pi according to mutation probability Next peer 3. Benchmark SimulationSimulation Results 3 Benchmark
i
Results
As in previous research with BBO [3], we used 14 common benchmark functions to simulate the BBO and DBBO algorithms. Distributed BBO was used with three different numbers of peers (2, 4, and 6). We were able to compare the results of BBO against DBBO/2, DBBO/4, and DBBO/6 for the benchmark functions from 100 Monte Carlo simulations. The results from the Monte Carlo simulations were analyzed based on the best cost functions returned by BBO and DBBO. The same parameters were used in all four runs of the BBO and DBBO algorithms. After analysis
208
C. Scheidegger, A. Shah, and D. Simon
of the best combination of variables, we chose a population size of 50, we used 20 independent variables (i.e., each benchmark has a dimension of 20), and we used a 1% probability of mutation for each independent variable. Figures 2 and 3 are normalized plots of the cost function values calculated from the simulation runs of the benchmark functions. The value used for normalization was the minimum value achieved for each benchmark function by BBO and the three DBBO versions. However, if a minimum value was 0, we instead normalized to the second smallest value for that benchmark. We analyzed the minimum cost and average cost over the 100 Monte Carlo simulations. The minimum cost figure shows the minimum cost achieved by each algorithm for each benchmark after 100 simulations. BBO minimized the cost functions of each benchmark function most often and generally out-performed DBBO. The average cost in Figure 3 shows which BBO version optimized best on average. In general, the average of each benchmark function and each algorithm was fairly good. Centralized BBO, however, usually obtained the best costs on average. However, on occasion BBO had a higher minimum cost than DBBO. This means that DBBO is sometimes more likely than BBO to get closer to the minimum. We intuitively expect BBO to outperform DBBO. But just because BBO has more candidate solutions to choose from when performing immigration does not guarantee that it will outperform DBBO. The fact that BBO has more candidate solutions to choose from can just as easily result in a detrimental immigration as a beneficial immigration. Although we usually see BBO outperform DBBO, there is no guarantee of this advantage, and further research is needed to determine the conditions under which BBO or DBBO will give better optimization results.
Fig. 2. Plot of the minimum cost function values (best performance) for BBO, DBBO/2, DBBO/4, and DBBO/6 over 100 Monte Carlo simulations. BBO usually performs better than DBBO, but not always.
Distributed Learning with Biogeography-Based Optimization
209
Fig. 3. Plot of the average cost function values for BBO, DBBO/2, DBBO/4, and DBBO/6 over 100 Monte Carlo simulations. BBO performs better than DBBO for every benchmark.
4 Robot Optimization Using BBO and DBBO This section discusses the use of BBO and DBBO for robot controller optimization. Section 4.1 discusses the robot hardware that we used. Section 4.2 discusses the robot control task. Section 4.3 presents simulation results, and Section 4.4 presents experimental hardware results. 4.1 Robot Hardware The physical application of BBO in this research is performed on mobile robots that have been used in previous research in which BBO has optimized the control parameters [4]. The robots are equipped with two DC motors and eight AA batteries. The batteries power the two motors and the circuit board. The main control base of the robots is the microcontroller, which is a Microchip PIC18F4520. The microcontroller is equipped to control the motors and communicate with a PC. In BBO the microcontroller communicates with a central PC using a wireless radio, the MaxStream 9Xtend radio. In BBO the radio sends all the necessary commands, parameters, and data to each individual robot. In DBBO the radio signals are sent between individual robots. Two voltage regulators are used on the robots to distribute a constant 5 volts to the microcontroller and to the motors. The voltage regulators ensure that the microcontroller receives enough current to function correctly. In order to power the motors the signal from the microcontroller is used to switch the motor power supply, so an Hbridge SN754410NE is also used. The final major hardware is the infrared sensors.
210
C. Scheidegger, A. Shah, and D. Simon
The infrared sensors are the main component that measure the distance of the robot from a wall during its tracking task using a light-emitting sensor and a light-detecting sensor. Figure 4 shows a photograph of the robots.
Fig. 4. Photograph of the mobile robots used for BBO and DBBO testing
4.2 PID Control The robot controller is a proportional-integral-derivative (PID) controller, a very commonly used algorithm for control [22]. PID uses different factors to determine what a control system needs to do. The proportional gain Kp is generally a large number for performance purposes, and is the primary determinant of the amount of control signal. The higher the value of Kp the faster the controller responds to tracking errors. The integral gain Ki speeds up the output movement to the desired value and helps to reduce the steady-state error resulting from the proportional gain. The final parameter of PID control is Kd and is multiplied by the rate of change of the output of the system. It helps keep the controller stable and decrease the range of overshoot created by Ki and Kp. The robots used in our BBO and DBBO research use Kp and Kd terms. The mobile robots used for testing BBO and DBBO are applied to a particular tracking task. The robots are programmed to follow a wall, using the PID controller to maintain a specified distance from the wall. The robot uses infrared sensors to measure its distance and angle from the. The robot controller uses a calculated error from the measured values to correct the motor output. The controller output controls the voltage input to the two motors (one for the left wheel and one for the right wheel). Using this control output, the robot is able to adjust its wheel speed to compensate for its error. See [4] for more details about the robots and their control algorithm. 4.3 Simulation Simulation results were generated using a robot function in place of the benchmark functions used in Section 3. The robot function simulated a robot’s program for the desired task of following a wall at a certain reference distance. To make the simulations close to a real robot application, the BBO and DBBO parameters were set differently than the benchmark simulation parameters. We ran BBO and DBBO with a
Distributed Learning with Biogeography-Based Optimization
211
maximum of 50 function evaluations, a population size of 5, and a mutation rate of 15% per independent variable. The mutation rate was chosen to be relatively high because of the small population size. In the benchmark simulations the population size was much larger, but to have a realistic physical implementation a small population size of only 5 was chosen for the robot simulations. To create a fair chance of mutation of each problem solution, the mutation rate needed to be increased to a relatively high rate. The robot simulation function also had the following parameters that needed to be set: Kp and Kd minimums and maximums. The domain of Kp was set to [0, 2], and the domain of Kd was set to [0, 10]. Testing was done with 100 Monte Carlo simulations. The robot simulation results using BBO and DBBO/2, DBBO/4, and DBBO/6 are shown in Table 1. We analyzed the minimum cost, maximum cost, average cost, and standard deviation of the four algorithms for 100 optimization trials. The cost function analyzed is the sum of the rise time r of the controller and integral of the absolute tracking error [4]: Cost = k1
∫ e(t ) dt + k r
(1)
2
Table 1. Cost values from 100 Monte Carlo simulations of the BBO and DBBO algorithms BBO
DBBO/2
DBBO/4
DBBO/6
Minimum Cost
7.48
7.23
7.30
7.16
Maximum Cost
7.99
8.12
8.07
8.10
Average Cost
7.68
7.78
7.77
7.76
0.119
0.169
0.147
0.193
Standard Deviation
As predicted from the benchmark results of the previous section, and as seen in Table 1, BBO performed better, on average, than the distributed versions of the algorithm. The centralized algorithm had the lowest worst-case cost value, performed best on average, and had the smallest standard deviation. However, it is interesting to note that the DBBO versions had better best-case performance than centralized BBO. Among the DBBO algorithms, they all performed very similarly, neither one significantly outperforming another. However, if one had to be chosen, 6-peer communications might be the best choice because of its low minimum cost value, and relatively low average cost. On average, BBO returned better results than any distributed algorithm. However, DBBO still is capable of performing well enough to be used in situations where a centralized processor is not available. 4.4 Experimental Results We used four robots in our experiments. The four experimental robots’ initial Kp and Kd values varied between each robot and were set randomly. The initial Kp values for
212
C. Scheidegger, A. Shah, and D. Simon
the robots were 0.93, 0.07, 0.18, and 0.12. The initial Kd values for the robots were 4.26, 6.36, 2.45, and 2.21. The Kp and Kd values change from one generation to the next as different robots communicate using the DBBO algorithm. The final values of Kp after 8 generations were 0.82, 0.07, 0.67, and 0.02, and the final values of Kd were 9.03, 3.41, 4.32, and 2.03. From their initial to their final values, the robots’ control parameters changed as follows. Robot 1: Robot 2: Robot 3: Robot 4:
Kp = 0.93 → 0.82 Kd = 4.26 → 9.03 Kp = 0.07 → 0.07 Kd = 6.36 → 3.41 Kp = 0.18 → 0.67 Kd = 2.45 → 4.32 Kp = 0.12 → 0.02 Kd = 2.21 → 2.03
Cost
Figure 5 shows the experimental results from the robots using the distributed BBO algorithm DBBO/2 on four mobile robots. Two robots were communicating per generation in this particular experiment. The DBBO program shows successful optimization over 8 generations. Both the minimum cost of the four robots, and the average cost of the four robots are decreasing as the generation count increases. The decreasing cost values show that the robots are learning to have smaller fluctuations in their path as they track a certain distance from a wall. The best Kp and Kd values were 0.07 and 3.41 (robot #2). These values returned the smallest minimum cost. Figure 6 shows a simplified flowchart of the distributed BBO algorithm DBBO/2 as applied to robot controller optimization.
3000 2500 2000 1500 1000 500 0
Mincost Averagevalue 1
2
3
4
5
6
7
8
Generation
Fig. 5. Minimum and average cost value of the distributed BBO algorithm DBBO/2 on four mobile robots
5. Conclusion
Distributed Learning with Biogeography-Based Optimization
213
start
Get Robot ID from memory
Switch Pressed?
yes
Generate random Robot ID
no Send parameters to robot B and wait for reply from robot B Receive GO command
Incoming Data type? Run BBO
Start motor timers Robot A data 100ms intervals run sensors Is Data a Robot ID?
yes
Get distance from wall no Calculate angle and ‘y’ value
Send data to robot A
Calculate error and set new motor values
Run BBO
Last Robot?
yes
no
Generate another robot ID Stop motors, calculate cost, determine Min and Max cost values
Last Robot?
yes
no Generate robot ID
Fig. 6. Basic flow chart of the distributed BBO algorithm for robot control optimization. Pressing a switch on a random robot, which is called “Robot A,” begins the DBBO process. Robot A randomly selects another robot, which is called “Robot B,” with which to perform migration of solution features.
214
C. Scheidegger, A. Shah, and D. Simon
5 Conclusion This work successfully extended BBO to its distributed counterpart, and presented simulations and experiments to see which would be better in different situations. The results from this paper show that BBO and DBBO are able to optimize benchmark functions and the real-world problem of robot controller tuning. Although distributed BBO offers more flexibility, centralized BBO returns the best results, on average. This topic can be researched further by using simulations and experiments to explore the effect of different BBO and DBBO parameter settings, including population sizes, mutation rates, and number of communicating peers in DBBO. Other future work includes using theoretical Markov modeling [19, 20] and dynamic system modeling [21], which has been performed for BBO, and extending it to DBBO.
References 1. Quammen, D.: The Song of the Dodo: Island Biogeography in an Age of Extinction. Simon & Schuster, New York (1997) 2. Mac Arthur, R.H., Wilson, E.O.: The Theory of Island Biogeography. Princeton University Press, Princeton (1967) 3. Simon, D.: Biogeography-Based Optimization. IEEE Transactions on Evolutionary Computation 12(6), 702–713 (2008) 4. Lozovyy, P., Thomas, G., Simon, D.: Biogeography-Based Optimization for Robot Controller Tuning. In: Igelnik, B. (ed.) Computational Modeling and Simulation of Intellect: Current State and Future Perspectives. IGI Global (in print, 2011) 5. Parker, L.E., Touzet, C.: Multi-Robot Learning in a Cooperative Observation Task, pp. 391–401 (2000) 6. Parker, L.E.: Distributed Intelligence: Overview of the Field and its Application in Multirobot Systems. Journal of Physical Agents 2, 5–14 (2008) 7. Fischer, G.: Distributed Intelligence: Extending the Power of the Unaided, Individual Human Mind, pp. 7–14 (2006), http://l3d.cs.colorado.edu/ 8. Van Dam, K.H., Verwater-Lukszo, Z., Ottjes, J.A., Lodewijks, G.: Distributed intelligence in autonomous multi-vehicle systems. International Journal of Critical Infrastructures 2, 261–272 (2006) 9. Valavanis, K.P., Saridis, G.N.: Intelligent Robotic Systems: Theory, Design and Application. Kluwer Acadamic, Boston (1992) 10. Weiss, G.: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. The MIT Press, Cambridge (1999) 11. O’Hare, G.M.P., Jennings, N.: Foundations of Distributed Artificial Intelligence. John Wiley and Sons, New York (1996) 12. Lomolino, M.V., Brown, J.H.: The Reticulating Phylogeny of Island Biogeography Theory. Q Rev. Biol., 357 – 390 (2009) 13. Rarick, R., Simon, D., Villaseca, F., Vyakaranam, B.: Biogeography-based optimization and the solution of the power flow problem. In: IEEE Conference on Systems, Man, and Cy-bernetics, pp. 1029–1034 (2009) 14. Roy, P., Ghoshal, S., Thakur, S.: Biogeography-based optimization for economic load dispatch problems. Electric Power Components and Systems (38), 166–181 (2010)
Distributed Learning with Biogeography-Based Optimization
215
15. Kundra, H., Kaur, A., Panchal, V.: An integrated approach to biogeography based optimization with case based reasoning for retrieving groundwater possibility. In: 8th Annual Asian Conference and Exhibition on Geospatial Information, Technology and Applications (2009) 16. Savsani, V., Rao, R., Vakharia, D.: Discrete optimisation of a gear train using biogeography based optimisation technique. International Journal of Design Engineering (2), 205– 223 (2009) 17. Panchal, V., Singh, P., Kaur, N., Kundra, H.: Biogeography based satellite image classification. International Journal of Computer Science and Information Security (6), 269–274 (2009) 18. Ovreiu, M., Simon, D.: Biogeography-based optimization of neuro-fuzzy system parameters for diagnosis of cardiac disease. In: Genetic and Evolutionary Computation Conference, pp. 1235–1242 (2010) 19. Simon, D., Ergezer, M., Du, D., Rarick, R.: Markov models for biogeography-based optimization. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics (41), 299–306 (2011) 20. Simon, D., Ergezer, M., Du, D.: Population distributions in biogeography-based optimization algorithms with elitism. In: IEEE Conference on Systems, Man, and Cybernetics, pp. 1017–1022 (2009) 21. Simon, D.: A Dynamic System Model of Biogeography-Based Optimization (2010) (submitted for publication) 22. Astrom, K., Hagglund, T.: PID Controllers: Theory, Design, and Tuning. International Society for Measurement and Control, Research Triangle Park, North Carolina (1995)
Scheduling a Single Robot in a Job-Shop Environment through Precedence Constraint Posting D. D´ıaz1 , M.D. R-Moreno1, A. Cesta2 , A. Oddi2 , and R. Rasconi2 1
Universidad de Alcala, Alcala de Henares, Madrid, Spain {daniel.diaz,mdolores}@aut.uah.es 2 ISTC-CNR, Italian National Research Council, Rome, Italy {amedeo.cesta,angelo.oddi,riccardo.rasconi}@istc.cnr.it
Abstract. The paper presents recent work on using robust state-of-the-art AI Planning and Scheduling (P&S) techniques to provide autonomous capabilities in a space robotic domain. We have defined a simple robotic scenario, reduced it to a known scheduling problem which is addressed here with a constraint-based, resource-driven reasoner. We present an initial experimentation that compares different meta-heuristic algorithms.
1 Introduction This paper presents firsts results obtained within a Ph.D. program on the topic Autonomy for Interplanetary Missions funded and supported by the European Space Agency (ESA). Our current work is mainly aimed at injecting constraint-based, resource reasoning techniques applied to space robotics, especially for applications in planetary exploration. Promoting autonomy for future space missions certainly involves enormous benefits such as by reducing operational costs, enabling opportunistic science, or incrementing mission quality in terms of safety, science return, reliability and flexibility. Very often, the implementation of autonomous high-level competences on-board (specially to deep space missions) becomes essential since real time command dispatching is not possible due to the astronomic distances involved [7], or to the scarce time windows for establishing communication. The following futuristic mission scenario is an hypothetical application example that aims at illustrating how autonomy could be applied in deep space missions. Mission Scenario. The complete scenario describes a future ESA deep space mission where groups of autonomous rovers are in charge of collaborating for transporting supplies between different martian sites. Typical rover activities include loading, transporting and unloading the supplies, as well as performing basic self-maintenance tasks. Rovers are able to autonomously synthesize efficient action plans by optimizing energy management and plan completion time. Furthermore, advanced on-board re-planning capabilities are necessary in order to hedge against environmental uncertainty (i.e., rugged terrain, harsh weather conditions, etc.). K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 216–225, 2011. c Springer-Verlag Berlin Heidelberg 2011
Scheduling a Single Robot in a Job-Shop Environment
217
We start our analysis from a significantly reduced version of the previous scenario. In particular, problem complexity is reduced by considering: (1) a unique robot and (2) a completely deterministic environment. We refer to the problem as scheduling a single robot in a job-shop environment. This is because its formulation is equivalent to a scheduling problem as we can succinctly describe it as the problem of synchronizing the use of a set of machines to perform a set of jobs over time. Each job consists of a chain of activities that are to be sequentially processed on different machines for their entire execution. Each activity entails the processing of one item on behalf of machine whose utilization is exclusive for the whole duration of the activity. Each machine can only process one activity at a time, while all activity durations are fixed. Additionally, the rover is in charge of transporting the items among all present machines in order to allow their processing along all the jobs; at the beginning of the process all the items are kept in an initial storage area, while at the end all the items will have to be unloaded in a final storage area. Transportation times between different machines depend upon the traveling distance, and the rover can simultaneously carry a maximum number of items. The goal is to synthesize a schedule where both machine and rover activities are synchronized towards the successful execution of all jobs while minimizing total completion time (makespan). The solution that we propose in this paper is based on the last reported results in constraint-based P&S techniques, with particular attention to the “Precedence Constraint Posting” (PCP) approach as described in [4,9,3]. The remainder of the paper is structured as follows: section 2 describes in detail the problem of scheduling a single robot in a job-shop environment; section 3 introduces our constraint-based solution method as well as a meta-heuristic strategy for solution optimization; in section 4 we report performance results on a set of reference benchmark problems; finally, a conclusions section closes the paper.
2 Problem Formulation In this section we define the tackled problem as well as the constraint-based representation we use as a reference for problem solving. Problem description. The problem of scheduling a single robot in a job-shop environment involves synchronizing the use of a set of machines M = {μ1 , . . . , μn } and a robot R to perform a set of jobs J = {j1 , . . . , jn } over time. Each job consists of a set of operations (or Activities) Aj ={l0j , u1j , a1j , l1j , . . . , uij , aij , lij , . . . , unj , anj , lnj , u(n+1)j } to be sequentially processed, where aij is the activity belonging to Job j performed by machine μij ∈ M , while uij and lij are, respectively, the Unload and Load activities that the robot must perform at machine μij to release and hold items respectively. l0j is the Load activity performed by the robot at the initial storage location, while u(n+1)j is the Unload activity performed by the robot at the final storage location. The execution of all the activities is subject to the following constraints: – Resource availability: each activity aij requires the exclusive use of μij during its entire execution, i.e., no preemption is allowed. All the activities belonging to the same job demand distinct machines. The activities uij and lij require the exclusive use of the robot R.
218
D. D´ıaz et al.
– Processing time constraints: all activities aij , uij and lij have a fixed processing time. Both the machines and the robot can perform one operation at a time. – Transportation time constraints: for each pair < μix , μjy > of machines, the robot R is in charge of moving all items processed by the machine μix to the machine μjy . This entails performing a Load activity lix at μix , transporting the item at μjy , and finally performing an Unload activity ujy at μjy . The time necessary to travel from μix to μjy is directly proportional to the traveling distance, and is modeled in the problem in terms of sequence dependent setup times stij which must be enforced between each < lix , ujy > activity pair. All setup constraints satisfy the triangle inequality property, i.e., given three machines μi , μj and μk , the condition stij ≤ stik + stkj always holds. – Robot maximum capacity constraint: the robot is able to transport a maximum of C items at a time. This entails that the robot may chain different Load or Unload activities a maximum number of C times. A feasible solution is a schedule where the start times of all activities are fixed while temporal and resource constraints are satisfied. We are interested on solutions which minimize their total completion time or makespan. Constraint-based problem representation. Our solution considers the scheduling problem as a special type of a Constraint Satisfaction Problem (CSP) [6]. A general description of a scheduling problem as CSP involves a set of variables with a limited domain each, and a set of constraints that limits possible combinations. Hence, a feasible CSP solution is defined as an assignment of domain values to all variables which is consistent with all imposed constraints. A general CSP solution scheme can be seen as an iterative procedure that interleaves two main steps in each cycle: – A decision making step where a variable is chosen to be assigned with a specific domain value. The decision of the assignment to do next could be taken by either systematically following an exhaustive search technique (such as a simple depth-first search), or by using more efficient approaches that use variable and value ordering heuristics to guide the search process. Typical general purpose heuristics generate variable and value orderings by selecting the “most constrained variable (MCV)” and the “least constraining value (LCV)” respectively. – A propagation step where a set of “inference rules” prune unfeasible solutions in advance, by removing elements from variable domains when a decision step is performed. Path consistency algorithms such as “all pairs shortest paths” are typically used. New problem modelling assumptions are considered in order to adapt the initial problem formulation to our constraint-based solution scheme. Hence, the problem is described now as follows (see figure 1): each machine μij is considered as a binary resource that process all job operations aij . The Unload and Load activities are devised to be performed by a single robotic arm able to manage one item at a time (also modelled as a binary resource). We introduce an additional kind of activity cij that requires the use of a new cumulative resource (with a maximum capacity of C) to be processed by the robot, with the aim of modelling the robot capability of simultaneously carrying multiple items. The execution of cij activities starts when the corresponding Load activity lij starts, and finishes at the termination of the Unload activity ui+1j .
Scheduling a Single Robot in a Job-Shop Environment
219
Fig. 1. A cumulative resource is introduced to model the multi-capacity robot usage
Related work. Research in CSP-based scheduling has been mainly focused on the development of effective heuristic-biased models for efficiently controlling the scheduling search process (without incurring in the overload of a backtracking search). Two different directions can be mainly distinguished to cope with problems which involve both unary and cumulative or multi-capacity resources respectively. In the first case, some initial heuristic-biased procedures implement deterministic and one-pass solutions such as the precedence constraint posting algorithm proposed by [4]. In the second case, we can mention the resource profile-driven algorithm ESTA [3] as an efficient contraintbased scheduling technique that additionally addresses cumulative resources.
3 The Profile-Based Solution (Extended-ESTA) Since cumulative resources are needed to model the multi-capacity robot usage, we have chosen the ESTA algorithm [3] as reference constraint-based solving procedure to implement our solution. Furthermore, we have studied and adapted the basic principles of a recent extension of the SP-PCP (Shortest Path-based Precedence Constraint Posting) algorithm proposed in [8]. More concretely, we have adopted the new set of dominance conditions introduced in [8] as a set of four basic rules to decide the conflict resolution strategy by considering sequence-dependent setup times. 3.1 Adapting PCP to Our Robot Scheduling Problem ESTA was initially developed for solving the scheduling problem known in OR literature as (RCPSP/max) [2]. It follows an advance precedence constraint posting schema
220
D. D´ıaz et al.
that uses two different abstraction levels of the CSP formulation (i.e., the robot scheduling problem represented as a temporal constraint network) to deal with temporal and resource aspects of the problem respectively. Thus, ESTA algorithm basically consists on iteratively interleaving the two following steps until a conflict-free schedule is found: – Temporal analysis. Corresponds to the first step of the algorithm and consist on creating a basic temporal network (ground-CSP) to represent and reason about the temporal constraints of the problem. Thus, temporal constraint network described here corresponds to a Simple Temporal Problem (STP) formulation where time points represent start and end times of activities, and temporal constraints between time points represent both the duration of the activity and the precedence relations between pairs of activities. Temporal propagation (for computing current bounds for all time points after posting a new temporal precedence constraint) and solution extraction1, are operations directly performed over this STP formulation. – Resource analysis. Roughly speaking, the second step of ESTA basically consist on the following sequence: firstly, a meta-CSP representation is created by identifying a set of capacity violations inferred from previous ground-CSP, where variables correspond to the remaining resource conflicts and values to the set of feasible activity orderings to solve them; secondly, a resource conflict is selected by applying a variable ordering heuristic; and finally, selected conflict is solved by using a value ordering heuristic that imposes a new precedence constraint (over the ground-CSP) between some pair of competing activities that contributes to the conflict. In other words, ESTA algorithm implements a (one-pass) greedy resourcedriven scheduler that uses an earliest start-time resource profile projection (ground-CSP) to later perform a resource analysis and iteratively select and level “resource contention peaks” (i.e., overcommitments). More concretely, resource analysis consist on synthesizing the metaCSP by computing (sampling) the Minimal Critical Sets (MCSs), i.e., sets of Fig. 2. Meta-CSP generation example (MCSs activities that overlaps in time and de- computation) mand same resource causing overcommitments, such that whatever subset does not cause a resource conflict (see figure 2). The search strategy for selecting and solving resource contention peaks is biased by the following variable and value ordering heuristics: – Once the decision variables are computed (candidate MCSs), a most constrained variable ordering heuristic chooses the MCS with the smallest temporal flexibility (free temporal space). 1
Solution extraction provides a conflict-free schedule in the form of Earliest Start Schedule (ESS): a consistent temporal assignment of all time points with their lower bound values that is also resource consistent.
Scheduling a Single Robot in a Job-Shop Environment
221
Algorithm 1. Conflict selection and resolution process. Conflict ← SelectConflict (MetaCSP) Precedence ← SelectPrecedence (Conflict) GroundCSP ← PostConstraint (GroundCSP, Precedence)
– Conflict resolution is performed by a least constrained value ordering heuristic that levels the contention peak by posting a simple precedence constraint between two activities that belong to the related MCS according to the following criteria: the greater the flexibility is retained after posting a precedence ordering constraint, the more desirable it is to post that constraint. This kind of search heuristics that use the temporal flexibility retained between each pair of activities to bias the variable and value selection decisions are typically known as slack-based heuristics [11]. Algorithm 1 depicts in detail the basic steps corresponding to the conflict resolution process previously explained: within the first step, a collection of candidate MCSs is computed; the second step selects the most constrained MCS and the ordering choice to solve it; last step imposes the new leveling constraint within the ground CSP. Extending ESTA. Some enhancements have been introduced to basic ESTA algorithm in order to extend it to specificities of the robot scheduling problem. Let us recall that two types of resources are managed in this case: on one hand, binary resources are used to model the machines and the robot usage; and on the other hand, a cumulative resource is used to model the robot multi-capacity aspect. We essentially introduce the following modifications: the SelectPrecedence() and PostConstraint() functions that implement the variable and value ordering decisions are able to detect different types of resource contention peaks and impose the corresponding (simple or setup time-bounded) precedence constraint. More concretely, the profile projection analysis now synthesizes two different sets of MCSs separately: – First MCS set correspond to the resource profiles of the machines and robot multicapacity usage, and it is synthesized in the same way than the original ESTA does. – Second MCS set is the result of the contention peaks analysis performed over the robot usage resource. In this case, setup times associated to Load/Unload activities are also taken into account by introducing the underlying rationale of the extended dominance conditions: distances between each pair of activities are analysed such that a conflict is found if the separation between them is less than the corresponding setup time. Figure 3 illustrates the different kind of activities related to both binary and cumulative resources as well as some precedence constraints between them. Once both sets of candidate MCSs are computed, they are merged to select the most constrained MCS. Similarly, the solution of the conflict consists in imposing a simple precedence constraint if the selected pair of activities belonged to the first MCS set, or a precedence constraint with a setup time otherwise with the following exception: if a crossed conflict situation is solved (see figure 4), a specific profile projection analysis is performed with the aim of avoiding possible dead-ends. If the cumulative-related activity attached to the target activity constrained by the new ordering is involved in a peak, the opposite precedence ordering constraint is imposed.
222
D. D´ıaz et al.
Fig. 3. Some examples of simple precedence constraints (SPC) and precedence constraints with setup times (ST)
Fig. 4. The possible crossed orderings that may lead to a dead-end
3.2 Providing Better Solutions Since the previous one-pass, greedy solution does not guarantee optimality, we used an efficient optimization framework (in contrast to the costly backtracking-based schemes) with the aim of providing better results. We adopted the advanced IFLAT [10] iterative sampling optimization schema that allows us to find lower solution’s makespan and overcome situations where unsolved conflicts are encountered. The underlying idea is to iteratively run the extended-ESTA algorithm such that different paths are randomly explored within the search space, by using a “meta-heuristic strategy” to bias the process [1]. The algorithm 2 illustrates the IFLAT process. The procedure takes two parameters as input: an initial solution S and the amount of backtracking (MaxFail) that delimits the number of failed attempts at improving the solution. The exploration of different solutions is broaden by a meta-heuristic strategy that basically interleaves the following two steps on each cycle: firstly, a retraction step removes an arbitrary number of solving constraints (with a specific retracting probability) from the “critical path” of the last solution; and secondly, a flattening step attempts at repairing the “partially destroyed solution” by running extended-ESTA with it. If a better makespan solution is found, the best solution (Sbest ) is updated and the counter is reset to 0. Otherwise, if no improvements have been found within the MaxFail iterations, the algorithm returns the best solution encountered.
4 Preliminary Results Analysis In this section we provide a first evaluation of the efficiency of both the deterministic extended-ESTA algorithm, and the iterative sampling framework for solution optimization. The test input data was obtained from previous work of [5] which aimed at providing robust and efficient solutions for a large variety of scheduling problems. In
Scheduling a Single Robot in a Job-Shop Environment
223
Algorithm 2. IFLAT optimization algorithm Input: S, MaxFail Output: Sbest Sbest ← S counter ← 0 while (counter ≤ MaxFail) do // Retracting step RELAX (S) // Repairing step FLATTEN (S) if (Mk (S)<Mk (Sbest )) then Sbest ← S counter ← 0 else counter ← counter + 1
particular we are interested in the “trolley problem” instances [12], as a problem in OR that essentially contains the same elements than our robot scheduling problem2. The considered benchmarks consist on problems of size n×m, where n is the number of jobs and m corresponds to the number of machines. The number of operations or activities per job is equal to the number of machines. We used two different sets of solvable instances of size 10 × 5 where the capacity of the robot is 2 and 3 respectively. The equivalent benchmark sizes on our model is given by the following formula that considers the Load and Unload activities, as well as the activities associated to the cumulative resource: n × (m + (m + 1) · 2 + (m + 1) + 1). Hence, the resulting number of activities per job is 24 now. We implemented the solving procedures under evaluation in Java and were tested on a PC with 3Gb of RAM and an AMD Athlon(tm) XP 2400+ of CPU. Table 1 shows the performance results of the tests corresponding to the deterministic extended-ESTA and solution optimization framework (IFLAT). The metrics provided are the makespan, the completion time, and the number of attempts that IFLAT performs during an estimated time of 1800 seconds. Additionally, it shows the comparative performance value ΔLW U %3 (the standard baseline performance metric used in the OR literature), as well as the completion time and the number of the IFLAT attempts in average. The general settings used for the IFLAT tests are the following: the maximum number of the relaxations and the removal probability are respectively set to 4 and 0.2, and the randomization factor is set to 0.2. From previous results we can mainly extract the following conclusions: the makespan of the solutions is significantly improved, in average, with the IFLAT optimization framework on both instance sets. Furthermore, we can see that the number of solutions 2
3
Unfortunately, the results described in [5] are not available any more, hence we offer a comparison completely internal to our own work. Due to the lack of detailed results, the comparative analysis is performed against an estimated lower bound that corresponds to the length of the longest job.
224
D. D´ıaz et al.
Table 1. Preliminary experimental results Extended-ESTA C=2
IFLAT
C=3
C=2
C=3
Benchmark
MK
CP Us
MK
CP Us
MK
CP Us
#Iter
MK
CP Us
#Iter
Robot test (1)
2696
171.25
2680
199.92
2644
2330.65
31
2560
2035.86
22
Robot test (2)
2817
163.60
2339
185.77
2528
1992.36
31
2566
2123.15
27
Robot test (3)
2925
162.55
2660
173.86
2671
2131.64
24
2609
2172.49
26
Robot test (4)
2962
164.55
2656
215.49
2737
2502.52
25
2591
2334.52
27
Robot test (5)
–
–
2633
191.38
2780
3277.77
23
2384
2035.11
29
Robot test (6)
–
–
2428
186.77
2512
2366.13
36
2213
2240.06
25
Robot test (7)
–
–
2492
263.63
2584
2175.12
25
2721
2011.04
24
Robot test (8)
2652
174.92
2916
251.74
2877
2062.49
26
2474
2054.08
27
Robot test (9)
2509
173.63
2307
257.82
2291
2089.43
29
2093
2176.26
27
Robot test (10)
2750
172.36
2463
235.48
2437
2208.64
28
2346
2097.10
28
Robot test (11)
–
–
3272
253.21
3247
2816.75
21
2874
2052.16
26
Robot test (12)
3279
125.99
3380
234.37
3343
2636.52
29
3212
2243.99
27
Robot test (13)
3580
170.30
3179
224.38
3034
1991.38
27
2587
2333.39
28
Robot test (14)
3064
175.98
3134
223.52
2613
3045.66
25
2558
2211.21
24
Robot test (15)
2623
164.68
2991
220.45
2609
2457.38
28
2635
2008.09
27
Robot test (16)
2900
179.57
2890
232.56
2705
2603.94
30
2493
2028.71
29
Robot test (17)
–
–
3202
241.86
3071
2206.43
24
2858
2133.43
26
Robot test (18)
3382
216.13
3130
243.87
2739
2200.67
29
2792
2402.53
28
Robot test (19)
3461
171.28
3071
229.53
2956
2696.34
32
2893
2140.92
28
Robot test (20)
–
–
2889
224.75
2573
2195.64
32
2521
2036.93
29
av ΔLW U % CP U av ΔLW U % CP Us s
307.21
170.49
288.60
224.52
av #Iter av ΔLW U % CP U av #Iter av ΔLW U % CP Us s
276.53
2399.36
27.75
256.17
2128.55
26.70
found is larger (all instances are solved) with IFLAT. On the contrary, the computation time was increased since this is the price to pay in expenses of getting better quality results. According to our expectation, the solution makespan is also shortened if the robot capacity is increased, as the extended-ESTA experiments with capacity 2 and 3 show. It is worth saying that we are actually studying different ways to enhance the one-pass extended-ESTA solution such that by improving the performance of the profile analysis performed when a crossed constraint is found. This would allow us not only getting better completion times, but also increasing the possibility of optimizing the solution quality with IFLAT, since the number of iterations performed (within the same time) would be also increased.
5 Conclusions In this paper we have studied the use of existing constraint-based, heuristic-biased techniques to solve the problem of “scheduling a robot in a job-shop environment”. In particular we have chosen two different PCP-based algorithms as basis for building our extended-ESTA solution: an enhanced SP-PCP for addressing problems that involve sequence-dependent setup times, and the advanced profile-based ESTA for dealing
Scheduling a Single Robot in a Job-Shop Environment
225
with (multi-capacity) cumulative resources. Extended-ESTA basically creates a metarepresentation of the temporal network where resource aspects are explicitly addressed by performing a resource contention peaks analysis. Furthermore, setup times are considered when contention peaks are solved since transportation constraints have to be satisfied. A deterministic and greedy implementation has been presented, to later embed it within a larger iterative-sampling search framework. This allowed us to promote solution optimization by broaden search space coverage with a meta-heuristic strategy. Then, we have performed a preliminary empirical experiment over a set of reference benchmarking problems to get initial solution quality evaluation. Acknowledgments. Daniel Diaz is supported by the European Space Agency (ESA) under the Networking and Partnering Initiative (NPI) Autonomy for Interplanetary Missions (ESTEC-No. 2169/08/NI/PA). CNR authors are partially supported by MIUR under the PRIN project 20089M932N (funds 2008). The second UAH author is supported by Castilla-La Mancha project PEII09-0266-6640. Special thanks to Michele Van Winnendael for his support throughout the whole course of the study.
References 1. Blum, C., Roli, A.: Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Comput. Surv. 35(3), 268–308 (2003) 2. Brucker, P., Drexl, A., Mohring, R., Neumann, K., Pesch, E.: Resource-Constrained Project Scheduling: Notation, Classification, Models, and Methods. European Journal of Operations Research 112(1), 3–41 (1999) 3. Cesta, A., Oddi, A., Smith, S.F.: A constraint-based method for project scheduling with time windows. J. Heuristics 8(1), 109–136 (2002) 4. Cheng, C., Smith, S.F.: Generating Feasible Schedules under Complex Metric Constraints. In: Proceedings 12th National Conference on AI, AAAI 1994 (1994) 5. Laborie, P., Godard, D.: Self-Adapting Large Neighborhood Search: Application to SingleMode Scheduling Problems. In: Proc. of the Multidisciplinary International Scheduling Conference: Theory & Applications, MISTA 2007 (2007) 6. Montanari, U.: Networks of Constraints: Fundamental Properties and Applications to Picture Processing. Information Sciences 7, 95–132 (1974) 7. Muscettola, N., Nayak, P., Pell, B., Williams, B.C.: Remote Agents: To Boldly Go Where No AI Systems Has Gone Before. Artificial Intelligence 103(1-2), 5–48 (1998) 8. Oddi, A., Rasconi, R., Cesta, A., Smith, S.F.: Iterative-Sampling Search for Job Shop Scheduling with Setup Times. In: COPLAS 2009, Proc. of the Workshop on Constraint Satisfaction Techniques for Planning and Scheduling Problems at ICAPS (2009) 9. Oddi, A., Smith, S.F.: Stochastic Procedures for Generating Feasible Schedules. In: Proceedings 14th National Conference on AI (AAAI 1997), pp. 308–314 (1997) 10. Oddi, A., Cesta, A., Policella, N., Smith, S.F.: Combining Variants of Iterative Flattening Search. Journal of Engineering Applications of Artificial Intelligence 21, 683–690 (2008) 11. Smith, S.F., Cheng, C.: Slack-Based Heuristics for Constraint Satisfaction Scheduling. In: Proceedings 11th National Conference on AI, AAAI 1993 (1993) 12. Van Hentenryck, P.: The OPL Optimization Programming Language. MIT Press, Cambridge (1999)
An Intelligent Framework to Online Bin Packing in a Just-In-Time Environment Sergey Polyakovskiy1 and Rym M’Hallah2 1
Department of Computer Science and Robotics, Ufa State Aviation Technical University, Ufa, Russia
[email protected] 2 Department of Statistics and Operations Research, Kuwait University, Kuwait
[email protected] Abstract. This paper addresses a complex real life manufacturing problem that occurs in paper and wood industries: the online bin packing and cutting of small items using parallel machines in a just-in-time environment. The objective is to minimize both the unused areas of the bins and the sum of weighted earliness and tardiness penalties. This NP hard problem is herein solved using an agent-based approach A-B. The active agents of A-B interact dynamically in real time to jointly fill the bins with items, assign cutting patterns to machines, cut them, and have the items delivered on time. The application of A-B to real life instances from a manufacturing company highlights the sizeable savings induced by A-B, and demonstrates its suitability for complex online real-life situations. Keywords: Multi-agent, bin packing, parallel machine scheduling, JIT.
1
Introduction
Globalization forced different production sectors to undergo fundamental strategic changes. It switched their focus from maximizing production quantities and minimizing unit cost to just-in-time (JIT) production [1] where responding to customer needs in terms of quantities and due dates are the prevailing factor. Today’s trend is make-to-order: low-volume high-mix orders with short lead times. It happens at a time of ever increasing raw material cost, expensive local labor, and ultimate need for growth and profitability. It occurs in the wood, furniture, interior components, and plastic goods industries, where clients tailor their orders to their specific needs and order them from all over the world while manufacturers compete to respond to this demand on-time at low cost. JIT-BP is a real-life problem encountered in make to order industries with a high mix of products. It involves cutting raw material in a JIT environment. Orders reach the firm online. They are not known a priori. Their characteristics
This research is supported by grant HIII 65497.2010.9 of the president of the Russian Federation for Leading Scientific Schools and grant 09-07-00408-a of the Russian Foundation for Basic Research.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 226–236, 2011. c Springer-Verlag Berlin Heidelberg 2011
An Intelligent Framework for Online BP in a JIT Environment
227
–i.e., demand, items’ dimensions, and delivery dates– become known when the order is made. Manufacturing these orders involves a stage that groups the items into cutting patterns and cuts them using one of q parallel cutting machines. The JIT set up requires that each item be cut on the date it is due to reach its next manufacturing stage (additional machining or assembly). Cutting a piece too early requires its temporary transfer and storage whereas cutting it too late induces delays on the next production stages. The firm faces two challenges: delivering the orders within their specified time frames, and optimizing its use of raw materials by identifying cutting patterns with reduced material waste. Addressing these two challenges is further complicated by unforeseen events such as online demand and machine breakdown. JIT-BP is closely related to the online version of two NP-hard combinatorial optimization problems: variable-sized bin packing (VSBP), and parallel machine weighted earliness and tardiness (PWET). Searching for optimal cutting patterns in real time is equivalent to VSBP, where a set of small sized items are packed into a finite number of large sized raw material stocks (called hereafter bins) such that the unused surface of the bins (or total waste) is minimized, and no pair of items packed within a bin overlap. Once packed, the items are to be cut using one of the available cutting machines while respecting the due dates imposed by the subsequent production stages. This is equivalent to the online PWET which consists of scheduling a set of n tasks on q parallel machines with the objective of minimizing the total weighted earliness and tardiness of all jobs. To the best of our knowledge, no research work addresses VSBP and PWET simultaneously. Their combination results in a complex multi-criteria problem. JIT-BP consists of scheduling a dynamic set of cutting patterns s ∈ R = {1, ..., m } on a set Q = {1, ..., q} of parallel machines. A pattern s has a distinct processing time psk on machine k, where psk is a function ϕk (s) of s. A pattern consists of a set of small items i ∈ I = {1, ..., n} . An item i is characterized by its size (eg., length li , width wi and height hi or any subset of them), a known distinct deterministic due date di , and a per unit cost of earliness αi and tardiness βi . Each pattern is associated with a bin j ∈ B. A bin j is characterized by its size (eg., length Lj , width Wj and height Hj or any subset of them). Sets I and B are dynamic over time. At any instant in time, B can be divided into two non¯ ∗ = B\B ∗ overlapping subsets: a subset B ∗ of standard new bins, and a subset B of non-standard bins that emanated from previous cutting operations. JIT-BP has two objectives. The first is to pack the items of I into a subset of the bins of B as to minimize the total waste of the standard bins of B ∗ . The second is to find a feasible schedule of the resulting patterns on the parallel machines of Q with minimal sum of weighted earliness and tardiness (WET) over all items; i.e., m to minimize (αi Ei + βi Ti ), where Gs is the set of items within pattern s, s=1 i∈Gs
Ei = max {0, di − csk } and Ti = max {0, csk − di } are the earliness and tardiness of i, and csk is the completion time of s on k.
228
S. Polyakovskiy and R. M’Hallah
Section 2 details the multi-agent framework. Section 3 discusses its implementation. Section 4 compares its results to those obtained by a real life manufacturer. Finally, Section 5 is a summary.
2
Intelligence Framework
JIT-BP is a multi-objective, dynamic optimization problem. It can’t be addressed by traditional optimization approaches [2] which require large amounts of information for their decision process, are not robust, and have large computational times. Herein, we propose an agent-based system (ABS), which applies the concept of distributed artificial intelligence. Agent-based modeling is more suitable for complex dynamic problems with a decentralized structure [3]. It is a promising venue for complex object modeling and optimization. It is flexible, highly reliable, widely applicable, and easily adaptable in manufacturing. ABS is comprised of independent intelligent agents related to physical or functional entities. (An agent is a hardware or software based computer system with key properties of autonomy, social ability, reactivity and pro-activeness [4].) It is a self-organizing system of agents with situational behavior. The agents, which are driven by their own parameters, decision rules, and fitness assessment, interact with each other in a system that mimics the real life one. In fact, the agents’ interaction defines all system’s dynamic, evolution of its characteristics, and final solution structure. Each agent applies a “greedy” strategy: at every time epoch, it assesses the potential of available actions and chooses one that optimizes its reward. By striving to improve its fitness, each agent contributes to the overall improvement of the system’s objective function and influences the system’s future. Agents keep interacting until the system reaches a stable state. 2.1
Model and General Paradigm of ABS
ABS maps the elements of JIT-BP to three types of intelligent hybrid-type agents: B, I, and M which represent, respectively, the bins, items, and cutting machines. These agents interact using direct informational exchange. Each B-agent corresponds to a bin and its corresponding cutting pattern. It creates a group of items and competes to attract the best I-agents to it. Its goal is to maximize its group’s fitness measured via its packing coefficient. That is, its goal is to fill its bin as efficiently as possible by inviting I-agents and packing them into the bin while respecting standard packing constraints. An I-agent corresponds to an item. It is free if it does not belong to any group, and busy otherwise. A free I-agent competes with other I-agents to join the “best” B-agent’s group. Its goal is to minimize its incremental weighted earliness and tardiness on the B-agent: WET(Gs ∪ {i})- WET(Gs ); i.e., its goal is to reduce its negative influence on the system’s WET. An M-agent corresponds to a machine. It does not have any optimization role, and carries a technical one. Each M-agent stores its assigned cutting patterns and their starting and completion times. It is characterized by its ϕk function.
An Intelligent Framework for Online BP in a JIT Environment
229
ABS acts like a global agent: it supports the agents’ communication, runs special algorithms, and guarantees convergence to a feasible solution. It constructs a feasible solution in two stages. The first stage updates ABS’ knowledge base every time an event occurs such as a new request for items or the arrival of a bin. The second launches a five-phase repetitive decision process that mimics the sequence of real life operations. They are based on the cutting management policy which supposes that constructing and scheduling cutting patterns simultaneously increases the efficiency and flexibility of production. In the first phase, the decision maker (DM) identifies “critical” zones of the production plan. These time zones correspond to clusters of items having close due dates; i.e., competing for the same time slots on the machines. The DM’s goal is to schedule these items on the cutting machines using preset decision rules. In the second phase, the DM chooses the most appropriate raw material to pack the conflicting items. His/her goal is to identify the bins which will induce the least waste and least useful residuals when forming cutting patterns. The DM always uses residuals from previous bins prior to using new ones. S/he accounts not only for the items’ sizes and due dates, but also for the availability of the machines. In the third phase, the DM initiates the construction of cutting patterns using the set of competing items and selected bins. S/he packs items with close due dates on the same bin while reducing waste and respecting packing constraints. In the fourth phase, the DM estimates the processing time of each cutting pattern, and determines the idle times of each machine. Using some pre-set decision rules, s/he schedules the patterns on the q machines; i.e., s/he specifies the starting and completion time of each item of the cutting patterns. Finally, in the fifth phase, the DM checks the feasibility of the schedule to JIT-BP (i.e., as a function of all received orders), and makes any necessary amendments. 2.2
Dynamic Operation Model
ABS operates in a dynamic environment; thus, has to account for some modelspecific events such as the arrival of new items, removing cut patterns, machine break down, and modifying the cutting plan. Subsequently, its design should guarantee not only the quality of the final solution and its robustness, but also the speed of generating new solutions when unforeseen events occur. To account for new events, ABS divides the time horizon into three: an items accumulation zone H3 , an ABS working zone H2 , and an immediate response zone H1 . These zones cover the due dates of the free items (marked by points in Fig. 1), in addition to the starting and completion times of all cutting patterns. H3 identifies items whose due dates fall into its time interval, and makes them free I-agents. However, it does not actively serve in the solution construction. H2 constructs patterns and schedules their cutting using the iterative decision process of sec. 2.1. It may support the re-optimization of the current solution by applying special algorithms such as local search, evolutionary programming, and the approximate approach of [5]. Finally, H1 responds immediately to any event that occurs within its time interval (eg., new ordered items, machine breakdown).
230
S. Polyakovskiy and R. M’Hallah
Thus, H1 uses fast special-purpose algorithms to generate new least cost feasible cutting plans. In addition, it removes cut patterns from ABS. The model proceeds as follows. On the first stage, all I-agents’ due dates are distributed along the time line which is divided into three zones as in Fig. 1.a. The length of zone H2 , l (H2 ) , is a multiple of l (H1 ) , the length 2) of H1 . It is subdivided into η = l(H sub-zones: H2,1 , . . . , H2,η . On the second l(H1 ) stage, ABS starts its iterative process. At every time unit, ABS shifts the zero point of the time line to the right by one; thus, decreases l (H1 ) . All patterns ready at time zero are removed from ABS and transferred to the production line (cf. Fig. 1.b). Furthermore, all useful residuals are injected back to ABS through ¯ ∗ . When l (H1 ) = 0, ABS sets H1 = H2,1 , and decreases l (H2 ) the update of B as illustrated by Fig. 1.c. ABS iterates through these operations until l (H2 ) = 0. When l (H2 ) = 0, ABS switches a part of H3 of length l(H2 ) as shown in Fig. 1.d. To obtain a high quality solution for this dynamic problem, ABS investigates a large number of packing alternatives; thus, uses a large H3 . Subsequently, it receives more items and has more alternatives when H2 is switched to H3 , while it makes the decrease of H2 smoother since it gives search algorithms more time to find a better solution. Only a small interval of the time line should be used as H1 . Any special-purpose algorithm is to produce solutions instantly within H1 . H1
H2
а)
H3
t
h2,1 h2,2 h2,3 t0 (zero time point) H1 H2
b)
H3
t
h2,1 h2,2 h2,3 t0 H1
c)
H2
H3
t
h2,1 h2,2 h2,3 t0 H1
d)
H2
H3
h2,1 h2,2 h2,3
t
t0
Fig. 1. ABS’ operational model scheme
3
Mathematical Support
To obtain an approximate feasible solution for JIT-BP, ABS declares the q machines and the n items M and free I-agents, respectively. For each M-agent k ∈ Q, it defines the patterns’ cutting times using ϕk . Then, it iteratively solves five consecutive subproblems. Subproblem 1. Let a critical zone denote a time interval having a high density of the distribution of due dates, and let the corresponding items be critical ones. A judicious grouping of the critical items into priority groups or patterns and their subsequent scheduling into the q machines may significantly decrease WET. To detect a critical zone along the upcoming production plan, we apply a peak clustering algorithm based on the Gauss function [6]: μi =
An Intelligent Framework for Online BP in a JIT Environment
231
2b di − di exp − , where r is the size of the neighborhood, and b is a r/2 i ∈I generalization factor. ABS computes this function over the set of free I-agents, and selects the free I-agent i∗ with maximal mean; i.e., μ∗ = max (μi ). (A high
i∈I
function value indicates that a large number of other I-agents with due dates close to di∗ are located in a range of neighborhood r and compete for the same time slots.) Finally, ABS defines the critical zone as the interval (di∗ − t , di∗ + t ), 2b d ∗ −t where υ (t ) = exp − i r/2 = λ and λ ∈ (0, 1) is a preset threshold parameter. In practice, b, r and λ depend on the problem’s characteristics, density and distribution of due dates, and time horizon. Subproblem 2. To determine a set of bins B ⊆ B to place items of the critical zone I ⊆ I, we propose a two-stage algorithm. The first stage divides I into subsets using the nearest neighbor rule whereas the second stage defines the bins that can cover each subset of items with the least waste. For the first stage, ABS chooses b I-agents (b ≤ q) with largest sizes and makes them create clusters Ca ⊆ I (a ∈ {1, . . . , b }, C1 ∩ . . . ∩ Cb = ∅) with centers ν (Ca ) = da . Then, it makes I-agent i∗ , which has formed cluster C1 , to play the “asking agent” role; i.e., to begin a clustering process. Subsequently, i∗ searches for another unmarked I-agent e ∈ I with due date de nearest to ν (Ci∗ ). It asks e to join the cluster Ci∗ . To confirm or reject an invitation, e analyzes other possibilities. If de is nearer to ν (Ci∗ ) than to any other cluster’s center, it accepts the invitation; otherwise, it rejects it. When it accepts the invitation, e assigns itself to cluster Ci∗ = Ci∗ ∪{e} and does not accept anyfurther invitation. Additionally, i∗ computes the new cluster’s center ν (Ci∗ ) =
i∈Ci∗
|Ci∗ |
di
. When
e rejects the invitation, i∗ directs control to I-agent i∗∗ which leads a cluster Ci∗∗ with center ν (Ci∗∗ ) nearest to de . Subsequently, i∗∗ starts the role of “asking agent” instead of i∗ . Finally, any cluster leader stops the process and directs control to ABS when no unmarked I-agent i ∈ I is available. The second stage determines, for each cluster, a bin to pack its items. It uses the lower bound algorithm introduced in [7] to decide if the items of the cluster can be packed into a given bin. ABS enumerates all potential bins for each ¯ ∗ of cluster and selects the bin yielding the tightest lower bound from the set B ∗ residuals or from the set B of standard bins (if the items of the cluster can not fit into any residual) or selects a standard bin with a larger size when the first two alternatives fail. Evidently, the first alternative is best since it decreases the level of useful residuals. This algorithm does not guarantee the packing of all the items of the critical zone since it uses a lower bound. Subproblem 3. Having determined the most suitable bins, ABS starts filing them with items from the critical zone. The agent-based interaction used to construct cutting patterns is inspired from [5,8]. Initially, ABS creates a set
232
S. Polyakovskiy and R. M’Hallah
A ⊆ B (|A| = b ) of B-agents. In turn, each B-agent creates a group of I-agents Gs = ∅, and determines a list of vacancies Vs (free space available in the bin). Having been initialized, s ∈ A undertakes group formation actions, directed toward inviting a subset I ⊆ I of free I-agents to join Gs . It considers the first element of Vs and identifies the list of available I-agents that can fit into it. Then, it sorts the list in decreasing reward order and successively invites I-agents to join Gs by sending them attachment offers. Each offer contains the value ΔW ETs,i the increment value of WET of Gs when I-agent i joins it. The attachment of any I-agent to Gs increases the Gs packing coefficient and improves Gs fitness. When receiving an offer from B-agent s, I-agent i processes it and decides whether to reject it or accept it. To reach its decision, i analyzes the offer and considers the potential offers that may emanate from other B-agents in A\s. To determine these alternatives, i undertakes group-joining actions. i tries to join some group s for which it will minimize its ΔW ETs ,i . i sends to potential B-agents s ∈ {A\s} a request which asks about the possibility to receive i. Upon receipt of the request of i, B-agent s checks if any vacancy in Vs can fit i. If such a vacancy is available, s sends i the value ΔW ETs ,i ; otherwise, it rejects the request. Agent i analyzes this information, computes its profit for every s ∈ {A\s}, and reaches a decision. If its profit when joining Gs is highest (i.e., the value ΔW ETs,i is minimal among all s), i accepts the invitation of s; else, it rejects it. Upon undertaking a decision, i relates it to s. The action of B-agent s depends on the response of i. If i has accepted the invitation, s attaches i to Gs = Gs ∪ {i}, positions i in the bin (using a packing algorithm), and updates its list of vacancies Vs by removing the used vacancy and adding newly created ones if any. Consequently, i changes its state to “busy”, and s directs control on a cyclic manner to the next B-agent in A. If i rejects the invitation, s examines the next I-agent in its list. It proceeds until s gets one of the agents to accept its invitation or the list of potential agents is exhausted. The above process is reiterated until none of the B-agents can extend an invitation or receive a positive response from I-agents. In this case, cutting patterns are filled with items. This algorithm does not guarantee the packing of all items since the chosen bins may not fit them all. This is not problematic since any unpacked item will eventually be packed on the next iterative step of ABS. Subproblem 4. ABS assigns the b prepared patterns to the q available cutting machines for processing (b ≤ q). First, it computes a cost matrix V (q × q), where each element presents the WET cost of assigning pattern s ∈ A to machine k ∈ Q. Specifically, ABS asks all available M-agents to compute the patterns’ processing times. It uses ϕk , k ∈ Q, and the availabilities of the machines to determine ΔW ETk,s , the increment of WET of all items of the cutting patterns already assigned to k if s is assigned to k. Then, ABS examines all unit time intervals in the interval min {di }, max {di } to select cs , the machine indepeni∈Gs
i∈Gs
dent completion time of pattern s that minimizes WET(Gs ). Next, it sends cs along with the characteristics of s to each k ∈ Q. When receiving this data, M-agent k computes the processing time of s, and virtually schedules s; which
An Intelligent Framework for Online BP in a JIT Environment
233
may result in shifting the processing intervals of all or some of the patterns that preceded s and/or removing overlaps. If backward shifting moves the starting time of the machine beyond the zero point, k sets its starting time to zero and successively shifts all processing intervals forward until no time windows of any patterns overlap. Since, future demand is unknown, this algorithm favors using any idle time on the machine prior to increasing the machine’s makespan. Using V, ABS finds an optimal assignment via the Hungarian algorithm [9]. Subproblem 5. ABS resolves any two patterns’ conflicting processing time intervals by shifting the time intervals of the M-agent as in subproblem 4. Processing of unexpected events. ABS should be fast and robust in response to unexpected events such as machine breakdown and arrival of rush orders. Dealing with machine break down is straightforward; thus, the next discussion focuses on rush orders. When arriving during H1 , urgent items require immediate attention. ABS applies a fast algorithm to include them into the schedule. It inserts them into a set I of I-agents. It pre-sorts I in non-increasing order of the items’ sizes and directs control to the first I-agent i. Subsequently, i undertakes group-addition actions: it tries to join the B-agent’s group Gs to minimize ΔW ETs,i . Specifically, i considers the set of B-agents Ak already assigned to machine k ∈ Q, and retains the subset A ⊆ {A1 ∪ . . . ∪ Aq } corresponding to those B-agents whose processing time intervals cover its due date di . It successively inquires each of them regarding a possible vacancy that can fit i and the resulting ΔW ETs,i , s ∈ A . Finally, it chooses the B-agent s with minimal ΔW ETs,i . In turn, s packs i in the bin, removes the related vacancy, and adds new ones. Consequently, i changes its state to “busy” and directs control to the next I-agent in I . If no B-agent can fit i into its bin, i asks ABS to create a new B-agent s with minimal size and packs itself in it. Subsequently, ABS assigns s to the machine k inducing the minimal ΔW ETk,s . The packing process stops when all urgent items get a “busy” state.
4
Computational Results
ABS, whose B-agents apply the packing heuristics of [5,10], is applied to the guillotine two dimensional VSBP problem that occurs in a real furniture manufacturer during the preliminary saw cutting of rectangular pieces from rectangular sheets of plywood and chip boards using CNC beam saws. The real data included the items’ sizes, the dates they were ordered, their due dates according to the operational production plan, and unit weights. It had ten instances where each instance involved around 7250 items, a 12-day planning horizon (i.e., a 6480-minute time horizon), and two cutting centers with different speeds and properties. The approximate processing time functions are ϕk (tk−inst , ls , vk , ns−rot , tk−rot, tk−carry ) = tk−inst + (ls /vk ) + ns−rot × (tk−rot + tk−carry ), where tk−inst is the set up time of a sheet on machine k, ls the cutting time of pattern s; vk the cutting speed of k, ns−rot the number of sheet rotations needed to cut s, tk−rot the rotation time on k, and tk−carry the
234
S. Polyakovskiy and R. M’Hallah
time to transfer a sheet to the cutting line. The base time unit is one minute. The smallest size of a useful residual bin was set according to manufacturer’s specifications. The company had 5 standard types of plywood sheets. The considered instances had several unexpected events: breakdowns, orders with short lead time, and defective items needing urgent re-cutting. Their occurrence mimics their real life one. Table 1 displays the average, over all instances, of the two main (material utilization and WET) and four secondary (residual level, number of items produced in time, total number of cutting patterns and total cutting time) performance criteria. It shows ABS’ results as well as those of two heuristics: “Cutting FCFS” which assigns the patterns to the machine with the earliest idle time, and “Cutting on demand” which assigns them to the earliest free machine only when an item is due for the next manufacturing stage. Both heuristics use the same packing heuristic as ABS. They bound ABS’ solution values since they correspond to the cases of cutting the items too early or too tardy. Most importantly, they are applied by the company. Table 1. Performance of ABS AB (2t , b, r) Cutting Criteria (80, 0.5, 6) (30, 1, 14) (3, 0.15, 1) FCFS On-demand Packing coefficient (%) 82 78 66 69 69 WET (minutes) 109008 32220 3486 156856 38842 Useful residuals (pieces) 118 125 166 149 149 On-time production (pieces) 496 624 1432 46 35 Number of cutting patterns 1056 1247 1802 1406 1406 Cutting time (minutes) 3744 4067 4863 4227 4227
ABS improves material utilization by 14% and WET by 21% while it decreases the number of residual levels by 19%, the number of cutting patterns by 13%, and the cutting time by 13%; in addition to producing a much larger number of items on-time. ABS’ good performance can be amplified with a judicious choice of the control parameters. For the tested instances, the best material utilization is obtained for a critical zone size 2t = 80 minutes with λ = 0.99 while the best WET is expected for 2t = 3 minutes. To further study the effect of the critical time zone on the two main performance criteria, we consider the data for one day and one cutting center (t = 540 minutes and q = 1), with αi and βi randomly generated from the discrete Uniform[1,5]. Figure 2.a shows that the average packing coefficient is not sensitive to n, but is best for reasonably sized critical zones. Increasing the size of the critical zone makes ABS cumulate items and gives more alternatives for a better utilization of the bins. On the other hand, having a large-sized critical zone makes ABS use only large-sized bins and ignore other bins and useful residuals. This decreases the packing coefficient for high-sized critical zones. WET is sensitive to both 2t and n as illustrates Figure 2.b. For n = 500, setting 2t = 75 minutes yields the best results. In fact, when n is large and
An Intelligent Framework for Online BP in a JIT Environment
n=750
Packing coefficient(%)
n=300
235
n=750 WET
n=300
80 160000 75 70
120000
65
80000
60
40000
Criticalzone's size(minutes)
2 6 8 14 22 30 36 42 75 85 120 230 670 3400 8300 3280
50 (a)
0
2 6 8 14 22 30 36 42 75 85 120 230 670 3400 8300 3280
55
Criticalzone's size(minutes)
(b)
Fig. 2. Effect of the critical time zone on the packing coefficient and WET
2t is small, WET is large because the machine is running continuously without interruption while the largest cutting time is spent on installation of the sheets on the machine. Therefore, a small-sized critical zone yields a large number of patterns which, in turn, directly affects the total cutting time and WET.
5
Conclusion
This paper presents an artificial intelligence paradigm based on multi agents to solve complex, dynamic, multi-objective real life manufacturing problems. This paradigm is applied to the JIT on-line bin packing and cutting problem using parallel machines that are subject to breakdown. Its application to real life instances highlights the sizeable savings it induces, and illustrates its flexibility.
References 1. BenBassat, M.: Production Chain Optimization: Basic Principles, http://www.plataine.com/mbb0806 2. Mes, M., Heijden, M., Harten, A.: Comparison of agent-based scheduling to lookahead heuristics for real-time transportation problems. European Journal of Operational Research 181(12), 59–75 (2007) 3. Parunak, H.V.D.: Agents in Overalls: Experiences and Issues in the Development and Deployment of Industrial Agent-Based Systems. International Journal of Cooperative Information Systems 9(3), 209–228 (2000) 4. Wooldridge, M., Jennings, N.R.: Agent theories, architectures, and languages: A survey. In: Proceedings of the ECAI 1994 Workshop on Agent Theories, Architectures and Languages, pp. 1–39. Springer, Heidelberg (1995) 5. Polyakovsky, S., M’Hallah, R.: An agent-based approach to the two-dimensional guillotine bin packing problem. European Journal of Operational Research 192, 767–781 (2009) 6. Yager, R., Filev, D.: Essentials of Fuzzy Modeling and Control. John Wiley, New York (1984)
236
S. Polyakovskiy and R. M’Hallah
7. Fekete, S., Schepers, J.A.: General framework for bounds for higher-dimensional orthogonal packing problems. Mathematical Methods of Operations Research 60, 311–329 (2004) 8. Polyakovsky, S.J., M’Hallah, R.: New Trends in Applied Artificial Intelligence: An agent-based approach to knapsack optimization problems. In: Okuno, H.G., Ali, M. (eds.) IEA/AIE 2007. LNCS (LNAI), vol. 4570, pp. 1098–1107. Springer, Heidelberg (2007) 9. Burkard, R., Dell’Amico, M., Martello, S.: Assignment Problems. SIAM, Philadelphia (2009) 10. Polyakovsky, S.J., Valeeva, A.F., M’Hallah, R.: An Improved Heuristic for the Variable Sized Guillotine Bin Packing Problem. In: Proc. of the 9th Workshop on Comp. Science and Inform. Technologies, Russia, Ufa, pp. 34–37 (2007)
A Greedy Heuristic for Airline Crew Rostering: Unique Challenges in a Large Airline in China Qiao Chen1 , Andrew Lim1,2 , and Wenbin Zhu3, 1
School of Computer Science and Engineering South China University of Technology, Guangzhou, Guangdong, P.R. China
[email protected] 2 Department of Management Sciences, College of Business City University of Hong Kong, Tat Chee Ave, Kowloon Tong, Hong Kong S.A.R.
[email protected] 3 Department of Computer Science and Engineering, School of Engineering Hong Kong University of Science and Technology Clear Water Bay, Hong Kong S.A.R.
[email protected] Abstract. We study a unique challenge in the crew rostering problem faced by a large airline in China, where fairness for crewmembers is the main concern. The fairness aspect of rosters produced by a commercial optimizer is not satisfactory. We discuss the challenges imposed by the fairness constraint and devise a greedy heuristic to directly address these challenges. The resulting algorithm is not only much faster but also produces a more balanced plan based on actual data from the airline in question. Keywords: crew rostering, fairness constraint, greedy heuristic.
1
Introduction
Manpower costs account for a large portion of expenses for many airlines. Existing literature often consider cost efficiency as the main objective, i.e., either maximize the number of fulfilled activities given fixed crew or minimize the required crew to cover a set of activities [1–7]. Other factors such as fairness are often of secondary consideration. Fairness in terms of the workload, however, affects the morale of the crew, which in turn affects work efficiency. In particular, when firing excess crew is not an option (as in most airlines in China) or too costly, minimizing required manpower is less important. Instead, fairness become a primary objective. Existing commercial rostering systems are designed for airlines in developed countries where the availability of manpower is often the bottleneck. Consequently, they tend to optimize in terms of minimizing manpower and often sacrifice absolute fairness. The commercial rostering system used by a large airline
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 237–245, 2011. c Springer-Verlag Berlin Heidelberg 2011
238
Q. Chen, A. Lim, and W. Zhu
in China, for example, often takes days to create a rostering plan where the fairness of workload allocation is often not satisfactory. The task of crew rostering involves assigning activities (such as a flight leg) to crewmembers so that the manpower requirement of all activities are fulfilled and various constraints are satisfied. This problem is extremely complex due to the large number of safety constraints imposed by government regulators, rules imposed by the airline itself, and various contract terms. Due to the complexity of the problem, it is often solved in two stages: pairing and rostering. In the pairing stage, flight legs are formed into anonymous pairings (a sequence of work assignments, including embedded breaks, that starts from a given crew base and end at the same base). In the rostering stage, monthly rosters are created for each crewmember [8, 9]. The standard approaches in most literature and commercial rostering systems to the rostering problem are based on the ”generate-and-optimize principal” [8]. First, a large set of feasible rosters for individual crew members is generated; typically, a few million rosters are generated. The rostering problem is then formulated as a set cover problem, i.e., select one roster for each crew member while satisfying all the manpower demand constraints of all anonymous pairings [10]. Such models are flexible enough to be extended to take fairness into account. One strategy is to compute the expected average workload for each crewmember and penalize any deviation from the average in the objective function. Another strategy is to set upper and lower limits for the workload of each crewmember. However, both strategies are not very effective. In this study, we propose a much simpler and more direct approach, which is to construct rosters using greedy approaches where fairness is treated as a primary objective that is explicitly considered during the construction. It turns out that such an approach is not only much faster (creating feasible plans in minutes), the resulting roster is much more balanced in terms of workload. We will first present the simplified problem in Section 2, so that we can focus our discussion on the main challenges. In section 3, we outline our heuristic approach. In section 4, we present the computational results obtained by our approach on real data from a large airline in China. We conclude this paper by pointing out future research directions in Section 5.
2
Problem Description
Rostering for cockpit crew (such as the captain and the first officer) and cabin crew (such as cabin attendants and security guards) are separate problems that can be solved independently since cockpit crew and cabin crew are not interchangeable. The solution processes for cockpit crew rostering and cabin crew rostering are similar. Hence, we focus our discussion primarily on the cabin crew rostering problem. Given a set of activities each with a manpower requirement, our task is to assign activities to crewmembers so as to create a roster for each crewmember. The problem studied in this paper is closer to the personalized rostering
A Greedy Heuristic for Airline Crew Rostering
239
practice in European airlines than the “bidlines” practice in American airlines, i.e., individual rosters are constructed based on fairness of workload allocation while taking crew members’ preferences into account. All rosters constructed must satisfy various constraints imposed by government regulators, the airline company, and contract terms between the airline and its employees. Two of the main objectives are: 1) ensure the equitability of rosters for crewmembers of the same rank; 2) minimize the penalty cost of unassigned activities. The unique situation faced by the airline in China is quite different from most airlines in developed countries. A few years ago, rosters were created manually, which are often far from optimal in terms of minimizing crew size. Since most airlines in China are state-owned, firing excess crew is usually the last option. As a result, manpower availability is often not the bottleneck when creating a feasible roster. Instead, fairness among crewmembers of same rank is main concern. There are two measures of fairness. Firstly, the number of overnight duties for crewmembers of the same rank over a given period (normally a quarter) should be as close as possible. The vast majority of overnight duties are associated with international flights, which are greatly preferred by crewmembers since they will get significant compensations and benefits. Secondly, the total fly-hours for crewmembers of same ranks over a given period should be as close as possible. The input of the rostering problem includes the attributes of various activities, information about available crewmembers, rules that must be obeyed, and the objective functions. Although rostering problems for every airline are different, their setups are largely similar. Hence, we only describe input that is necessary to illustrate the unique challenge we are facing. Interested readers can refer to [8] for other considerations related to crew rostering. Typical activities include pairings, reserves, ground duties and pre-assigned Day-Offs. We only focus on pairings; other activities can be similarly handled. A pairing consists of a sequence of flight legs starting and ending at the same base. Every flight leg requires a certain number of crewmembers of different rank, e.g. 3 first class attendants and 10 attendants. Every leg may have qualification requirements, e.g., an international flight leg may require the crewmember to own a passport with a valid visa to the destination country. For each pairing, we are also given its start time, end time, number of flight legs, number of overnight duties, total fly-hours etc. For each crewmember, the personal record includes qualifications held, free time slots, pre-assigned pairings, vacations etc. The qualifications include information such as language proficiency or a list of eligible destinations. The available time slots determine which time slots in the roster can be filled by the crewmember. The pre-assigned pairings, vacations, and other information is needed to compute the cumulative fairness measure that needs to be equalized. Constraints can be broadly classified into two categories: – Horizontal constraints: constraints that are imposed on a roster of a crewmember whose validity can be verified using only information pertaining to the roster and crewmember. Typical examples include:
240
Q. Chen, A. Lim, and W. Zhu
• (Person, task, time)-compatibility constraints. For example, a flight leg in a pairing that requires a crewmember holding a valid passport and visa to Japan; two activities with overlapping time cannot be assigned to the same crewmember. • Rest Constraints: there must be a consecutive 48 hours of rest for any consecutive 7 calendar days. – Vertical constraints: constraints that are imposed on a set of related rosters. For example: • Down-Rank Constraints: certain high-ranking crewmembers can perform tasks that only require a lower-ranked crewmember. However, there may be an upper limit on the total number of down-ranked crewmembers associated with an activity.
3 3.1
A Greedy Algorithm Basic Idea
To simplify the presentation, we only discuss how pairings are assigned. Other activities such as ground duties and reserves can be handled similarly. If down-ranking is not allowed, the rostering problem can be divided into separate problems, one for each rank. The rostering problem for one rank can be solved independently of other ranks. According to historical data, down-ranking is relatively infrequent. Hence, we decided to solve the rostering problem for each rank independently and handle down-ranking during post processing. The vast majority of overnight duties (or overnights for short) are from pairings containing international flight legs. The number of such pairings is much smaller than the number of pairings consisting of domestic flight legs only. The data also shows that the number of international flight job opportunities within one scheduling period is usually smaller than the number of crewmembers. Hence, fairness of overnight duties becomes the bottleneck, which is also confirmed in the experience of the human scheduler of the airline in question. Therefore, we try to allocate such pairings first. Since the equitable allocation of overnight duties is measured over a period longer than the scheduling period, we also need to take the historical allocation into account. Pairings are sorted by the number of overnights in descending order and are allocated one by one to the crew with the smallest accumulated overnights. To ensure the equitable distribution of fly-hours among crewmembers, a similar greedy strategy can be used, i.e., crewmembers with less fly-time has a higher priority when assigning pairings with more fly-hours. Activities with long durations are more difficult to assign since they require a large contiguous period of free time, whereas activities with short durations can be easily inserted into existing rosters. Thus, we sort pairings in descending order of duration. For each pairing in this order, we try to find enough crewmembers to fulfill its manpower requirements. Crewmembers are selected in ascending order of their accumulated fly-time. A crewmember can be selected only if it
A Greedy Heuristic for Airline Crew Rostering
241
has all the qualifications required by the flight legs in the pairing and the time slot on its roster is free for the entire duration of the pairing. All other relevant constraints must also be satisfied. Though fairness in terms of total fly time of crewmembers is the primary objective, we cannot totally ignore the efficiency (in terms of maximizing the number of fulfilled activities) of the rostering. This is because fairness and efficiency are often competing constraints, i.e., good fairness usually results in poor efficiency and vice versa. Different crewmembers often hold different qualifications, which affects the flexibility of task assignments. Crewmembers that are proficient in more languages, for example, are more flexible. Crewmembers whose roster have large contiguous periods of free time slots are also more flexible. Intuitively, when multiple crewmembers (with similar accumulated fly-time) are available for a particular activity, we would prefer to assign the activity to the more inflexible crewmembers first. Such a heuristic rule can be easily implemented as a ranking function. The ranking function uses the weighted sum of accumulated fly-hours and the number of activities that can be assigned to the crewmember. 3.2
Implementation Considerations
The main challenge is to handle the large number of constraints correctly and efficiently. We highlight a few strategies we used to achieve efficiency in this section. In our algorithm, constraints are evaluated incrementally. Each time an activity is assigned to some crewmember, we only need to check the validity of constraints associated with the activity and the crewmember. Recall that constraints are broadly classified into horizontal constraints (HC), where validity can be checked using only information associated with a roster; and vertical constraints (VC), where validity can only be verified with information across multiple rosters. Typically there are many more horizontal constraints than vertical constraints. Horizontal constraints can be further classified into three types: H1) compatibility between persons and activities, which can be pre-computed and remains static throughout the algorithm; H2) constraints that require information of the activity that immediately precedes or succeeds the current activity; H3) constraints that require cumulative statistics of the current crewmember; H4) constraints that require information pertaining to multiple activities that precede or succeed to the current activity. Consider an example of H1, a pairing consisting of an international flight leg that requires proficiency in the Thai language. Only crewmembers with this qualification is compatible with the pairing. Another example of H1 would be a crewmember who has a pre-assigned activity (medical checkup, annual leave) that clashes with a pairing, whereupon he/she is not compatible with the pairing. Such constraints only need to be evaluated once at the beginning of the algorithm. The result is stored in a compatibility table between crewmember and
242
Q. Chen, A. Lim, and W. Zhu
activities. Whenever a pairing is to be assigned to a crewmember, the compatibility table is consulted and we can determine the violation of such constraints in constant time. An example of H2 is the constraint that all crewmembers must have at least 2 hours of rest time between two consecutive fly duties. Such constraints can be handled efficiently by building a compatibility table between a pair of activities. For example, pairing A and B are compatible if we can assign B after A to the same crewmember without violating any of the H2 constraints. With this compatibility table, before we assign the current activity to a crewmember, we can quickly decide whether the resulting roster is feasible by checking the compatibility of the current activity and the activities immediately preceding and succeeding it. If all activities assigned to a crewmember are stored in ascending order of starting time in a balanced binary search tree structure (such as an AVL or red-black tree), the activities immediately preceding and succeeding the current activity can be retrieved in O(logn) time, where n is the number of pairings assigned. When the scheduling period is short (such as 30 days), the number of activities assigned to each crewmember is relatively small. A sorted array will achieve a similar time efficiency. The constraint “the maximum rest time of a crewmember must not exceed a specified upper bound over the scheduling period” is an example of H3. We only need to introduce a counter for each of the accumulated measures for every crewmember. The validity of such constraints can then be verified in constant time and the accumulated measures can be updated in constant time when a new activity is assigned. Finally, an example of H4 would be the constraint requiring all crewmembers to have 48 hours of consecutive rest for any consecutive 7 calendar days. Before assigning the current activity to a crewmember, we need to scan all activities that end in fewer than seven days before the start of the current activity, and also activities that start in fewer than seven days after the end of the current activity. If no activity scanned causes the violation of this constraint, we can assign the current activity. Such constraints require in the worst case linear time to scan all activities already assigned. Vertical constraints can be broadly classified into two categories: V1) constraints are enforced at pairing or flight leg level; and V2) constraints are enforced at the global level. As an example of a V1 constraint, the total number of down-ranked crewmembers should not exceed two for any pairing. This constraint can be checked efficiently by maintaining a counter for each pairing to keep track of the number of down-ranked crewmembers. In general, every such constraint requires a data structure associated with each involved pairing. An example of V2 is the constraint that crewmembers A and B cannot fly together due to personality conflicts. We handle such constraints by introducing a global data structure for each such constraint. Constraints are checked in the order of increasing complexity order, i.e., H1, H2, H3, H4, V1, V2. This ordering allows a quick rejection of infeasible partial rosters and avoids unnecessary time spent to check complex constraints.
A Greedy Heuristic for Airline Crew Rostering
3.3
243
Handling Complications in Practice
Many rules are subject to change. Fortunately, most of the changes concern the parameters of the rules rather than the rules themselves. Consider the constraint “the maximum fly-hours for first class attendants over one week is 48 hours.” The parameters such as “48 hours” or “one week” may be changed to “96 hours” and “two weeks.” Such changes do not affect the logic handling the constraint since all parameters are loaded from configuration files or a database. New rules may be introduced. The general strategy to minimize the impact of such changes to the overall system is to collate constraint validation into separate modules, i.e., one module each for H1, H2, H3, H4, V1 and V2 constraints. In practice, most of the newly introduced rules areof types H1, H2 or H3, which can be handled with very little modification of the related module. A new H4, V1 or V2 rule will, however, mandate additional data structures to be introduced for crewmembers, pairings or globally. Constraints that impose lower bounds cannot be checked incrementally, e.g., a constraint that requires a minimum number of fly-hours over a period. Since rosters are constructed incrementally, it is necessary to allow partial rosters that are infeasible with respect to such constraints and defer the validation of the constraints when the roster is complete. Usually, such constraints are easy to satisfy. However, in cases where such constraints are tight, the strategy is to use the difference between the lower bound and actual allocation in the partial roster as a component of the ranking function when selecting crewmembers such that the ranking function will prioritize crewmembers that are far from fulfilling the constraint. If the manpower requirement of certain activities is hard to satisfy, the scheduler can prioritize such activities. In general, instead of sorting activities in descending order of duration, we sort activities in descending order of weight. The default weight of an activity is its duration, but the weight can be changed by the scheduler manually.
4
Numerical Results
We tested our greedy heuristic on actual operational data obtained from a large airline in China. This particular data set contains about one thousand nine hundred crewmembers and one year of activities. The airline generates rosters every calendar month and evaluates the fairness over a period of two months. Hence, we need to load the pre-assigned activities from previous month before producing the roster for current month. A total of 11 tests can be conducted. It takes less than 20 seconds for our algorithm to generate rosters for one calendar month. A single execution of the greedy algorithm is able to assign 99.33% activities. Furthermore, the human scheduler has testified that the fairness of the produced roster is much more satisfactory than the rosters produced by the commercial system employed by the airline. Figure 1 illustrates the fairness of workload allocation among crewmembers in one scheduling period. The vertical axis is the fly-time (the axis unit is 1000
244
Q. Chen, A. Lim, and W. Zhu
Fig. 1. Fly-Time of Crewmembers by Rank
minutes). The horizontal axis represents each individual crewmember, and every dot represents the fly-time of a crewmember. Crewmembers with the same rank form an approximately horizontal line, which indicates that the workload is evenly distributed. Take the attendants (AT) for example. The dots form a horizontal line with average fly-time of about 4000 minutes in the scheduling period. Similar results are achieved for head attendants (HA), security guards (SG), and first class attendants (FAT). However, balance of business class attendants (BAT) and deputy attendants (DA) are not as good. For BAT, the imbalance is mainly due to the small number of crewmembers; and for DA, it is mainly due to pre-assigned duties. Despite the imbalance, the result is still much better than that of the existing rosters according to the scheduler. Table 1 give the statistics of fly-time by rank. Column Count gives the number of crewmembers of that rank; column min is the minimum fly-time (in minutes); column max is the maximum fly-time; column avg is the average fly-time; and column StdDev is the standard deviation of the fly-time for that rank. Table 1. Statistics of Flying Time by Rank
DA HA AT SG FAT BAT
Count 165 356 673 386 305 72
min 752 3200 3759 3400 2219 6777
max 2059 3409 4506 3607 2448 10140
avg StdDev 1135 311.06 3245 45.68 3995 54.20 3510 30.53 2367 37.56 8596 591.98
A Greedy Heuristic for Airline Crew Rostering
5
245
Conclusion
In this study, we presented a heuristic algorithm that addresses a unique challenge faced by a large airline in China. The resulting heuristic produces a feasible roster very quickly (in seconds) that is more balanced in terms of workload allocation than existing commercial systems. It is possible to introduce a metaheuristic around the greedy heuristic, where different strategies can be used to adjust the weights given to the ranking functions. Such an extension may further improve the quality of the solution generated.
References 1. Etschmaier, M.M., Mathaisel, D.F.X.: Airline Scheduling: An Overview. Transportation Science 19(2), 127–138 (1985) 2. Campbell, K.W., Durfee, R.B., Hines, G.S.: FedEx Generates Bid Lines Using Simulated Annealing. Interfaces 27(2), 1–16 (1997) 3. Jarrah, A.I.Z., Diamond, J.T.: The Problem of Generating Crew Bidlines. Interfaces 27(4), 49–64 (1997) 4. Anbil, R., Gelman, E., Patty, B., Tanga, R.: Recent Advances in Crew-Pairing Optimization at American Airlines. Interfaces 21(1), 62–74 (1991) 5. Barnhart, C., Johnson, E.L., Anbil, R., Hatay, L.: A column-generation technique for the long-haul crew-assignment problem, pp. 7–24. John Wiley & Sons, Inc., New York (1994) 6. Gershkoff, I.: Optimizing Flight Crew Schedules. Interfaces 19(4), 29–43 (1989) 7. Hoffman, K.L., Padberg, M.: Solving Airline Crew Scheduling Problems by Branchand-Cut. Management Science 39(6), 657–682 (1993) 8. Kohl, N., Karisch, S.E.: Airline Crew Rostering: Problem Types, Modeling, and Optimization. Annals of Operations Research 127(1), 223–257 (2004) 9. Cappanera, P., Gallo, G.: A Multicommodity Flow Approach to the Crew Rostering Problem. Operations Research 52(4), 583–596 (2004) 10. Fahle, T., Junker, U., Karisch, S.E., Kohl, N., Sellmann, M., Vaaben, B.: Constraint Programming Based Column Generation for Crew Assignment. Journal of Heuristics 8(1), 59–81 (2002)
Optimal Algorithms for Two-Dimensional Box Placement Problems Wenbin Zhu1 , Wee-Chong Oon2, , Yujian Weng3 , and Andrew Lim2 1
Department of Computer Science, Hong Kong Univ. of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
[email protected] 2 Department of Management Sciences, City University of Hong Kong, Tat Chee Ave, Kowloon Tong, Hong Kong {weecoon,lim.andrew}@cityu.edu.hk 3 Global R&D Center, Beijing Yahoo!, 22/F Building C, SP Tower, Tsinghua Science Park, Haidian District Beijing, P.R. China 100084
[email protected] Abstract. The two-dimensional box placement problem involves finding a position to place a rectangular box into a container given n rectangular boxes that have already been placed. It commonly arises as a subproblem in many algorithms for cutting stock and packing problems. We develop an asymptotically optimal approach for finding the bottom-leftmost feasible position, and modify it to find all normal feasible positions (which is also asymptotically optimal). Our approach relies on augmented versions of the segment tree data structure, and is simpler and more practicable than the best existing approach. Furthermore, it does not require that the placed boxes are interior-disjoint. Keywords: rectangle placement; box placement; cutting and packing; VLSI layout; combinatorial optimization.
1
Introduction
Cutting and packing problems have many industrial applications, e.g., glass cutting and component layouts in VLSI circuit design. One of the fundamental tasks of many variants of cutting and packing problems is to place a set of rectangular pieces into a larger rectangular container such that the pieces do not overlap. The two-dimensional box placement problem (2DBPP), which involves finding a location to place a given piece into a container when n pieces have already been placed, arises as a subproblem to many algorithms that handle cutting and packing problems. Examples include constructive algorithms, i.e., pieces are placed one by one into the container based on heuristic rules; and local search based algorithms that iteratively improve solutions by swapping previously placed pieces with unplaced pieces. The efficiency of such algorithms is directly affected by the efficiency of the solution for the box placement subproblem.
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 246–255, 2011. c Springer-Verlag Berlin Heidelberg 2011
Optimal Algorithms for Two-Dimensional Box Placement Problems
247
In this paper, we first describe an approach to find the bottom-leftmost feasible position for the 2DBPP. We transform the problem into a problem of finding the depth of an arrangement of rectangles, and then employ a line sweeping algorithm with an augmented segment tree data structure that can perform certain critical operations efficiently. Our approach is asymptotically optimal, and also easier to implement than the best existing approach for the problem. Furthermore, it does not require the assumption that the placed boxes are interiordisjoint unlike the best existing approach. We then show how it can be modified to locate all solutions corresponding to normal positions; this algorithm is also asymptotically optimal. Our approach can be generalized to solve box placement problems in d dimensions for d ≥ 3.
2
Problem Description
In the two-dimensional box placement problem (also called the rectangle placement problem), we are given a rectangular container that contains n orthogonally placed rectangular pieces. The task is to find a location to orthogonally place a given new piece into the container such that it does not overlap with any other pieces. The box placement problem can be generalized to d dimensions. In particular, a 1D box is a segment, while a 3D box is a cuboid. We assume the container is placed in the first octant of a Cartesian coordinate system such that it occupies the region R = [0, L] × [0, H], R ⊂ R2 . We use integers 1, .., n to denote the n placed pieces, and integer 0 to denote the new piece. The placed pieces Ri , i = 1, .., n are given as input. The interior of the i-th placed piece is represented by the set Ri = (xi , xi ) × (y i , y i ). The new piece to be placed is given by its dimensions on the x- and y-axes, denoted by l and h, respectively. The objective is to find a point (x0 , y 0 ) to place the new piece with its corner closest to the origin at that point such that C1) it does not overlap with any placed piece, i.e., R0 ∩ Ri = ∅, i = 1, .., n; and C2) it is inside the container, i.e., R0 ⊆ R. Note that the placed pieces are not required to be interior-disjoint. We call a point p ∈ R2 a feasible position if placing the new piece at point p = (x0 , y0 ) satisfies both constraints C1 and C2. For every layout of n pieces in a container, there is an equivalent layout where every piece is pushed towards the origin as much as possible, called a normal layout. For many cutting and packing applications, any layout is considered equivalent to its corresponding normal layout, and therefore it is sufficient to consider only normal positions when placing the new piece. Let S denote the set of all feasible positions. The box placement problem has four variants depending on the desired output: P1. P2. P3. P4.
Decision Problem: Does there exist a feasible position? (i.e., S = ∅?) First Position: Find the bottom-leftmost feasible position. All Normal Positions: Find all normal feasible positions, SN ⊆ S. All Positions: Find all feasible positions S.
248
W. Zhu et al.
Healy et al. [3] proved that any algorithm for P2 where the n placed pieces are not necessarily in normal positions requires Ω(n log n) time by reducing the closeness problem to the 1D version of the box placement problem. The authors described a complex algorithm to achieve this bound that combines the segment tree data structure [2] and a height-balanced binary search tree; this algorithm requires that the placed pieces are interior-disjoint, and it is unclear how it can be extended to handle the P3 variant. In this paper, we develop practicable algorithms to solve the P2 and P3 variants, where the n placed pieces are not assumed to be disjoint or in a normal layout.
3
Transformation to Minimum Depth Problem
Observe that when the new piece is placed at p = (x0 , y 0 ), it will overlap with the i-th placed piece if and only if p lies within the region Ri = (xi −l, xi )×(y i −h, y i ). Graphically, Ri can be viewed as the i-th placed piece enlarged towards the origin by l and h along the x- and y-axes, respectively. Furthermore, the new piece is inside the container if and only if p lies in the region R = [0, L − l] × [0, H − h]. The region R is the container reduced by l and h along the x- and y-axes, respectively. Hence, the set of all feasible positions to place the new piece is: S = R −
n
Ri
(1)
i=1
By extending the placed pieces and reducing the container, the 2DBPP is transformed into a version of the minimum depth problem: given an arrangement of n axis-aligned rectangular boxes Ri within a rectangular region R, the depth of a point is defined as the number of boxes that cover that point. Any point in the arrangement with a depth of 0 is a feasible solution to the 2DBPP. Our solution approach makes use of this transformation, which is linear in n.
4
The First Position Variant (P2)
We first describe an algorithm for the P2 variant of the 2DBPP based on the segment tree data structure; it is both conceptually simpler and more readily implementable in practice than the best existing approach by Healy et al. [3], while still retaining the optimal asymptotic running time of O(n log n). We then show how our algorithm can be modified to solve the P3 variant. Our solution to the P2 variant is as follows. First, we reduce the container R to obtain R and extend all placed pieces Ri to Ri . We then use the following line sweeping algorithm to solve the equivalent minimum depth problem. Imagine a vertical line sweeping from left to right along the x-axis. We use an interval I to represent all points on the line with y-coordinate y ∈ I. As the line sweeps, we record the depth for all points on the line in the query interval Iq = [0, H − h] in a data structure. When the line encounters the right edge of
Optimal Algorithms for Two-Dimensional Box Placement Problems
249
some extended piece Ri , we decrease the depth of all points in [y i , y i ] by 1; when the line encounters a left edge, we increase the depth of all points in [y i , y i ] by 1. If there is some point y ∈ Iq on the line with depth 0 when the line is at some position x ∈ [0, L − l], then (x, y) is a feasible position to place the new item. Hence, we require a data structure that supports three operations: O1) for all points in a closed interval, increase/decrease the depth by 1; O2) query the minimum depth of all points in a given closed interval; O3) find the first point in a given closed interval that attains the minimum depth The set of intervals involved is the left/right edges of the extended pieces I = {(y i , y i ), i = 1, .., n}, and the query interval Iq = [0, H − h]. In order to conveniently handle the boundary cases at x = 0, we also instantiate a dummy piece with interior (−l − 1, 0) × (0, H − h) representing the left side of the reduced container. As a result, all points in Iq before x = 0 will have a depth value ≥ 0. Let P = {p1 , p2 , ..., pm } be the set of all distinct end points of the intervals in I and Iq in ascending order. Any feasible normal position p must have a ycoordinate in P ; therefore, it is sufficient to monitor the depths of only the points in P . Furthermore, we can replace all intervals in I with closed intervals with end points in P without affecting the minimum depth values for the points in P . We shall refer to these converted closed intervals as event intervals. Let E be the set of event intervals. The set of end points P divides the query interval Iq into m−1 elementary intervals [p1 , p2 ), [p2 , p3 ), ..., [pm−1 , pm ]. Any event interval [a, b] in E corresponds to a set of consecutive elementary intervals [pi , pi+1 ), ..., [pj , pj+1 ), where pi = a and pj = b. We can augment the segment tree data structure [1, Section 10.3] to support the three operations. Since there are at most n intervals in I, so there are at most 2n distinct endpoints in P . Therefore, the construction of the segment tree skeleton takes O(n) time and the height of the tree is O(log n). For each node η in the tree, we maintain an attribute η.interval , which contains the endpoints of the interval represented by the node; and an attribute η.c that maintains a count of extended pieces that cover all points in η.interval . We denote the left and right subtrees of η by η.left and η.right , respectively. When an event interval I ∈ E is encountered by the sweeping line, we update the attributes of the relevant nodes as given in Algorithm 1, where δ is either 1 or -1 depending on whether I is a left or right edge of the extended box. There are three possibilities: 1) if I does not intersect η.interval , we do nothing; 2) if I covers the entire η.interval , we increase/decrease η.c by 1; 3) otherwise, we recursively update the attributes of the subtrees corresponding to I. Subsequently, we can find the depth of any point in an elementary interval (a leaf node), which is the sum of η.c for all the nodes on the path from the root to the leaf node. Let f (η1 → η2 ) be the sum of the k. c values for all nodes k on the path from node η1 to node η2 , excluding η1 but including η2 . Finding the minimum depth of the arrangement is accomplished by finding the leaf node η that minimizes the value of r.c + f (r → η), where r is the root node of the tree. To efficiently perform this query, we maintain an additional attribute η.mf that stores the
250
W. Zhu et al.
minimum value of f (η → η ) over all leaf nodes η in the subtree rooted at node η; the minimum depth for the arrangement is therefore r.c + r.mf . For any leaf node η, η.mf = 0; and for any internal node η.mf = min{η.left . c + η.left . mf , η.right .c + η.right .mf }. This attribute is updated after an interval is inserted into its children subtrees (lines 9, 11 in Algorithm 1). For P2, we must also identify the leftmost point that attains the minimum depth. This is done by introducing another attribute η.mpos with initial value equal to the left endpoint of η.interval . This attribute is also updated after the insertion of an interval into its children subtrees (lines 10, 12 in Algorithm 1). Algorithm 1. Segment tree update procedure for P2 update(η, I, δ) 1 2 3 4 5 6 7 8 9 10 11 12
if η. interval ∩ I == ∅ return if η. interval ⊆ I η. c = η. c + δ return update(η. left, I, δ) update(η. right , I, δ) if η. left .c + η. left. mf ≤ η. right . c + η. right . mf η. mf = η. left . c + η. left.mf η. mpos = η. left. mpos else η. mf = η. right . c + η. right. mf η. mpos = η. right .mpos
Every interval will affect at most two nodes at the same depth of the segment tree [1, pp. 234, first paragraph]. The update procedure will affect at most O(log n) nodes, and all updates can be performed in O(1) time, so operation O1 takes O(log n) time. The query operation O2 is answered by returning the value of r.c + r.mf for the root node r, and the query operation O3 is answered by returning the value r.mpos for the root node r, so both operations require O(1) time. Since there are exactly 2n intervals in E, the line sweeping algorithm to find the bottom-leftmost feasible position requires at most O(n) O1, O2, and O3 operations. Therefore, the total time complexity for the algorithm is O(n log n).
5
The All Normal Positions Variant (P3)
There are two main difficulties to overcome when adapting our algorithm to handle the P3 variant. Firstly, we must report all feasible normal positions. Secondly, we must also ensure that no non-normal positions are reported. To ensure that all feasible normal positions are considered, we use the following observation. Consider a feasible normal position (x∗ , y ∗ ). The point (x∗ − , y ∗ ) for a small value must be covered by some extended piece (otherwise, the
Optimal Algorithms for Two-Dimensional Box Placement Problems
251
new item can be translated to the left by , so (x∗ , y ∗ ) is not a normal position). Consequently, there must be an event interval I corresponding to the right edge of some extended box whose coincidence with the sweeping line decremented the depth of y ∗ from 1 to 0 in our line sweeping algorithm, and there is exactly one such event interval for each feasible normal position. Given an event interval I that corresponds to a right edge, let S be the set of all points on I whose depths changed from 1 to 0 due to the coincidence of I with the sweeping line. All points in S can be expressed as the set containing the smallest number of intervals encompassing these points (which implies that all intervals are disjoint non-consecutive), sorted by ascending y-coordinate. We call S the zero-set of I. Note that for any normal position (x∗ , y ∗ ), y ∗ ∈ P and y ∗ must be the lower endpoint of some interval in a zero-set. Therefore, to report all feasible normal positions, it is sufficient to report the lower endpoints of all intervals in the zero-sets. In fact, the lower endpoints of all but the first interval in S are the y-coordinates of feasible normal positions. The exception is the first interval [pi , pj ) in S, which is not a normal position if pi is the lower endpoint of the event interval I and the depth of the preceding elementary interval [pi−1 , pi ) is zero. Our implementation handles this special case explicitly by checking the depth of interval [pi−1 , pi ) prior to reporting a first interval [pi , pj ). Our data structure must support an additional operation to solve P3: O4 Given a query interval I corresponding to the right edge of some extended piece, report (the left endpoints of) all intervals in S (the zero-set of I). To perform operation O4 efficiently, we maintain a set η.np for each node η in the segment tree, which is an ordered set of intervals containing all points in η.interval attaining the minimum depth η.mf . We will describe our data structure for implementing this set at the end of this section. Algorithm 2. Finding all intervals in I with depth 0 collect(η, I, S) 1 2 3 4 5 6
if η. c > 0 or η. mf > 0 or η. interval ∩ I == ∅ return if η. interval ⊆ I append(S, η. np) else collect(η. left, I, S) collect(η. right, I, S)
With η.np in place, Algorithm 2 performs operation O4 given the query interval I by concatenating all intervals with depth 0 into the list S. The append procedure appends each interval in η.np to the end of S. If the last interval in S and the first interval in η.np are consecutive, then the append procedure merges the two intervals in O(1) time before appending the remaining intervals. Hence, S is appended with the zero-set of I.
252
W. Zhu et al.
Assuming the traversal of the set η.np takes O(|η.np|) time, then the total number of operations performed by all append calls in the collect procedure is the number of intervals in the sets η .np for all visited nodes η . Let k be the number of intervals in S returned by the collect procedure. Observe that the collect procedure is similar to a range query on the segment tree, so it visits O(log n) nodes. For each node visited, the append procedure performs at most one merge. Since the number of intervals in the sets η .np is the sum of k and the number of merges, so the running time for collect is O(k + log n). The update procedure for P2 (Algorithm 1) can be adapted to maintain the set η.np. For a leaf node, the set of intervals that attains the minimum depth must be the elementary interval itself. For any internal node η, there are three possibilities: 1) the minimum depth is contributed by the left subtree, i.e., η.np = η.left .np; 2) the minimum depth is contributed by the right subtree, i.e., η.np = η.right .np; 3) both subtrees have the same minimum depth, whereupon we set η.np to be the concatenation of η.left .np and η.right .np, merging the last interval of the left subtree with the first interval of the right subtree if they are consecutive. If this union operation can be done in O(1) time, then the running time of update remains O(n log n). The complete line sweeping algorithm for P3 is given in Algorithm 3. We denote each event interval by a triplet (x, f, I), where x is the x-coordinate of the event; f ∈ {left, right} indicates whether the event interval is the left or right edge of some extended piece; and I is the interval. In order to report all K feasible normal positions, the line sweeping algorithm considers exactly n + 1 right edges and n + 1 left edges (including the dummy piece), so both the update and collect procedures are called O(n) times. Furthermore, the total number of intervals collected in all sets S is O(K + n) because each right edge will introduce at most one interval that corresponds to a non-normal position. Therefore, the total time complexity to report all K feasible normal positions is O(n log n + K), which is asymptotically optimal. Note that K = O(n) if all n placed pieces are interior-disjoint [3]; otherwise, K = O(n2 ). When the placed pieces are not interior-disjoint, the special case when an interval I1 fully encompasses an interval I2 might cause Algorithm 3 to report non-normal positions. For example, when I1 is processed in Figure 1, the four left endpoints in S1 are reported as feasible normal positions. However, the empty point (second from the top) is not a normal position and should not be reported. To resolve this issue, we transform the set of vertical intervals at an x-coordinate into another set of vertical intervals prior to processing (after line 21) in the following manner: sort all lower endpoints in ascending order and all upper endpoints in ascending order, keeping duplicates; the new set of intervals is formed by matching lower endpoints to upper endpoints in ascending order. This eliminates the special case, and the depths of all points remain intact. Finally, we show how the ordered set of all intervals that attain the minimum depth η.np for each node η can be efficiently implemented. In particular, it must support the following two operations:
Optimal Algorithms for Two-Dimensional Box Placement Problems
253
S1 Compute the union of two sets in O(1) time; if the last element in one set and the first element in the other are consecutive, then they should be merged into a single interval; Algorithm 3. Line Sweeping algorithm for P3 Box-Placement-P3(L, H, Rs, l, h) // Inputs: L, H: length, height of container; l, h: length, height of piece to be placed // Rs: list of n placed pieces Ri = (xi , yi ) × (xi , y i ) // Output: list of feasible normal positions S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30
for each Ri = (xi , yi ) × (xi , yi ) ∈ Rs Ri = (xi − l, y i − h) × (xi , y i ) L = L − l; H = H − h
// extend placed pieces // reduce container
P = {0, H } for each Ri = (xi , yi ) × (xi , yi ), i = 1, .., n // find all distinct end points in Iq if y i ≥ 0 P = P ∪ {y i } if y i ≤ H P = P ∪ {y i } sort all points in P in ascending order E = {(−l − 1, left, [0, H]), (0, right, [0, H])} // insert dummy piece at left side for each Ri = (xi , yi ) × (xi , yi ), i = 1, .., n // convert to closed intervals find the smallest p ∈ P , s.t. p > y i find the largest q ∈ P , s.t., q < y i E = E ∪ {(xi , left, [p, q]} if xi ≤ L E = E ∪ {(xi , right, [p, q]} sort event intervals in E by ascending x-coordinate if tied, left edge has priority if tied, prefer smaller bottom endpoints if tied, prefer smaller top endpoints construct segment tree T using Iq = [0, H ] and P S = ∅ for each distinct x-coordinate x∗ in E // line sweeping for each (x∗ , right, I) ∈ E update(T. root , I, −1) S=∅ collect(T. root , I, S) Remove first interval from S, if its lower endpoint is not normal position for each [y, y ) ∈ S insert (x∗ , y) into S update(T. root , I, 1), ∀(x∗ , left, I) ∈ E return S
254
W. Zhu et al. S1 S2
I2
I1
Sweeping Line
Fig. 1. I1 fully encompasses I2
S2 Traverse all elements in the set in O(k) time, where k is the number of elements in the set. Such a set S can be implemented by separating its elements into two parts. The first part is a sublist of a doubly linked list demarcated by S.head and S.tail ; the second part is a single element S.last that is possibly empty. If S.last is empty, then the sublist contains the entire set S; otherwise, the set S is the sublist plus the element S.last as the last element. The traversal of the set takes O(k + 1) time, where k is number of elements in the set, satisfying operation S2. We now show how the union operation S1 can be performed in O(1) time. For every leaf node η representing an elementary interval, the set η.np contains the elementary interval, which is represented by a doubly linked list with a single node: η.np.head and η.np.tail both point to this node and η.np.last = ∅. The procedure that merges two sets S1 and S2 is required to return another set S representing the union, and both S1 and S2 must still represent the respective sets before the union operation. The retrieval of the last element of a set S1 can be done in O(1) time since it is either S1 .last or the element pointed to by S2 .tail . Therefore, determining if the last element in S1 and first element in S2 are consecutive can be done in O(1) time. To produce the new merged set S, we must first concatenate the linked lists of S1 and S2 . There are 4 cases, depending on whether the last element of S1 and the first element of S2 are consecutive, and whether S1 .last is empty: 1. Not consecutive, empty: we concatenate the linked lists directly by setting S1 .tail .next = S2 .head and S2 .head .prev = S1 .tail ; 2. Not consecutive, non-empty: we first create a new node n and copy the element S1 .last into it. We then concatenate the lists by setting S1 .tail .next = n; n.prev = S1 .tail ; n.next = S2 .head ; and S2 .head .prev = n; 3. Consecutive, empty: we set S1 .last to be S1 .tail ; set S1 .tail = S1 .tail .prev ; and then concatenate the lists using the method described in case 4 below; 4. Consecutive, non-empty: we create a new node n that contains the merged interval of S1 .last and the first node in S2 ; then set S1 .tail .next = n and n = S2 .head .next (if it exists) After the concatenation is done, we set the head and tail pointers of the union S to be S1 .head and S2 .tail , respectively. Finally, we copy S2 .last into S.last. In
Optimal Algorithms for Two-Dimensional Box Placement Problems
255
all cases, we can verify that both S1 and S2 retain their original elements after the merge, S contains the union of S1 and S2 , and all operations take O(1) time.
6
Conclusion
The 2DBPP is a fundamental subproblem for many algorithms in the cutting and packing domain. We describe line sweeping algorithms for the P2 and P3 variants of this problem based on the segment tree with asymptotically optimal running times. Our algorithms are conceptually simpler, easier to implement and more general (allowing overlapping placed pieces) than existing approaches. The technique employed for the P2 variant in 2D can be extended to 3D by using a quadtree rather than a segment tree. Our algorithm for the P3 variant can also potentially be extended to 3D. However, in order to maintain an asymptotically optimal running time, we require an implementation of a set that supports fast union and merge operations; currently, we have not been able to discover an implementation that can achieve these operations in O(1) time. We have developed an asymptotically optimal algorithm for the P4 variant of the 2DBPP using the transformation described in Section 3, our P2/P3 algorithm, and the algorithm by Wood [4]. The time complexity is O(n log n + k) time, where k is the number of edges in the contour describing the feasible region. However, we omit the details due to space constraints.
References 1. Berg, M., Cheong, O., Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications, 3rd edn. Springer, Heidelberg (2008), http://www.worldcat.org/isbn/3540779736 2. de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Segment Trees, 3rd edn., ch. 10.3. Springer, Heidelberg (2008) 3. Healy, P., Creavin, M., Kuusik, A.: An optimal algorithm for rectangle placement. Operations Research Letters 24(1-2) (February 1999), http://www.sciencedirect.com/science/article/B6V8M-43GHSPF-B/ 2/42b6ea739a362592dccfb090350a5a34 4. Wood, D.: The contour problem for rectilinear polygons. Information Processing Letters 19(5), 229–236 (1984), http://dx.doi.org/10.1016/0020-0190(84)90058-9
An Algorithm for the Freight Allocation Problem with All-Units Quantity-Based Discount Xiang Gao1 , Andrew Lim1 , Wee-Chong Oon1 , and Hu Qin2,1, 1 2
Department of Management Sciences, City University of Hong Kong, Tat Chee Ave, Kowloon Tong, Hong Kong School of Management, Huazhong University of Science and Technology, No. 1037, Luoyu Road, Wuhan, China
[email protected] Abstract. This study is motivated by a project awarded by one of the largest international retail distributors in the world. An important task for the distributor is to plan the distribution of goods from Asia to various destinations across Europe. The goods are transported by shipping companies along shipping lanes, and these companies offer different discount rates depending on the freight quantity. The task involves purchasing freight capacity from shipping companies for each shipping lane based on projected demand, and subject to minimum quantity requirements, such that the total transportation cost is minimized. Existing work on this and related problems employs commercial linear programming software to solve their models. The problem is N P-hard, and is therefore unlikely to be solvable optimally in reasonable time for large instances. Hence, we propose a heuristic-based algorithm that combines a filter and fan search scheme with a tabu search mechanism for the problem. Experiments on randomly generated test instances show that our algorithm produces superior solutions in less computation time and requires less computer memory compared to ILOG CPLEX 11.0. Keywords: freight allocation; quantity discount; filter and fan; tabu search.
1
Introduction
This study is motivated by a project awarded by one of the largest international retail distributors (henceforth referred to as the shipper). The shipper annually procures diverse products, from textiles and foodstuffs to major electrical appliances, from over one thousand suppliers across Asia to satisfy the demands of sales divisions distributed across Europe. Long-distance ocean shipping is the main transportation mode for the shipper for the delivery of the procured goods, accounting for around 95% of its total
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 256–265, 2011. c Springer-Verlag Berlin Heidelberg 2011
An Algorithm for the Freight Allocation Problem
257
annual turnover on average. Hence, the shipper maintains close relationships with most of the leading international shipping companies (henceforth referred to as carriers). The transportation network under the purview of the shipper comprises more than 1000 shipping lanes connecting 71 loading ports in Asia and 27 discharging ports in Europe. This paper examines the problem of allocating the freight of all lanes to carriers such that the total transportation cost is minimized. The shipper performs this freight allocation at the strategic level. At the beginning of every fiscal year, the shipper forecasts the total quantity of freight (demand) for the coming year on each shipping lane, taking into account possible market fluctuations. Price quotes are collected from each carrier along with its discount rates, which are generally applied based on the total freight quantity awarded to that carrier across all lanes. Minimum quantity commitments (MQC) must also be taken into account; stipulated by the U.S. Federal Maritime Commission, it enforces a minimum freight quantity for each contracted carrier [6]. As a safeguard against the inability of a carrier to fulfill its contractual obligations due to unforeseen circumstances, the shipper also specifies a minimum number of carriers for each lane. We call this problem the freight allocation problem with all-units quantitybased discount (FAPAQD). While some research has been done on problems of this nature, in general they employ exact linear and integer programming solvers in order to produce optimal solutions on small instances. However, the problem is N P-hard, and therefore such approaches are unlikely to be successful for the large and practical scenarios faced by the shipper. Consequently, we developed a technique that makes use of the polynomial-time solvable minimum cost network flow problem to generate solutions for the FAPAQD, and then uses a filter and fan search heuristic with tabu search in order to locate good solutions. There is a substantial amount of literature related to procurement problems involving discounts. Two commonly used discount policies are the all-unit discount and the incremental discount. Under the all-units discount policy, the discounted price applies to all units purchased, while the discounted price applies only for quantities within the associated discount level for the incremental discount policy. The discount can be based on business volume, which is the total dollar amount of business across all products purchased from the suppliers [10]; or based on total quantity, in which the discount is given according to the total number of units of all products purchased from the supplier [5]. If these problems are distinguished by the number of suppliers and products, then procurement problems with discounts can be classified into the following four categories: (1) single supplier and single product [8,7]; (2) single supplier and multiple products [3,9]; (3) multiple suppliers and single product [2,1]; and (4) multiple suppliers and multiple products [5,11]. The FAPAQD is in essence a multiple-supplier, multiple-product procurement problem. The test data employed in existing literature are all much smaller than the hundreds of shipping lanes and dozens of carriers the shipper must consider; for example, the largest instances considered by [5] consist of only 50
258
X. Gao et al.
suppliers and 100 products. While problem instances of this scale are appropriate for the procurement problems examined in the previous publications, they are insufficient for our purposes.
2
Problem Formulation
We modeled the problem faced by the shipper in the following manner. There is a set of carriers I = {1, 2, . . . , n} and a set of shipping lanes J = {1, 2, . . . , m}. Not all carriers can operate on all lanes; the set N contains (i, j) pairs, i ∈ I, j ∈ J, indicating that carrier i operates on lane j. The projected demand for lane j in the upcoming fiscal year is given by dj , and the task is to assign a freight quantity to each carrier-lane pair (that is in N ) such that the demand for each lane is fulfilled at minimum cost, subject to various constraints. Each carrier i has a minimum quantity commitment (MQC) constraint, denoted by bi , that defines the minimum quantity that must be assigned to that carrier if it is selected. The regular price quoted by carrier i to transport one unit of product on lane j is denoted by pij . Each carrier i also defines a set of discount intervals Ki = {1, 2, . . . , ki } that describe the percentage discount on all units assigned to that carrier for each quantity range. We denote the discount lower bound of the k-th interval for carrier i by βik , and the discount value of the k-th interval for carrier i by αik . It is assumed that bi = βi1 < βi2 < . . . < βiki for all i, and αik is a non-decreasing function of βik . In order to increase the reliability of service, the shipper requires that each lane must use a minimum number of carriers based on the characteristics of the lane such as traveling distance, annual quantity of freight and safety level of the shipping lane. By imposing a minimum number of carriers on each lane, the shipper protects against the unavailability of carriers due to insufficient cargo hold space on the day or other unforeseen circumstances. However, assigning a minimum number of carriers for each lane is insufficient. For example, if we impose that the number of carriers for a certain lane must be at least 4, then the resulting allocation pattern may be 97%, 1%, 1%, 1%, which is almost the same as allocating all freight quantity to one carrier. After conferring with the executives of the shipper, we decided to model this requirement as a maximum percentage allocation qj of the freight quantity for each selected carrier on lane j, i.e., the amount of freight quantity assigned to each carrier servicing lane j can be at most qj · dj . Similar applications of such quantity limit measures can be found in [10,5]. For a lane j, the number of selected carriers that can operate on it can be expressed as cj = |{i : (i, j) ∈ N and carrier i is selected}|. For a given set of selected carriers, the number of carriers that can operate on a lane may be insufficient to cater for the demand, i.e., it is possible that there exists some lane j such that qj · cj < 1. When this occurs in practice, the shipper would likely have to purchase freight space from carriers in the open or spot market, who usually charge a much higher rate than regular carriers under long-term contract (e.g., double the price or more). We model this aspect by the value rj ,
An Algorithm for the Freight Allocation Problem
259
which is the unit penalty cost for unfulfilled demand for lane j from purchasing freight space from the spot market. Our MIP formulation for the FAPAQD is as follows. The decision variable xijk is the amount of freight space purchased from carrier i on lane j subject to discount interval k. The linear program is formulated such that the value of xijk controls two other values: the amount of unfulfilled demand uj on lane j and the binary variable zik that equals 1 if discount interval k for carrier i is activated and equals 0 otherwise. The value M is a sufficiently large positive number. M IP : Minimize ω = (1 − αik )pij xijk + r j uj (1) s.t.
i∈I j∈J k∈Ki
j∈J
xijk + uj ≥ dj , ∀ j ∈ J
(2)
i∈I k∈Ki
xijk + (1 − zik )M ≥ βik , ∀ i ∈ I , k ∈ Ki
(3)
xijk ≤ M zik , ∀ i ∈ I, k ∈ Ki
(4)
j∈J
j∈J
zik ≤ 1, ∀ i ∈ I
(5)
k∈Ki
/ N, k ∈ Ki xijk = 0, ∀ (i, j) ∈ xijk ≤ qj dj , ∀ j ∈ J, i ∈ I
(6) (7)
k∈Ki
uj ≥ 0, ∀ j ∈ J zik ∈ {0, 1}, ∀ i ∈ I, k ∈ Ki xijk ≥ 0, ∀ i ∈ I, j ∈ J, k ∈ Ki The objective function (1) minimizes the total procurement cost, comprising the total freight space cost and the total penalty cost across all lanes. Constraints (2) state that on each lane, the sum of the freight capacity purchased from all carriers and the unfulfilled quantity must satisfy demand. Constraints (3) guarantee that if carrier i offers discount interval k to the shipper, then the total freight capacity purchased from carrier i across all lanes must be greater than or equal to the discount lower bound βik . Conversely, if carrier i does not offer discount interval k to the shipper, then all corresponding decision variables xijk must equal zero, as required by constraints (4). Constraints (5) ensure that at most one discount interval per carrier can be selected. When a carrier i cannot provide service on lane j (i.e., (i, j) ∈ / N ), then the corresponding decision variables xijk = 0 for all k, as given by constraints (6). Finally, the quantity limit for each carrier on each lane is controlled by constraints (7). In our model, the variables xijk and uj are non-negative real numbers. The formulation of model MIP implicitly includes the MQC constraints. Given a feasible solution (x, u, z), if k∈Ki zik = 1, then carrier i must be selected and there must exist exactly one zik˜ = 1, 1 ≤ k˜ ≤ ki . According to constraints (3),
260
X. Gao et al.
we have j∈J xij k˜ ≥ βik˜ ≥ βi1 = bi , which guarantees that the total quantity purchased from carrier i satisfies its MQC. Otherwise, if k∈Ki zik = 0, then all the decision variables xijk relating to carrier i must be zero, which implies that carrier i is not selected on all lanes. The Transportation Problem with MQC introduced in [6], which was shown to be N P-hard, is a special case of the FAPAQD, therefore the FAPAQD is also N P-hard.
3
Solution Procedure
In this section, we present an algorithm combining a filter-and-fan (F&F) search scheme with a tabu search mechanism to solve the FAPAQD. It comprises two main components, namely a local search to identify a locally optimal solution, and an F&F tree search to explore larger neighbourhoods. If all binary variables zik are determined in the model M IP , then the resultant model becomes a linear programming problem with only continuous variables. We define a vector set Z as: Z = {(z11 , . . . , z1k1 , . . . , zn1 , . . . , znkn ) : zik ≤ 1 and zik ∈ {0, 1}, ∀ i ∈ I} k∈Ki
Hence, the set Z contains all vectors that describe a possible selection of discount intervals from the n carriers. For a particular z ∈ Z, let I1 be the set of selected carriers where k∈Ki zik = 1, and let k(i) be the discount interval offered by carrier i, where k(i) = k if zik = 1 and k(i) = 0 if i ∈ / I1 . Given an element z ∈ Z, we can derive a model as follows: (1 − αik(i) )pij xijk(i) + rj uj (8) M IP (z) : Minimize ω(z) = s.t.
i∈I1 j∈J
xijk(i) + uj ≥ dj , ∀ j ∈ J
j∈J
(9)
i∈I1
xijk(i) ≥ βik(i) , ∀ i ∈ I1
(10)
j∈J
xijk(i) = 0, ∀ (i, j) ∈ / N, i ∈ I1 , j ∈ J
(11)
xijk(i) ≤ qj dj , ∀ j ∈ J, i ∈ I1 uj ≥ 0, ∀ j ∈ J xijk(i) ≥ 0, ∀ i ∈ I1 , j ∈ J
(12)
M IP (z) is a minimum cost network flow problem, which can be solved in polynomial time using the network simplex algorithm. For a given z ∈ Z, the solution of M IP (z) represents an optimal allocation of freight capacity to the discount intervals corresponding to z, which is a solution to the original FAPAQD. However, note that for the given z, it is possible that j∈J:(i,j)∈N qj dj < βik(i) for some i ∈ I1 , which implies that the discount interval combination z leads to an infeasible solution of the original FAPAQD.
An Algorithm for the Freight Allocation Problem
261
Given the current solution z ∈ Z, we define the neighbourhood N (z) of z as: N (z) = {z : z ∈ Z and there exists exactly one carrier i such that k(i) = k (i)} where k(i) and k (i) are the discount intervals selected for carrier i in solution z and z , respectively. Therefore, the neighbourhood of z consists of solutions where a different discount interval is selected for a single carrier (including interval 0, which denotes that the carrier is not selected). Changing from a current solution z to any member of N (z) is defined as a move. If z results in an infeasible FAPAQD solution, we set ω(z) = +∞. For any two vectors z1 , z2 ∈ Z, if ω(z1 ) ≤ ω(z2 ) and z1 is feasible, then z1 is considered better than z2 . The F&F procedure is a meta-heuristic that alternates between a local search to achieve a locally optimal solution and an F&F search to escape the local optimum so as to explore other neighbourhoods. Our F&F procedure begins with an initial feasible solution z s to the original FAPAQD. The simplest such solution can be obtained by setting all zik to 0, which represents the situation that the shipper uses only spot market carriers. We then use a standard 1-step lookahead algorithm (i.e., choose the best neighbour from the neighbourhood of the current solution) to determine the local optimum z ∗ . Once z ∗ is identified, the F&F procedure generates a series of neighbourhood trees where each branch represents a move and each node corresponds to a solution in Z. There are three user-defined parameters for this procedure: L, η1 and η2 . The maximum number of levels in each neighbourhood tree is L, and at each level there are at most η1 selected nodes. In each neightbourhood tree, the node corresponding to the locally optimal solution is regarded as level 0, and its η1 best feasible neighbours comprise level 1. For each node in level l ≥ 1, the best η2 feasible neighbours are found, thereby generating η1 · η2 candidate nodes for level l + 1. Then, the best η1 nodes are selected from the η1 · η2 candidates and constitute level l + 1. The generation of η1 · η2 candidate nodes is called the fan candidate list strategy, and the selection of the best η1 nodes is called the filter candidate list strategy. This level propagation procedure is repeated until either an improved solution z i that is better than the previously best-known solution is found, or the depth of the neighbourhood tree reaches L. In the first situation, the greedy algorithm is again triggered to seek another local optimum based on z i (upon reaching new local optimum, L is reset to 0), and the tree generation process repeats. In the second situation, the procedure terminates. Figure 1 gives an example of the F&F procedure. To escape local optimality during the F&F search process, we incorporate a tabu search mechanism [4] on the selection of the η2 neighbours associated with each node. Recall that a move changes the discount interval of exactly one carrier. We therefore define the tabu condition as: for the current move, if the discount interval of carrier i is changed from k ∈ Ki ∪ {0} to any other value k ∈ Ki ∪ {0}, then the discount interval k cannot be selected for carrier i for the next ξ moves, where ξ is the tabu tenure. Each node has several tabu conditions inherited from its parent node and one tabu condition upon creation. However, we also use the aspiration criterion, i.e., the tabu conditions are overridden if the
262
X. Gao et al.
Starting Solution z s
Selected solution
G reedy
C andidate solution Local O ptimum z *
Level 0
Level 1 Level 2 Filter-and-Fan Level 3 Level 4 Improved Solution z i G reedy
Local O ptimum
Level 0
Level 1 Filter-and-Fan Level 2
...... Fig. 1. Example of Filter-and-Fan algorithm (η1 = 3, η2 = 2, L = 4)
move yields an improved solution; this method of aspiration is commonly used in previous tabu search implementations. Figure 2 provides an example of the propagation of tabu conditions, where the node chain corresponds to a path in Figure 1. For example, node 4 inherits two tabu conditions “cannot select discount interval 0 of carrier 2 for 1 move” and “cannot select discount interval 3 of carrier 5 for 2 moves” from node 3. When node 4 is created, a new tabu condition “cannot select discount interval 3 of carrier 0 for 3 moves” is generated. Each node can lead to neighbours that either satisfy all tabu conditions or that correspond to an improved solution (or both). Note that each path in Figure 1 can be viewed as a separate tabu search path; hence, the F&F tree search can be viewed as a modified tabu search algorithm with multiple search paths.
4
Computational Experiments
We tested our approach on some randomly generated instances. As a comparison, we also applied the branch-and-cut search scheme provided by ILOG CPLEX 11.0 to optimally solve the M IP model. However, it was necessary to set both a time and a memory limit for each run of the CPLEX solver due to practical considerations; the time limit varied according to the time used by the F&F based algorithm, while the memory limit was fixed at 2GB RAM. We implemented both algorithms in C++ and ran all experiments on an Intel Xeon(R) 2.66 GHz server
An Algorithm for the Freight Allocation Problem
263
6 carriers, 4 discount intervals, tabu tenure = 3 320013 1 (2: 0--> 1) 321013
T ab u co nd itio ns: C anno t select d isco unt interval 0 o f carrier 2 fo r next 3 mo ves
2 (5: 3--> 2)
321012
T ab u co nd itio ns: C anno t select d isco unt interval 0 o f carrier 2 fo r next 2 mo ves C anno t select d isco unt interval 3 o f carrier 5 fo r next 3 mo ves
3 (0: 3--> 2)
T ab u co nd itio ns: C anno t select d isco unt interval 0 o f carrier 2 fo r next 1 mo ve C anno t select d isco unt interval 3 o f carrier 5 fo r next 2 mo ves C anno t select d isco unt interval 3 o f carrier 0 fo r next 3 mo ves
221012 4 (4: 0--> 3) 221312
5
T ab u co nd itio ns: C anno t select d isco unt interval 3 o f carrier 5 fo r next 1 mo ve C anno t select d isco unt interval 3 o f carrier 0 fo r next 2 mo ves C anno t select d isco unt interval 0 o f carrier 4 fo r next 3 mo ves
......
......
F o r examp le, (1) 320013 rep resents that carrier 0, 1, 2, 3, 4, 5 o ffer d isco unt interval 3, 2, 0, 0, 1, 3 to the ship p er (2) (2: 0--> 1) rep resents that carrier 2 changes its d isco unt interval fro m 0 to 2
Fig. 2. Tabu conditions on one tree path
with 3GB RAM. Computational times reported here are in CPU seconds on this server. 4.1
Test Instance Generation
Our test instances were generated according to the following scheme. The number of carriers is fixed to 80 and the number of shipping lanes is taken from {300, 500, 1000}. The percentage of lanes to which carrier i does not provide service is chosen from U [0.2, 0.5], where U [a, b] denotes the uniform distribution in the interval [a, b]. Given this percentage value, the lanes to which the carrier does not provide service are uniformly randomly selected. For each lane j, we first randomly choose a mean rate p¯j from U [20, 100], and a deviation factor τ from U [0.05, 0.30]. Then, the standard price pij quoted by carrier i for lane j is selected from U [¯ pj (1 − τ ), p¯j (1 + τ )]. The demand of each lane, denoted by dj , is a uniformly distributed random real number between 10 and 100. The value of qj is randomly selected from {0.3, 0.4, 0.6} with equal probabilities. These approximately correspond to having a minimum of 4, 3 and 2 carriers for that lane, respectively. The number of discount intervals is fixed at 5 for all carriers. Given the total demand D = m j=1 dj , the discount interval lower bounds βi1 , . . . , βi5 were generated from U [D/13.5, D/12.5], U [D/11.5, D/10.5], U [D/9.5, D/8.5], U [D/7.5, D/6.5] and U [D/5.5, D/4.5], respectively. The discount increment for
264
X. Gao et al.
each interval k ≥ 2 has a 40%, 40% and 20% probability to be 1.5%, 2% or 2.5%, respectively. For example, if the discount increments for carrier i were selected to be 0.0%, 1.5%, 2%, 2% and 1.5%, then the discount coefficients associated with the five intervals would be 0.0%, 1.5%, 3.5%, 5.5% and 7.0%. The unit cost for unfulfilled demand rj = 2 maxi∈I pij , which is twice the quoted standard price for the most expensive carrier on that lane. For each instance size, we generated 10 random instances, for a total of 30 instances. 4.2
Experimental Results
We compared the performance of our F&F based algorithm (denoted by F&F) with CPLEX in terms of the gap between the total cost of their solutions, which is calculated by Gap = (CP LEX −F &F )/CP LEX. The results are summarized in Table 1. Table 1. Performance Comparison between F&F and CPLEX Size
Instance F&F Time Gap1(%) Gap2(%) Gap4(%) Gap6(%) Gap8(%) Gap10(%) 0 749 0.77 0.77 0.77 0.77 N/A N/A 1 781 1.36 1.36 1.36 N/A N/A N/A 2 779 1.49 1.49 1.49 1.49 N/A N/A 3 912 1.45 1.45 1.45 N/A N/A N/A 4 757 2.10 2.10 2.10 2.10 N/A N/A (80,300) 5 671 1.24 1.24 1.24 1.24 N/A N/A 6 643 0.61 0.61 0.61 0.61 N/A N/A 7 820 2.15 2.15 2.15 N/A N/A N/A 8 486 1.05 1.05 1.05 1.05 1.05 N/A 9 978 1.33 1.33 1.33 N/A N/A N/A Average 757 1.36 1.36 1.36 1.21 1.05 N/A 0 1872 2.75 2.75 N/A N/A N/A N/A 1 1574 0.99 0.99 N/A N/A N/A N/A 2 1352 1.00 1.00 1.00 N/A N/A N/A 3 767 1.73 1.11 1.11 1.11 N/A N/A 4 2066 1.53 1.53 N/A N/A N/A N/A (80,500) 5 974 2.45 2.45 2.45 N/A N/A N/A 6 1450 1.14 1.14 N/A N/A N/A N/A 7 1751 1.49 1.49 N/A N/A N/A N/A 8 1996 1.54 1.54 N/A N/A N/A N/A 9 1070 2.10 2.10 N/A N/A N/A N/A Average 1487 1.67 1.61 1.52 1.11 N/A N/A 0 2003 2.48 2.48 1.91 1.91 1.39 1.39 1 2909 63.78 1.54 1.54 N/A N/A N/A 2 3110 1.93 1.93 1.93 1.93 1.93 1.93 3 5567 1.76 1.76 1.76 N/A N/A N/A 4 4353 2.17 2.17 2.17 N/A N/A N/A (80,1000) 5 2768 63.70 1.94 1.94 1.94 1.94 1.94 6 2862 1.14 1.14 1.14 N/A N/A N/A 7 2459 64.02 2.36 2.36 2.36 2.36 2.36 8 4565 1.88 1.88 1.88 N/A N/A N/A 9 3808 0.97 0.97 N/A N/A N/A N/A Average 3440 20.38 1.82 1.85 2.04 1.91 1.91
We first ran F&F on these instances, and then set the time limit for CPLEX to be up to ten times more than “F&F Time”. The columns “Gapx” (x ∈ {1, 2, 4, 6, 8, 10}) give the gap between the best solutions obtained by CPLEX at x times of “F&F Time” and the solutions obtained by F&F. The “N/A” in this table indicates that CPLEX terminated due to an out-of-memory exception. Table 1 shows that F&F obviously outperformed CPLEX. Firstly, CPLEX required much more memory compared to F&F because CPLEX encountered an out-of-memory exception when given 10 times of “F&F Time” for 26 out of the 30 instances. Secondly, the solution quality of F&F was significantly better than that of CPLEX; most gaps exceeded 1%, and the average gap of the largest instances were about 2%.
An Algorithm for the Freight Allocation Problem
5
265
Conclusion
In this paper, we described an algorithm that uses a filter and fan technique with tabu search to solve the freight allocation problem with all-units quantity-based discount. Experiments show that our proposed algorithm outperforms ILOG CPLEX 11.0 in terms of running time, solution quality and computer memory requirements when solving 30 randomly generated instances.
References 1. Burke, G.J., Carrillo, J., Vakharia, A.J.: Heuristics for sourcing from multiple suppliers with alternative quantity discounts. European Journal of Operational Research 186(1), 317–329 (2008) 2. Chauhan, S.S., Eremeev, A.V., Romanova, A.A., Servakh, V.V.: Approximation of the supply scheduling problem. Operations Research Letters 33(3), 249–254 (2005) 3. Chung, C.S., Hum, S.H., Kirca, O.: An optimal procedure for the coordinated replenishment dynamic lot-sizing problem with quantity discounts. Naval Research Logistics 47(8), 686–695 (2000) 4. Glover, F., Laguna, M.: Tabu search. Kluwer Academic Publishers, Dordrecht (1997) 5. Goossens, D.R., Maas, A.J.T., Spieksma, F.C.R., van de Klundert, J.J.: Exact algorithms for procurement problems under a total quantity discount structure. European Jounal of Operational Research 178(2), 603–626 (2007) 6. Lim, A., Wang, F., Xu, Z.: A transportation problem with minimum quantity commitments. Transportation Science 40(1), 117–129 (2006) 7. Mirmohammadi, S.H., Shadrokh, S., Kianfar, F.: An efficient optimal algorithm for the quantity discount problem in material requirement planning. Computers and Operations Research 36(6), 1780–1788 (2009) 8. Munson, G.L., Rosenblatt, M.J.: Theories and realities of quantity discounts: An explanatory study. Production and Operations Management 7(4), 352–369 (1998) 9. Rubin, P.A., Benton, W.C.: Evaluating jointly constrained order quantity complexities for incremental discounts. European Journal of Operational Research 149(3), 557–570 (2003) 10. Sadrian, A.A., Yoon, Y.S.: A procurement decision support system in business volume discount environments. Operations Research 42(1), 14–23 (1994) 11. Stadtler, H.: A general quantity discount and supplier selection mixed integer programming model. OR Spectrum 29(4), 723–744 (2007)
A Distributed, Heterogeneous, Target-Optimized Operating System for a Multi-robot Search and Rescue Application Karl Muecke and Brian Powell National Instruments, 11500 N. Mopac Expwy., Austin, TX 78759-3504, USA {karl.muecke,brian}@ni.com http://www.ni.com
Abstract. Most solutions to complex robotic challenges require many levels of computational processes to work in harmony across heterogeneous targets. These processes can be as basic as generating a pulsewidth modulated (PWM) signal, or as sophisticated as task planning. The solution’s implementation may also not be fully understood at the beginning of development, which begs for a robot operating system that allows for flexible and interactive development at all levels of the solution. We used LabVIEW1 as a robot operating system and programming language to address these challenges in solving a multi-robot search and rescue application. Keywords: Distributed OS, heterogeneous targets, target-optimized, parallel, multi-robot, search and rescue, LabVIEW.
1
Introduction
Developing a robotics system for complex problems often requires sophisticated and advanced architecture. This work describes some of our research in advanced robotics control architectures and the concepts essential for designing effective solutions. Most robotics applications use cascading control loops that are usually best deployed across heterogeneous targets (i.e. closing motor control loops on a field-programmable gate array (FPGA) and mission-planning on a high-end computer). Robotics systems that consist of multiple robots require even more sophisticated and distributed architectures. Processes used in the applications must often run in parallel in order to be efficient and be deterministic in order to implement effective and reliable control. Simulation can reduce the development time of robotics applications, but requires a level of hardware abstraction that allows the same code to run both in the real world and in simulation. Combining cascading control loops, retargetable heterogeneous code, parallel and distributed control processes, and hardware abstraction creates an environment 1
c 2010 National Instruments Corporation. Copyright
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 266–275, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Distributed, Heterogeneous, Target-Optimized Operating System
267
Fig. 1. The LabVIEW Robotics Starter Kit. The kit includes a mobile platform, which interfaces with NI Single-Board RIO (9631)–a real-time controller and FPGA with integrated analog and digital I/O.
and operating system well suited for developing complex robotics applications quickly and reliably. In order to test our ideas for advanced robotics control architectures, we created an advanced robotics problem to solve. We formulated a hypothetical ‘search and rescue’ scenario where multiple robots work together in order to map out an unknown area. In additional to researching advanced architectures, we also wanted to test the feasibility of using our LabVIEW Robotics Starter Kit (Fig. 1) as a robotics platform to solve complex robotics problems. In brief, the architecture we developed has a number of key characteristics, including: – – – –
1.1
Independent, parallel, and deterministic processes A flexible server-centric communication architecture Retargetable and optimized code for heterogeneous targets A hardware abstraction layer that allows seamless integration with simulation Background
Advanced architectures are not new for robotics [1,2]. In fact, one can consider a collection of architectures and a consistent approach in implementing them as a robot operating system or middleware. Middleware typically refers to hardware abstraction, but it can also contain other programming tools that make it more like an operating system. Middleware. This section summarizes a sample of robot middleware. Robot Operating System (ROS) is an open-source, meta-operating system for robots [3]. ROS has three main components: Filesystem, Computation Graph, and Community. The filesystem allows developers to organize the code into packages and better handle dependencies. The computation graph describes the peer-to-peer network processes and relationships at run-time. The communication can be configured as a publish/subscribe model or a request-based protocol. ROS also has tools for monitoring and logging the computation graph.
268
K. Muecke and B. Powell
The community in ROS allows developers to make code available in repositories, which integrate into the ROS framework. MOOS uses the backseat driver paradigm and focuses on behavior-based autonomy [4]. The backseat driver design philosophy separates vehicle navigation and control from autonomy. ‘The vehicle manufacturer provides a navigation and control system capable of streaming vehicle position and trajectory information to the main vehicle computer, and accepting a stream of autonomy decisions such as heading, speed, and depth in return. Exactly how the vehicle navigates and implements control is largely unspecified to the autonomy system running in the payload.’ In MOOS, every application has a single connection to a single MOOS Database. All communication is initiated by the client and must go through the server. What makes MOOS more of an OS is the format of messages, which are sent as human readable strings. This allows developers to use MOOS as an interactive operating system as opposed to just a communication architecture. Miro is robotics middleware based on Common Object Request Broker Architecture (CORBA). It aims to be an object-oriented open architecture that provides hardware and operating system abstraction across heterogeneous platforms using a client-server systems design for communications [5]. Miro is divided into three architectural layers: Device Layer, a platform-dependent I/O interface/abstraction; Service Layer, a network-based platform-independent abstraction of the device layer; and the Class Framework, a collection of function modules. Player is a robot device interface that defines a set of standard interfaces for interacting with robotic devices [6]. Though other transport mechanisms are under development, TCP socket-based transport is the primary protocol. The robot is configured in the server, where all of the different components are defined. Once the server starts, all of the components run on their own driver thread on the server. Then, each driver interacts with a client program the user creates. From the user’s perspective, Player provides a simple network-based interface to the robot’s hardware. Microsoft Robotics Developer Studio (MSRDS) is a free platform for developing robotics applications [7]. MSRDS features a visual programming language (VPL) ‘that makes it easy to create robotics applications. VPL also provides the ability to take a collection of connected blocks and reuse them as [a] single block elsewhere in your program.’ MSRDS is based on web services and uses a ‘Concurrency and Coordination Runtime’ with state-oriented ‘Decentralized Software Services’ to enable handling asynchronous I/O and to create a framework based on a simple protocol between program modules. Additionally, MSRDS includes a physics-based simulator for testing applications. Examples of Advanced Robot Architectures. This section discusses some examples of advanced robot architectures. In [8], the authors address designing a robot system architecture not only from a high level, but also down to the I/O. While computing power has come a long way since its publication (1989), [8] points out the necessity of leveraging different targets for different tasks. A more powerful computer can be used for
A Distributed, Heterogeneous, Target-Optimized Operating System
269
developing code, which can then be compiled and optimized to run on a smaller embedded real-time target. Finally, to handle I/O, a general-purpose I/O board handles A/D conversion and data buffering. The very nature of robotics is a fusion of hardware, software, and electronics. Software modularity is a common practice in all programming. However, when addressing robotics, it makes sense to also think about hardware modularity as well. Many robotics researchers focus just on software; however in [9], the authors show how addressing hardware as a modular resource can be beneficial for the robot’s design. Software modularity in robotics is often achieved by defining a strict communication implementation (described in previous section, ‘middleware’). However, another approach is to create a more generic architecture that leaves room for various implementations [10,11]. This approach also makes it possible to apply the architecture to a variety of heterogeneous platforms. Multi-robot applications often have common characteristics, such as wireless communications. Specific architectures to address wireless communications include considerations for multi-hop, mesh networking, unreliable service, and bandwidth limitations [12,13,14,15]. Multi-robot applications also present interesting challenges such as distributed processing and goal cooperation. These issues need to be addressed in the robot’s architecture since they fundamentally shape the solution. 1.2
The Search and Rescue Challenge
The search and rescue challenge served as a test case for studying advanced architectures. Search and rescue or multi-agent systems are not a new application for robotics [16], but are useful as examples. The robots are autonomous with all processes running on the robot. Data transmitted to the robot was limited to: simulated GPS positions, status commands (stop/resume), and data another robot wanted to send. Imposing these restrictions help reflect a real-world search and rescue application with robots. Since we used the LabVIEW Robotics Starter Kit (Fig. 1), we scaled down the physical size of the challenge. Scale model buildings and cinder blocks represented buildings and debris that might be found in an unstructured environment (Fig. 2). The technical details of the application, its implementation, and code can be found in [17].
2
Approach
We designed an advanced robotics control architecture that is extensible to many different applications. We also tested implementing the architecture using the combination of National Instruments hardware and the LabVIEW programming language [18], and a search and rescue scenario as a test case. The following sections detail some aspects of these approaches that work well for sophisticated
270
K. Muecke and B. Powell
Fig. 2. Image of a scaled unstructured environment used for simulating a search and rescue challenge
robotics problems. The presence of all of these different aspects form a very strong and powerful robot operating system that simplifies many aspects of developing complex robotics systems. 2.1
Independent, Parallel, and Deterministic Processes
Due to the complex nature of robotics problems, independence is a desirable attribute for processes. In this instance, independence means that a process can run regardless of the run states of any other process. In essence, changes to the code of one process can occur without needing to stop or recompile others. Additionally, the independence and separation of processes helps maintain stability for the entire application. The ability to isolate a single process, stop it, debug it, and restart it while running the rest of the application is very valuable for reducing development time. This independence also helps reduce development time when multiple developers work on the application. The developers can work on different portions of code without breaking or stalling others’ progress. Figure 3 shows some of the processes in the example multi-robot application as well as their relationships. For the application, the processes’ independence meant that while the entire application was running, ‘Localization,’ for example, could be stopped, changed, recompiled, and restarted without affecting any other process. Figure 3 also shows some of the parallelism required for this application. As in most sophisticated robotics applications, a cascading control loop design is required in order to control the system. One loop plans a path for the robot to follow, another loop generates steering commands based on obstacle data and the generated path, another loop translates steering commands to motor velocity commands, another loop runs PID control using the velocity command to calculate an applied torque for the motors, and finally, a loop generates a PWM signal for the motor drivers based on the torque command. All of these processes must run in parallel in order for the entire system to work (in addition to a host of other processes also running in parallel, such as communications, obstacle detection, mapping, etc).
A Distributed, Heterogeneous, Target-Optimized Operating System
271
Fig. 3. Diagram showing some of the processes and relationships for the search and rescue application
Not only do processes like PWM control or PID velocity control need to run in parallel, they also need to be deterministic and run in real-time. Determinism is essential when acquiring data or controlling a system. Creating parallel and deterministic processes can be challenging. However, the intuitive visual programming language of LabVIEW makes it very easy to create parallel processes (Fig. 4). Additionally, the LabVIEW Real-Time and LabVIEW FPGA modules act as components of robot’s operating system by providing a way to create truly parallel and deterministic loops. We were able to read motor encoder values, perform PID control, generate PWM motor control signals, and detect obstacles on the FPGA in true, parallel and deterministic fashion. All other processes on the robot ran in the real-time OS on the NI Single-Board RIO.
Fig. 4. LabVIEW block diagram showing an example of parallel loops. Each node represents a function call. The ‘wires’ represent data passed between functions. The frames around the diagrams are loops that control iteration.
272
2.2
K. Muecke and B. Powell
A Server-Centric Communication Architecture
When designing applications involving multiple robots–especially when robots cannot be physically wired together–a communication architecture is necessary for the robots to pass information. Some architectures use a swarm approach for multi-robot applications. However, since this particular application was relatively small we decided to take a server-centric approach like many others ([4,6,3]) in order to save time. All communication messages were sent to a server, which then relayed messages to the appropriate destination. Figure 5 shows how the server creates a connection handler for each client or process that wants to communicate with another process or client. The connection handler passes messages to an arbiter that routes outgoing messages to the appropriate handler to send out to its client. The arbiter keeps track of the publish-subscribe relationship between clients.
Fig. 5. Diagram of the server communication relationship with clients
This communication architecture worked well for this particular use case because the number of robots was relatively small and we could assume that all of the clients could communicate with the server. For larger applications that involve hundreds or thousands of robots, a server-centric approach is less tractable. Additionally, if the robots do not have a reliable connection to the server (i.e. wireless communication range limitations), an ad-hoc network or swarm network would be more appropriate (i.e. 802.11s [14] or [12,11]). 2.3
Retargetable and Optimized Code for Heterogeneous Targets
Since processes run with different rates, priorities, dependencies, and computational requirements, it follows that the processes would be targeted to different
A Distributed, Heterogeneous, Target-Optimized Operating System
273
devices. However, what may be less obvious at the beginning of development, and especially under development time constraints, is determining on which target a process could or should run. Ideally, the code should be portable to any target without modification. This flexibility allows the developer to better distribute the computing load and make use of the advantages offered by the different targets. Consider a steering algorithm as an example. The process could run on an FPGA in parallel with PID motor velocity control. However, if no room is left on the FPGA after coding, it would be convenient if the steering could just as easily execute on a realtime operating system that communicates with the FPGA, thereby freeing up space on the FPGA. Furthermore, if there is not enough processing power on the real-time target, the steering should be able to run on a desktop machine. Where exactly the steering code is deployed is less important than the fact that it runs. However, we would always like to maximize performance, in which case moving the steering process as close to the I/O pins as possible is best (i.e. on the FPGA). Writing a VHDL, real-time, and desktop version for every process is not very practical. Instead, by using something like a visual programming language, the visual code can be used as the basis for compiling the appropriate code for the target you ultimately choose. Using a compiler for each target also allows for more optimized code when deployed. Since the compiler interprets the graphical representation as a compile step, it makes sense that it can optimize the deployed code as well. Frequently, developers may try to use the ‘backseat driver’ paradigm when considering their robotic system. In this case, the ‘intelligent’ code is completely separated from the code that touches the hardware. This may make design more simple, but it fails to take advantage of the processing power and distributed control of the embedded targets, which are often what touch the hardware. Because of the reliable nature of the hardware that interfaces with I/O, it makes more sense to try and run other code at the same level to take advantage of the reliable operation and determinism. Fortunately, LabVIEW offers a simple solution by allowing the developer to use the same graphical code for FPGA, real-time, or desktop deployment. When deploying the code, LabVIEW optimizes the compilation for the target. The LabVIEW environment is designed for this kind of use case, allowing users to drag their code from a desktop target to a real-time or FPGA target, and hit run. For the search and rescue application, we were able to run obstacle detection, PWM modulation, and PID velocity control on the FPGA. We originally targeted mapping to the FPGA as well, but found its computational requirements compromised other processes on the FPGA. Fortunately, to run the code in real-time, all we had to do was change the target–leaving the code unchanged. 2.4
Hardware Abstraction and Simulation
One of the most common needs for robot development is simulation. Simulation allows developers to test code much faster than if using actual hardware.
274
K. Muecke and B. Powell
However, the development time gained from testing in simulation can easily be lost if not thought out properly. Ideally, the exact same code can run either on the actual robot or in a simulated state; the only change being the specified target. This would give all the benefits of rapid testing in simulation, without sacrificing time getting the code to work with hardware. We developed a hardware abstraction layer to address this issue. From a coding perspective, every I/O software interface had a general implementation, which could take the form of real hardware or simulation. Therefore we could simulate any sensor input and any motor output. This allowed us to test algorithms like mapping, steering, and path planning in simulation, and then use the exact same code on the hardware (Fig. 6). The combination of a hardware abstraction layer, simulated I/O, and retargetable code, the transition between running the entire application on a desktop to deployment on the robot was seamless.
Fig. 6. Screen shot of the physics-based simulation created in LabVIEW used to test the multi-robot search and rescue application
3
Results and Conclusions
We were able to create independent, parallel, and deterministic processes that were retargetable and optimized for heterogeneous targets as well a simulated environment using LabVIEW as the robot’s operating system and programming language, and the LabVIEW Robotics Starter Kit as a hardware platform. In very little time we developed, tested, and deployed a multi-robot search and rescue application successfully using the aforementioned architectures and strategies. Videos and descriptions of the application can be found in [17].
A Distributed, Heterogeneous, Target-Optimized Operating System
275
References 1. Brooks, R.A.: New Approaches to Robotics. Science 253, 1227–1232 (1991) 2. Coste-Maniere, E., Simmons, R.: Architecture, the backbone of robotic systems. In: Proceedings of the IEEE International Conference on Robotics and Automation, ICRA 2000, vol. 1, pp. 67–72 (2000) 3. Cousins, S.: Welcome to ROS Topics [ROS Topics]. IEEE Robotics Automation Magazine 17, 13–14 (2010) 4. Smidt, H., Arjuna alasuriya, T.S.: MOOS-IvP Autonomous Command and Control for Undersea Network Nodes (2008) 5. Utz, H., Sablatnog, S., Enderle, S., Kraetzschmar, G.: Miro - middleware for mobile robot applications. IEEE Transactions on Robotics and Automation 18, 493–497 (2002) 6. Gerkey, B.P., Vaughan, R.T., Howard, A.: The Player/Stage Project: Tools for Multi-Robot and Distributed Sensor Systems. In: Proceedings of the 11th International Conference on Advanced Robotics, pp. 317–323 (2003) 7. Microsoft, Product Information (2010), http://www.microsoft.com/robotics/ 8. Ish-Shalom, J., Kazanzides, P.: SPARTA: multiple signal processors for highperformance robot control. IEEE Transactions on Robotics and Automation 5, 628–640 (1989) 9. Roh, S.g., Yang, K.W., Park, J.H., Moon, H., Kim, H.S., Lee, H., Choi, H.R.: A modularized personal robot DRP I: design and implementation. Trans. Rob. 25, 414–425 (2009) 10. Stewart, D., Volpe, R., Khosla, P.: Design of dynamically reconfigurable real-time software using port-based objects. IEEE Transactions on Software Engineering 23, 759–776 (1997) 11. Bicchi, A., Danesi, A., Dini, G., La Porta, S., Pallottino, L., Savino, I., Schiavi, R.: Heterogeneous Wireless Multirobot System. IEEE Robotics Automation Magazine 15, 62–70 (2008) 12. Elston, J., Frew, E.: Hierarchical distributed control for search and tracking by heterogeneous aerial robot networks. In: IEEE International Conference on Robotics and Automation, ICRA 2008, pp. 170–175 (2008) 13. Rybski, P., Stoeter, S., Gini, M., Hougen, D., Papanikolopoulos, N.: Performance of a distributed robotic system using shared communications channels. IEEE Transactions on Robotics and Automation 18, 713–727 (2002) 14. IEEE: IEEE Draft STANDARD for Information Technology–Telecommunications and information exchange between systems–Local and metropolitan area networks– Specific requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications Amendment 10: Mesh Networking. IEEE Unapproved Draft Std P802.11s/D4.0 (December 2009) 15. Friedmann, M., Kiener, J., Petters, S., Thomas, D., von Stryk, O.: Modular software architecture for teams of cooperating, heterogeneous robots. In: IEEE International Conference on Robotics and Biomimetics, ROBIO 2006, pp. 613–618 (2006) 16. Balch, T., Boone, G., Collins, T., Forbes, H., MacKenzie, D., Santamaria, J.C.: Io, Ganymede and Callisto – a Multiagent Robot Trash-collecting Team. AI Magazine 16(2), 39–51 (1995) 17. Muecke, K., Steven Moser, C.M.: NIWeek 2010 Robotic Swarm Demo (2010), http://decibel.ni.com/content/docs/DOC-13031 18. National Instruments, Product Information: What is NI LabVIEW? (2010), http://www.ni.com/labview/whatis/
A Heuristic for the Multiple Container Loading Cost Minimization Problem Chan Hou Che1 , Weili Huang1 , Andrew Lim1 , and Wenbin Zhu2, 1
2
Department of Management Sciences, City University of Hong Kong, Tat Chee Ave, Kowloon Tong, Hong Kong
[email protected],
[email protected],
[email protected] Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
[email protected] Abstract. We examine the multiple container loading cost minimization problem (MCLCMP), where the objective is to load products of various types into containers of various sizes so as to minimize the total container cost. We transform the MCLCMP into an extended set cover problem that is formulated using linear integer programming and solve it with a heuristic to generate columns. Experiments on standard bin-packing instances show our approach is superior to prior approaches. Keywords: Packing; Programming.
1
Heuristics;
Container
Loading;
Integer
Introduction
Our team was contracted by a buying agent for a large multi-national retailer to investigate better ways to formulate packing plans for the loading of goods into multiple containers of various standard sizes, where the containers have different shipping costs. The task is to load all items while minimizing the total shipping cost. We refer to the problem as the multiple container loading cost minimization problem (MCLCMP). The MCLCMP is defined as follows. We are given an unlimited number of rectangular containers of M types, represented by C1 , C2 , ..., CM , that differ in dimensions and cost. The cost for each type is given by c1 , c2 , ..., cM . We are also given a number of rectangular boxes of N types with different dimensions, represented by B1 , B2 , ..., BN ; there are nj , 1 ≤ j ≤ N available boxes of type j. The objective is to produce a set of packing plans such that all boxes are orthogonally packed into the containers and the total cost of containers used is minimized. In this paper, we do not consider supporting area constraints although our approach can be modified to do so. Earlier literature often recommended ways to adapt procedures for the wellanalyzed single container loading problem (SCLP) for multiple containers. Possible strategies include the sequential strategy, where single containers are filled
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 276–285, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Heuristic for the Multiple Container Loading Cost Minimization Problem
277
in turn using SCLP approaches [4,6]; the pre-assignment strategy, which first assigns boxes to containers before loading [11]; and the simultaneous strategy, which considers multiple containers during the loading of boxes [3]. However, there is little existing literature that examines the MCLCMP and its variants directly; the most recent study was done by Eley [4] who proposed a set cover formulation for the MCLCMP. The integer linear programming model is solved using a bottleneck assignment approach. Our approach is an extension of the IP model proposed by Eley. We add a new parameter to this formulation that controls the percentage of boxes to be packed, and perform a binary search on this variable. In addition, we devise three new strategies to pre-generate packing patterns for single containers along with a subroutine that augments the set of packing patterns while searching for solutions for the MCLCMP.
2
Set Cover Formulation for the MCLCMP
We assume that there is a set P of single container packing patterns indexed by i, and the box types are indexed by j. Each packing pattern pi fills a container with associated cost ci with bij boxes of type j. Let nj be the number of boxes to be packed for box type j, and let xi be the integer decision variable for the i-th column that represents the number of packing patterns pi used. We introduce a loading factor a and parameterized the model as follows: SC(a): z = min s.t.
ci xi
(1)
pi ∈P
xi bij ≥ a · nj , ∀j
(2)
pi ∈P
xi ≥ 0 and integer, i = 1, . . . , |P |
(3)
where the objective function (1) computes the total cost of all selected packing patterns, and the inequality (2) ensures that the selected packing patterns have sufficient space to pack the boxes of each type if a = 1.0. The loading factor a is the percentage of boxes that are guaranteed to be loaded. For example, the solution found by SC(0.97) will guarantee that at least 97% of each type of box can be loaded into the containers selected. It may be possible to load the unguaranteed 3% into the containers selected by SC(0.97), whereupon we would obtain a feasible solution of MCLCMP. We make use of this parameterized formulation as follows. First, we generate a set of P packing patterns, which are solutions to the SCLP, using three strategies that employ fast heuristics for the SCLP. We then perform binary search on the range 0.01 ≤ a ≤ 1.0 for max iter iterations, employing a commercial IP solver for each iteration. If there are boxes that are not loaded in a given iteration, we invoke a routine that attempts to load these boxes without increasing the number of containers.
278
3
C.H. Che et al.
Strategies for Generating Packing Patterns
Intuitively, if we can load more boxes into the individual containers, we will obtain a better overall solution. Furthermore, having a diverse set of single container loading patterns will increase the probability of getting a better solution to MCLCMP as they are more likely to complement and cover the weaknesses of each other. Hence, we designed three strategies that aim to quickly produce diverse and utilization-efficient packing patterns 3.1
The G4-Heuristic Bin-Packing Strategy
In the G4-heuristic Bin-packing (G4BP) strategy, we try to create packing patterns that load boxes with the same dimensions into a container. Given boxes of dimensions l × w × h, we can place as many of such boxes as possible on the floor of the container where all boxes are placed with its l-side in the vertical direction. This will result in a horizontal layer with thickness l. Since we can also place boxes with w- or h-side in vertical direction, there are three types of horizontal layers with dimensions L × W × l, L × W × w and L × W × h. We can then fill a container from the bottom up using these three types of horizontal layers. For a horizontal layer with dimensions L × W × l, the number of boxes we can place in it can be determined by solving a rectangle packing problem, which is to place as many rectangles of dimensions w × h into a rectangle of dimensions L × W as possible. This can be solved by the G4-heuristic [8]. Let Nl , Nh and Nh be the number of boxes we can place in L × W × l, L × W × w and L × W × h horizontal layers, respectively. If we treat each layer as an item, then the SCLP becomes a one dimensional bin-packing problem with 3 types of items of sizes Nl , Nw , and Nh . Let xl , xw , and xh be integer variables that denote the number of horizontal layers with thickness l, w and h, respectively. We can model this problem using linear integer programming: G4BP: z = max Nl xl + Nw xw + Nh xh s.t.Nl xl Il + Nw xw Iw + Nh xh Ih ≤ H xl , xw , xh ≥ 0 and integer
(4) (5) (6)
The objective (4) is to maximize the total number of boxes in all layers in a container, subject to the constraint (5) that the sum of the heights of all layers does not exceed the height of the container. Note that the layers can also be built on the (W ,H) face along the length direction, or on the (L,H) face along the width direction. For a given box type and container type, the G4BP strategy solves the above linear program three times, once for each direction, and returns the best result.
A Heuristic for the Multiple Container Loading Cost Minimization Problem
3.2
279
Sequential Application of GRASP Strategy: S-GRASP
Assuming only one type of container can be used, we can solve the MCLCMP by loading one container at a time until all boxes are loaded. This process would produce a set of packing patterns for this type of container. We then repeat the process for each of the M types of container. We adapted the GRASP approach [7] to load boxes into a single container for this strategy, which we call the Sequential Application of GRASP strategy (S-GRASP). Due to the space limitations, we only describe the three differences between our implementaion of GRASP and the original implementation. In GRASP, boxes of the same dimensions l × w × h are first repeated n1 , n2 and n3 times along each dimension to form a block of dimensions (n1 × l) × (n2 × w) × (n3 × h) (subject to the availability of the boxes and the dimensions of the free space). Loading is done block by block. When a block is placed in a free space, the remaining space is partitioned into three rectangular free space regions (c1, c2, c3 in left diagram in Figure 1). The first difference is that we tried an additional partition as shown in the right diagram in Figure 2, and we select the partition with higher fitness. Fitness is defined as volume of the largest free space region minus the total waste (total volume free space regions that are too small to hold any box). Free spaces are filled in a fixed order. For a given free space, the d blocks with greatest volume are selected in original GRASP, whereas we select the d blocks with greatest fitness. The fitness is defined as the volume of the block minus the smaller total wasted volume in the two possible partitions. One block is randomly picked among the d blocks. In each step, a free space is filled and at most three free spaces will be created (wasted free spaces are discarded). The loading process will eventually stop after K steps. In the original GRASP, the last K/2 steps are undone and redone to obtain a different solution, whereas we undo and redo the last i steps for all possible i, i.e., i = 1, 2, ..., K − 1. 3.3
Application of GRASP on Combinations Strategy
In the Application of GRASP on Combinations (C-GRASP) strategy, we wish to generate packing patterns that load exactly r, 2 ≤ r ≤ N types of boxes for each type of container using the above GRASP algorithm, which serves to diversify
Fig. 1. Two ways of generating free spaces
280
C.H. Che et al.
our set of packing patterns to include different numbers of types of boxes. Since it is not feasible to generate packing plans for all Nr combinations for large N , we only generate up to R = 5 packing patterns using a probabilistic sampling strategy for each container Ci and each value of r, 2 ≤ r ≤ N . We illustrate our procedure using the following MCLCMP instance. There are two container types C1 and C2 with corresponding volumes V1 = 100 and V2 = 50. There are three types of boxes B1 , B2 , and B3 , with n1 = 15, n2 = 20, and n3 = 25 boxes of each type available. The volume of each box type is v1 = 10, v2 = 15, and v3 = 20. For container C1 and r = 2: 1. Enumerate all possible combinations of r box types and place it into B Set. In our example, B Set = {(B1 , B2 ), (B2 , B3 ), (B1 , B3 )}. 2. For each combination in B Set, assign a probability value in proportion to the total volume of all combinations (see Table 1). Table 1. Probability of combination selection
Combination Total Box Volume ( v1 · n1 ) Probability (B1 , B2 ) 10 × 15 + 15 × 20 = 450 0.281 (B2 , B3 ) 15 × 20 + 20 × 25 = 500 0.313 (B1 , B3 ) 10 × 15 + 20 × 25 = 650 0.406 3. Choose a combination in B Set according to the probabilities, and remove it from B Set. Assume in our example that (B1 , B3 ) is chosen. 4. Create an instance of SCLP with only boxes in the chosen combination, where the number of boxes of each type is proportional to the total volume of the selected box types, rounded down. In our example, the number of boxes n1 of type B1 is Vv11 ·(v1 ·n1 )/(v1 ·n1 +v3 ·n3 ) = 100 10 ×150/(150+500) = 1. Similarly, the number of boxes n3 of type B3 is 100 × 500/(150 + 500) = 3. 20 If the rounded-down value for a box type becomes 0, we set it to 1. 5. Generate a packing pattern for the SCLP instance using the GRASP algorithm and add this pattern to P . Assume the packing pattern found loads 1 box of type B1 and 4 boxes of type B3 into the container. 6. Subtract the boxes used by the packing pattern from the original problem. In our example, we set n1 = 14 and n3 = 21. 7. Go back to Step 2, until R = 5 packing patterns are generated or all combinations are exhausted. For the special case of r = N (i.e., all box types are considered), we always generate R packing patterns. For each container type, we generate up to R = 5 packing patterns for each value of r = 2, 3, . . . , N . Since there are M types of containers, the above procedure generates up to M × (N − 1) × R patterns.
A Heuristic for the Multiple Container Loading Cost Minimization Problem
281
Algorithm 1. SCBS 1: Generate the set of packing patterns P 2: a ← 1.0, al ← 0.01, ah ← 1.0, iter ← 0 3: while iter < max iter do 4: B ← set of boxes to be loaded 5: Solve SC(a); let S be the list of selected packing patterns sorted by space utilization in descending order 6: for all packing patterns s ∈ S do 7: Load a container with boxes based on s. 8: Remove loaded boxes from B 9: end for 10: Cm ← containers loaded with multiple box types, sorted by space utilization in descending order 11: Cs ← containers loaded with only one box type, sorted by space utilization in descending order 12: If B is not empty, then invoke Insert(Cm , P , B) 13: If B is not empty, then invoke Insert(Cs , P , B) 14: If B is not empty, then al ← a; else ah ← a and record the solution 15: a ← (ah + al )/2, iter ← i + 1 16: end while
4
Binary Search on Loading Factor
The idea behind our technique is to solve SC(a) to provide an estimate of the number of containers needed to guarantee successful loading of a percentage of the input boxes. We then attempt to insert the remaining boxes into the unused capacity of selected containers using the subroutine Insert. If we are able to successfully load all boxes, then we we have found a solution to the original MCLCMP instance and we reduce the loading factor a; otherwise, we increase a. We employ a binary search procedure that terminates after max iter iterations (for our implementation, we set max iter = 8). See Algorithm 1. A set of packing patterns P is first generated using the three strategies described in Section 3. We then perform binary search on the range [al , ah ] for the loading factor a, with initial values al = 0.01 and ah = 1.0. We solve the model using a standard IP solver; the solution has a corresponding list of selected packing patterns S, which we sort by space utilization in descending order. We then load containers with boxes according to the patterns in S. At the end of this process, there may be boxes left over. If so, we invoke the Insert subroutine, first on containers with multiple box types, then on containers containing only one type of box. Finally, we update the binary search range. The aim of the Insert subroutine is to attempt to load the remaining boxes into the selected containers. Given the set of leftover boxes, the subroutine unloads each container in turn and uses the GRASP algorithm to reload the container inclusive of the leftover boxes. Over the course of this process, new packing patterns might be discovered, which we add to our set of packing patterns P .
282
5
C.H. Che et al.
Results and Analysis
Our algorithm, which we label as SCBS, was implemented as a sequential algorithm in Java, and no multi-threading was explicitly utilized. It was executed on a HP server equipped with one Intel Xeon E5430 CPU clocked at 2.66 GHz, 8 GB RAM and running CentOS 5.4. Sun Microsystem’s JDK 1.6.0 update 22 and ILog CPlex 11.0 were installed. Both Java and CPlex are 32-bit editions. 5.1
Results on Bin-Packing Instances
Ivancic et al. [5] proposed 47 instances of the bin-packing problem, which has since become a standard benchmark test set for multiple container loading problems. Only one container type is used, and the objective is to minimize the number of containers used to pack all the boxes. This can be considered a special case of MCLCMP with the cost for the single container type c1 = 1. We compared our SCBS approach with the following algorithms: – IMM: an integer programming based approach by Ivancic et al.[5]; the result is extracted from the article by Bischoff and Ratcliff [1]. – BO: the approach by Bortfeldt [2]. – seq: the sequential version of the approach by Eley [3]. – sim: the simultaneous version of the approach by Eley [3]. – bot: the set cover based bottleneck approach of Eley [4]. – IC: the Iterated Construction approach Lim and Zhang [6]. – SA: the meta-heuristic(local search,SA) approach by Takahara [9]. – MS: the multi-start local search approach by Takahara [10]. Due to space limitations, the detailed experimental results are not reported here. They can be found at http://www.zhuwb.com/mclcmp. The important findings are as follows: for 45 out of the 47 instances, our result is the same as the best results in literature. For instance No. 23, SCBS improves the best known result by 1, while it is worse than the best known result by 1 for instance No. 28. Over all test instances, the total number of bins required by SCBS (692) is the least out of all the approaches tested; the second best approach IC required 694 bins. In particular, it uses 7 fewer bins than the original set cover based bottleneck approach from which our approach is adapted. This is an indication of the effectiveness of our modifications to the IP formulation and packing pattern generation strategies. Although the SCBS algorithm was developed for the MCLCMP rather than the specific bin-packing problem variant, it is able to slightly improve on existing techniques for the bin-packing problem. Importantly, it is unclear how these existing algorithms can be adapted to the MCLCMP. 5.2
Results on Generated MCLCMP Instances
At the time of writing, there are no MCLCMP benchmark test cases available. Hence, we randomly generated a set of 350 MCLCMP instances with known
A Heuristic for the Multiple Container Loading Cost Minimization Problem
283
optimal solutions. Due to space limitations, we omit the generation procedure and the proof of solution optimality. The test cases, optimal solutions and our detailed experimental results can be found at http://www.zhuwb.com/mclcmp. Table 2 provides the results for the SCBS approach on the 350 newly generated MCLCMP instances. The new instances are divided into seven sets mtc3 - mtc9 of 50 instances each (mtc stands for “multiple-container test case” and the suffix gives the number of box types in each instance in that set). The #box column gives the average number of boxes in the problem set. The optimal column shows the average optimal cost of the 50 instances. The opt.util column provides the average volume utilization of the containers in the optimal solutions. The gap columns give the average percentage difference between the SCBS solution and the optimal solution, computed as (cost − optimal cost)/optimal cost. The vol. util columns provide the average volume utilization of the containers in the solutions found by SCBS. The columns t(s) give the average time taken by SCBS in CPU seconds. Finally, the last row of the table provides the average over the 350 instances. Table 2. Results of SCBS on new generated MCLCMP instances set #box optimal opt.util mtc3 mtc4 mtc5 mtc6 mtc7 mtc8 mtc9
1828 11704 1871 12234 1890 12548 1930 12282 2136 13496 1734 11540 1886 12572 average
99.30% 99.26% 99.29% 99.28% 99.33% 99.31% 99.34% 99.30%
gap vol.util t(s) 5.45% 5.43% 6.78% 7.66% 6.94% 8.81% 9.00% 7.15%
95.24% 94.50% 93.65% 92.78% 93.17% 91.81% 91.59% 93.25%
727 1282 2040 3139 4303 5126 6782 3343
The results show that the average gap between the SCBS approach and the known optimal solution over all 7 sets of instances is 7.15%, and this gap increases as the number of box types increases. This is not a small difference, which shows that although SCBS outperforms existing algorithms for the standard binpacking test problems, it is only moderately successful when solving MCLCMP instances. Also note that the average volume utilization of the SCBS solutions is about 93.25% for test instances with 8 or 9 box types even though the optimal solution has a volume utilization of over 99.30%. It is reasonable to say that the MCLCMP is a difficult problem that deserves further research. Table 3 examines the effect of parameterizing the set cover formulation. If SCBS stops with a < 1, we can conclude that the introduction of the loading factor has helped to find a better solution. The number of instances improved by binary search (out of 50) is given in the column #improved. We can see that at least 60% of the instances are improved by binary search (column %improved ).
284
C.H. Che et al. Table 3. Effectiveness of loading factor
set #improved %improved impr impr2 mtc3 mtc4 mtc5 mtc6 mtc7 mtc8 mtc9 avg
30 30 36 41 42 44 48 38.7
60.00% 60.00% 72.00% 82.00% 84.00% 88.00% 96.00% 77.43%
1.19% 2.21% 2.15% 2.82% 2.11% 2.45% 2.10% 2.63% 2.36% 2.94% 2.40% 3.33% 2.08% 2.74% 2.06% 2.73%
For each instance, the improvement due to binary search is calculated as the gap to optimal before binary search less the gap to optimal after binary search. For a set of instances, the average improvement is given in column impr. If we only consider instances that are actually improved by binary search, the average improvement to the quality of the solutions is given in column impr2. For six of the seven test sets (except mtc3), the introduction of binary search reduced the average gap to optimal by about 2% (column impr). This reduction is a significant contribution to the performance of SCBS since without the binary search phase the average gap to optimal would increase from 7% to 9%.
6
Conclusion
This paper studies the multiple container loading cost minimization problem, which has the practical application of modeling the process whereby the buying agent for a multi-national retailer redistributes products after inspection. We added a loading factor parameter to the set cover formulation by [4] to exploit the excess capacity of the chosen containers inherent in the formulation. Combined with three new strategies to find efficient packing patterns, a binary search on the loading factor produces a solution that is superior to the original model. Our SCBS approach is designed to handle multiple containers, and makes use of SCLP algorithms to do so. As new and better SCLP approaches are developed, they can be incorporated into the SCBS approach in order to find good packing patterns. The type of constraints that can be added to the basic MCLCMP is dependent on the constraints that the SCLP sub-components can handle. For example, although the S-GRASP and C-GRASP strategy can handle the supporting area constraint, the G4BP strategy is unable to do so. If we replace the G4BP strategy with a technique that can handle the supporting area constraint (or remove it entirely), then SCBS will be able to handle this constraint for the MCLCMP.
A Heuristic for the Multiple Container Loading Cost Minimization Problem
285
References 1. Bischoff, E.E., Ratcliff, M.S.W.: Issues in the development of approaches to container loading. OMEGA the International Journal of Management Science 23(4), 377–390 (1995), http://dx.doi.org/10.1016/0305-0483(95)00015-G 2. Bortfeldt, A.: A heuristic for multiple container loading problems. OR Spectrum 22(2), 239–261 (2000), http://dx.doi.org/10.1007/s002910050104 3. Eley, M.: Solving container loading problems by block arrangement. European Journal of Operational Research 141(2), 393–409 (2002), http://dx.doi.org/10.1016/S0377-2217(02)00133-9 4. Eley, M.: A bottleneck assignment approach to the multiple container loading problem. OR Spectrum 25(1), 45–60 (2003), http://dx.doi.org/10.1007/s002910200113 5. Ivancic, N., Mathur, K., Mohanty, B.B.: An integer programming based heuristic approach to the three-dimensional packing problem. Journal of Manufacturing and Operations Management 2, 268–298 (1989) 6. Lim, A., Zhang, X.: The container loading problem. In: SAC 2005: Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 913–917. ACM, New York (2005), http://dx.doi.org/10.1145/1066677.1066888 7. Moura, A., Oliveira, J.F.: A grasp approach to the container-loading problem. IEEE Intelligent Systems 20(4), 50–57 (2005), http://dx.doi.org/10.1109/MIS.2005.57 8. Scheithauer, G., Terno, J.: The G4-heuristic for the pallet loading problem. Journal of the Operational Research Society 47(4), 511–522 (1996), http://dx.doi.org/10.2307/3010727 9. Takahara, S.: A simple meta-heuristic approach for the multiple container loading problem. In: SMC 2006: IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2328–2333 (October 2006), http://dx.doi.org/10.1109/ICSMC.2006.385210 10. Takahara, S.: A multi-start local search approach to the multiple container loading problem. In: Bednorz, W. (ed.) Advances in Greedy Algorithms, pp. 55–68. INTECH, I-TECH Education and Publishing, Zieglergasse 14 1070 Vienna Austria (2008) 11. Terno, J., Scheithauer, G., Sommerweiß, U., Riehme, J.: An efficient approach for the multi-pallet loading problem. European Journal of Operational Research 123(2), 372–381 (2000), http://dx.doi.org/10.1016/S0377-2217(99)00263-5
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem Lijun Wei1 , Andrew Lim1 , and Wenbin Zhu2, 1
2
Department of Management Sciences, City University of Hong Kong, Tat Chee Ave, Kowloon Tong, Hong Kong
[email protected],
[email protected] Department of Computer Science, Hong Kong Univ. of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
[email protected] Abstract. In this paper, we propose a greedy heuristic for the 2D rectangular packing problem that represents packings using a skyline. To solve the 2D rectangular strip packing problem, we make use of this heuristic as a subroutine in an iterative deepening binary search on the maximum height of the packing that incorporates a tabu search mechanic. Our approach outperforms all existing approaches on several sets of standard benchmark test cases for the 2D strip packing problem. Keywords: Packing; Heuristics; Tabu Search.
1
Introduction
The 2D rectangular strip packing problem (2DSP) is a fundamental problem in cutting and packing literature. We are given a set of n rectangular boxes with dimensions wi × hi , i = 1, . . . , n, and all dimensions are integral. The task is to orthogonally pack all boxes without overlap into a rectangular container of width W such that the height H of the packing is minimized. There are two variants of the problem: the rotatable variant allows the boxes to be rotated by 90◦ , while the fixed orientation variant does not allow rotation. The 2DSP is NP-Hard, and it is usually solved in practice using heuristics. In this paper, we present a solution approach that solves the 2DSP using a greedy heuristic for the 2D rectangular packing problem (2DRP). The 2DRP is similar to the 2DSP, except that the containing rectangle is of dimensions W × H and the task is to maximize the total volume of boxes packed. Our solution performs an iterative deepening binary search on the height of the container, invoking the 2DRP heuristic several times in each iteration according to a tabu search; this approach can be considered a new and generic technique for solving the 2DSP.
2
Heuristic for the 2DRP
In this section, we present a new greedy heuristic for the 2D rectangular packing problem for use in our 2DSP approach. This deterministic heuristic takes as
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 286–295, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem
287
input a sequence of rectangular boxes, the dimensions W × H of the containing rectangle, and a controlling parameter max spread. If all the boxes can be packed, the heuristic returns the packing pattern, otherwise it returns a failure message. 2.1
Skyline Representation of a Packing Pattern
We represent a current packing pattern by a rectilinear skyline, which is the contour of a set of consecutive vertical bars. It can be expressed as a sequence of k horizontal line segments (s1 , s2 , . . . , sk ) satisfying the following properties: (1) the y-coordinate of sj is different from the y-coordinate of sj+1 , j = 1, . . . , k − 1; and (2) the x-coordinate of the right endpoint of sj is the same as the xcoordinate of the left endpoint of sj+1 , j = 1, . . . , k − 1. The initial empty packing pattern is represented by a single line segment corresponding to the bottom of the container. Our heuristic places boxes one by one. Each box is placed with either its left bottom corner touching a left endpoint or its right bottom corner touching a right endpoint of a line segment sj in the skyline. The left (resp. right) endpoint of a segment sj is a candidate position if and only if sj−1 (resp. sj+1 is higher than sj . Note that the left endpoint of s1 and the right endpoint of sk are always candidates. The candidate positions in Figure 1(a) are depicted by dots. When a box b is placed on a segment sj , the skyline is updated. This is done in two steps. The first step instantiates a new line segment corresponding to the top edge of b and updates the existing segments affected. If the width of b is smaller than the length of sj , then the x-coordinate of the left endpoint of sj is updated to match that of the right edge of b. If the width of b is greater than sj , then sj is deleted, and other segments will be affected. In the second step, we check each line segment that is lower than both its adjacent segments, which we call a locally lowest segment. If there are no unplaced boxes that can be placed on that segment, we raise it to the height of its lower adjacent segment and merge them. This is repeated until all line segments are considered. Figure 1 shows the skyline after the first updating step, after the box b is placed on the left endpoint of s2 (each line segment sj is labeled as j at its left endpoint). If we find that no unplaced boxes can be placed on the
6
1
6 1
2
2 b
b 3'
3' 5
(a) Neighboring waste
(b) Non-neighboring waste
Fig. 1. Updating skyline step 2
288
L. Wei, A. Lim, and W. Zhu
locally lowest segment s3 , then it is raised and merged with s4 to form segment s3 (Figure 1(a)). Similarly, if the box b was the only remaining box that could have been placed on s5 , then s5 will also be raised, as shown in Figure 1(b). 2.2
Placement Evaluation
Our heuristic considers all possible placements (i.e., a position-box pair), picks the best one and places the box at that position. This is repeated until either a box has no feasible position (a failure) or all boxes have been placed. Consider a feasible placement (p, b), where p is an endpoint of line segment sj in the skyline. When evaluating a placement, the algorithm examines the resultant skyline after the first step of our skyline updating procedure according to the following set of priority rules: 1. (maximum spread) The spread of a skyline is the difference between the y-coordinates of its highest and lowest line segments. If the spread of the resultant skyline after placing the box at that position is greater than the input parameter max spread, then the position is immediately rejected. 2. (only fit) If the box b is the only box remaining that can fit onto line segment sj , then (p, b) has the highest priority. 3. (minimum local waste) If there are no such cases or more than one such case, we select the placement that minimizes the amount of wasted local space. This is calculated as the total volume of wasted space of the four
gap 6
6
1
1 2
2
b
3
b
3
4
4 5
5
(a) Wasted space below
(b) Wasted space to the left gap
gap
6
6 1
1 2
2
b
b
3
3
4
4
5
5
(c) Wasted space to the right
(d) Wasted space above
Fig. 2. Wasted local space
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem
289
types given in Figure 2(a-d). Let w min and h min be the minimum width and height, respectively, of the remaining boxes after placing b at p. The shaded area in case (a) is always considered wasted space. The shaded areas in cases (b) and (c) are wasted space if the length of gap is less than w min. Finally, the shaded area in case (d) is wasted if gap is less than h min. 4. (maximum fitness number) If there is a tie, prefer the placement with highest fitness number. The fitness number of a placement is the number of sides of the box that exactly matches the segment it is touching in the skyline. The bottom side of a box is an exact match if its width is equal to the length of sj . The left (resp. right) side of a box is an exact match if its height is equal to the y-coordinate of sj−1 (resp. sj+1 ) less the y-coordinate of sj . Any side of a box that touches the container is not considered an exact match unless it fills all the remaining space on that side of the container. The exception to this rule is when the top edge of a placed box touches the top of the container, which counts as an exact match. 5. (earliest in sequence) If there is a tie, prefer the placement involving the earliest box in the input sequence. If the box can be placed at multiple locations, prefer the smallest y-coordinate, then the smallest x-coordinate. Without the maximum spread restriction, our greedy heuristic would tend to produce a tall tower of boxes along the left side until the container height H is reached. This is a possibly undesirable phenomenon that can be controlled with the maximum spread restriction. Furthermore, different values of max spread causes our heuristic to behave differently: a high max spread value allows tall towers, while a low max spread value causes the heuristic to place boxes layer by layer. By trying a range of max spread values, we can generate more diverse packings, thereby exploring the search space more thoroughly. The only fit criterion is based on the observation that if the only box b that can be placed on a line segment sj is placed elsewhere, then the area just above sj is wasted, which might be large. Hence, it is likely that placing b on segment sj will reduce the overall amount of wasted space. The key component of our evaluation function is the minimum local waste priority rule. It is motivated by the natural assumption that if the amount of wasted space is minimized at every stage of the process, then the remaining unplaced boxes will be more likely to be placeable. Note that our local waste measure is only an approximate measure of the actual amount of wasted space. The maximum fitness number rule favors placements that result in a “smoother” skyline that contains fewer line segments, which is more likely to allow the placement of larger boxes without producing wasted space. Given multiple placements with least waste and highest fitness number, the earliest in sequence criterion introduces a tie-breaking rule. This allows us to consider different ways of prioritizing the placement of boxes using different input sequences.
290
3
L. Wei, A. Lim, and W. Zhu
Solution Approach for 2DSP
The 2DRP heuristic described in the previous section is used as a subroutine in our 2DSP approach. It is contained in a wrapper function 2DRPSolver(H, iter) that invokes the heuristic several times with different inputs sequences that are generated using a tabu search mechanism. We make use of six sorting rules to generate the initial input sequence for our 2DRP heuristic. We sort the boxes in decreasing order of (1) area, (2) width, (3) height, (4) perimeter, (5) maximum of width and height, and (6) length of diagonal + width + height (i.e., perimeter of triangular half). Let mh be the height of the tallest box. For each sequence of boxes, we try four maximum spread values given by the set {mh, mh + (H − mh) ∗ 13 , mh + (H − mh) ∗ 23 , H}. For each of these combinations of sequences and max spread values, we attempt to optimize the solution in terms of area utilization using a tabu search mechanism for a number of iterations determined by the input parameter iter. The neighborhood operator simply swaps two boxes in the current sequence. We generate 10 such sequences that are not forbidden by the tabu list, select the one that produces the solution with the highest area utilization, and insert the swap into the tabu list. Our tabu list size (and tabu tenure) is set to 3n, where n is the number of boxes in the problem instance. At any point in the process, if a packing pattern is found that places all n boxes, then the procedure halts with success. If this does not occur, then the 2DRPSolver reports a failure. Our solution for the 2DSP uses the 2DRPSolver procedure as a subroutine in an iterative deepening binary search as given in Algorithm 1. Let LB1 be the naive lower bound computed as the total area of all boxes divided by the width of the container, i.e., LB1 = total box area / W . Observe that if there are boxes has width greater than W/2, then these boxes cannot be placed side by side in the container. Hence, the sum of the heights of all such boxes is also a valid lower bound, which we denote by LB2. Also note that if there are boxes of width exactly W/2, then no packing can have a height less than half the total height of these boxes; we set LB3 to be this value. Furthermore, such boxes cannot be placed side by side with boxes of width greater than W/2 in the container. Hence, LB2 + LB3 is also a valid lower bound. We set our initial lower bound LB to be LB = max{LB1, LB2 + LB3}. We set the upper bound U B to be LB × 1.1. We then perform a binary search on the height of the container H. If a feasible solution is found, we record the solution and update the upper bound U B to be the height of the current solution. If a feasible solution cannot be found in the given number of iterations, then proceed with the binary search by setting the lower bound to be H + 1. Note that the initial upper bound of U B = LB × 1.1 may be too optimistic, and no solution can be found within the prescribed number of iterations with height no larger than U B. If this is the case, then we increase the upper bound once again by 10 percent and repeat the process (line 19). After each binary search attempt, we double the number of iterations of tabu search allowed by the 2DRPSolver subroutine. The rationale behind this iterative
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem
291
Algorithm 1. Iterative Deepening Binary Search for 2DSP 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24:
LB ← max{LB1, LB2 + LB3} U B ← LB × 1.1 iter ← 1; U Bf ound ← f alse while time limit not exceeded do tempLB ← LB if tempLB == U B then break end if while tempLB < U B do H ← (tempLB + U B)/2 if 2DRPSolver(H, iter) is successful then Record solution in Sol U Bf ound ← true UB ← H else tempLB ← H + 1 end if end while if U Bf ound == f alse then U B ← U B × 1.1 end if iter ← iter ∗ 2 end while return Sol
deepening approach is that early in the process, when the upper bound is likely to be overly optimistic, less computational effort should be spent on finding a feasible solution that is likely to be non-existent. Later in the process, when the upper and lower bound values are closer to the value for the optimal solution, then it may be worthwhile to spend more effort on finding a good solution.
4
Computational Experiments
Our algorithms were implemented as a sequential algorithm in C++ and compiled by GCC 4.1.2, and no multi-threading was explicitly utilized. It was executed on an Intel Xeon E5430 clocked at 2.66GHz (Quad Core) with 8 GB RAM running the CentOS 5 linux operating system. The characteristics of the benchmark test data are summarized in Table 1, where the entries in the first 8 rows (above the horizontal line) have known perfect optimal solutions that have 100% area utilization, and the remaining do not. The name of each test set is given under the column Test Set. For each test set, we state the number of instances (column Inst. #), the range for the number of input boxes (column n) and the width of the container (column W ). Finally, the range of height values for the known perfect optimal solutions are given in column H ∗ .
292
L. Wei, A. Lim, and W. Zhu Table 1. Data Set Characteristics Data Source Hopper et. al. [12] Hopper [13]
Test Set Inst. # C N T Babu Burke CX Nice Path
21 35 35 1 13 7 6 6
Beasley [5] ngcut Beasley [4] gcut Christofides and Whitlock [11] cgcut Bengtsson [6] beng Berkey and Wang [7] bwmv Martello and Vigo [15] bwmv Bortfeldt [8] AH
12 13 3 10 300 200 360
Babu and Babu [3] Burke et. al. [9] Pinto and Oliveira [16] Wang [17]
n 16-197 17-199 17-199 50 10-3152 50-15000 25-1000 25-1000
W
H∗
20-160 15-240 200 200 200 200 1000 375 30-640 40-960 400 600 100 100 100 100
7-22 10-30 10-50 250-3000 16-60 10-70 20-200 25-40 20-100 10-100 20-100 10-100 1000 1000
-
We first analyze the results for the rotatable variant, which allows the boxes to be rotated by 90◦ . We compare our iterative deepening binary search (IDBS) approach with the best approaches for the 2DSP in existing literature: – – – –
BF: a deterministic placement heuristic [9] based on the best fit strategy. BF+SA: a simulated annealing enhancement of the BF heuristic [10]. BFF: a bi-directional best-fit heuristic [1]. FH: an iterative greedy heuristic [14].
In the existing work on the rotatable variant of 2DSP, the test sets used all had known perfect optimal solutions. Consequently, we also evaluate our IDBS approach on these sets of instances; the results are given in Table 2. For each test instance, the relative gap of the height of the produced solution sol from the optimal solution opt is computed as (sol − opt)/opt. The original authors for BF, FH and BBF heuristics only executed their approaches once for each instance; we report the average relative gaps for each set of instances in the columns rg. The original authors executed the BF+SA method 10 times for each instance, and each execution was given a time limit of 60 CPU seconds. The values in the column rgavg is computed by first finding the average relative gap over the 10 executions for each instance, and then taking the average over the test set. Note that the amount of time required to execute the BF heuristic is negligible. For FH and BBF, the columns t(s) report the average time taken over all instances in the test set. For each instance, we ran our IDBS algorithm 10 times with a time limit of 100 CPU seconds. The column rgbest gives the relative gap from optimal of the best solution found out of the 10 executions on each instance, averaged over the number of instances. Similarly, rgworst reports the value for the worst soluions
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem
293
out of the 10 executions on each instance. Finally, the column t∗ (s) gives the average time required for IDBS to first find its final solution for the test set. The results show that IDBS outperforms all existing heuristics for the rotatable variant of the 2DSP. For the C, Babu and Burke test sets, IDBS is able to find the optimal solution in every execution. For the remaining test sets, the average height of the solutions found by IDBS is at most 1.01% away from optimal, which is the best result in literature. Furthermore, even when the worst solutions from 10 executions for each instance are taken, the average gap to optimal is at most 1.2%. Table 2. Results for rotatable variant
Test Set C Babu Burke N T Nice Path CX
BF BF+SA rg rgavg 6.10 6.67 4.05 5.85 6.88 -
FH rg t(s)
BBF IDBS rg t(s) rgavg rgbest rgworst t∗ (s)
2.33 1.71 2.09 2.77 0.3 0 0 6.67 - 6.67 0.4 0 0 0.00 1.41 0.78 2.24 13 0 0 - 0.93 0.80 - 3.90 211 1.01 0.80 4.19 - 5.75 9.8 1.06 0.92 3.13 - 5.95 8.8 1.31 1.17 - 1.02 0.87 2.52 922 0.22 0.19
0 0 0 1.03 1.20 1.27 1.43 0.26
0.31 0.02 0.41 19.11 19.48 44.93 48.87 13.51
For the fixed orientation variant, the GRASP approach by Alvarez-Valdes et al. [2] is the best algorithm which directly outperforms all former algorithms in most cases. Hence, we only compare our IDBS algorithm with this GRASP approach. Both groups of test instances (with and without known perfect optimal solutions) were employed. The results are shown in Table 3. The GRASP approach was given a time limit of 60 CPU seconds per execution for each instance. We set the time limit for our IDBS approach at 100 CPU seconds for the test sets with known perfect optimal solutions, and 200 CPU seconds for the other sets. All entries in the table are averaged over all instances in the test sets. The best results are shown in bold. For the instances with known perfect optimal solutions, IDBS is superior to GRASP. For many test sets, the average relative gap from optimal achieved by GRASP is about twice that of IDBS. In fact, for every test set, the average gap to optimal of the worst solutions found from 10 executions of each instance for IDBS is equal or superior to the average gap to optimal of the best solutions found from to executions of each instance for GRASP. For the remaining instances, since the optimal solutions are unknown, the relative gap values are computed in terms of the naive lower bound LB1. There are some where IDBS produces a better solution than GRASP and others where GRASP is superior. On average, GRASP outperforms IDBS by a very small amount (less than 0.3%). However, IDBS outperforms GRASP overall for the entire set of test instances considered.
294
L. Wei, A. Lim, and W. Zhu Table 3. Results for fixed orientation variant Test Set
5
GRASP IDBS rgavg rgbest rgavg rgbest rgworst
t∗ (s)
C Babu Burke N T Nice Path CX
0.98 0.00 0.91 2.41 2.27 3.55 2.75 1.02
0.85 0.00 0.91 2.31 2.16 3.33 2.61 0.98
0.14 0.00 0.00 1.29 1.33 1.96 2.03 0.43
0.04 0.00 0.00 0.87 1.04 1.80 1.91 0.40
0.20 0.00 0.00 1.94 1.79 2.12 2.15 0.45
10.46 0.04 4.73 28.38 21.55 42.54 48.72 17.96
beng cgcut gcut ngcut bwmv AH
0.00 2.37 5.43 1.34 1.80 -
0.00 2.37 5.42 1.34 1.73 -
0.00 0.00 2.42 2.42 5.63 5.55 1.60 1.60 1.86 1.71 0.87 0.77
0.00 2.42 5.71 1.60 1.98 0.95
0.04 37.18 56.76 0.19 39.11 104.06
Conclusions
In this paper, we present an approach for solving the 2DSP. It makes use of a new heuristic for the 2DRP involving several priority rules that are motivated by observations on the nature of 2D packing problems. The 2DRP heuristic is called several times based on a tabu search procedure, which is used as a subroutine in an iterative deepening binary search on the height of the container. The resultant approach outperforms all existing approaches for the 2DSP on both the fixed and rotatable variants on a large set of benchmark test cases.
References ¨ Ozcan, ¨ 1. A¸sık, O., E.: Bidirectional best-fit heuristic for orthogonal rectangular strip packing. Annals of Operations Research 172, 405–427 (2009), http://dx.doi.org/10.1007/s10479-009-0642-0 2. Alvarez-Valdes, R., Parre˜ no, F., Tamarit, J.M.: Reactive GRASP for the strippacking problem. Computers & Operations Research 35(4), 1065–1083 (2008), http://dx.doi.org/10.1016/j.cor.2006.07.004 3. Babu, A.R., Babu, N.R.: Effective nesting of rectangular parts in multiple rectangular sheets using genetic and heuristic algorithms. International Journal of Production Research 37(7), 1625–1643 (1999), http://dx.doi.org/10.1080/002075499191166 4. Beasley, J.E.: Algorithms for Unconstrained Two-Dimensional Guillotine Cutting. The Journal of the Operational Research Society 36(4), 297–306 (1985), http://dx.doi.org/10.2307/2582416 5. Beasley, J.E.: An Exact Two-Dimensional Non-Guillotine Cutting Tree Search Procedure. Operations Research 33(1), 49–64 (1985), http://dx.doi.org/10.2307/170866
A Skyline-Based Heuristic for the 2D Rectangular Strip Packing Problem
295
6. Bengtsson, B.E.: Packing rectangular pieces heuristic approach. The Computer Journal 25, 253–257 (1982) 7. Berkey, J.O., Wang, P.Y.: Two-Dimensional Finite Bin-Packing Algorithms. The Journal of the Operational Research Society 38(5), 423–429 (1987), http://dx.doi.org/10.2307/2582731 8. Bortfeldt, A.: A genetic algorithm for the two-dimensional strip packing problem with rectangular pieces. European Journal of Operational Research 172(3), 814– 837 (2006), http://dx.doi.org/10.1016/j.ejor.2004.11.016 9. Burke, E.K., Kendall, G., Whitwell, G.: A New Placement Heuristic for the Orthogonal Stock-Cutting Problem. Operations Research 52(4), 655–671 (2004), http://dx.doi.org/10.1287/opre.1040.0109 10. Burke, E.K., Kendall, G., Whitwell, G.: A Simulated Annealing Enhancement of the Best-Fit Heuristic for the Orthogonal Stock-Cutting Problem. INFORMS Journal on Computing 21(3), 505–516 (2009), http://dx.doi.org/10.1287/ijoc.1080.0306 11. Christofides, N., Whitlock, C.: An Algorithm for Two-Dimensional Cutting Problems. Operations Research 25(1), 30–44 (1977), http://dx.doi.org/10.2307/169545 12. Hopper, E., Turton, B.C.H.: An empirical investigation of meta-heuristic and heuristic algorithms for a 2D packing problem. European Journal of Operational Research 128(1), 34–57 (2001), http://dx.doi.org/10.1016/S0377-2217(99)00357-4 13. Hopper, E.: Two-dimensional packing utilising evolutionary algorithms and other meta-heuristic methods. Ph.D. thesis, University of Wales, Cardiff School of Engineering (May 2000) 14. Leung, S.C.H., Zhang, D.: A New Heuristic Approach for the Stock-Cutting Problems. In: Engineering and Technology K: Business and Economic Sciences 2:2, pp. 121–126. World Academy of Science (2010) 15. Martello, S., Vigo, D.: Exact Solution of the Two-Dimensional Finite Bin Packing Problem. Management Science 44(3), 388–399 (1998), http://dx.doi.org/10.1287/mnsc.44.3.388 16. Pinto, E., Oliveira, J.F.: Algorithm based on graphs for the non-guillotinable twodimensional packing problem. In: Second ESICUP Meeting, Southampton, United Kingdom (May 2005) 17. Wang, P.: Data set generation for rectangular placement problems. European Journal of Operational Research 134(2), 378–391 (2001), http://dx.doi.org/10.1016/S0377-2217(00)00263-0
Real-Time Resource Allocation Co-processor Stuart W. Card* Critical Technologies Inc., 4th Floor 1001 Broad Street, Utica NY 13501 USA
[email protected] www.critical.com
Abstract. The increasing capabilities of handheld devices, increasing number of such devices in the battlespace, increasing need for and supply of sensor data and Situational Awareness (SA) information, and increasing recognition of the benefits of cloud based storage and processing, versus the inherent limitations and intermittency of the wireless connectivity between mobile platforms and the cloud infrastructure, motivates adaptive networks in which resources are dynamically allocated in response to conditions and demands. In the proposed cognitive network architecture, cross-layer multi-objective optimization algorithms allocate computation, storage, wireless spectrum, battery energy, etc. on the local node, within the local wireless MANET and across wide area network cloud infrastructure, supporting hands free, wearable, non-traditional intelligence, surveillance and reconnaissance (NTISR) systems. Keywords: Resource Management, Adaptive Control, Planning, Scheduling, Multi-Agent Systems, Reasoning under Uncertainty.
1 Problem Changes in the nature and pace of warfare and other operations have dramatically shortened the time available for Observe-Orient-Decide-Act (OODA) loops. This trend is certain to continue, affecting many of the nested (some would say tangled) loops. In the larger context of military planning and execution, Intelligence, Surveillance and Reconnaissance (ISR) is associated primarily with the Observe and Orient steps; but ISR has its own loops, where the Act is to Observe, analysis enables us to Orient and then we must Decide what collection missions to do next. The ever-shrinking time within which this loop must be closed has motivated exploiting the many opportunities for Non-Traditional ISR implicit in our warfighters having not only guns but also eyes and brains. Air Force NTISR entails not only bombers, fighters and transports acting as surveillance and reconnaissance platforms, but also airmen acting as Multi-INT collectors and front-line analysts. A decade ago General John P. Jumper declared “In this culture, you have to get back to some basic institutional values: every airman is a warrior, every airman is a sensor.”[1] *
This proceedings “short paper” is an extended excerpt from a recently accepted proposal. Phase I work in progress now will be complete and results presented at IEA/AIE 2011.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 296–301, 2011. © Springer-Verlag Berlin Heidelberg 2011
Real-Time Resource Allocation Co-processor
297
Subsequent directives have clarified this as a basic principle of Force Protection. An airman is not, of course, a sensor, but his roles include multi-sensor platform, operator and analyst, plus many others. The collector’s role as a front-line analyst has always been implicit, especially in HUMINT. So we are concerned not just with Multi-INT fusion, but also integration of C4ISR, kinetic and other action, etc. The Army calls this “Every Soldier is a Sensor” (ES2).1 As the footnoted video shows, this concept is already proving its value; but as it also shows, current instantiations of the concept, which require juggling and peering at handheld devices, also occupy the soldiers’ hands and eyes and distract his attention. These same distractions have been noted in civilian health care: too often, the “examining physician” is looking not at the patient, but at a computer screen; wearable systems for both the doctor and the patient can mitigate, and soon with improving Augmented Reality technology may eliminate, this problem, while providing other benefits as well. In battles to win hearts and minds, counter-insurgency and relief operations blur: a medical corpsman caring for villagers may notice malingerers at a temporary bush clinic; some unobtrusive NTISR collection and real-time front-line analysis may thwart a bombing.
2 Opportunity The focus and intensity of a soldier, Marine, sailor or airman’s attention can be inferred if he is sufficiently well instrumented: cameras that see what he sees2; microphones that hear what he hears; biometric sensors; etc. Wearable cognitive systems can infer the wearer’s attention from these physiological and other sensors. Hands free cognitive systems can also perform dynamic Multi-INT correlation, both to cue and to be cued by wearer attention to support the warfighter’s primary tasking and his collateral NTISR opportunities. This processing is computationally intense, while the warrior’s gear must be rugged, wearable, small and light, run on batteries and not cost too much. Fortunately the cloud offers limitless storage and computational power. One example of using cloud computational horsepower to support a wireless connected mobile user is Google Goggles3: with an Android smartphone, take a picture; it is immediately uploaded to the cloud, where image / object recognition is performed; in seconds, hyperlinks to web pages likely to be relevant are downloaded back to the phone. Images of text are subjected to machine translation4, as are voice inputs, interfaced to applications such as Google Maps5. The Raytheon Android Tactical System (RATS)6 is also based upon Android, but as the Forbes article notes “Some RATS services stress a fast connection to the Department of Defense's Internet-like data network.” To use the cloud, the mobile user must be able to reach it. Tactical users’ wireless links often suffer from long latencies, low data rates, high error rates, asymmetry and 1
http://www.army.mil/professionalvideo/movies/sensor_movie.html http://eyetap.org/ 3 http://www.google.com/mobile/goggles/ 4 http://googlemobile.blogspot.com/2010/02/integrating-translation-into-google.html 5 http://googlemobile.blogspot.com/2009/06/search-by-voice-and-transit-directions.html 6 http://www.forbes.com/2009/10/19/android-google-military-technologywireless-raytheon.html 2
298
S.W. Card
intermittency. Delay/Disruption Tolerant Networking (DTN)7 can mitigate some of these problems. However, closing the OODA loop may be urgent, so waiting for store-and-forward delivery may not always be optimal… or even survivable. Software Defined Radio (SDR8) can adapt dynamically to maintain the wearer’s connectivity – it can also be a NTISR sensor: our medical corpsman at his bush clinic sees no overhead doors; if his SDR senses a garage door opener transmission, it is likely being used for another purpose, which the SDR could then jam9 – of course this jamming compounds his difficulties (the need to sense, communicate and fight through the “jamming bubble” in which our troops must operate). A major benefit of the proposed system is cognitive cross-layer optimization and delay/disruption tolerance of not just the networking functions but also sensing and jamming: we define Quality of Experience as user-perceived, application-oriented QoS.
3 Innovation A wearable cognitive NTISR system should dynamically adapt its usage of resources based upon their availability (cloud asset reachability over intermittent wireless links), the costs of using them (battery energy consumption), the risks of using them (revealing position to Radio Direction Finding [RDF]) and of not using them (potentially undetected threats), as well as the expected benefits of using them (object identification / threat assessment or other information discovery) per the intent of the wearer as inferred from the focus and intensity of his attention. 3.1 Cognitive Net-Centric NTISR System Concept A future cognitive system user may dispatch avatars into the cloud to proxy for him, reducing bandwidth requirements when connected and maintaining his virtual presence and ability to act within the cloud when disconnected from it. The significance of avatars as distinct from generic proxies is that these intelligent agents learn the behaviors and preferences (utility functions) of the user on whose behalf they act. Among these avatars are the user’s symbiotes – interface agents that dispatch mobile code and data, manage his network connectivity and control his wearable systems. This picture is science fiction today, but steps toward it can be taken now. The framework will exploit arbitrary combinations of artificial and computational intelligence techniques in a cognitive network [2] enabling essential agent capabilities: • phylogenetic species evolution via recombination and/or mutation of genetic code • morphogenetic agent development via gene expression interacting with the environment • ontogenetic individual learning via processes enabled by the developed structures • input, output, perception, action, movement, secure communication and cognition
7 8
http://www.dtnrg.org/wiki e.g. http://www.ettus.com/products for inexpensive R&D platforms 9 As in the JIEDDO CREW system, but with a wearable package, smaller than that of the Guardian, which serves multiple C4ISR functions rather than being dedicated to IED Defeat.
Real-Time Resource Allocation Co-processor
299
3.2 Situated Agents with Reinforcement Learning in a Marketplace The framework will be similar, in the breadth of its scope, to artificial life simulations, but different in two key respects: agents are situated in cloud-connected MANETs; and rewarded for effectiveness in accomplishing user objectives. Boltzmann10, Lotka [3], Odum11 and others have observed that the fundamental object of contention in the evolutionary struggle is available energy (exergy: free energy after discounting entropy). The situation is essentially the same in computer networks, especially those using power-hungry wireless links to interconnect battery powered mobile nodes. Free energy is demonstrably equivalent to information [4]; informally, consider the energy equivalent of information regarding the location of a food source or gas station (or lack of information on the approach of a predator). Connections have long been recognized between thermodynamics (Boltzmann’s statistical mechanics) and the mathematical theory of communications (Shannon’s information theory). Evolution and learning are essentially information flows between the environment and the entities differentially proliferating therein. Algorithmic information (Kolmogorov complexity) is a closely related quantity.12 It enables identification of Minimum Description Length (MDL) models that satisfy Occam's Razor and are more likely to generalize well than less parsimonious candidates. Normalized Compression Distance (NCD) and other computable estimates of non-computable complexity not only are theoretically justifiable but also have been shown to perform well in various real world applications. [5] Finally the equivalence of information and money is well known to modern economists. Thus several seemingly distinct key quantities from different theories – thermodynamic free energy, communicated information, computational complexity and economic money – prove to be not only inter-convertible, but actually equivalent if not identical. The proposed framework will exploit this equivalence. Agents will be rewarded (paid) for the information they yield and penalized (billed) for the information and energy they consume. The framework will be a service oriented architecture wherein services come at a price. 3.3 Situated Agents with Reinforcement Learning in a Hostile Environment Here as in the real world, deviations from accepted market behavior (e.g., breaches of contract) will be possible, so underlying the marketplace must be a less constrained virtual arena in which agents can defend themselves from the depredations of misbehaving peers. Game theory suggests we should design the framework so that interactions lead to Pareto optimal Nash equilibria and no player can gain systematically by cheating. Assured command and control requires that legitimate authority maintain a monopoly on some key resource needed by the agents. Ultimately these resources are battery energy, processor cycles, memory space, etc.; penultimately they are virtual coins that agents expend to gain access to those items. Identification Friend or Foe (IFF) is critical: identities and roles inform interacting friendly agents (human and software) who commands whom etc. In nature, different 10
http://en.wikipedia.org/wiki/Boltzmann
11 http://en.wikipedia.org/wiki/Howard_T._Odum 12
GECCO-2010 Workshop on Entropy, Information and Complexity.
300
S.W. Card
species have evolved different visual, auditory, olfactory and other cues to claim and to recognize individual identity, family or other group membership, etc. upon which trust relationships are based. In the framework, agents must evolve and learn means of recognition. Agents that verify identities of other agents can not only avoid imposters but also use reputations to minimize exposure to mutated or compromised agents.
4 Product To enable ubiquitous NTISR while supporting rather than interfering with the warfighter’s primary job, we propose a hands-free wearable C4ISR system that: • • • • • • •
monitors the user’s physiology and infers his status and the focus of his attention identifies and tracks objects by directing sensors at the user’s focus of attention scans by directing sensors elsewhere, finds new objects of interest and alerts him automatically reports user position, status, objects of interest, etc. via DTN exploits local & global cloud resources to perform heavy-weight computations accesses cloud resources via DTN, per policy rules, adaptively uses cognition to optimize the network & the network to amplify cognition
The product will be software enabling the capabilities described above. It will be prototyped on an open source Linux operating system, Google Android, which runs on large and increasing numbers of inexpensive battery powered mobile wireless devices. The resource constraints of such devices enable commercial smartphones to serve as realistic surrogates for tactical network computing platforms. Agents will coordinate activity by exchanging information via publish-subscribe. Through reinforcement learning, agent species will evolve and individual agents will develop and learn to identify each other, determine trust, infer user attention and intention, trigger network computational tasks, and support achievement of user goals. Multi-objective “Evolutionary Learning with Information Theoretic Evaluations of Ensembles (ELITE2)” will explicitly measure information content of observations and their independence (for Multi-INT corroborating evidence), redundancy or synergy. The product will use open source software implementations of open standards based protocols, including NACK Oriented Reliable Multicast (NORM) for robust efficient message dissemination and the Host Identity Protocol (HIP) for secure mobility. The design will exploit agent mobility as well as techniques developed in our successful prior work in tactical networks to minimize the bandwidth costs of sharing information among agents and to maximize robustness of services provided despite inherent unreliability of wireless links among mobile nodes. The product will be the first open source framework for evolving agents that develop, learn, compete and cooperate in support of wearable NT-C4ISR in an intermittently connected, resource constrained, tactical wireless environment
5 Technical Objectives of Ongoing Work This program is developing co-processing technology to optimize allocation of communication, computation and storage resources of and for mobile devices, as:
Real-Time Resource Allocation Co-processor
• • • • •
301
global infrastructure clouds are not always reachable from the mobile; local MANET clouds are less powerful but more reliably reachable; multi-cores/GPUs offer parallelism even within a single mobile; resource allocation optimization algorithms can prioritize & move data & code; redundancy can be detected to both filter false alarms & minimize resource usage.
The overall objective is to integrate a cross-layer information sharing plane, transparent Delay/Disruption Tolerant Network (DTN) gateways and dynamic resource allocation algorithms onto a mobile co-processor, ensuring the most important and urgent sensor data will be processed in an accessible location and the resulting indications and warnings will be delivered to the analysts, commanders and other users with a timeliness, precision and accuracy meeting mission requirements.
References 1. Jumper, J.: Expeditionary Air Force: A New Culture for a New Century. Air Force Association (1998), http://www.afa.org/AEF/pub/ol19.asp 2. Thomas, R.: Cognitive Networks. PhD dissertation (2007), http://scholar.lib.vt.edu/theses/available/ etd-07172007-150149/unrestricted/ Thomas_CognitiveNetworksDiss4.pdf 3. Lotka, A.: Contribution to the Energetics of Evolution. Proceedings of the National Academy of Sciences (1922), http://www.pnas.org/content/8/6/147.full.pdf 4. Fraundorf, P.: Heat Capacity in Bits (2002), http://newton.umsl.edu/philf//ifzx/hcapbit8x.pdf 5. Kraskov, A.: Hierarchical Clustering Based on Mutual Information (2008), http://arxiv.org/abs/q-bio/0311039
A Hybrid Search Strategy to Enhance Multiple Objective Optimization Li Ma and Babak Forouraghi Computer Science Department, Saint Joseph’s University, Philadelphia, PA, U.S.A. {Li.Ma,Babak.Forouraghi}@sju.edu
Abstract. This paper presents a new adaptive strategy for combining global (exploration) and local (exploitation) search capabilities of a multi-objective particle swarm optimizer (MOPSO).The goal of hybridization of search strategies is to enhance an optimizer’s overall performance. In contrast to previous attempts at hybridization, the proposed methodology efficiently balances exploration and exploitation of the search space using the two novel methods of intersection test and objective function normalization. Experimental results obtained from several well-known test cases demonstrate the efficiency of the proposed MOPSO algorithm. The results are compared with those obtained from NSGA-II, which is a well-established evolutionary algorithm. Keywords: Multi-objective Optimization; Particle Swarm Optimization; Pareto Front; Exploration and Exploitation.
1 Introduction Particle Swarm Optimization (PSO) is a population-based global optimization methodology inspired by the swarming behavior of animals such as birds or fish [1]. Over the past few years, due to its derivative-independent nature, simplicity and quick convergence rate, PSO has been proven to be a successful evolutionary algorithm used in a wide variety of applications [2-4] especially in the area of multi-objective optimization [5-9]. In most multi-objective optimization problems, several objectives often conflict with each other and that limits the solver’s ability to obtain ideal solutions, which simultaneously optimize all of the objectives. Therefore, since no single solution may be considered optimal, a set of trade-off or Pareto-optimal solutions must be provided for designers. The notion of Pareto dominance is utilized to generate the Pareto frontier, which contains the Pareto-optimal solutions. The goal of a multi-objective optimization algorithm, therefore, is to identify as many of the Paretooptimal solutions as possible in the entire search space. Further, the discovered solutions must be well-distributed along the Pareto frontier [10]. Multi-objective Particle Swarm Optimization (MOPSO), which is the extension of PSO for solving multi-objective optimization problems, has been an active area of research in multi-objective evolutionary algorithms (MOEA). A comprehensive review has been conducted to highlight various characteristics of these algorithms [10]. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 302–311, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Hybrid Search Strategy to Enhance Multiple Objective Optimization
303
However, the capability of exploiting the local information has not been fully investigated in this area. Recently, many researchers have focused on augmenting the existing local search approaches to enhance the overall performance of MOPSOs. For instance, to minimize the mean value of the objectives and the standard deviation, a combination of MOPSO and the quasi-Newton method was introduced to find robust solutions against small perturbations of design variables [9]; another MOPSO utilizes the well-known Nelder Mead simplex algorithm to balance its exploration (global search) and exploitation (local search) of the solution space [11]; and finally, to obtain a uniform distribution along the Pareto frontier with a low number of fitness evaluations, a hybrid algorithm has been proposed which enhances a particle swarm optimizer’s accuracy using concepts from the rough sets theory [12]. In this paper, two new hybrid search strategies are proposed which rely on geometric transformation and normalization techniques to balance the exploration and exploitation of a MOPSO. The aim of the proposed strategies is to divide a typical PSO swarm into two sub-swarms: a global swarm and a local swarm. Clearly, the global swarm is responsible for exploration of the global search space while the local swarm exploits the local information. The two strategies have been shown to provide an efficient method for performing exploration and exploitation using the ZDT test suite. The remainder of this paper is organized as follows. Section 2 presents the proposed MOPSO algorithms which uses a hybrid search strategy. Section 3 reports on the overall performance of the proposed methods and compares them to well-known algorithms often recited in the field of multi-objective evolutionary computation. And finally, Section 4 is the summary and conclusions.
2 The Proposed MOPSO Algorithms This section presents the proposed MOPSO algorithms and their crowding factor method, along with the two new balancing search strategies of intersection test and objective function normalization. 2.1 The Generalized MOPSO Algorithm In order to solve multi-objective problems, a combination of particle swarm optimization [10.19,20] and the Pareto-dominance strategy [6-8,13-15,17-18] can be used to find a set of Pareto-optimal solutions. An external repository is used to store the set of discovered Pareto-optimal solutions. The Pareto-based strategy is used to determine the best local position (leader): (1) When a particle’s current position dominates its current best local position, replace its best local position with the current position. (2) When a particle’s current best local position dominates its current position, keep its best local position. (3) Otherwise, randomly select a position between its current position and current best local position as the best local position. During the first generation, all the swarm members are considered as global particles and the global learning parameters are applied. Later, the intersection test or the
304
L. Ma and B. Forouraghi
normalized objectives metric is utilized to identify the global and local sub-swarms. As a result, particles belonging to different sub-swarms will usedifferent learning parameters. It must be mentioned that in this work the inertia weight is perturbed in a small range for each particle to avoid premature convergence. When a particle violates the geometric side constraints, the violated dimension is reset to its boundary value. Also, to promote diversity a standard mutation operator is applied to a small proportion of the current swarm. Empirically, the mutation parameter of 0.05 is a common value to use. The maintenance of the global repository is a crucial issue. The size of the repository is defined as a system parameter. Particles in the densely-populated areas have the priority to be removed when the repository’s size exceeds a predefined value. The density of the search space is defined by a novel crowding factor (YACF) according to which social leaders are easily determined by randomly selecting candidate solutions in the sparsely-populated areas. Table 1 summarizes the main steps of the proposed MOPSOs. Table 1. The main steps of the proposed MOPSO algorithms Step 1 Step 2
Step 3 Step 4 Step 5
Initialize the particles and the global repository For each particle in the current generation: - update particle’s own memory - choose the current social leader - update the velocity using global/local learning parameters - check the feasibility - evaluate the particle Update the global repository Apply the intersection test or the normalized objectives metric to classify the current swarm Go to step 2 until the stopping criterion is met (usually the maximum number of generations is achieved)
2.2 Yet Another Crowding Factor (YACF) A new social leader selection method, called Yet Another Crowding Factor (YACF), was implemented in this work. This parameter-less measurement not only provides the estimation of the density in the neighborhood, but it also provides a fitness sharing mechanism that degrades the fitness value of an individual solution with respect to a set of solutions in a similar circumstance. Before calculating the YACF value, a new sharing area in the objective space is calculated for the current generation. Each swarm particle’s sharing area is viewed as a hyper-sphere, the center of which is a particle's objective vector while the radius is the vector R: ,
,…,
,
(1)
Where fi is the ith objective function and N is the size of the global repository containing all the discovered Pareto-optimal solutions. The YACF value for each swarm particle is defined as the number of particles in that particle’s sharing area. Thus, the minimum crowding value of a particle is 1
A Hybrid Search Strategy to Enhance Multiple Objective Optimization
305
because it only appears in its own sharing area. The Pareto-optimal solutions in the global repository can be divided into groups where members of each group share the same crowding value. During the initial stage of social leader selection, 10% of the repository corresponding to the less-populated groups is identified and a social leader among that group is then randomly selected. 2.3 Intersection Test Inspired by geometric algorithms [21], the intersection test is conducted to classify the current swarm. The main steps of the test are described below: (1) Generate a ray R which crosses the two extreme solutions on the current Pareto frontier. (2) Randomly select a candidate Pareto-optimal solution c and then generate a segment S from c to the current swarm member p. (3) If there’s an intersection i between R and S, mark p as the global particle; otherwise, p is the local particle. For a 3-dimensional case, we conduct the plain-segment intersection test in that R becomes a plain formed by 3 extreme solutions. Generally, for the n-dimensional case, we conduct a multi-dimensional intersection test in that R becomes a hyper-plain formed by n extreme solutions. This strategy is in O(k) complexity, where k is the number of dimensions. A more detailed introduction can be found elsewhere [21]. 2.4 Normalized Objectives Metric This strategy first sorts the current swarm according to the sum of the normalized objectives NO (Eq. 2) of each particle. ∑
(2)
where fi indicates the ith objective value of the particle; and fmini and fmaxi indicate the minimum and maximum values of the ith objective in the current swarm, respectively. Next, a self-adaptive proportion N (Eq. 3) of the particles is selected with a larger NO value, and it is grouped as the global sub-swarm. As a result, the rest of the particles are considered to be in the local sub-swarm. max 1.0
, 0.1
(3)
Here, Gencurrent indicates the current generation number and Genmaxis the maximum number of generations. This strategy is in O(k) complexity, where k is the number of dimensions, and results in a progressively reduced number of global particles. Clearly, it reflects the requirement of the evolutionary process, which is to explore the search space at the initial stage and to exploit local information at later stages.
306
L. Ma and B. Forouraghi
3 Experiments and Results This section compares the proposed algorithms, MOPSO-IT (the proposed MOPSO algorithm in Section 4 with the intersection test strategy) and MOPSO-NO (the proposed MOPSO algorithm in Section 4 with the normalized objective metric) to several MOEAs in the field of evolutionary computation [8, 12]. All algorithms were tested on a well-known test suite, ZDT, which is described elsewhere [22]. Briefly, the suite contains six problems numbered ZDT1 to ZDT6 where ZDT4 is the most difficult optimization problem (too many local minima) while ZDT1 is the most tractable. In the conducted experiments, ZDT5 was eliminated because it is a binary-string optimization problem which cannot be handled by the proposed MOPSO. ZDT1-3 use 30 decision variables while ZDT4,6 use 10 decision variables. For the sake of fairness, in each test problem the standard PSO parameters for the local sub-swarm and the global sub-swarm, the number of particles and the number of generations were set to the same value. Since no single performance metric can measure convergence to the true Pareto frontier and the spread of the discovered Pareto-optimal solutions, four different performance metrics were used to evaluate the overall performance of the proposed algorithms: (1) Two Set Coverage (SC) measures the relative coverage comparison of two sets: the discovered Pareto-optimal set and the true Pareto-optimal set [8, 23]. (2) Spread (S) measures how well the solutions are distributed along the discovered Pareto frontier and the extent of spread according to the true Pareto frontier [8]. (3) Generational Distance (GD) measures the proximity of the discovered Pareto-optimal solutions to the true Pareto-optimal solutions [8]. (4) Inverted Generational Distance (IGD), which is a variation of GD, measures how far are the solutions in the true Pareto frontier from those in the discovered Pareto frontier and reduces the bias that can arise when an algorithm didn’t fully cover the true Pareto frontier [23]. Due to the fact that the true Pareto frontiers of ZDTs are provided in [13, 22], a set of uniformly-distributed solutions was generated along the true Pareto frontier as reference. To test the performance of the algorithms, two sets of experiments were conducted: (1) The first set of experiments concentrated on the overall performance of the proposed algorithms using a larger number of fitness evaluations in comparison to NSGA-II, which represents the state-of-the-art in MOEA and EM-MOPSO [8]. The total number of fitness evaluations of the proposed algorithms was 30,000, while the other algorithms required a minimum of 50,000 fitness evaluations.
A Hybrid Search Strategy to Enhance Multiple Objective Optimization
307
(2) The second set of experiments concentrated on the efficiency issue using a very small number of fitness evaluations in comparison to NSGA-II and PSOMORSA [12]. The total number of fitness evaluations was 4,000 for all the algorithms. Initially, it is desired to identify the obtained solutions in which all the performance measures are gradually minimized. For the sake of brevity, Figure 3 graphically demonstrates only the spread and the set coverage metrics measured over for 3 statistically independent runs of MOPSO-NO and MOPSO-IT each evolving over 80 generations and using a total of 4,000 fitness evaluations. It can be seen in Fig. 1 that both MOPSO-NO and MOPSO-IT are able to gradually approach the true Pareto frontier although most of the obtained Pareto-optimal solutions cannot be dominated by the solutions along the true Pareto frontier in most of the ZDT tests. Only in the case of ZDT4, the obtained solutions are all dominated by the solutions along the true Pareto frontier in 2 of the 3 runs. In addition, MOPSO-NO converges very quickly and identifies optimal solutions starting around the 20th generation, which is better than MOPSO-IT. Also as seen in Fig. 3, the proposed algorithms are able to obtain a diverse set of the Pareto-optimal solutions as the value of spread (S) is minimized. To further assess the overall performance of the proposed algorithms compared with other approaches in the two experiments, a total of 30 statistically independent runs were performed for each test case and the obtained results are depicted in Table 2. The parameters of the proposed algorithms are fixed during the experiments. For the global swarm in the intersection test, the inertia weight is 0.9 and the learning rates c1 and c2 are both set to 2.0, while the inertia weight is 0.4 and the learning rates are both 0.5 for the local swarm. The parameters for the normalized objective metric are the same to the global swarm in the intersection test. In addition, parameters for other algorithms are set to the values proposed in the referred literatures. In Experiment 1, the proposed algorithms MOPSO-IT and MOPSO-NO were able to obtain a good approximation and coverage of the true Pareto frontier, and they also provided a good distribution of the discovered Pareto-optimal solutions for most of the problems. It can be observed that MOPSO-IT and MOPSO-NO outperformed others in every aspect in ZDT1-3. In the case of ZDT4 only, EM-MOPSO seemed to outperform all other methods although MOPSO-NO had a better spread and diversity. In the case of ZDT6, however, both MOPSO-IT and MOPSO-NO outperformed all others in all aspects except for EM-MOPSO, which only produced a better set coverage (SC). In addition, it is evident that the proposed strategies are more efficient in that they only need 30,000 fitness evaluations while other algorithms require 50,000 fitness evaluations. In Experiment 2 the main focus was on efficiency, and the obtained results demonstrate that MOPSO-IT and MOPSO-NO outperformed NSGA-II in most problems except for the coverage of ZDT4. Although PSOMORSA is very competitive considering the coverage performance metric for all the problems, MOPSO-IT and MOPSO-NO
308
L. Ma and B. Forouraghi
exhibited a better approximation of the true Pareto frontier and also provided a good distribution of solutions along the discovered Pareto-optimal frontiers. As a whole, it is demonstrated that the two proposed algorithms are efficient, capable of achieving the goal of multi-objective optimization and also compete well with other MOEAs including the NSGA-II, which is the representative of the state-of-the-art in the area of MOEA.
Fig. 1. Spread (S) and set coverage (SC) performance metrics
A Hybrid Search Strategy to Enhance Multiple Objective Optimization Table 2. Experimental results for the various algorithms
*GD was used in Experiment 1 while IGD was used in Experiment 2.
309
310
L. Ma and B. Forouraghi
4 Conclusions This paper proposed two novel MOPSO-based algorithms, MOPSO-IT and MOPSONO, which utilize the hybrid search strategies of the intersection test and the normalized objectives metric. Performing hybrid, balanced searches involves exploration and exploitation of the solution space during evolutionary computation, and it was shown to greatly enhance the optimization ability of an optimizer while obtaining better approximations, coverage and distribution along the true Pareto frontier. The two strategies are only in O(k) complexity, where k is the number of dimension, and they will not affect the efficiency of algorithms (not limited to MOPSOs) when incorporating them in the existing population-based optimization algorithms. To highlight the performance of the proposed MOPSO-based algorithms, a twofold experiment was designed and conducted. The first experiment measured the overall performance while the second one focused on assessing overall optimization efficiency. During the course of the two experiments, the well-known ZDT test suite was utilized and the experimental results were statistically analyzed. It was demonstrated that the proposed MOPSO algorithms outperformed other MOEAs that are often reported in the literature, especially the NSGA-II, which is the representative of the state-of-the-art in the area of MOEA.
References 1. Eberhart, R.C., Shi, Y., Kennedy, J.: Swarm Intelligence. Morgan Kaufmann, San Mateo (2001) 2. Shi, Y., Eberhart, R.C.: Fuzzy adaptive particle swarm optimization. In: IEEE Congress on Evolutionary Computation, pp. 101–106 (2001) 3. He, S., Prempain, E., Wu, Q.H.: An improved particle swarm optimizer for mechanical design optimization problems. Engineering Optimization 36, 585–605 (2004) 4. Maeda, Y., Matsushita, N., Miyoshi, S., Hikawa, H.: On simultaneous perturbation particle swarm optimization. In: IEEE Congress on Evolutionary Computation, pp. 3271– 3276 (2009) 5. Li, X.: Better spread and convergence: Particle swarm multiobjective optimization using the maximin fitness function. In: Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3102, pp. 117–128. Springer, Heidelberg (2004) 6. Ho, S.L., Yang, S., Ni, G., Lo, E.W., Wong, H.C.: A particle swarm optimization-based method for multiobjective design optimizations. IEEE Transactions on Magnetics 41, 1756–1759 (2005) 7. Ochlak, E., Forouraghi, B.: A particle swarm algorithm for multiobjectivedesign optimization. In: Proceeding of the 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2006), pp. 765–772 (2006) 8. Reddy, M.J., Kumar, D.N.: An efficient multi-objective optimization algorithm based on swarm intelligence for engineering design. Engineering Optimization 39, 49–68 (2007) 9. Ono, S., Nakayama, S.: Multi-objective particle swarm optimization for robust optimization and its hybridization with gradient search. In: IEEE Congress on Evolutionary Computation, pp. 1629–1636 (2009)
A Hybrid Search Strategy to Enhance Multiple Objective Optimization
311
10. Reyes-Sierra, M., CoelloCoello, C.A.: A survey of the state-of-the-art multi-objective particle swarm optimizers. International Journal of Computational Intelligence Research 2, 287–308 (2006) 11. Koduru, P., Das, S., Welch, S.M.: A particle swarm optimization-neldermead hybrid algorithm for balanced exploration and exploitation in multidimensional search space. In: Proceeding of International Conference on Artificial Intelligence, Las Vegas, Nevada, pp. 457–464 (2006) 12. Santana-Quintero, L.V., Ramírez-Santiago, N., Coello, C.A.C., Luque, J.M., HernándezDíaz, A.G.: A new proposal for multiobjective optimization using particle swarm optimization and rough sets theory. In: Runarsson, T.P., Beyer, H.-G., Burke, E.K., MereloGuervós, J.J., Whitley, L.D., Yao, X. (eds.) PPSN 2006. LNCS, vol. 4193, pp. 483–492. Springer, Heidelberg (2006) 13. CoelloCoello, C.A., Lamont, G.B., Van Veldhuizen, D.A.: Evolutionary Algorithms for Solving Multi-objective Problems. Springer, New York (2007) 14. Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Chichester (2001) 15. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. KanGAL Report 200001, Indian Institute of Technology, Kanpur, India (2000) 16. Deb, K., Pratap, A., Moitra, S.: Mechanical component design for multiple objectives using elitist non-dominated sorting GA. In: Proceeding of the Parallel Problem Solving from Nature VI Conference, pp. 859–868 (2000) 17. Liu, D., Tan, K., Goh, C., Ho, W.: A multiobjectivememetic algorithm based on particle swarm optimization. IEEE Transactions on Systems, Man and Cybernetics - Part B: Cybernetics 37, 585–605 (2007) 18. Ray, T., Liew, K.M.: A swarm metaphor for multiobjective design optimization. Engineering Optimization 34, 141–153 (2002) 19. Clerc, M.: Particle Swarm Optimization. ISTE Ltd., California (2006) 20. Engelbrecht, A.P.: Fundamentals of Computational Swarm Intelligence. Wiley, Chichester (2005) 21. O’Rourke, J.: Computational Geometry in C, 2nd edn. Cambridge University Press, Cambridge (2001) 22. Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evolutionary Computation 8, 173–195 (2000) 23. Villalobos-Arias, M.A., Pulido, G.T., CoelloCoello, A.C.: A proposal to use stripes to maintain diversity in a multi-objective particle swarm optimizer. In: Proceeding of Swarm Intelligence Symposium, pp. 22–29 (2005)
Forest Planning Using Particle Swarm Optimization with a Priority Representation P.W. Brooks and W.D. Potter Institute for Artificial Intelligence University of Georgia Boyd GSRC, Room 111 Athens, GA 30602-7415
Abstract. We examine the poor performance of Discrete Particle Swarm Optimization when applied to forest planning, a combinatorial optimization problem in which the goal is to maintain an even flow of timber from a forested area of multiple plots over several time periods while cutting each plot no more than once and no two adjacent plots within the same period. We suggest an alternative priority representation using Particle Swarm Optimization with real numbers and justify it with experimental results. Keywords: particle swarm optimization, forest planning, priority representation.
1
Introduction
In [1], the authors evaluated the performance and suitability of nature-inspired optimization techniques when applied to problems chosen to exemplify the concerns faced in the areas of diagnosis, configuration, planning, and pathfinding. They found a surprising result in the planning area: Discrete Particle Swarm Optimization (DPSO), a population-based optimization method that models the behavior of animals in schools or flocks, performed very poorly compared to Genetic Algorithms (GA), a population-based optimization method that models biological evolution, and two single-element techniques, Raindrop Optimization (RO) and Extremal Optimization (EO). Their final results are reproduced in Table 1. For forest planning, a lower value is better. It is clear that their DPSO ranked last among the techniques in planning, but compared well in other areas. The authors experimented extensively with DPSO parameters such as swarm size, inertia, and the learning constants, but none of these was sufficient to bring its performance in line with the GA or single-element methods. This suggests that if DPSO is to be improved in this area, it must be through more radical means such as hybridization with other search methods, enhancement with domain knowlege, or a reformulation of the representation. In this paper, we describe an attempt at the last of these. Specifically, we propose that particles encode relative priorities for each possible element of a plan rather than directly K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 312–318, 2011. c Springer-Verlag Berlin Heidelberg 2011
Forest Planning Using Particle Swarm Optimization
313
Table 1. Results from [1]
Diagnosis Configuration Planning Pathfinding
GA 87% 99.6% 6,506,676 95
DPSO 98% 99.85% 35M 86
RO 12% 72% 5,500,391 65
EO 100% 100% 10M 74
encoding a plan. This results in a significantly larger search space, but one which Particle Swarm Optimization (PSO), of which DPSO is a variation, can more easily navigate. We begin by describing the particular planning problem used by Potter et al in more detail, followed by PSO, then our alternative representation, and finally our results.
2
Forest Planning Problem
The forest planning problem deals with managing a forested area in such a way as to achieve an even flow of harvested lumber without violating constraints intended to control erosion and maintain aesthetics. The forested area is divided into plots or stands, each of which may border one or more other plots. Time is divided into discrete periods, for each of which we have estimates of how many MBF (thousand board feet, a measure of lumber volume) can be harvested from each plot. A plan comprises decisions on when and if to cut each plot to harvest its lumber. Each plot may be cut in at most one period. There is a further unit restriction adjacency constraint that no two adjacent plots may be harvested during the same time period. The closer each period’s total harvest is to a target harvest constant across periods, the better the plan. Quantitatively, the goodness of a plan is given by the formula: n
(Hi − T )2
i=0
where i is the harvest period, Hi the total timber harvested in period i, and T the target harvest constant across periods. The result represents the accumulated error of the plan. The objective of the problem is to minimize this formula, which we call a fitness function following GA terminology. Following the lead of [1], we use an instance of the forest planning problem given in [2], the 73-plot Daniel Pickett Forest (see Fig. 1) over 3 time periods covering 10 years each. Typical harvest volumes for a western US forest are assigned to each plot. The target harvest for each period is 34,467 MBF, established by a linear programming solution without the adjacency constraints.
314
P.W. Brooks and W.D. Potter
Fig. 1. The Daniel Pickett Forest divided into 73 plots
3
Particle Swarm Optimization
PSO is a nature-inspired optimization technique that models the movement of animals in groups such as schools of fish and flocks of birds [3,4]. PSO maintains a swarm of particles, each of which represents a potential solution to the problem. With each iteration of the PSO, the particles move through the search space influenced both by a memory of their previous best position and the best position attained by any particle in the population. Programmatically, each PSO particle i is represented as a vector of values X i (t) and a vector of velocities V i (t) at discrete time step t. At each time step, new velocites and values are calculated based on the previous ones as follows: V i (t) = αV i (t − 1) + φ1 (P i − X i (t − 1) + φ2 (P g − X i (t − 1)) X i (t) = X i (t − 1) + V i (t) where α is an inertia constant, φ1 a random number drawn from the uniform distribution between 0 and C1 , φ2 a random number drawn from the uniform distribution between 0 and C2 , P i the best location particle i has yet occupied (as measured by the fitness function), and P g the global best location any particle has yet occupied. C1 is the cognitive constant that weights the effect of a particle’s memory on its movement, and C2 is the social constant that weights the effect of other particles on a particle’s movement. Velocities are limited by the constants Vmin and Vmax , which are typically defined as Vmax = C1 + C2 Vmin = 0 − (C1 + C2 )
Forest Planning Using Particle Swarm Optimization
315
PSO as described above is intended for problems with real-valued solutions, but it is possible to apply it to problems with integer- or bit-valued solutions by discretizing the values before applying the fitness function. This is called Discrete PSO or DPSO. The PSOs used in [1] are of this variety.
4
Priority Representation
The most direct approach to applying PSO to the forest planning problem is to encode a plan in a discretized vector X. For example, if discretized to integers, each element xn of X = x0 , x1 , x2 , ... could indicate the period in which to harvest stand n. This can result in plans that violate the adjacency constraint. Penalties may be applied to the fitnesses of particles that violate the constraint to discourage it without completely eliminating its influence. This was the initial approach taken in [1], but it resulted in a terrible 150M fitness. Later experiments with other parameters and a bitstring representation reduced that to 35M. Our approach to the problem, borrowing an idea from operations research [5,6], is to treat the particle not as a plan but as a set of priorities for assembling a plan. We use a real-valued PSO where the length of X is the product of the number of plots and the number of periods. If o represents the number of periods, we say that xn is the priority assigned to harvesting plot no in period (n mod o). To convert a particle into a plan, we first sort the values by priority, thereby forming a permutation of all possible assignments of periods to plots (ignoring for now the possibility of not cutting a plot). We then iterate through the permutation from highest priority to lowest priority, assigning the specified plot to be harvested at the specified period. Should an assignment be impossible either because the plot has already been assigned to be harvested during another period or it would violate the adjacency constraint, we ignore that element and move on to the next. When we are finished, any plots that have not been assigned a period (because the harvests of its adjacent plots rule them all out) are not harvested. This representation could be extended to include a priority for not harvesting each plot, but in practice doing so resulted in very poor performance. The target harvest T is such that nearly every plot must be harvested to achieve it because the goal of even-flow management is an optimal constant supply of lumber [2], so the additional dimension needlessly expands the search space. An advantage of this priority representation is that impossible plans cannot be generated, so there is no need for a penalty function or repair operation to account for constraint violations. This takes a considerable burden from the shoulders of the algorithm designers; when present, optimization performance hinges on the appropriateness of the penalty function or repair operation, which may need to be finely tuned to individual problems [7]. In the case of the priority representation, the fitness function essentially repairs plans based on the particle’s priorities, which avoids the systemic bias that can be introduced by traditional repair operations.
316
P.W. Brooks and W.D. Potter
There are also some disadvantages to this representation. First, the search space is larger. While the integer representation only has a number of dimensions equal to the number of plots, the priority representation has a number of dimensions equal to the number of plots times the number of periods. Each dimension also spans the range of the real numbers instead of a few discrete possibilities, but in practice the values rarely stray far from the interval (Vmin , Vmax ) until the particles have converged. Second, the fitness function takes longer to compute because of the introduction of the requirement that the priorities be sorted before the plan is constructed. In our experience with the 73-plot Daniel Pickett Forest scenario, time spent in the fitness function approximately doubled. This was a price well worth paying as we shall see from the results.
5
Results
To test the priority representation for the forest planning problem, we developed a real-valued PSO in Erlang, which we chose for the ease with which programs written in it can be parallelized. In order to compare the priority representation with the integer representation, we developed two fitness functions to evaluate particles. The first, which we will call DPSO, directly encodes the plan in the particle. It expects one value per plot, which it discretizes by applying the function o−1 period (xn ) = |xn mod o| Vmax where o is the number of periods1 . The result of this function is the period during which plot n is to be harvested. No penalty function is used; to ensure adherence to the adjacency constraint a simple repair operator blocks the harvesting of plots that conflict with it, a strategy used as the basis of the RO technique [2]. The second fitness function, which we will call PPSO, uses the priority representation discussed in the previous section. It expects o values per plot, each of which represents the priority of harvesting the plot in that period. We did not encode a priority for not harvesting plots in any period because of its poor performance in preliminary trials. In the first set of trials, we used population sizes of 100, 500, and 1000, and the following parameters: C1 = 2 C2 = 2 Vmax = 4 Vmin = −4 α = 1.0 1
The inclusion of the Vo−1 term makes it unlikely that a particle moving at maximum max velocity in a dimension will end up in the same place it left. It may still happen in some circumstances because of the nature of floating point arithmetic.
Forest Planning Using Particle Swarm Optimization
317
Table 2. Results α 1.0 1.0 1.0 0.8 0.8 0.8
Pop. Size 100 500 1000 100 500 1000
PPSO Best Avg 7,346,998 9,593,846 6,481,785 9,475,042 5,821,866 10M 8,536,160 13M 5,500,330 8,831,332 6,999,509 10M
DPSO Best Avg 118M 135M 133M 139M 69M 110M 47M 70M 61M 72M 46M 59M
We ran a set of 5 trials of 2,500 iterations each for every population size. Different initial random seeds were used for each trial in a set. We used accepted values for the learning constants and velocity limits because we did not have an a priori reason to weigh the social factor more heavily than the cognitive factor or vice versa. In the case of PPSO, we noticed that individual values of the particles were straying far from the origin. It was not unusual for a particle to have multiple priorities with absolute values exceeding 4Vmax . In an attempt to control this, we ran another set of trials with the same parameters except α = 0.8. Results from all parameter settings are tabulated in Table 2. The columns labeled Best contain the fitness of the best particle from each set of 5 trials. Those labeled Avg contain the arithmetic mean of the fitnesses of the single best particles from each of the 5 trials. As expected from the work of [1], the DPSO configured in this way performed poorly. The PPSO performed very well, with best fitness in its peak case (α = 0.8 and population of 500) exceeding the performance of the RO in Table 1, and even its average performance close to EO. The reason for the disparity of performance between the two appears to have to do with adjacency constraints. Table 3 compares the plans of the best particle from all DPSO trials to the best of all the PPSO trials. Numerals indicate the period in which a plot is to be cut, while Xs indicate a plot is not to be cut. Because of the nature of our representations, Xs should only occur when constraints forbid the planned harvest. The DPSO particle has 10 uncut plots, while the PPSO particle has only 3. Because nearly every plot must be cut in order to meet the target harvest each period, our DPSO’s way of handling adjacency constraints puts it at a disadvantage. It may be possible to improve the DPSO’s performance by finding a more flexible repair operator or penalty function that allows it to cope better with the constraints. Curiously, PPSO tended to perform best with a swarm size of 500, whereas Genetic Algorithms tend to improve in performance with larger population sizes. One possible explanation for this is our use of the global best particle for computing the social factor that influences the movement of each particle. A large swarm might prematurely uncover a local optimum and influence all other particles to converge on it when they should still be exploring. Some PSO implementations use the best in the neighborhood of each particle, where the neighborhood is
318
P.W. Brooks and W.D. Potter Table 3. Best particles
Plot 00 10 20 30 40 50 60 70
0 1 0 2 2 X 1 2 0
1 0 0 0 2 X 2 X 2
2 2 X 0 1 0 2 1 0
DPSO 3 4 X 2 1 0 1 X 0 1 0 0 X 1 0 1
5 X 2 1 2 0 2 1
6 0 2 1 X 2 1 1
7 1 2 0 1 2 0 0
8 X 2 2 0 1 2 2
9 2 1 2 2 1 1 2
Plot 00 10 20 30 40 50 60 70
0 0 0 1 2 X 2 1 0
1 1 0 2 2 2 2 0 0
2 0 1 0 1 1 1 2 1
PPSO 3 4 5 1 1 2 1 0 2 2 1 1 2 2 1 0 0 0 2 1 2 2 2 2
6 1 X 2 0 1 0 2
7 1 2 1 0 1 1 1
8 2 X 0 0 1 2 2
9 2 1 2 1 1 1 2
a set of particles smaller than the whole swarm [3], which could improve the performance of large populations. In summary, PSO with priority representation copes well with constraints of the sort found in the forest planning problem, while a straightforward DPSO using the natural integer representation of a plan does not. There is a price to pay in terms of time performance, since the priority representation requires the extra work of sorting the priorities before a plan can be formed, but at least in the case of the forest planning problem the greatly improved results are worth the sacrifice. There remains a future avenue of research in finding an improved repair operator, penalty function, or representation for DPSO that can better handle forest planning adjacency constraints. Furthermore, [6] describes a hybrid of PSO with the permutation representation and crossover operators of Genetic Algorithms that may also work well with the forest planning problem.
References 1. Potter, W.D., Drucker, E., et al.: Diagnosis, configuration, planning, and pathfinding: Experiments in nature-inspired optimization. In: Natural Intelligence for Scheduling, Planning, and Packing Problems, pp. 267–294. Springer, Berlin (2009) 2. Bettinger, P., Zhu, J.: A new heuristic for solving spatially constrained forest planning problems based on mitigation of infeasibilities radiating outward from a forced choice. Silva Fennica 40, 315–333 (2006) 3. Kennedy, J., Eberthart, R.: Swarm Intelligence. Morgan Kaufmann, San Francisco (2001) 4. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proc. IEEE Intern. Conf. on Neural Netw., pp. 1942–1948. IEEE Service Center, Piscataway (1995) 5. Bean, J.C.: Genetic Algorithms and Random Keys for Sequencing and Optimization. ORSA Journal on Computing 6, 154–160 (1994) 6. Zhang, H., Li, X., et al.: Particle swarm optimization-based schemes for resourceconstrained project scheduling. Automation in Construction 14, 393–404 (2005) 7. Michaelwicz, Z., Schoenauer, M.: Evolutionary algorithms for constrained parameter optimization problems. Evolutionary Computation 4, 1–32 (1996)
Fuzzy Robot Controller Tuning with Biogeography-Based Optimization George Thomas*, Paul Lozovyy, and Dan Simon Cleveland State University, Department of Electrical and Computer Engineering 2121 Euclid Avenue, Stillwell Hall Room 332 Cleveland, Ohio 44115 {g.l.thomas71,p.lozovyy89,d.j.simon}@csuohio.edu
Abstract. Biogeography-based optimization (BBO) is an evolutionary algorithm (EA) based upon the models of biogeography, which describe the relationship between habitat suitability and the migration of species across habitats. In this work, we apply BBO to the problem of tuning the fuzzy tracking controller of mobile robots. This is an extension of previous work, in which we used BBO to tune a proportional-derivative (PD) controller for these robots. We show that BBO can successfully tune the shape of membership functions for a fuzzy controller with both simulation and real world experimental results. Keywords: biogeography-based optimization, fuzzy logic control.
1 Introduction Optimization is one of the main objectives for almost any engineering project. Iterating through the entire search space of a problem is usually not feasible because of how much CPU time this can require. Engineers have therefore adapted natural processes such as evolution to optimization with great results [1]. Biogeography-based optimization (BBO) [2] belongs to the class of heuristic optimization methods known as evolutionary algorithms (EAs). BBO is based upon the models of biogeography, which describe the migration of species across island habitats [3]. In previous work, we have demonstrated BBO’s efficacy at PD controller tuning [4]. The next step, which we take in this paper, is to test BBO’s performance at optimizing a more complex controller. We have chosen fuzzy logic control because of its ease of use and increasing ubiquity, combined with the difficulty in tuning it optimally [5]. In the remainder of this section we give an overview of fuzzy logic and BBO. Section 2 describes our robot hardware. Section 3 presents our fuzzy robot control tuning approach, simulation results, and experimental results. Section 3 provides the primary contribution of this research, and reports the first successful use of BBO for fuzzy logic system tuning. Section 4 concludes with some discussion and suggestions for future work. *
Corresponding author.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 319–327, 2011. © Springer-Verlag Berlin Heidelberg 2011
320
G. Thomas, P. Lozovyy, and D. Simon
1.1 History of Fuzzy Set Theory to Fuzzy Logic Control Fuzzy sets were first introduced in 1965 by Lotfi A. Zadeh [6]. Fuzzy set theory is a generalization of classic (or crisp) set theory, in which elements of a set have a continuous value that represents their membership in a class, generally with a range of [0, 1]. In classic set theory, membership is discrete—an element can only be a member or a nonmember of a set. Fuzzy logic is an extension of fuzzy set theory to the field of logic. Classical logic uses Boolean operators such as AND and OR in reasoning, but fuzzy logic makes these operators ambiguous. The fuzzy generalizations of the operations AND and OR are the t-norm and s-norm respectively. Several functions fit the necessary and sufficient conditions for being a t-norm or an s-norm; the most common t-norms are the min and product functions, and the most common s-norms are max and sum [7]. Zadeh later proposed fuzzy logic as a method of decision making [8]. In this scheme, the domain of each input and output variable is mapped by fuzzy adjectives that describe the value of the variable, such as “hot” or “high.” Each adjective is also associated with a membership function (MF). The adjectives make it possible to form control rules by relating variables together in plain sentences. For instance, “if boiler pressure is high, then escape valve should be slightly open.” These control rules can then be written in terms of fuzzy set theory with t-norms and s-norms relating variables together. In order to go from decision making to control, fuzzy outputs must be made crisp, so that they can be used in practice. This process is called defuzzification. The first fuzzy controller was implemented by Mamdani and Assilian for steam engine control in 1975. In this defuzzification scheme, max-min product composition was used to produce crisp outputs from the inputs [9]. The universal approximation theorem states that an arbitrary function composed of a set of sub-functions can approximate any nonlinear function with an arbitrary level of accuracy [10]. A fuzzy logic controller is composed of its rule base and membership functions. These parameters determine the accuracy of the output with respect to the function we wish to approximate. Because our robot is a nonlinear system, a fuzzy logic control is more suitable for this problem than PD control, which is linear. 1.2 BBO The mathematical models that describe biogeography were first published in the 1960s [11]. These models describe the immigration and emigration of species between habitats, based on the fitness of the habitats. Because BBO is based on biogeography, a lot of its terms are borrowed from biogeography [2]. A solution, or individual, is referred to as a habitat or island. The fitness, or cost, of a habitat is often referred to as habitat suitability index (HSI). Finally the independent variables, or features, of a solution to a problem are referred to as suitability index variables (SIVs). Sometimes the population of candidate solutions is referred to as an archipelago. In BBO, different operators are applied to the population of solutions to accomplish the optimization—migration is one such operator. The probabilities of an SIV immigrating to a habitat, or emigrating from a habitat, are λ and μ, which are determined by the HSI. These probabilities are typically complements of each other (i.e., λ = 1 − μ).
Fuzzy Robot Controller Tuning with Biogeography-Based Optimization
321
In addition to the migration operator, there is also the mutation operator, as with most other EAs. Mutation randomly selects a solution from the population with a specified probability and sets one of its independent variables to a random value. Mutation helps to introduce new features into the population, since in BBO, the population tends to become populated with the same high fitness solution as the number of generations increases. A pseudocode outline of BBO is provided in Algorithm 1.
For each candidate solution Hi For each solution feature s Select candidate solution Hi for immigration with probability λi If candidate solution Hi has been selected, then Select Hj for emigration with probability μj If Hj has been selected, then Hi(s) ← Hj(s) end end if next solution feature Probabilistically mutate candidate solution Hi next candidate solution Algorithm 1. A pseudocode representation of one BBO generation
BBO has been applied to several real-world problems. In addition to experimental robot control tuning, as discussed in this paper and in [4], BBO has been applied to aircraft engine sensor selection [2], power system optimization [12, 13], groundwater detection [14], mechanical gear train design [15], satellite image classification [16], and neuro-fuzzy system training for biomedical applications [17]. Recent research in the area of BBO has focused on putting it on a firm theoretical foundation, including the derivation of Markov models [18, 19] and dynamic system models [20] that describe its behavior.
2 Hardware In this research, we use a standard two-wheeled robot design [21]. Our robots were originally assembled in 2007 [22], but we have adapted them to our work [4]. Our robots are designed to be flexible. We can replace any component at any time with a minimum of effort. Solid state components such as a microcontroller, voltage regulators, and H-bridges are mounted and hand soldered on two-layer printed circuit boards (PCBs). The PCBs, sensors, and other electronics are mounted on thin plastic boards separated by aluminum standoffs. Two geared DC motors are attached with brass brackets. Two AA battery packs, which each hold eight rechargeable batteries, supply power to the motors and PCBs separately. Figure 1 depicts one of these robots. Each robot is equipped with a MaxStream 9Xtend radio, which is used to communicate with the base station. The base station is a PC running a Matlab® graphical
322
G. Thomas, P. Lozovyy, and D. Simon
user interface (GUI), and it can transmit commands to each robot, update robot control parameters, and receive robot tracking data. The RF device can output power from 1 mW to 1 W and has a maximum outside range of 40 km. It uses a serial interface that can operate at several baud rates. The radio, costing about $200, is the most expensive device on each robot. One of the goals of this work is to improve upon our previous research [4]. We have taken steps towards improving the reliability of our robots by using Fig. 1. One of our robots infrared (IR) range-finding sensors to find distances. The reason for this change from ultrasonic rangefinders, which we used in our previous work, is to reduce the effect of noise that came from the motors that made the robot’s controller unstable. The sampling rate of the IR sensors is slower than the ultrasonic sensors, so our control code needed modification as well. Our second upgrade in this research is a new printed circuit board. We have added opto-isolation to the robot PCBs to completely isolate the motor circuit from the logic circuit. Another benefit of opto-isolation is the elimination of the effect of back EMF that is produced by DC motors. Also, the boards have been redesigned for use of the robot’s new IR range-finders, and the redesign eliminates capacitance that could be induced between small traces on the PCB. More flexibility also is added to the new boards such as access to every pin of the microcontroller if more components are required.
3 Description of Experiments In this work, we use tracking error and its derivative for fuzzy logic controller inputs, and a motor voltage correction value for the output. Each of the inputs and outputs are mapped with five sum-normal triangular MFs. Any n sum-normal MFs that describe some variable can be described by a set of n break points. In the case of our triangular MFs, a break point can also be described as the only point in a MF where it is both non-differentiable and equal to 1. Therefore, the fuzzy MFs for each variable are completely specified by five parameters. Figure 2 illustrates this concept.
Fig. 2. Five fuzzy membership functions which are triangular and sum-normal. The numbered dashed lines represent break points.
Fuzzy Robot Controller Tuning with Biogeography-Based Optimization
323
BBO modifies the shape of the MFs by modifying these break points. In addition, we use the Mamdani inference system [7]. Min and max are used for the t-norm and s-norm in our work. We define a rule for each of the intersections between a pair of input MFs. Table 1 represents our rule table, where LP means a large positive, SP means small positive, Z means a value near zero, SN means small negative, and LN means large negative. Table 2 shows the search domain that was used for this work. Note that the zero MF is constrained to have a break point equal to 0. This means that the fuzzy logic system consists of 12 independent variables: four break points for error, four break points for delta-error, and four break points for delta-voltage. Table 1. Fuzzy rule table, where output corresponds to a change in motor voltage
Error LN
SN
Z
SP
LP
LN
LN
LN
LN
SN
Z
SN
LN
LN
SN
Z
SP
Z
LN
SN
Z
SP
LP
SP
SN
Z
SP
LP
LP
LP
Z
SP
LP
LP
LP
∆Error
Table 2. The search space used for each fuzzy membership function break point in the fuzzy controller
Break points
Variable
LN
SN
Z
SP
LP
Error (mm)
[-10000, -2500]
[-2500, 0]
[0, 0]
[0, 2500]
[2500, 10000]
∆Error (mm/s)
[-750, -100]
[-100, 0]
[0, 0]
[0, 100]
[100, 750]
∆Motor Voltage (normalized)
[-1000, -250]
[-250, 0]
[0, 0]
[0, 250]
[250, 1000]
The maximum allowable magnitudes of the large negative and large positive values were chosen to be more than an order of magnitude greater than the values that we typically used in our controller, which gave BBO additional flexibility in its optimization search. The cost function that we used is a weighted sum of the rise time of the robot trajectory to a reference point, and the integral of absolute value of the tracking error:
324
G. Thomas, P. Lozovyy, and D. Simon
Cost = k1 ∫ e(t ) dt + k 2 r
(1)
where k1 and k2 are weighting constants, e(t) is the tracking error (mm), and r is the rise time, which represents the length of time that the robot takes to reach 95% of the reference tracking distance. For our experiments, k1 and k2 were set to 1 and 5 respectively to give approximately equal contribution of each term to the cost function. The time duration of each robot tracking experiment was 20 seconds. 3.1 Simulation Results We ran some preliminary simulations before we performed experimentation with the robots. We ran 100 Monte Carlo simulations of BBO with a population size of 20, for 50 generations. Also, the mutation probability was set to 10% and a single elite solution was preserved between generations. We used an unusually high mutation rate because our population size was relatively small. Figure 3 shows the cost of Equation (1) decreasing as a function of generation number in BBO.
Fig. 3. BBO optimizing the fuzzy controller cost. The traces show the average and minimum cost of the entire population, averaged over 100 Monte Carlo simulations.
3.2 Experimental Results In our experiment on the real robots, we ran 10 generations of BBO with a population size of 6 and a mutation probability of 20%. One elite solution was kept between generations. Figure 4 shows the cost as BBO optimizes the population each generation. Figure 5 shows the robot paths representing the best solutions of the first and final generations. The integral of absolute error changed from 1476 mm·s for the first generation to 717 mm·s in the final generation, which is a decrease of 51%. Figure 6 shows the MFs of the best solution produced by BBO.
Fuzzy Robot Controller Tuning with Biogeography-Based Optimization
Fig. 4. BBO optimizing the fuzzy controller cost on the real robot
325
Fig. 5. Tracking error of best solution at the first and last generation
Fig. 6. Fuzzy MFs produced by BBO after 10 generations
4 Conclusion The difficulty associated with tuning a fuzzy controller makes the application of EAs to this problem very appropriate. We have shown that BBO can perform this task well both in simulations and in a real robotic system. Future work may include performing these experiments with different parameter settings. Changing parameter settings such as population size and mutation probability may not provide much information for qualitative tests like we have performed here, which were designed to test whether or not BBO can optimize the MFs of a fuzzy logic system. However, such tests would be very useful for obtaining more quantitative results. For example, if we wanted to examine how fast BBO could produce a set of membership functions with a cost value below some threshold, it would be important to find optimal BBO parameter settings. Other future work may include using BBO to tune non-triangular MFs, such as MFs with trapezoidal or Gaussian shapes. BBO could also be used to tune the rule base of a fuzzy controller. Also, we can apply BBO to the optimization of other control algorithms, such as artificial neural networks. It would be very interesting to compare BBO’s performance at tuning our fuzzy controller versus BBO’s performance at tuning our PD controller [4], but it is difficult to compare these two tasks with the data from our experiments. Although we use the same cost function to evaluate the performance of both controllers, their input
326
G. Thomas, P. Lozovyy, and D. Simon
parameters are very different. The PD controller’s inputs are proportional gain and derivative gain, and the fuzzy controller’s inputs are the shapes of its constituent membership functions. These quantities are not comparable in any meaningful way. Therefore, the domains of input values for each problem are not comparable. Future work in this regard would be to redesign our experiments in such a way that we can make meaningful comparisons between BBO’s performance at these two problems. Perhaps we can then draw conclusions about tradeoffs between controller robustness and tuning complexity. Acknowledgment. This material is based upon work supported by the CSU Provost’s Office, and by the National Science Foundation under Grant No. 0826124.
References 1. Fogel, D.B.: Evolutionary Computation. John Wiley and Sons, Hoboken (2006) 2. Simon, D.: Biogeography-Based Optimization. IEEE Transactions on Evolutionary Computation 12(6), 702–713 (2008) 3. Lomolino, M.V., Riddle, B.R., Brown, J.H.: Biogeography. Sinauer Associates, Sunderland (2009) 4. Lozovyy, P., Thomas, G., Simon, D.: Biogeography-based optimization for robot controller tuning. In: Igelnik, B. (ed.) Computational Modeling and Simulation of Intellect: Current State and Future Perspectives. IGI Global (in print, 2011) 5. Jantzen, J.: Tuning of fuzzy PID controllers. Denmark Tech. Report no 98- H 871(fpid) 30, 1–22 (1998) 6. Zadeh, L.A.: Fuzzy Sets. Information and Control 8(3), 338–353 (1965) 7. Jang, J.-S.R., Sun, C.-T.: Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice Hall, Upper Saddle River (1997) 8. Zadeh, L.A.: The Concept of a Linguistic Variable and its Application to Approximate Reasoning. Inf. Sci. 8, 199–249 (1975) 9. Mamdani, E.H., Assilian, S.: An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller. Int. J. Man-Machine Studies 7(1), 1–13 (1975) 10. Tikk, D., Kóczy, L.T., Gedeon, T.G.: A Survey on Universal Approximation and its Limits in Soft Computing Techniques. Int. J. Approx. Reasoning 33(2), 185–202 (2003) 11. MacArthur, R., Wilson, E.: The Theory of Biogeography. Princeton Univ. Press, Princeton (1967) 12. Rarick, R., Simon, D., Villaseca, F., Vyakaranam, B.: Biogeography-based optimization and the solution of the power flow problem. In: IEEE Conference on Systems, Man, and Cybernetics, pp. 1029–1034 (2009) 13. Roy, P., Ghoshal, S., Thakur, S.: Biogeography-based optimization for economic load dispatch problems. Electric Power Components and Systems (38), 166–181 (2010) 14. Kundra, H., Kaur, A., Panchal, V.: An integrated approach to biogeography based optimization with case based reasoning for retrieving groundwater possibility. In: 8th Annual Asian Conf. and Exhibition on Geospatial Information, Tech. and Applications (2009) 15. Savsani, V., Rao, R., Vakharia, D.: Discrete optimisation of a gear train using biogeography based optimisation technique. Int. J. Design Eng. (2), 205–223 (2009) 16. Panchal, V., Singh, P., Kaur, N., Kundra, H.: Biogeography based satellite image classification. Int. J. of Comp. Sci. and Info. Security (6), 269–274 (2009)
Fuzzy Robot Controller Tuning with Biogeography-Based Optimization
327
17. Ovreiu, M., Simon, D.: Biogeography-based optimization of neuro-fuzzy system parameters for diagnosis of cardiac disease. In: Genetic and Evolutionary Computation Conference, pp. 1235–1242 (2010) 18. Simon, D., Ergezer, M., Du, D., Rarick, R.: Markov models for biogeography-based optimization. IEEE Transactions on Systems, Man, and Cybernetics (Part B: Cybernetics), 299–306 (2011) 19. Simon, D., Ergezer, M., Du, D.: Population distributions in biogeography-based optimization algorithms with elitism. IEEE Conference on Systems, Man, and Cybernetics, 1017– 1022 (2009) 20. Simon, D.: A Dynamic System Model of Biogeography-Based Optimization (2010) (submitted for publication) 21. Jiang, X., Motai, Y., Zhu, X.: Predictive Fuzzy Logic Controller for Trajectory Tracking of a Mobile Robot. In: Proc. IEEE Workshop on Soft Comp. in Ind. Appl., pp. 29–32 (2005) 22. Churavy, C., Baker, M., Mehta, S., Pradhan, I., Scheidegger, N., Shanfelt, S., Rarick, R., Simon, D.: Effective Implementation of a Mapping Swarm of Robots. In: IEEE Potentials, pp. 28–33 (2008)
Development of a Genetic Fuzzy Controller for an Unmanned Aerial Vehicle Y. Qu, S. Pandhiti, K.S. Bullard, W.D. Potter, and K.F. Fezer Institute for Artificial Intelligence University of Georgia Boyd GSRC, Room 111 Athens, GA 30602-7415
Abstract. Autonomous Unmanned Aerial Vehicles (UAVs) have been increasingly employed by researchers, commercial organizations, and the military to perform a variety of missions. This paper discusses the design of an autopilot for an autonomous UAV using a messy genetic algorithm for evolving fuzzy rules and fuzzy membership functions. The messy genetic algorithm scheme has been adopted because it satisfies the need for flexibility in terms of the consequents applied within the conditional statement framework used in the fuzzy rules. The fuzzy rules are stored in a Learning Fuzzy Classifier System (LFCS) which executes the fuzzy inference process and assigns credit to the population during flight simulation. This framework is useful in evolving a sophisticated set of rules for the controller of a UAV, which deals with uncertainty in both its internal state and external environment. Keywords: Unmanned Aerial Vehicle, Genetic Algorithm, Fuzzy Control.
1 Introduction At present, unmanned aerial vehicles (UAVs) are widely employed by researchers, commercial organizations, and the military to perform a variety of missions. Although many of these UAVs are controlled from a remote location, relying on human support for control, stability, and navigation, autonomously controlled UAVs are becoming increasingly ubiquitous. The classical method for autonomously controlling an aircraft is to use a control loop feedback mechanism called a proportional-integralderivative (PID) controller. The PID controller depends on an input representing the error between the desired state and the current measured state of the aircraft. Although PID controllers are the most commonly used feedback controllers and are widely used in industrial control systems, this paper proposes a novel methodology for developing an unmanned aircraft system that will attempt to outperform the classical PID methodology, in terms of navigation and stability, using a rule-based Learning Fuzzy Classifier System (LFCS). A sophisticated autonomous UAV controller is capable of dealing with various flying scenarios, and therefore, must take into account a considerable amount of knowledge to effectively manage flight. Hence, a methodology using a LFCS that generates useful fuzzy control rules for the on-board autopilot K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 328–335, 2011. © Springer-Verlag Berlin Heidelberg 2011
Development of a Genetic Fuzzy Controller for an Unmanned Aerial Vehicle
329
of a UAV is proposed in this paper. In the LFCS, a messy genetic algorithm evolves the fuzzy rules and fuzzy membership functions simultaneously. The rule development is done using a simulation of the UAV’s aerodynamics and flight path, in MATLAB, in order to continually assess the control rules being developed by the LFCS. Additionally, a visualization of the UAV’s flight path traversal is created in FlightGear, an open-source flight simulator. Ultimately, the simulation and visualization process provides the opportunity to assess the accuracy of the control rules developed as well as the stability of the flight operations performed in a virtual environment, and the rules can then later be implemented and evaluated on an actual UAV.
2 Background The unmanned aircraft system being developed at the Institute for Artificial Intelligence at the University of Georgia (UGA) is a fixed-wing (Zagi) mini-UAV. The actuators of this aircraft are elevons, which are a combination of elevator and aileron control mechanisms of an airplane. We use a Blackboard system to communicate between the different layers in the rule-based architecture. Initially, the Parser posts the major flight tasks on the Blackboard to be completed in order to achieve the overarching flight mission. The Flight Manager divides and posts these flight tasks into lower-level tasks to be executed by the Controller. Although the flight goals are executed sequentially, many of the tasks and sub-tasks can be executed simultaneously, and when completed, this information is also updated on the Blackboard, until the overall mission is accomplished. Figure 1 is a flow diagram of the overall system:
Fig. 1. The UAV system architecture
330
Y. Qu et al.
In order to achieve the goals set forth, we have decided on a LFCS as the approach to be used inside of the controller, replacing the widely used PID control methodology. In understanding the methodology presented in this paper, it is essential to understand how fuzzy logic, genetic algorithms, and learning classifier systems are used in applications. Fuzzy control provides a flexible tool to model the relationship between input information and control output. Introduced by Zadeh’s theory of fuzzy sets [1], a fuzzy control strategy yields the powerful ability to represent and process imprecise, uncertain knowledge. It has the advantage of describing its knowledge by means of linguistic concepts without requiring the complexity and precision of mathematical models. Thus, the solution to a problem can be cast in terms that human operators can understand, so that their experience can be used in the design of the controller. This makes it easier to mechanize tasks that are already successfully performed by humans. Multiple applications have emerged over decades, such as multi-dimensional motion control [2], temperature control [3], traffic control [4], and the control of mobile robots [5]. A genetic algorithm (GA) serves as the mechanism of generating adaptive rules based on the existing population. We have chosen to use a messy genetic algorithm because it has proven to be an efficient way to evolve rules for an autonomous vehicle capable of avoiding obstacles while travelling [6]. The “messy” part of the genetic algorithm refers to the variable lengths of chromosomes in each gene, representing a single tuple. This provides flexibility for the control rules, allowing each individual to have different length rules, allowing greater variation. During the evolution process, the GA evolves both the fuzzy relation (rule) and the membership function. By doing this, it allows a minimal amount of expertise and prior knowledge to be used in interpreting the input variables of the UAV controller, as well as in performing the flight operations. Based on a revised Pittsburgh model of learning classifier systems (LCS), we evolve an autonomous controller whose genetic operators (selection, crossover, and mutation) are carried out locally on individual rules, as in a traditional Michigan approach. But some other operators (replacement and credit assignment) are executed globally, on the whole population, like in a Pittsburgh approach. The motivation behind using this revised version of a LCS is to compensate for the complexity of the fuzzy rule system and control conditions as stated in Section 3.
3 UAV Controller Development 3.1 Fuzzy Rules The fuzzy rules are created to respond to any type of scenario which may be encountered during normal flight. Rules are categorized into rule families [7], in which each rule family defines one type of the four fundamental flying maneuvers: straight-andlevel flight, turns, climbs, and descents. A flying maneuver first involves the evaluation of the current flying state. A sensor value is translated to a membership degree based on its corresponding membership function. Thus a real value is mapped onto a fuzzy value describing one aspect of the airplane’s state. Some sample states of the airplane are shown below:
Development of a Genetic Fuzzy Controller for an Unmanned Aerial Vehicle
331
Table 1. Sample fuzzy variables and corresponding states Fuzzy Variable AltHIGH AltLOW SpdHIGH SpdLOW HdgRIGHT
Airplane State Current altitude is higher than desired altitude Current altitude is lower than desired altitude Current speed is lower than desired speed Current speed is lower than desired speed Current heading is to the left of desired heading
These states can be joined in the conditional portion of the IF-THEN statement using AND or OR to represent the airplane’s state. The result of each successful conditional rule is an assignment of a desired amount of change in the actuators of the aircraft, the throttle and the right and left elevons, from the current state in order to achieve the desired state of the aircraft. The degree of change in the actuator values is defined by fuzzy logic descriptors such as Sharply, Gently, and Slightly. These are then defuzzified into real-values and transmitted to the actuators to perform the desired actions. 3.2 Messy Genetic Algorithm 3.2.1 Evolving Fuzzy Rules Different from the traditional genetic algorithm which takes a fixed-length bit string as an individual, the messy genetic algorithm uses chromosomes of differing lengths, where each gene is represented by a specific tuple. In our framework, each tuple comprises three parts: the input variable, the fuzzy variable, and the clause connector. A tuple representing a high altitude is constructed as: (Alt, High, AND/OR).
(1)
A rule uses several tuples to represent the conditions and the consequents of the rule. Within each chromosome, each input variable will occur at most once as part of a tuple. Figure 2 is a typical fuzzy rule representation.
Fig. 2. A sample chromosome
The crossover and mutation operators follow the traditional pattern to evolve rules. Only rules in the same family can generate offspring through crossover, and the mutation operator can only change a gene to another gene pre-defined in that rule family. The crossover operator first cuts the chromosome at random points in the parents and then concatenates the fragments produced by the cut operation to form the new chromosome of the child. If the crossover point is part of the consequent segment in one parent but part of the conditional segment in the other parent, then the new chromosome is re-ordered so that all conditional genes fall before the consequent genes in each child. When there is more than one tuple containing the same input variable, the operator randomly selects one to remove. Figure 3 shows an example of the crossover operation.
332
Y. Qu et al.
Fig. 3. A crossover example
In mutation, the operator selects any gene in the rule (either conditional or consequent), and then either changes the input variable or the fuzzy variable in the tuple as long as the new input variable/fuzzy variable combination matches those defined for that rule family. 3.2.2 Evolving Fuzzy Rules The fuzzy membership functions for the input variables are specially coded to distinguish them from the fuzzy rules, and only mutation is allowed on the fuzzy membership functions. Sensor inputs are automatically associated with real values [10]. For each input and output variable, there is one fuzzy membership chromosome to represent its fuzzy shape. Initially, we set all these fuzzy membership functions to even distributions across the input sensor range. Therefore the gene for a fuzzy variable is defined by its center point, and its lower and upper bound as below. (Ci, Li, Ui).
(2)
One fuzzy membership function has several genes utilizing its fuzzy variables. The mutation of the fuzzy membership function is a non-uniform mutation where a random value in a relatively small range is drawn from a Gaussian distribution with a mean of zero and then added to or subtracted from the current gene. The mutation is executed from the leftmost gene to the rightmost gene. Within the mutation function is a check to make sure that the mutation is within the input range and that every value in the sensor input range is covered by at least one fuzzy variable. 3.3 Fuzzy Learning Classifier System 3.3.1 Matching The matching activity includes computing the matching degree, and then associating it with the classifiers in the match set. This data represents the current state of the UAV. Along with the desired state from the flight planner, we can specify the relation between the current state and the desired state of each control variable, like
Development of a Genetic Fuzzy Controller for an Unmanned Aerial Vehicle
333
“the altitude is too high” or “the speed is too low”. This corresponds to the error input used in PID controller. 3.3.2 Rule Activation The rule activation has to select the rules that will generate the output operations. In a LFCS, all the rules matching with a degree greater than a given threshold are triggered and the action is computed by an aggregation of the output of activated rules. This is then defuzzified to interpret the desired operation into precise surface control values such as a 20% increase on the throttle. In the rule base, the rules with fewer conditions represent more general situations and help to reduce the size of the rule base. However, when there is a more specific rule, we allow it to overrule a more general rule. In addition, if there is a situation in which none of the rules match the current input, a new rule is automatically generated with the current situation as its premise and a random number and configuration of actions as its consequence. 3.3.3 Credit Assignment Since the flying process is a constantly changing process with simultaneous as well as consecutive control commands and randomness in the environment, it is very difficult to cover all scenarios in the rule base. Instead of evaluating rules one by one, we consider the whole population as a unit for performance assessment as in the Pittsburgh approach to LCS, even though the rules are encoded separately as in the Michigan approach.
Fig. 4. A lazy eight flying pattern
A lazy eight flying pattern, as shown in Figure 4, is a maneuver designed to develop perfect coordination of controls through a wide range of airspeeds and altitudes where a series of flying control rules will be triggered to respond to altitude, airspeed, and/or heading changes. We evaluate the flying operations based on triggered rules, accumulative yaw error, tracking deviation etc. A score is given at the end of the lazy eight to represent the fitness of this rule generation. After the evolution operations, if a generation outperforms the best-so-far generation, then it replaces the old generation and becomes the new best population. Otherwise, the best-so-far remains in place and another series of evolutions are performed until a better generation arises.
334
Y. Qu et al.
3.4 Test Methodology A simulation environment has been developed for the UAV to provide a platform for testing and tuning algorithms and to evaluate the behavior of the UAV prototype in different scenarios. The simulation in this project is developed using Mathwork’s Simulink software to model the physics and the dynamics of the airframe, actuators, sensors, environment, and controller. The simulation runs using the current rule population, and once complete, an evaluator measures the controller’s performance. This is fed back to the GA in the LFCS as fitness, and a new generation is created to produce a new controller, restarting the process again. The evolution of the GA and the development of the simulation are still in progress. FlightGear will be used in conjunction with Simulink to produce useful graphical outputs to analyze and monitor the motion of the aircraft. We expect a rule-base for the controller with more than 100 rules that can successfully perform a smooth lazy eight flying maneuver. From there, the final stage will be the assessment and analysis of the rule-based system and implementation onto the actual hardware.
4 Future Works The broader scope of the LFCS methodology proposed in this paper is to develop an intelligent controller that is able to outperform the traditional PID control methodology and reliably, accurately, and efficiently minimize error between the current and desired state of an actual UAV in order to consistently achieve the desired state of the aircraft, aviate smoothly, avoid obstacles, as well as perform smooth transitions, all in the midst of environmental factors that may affect the aircraft’s performance. Our current efforts are aimed at developing the controller using a simulation process that mimics a real flying environment; once this overarching goal has been achieved; our goal is successfully and autonomously to pilot the Zagi mini-UAV, given a destination, intermediate waypoints, and a flightpath as inputs to the intelligent controller.
References 1. Zadeh, L.A.: Fuzzy Sets. Information and Control 8(3), 338–353 (1965) 2. Huang, L.J., Tomizuka, M.: A self-paced fuzzy tracking controller for two-dimensional motion control. IEEE Trans. Systems Man Cybernet 205, 1115–1124 (1990) 3. Ollero, A., Garcia-Cerezo, A.J.: Direct digital control auto-tuning and supervision using fuzzy logic. Fuzzy Sets and Systems 30, 135–153 (1988) 4. Saski, T., Akiyama, T.: Traffic control process of expressway by fuzzy logic. Fuzzy Sets and System 26, 165–178 (1988) 5. Hoffmann, F., Pfister, G.: Genetic Evolutionary Learning of a Fuzzy control Rule Base for an Autonomous Vehicle. In: Proceedings of the Fifth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based System (IPMU 1996), pp. 659–664 (1996) 6. Tan, K., Lee, T., Lee, L.: A Messy Genetic Algorithm for the Vehicle Routing Problem with Time Window Constraints. In: Proceedings of the 2001 Congress on Evolutionary Computation (CEC 2001), pp. 679–686 (2001)
Development of a Genetic Fuzzy Controller for an Unmanned Aerial Vehicle
335
7. Shu, L., Schaffer, J.: HCS: “Adding hierarchies to classifier systems”. In: Proceedings of the Fourth International Conference on Genetic Algorithms, Los Altos, CA, pp. 339–345 (1991) 8. Van Veldhuizen, D.A.: Genetic Multiobjective Optimization with Messy Genetic Algorithms. In: Proceedings of the Fifteenth ACM Symposium on Applied Computing (SAC 2000), pp. 470-476 (2000) 9. Riley, J., Ciesielski, V.: Evolving Fuzzy Rules for Reactive Agents in Dynamic Environments. In: Proceedings of the Fourth Asia-Pacific Conference on Simulated Evolution and Learning (SEAL 2002), Singapore, pp. 124–130 (November 2002) 10. Liska, J., Melsheimer, S.S.: Complete design of fuzzy logic systems using genetic algorithms. In: Proceedings of Third IEEE International Conference on Fuzzy Systems, pp. 1377–1382. IEEE, Piscataway (1994) 11. Bonarini, A.: An Introduction to Learning Fuzzy Classifier Systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 1999. LNCS (LNAI), vol. 1813, pp. 83–104. Springer, Heidelberg (2000) 12. Beard, R.W., McLain, T.W.: An Introduction to Autonomous Miniature Air Vehicles. Brigham Young University (2007) 13. Christiansen, R.S.: Design of an Autopilot for Small Unmanned Aerial Vehicles. Master Thesis in Electrical and Computer Engineering at Brigham Young University (August 2004) 14. Sorton, E., Hammaker, S.: Simulated Flight Testing of an Autonomous Unmanned Aerial Vehicle Using Flightgear. Institute for Scientific Research, Fairmont, West Virginia (September 2005) 15. Stevens, B.L., Lewis, F.L.: Aircraft Control and Simulation, 2nd edn. John Wiley & Sons, Inc., Chichester (2003)
Toward Evolving Self-organizing Software Systems: A Complex System Point of View Liguo Yu1, David Threm2, and S. Ramaswamy3 1
Computer Science and Informatics, Indiana University South Bend, South Bend, IN, USA 2 Defense Enterprise Applications, ITT Corporation, Fort Wayne, IN, USA 3 Industrial Software Systems, ABB Corporate Research Center, Bangalore, India
[email protected],
[email protected],
[email protected] Abstract. Complex systems feature a large number of interacting components and as a whole exhibit one or more properties which are not entirely obvious from the properties of individual components. One important property of complex systems is self-organization, which is a process of reorganizing its components or functions to adapt to its environment without being actively guided or managed by an outside source. Complex systems are widely noticeablein many areas, such as physics, chemistry, biology, mechanics, telecommunication, and sociology. In the software engineering field, considerable work has been performed to produce self-adaptive, self-healing, self-optimizing, and selfmanaging systems. In this paper, we attempt to summarize and discuss key properties of complex self-organizing systems (both natural and social) and compare them with similar software systems. Our objective is to provide a unifying perspective for software engineers, enabling themto study nature-inspired engineering as a viable alternative for producing complex, evolving, and selforganizing software systems. Keywords: Complex system, self-organization, architecture evolution, natureinspired engineering.
1 Introduction “At the beginning, the whole universe was not larger than a small dot, the dot of this ‘i’, or the dot at the end of this sentence. How did this dot-world become as complex as we know it today? Why did the big bang not form a simple gas of particles, or condense into one big crystal? How can the universe start with a big explosion and end up with life, history, and culture?” These open questions were raised by Fromm in his work to study the emergence and evolution of system complexity [1]. Complex systems are widely seen in our living world, which includes physical systems, biological systems, social systems, and manmade systems, such as mechanical systems and software systems. One important property of complex systems is self-organization, which is a process of reorganizing K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 336–346, 2011. © Springer-Verlag Berlin Heidelberg 2011
Toward Evolving Self-organizing Software Systems: A Complex System Point of View
337
components or functions without being actively guided or managed by an outside source. Although self-organization is a common research subject, it has different meanings for different systems, in different contexts. Self-organization, in natural and social systems, refers to the process in which a system can adjust its components and evolve to a high level of complexity. Figure 1 shows the evolution (from bottom to top) of several complex systems. Consider Figure 1(b), the self-organization ofa biological system: atoms form molecules, molecules form cells,and cells form tissues,etc. In other words, the self-organization process could be viewed as the evolution of a complex system in order to adapt. In mechanical and software systems, self-organization is the process where a system adds functionality while systems components and complexity remain unchanged. In other words, self-organization in systems exhibit little to no evolutionary properties. Considerable work has been done in the software engineeringfield to produce selfadaptive, self-healing, self-optimizing, and self-managing systems that dynamically adapt to resource variability, changing user needs, and system faults. Current research is focused more on adaptive, small scale reorganizations. Such research is highly domain-oriented, which means the self-organizing software productsare designed and produced for a specific application. As a result, self-organization in today’s systems iscostly to build, difficult to maintain, and usually provide only localized adjustments to changing environments, exceptions, or faults. Given this background, in this paper we examine nature-inspired engineering—the key properties of complex natural and social self-organizing systems, and compare these to software systems evolution and adaptation. Our objective is to examine similarities and surface new design methods and architectural styles in order to produce complex, self-organizing and evolving software systems.The remainder of this paper is as follows. In Section 2, we discuss key properties of evolving, self-organizing systems. In Section 3, we compare software systems with other self-evolving complex systems along the hierarchy dimension. In Section 4, we review related work and outline future research directions. In Section 5, we present an enterprise systems perspective. Our conclusions are in Section 6.
2 Evolving, Self-organizing Systems In some of our earlier work, we have outlined three key properties for complex systems: modularity, hierarchy, and interaction locality [2] [3] [4].Modularity inherently views a complex system as a composition of modules. Thus, the interaction among these modules determines the systems’ macro-behavior, i.e. high level behavior, of the system. Commonly seen examples of such modular organization includes: cells in an organic system, human beings in a society, and cars in traffic. In Figure 1, string, atom and organism can be considered as the basic modules of these three systems.Besides basic modules, a complex system could also have several layers of intermediate modules. These intermediate modules are the compositions of basic,lower-level, intermediate modules. In Figure 1, star, cell, and community are to be considered as intermediate modules.
338
L. Yu, D. Threm, and S. Ramaswamy
As pointed by Simon [5], for complex systems to emerge,i.e. evolve from a lower complexity to higher complexity level, hierarchy of structure is a fundamental necessity. Hierarchical structure has been observed in many, if not all, complex systems in the world.Hierarchical structure is seen in:physical systems,social systems,and manmade systems. Examples include: atoms and stellar galaxies, organizations and governments, and transportation and computer systems. A large system consists of subsystems, which consists of other subsystems, and so on. Figure 1 illustrates how these subsystems are hierarchically structured through many layers. Thus systems can be assumed to be formed through modules. Two types of systemsas described in [6]: simple system and complex system. Simple systems do not exhibit hierarchical structure while complex systems exhibit varying degrees of hierarchical structuring. Considering a natural system that contains elementary particles, say atoms, it could either form a piece of rock or a molecule. A piece of rock has a flat structure; it is the simple composition of atoms and, accordingly, can be considered a simple system. The molecule has a hierarchical structure; it is the result of both the composition and interaction of atoms, and accordingly, it can be considered a complex system [6]. Hence, a hierarchical structure is essential for the evolution of existing complex systems as well as the emergence of new complex systems.Hierarchical structure with intermediate layers can increase evolutionary speed. For example, the time required to build a complex system will be much shorter if the system contains one or more stable intermediate components, layers [5]. Consider a spacecraft, a complex system;without intermediate evolution, it would be impossible to form this system from elementary particles. Simon claimed that most complex systems in the world have a hierarchical structure [7] [8]. Figure 1 shows the hierarchical structure and self-organization,or emergence, of complex systems: from bottom to top, the system evolves from a low degree of complexity to increasinglyhigher degrees of complexity. Furthermore, the internal mechanism that governs fast evolution of hierarchical systems exists in the interactions between individual components,or modules, of the system. As commonly observed in hierarchical systems, the degree of interactions between components tends to decrease as we go up the hierarchy.Consider theexamplegiven in [6], TNT and the atomic bomb, which are compositions of molecules and atoms, respectively. Molecules are the upper level of components built from chemical bonds; atoms are lower level components built from atomic bonds. Since atoms are at a lower level than molecules, they interact more intensively and have more energy associated with these interactions, bonds. When these bonds are broken, the energy related to the interactions is released. This is why atomic bombs release more explosive energy than TNT, although both can have the same quantity of matter [6]. This concept is called interaction locality and this property was used by Ramsey to explain the difference between atom and molecule [9]. Interaction locality of hierarchical systems has a great degree of impact on system complexity evolution: separation of vertical interaction and separation of horizontal interaction [5]. The separation of vertical interaction allows the structure and behavior of high-level components, modules, to be independent of the dynamics of lower-level components, modules). The separation of horizontal interaction allows components,
Toward Evolving Self-organizing Software Systems: A Complex System Point of View
339
or modules, in the same level to operate independent of the detail of other components; only the input it requires and the output it produces are relevant for the macro aspects of system behavior [5]. Figure 2 illustrates the two impacts of interaction locality on a biological system, an organism. First, the upper-level module, the organ, is clearly separate from the lower-level module,or cell. Therefore, when we study the behavior of an organism, we only need to consider the organ as the basic unit, ignoring the detail structure of cells, the corresponding molecules and atoms. Second, components in the same level are loosely coupled. For example, an organism’s circulatory system is loosely coupled to its nervous system, one organism is loosely coupled to another organism, and it is extremely hard for a cell of one organism to interact with that of a different organism.
Fig. 1. The evolution of (a) a natural system; (b) Fig. 2. The separation of vertical interaction and horizontal interaction in an organism a biological system; and (c) biosphere system
Evolving, self-organizing complex systems and their key properties, including modularity, hierarchy, and interaction locality have been studied for decades in many different disciplines, for example: particle physics, chemistry, biology, psychology, and sociology [10] [11]. Most of these systems are natural, biological, or social systems. In this paper, we attempt to discuss the same concepts with respect to complex software systems.It is to be noted that these three key properties of complex systems are distinct, yet closely related: because of interaction locality, intermediate level modules could be formed, and hierarchical structure could be established. Therefore, in next section, we discuss hierarchy structure in more detail.
3 Hierarchy Structures in Complex Systems “The Chinese boxes called ‘hierarchies’ are a variant of that pattern. Opening any given box discloses not just one new box within, but a whole small set of boxes; and opening any of those component boxes discloses a new set in turn. While the ordinary set of Chinese boxes is a sequence, or complete ordering, of the component boxes, a hierarchy is a partial ordering--specifically, a tree.”
340
L. Yu, D. Threm, and S. Ramaswamy
The distinguished Nobel Laureate Herbert Simon used the above metaphor to explain the meaning of hierarchy [5]. Simon even speculated that all the complex systems in the world that can be observed by human beings are hierarchically structured. Even if there is a complex system that is not hierarchically structured, we might not be able to observe it [12]. Since Simon, hierarchy theory has emerged as part of a movement toward a general science of complexity [13] [14] [15] [16]. Several other scientists were also awarded with Nobel prizes for their study related to hierarchy theory. For example, chemist Ilya Prigogine identified what happens when new hierarchical levels emerge, not just for chemical reaction, but as a general process [17]; Psychologist Jean Piaget proposed theories of genetic epistemology, which are uniquely hierarchical in focus [18]. Lane [19] further distinguished four types of hierarchies that are commonly observed in nature and society:order, control, level, and inclusion hierarchy. Order hierarchy described a system based on the size of its components. Larger size components with lower frequency lie at a higher level of the hierarchy and smaller size components with higher frequency lie at a lower level of the hierarchy. Figure 3(a) illustrates an order hierarchy of an urban system: small sizehamlets located at the bottom of the hierarchy, medium size cities located at the intermediate level, and large size metropolitans located on top of the hierarchy [20]. Similarly in software systems, there are more small sized components and less large sized components, which exhibit an implicit order hierarchy. This is depicted in Figure 3(b), a distributed computing system. Now, let’s consider control hierarchy, where every entity has an assigned rank of power withthe greatest power concentrated in the entity with the highest rank. Therefore, the high level entities, in principle, control the behaviors of low level entities. Figure 4(a) show a control hierarchy in a team environment: general commander controls team (a) leader, who in turn controls group leader, who in turn controls individuals. Figure 4(b) shows a control hierarchy in a software system: The user interface module controls which task module should be called, the task module determines which feature should be selected, and each feature determines the (b) function that should be invoked.Control hierarchy is widely seen in the traditional structure-oriented software development. Fig. 3. The order hierarchy of (a) an urban system; and (b) a distributed Based on a data-flow diagram, any software computing system system can be decomposed and structured as a control hierarchy with the top module as the main control flow and individual functions as the lowest level modules.
Toward Evolving Self-organizing Software Systems: A Complex System Point of View
341
Level hierarchy represents entities in a system with level structure; each entity could exist at different levels. Each level is characterized by a particular spatialtemporal scale through which the entities at this level interact with one another (a) [19].Figure 5 illustrates level hierarchies observed in Linguistic and software system. In Figure 5(a), a language can be represented as letters, words, phrases, and so on; in Figure 5(b), software can be represented as source code,an assembly code, binary code, and so forth. Level hierarchies can be converted to inclusion (b) hierarchy. For example, the linguistic system in Figure 5(a) can also be considFig. 4. Control hierarchies in (a) a team ered as inclusion hierarchy; text contains working environment; and (b) a software system paragraphs, paragraphs contain sentences, sentences contain phrases, and so on. The systems shown in Figures 1 and 2 are also inclusion hierarchically structured.Order hierarchy, control hierarchy, and level hierarchy are commonly seen in many systems.However, they are not unique properties of complex systems. The unique property of a complex system is inclusion hierarchy, which is described as Chinese boxes by Simon: a large system consists of subsystems, which consists of other subsystems, and so on. These subsystems are hierarchically structured through many layers. Figure 6 illustrates the inclusion hierarchy of a software system [21] [22] [23]: a system consists of components C1, C2, and C3, which in turn contain components, C11 through C32. Interaction locality can be observed in a well-designed software system: high cohesion, interaction, within a module, component) and low coupling betweenmodules [24] [25].
Fig. 5. Examples of level hierarchies: (a) linguistic system; (b) a software system.
Fig. 6. A software system depicted with inclusion hierarchy.
342
L. Yu, D. Threm, and S. Ramaswamy
However, when dealing with failures, failures are a consequence of faults [26], in any of these systems; we often notice a ‘transient’ restructuring, actions not regularly defined within the norms of the established hierarchical relationships.In these relationships a ‘low-level’ component can instigate a ‘collapse’, triggering ‘evasive’ action to either ‘support’ or ‘subjugate’ the deviation in a regular behavioral pattern. Evolution supports the provision for such resilience in self-organizing systems, by devising ways overcome such sudden effects brought forth by instantaneous changes to what is considered normal behavioral patterns for the evolved system. While some such actions lead to success and thereby confer an evolutionary advantage for the resulting system, others may lead to obsolescence or rapid decay. A large body of software research is directed with thehypothesis that such resilience is highly domain specific. However, considerable research has been carried out to develop domain-independent self-adaptive systems. These studies are intended to produce software systems that can adjust not only their functionalities, but also their structures, or even architectures in order to adapt to changing environment. Garlan et al. [27] utilized external control mechanisms for self-adaptation, which can localize the concerns of problem detection and resolution. They proposed a framework that adopts an architecture-based approach to provide a reusable infrastructure and mechanisms for specific systems. Mishra and Misra [28] utilized Component Based Software Engineering (CBSE) to design self-adaptive systems that can automate the component integration process at runtime through accessing the component repository. Different systems will be produced according to different environments at run time. Based on abstract adaptation models and corresponding modelto-code transformations, Geihs et al. [29] designed a middleware for the development and operation of context-aware, self-adaptive applications. Vogelet al. [30] proposed a model-driven approach to developing self-adaptive systems with self-monitoring architectures. Their approach could realize an incremental synchronization between the run-time system and models for different selfmanagement activities. Bencomo [31] studied an external software system, which can survive variations in their execution environment with or withouthuman intervention. The research indicates that run time self-representations are the key to produce anexternal software system.
4 Nature Inspired Engineering Nature-inspired engineering is a relatively young research area [32]. Applying natureinspired solutions for self-adaptive software is beginning to receive more attention. For example, one line of research is to design software and hardware systems [33] [34] [35] inspired by natural and biological systems. Another line of research is to build self-adapting software systems following the property and behavior of natural and biological systems [36] [37] [38]. While software systems have similar structures and properties as other evolving self-adapting complex systems, such as biological and social systems: a fully evolving self-adapting software system has not been produced so far. As control theory has been successfully applied in robotics, self-adaption, self-management, and additionalareas: we believe now is the time to consider combining control theory with
Toward Evolving Self-organizing Software Systems: A Complex System Point of View
343
complex systems theory to produce real-timeintelligence in software systems., Software systems which can adjust their functionality, components, and structure to adapt to changing environments. Future work in this area is discussed below. •
• •
•
Understand how global features and structures emerge from simple local interactions and how new levels of components are formed within complex natural and social system evolution processes. Determine the mechanisms that govern such complex systems self-organization and apply it on software systems. Define appropriate metrics and measures to adjust the architecture for such selforganization in complex software systems to satisfy the three key properties: modularity, hierarchy, and interaction locality. While interaction locality has been exploited within some sub-areas of computing, such as compiler optimization, it has neither been systematically studied orapplied for large scale software systems architectural design. Study how a software agent or system could adjust its structure and response behavior to better fit thenatural environment, by exhibiting: adaptation, cooperation, or, self-organization, evolution, and emergence.
5 An Enterprise Systems Perspective Industry is always seeking novel alternatives to current adaptation practices and effective ways to make software systems architectures more resilient, sustainable and stable over time. Many legacy enterprise-widesoftware systemsevolve into massive agglomerations of interfaces, integration points and multiple user interfaces that are poorly documented. Most enterprise-wide legacy software systems exhibit the attributes of complex systems: extensive modularity, hierarchy and interaction locality. While from an enterprise infrastructure, or hardware adaptability, standpoint computing systems have made huge strides in self-adaptation and resiliency(such asautomated failover, recovery, and etcetera) enterprise-wide application systems are still in their infancy.To understand issues surrounding self-adaptability and resiliency of evolving enterprise-wide application systems it is not only important to understand themetrics and processes currently used to manage these systems, but also the timing relationships between the pre-adaptation and post-adaptation stabilization phases. There are many metrics surrounding enterprise-wide application systems: average application down time, average failure severity, number of application restarts, number of application rollbacks, number of application code change requests, etcetera. Some additional metrics may be more difficult to quantify, such as average code change complexity, average code change necessity, number of unknown system interfaces, and the difference between user wants and needs. Figure 7 shows a current process for enterprise-wide systems adaptations. The adaptations process allows for metrics collection, user requests that are inclusive, adaptation priority and roll in order, adaptation work through, and rollbacks and stabilization.Notations in figure 7 that are included by braces represent time from beginning to end of the adaptations roll in phase (ARP) and the adaptations stabilization phase (ASP) prior to restart of the process and metrics collection.
344
L. Yu, D. Threm, and S. Ramaswamy
Fig. 7. A Process for enterprise-wide application systems adaptations
Establishing an appropriate context around current standards, or frameworks, of adaptability and resilience would assist research, development and eventual industry acceptance. Self-adaptation and evolution of enterprise-wide application systems face many caveats. One caveat being that allowing a system to self-adapt to user data entry could lead to data quality and integrity issues. Furthermore, how do we identify priority and necessity of self-adaptation without human intervention? In enterprise-wide application systems the need for stability still remains at the top of hierarchy. For this reason, significant attention must be paid to providing a comprehensive metrics driven approach to addressing the inversion of timing relationships between the ARP and ASP phases, during the quest of ever increasing pressures of rapid adaptations. Such inversions cause a ‘tipping point’ in enterprise-software systems integration and roll up, which causes severe Enterprise IT infrastructure stability issues.
6 Conclusions In this paper, we identified and discussed key properties of evolvable, complex selforganizing systems. We briefly compared software systems with natural, biological, and social systems. We presented an industry perspective on evolvable, self-adapting enterprise-wide application systems. The primary objective of the paper is to provide a different perspective for software engineers to build complex, self-organizing, evolutionary software systems which can adjust their structures, components, and functionality to adapt to exceptions, faults, and environment changes. The secondary objective of the paper is to provide the initial foundation for future work to formulate an appropriate mathematical model around the process of enterprise-wide application systems adaptations and resiliency.
References 1. Fromm, J.: The Emergence of Complexity. Kassel University Press (2004) 2. Yu, L., Ramaswamy, S.: An Empirical Approach to Evaluating Dependency Locality in Hierarchically Structured Software Systems. J. of Sys. Soft. 82(3), 463–472 (2009)
Toward Evolving Self-organizing Software Systems: A Complex System Point of View
345
3. Yu, L., Chen, K., Ramaswamy, S.: Multiple-Parameter Coupling Metrics for Layered Component-Based Software. Software Quality Journal 17(1), 5–24 (2009) 4. Yu, L., Ramaswamy, S.: Verifying Design Modularity, Hierarchy, and Interaction Locality Using Data Clustering Techniques. In: 45th ACMSE Conf., Winston-Salem, North Carolina, pp. 419–424 (March 2007) 5. Simon, H.: The Organization of Complex Systems. In: Hierarchy Theory: The Challenge of Complex Systems. George Braziller, NY (1973) 6. Pattee, H.: The Physical Basis and Origin of Hierarchical Control. In: Hierarchy Theory: The Challenge of Complex Systems. George Braziller, NY (1973) 7. Simon, H.: The Architecture of Complexity. The Sciences of the Artificial. MIT Press, Cambridge (1969) 8. Simon, H.: The Architecture of Complexity: Hierarchic Systems. In: Proceedings of the American Philosophical Society, pp. 467–482 (1962) 9. Ramsey, D.: Molecular Coding Problems. New York Academy of Sc., New York (1967) 10. Schuster, P.: Evolution in Simple Systems and the Emergence of Complexity. In: Proc.of the 2005 IEEE/WIC/ACM Internl Conf.on Web Intelligence, pp. 33–37 (September 2005) 11. Mikulecky, D.: The Emergence of Complexity: Science Coming of Age or Science Growing Old? Computers and Chemistry 25(4), 341–348 (2001) 12. Ahl, V., Allen, T.F.H.: Hierarchy Theory, a Vision, Vocabulary and Epistemology. Columbia University Press (1996) 13. Whyte, L.L., Wilson, A.G., Wilson, D.: Hierarchical structures. American Elsevier, New York (1969) 14. Allen, T.F.H., Starr, T.B.: Hierarchy: Perspectives for Ecological Complexity. University Chicago Press, Chicago (1982) 15. Salthe, S.: Evolving Hierarchical Systems: their structure and representation. Columbia University Press, New York (1985) 16. O’Neill, R.V., DeAngelis, D., Waide, J., Allen, T.F.H.: A Hierarchical Concept of Ecosystems. Princeton University Press, Princeton (1986) 17. Prigogine, I.: End of Certainty. The Free Press, New York (1997) 18. Piaget, J.: Genetic Epistemology. W. W. Norton, New York (1971) 19. Lane, D.: Hierarchy, Complexity, Society. In: Hierarchy in Natural and Social Sciences, pp. 81–119. Springer, Netherlands (2006) 20. Gabaix, X.: Zipf’s Law for Cities: An Explanation. Qtrly J. of Econ. 114(3), 739–767 (1999) 21. Mei, H., Zhang, L., Yang, F.: A Software Configuration Management Model for Supporting Component-Based Software Development. ACM SIGSOFT Software Engineering Notes 26(2), 53–58 (2001) 22. Bruegge, B., Dutoit, A.H.: Object-Oriented Software Engineering Using UML, Patterns, and Java. Pearson Prentice Hall, Upper Saddle River (2004) 23. de Jonge, M.: Build-Level Components. IEEE Trans. on S. Engg 31(7), 588–600 (2005) 24. Offutt, J., Harrold, M.J., Kolte, P.: A Software Metric System for Module Coupling. Journal of System and Software 20(3), 295–308 (1993) 25. Schach, S.R.: Object-Oriented and Classical Software Engineering, 7th edn. McGraw-Hill, New York (2007) 26. Ramaswamy, S., Valavanis, K.P.: Modeling, Analysis and Simulation of Failures in a Materials Handling System using Extended Petri Nets. IEEE Transaction on Systems, Man and Cybernetics 24(9), 1358–1373 (1994)
346
L. Yu, D. Threm, and S. Ramaswamy
27. Garlan, D., Cheng, S.W., Huang, A.C., Schmerl, B.R., Steenkiste, P.: Rainbow: Architecture-Based Self-Adaptation with Reusable Infrastructure. IEEE Computer 37(10), 46–54 (2004) 28. Mishra, A., Misra, A.K.: Component Assessment and Proactive Model for Support of Dynamic Integration in Self Adaptive System. ACM SIGSOFT Soft. Engg. Notes 34(4) (2009) 29. Geihs, K., Barone, P., Eliassen, F., Floch, J., Fricke, R., Gjorven, E., Hallsteinsen, S., Horn, G., Khan, M.U., Mamelli, A., Papadopoulos, G.A., Paspallis, N., Reichle, R., Stav, E.: A Comprehensive Solution for Application-Level Adaptation. Software Practice and Experience 39(4), 385–422 (2009) 30. Vogel, T., Neumann, S., Hildebrandt, S., Giese, H., Becker, B.: Model-Driven Architectural Monitoring and Adaptation for Autonomic Systems. In: Proceedings of the 6th International Conference on Autonomic Computing, Barcelona, Spain, pp. 15–19 (June 2009) 31. Bencomo, N.: On the Use of Software Models during Software Execution. In: Proc.of the 2009 ICSE Workshop on Modeling in Software Engg., Canada, pp. 62–67 (2009) 32. Krka, I., Edwards, G., Brun, Y., Medvidovic, N.: From System Specifications to Component Behavioral Models. In: Proc. of the New Ideas and Emerging Results Track at the 31st Internl. Conf. on Software Engg., pp. 315–318 (May 2009) 33. Abelson, H., Allen, D., Coore, D., Hanson, C., Homsy, G., Thomas, F., Knight, J., Nagpal, R., Rauch, E., Sussman, G.J., Weiss, R.: Amorphous Computing. Comm. of the ACM 43(5), 74–82 (2000) 34. Brun, Y., Medvidovic, N.: An Architectural Style for Solving Computationally Intensive Problems on Large Networks. In: Proceedings of 2007 Software Engineering for Adaptive and Self-Managing Systems Workshop at 29th IEEE/ACM International Conference on Software Engineering, Minneapolis, MN, USA (May 2007) 35. Di Marzo-Serugendo, G., Fitzgerald, J., Romanovsky, A., Guelfi, N.: A Generic Framework for the Engineering of Self-Adaptive and Self-Organising Systems. Technical Report, School of Computer Science, University of Newcastle, Newcastle, UK (2007) 36. Clement, L., Nagpal, R.: Self-Assembly and Self-Repairing Topologies. In: Proc. of the Workshop on Adaptability in Multi-Agent Systems, Sydney, Australia (2003) 37. Shen, W.M., Krivokon, M., Chiu, H., Everist, J., Rubenstein, M., Venkatesh, J.: Multimode Locomotion via Superbot Reconfigurable Robots. Autonomous Robots 20(2), 165– 177 (2006) 38. Yu, L., Ramaswamy, S., Bush, J.: Symbiosis and Software Evolvability. IT Professional 10(4), 56–62 (2008)
Evolving Efficient Sensor Arrangement and Obstacle Avoidance Control Logic for a Miniature Robot Muthukumaran Chandrasekaran1, Karthik Nadig1 , and Khaled Rasheed2 1
Institute for Artificial Intelligence, University of Georgia, Athens, GA, USA
[email protected],
[email protected] 2 Computer Science Department, University of Georgia, Athens, GA, USA
[email protected] Abstract. Evolutionary computation techniques are being frequently used in the field of robotics to develop controllers for autonomous robots. In this paper, we evaluate the use of Genetic Programming (GP) to evolve a controller that implements an Obstacle Avoidance (OA) behavior in a miniature robot. The GP system generates the OA logic equation offline on a simulated dynamic 2-D environment that transforms the sensory inputs from a simulated robot to a controller decision. The goodness of the generated logic equation is computed by using a fitness function that maximizes the exploration of the environment and minimizes the number of collisions for a fixed number of decisions allowed before the simulation is stopped. The set of motor control decisions for all possible sensor trigger sequences is applied to a real robot which is then tested on a real environment. Needless to say, the efficiency of this OA robot depends on the information it can receive from its surroundings. This information is dependant on the sensor module design. Thus, we also present a Genetic Algorithm (GA) that evolves a sensor arrangement taking into consideration economical issues as well as the usefulness of the information that can be retrieved. The evolved algorithm shows robust performance even if the robot was placed in completely different dynamically changing environments. The performance of our algorithm is compared with that of a hybrid neural network and also with an online (real time) evolution method.
1
Introduction
Evolutionary robotics is a widely researched methodology that uses evolutionary algorithms to design controllers for autonomous robots. Different examples of applications of evolutionary robotics and approaches to develop controllers for mobile robots were introduced by Nolfi et al. [4]. Robotic controllers have been previously evolved using dynamic recurrent neural nets [1,2]. Several experiments have also been performed that directly use genetic programming to evolve controllers that implement an obstacle avoidance behavior [3,5,6]. The efficiency of such controllers and the impact on their performance, however, depend on the K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 347–356, 2011. c Springer-Verlag Berlin Heidelberg 2011
348
M. Chandrasekaran, K. Nadig, and K. Rasheed
sensor configuration present in the robot. Optimal sensor placement is essential to ensure maximal useful information retrieval from the surroundings in which the robot is placed. However, increasing the number of sensors alone does not guarantee a better performance. The usefulness of the information brought in by an additional sensor must also be taken into account. For instance, if sensors are placed very close to each other, there would be redundancy in the information that was retrieved by the sensor module. Not only would this redundancy render the information useless, but it would also raise economic issues. Therefore, when designing a sensor system, one must attempt to find those combinations of sensors that would maximize the information brought by those sensors while minimizing the possible cost. Previous work in optimizing sensor placement using genetic algorithms has been reported [7] where they propose a method to design a cost optimal sensor system that ensures a certain degree of diagnosability in an instrumentation system domain. In this paper, we use a GA to design an efficient sensor module that is later applied to a real robot with an objective to avoid obstacles while favoring movement in the forward direction and maximization of the area covered by it. The robot placed in the simulated environment also incorporates the resulting sensor configuration. Previous experiments with genetic programming and robotic control have been performed with a simulated robot placed in a simulated environment. Problems with performing experiments on only simulated environments are several. Real environments are dynamic and uncertainty exists everywhere. There are also an infinite number of real-life training situations that simulated environments fail to capture. Hence, testing on just simulated settings could result in specialization in certain types of environments and failure to evolve a behavior to generalize unseen types. Previous experiments have also been done on real robots trained in real-time [5]. However, since the OA algorithm was evolved in real-time, the robot would function under a good control strategy, only hours after the robot was placed in the environment. The robot would be subjected to several collisions as the training progresses even in a controlled environment. Such a robot would either, require a high RAM and a large ROM containing a small operating system with multi-tasking capabilities in order to be able to implement the entire algorithm in real time in a stand-alone fashion, or, require the controlling algorithm to be run on a workstation with data and commands communicated through a serial line. In our experiments, we use a real robot (with actual sensors) that was trained in a realistically simulated, wholesome dynamic environment using a GP to evolve the OA algorithm in the simulated environment where a simulated robot, whose sensor and constructional designs accurately resemble that of our real robot, was placed. The final sensor arrangement used in both our simulated and real time experiments is the outcome of the GA used to design the configuration.
2
The MKalpha Robot
We constructed a real robot containing real sensors for the purpose of testing our offline OA algorithm in the real world. The MKalpha robot is built on an
Evolving Efficient Sensor Arrangement and OA Control Logic
349
Arduino platform constituting an ATmega328 IC. It is equipped with four infrared distance sensors with a detection range from 2 to 10 cm. The dispersion angle for the sensors is negligibly small. The mobile robot has a circular shape, a diameter of about 10 cm and a height of 10 cm. It consists of two 5V DC motors, an on-board power supply for the Arduino board and the sensor module. The four infrared sensors are initially uniformly distributed around the circumference of the robot. A modular design is used for the motor driver and sensor units. This design enables repositioning sensors as per the configuration evolved by the GA. Some simple real time experiments showed that the sensor arrangement output by the GA was better than the initial sensor arrangement in terms of the number of collisions for the same controlling algorithm. This controlling algorithm code is written and compiled on an open-source Arduino environment and then uploaded on to the I/O board via a serial port.
Fig. 1. The MKalpha Robot
The ATmega328 microcontroller can operate on an external power supply of 6 to 20 volts. It possesses of 32 KB of flash memory for storing code (of which 2 KB is used for the boot loader). It also contains 2 KB of SRAM and 1KB of EEPROM. It has 14 pins that can be used as an input or output operating at 5V. Figure 1 shows the fully functional MKalpha robot including the sensor arrangement.
3
Genetic Algorithm for Sensor Placement
The goal of the GA is to find an optimal sensor arrangement around the robot given its actual physical configuration and application-specific biases. Our OA robot finds pleasure in moving forward and fast. It also favors a sensor arrangement, that maximizes the area covered by the sensors and minimizes the cost (by using only as many sensors as necessary in order to ensure no redundancy occurs in the information retrieved). The GA takes as input, the actual dimensions of the robot and infrared sensors and the maximum number of sensors available.
350
M. Chandrasekaran, K. Nadig, and K. Rasheed
It computes the maximum number of possible sensor slots based on the dimensions of the robot and the sensors. The GA also takes as input, the desirability of moving in a particular direction. Thus, for our specific application, since moving forward is favored, larger weights are arbitrarily assigned to forward-facing regions of the robot. Figure 2 shows how the robot was divided into 8 regions of interest and their corresponding weights. These weights indicate the degree of desirability of moving in the corresponding direction which should in turn proportionately increase or decrease the likelihood of finding a sensor in some slot in that region of interest. The blue contour in the figure shows the desirability curve for application-specific sensor arrangement. It is assumed that each region is uniformly distributed and spans over an angular distance of 45 degrees. For example, the forward region of interest is given a large weight of 0.7. This means that the robot strongly desires movement in the forward direction and in order to ensure safe passage, ideally, a sensor must be placed in some slot in this region. The genotypes or the individuals constituting the initial population are represented as bit strings containing one gene each for all available slots around the robot where a sensor unit could be assembled. A “1” in the genotype denotes the presence of a sensor in the corresponding slot. The number of 1’s in the genotype will be less than or equal to the maximum number of sensors available to us. For instance, assume that the robot has a diameter d = 10 cm and the size of the sensor unit l = 2 cm, the genotype will contain (π × d ) / l = 15 slots (genes). An example genotype would look like [1,0,1,0,0,0,0,1,0,0,0,0,0,1,0]. This denotes the presence of 4 sensors in the 1st , 3rd , 8th and 14th slot. Each slot occupies an angular distance of (360 / no. of slots) = 24 degrees. Assuming the midpoint of the first slot is at zero degrees, the 4 sensors are positioned at angles of 0, 48, 168 and 312 degrees with respect to the midpoint of the first slot. We then generate a set of n (=100 for our experiment) 8-bit strings, where each bit denotes one of 8 regions of interest. If the bit string has a “1” in some position, it would indicate the desirability of the robot to move in that direction and the likelihood of getting a “1” would depend on the weight corresponding to that region. The fitness of the individual is evaluated using the following rubric: Traverse through each gene in the individual,
Fig. 2. Regions of interest and their corresponding weights indicating a desirability to move in that direction. The blue contour around the robot shows the desirability curve whose radial dimensions measure the degree of desirability.
Evolving Efficient Sensor Arrangement and OA Control Logic
1. 2. 3. 4.
351
If the gene is a “1” and the corresponding region has a “1”, award 2 points If the gene is a “1” and the corresponding region has a “0”, award -0.5 points If the gene is a “0” and the corresponding region has a “1”, award -0.5 points If the gene is a “0” and the corresponding region has a “0”, award 1 point
The fitness would be the average of the total points received by each gene in the individual for n regions of interest divided by the total number of 1’s in the individual (= the number of sensors present in the current configuration). Mathematically, it would be represented as follows, n r=1 T otalP ointsr /n F(genotype) = (1) T otalN umberOf OnesInGenotype We implemented a steady state GA with a ternary tournament selection methodology, 1-point crossover and scramble, swap, insert or inversion mutation chosen randomly. We used a replace-worst strategy with the termination set at 200,000 fitness evaluations. A summary of the GA parameters used are shown in Table 1. Incidentally, the best sensor module design output by the GA happened to be the one used in the example above. Table 1. Summary of GA Parameters Representation Selection Crossover Mutation Replacement Termination Population Size
4
Binary (bit string) Tournament (Size = 3) 1-Point (PXover = 0.8-1.0) Insert, Inversion, Scramble or Swap (PM ut = 0.9) Replace Worst 200,000 fitness evaluations 200
Genetic Program for Motor Control
The inspiration for this experiment comes from [5] where control strategies were evolved in real time on a real robot using a GP. The design of the algorithm allowed for flexibility in the percept and control design. In this section, we present a GP that trains a realistic robot in a simulated dynamic environment to evolve a controller to enable the robot to navigate around obstacles minimizing collisions while exploring the environment it was placed in. The simulator design mimicked real-time scenarios in the best possible way, thereby allowing the controller, which was evolved offline, to be implemented on a real robot. The simulated environment generates digital percepts and expects a command response to process the next move. This response is generated by evaluating the percepts using an equation which transforms the percept data to motor control commands, such as Move Forward, Turn Right, Turn Left and Move Backward, represented as numeric values ranging from 0 to 3 respectively. Genetic Programming techniques were used to evolve the equations that calculate the motor control commands. This equation takes the sensor values as the input vectors and returns an integer representing a motor control command.
352
M. Chandrasekaran, K. Nadig, and K. Rasheed
f(s1 , s2 , s3 , s4 ) = m, where m ∈ [0,3] Each individual in the population represents the entire solution described as an equation tree. The individual was initialized by generating two types of random equation trees. They were either complete trees, where the equations used all of the available variables from the input, or partial trees, where the equations used only a subset of the input variables. The input variables represent the values received by the robot as percept inputs. Twenty percent of the population was populated with partial trees and the remaining eighty percent with complete trees. An example of such trees is shown in Table 2. Numbers within square brackets represent the index of input variables used. In the example above, the complete tree uses all four of the inputs available to it, whereas, the partial tree takes three, out of four, inputs. The expressions use any of the operators available to it from a predefined operator set. The operator sets were classified as binary and unary operators. The unary operator set contains the unary minus (−) and the bitwise-not (~) operators. The binary operator set was further classified as bitwise, arithmetic and relational operators. The initial population of individuals consisted of only bitwise and/or arithmetic operator. The relational operators were later added to test the performance. In order to avoid this equation from becoming really large, equations were limited to contain only a fixed number of operators. The equation trees consist of two types of nodes - the branch nodes (that take values from the operator sets) and the leaf nodes (that contain indexes of the input variables used or some constant integer values). The index of the input variables correspond to the percept inputs and are available as a zero based array. The results of the Boolean relational operations are evaluated as numerical values, by substituting TRUE with integer 1 and FALSE with integer 0. This allows relational operators to be evaluated as numeric values and hence generate the commands for the motor control. Both crossover and mutation operators are used in the evolution of equation trees. Sub-tree crossover was used with a crossover probability that was proportional to the depth of the tree. Multiple mutation operators are used with a low probability of mutation along with an inverse relation to the fitness of the individual. The mutation operator was applied to the binary and unary nodes. Index and constant nodes are mutated by selecting random input variable indices or random integer values to replace the corresponding gene in the original genotype, respectively. Sub-tree mutation with extremely low mutation probability was also included. The probability of mutation was also made proportional to the depth of the tree. A randomly generated short sub-tree was used to replace an existing sub-tree at the selected branch node. The GP was configured to run with a steady state scheme. Tournament selection was used to select parents. Table 2. Equation Trees Complete Tree : ((([1] + [3])^([2] | [0])) + (([0]^[1])^([0] + [1]))) Partial Tree :
((([1] + [2]) ∗ [0]) &[1])
Evolving Efficient Sensor Arrangement and OA Control Logic
353
The generated child was added to the population and then the individual with the worst fitness was removed from the population. A fixed population size was used and random seeding was done if no improvement in fitness was seen for a long time. The fitness of the individual was calculated by averaging the simulation results from 10 runs of the robot within a large pre-defined maze-like environment containing strategically placed obstacles to mimic real world settings. The MKalpha robot’s objective was to maximize the area covered by it while minimizing the number of collisions, for a fixed number of steps. We designed the simulator to allow the control of the position and orientation of the robot, its step size, angular step size, sensor error probability, actuator error probability, faulty triggers, number of sensors and sensor positions. Simulation runs were commenced by placing the robot at a randomly selected point within the movable area in the maze. The orientation of the robot was also randomly selected. The robot would then be assigned a fixed quota of steps it can use to move around. Each percept evaluation is considered as a step irrespective of any manifestation of change in the state of the robot. The environment within the simulator is probabilistic, and hence introduces perception errors in the sensors. At each step, the simulator generates the required values and applies it to the equation, and evaluates the next move. Data such as the number of collisions, number of critical decisions, area covered, encountered cases, etc., are recorded at each step. The results are averaged over 10 runs and the area covered and number of collisions are used as a part of a multi-objective evaluation of the individual.
5
Hybrid Neural Controller
An evolved neural controller was developed and its performance was compared to that of the GP. The neural controller is represented as an array of real valued weights. The controller was designed to accept sensor inputs and generate motor control commands. Arithmetic and/or uniform crossover and swap and/or random mutation were used in a steady state GA scheme to train the neural network. The controllers were evolved using identical parameter settings as that used for evolving the equation trees of the GP thereby allowing a fair comparison.
6
Experiments
In the first phase of experimentation, a realistically simulated MKalpha robot, controlled using the OA behavior evolved by the GP, was tested in the simulated environment (described earlier). First, we tested the performance of the robot based on the average fitness attained after 600 decision steps for the sensor configuration evolved using our GA, against a manually designed configuration with uniformly distributed sensors. We show the difference in fitness while the GP was run for 1000 generation equivalents. Next, we compared the performance of the robot, when its OA control strategy was evolved using a GP and using
354
M. Chandrasekaran, K. Nadig, and K. Rasheed Table 3. Summary of GP and Hybrid NN Parameters Parameters
Generic Programming
Hybrid NN
Objective Terminal Set Function Set
OA Behavior OA Behavior Integers (0 to 3) Integers (0 to 3) AND, OR, XOR, ADD, SUB, AND, OR, XOR, ADD, SUB, MUL, SHL, SHR, NOT MUL, SHL, SHR, NOT Relational Set LT, GT, EQ, LTE, LT, GT, EQ, LTE, GTE, NEQ GTE, NEQ Representation Equation Trees Weights (Float array) Population Size 50 50 Crossover Prob 0.95 0.95 Mutation Prob 0.05 0.05 Selection Tournament (5%-10%) Tournament (5%-10%) Max Generations None None Limits 200 Function Nodes 20 Hidden neurons Termination None None Fitness Weighted Multi-Objective Weighted Multi-Objective
a hybrid neural network (neural network whose weights are evolved using a GA). We illustrate the difference in fitness while the controllers trained for 1000 generation equivalents. We also show the difference in the number of generations taken to achieve a particular fitness for each controller. In the second phase of experiments, the objective is to compare our offline GP controller with the online counterpart developed by Peter Nordin and Wolfgang Banzaf [5]. They tested their GP on the Khepera robot which is half the size of the MKalpha with twice the number of sensors. The Khepera robot was trained on two environments - a rectangular box (30cm × 40cm) and a more complex larger environment (70cm × 90cm) with irregular borders and loose objects placed inside it. In order to conduct a fair comparison, we designed and tested the MKalpha robot on similar environments but twice as big to account for its larger size. The results of the Khepera robot are taken as it is from [5] for comparison.
7
Discussion of Results
The first phase of experiments show an exploratory behavior for the simulated robot as the fitness function of the GP favors a strategy that maximizes exploration. Since, the robot carried out in-place turns, only movement in the forward or backward directions brought about a change in the area covered. Thus, the desirability of the robot to favor movement towards forward facing regions was used as bias to evolve a sensor arrangement so as to better enable this exploratory behavior. It was seen that the GA-evolved sensor arrangement performed better when compared to the manually designed uniform sensor arrangement. As the control logic allowed for more area coverage, the robot was exposed to greater chances of collisions. During the first few generations, while the robot explores
Evolving Efficient Sensor Arrangement and OA Control Logic
(a)
355
(b)
Fig. 3. Comparison between GP an Hybrid Artificial Neural Network (ANN). (a) Shows exploration and collisions(b) Shows average fitness. Note that each data point is the average of 10 simulation runs.
the map, it keeps colliding with different obstacles in its path, but as time goes on the number of collisions becomes more and more infrequent even as a greater area of the map was explored as seen in Fig. 3 (a). The GP based individual shows a maximum area covered of 254 given a quota of 600 decision steps. The reason for this small ratio could be one or more of the following - (1) in an attempt to over train for realistic situations, the robot was trained in a congested and dynamically changing environment. (2) The robot makes in-place turns which are counted as steps while the area covered remains unchanged. (3) In order for the robot to maneuver around a wall, it takes an average of 3 to 4 decision steps because the robot was programmed to make 10 to 20 degree turns. So, the environment had to be sampled at least 3 times for the robot to make a 60 degree in-place turn. From Fig. 3 (b), we observe that GP was able to generate a better control strategy than an evolved neural controller for equal number of generations. Due to large selection pressure during parent selection, the algorithms show a faster convergence rate. Since a minor change in the equation tree could drastically alter the fitness level of the individual, with careful design, evolving an effective equation could improve the performance of the robot. The GP-evolved offline controller was tested on the MKalpha robot placed on the complex environment and the number of collisions that occurred every minute was recorded over a period of 60 minutes. These results were superimposed on the results obtained when an online GP-evolved controller was tested on the Khepera robot (Figure 7 in [5]). The online algorithm evolved a comparably good control strategy only 40 to 60 minutes after the training commenced while the offline GP converged much faster. An even slower convergence may be expected if the robot was trained in real-time in a more complex environment. We also observed only a small advantage in the online controller even when the converged control strategy was used. Also, considering the fact that the size of the Khepera robot is half of that of the MKalpha and that it contains twice as many sensors, one wonders if such economically and computationally expensive online controllers are really worth the small advantage in performance.
356
8
M. Chandrasekaran, K. Nadig, and K. Rasheed
Conclusion and Future Work
We have demonstrated that a GP system can be used to evolve a good control strategy for a real robot, offline. The goodness of the control strategy depended largely on the percept inputs to the GP which in turn depends on the sensor arrangement around the robot. We used a GA to evolve an efficient sensor arrangement. The evolved controller showed robustness even if the robot was placed in a completely different environment or if the obstacles were moved around. We also showed the advantages of evolving the controller offline as opposed to evolving one in real time in terms of the robot design configuration, hardware capabilities of the on-board microcontroller and time taken to achieve a good control strategy. Future work includes additionally evolving the speed of the motor. For instance, the speed of the motor could be automatically increased if no obstacle was detected for several steps. Robot designs that have multiple motors have to be manually aligned accurately as they are used for in-place rotational and linear movements, which is a painfully tedious process. This could also be done automatically by evolving the right motor speeds to enable precision movement. The control strategy could be extended to incorporate analog, or a combination of analog and digital sensors and allow a GA to choose an optimal combination of the various types of sensors and their arrangement on the robot.
References 1. Cliff, D.: Computational neuroethology: a provisional manifesto. In: Proceedings of the First International Conference on Simulation of Adaptive Behavior on From Animals to Animats, pp. 29–39. MIT Press, Cambridge (1990) 2. Harvey, I., Husbands, P., Cliff, D.: Issues in evolutionary robotics. In: Proceedings of the Second International Conference on From Animals to Animats 2: Simulation of Adaptive Behavior: Simulation of Adaptive Behavior, pp. 364–373. MIT Press, Cambridge (1993) 3. Koza, J.R.: Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge (1992) 4. Nolfi, S., Floreano, D., Miglino, O., Mondada, F.: How to evolve autonomous robots: different approaches in evolutionary robotics. In: Artificial Life IV, pp. 190–197. MIT Press/Bradford Books (1994) 5. Nordin, P., Banzhaf, W.: A genetic programming system learning obstacle avoiding behavior and controlling a miniature robot in real time. Tech. rep., The University of Dortmund (1995) 6. Reynolds, C.W.: Evolution of obstacle avoidance behavior: using noise to promote robust solutions, pp. 221–241. MIT Press, Cambridge (1994) 7. Spanache, S., Escobet, T., Trav-massuys, L.: Sensor placement optimisation using genetic algorithms. In: Proceedings of the 15th International Workshop on Principles of Diagnosis, DX 2004, pp. 179–183 (2004)
Feature Selection for Translation Initiation Site Recognition Aida de Haro-Garc´ıa, Javier P´erez-Rodr´ıguez, and Nicol´as Garc´ıa-Pedrajas Department of Computing and Numerical Analysis, University of C´ordoba, Spain
[email protected],
[email protected],
[email protected] http://www.cibrg.org/
Abstract. Translation initiation site (TIS) recognition is one of the first steps in gene structure prediction, and one of the common components in any gene recognition system. Many methods have been described in the literature to identify TIS in transcripts such as mRNA, EST and cDNA sequences. However, the recognition of TIS in DNA sequences is a far more challenging task, and the methods described so far for transcripts achieve poor results in DNA sequences. From the point of view of Machine Learning, this problem has two distinguishing characteristics: it is class imbalanced and has many features. In this work, we deal with the latter of these two characteristics. We present a study of the relevance of the different features, the nucleotides that form the sequences, used for recognizing TIS by means of feature selection techniques. We found that the importance of each base position depends on the type of organism. The feature selection process is used to obtain a subset of features for the sequence which is able to improve the classification accuracy of the recognizer. Our results using sequences from human genome, Arabidopsis thaliana and Ustilago maydis show the usefulness of the proposed approach.
1
Introduction
Translation initiation site (TIS) recognition consists of identifying the start codon, ATG, which marks the beginning of the translation in most genes. Most previous approaches have focused on recognizing TIS in transcripts. However, recognizing TIS in genomic sequences is a different, more difficult task. Full length or partial transcripts usually contain one or zero TIS, and no introns. On the other hand, in a generic genetic sequence, we can find the ATG codon, and thus a putative TIS, in any place. In this work we consider the most difficult case of analyzing genomic sequences that contain junk DNA, exons, introns and untranslated terminal regions (UTRs). The latter are also contained in transcripts. The different characteristics of recognizing TIS in transcripts and genomic sequences are illustrated in the different performance of the predictors in each
This work has been financed in part by the Excellence in Research Project P07-TIC2682 of the Junta de Andaluc´ıa.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 357–366, 2011. c Springer-Verlag Berlin Heidelberg 2011
358
A. de Haro-Garc´ıa, J. P´erez-Rodr´ıguez, and N. Garc´ıa-Pedrajas
problem. TisMiner [6] is one of the best performing programs for TIS recognition in transcripts, able to achieve a specificity of 98% at a sensitivity level of 80%. However, when tested in genomic sequences, its performance at the same level of sensitivity drops to a specificity of 55%. TIS prediction in genome sequences has two important characteristics: firstly, negative instances outnumber positive instances by many times, and secondly, there are usually many features to describe each sequence. In machine learning theory, the former problem is called the class-imbalance problem [1]. Most learning algorithms expect a somewhat balanced distribution of instances among the different classes. It has been shown that learning algorithms suffer from the skewed distribution that is associated with class imbalance. In a previous work [2] we studied the class imbalance aspect of the problem. In this work, we deal with the second aspect of TIS recognition and study the effect of the number of features on the classification accuracy. We show results of the application of feature selection techniques and their effect on the performance of the model. Many classifiers find problems in TIS recognition task due to the large amount of features involved. Most previous works used an arbitrary window of nucleotides around the putative TIS. In this work, we present a more principled way of selecting the relevant features using feature selection techniques. We will proceed in a two step method. First, we will select the most suitable window for each organism, and then we will select a subset of the most relevant features within that window. This paper is organized as follows: Section 2 summarizes the most important aspects of feature selection; Section 3 shows the experimental setup and the results obtained; and finally Section 4 states the conclusions of our work and the future research lines.
2
Feature Selection
Feature selection has been a fertile field of research and development since 1970’s in statistical pattern recognition, machine learning, and data mining, and widely applied to many fields such as text categorization, image retrieval, customer relationship management, intrusion detection, and genomic analysis. Feature selection can be defined as the selection of a subset of M features from a set of N features, M < N , such that the value of a criterion function is optimized over all subsets of size M [7]. The objectives of feature selection are manifold, the most important ones being [8]: – To avoid over-fitting and improve model performance, e.g. prediction performance in the case of supervised classification and better cluster detection in the case of clustering. – To provide faster and more cost-effective models. – To gain a deeper insight into the underlying processes that generated the data.
Imbalance Data Methods for Translation Initiation Site Recognition
359
However, the advantages of feature selection techniques come at a certain price, as the search for a subset of relevant features introduces an additional layer of complexity in the modeling task. In the context of classification, feature selection techniques can be organized into different categories. The output type divides feature selection algorithms into two groups: ranked list and minimum subset. The real difference between the two is about the order among the selected features. There is no order among the features in a selected subset. One cannot easily remove any more features from the subset, but one can do so for a ranked list by removing the least important one. If we focus on how they combine the feature selection search with the construction of the classification model we can identify three categories [8]: filter methods, wrapper methods, and hybrid/embedded methods: 1. Filter techniques rely on the intrinsic properties of the data to evaluate and select feature subsets without involving any mining algorithm. Advantages of filter techniques are that they easily scale to very high-dimensional datasets and that they are computationally simple, fast, and independent of the classification algorithm. As a result, feature selection needs to be performed only once, and then different classifiers can be evaluated. 2. Wrapper methods embed the model hypothesis search within the feature subset search. In this setup, a search procedure in the space of possible feature subsets is defined, and various subsets of features are generated and evaluated. To search the space of all feature subsets, a search algorithm is then wrapped around the classification model. However, as the space of feature subsets grows exponentially with the number of features, heuristic search methods are used to guide the search for an optimal subset. The evaluation of a specific subset of features is obtained by training and testing a specific classification model. Their advantages include the interaction between feature subset search and model selection, and the ability to take into account feature dependencies. A common drawback is that they have a high risk of over-fitting and are very computationally intensive. 3. Hybrid/embedded techniques attempt to take advantage of the two models by building the search for an optimal subset of features into the classifier construction. Just like wrappers, they are specific to a given learning algorithm. Embedded methods have the advantage that they include the interaction with the classification model, while at the same time being far less computationally intensive than wrapper methods. In order to choose an appropriate algorithm for our problem we have to bear in mind the special characteristics of TIS recognition. First, its class imbalance nature. To avoid the problems derived from this fact, we have performed an undersampling step before applying any learning algorithm. We applied undersampling as it was the overall best performing method in our previous study [2]. Then, we must consider the large number of features in the datasets. This size prevents the use of wrapper approaches due to their computational cost. Then, we must choose a filter approach. Furthermore, the method used must be able to
360
A. de Haro-Garc´ıa, J. P´erez-Rodr´ıguez, and N. Garc´ıa-Pedrajas
cope with many features. With all these constraints we have selected SVM-RFE as the best choice. Support vector machine recursive feature elimination (SVM-RFE) method is well-studied for use in gene expression problems [4]. This algorithm conducts feature selection in a sequential backward elimination manner, which starts with all the features and discards one feature at a time (it’s a greedy algorithm). Just like SVM, SVM-RFE was initially proposed for binary problems. The squared coefficients: wj2 (j = 1, ...; p) of the weight vector w are employed as feature ranking criteria. Intuitively, those features with the largest weights are the most informative. Thus in an iterative procedure of SVM-RFE one trains the SVM classifier, computes the ranking criteria wj2 for all features, and discards the feature with the smallest ranking criterion. One can consider that the removed variable is the one which has the least influence on the weight vector norm. The procedure provides as output a ranking of the features. 2.1
Evaluation Measures
TIS recognition is a class-imbalance problem and accuracy is not a useful measure for imbalanced data, specially when the minority class is highly underrepresented. If we have a ratio of 1:100, a classifier that assigns all instances to the majority class will have a 99% accuracy. Several measures [9] have been developed to take into account the imbalanced nature of the problems. Given the ratio of true positives (TP), false positives (FP), true negatives (TN) and false P negatives (FN) we can define two basic measures: sensitivity, Sn = T PT+F , and N TN specificity, Sp = T N +F P . If we are concerned about the performance on both negative and√ positive classes G − mean measure [5] considers both where G − mean = Sn · Sp. G − mean measures the balance performance of the learning algorithm between the two classes. Many classifiers are subject to some kind of threshold that can be varied to achieve different values of the above measures. For that kind of classifier receiver operating characteristic (ROC) curves can be constructed. A ROC curve, is a graphical plot of the T Prate (sensitivity) against the F Prate (1 - specificity) for a binary classifier system as its discrimination threshold is varied. The perfect model would achieve a true positive rate of 1 and a false positive rate of 0. A random guess will be represented by a line connecting the points (0, 0) and (1, 1). ROC curves are a good measure of the performance of the classifiers. Furthermore, from this curve a new measure, area under the curve (AUC), can be obtained which is a very good overall measure for comparing algorithms. In our experiments we will use ROC curves and AUC as main tools for comparing the algorithms. For estimating the error measures as well as ROC curves and AUC we used 10-fold cross-validation. The source code, in C and licensed under the GNU General Public License, used for all methods as well as the partitions of the datasets are freely available upon request from the authors.
Imbalance Data Methods for Translation Initiation Site Recognition
3
361
Experimental Setup and Results
As learning algorithm we have used a support vector machine (SVM). In a previous work we found SVM to achieve the best overall results [3] when compared with other widely used classification methods. SVM is very sensitive to its learning parameters, especially C, and γ in case of a Gaussian kernel. Thus, we have carried out a cross-validation procedure for obtaining values for these two parameters. We tried a linear kernel with C ∈ {0.1, 1, 10}, and a Gaussian kernel with C ∈ {0.1, 1, 10} and γ ∈ {0.0001, 0.001, 0.01, 0.1, 1, 10}, testing all the 21 possible combinations. As the optimal values of these parameters depends on the training set, the cross-validation process is repeated each time a SVM is trained. 3.1
Datasets
We have used three datasets for testing the performance of the described methods. The CCDS dataset was compiled by Saeys et al. [8] from the consensus CDS database. The CCDS project is a collaborative effort of compiling and identifying a core of human genes that are accurately annotated. The annotation is a mix of manual curation and automatic annotation. The CCDS dataset contains 350,578 negative samples and 13,917 positive samples with a positive/negative ratio of 1:25. The Ustilago dataset is a set of coding and non-coding regions of genomic sequences from the sequencing of the fungus Ustilago maydis. The sequences are first obtained from the Broad Institute1 and then completed with the information of the Munich Information Center for Protein Sequences (MIPS)2 . The Ustilago dataset contains 607,696 negative samples and 6,515 positive samples with a ratio of 1:93. The Arabidopsis dataset comprises coding and non coding regions of the plant Arabidopsis thaliana obtained from “The Arabidopsis Information Resource” (TAIR)3 . This dataset contains 27,342 positive instances and 3,369,875 negatives instances with an imbalance ratio of 1:123. Our aim was to test the proposed methodology on very different datasets to study whether the problem is more difficult depending on the organism. Thus, we used these three datasets because they correspond to very different species. The CCDS dataset contains human DNA, thus it has long genes, with many exons and potentially long introns. On the other hand, the Ustilago dataset contains shorter genes, with very few exons, usually only one or two, and few and shorter introns. Arabidopsis complexity is between these two organisms. With these datasets, we can study the behavior of the proposed methodology in different environments. For all datasets we consider a sequence of 500 bps upstream and downstream of every ATG codon. The SVM classifier needs numerical attributes, thus we used a 1 out of 4 codification of each element of the sequence for a total of 4012 inputs. 1 2 3
http://www.broadinstitute.org/annotation/genome/ustilago_maydis/ http://www.helmholtz-muenchen.de/en/mips/home/index.html http://www.arabidopsis.org/
362
A. de Haro-Garc´ıa, J. P´erez-Rodr´ıguez, and N. Garc´ıa-Pedrajas
Our first task was to obtain a working window for classifying the sequences. We chose an initial window large enough to assure that all needed information is contained in it. We selected sequences with 500 bps upstream and downstream of the actual or putative TIS. With these sequences we applied SVM-RFE method to rank the inputs. However, we must take into account that SVM-RFE works with decoded inputs. So we have to find a method that processes the 4012 original inputs in a way that allows us to measure the relevance of the 1003 corresponding nucleotides. To obtain that measure, we assigned to each nucleotide in the sequence a ranking equal to the average of the rankings of each of the 4 variables that codify the nucleotide. To make the information obtained manageable we grouped the relevance of the variables in 40 groups. These results are shown in Figure 1. The first thing the results show is that the relevance of the variables is highly dependent on the organisms. It is very clear that the profile of the plots is very different for human, Arabidopsis thaliana and Ustilago maydis genomes. The profile of Ustilago maydis shows that most of the downstream part is relevant. The ustilago dataset has genes with few exons and introns. Due to the fact that the are few introns, the downstream sequence is usually an exon with no introns, or at most one intron. Thus, it is a coding region and significant for the classification of the sequence. For the human genome, CCDS dataset, the situation is different. Human genome has more complex genes, and the first exon is usually disrupted by an intron after a short sequence. In this way, the downstream part is not all coding, and after the first 50 codons, the significance of the features decreases. For Arabidopsis thaliana, we found a behavior between these two. With this first results we determined a sequence window to proceed with our second experiment. This window must consider a smaller sequence and at the same time include most of the significant features. We chose a sequence of 250 bps for the three genomes. For CCDS and Ustilago maydis we selected a window of 100 bases downstream and 150 upstream, and for Arabidopsis thaliana of 50 bases upstream and 200 downstream. With these windows we performed a second round of feature selection to study the relevancy of each base in the sequence. The results are shown in Figure 2 for each base in the sequence window. For the CCDS dataset we found interesting results. First, most of the useful information is concentrated in the coding part, as should be expected. Furthermore, the relevance of the features follows a very marked sequence of period three, always assigning less relevancy to the middle base in the codon than the first base and the most relevant is always the last base. This is very interesting and further research is ongoing for a complete explanation of this fact. The ustilago has a more even distribution of the relevance of the features but it also exhibits the periodicity of CCDS in the coding part. However, for Ustilago maydis the first base of the codon is always the less informative, and the middle one the most relevant. For Arabidopsis thaliana, the downstream sequence is almost irrelevant. The periodicity in the coding region is also observed with the middle base as the most important. A comparison of the performance using this window
Imbalance Data Methods for Translation Initiation Site Recognition
363
40 35
Relevance rank
30
25 20 15 10 5
0 -150
-125
-100
-75
-50
-25
ATG ATG 0 25
50
75
100
125
150
50
75
100
125
150
50
75
100
125
150
Codons 40 35
Relevance rank
30
25 20 15 10 5
0 -150
-125
-100
-75
-50
-25
ATG ATG 0 25 Codons
40 35
Relevance rank
30 25
20 15 10 5 ATG 0 -150
-125
-100
-75
-50
-25
0
25
Codons
Fig. 1. Codon relevance for a sequence of 1000 bps for, from top to bottom, Arabidopsis, CCDS and Ustilago
364
A. de Haro-Garc´ıa, J. P´erez-Rodr´ıguez, and N. Garc´ıa-Pedrajas 12
10
Relevance rank
8
6
4
2
ATG ATG
0 -30
0
30
60
90
120
150
180
Codons 12
10
Relevance rank
8
6
4
2
ATG ATG
0 -90
-60
-30
0
30 Codons
60
90
120
150
60
90
120
150
12
10
Relevance rank
8
6
4
2
ATG ATG
0 -90
-60
-30
0
30 Codons
Fig. 2. Base relevance for a sequence of 250 bps around the TIS for, from top to bottom, Arabidopsis, CCDS and Ustilago
Imbalance Data Methods for Translation Initiation Site Recognition
365
1 0.9 0.9 0.85 0.8 0.8
0.75 0.6 0.7 AUC
True positive rate (sensitivity)
0.7
0.5
0.65
0.4 0.6 0.3
0.2 0.5 0.1
Sequence [-500,+500] Sequence [-100, 150] Feature selection in [-100, 150]
0 0
0.2
0.4 0.6 False positive rate (1-specificity)
0.8
0.45
1
Sequence [-500, 500] Sequence [-100, 150] Selection in [-100, 150]
0.55
1
0.9 0.85 0.8
0.8
0.6 AUC
True positive rate (sensitivity)
0.7
0.5
0.75
0.4 0.7
0.2 0.65 0.1
Sequence [-500,+500] Sequence [-100, 150] Feature selection in [-100, 150]
0 0
0.2
0.4
0.6
0.8
0.6
1
Sequence [-500, 500] Sequence [-100, 150] Selection in [-100, 150]
0.3
False positive rate (1-specificity) 1
0.9
0.9
0.8
0.85
0.6 AUC
True positive rate (sensitivity)
0.7
0.5
0.4 0.8
0.2
0.1
Sequence [-500,+500] Sequence [-100, 150] Feature selection in [-100, 150]
0 0
0.2
0.4 0.6 False positive rate (1-specificity)
0.8
0.75 1
Sequence [-500, 500] Sequence [-100, 150] Selection in [-100, 150]
0.3
Fig. 3. ROC curves and AUC values for, from top to bottom, Arabidopsis, CCDS and Ustilago
and the whole sequence is shown in Figure 3. It is clear that using the reduced window the classification is improved. Our last step was to obtain the subset of most relevant features within the window and test its performance. However, SVM-RFE outputs a ranking of the variables and not a selection. As we want to compare the performance of feature
366
A. de Haro-Garc´ıa, J. P´erez-Rodr´ıguez, and N. Garc´ıa-Pedrajas
selection with no feature selection, we performed a cross-validation approach for selecting the most appropriate number of retained features according to the ranking order provided by SVM-RFE in the classification process. A comparison of the three different groups of features is shown in Figure 3 for the three datasets. The figure shows the ROC curves and corresponding AUC. The results show the improvement in performance of feature selection against using all the variables. AUC is improved for the three problems with fewer features.
4
Conclusions
In this work we have shown that feature selection methods are a useful tool for improving the performance of TIS recognition. In a first step we have improved the performance of the classifier reducing the whole sequence to a smaller sequence. Then, we have carried out a second feature selection process to select the most interesting features within that smaller sequence. The performed selection has achieve better results in terms of AUC. In this way, SVM-RFE has shown is ability to, first select the correct window for learning the classifiers, and them to obtain a subset of features able to improve the results of all of them. Our current research is centered in the explanation of the periodicity exhibited by the coding part of the sequences in the relevancy of the features.
References 1. Barandela, R., S´ anchez, J.L., Garc´ıa, V., Rangel, E.: Strategies for learning in class imbalance problems. Pattern Recognition 36, 849–851 (2003) 2. Garc´ıa-Pedrajas, N., Ortiz-Boyer, D., Garc´ıa-Pedrajas, M.D., Fyfe, C.: Class imbalance methods for translation initiation site recognition. In: Garc´ıa-Pedrajas, N., Herrera, F., Fyfe, C., Ben´ıtez, J.M., Ali, M. (eds.) IEA/AIE 2010. LNCS (LNAI), vol. 6096, pp. 327–336. Springer, Heidelberg (2010) 3. Garc´ıa-Pedrajas, N., P´erez-Rodr´ıguez, J., Garc´ıa-Pedrajas, M., Ortiz-Boyer, D., Fyfe, C.: Class imbalance methods for translation initiation site recognition in dna sequences. Knowledge Based Systems (2010) (submitted) 4. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002) 5. Kubat, M., Holte, R., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998) 6. Liu, H., Han, H., Li, J., Wong, L.: Using amino acids patterns to accurately predict translation initiation sites. Silico Biology 4, 255–269 (2004) 7. Narendra, P., Fukunaga, K.: Branch, and bound algorithm for feature subset selection. IEEE Transactions Computer C-26(9), 917–922 (1977) 8. Saeys, Y., Abeel, T., Degroeve, S., de Peer, Y.V.: Translation initiation site prediction on a genomic scale: beauty in simplicity. Bioinformatics 23, 418–423 (2007) 9. Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40, 3358–3378 (2007)
DTP: Decision Tree-Based Predictor of Protein Contact Map Cosme E. Santiesteban-Toca and Jesus S. Aguilar-Ruiz 1
´ Centro de Bioplantas, University of Ciego de Avila, Cuba
[email protected] 2 University of Pablo de Olavide, Sevilla, Spain
[email protected] Abstract. In this paper, we focus on protein contact map prediction, one of the most important intermediate steps of the protein folding problem. We describe a method where contact maps of proteins are predicted with decision trees, using as input codings the information obtained from all possible pairs of amino acids that were formed in the training data set. As a result, the algorithm creates a model that consists of 400 decision trees (one for each possible amino acids pair), which takes into account the amino acids frequency in the subsequence existent between the couple of amino acids analyzed. In order to evaluate the method generalization capabilities, we carry out an experiment using 173 non-homologous proteins of known structures, selected from the protein databank (PBD). Our results indicate that the method can assign protein contacts with an average accuracy of 0.34, superior to the 0.25 obtained by the FNETCSS method. This shows that our algorithm improves the accuracy with respect to the methods compared, especially with the increase of protein length. Keywords: protein structure prediction, protein contact map prediction, decision trees.
1
Introduction
One of the greatest challenges of bioinformatics is the protein structure prediction, where the objective is to determine the three-dimensional (3D) structure of a protein from its amino acid sequence [1,2]. Inter-residual contact maps are an important bidimensional representation of the proteins spatial structure and constitute a critical step in protein structure prediction problem. The solution of inter-residue contacts prediction in proteins may be useful in protein-folding recognition. The secondary structure, fold topology and others patterns can be highlighted easily from a contact map. Similarly, the contact map information is used to predict unknown structures and proteins functions [3,4]. The ability to make successful predictions involves understanding the relationship between a sequence and its protein structure [5,6]. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 367–375, 2011. c Springer-Verlag Berlin Heidelberg 2011
368
C.E. Santiesteban-Toca and J.S. Aguilar-Ruiz
In the last 20 years, multiple methods to predicting contact maps have been developed. Some of the methods to predicting inter-residual contacts focus on: processing the distribution of distances between residue pairs in proteins of known 3D structure with the aim of tackling the folding and the inverse folding problem; classifying the inter-residual contacts related to structural patterns; combining the correlated mutations with other properties, such as sequence conservation, sequence separation along the chain, alignment stability and residue-specific contact occupancy; employing machine learning and neural network-based methods to predict whether distances between couples of residues [7,8,9,10,11,12]. In this paper we propose a solution based on decision trees, taking into account the high degree of flexibility and ease of understanding. Our algorithm uses the Quinlan C4.5 method [13,14], to address the problem of proteins contact maps prediction. In this method we combine the information obtained from various sources, including the distance and sequence separation between amino acids and the frequency of amino acids present in the subsequence between them. Therefore, the objective of this research is to know how a system based on decision trees can learn the correlation between residue covalent structure of a protein and its contact map, since it is calculated from their known 3D structure. This article is structured as follows. A methodology section, which explains the proteins data set selection criteria, the definition of contact maps, the proposed model architecture and the measures employed for the algorithm effectiveness measuring. A results section, showing tabular and graphical experimentation results. Finally, the conclusions of this work.
2 2.1
Materials and Methods Data Bases
With the goal of completing the training and validation of the proposed prediction method, we chose a proteins database selected from protein data bank (PDB) [15]. Normaly, the selection of a training data set depends of the problem to be solved, in general terms the training data set should combine maximum coverage with minimum redundancy. For this reason in this experiment, we employed a set of proteins of known 3D structure. This selection was made following the Fariselli and Casadio criteria [8]. Therefore, the chains whose backbone was broken were excluded. The chains that had the lowest homology possible (less than 25% of identity) and whose structure does not contain redundant sequences were chosen. Were only kept in the training set those proteins without ligands in the PDB file, to eliminate false contacts due to the presence of hetero-atoms. The proposed procedure does not include contacts between residues whose sequence separation is less than four residues, to avoid small ranges of false contacts [16]. This data set allows to evaluate the generalization capability of the predictor, becouse it contains proteins that do not belong to the same family or have a common origin.
DTP: Decision Tree-Based Predictor of Protein Contact Map
369
Table 1. Protein data set (I-25). The proteins identity value is less than 25% and the sequence length (L) is equivalent to the number of structure covalent residues. L < 100 1a1i-A 1a1t-A 1a68 1a7i 1acp 1ah9 1aho 1aie 1ail 1ajj 1aoo 1ap0 1ark 1awd 1awj 1awo 1bbo 1bc8-C 1c5a
1cfh 1sco 1ctj 1spy 1cyo 1sro 1fna 1tbn 1hev 1tiv 1hrz-A 1tle 1kbs 1tsg 1mbh 1ubi 1mbj 1uxd 1msi 2acy 1mzm 2adx 1nxb 2bop-A 1ocp 2ech 1opd 2fdn 1pce 2fn2 1plc 2fow 1pou 2hfh 1ppt 2hoa 1brf 2hqi 1rof 2lef-A
1sco 2sn3 2sxl 3gat-A 3mef-A 4mt2 5pti L 100-169 1a62 1a6g 1acz 1asx 1aud-A 1ax3 1b10 1bc4 1bd8 1bea 1bfe-A 1bfg
1bkf 1npk 3lzt 1bkr-A 1pdnC 3nul 1br0 1pkp 5p21 1bsn 1poa 7rsa 1bv1 1put 1ad2 1bxa 1ra9 L 170-299 1c25 1rcf 1akz 1cew-I 1rie 1amm 1cfe 1skz 1aol 1cyx 1tam 1ap8 1dun 1vsd 1bf8 1eca 1whi 1bjk 1erv 2fsp 1byq-A 1exg 2gdm 1c3d 1hfc 2ilk 1cdi 1ifc 2lfb 1cne 1jvr 2pil 1cnv 1kpf 2tgi 1csn 1kte 2ucz 1ezm 1mak 3chy 1fts
1juk 1kid 1mml 1mrj 1nls 1ppn 1rgs 1rhs 1thv 1vin 1xnb 1yub 1zin 2baa 2fha L ≥300 16pk 1a8e 1ads 1arv
1axn 1b0m 1bg2 1bgp 1bxo 1dlc 1irk 1iso 1kvu 1moq 1svb 1uro-A 1ysc 2cae 2dpg 2pgd 3grs
Using this approach, in total, the data set used includes 173 proteins (Table 1). The table 1 lists the proteins sorted by length and labeled by name as they appear in the PDB. 2.2
Contact Maps Definition
An alternative view of the protein uses of a distance matrix (Figure 1), a symmetric square NxN matrix whose elements are the distances among the atoms in the protein. This representation is obviously redundant: it requires N(N - 1)/2 degrees of freedom instead of N2 or 3N (representing the 3D coordinates). The calculation of the distances between the residues is determined by Euclidean distance. 2.3
Model Architecture
Actually, most of existing prediction methods using the binaries contact maps, such as simplification of the problem and try to predict contacts as a classification problem. Assuming that the main constraint of binaries contact maps is the data loss that occurs in the discretization. The proposed method introduce the use of distance matrix as a basis for training the predictor. Similarly, in our method, the prediction is treated as a classification problem, which takes into account the contacts, quasi-contacts or non-contacts between residues.
370
C.E. Santiesteban-Toca and J.S. Aguilar-Ruiz
Fig. 1. Contact Map of 4sgb-I protein. In the lower left, the binary contact map constructed with a threshold of 8˚ A. In the upper right, the distance matrix, which is independent of the selected threshold.
A 1
Sub sequence information C --Y 3 --0
A1 , A2 information Ls D Class 15 6,3 Contact
Fig. 2. Scheme of coding input for decision trees. It is formed by the sub sequence and the amino acids couple information.
Moreover, decision trees are classifiers which make it possible to have understandable rules, which can be used to find further explanations of the data that are classified. The proposed algorithm is based on the C4.5 decision tree introduced by Quinlan [14], using the default setting. We build decision trees for all possibles pairs of contacts, which has a total of 400 trees (20 x 20 amino acids). As input coding, the algorithm uses vectors of length 23, which includes information of the substring formed between non adjacent amino acids, the distance and the sequence separation between the couple of analyzed amino acids (Figure 2). For a couple of amino A1 A2 , the first 20 elements of the vector match the existing amino acids and contain their frequencies in the substring that is formed between the pair of amino acids analyzed. Ls represents the length of the substring or the separation between the pair of amino acids. D, the Euclidean distance between the amino acids couple and Class is the discretization of the distance (D) in contact, quasi-contact and non-contact, depending on the established thresholds in Angstroms (˚ A). For this model we classified the contacts using this criteria: contact ≤ 8˚ A< quasi-contact ≤ 12˚ A< non-contact. The decision tree-based predictor of protein contact maps (DTP) is shown in Figure 3. Given the distance matrix of a proteins set with known structure (P1 , P2 , ..., Pn ), the DTP builds a model of two-dimensional array of size NxN, where N is the number of amino acids (20). Each matrix cell contains a function f (A1 , A2 , S) formed by a decision tree, whose input vector is composed by the amino acids couple (A1 , A2 ) and the information extracted from the substring (S) contained between them.
DTP: Decision Tree-Based Predictor of Protein Contact Map
371
Fig. 3. Scheme of the decision tree-based predictor of protein contact maps, where P1 to Pn are the training proteins, A is the algorithm that creates the knowledge model and S? is the unknow sequence. The knoledged model is formed by a matrix of functions f(A1,A2,S), where it is introduced the sub sequence S for the amino acids A1 and A2. As result, this algorithm returns the predicted distances matrix for the sequence S?.
For an unknown sequence (S?), each couples of amino acids is evaluated in the builded model. The result of prediction is obtained by the occurrence of contact, quasi-contact or non-contact. 2.4
Evaluation of the Eficiency
To evaluate the effectiveness of the predictor, a simplification of the problem was maked, reducing it to two classes: contact and non-contacts. Where true positives (TP) and true negative (TN) belong to correct classifications and, false positives (FP) and false negatives (FN) belong to incorrect classifications. Taking into account the unbalanced nature of the classes present in this problem, where the ratio between the number of contacts (NC ), and the number of non-contacts (NN C ) is about NC / NN C = 1 / 60 [8] The effectiveness of prediction (Ap) could be calculated as the ratio of true positives, also known as precision (1). This is because in this equation penalizes non-contacts and prioritizes contacts. Ap = T P/(T P + F P )
(1)
In order to compare the effectiveness of the predictor, an extra measure is used: “the improvement over a random predictor” (2). This measure computes the ratio between Ap (1) and the accuracy of a random predictor (Nc / Np): R = A/(N c/N p)
(2)
where Nc is the number of real contacts in the protein of length Lp, and Np are all the possible contacts. In this paper in order to limit the prediction of local
372
C.E. Santiesteban-Toca and J.S. Aguilar-Ruiz
contacts (clustered along the main diagonal of the contact map) we set to 4 the minimum length of the sequence separation between residues in contact. Since the contact map S is symmetric and residues whose sequence gap is 4 are not included, Np is computed to be equal to (Lp - 4)(Lp - 3)/2 [7]. With the aim of comparing our results with those obtained for Fariselli and Casadio, a new index was added, which measures the difference in the distribution of the inter-residue distances in the 3D structure for predicted pairs compared with all pair distances in the structure [17,8]. This index is defined by the equation (3): Xd = Σi = 1, n(P ic − P ia)/n ∗ di (3) where n is the number of bins of the distance distribution (15 equally distributed bins from 4 to 60 cluster all the possible distances of residue pairs observed in the protein structure); di is the upper limit (normalized to 60) for each bin; Pic and Pia are the percentage of predicted contact pairs (with distance between di and di - 1) and that of all possible pairs, respectively. By definition, values of Xd = 0 indicate no separation between the two distance populations, meaning that the predicted contacts are randomly distributed; values of Xd > 0 indicate positive cases, when the population of the distances between predicted contact pairs is shifted to smaller values with respect to the population of the distances of all residue pairs in the protein. For contact distances with an upper limit of 8˚ A, the larger and positive Xd is, more efficient the prediction of contacts is. Similarly to the other two indexes, Xd is also averaged on the protein sets [8].
3
Results
In order to determine the efficiency of the predictor implemented, a crossvalidation method using 10 partitions was performed. The results were compared with those obtained by Fariselli and Casadio (FNETCSS) [8] and a simple predictor based on the average of the distances and the sequences separation length (MDS) [18,19,20] (Table 2). With the aim of highlighting the results dependence to the proteins size, the effectiveness values were calculated after grouping proteins according to their sequence length. The results show that, in general, for all proteins, the proposed algorithm (DTP) shows good behavior. DTP not only improves the minimum efficiency threshold proposed by the MDS algorithm, but except for less than 100 amino Table 2. Protein data set (I-25). The proteins identity value is less than 25% and the sequence length (L) is equivalent to the number of structure covalent residues. All(173) Ls < 100(65) Ap R Xd Ap R Xd DTP 0,34 10,96 64,07 0,27 4,71 12,3 FNETCSS 0,25 8,05 11,87 0,33 6,28 12,91 MDS 0,23 5,17 - 0,17 2,61 Algorithms
100 ≤ Ls < 170(57) 170 ≤ Ls < 300(30) Ls ≥ 300(21) Ap R Xd Ap R Xd Ap R Xd 0,31 8,99 40,55 0,31 13,72 92,22 0,36 24,86 174,32 0,25 7,33 12,14 0,19 9,47 10,78 0,15 12,71 9,77 0,19 5,28 - 0,15 6,82 - 0,15 9,54 -
DTP: Decision Tree-Based Predictor of Protein Contact Map
373
Fig. 4. Bar graph showing the efficiency of the prediction of contacts based on the sequence lengths of proteins. In the x-axis values are represented effectively achieved by the predictors, depending on the length of the sequences and for all proteins. The vertical axis represents the effectiveness.
(a)
(b)
Fig. 5. The decision trees builded, after pruning. The sheets assign: C for contact, C-C for quasi-contact and N-C for non-contact. (a) Corresponds to the pair of amincidos A-L. (b) Corresponds to the pair of amino acids F-W.
acid protein that has a lower performance than FNETCSS, DTP is visibly superior. Figure 4 shows the effectiveness of predictions based on the proteins length, using different methods (DTP, FNETCSS and MDS). This graph shows that the effectiveness of the algorithm is dependent on the length of the protein. However, unlike FNETCSS, which is more efficient to predict contacts in short sequences, the proposed DTP method is more efficient to predict in large sequences, even when the density of contacts is much lower. One of the fundamentals advantages of this model is the ability to extract patterns of rules that are automatically generated for each decision tree (Figure 5).
374
C.E. Santiesteban-Toca and J.S. Aguilar-Ruiz
Figure 5a shows the tree created for the AL pair of amino acids, two of the most common in proteins. These trees (figures 5a and 5b) represent intuitives patterns which could explain the protein folding problem.
4
Conclusions
As shown previously, the decision trees are efficient tools for solving many problems, including protein contact maps prediction. This work clearly demonstrates that a predictor based on decision trees performs with greater efficiency when is included the subsequence information from the amino acids to predict. The proposed method combines the use of decision trees with a newest input codification for all possible pairs of amino acids that were formed in the training dataset. The method performance was very satisfactory (0.34), especially with the increase of protein length, greatly enhancing the accuracy with respect to the 0.25 obtained by the FNETCSS method.
Acknowledgements This research is inserted in the doctoral program in Soft Computing, developed by the University of Las Villas in Cuba and the Andalusian Universities, under the sponsorship of the AUIP, which has promoted and apported the financial support to the entire program and research visits.
References 1. Ouzounis, C.A., Valencia, A.: Early bioinformatics: the birth of a discipline a personal view. Bioinformatics 19(17), 2176–2190 (2003) 2. Cohen, J.: Bioinformatics An Introduction for Computer Scientists. Computing 36(2), 122–158 (2004) 3. Deng, M.: Mapping gene ontology to proteins based on protein-protein interaction data.. Bioinformatics 20, 895–902 (2004) 4. Bork, P.: Protein interaction network from yeast to human. Curr. Opin. Struct. Biol 14, 292–299 (2004) 5. Robson, B.B.: Analysis of the Code Relating Sequence to Conformation in Globular Proteins. Biochem. J. 141(3), 853–867 (1974) 6. Ramanathan, A.: Using Tensor Analysis to characterize Contact-map Dynamics of Proteins. PhD thesis, Carnegie Mellon University Pittsburgh, PA (2008) 7. Fariselli, P., Casadio, R.: A neural network based predictor of residue contacts in proteins. Protein Engineering 12(1), 15–21 (1999) 8. Fariselli, P., Olmea, O., Valencia, A., Casadio, R.: Prediction of contact maps with neural networks and correlated mutations. Protein Engineering 14(11), 835–843 (2001) 9. Bohr, H.: Protein secondary structure and homology by neural networks. FEBS Letters 241(1), 223–228 (1988) 10. Bohr, H., Bohr, J., Brunak, S., Cotterill, R.M.J., Fredholm, H., Lautrupt, B., Petersen, S.B.: A novel approach to prediction of the 3-dimensional structures of protein backbones by neural networks. FEBS Letters 261(1), 43–46 (1990)
DTP: Decision Tree-Based Predictor of Protein Contact Map
375
11. Pollastri, G., Baldi, P.: Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 18, 1–9 (2002) 12. Kim, H.: Computational analysis of hydrogen bonds in protein-RNA complexes for interaction patterns.. FEBS Letters 552, 231–239 (2003) 13. Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986) 14. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993) 15. Nakamura, H., Berman, H.M., Henrick, K.: Announcing the worldwide Protein Data Bank. Nature Structural Biology 10(12), 98 (2003) 16. Thomas, C., Casari, D.J., Sander, G.: The prediction of protein contacts from multiple sequence alignments. Protein Engng. 9, 941–948 (1996) 17. Pazos, F., Helmer-citterich, M., Ausiello, G., Valencia, A.: Correlated Mutations Contain Information About Protein - protein Interaction. Dipartimento Biologia, U Roma, and Tor Vergata, pp. 511–523 (1997) 18. Aguilar-Ruiz, J.S.Y., Anguiano-Rey, E., M´ arquez-Chamorro, A.E.: Marco de Referencia en la Calidad de la Predicci´ on de Mapas de Contacto de Prote´ınas. In: XIII Conferencia de la Asociaci´ on Espa˜ nola para la Inteligencia Artificial (2009) 19. Santiesteban Toca, C.: Predicci´ on de mapas de contacto basado en distancias. Trabajo de diploma de estudios avanzados. Doctorado Iberoamricano de Soft Computing (2010) 20. Aguilar-Ruiz, J.S.Y., Anguiano-Rey, E., M´ arquez-Chamorro, A.E.: Definici´ on de Umbral M´ınimo para la Predicci´ on de Estructura Secundaria de Prote´ınas. In: ESTYLF. XV Congreso Espa˜ nol Sobre Tecnolog´ıas y L´ ogica Fuzzy (2010)
Translation Initiation Site Recognition by Means of Evolutionary Response Surfaces Rafael del Castillo-Gomariz and Nicol´ as Garc´ıa-Pedrajas Department of Computing and Numerical Analysis, University of C´ordoba, Spain
[email protected],
[email protected] http://www.cibrg.org/
Abstract. Translation initiation site (TIS) recognition is one of the first steps in gene structure prediction, and one of the common components in any gene recognition system. Many methods have been described in the literature to identify TIS in transcribed sequences such as mRNA, EST and cDNA sequences. However, the recognition of TIS in DNA sequences is a far more challenging task, and the methods described so far for transcripts achieve poor results in DNA sequences. In this work we present the application of response surfaces to the problem of TIS recognition. Response surfaces are a powerful tool for both classification and regression as they are able to model many different phenomena and construct complex boundaries between classes. Furthermore, the interpretability of the results is very interesting from the point of view of the expert. In this paper we show the use of real-coded genetic algorithms for evolving a response surface that learns to classify TIS. The results obtained in three different organisms are comparable with a well-known classification algorithm with a more interpretable polynomial function.
1
Introduction
Translation initiation site (TIS) recognition consists of identifying the start codon, ATG, which marks the beginning of the translation in most genes. Most previous approaches have focused on recognizing TIS in transcripts. However, recognizing TIS in genomic sequences is a different, more difficult task. Full length or partial transcripts usually contain one or zero TIS, and no introns. On the other hand, in a generic genetic sequence, we can find multiple ATG codons, and thus a putative TIS, in any place. In this work we consider the most difficult case of analyzing genomic sequences that contain junk DNA, exons, introns and untranslated terminal regions (UTRs). The latter are also contained in transcripts. The different characteristics of recognizing TIS in transcripts and genomic sequences are illustrated in the different performance of the predictors in each problem. TisMiner [6] is one of the best performing programs for TIS recognition in transcripts, able to achieve a specificity of 98% at a sensitivity
This work has been financed in part by the Excellence in Research Project P07-TIC2682 of the Junta de Andaluc´ıa.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 376–385, 2011. c Springer-Verlag Berlin Heidelberg 2011
TIS Recognition Using Evolutionary Response Surfaces
377
level of 80%. However, when tested in genomic sequences, its performance at the same level of sensitivity drops to a specificity of 55%. One of the most important features of TIS prediction in genome sequences is the fact that negative instances outnumber positive instances by many times. In machine learning theory, this is called the class imbalance problem [1]. Most learning algorithms expect a somewhat balanced distribution of instances among the different classes. It has been shown that learning algorithms suffer from the skewed distribution that is associated with class imbalance. In a previous work [3] we studied the class imbalance aspect of the problem. In this work, we study the applicability of response surfaces to TIS recognition. Models based on response surfaces (RSs) are able to explain a wide variety of phenomena. The expression that defines a RS is a polynomial of degree G on each one of the input variables [11] [9]. So, the functions are of the form: f (x1 , x2 , . . . , xn ) = c0 +
n
n
ci xi + . . . +
i=1
ci1 i2 ...ig xi1 xi2 . . . xig ,
(1)
i1 ,i2 ,...,iG =1 ik ≤ik+1
where G is the degree of the model, xi the input variables, n the number of inputs, and ci the coefficients. For instance, for a quadratic RS (degree 2) the expression becomes: f (x1 , x2 , . . . , xn ) = c0 +
n i=1
ci xi +
n
cij xi xj .
(2)
i,j=1
The main problem we face when using RSs in real-world problems is the number of terms. Even for a RS of degree 2 or 3, and a small number of inputs, the number of terms is unmanageable large to deal with for most optimization algorithms. In this way, the search space is too large and classical search algorithms, such as Levenberg-Marquardt [7], are frequently trapped in local minima, need long training times, and achieve poor results. In such complex, noisy, non-differentiable, multi-modal and deceptive search spaces, evolutionary computation is a very good alternative [8] to classical search methods. Thus, we developed a real-coded genetic algorithm to evolve RSs that overcomes many of these problems. If we want to use a RS and real-coded genetic algorithms to model any phenomenon we need a chromosome with as many genes as coefficients the model has. The number of coefficients depends on the number of variables and the degree of the polynomial. For example, in the model of eq. 2 the individuals have 3n + 1 genes. Figure 1 shows an individual that represents a RS of degree 2 and 3 variables. The robustness exhibited by RSs when applied to several problems, together with their capacity to establish complex classification boundaries, is the rationale behind our application of RSs to TIS recognition. The remaining of this paper is organized as follows: Section 2 describes the proposed model; Section 3 details the experimental setup and shows the results obtained; and finally Section 4 states the conclusions of our work.
378
R. del Castillo-Gomariz and N. Garc´ıa-Pedrajas
Fig. 1. Response surface of degree 2 and 3 input variables
2
Evolutionary Response Surfaces
Interpretability is a highly desirable property of any model. Simplicity is also a very important property. A simpler model is more interpretable, needs fewer patterns to adjust its coefficients, and its generalization ability is usually better. In our algorithm we enforce the selection of simpler (with fewer terms) and accurate models. Each individual is codified with a gene for each coefficient of the model. This gene is formed by two different parts. On the one hand, there is one bit that represents the presence/absence of the term in the model. On the other hand, there is real value that represents the value of the coefficient for the corresponding term in case the term is present. Figure 2 shows an individual that represents a RS of degree 2 with 3 variables codified as explained.
Fig. 2. Modified individual to allow the selection of the simplest model
This representation scheme is not enough, by itself, to enforce expressions with a minimum number of terms. In order to obtain the desired effect of preferring smaller polynomials we must include a term in the fitness function that rewards simpler models. In this way, our problem becomes an optimization problem of two objectives: the accuracy of the model and the number of terms. Due to the fact that we have only two objectives we have opted for an unique fitness function that is a lineal combination of two terms weighted by a coefficient given by the researcher. The fitness function, F , is given by:
TIS Recognition Using Evolutionary Response Surfaces
F = (1 − β)Ferror + β ∗ Fcompl ,
379
(3)
where Ferror is the error term, and Fcompl is the complexity term, and β is a adjustable coefficient, 0 ≤ β ≤ 1. The exact form of the two terms can be defined in different ways. To classify a pattern we evaluate the function induced by the polynomial. If the value is greater than 0, the pattern is classified to the class labeled +1, otherwise to the class labeled −1. The result of applying the genetic algorithm is a binary classifier with a minimum number of terms within the family of polynomial functions and able to distinguish between two classes. 2.1
Genetic Algorithm
The evolution of the population is carried out using a standard genetic algorithm. With a population size of N individuals, each generation Pd N individuals of the population are copied to the new population, Pc N are reproduced by crossover, and Pm are copied and undergo mutation. The values of these four parameters must be fixed by the user. Each individual is codified by a gene formed by two parts. The first one codifies the real value of the coefficient of the term, and the second one is a selector that shows the presence/absence of the term in the model. Conceptually, we are working with a hybrid model that includes binary and real-valued values. In order to simplify the implementation, we have considered all the values real-valued. The selectors take values in the interval [0, 1], and a term is not considered in the model if its corresponding selector is below 0.5. As explained above, the fitness function consists of two terms: the first one is the classification accuracy of the model, and the second one is the complexity term, that is bigger as the number of terms, nT , becomes smaller. The expression for the fitness is given by: Patterns correctly classified nT − nTm F = (1 − β) +β 1− , (4) Number of patterns nTM − nTm where nTM and nTm represent, respectively, the maximum and minimum number of coefficients of the model. This is a monotonous fitness function that only achieves the maximum value, 1, if the classification accuracy is 100%, and the number of terms is nTm . The number of genes of each individual depends on the degree of the RS chosen to perform the classification. In the experiments reported we have used RSs of degree 2. 2.2
Crossover Operator
The proposed crossover operator is an adaptation of BLX-α [2]. We have used non-uniform mutation. These two operators are designed for real-coded genetic algorithms and have been adapted for our dual scheme. The adapted BLX-α uses two parents, β 1 = {(s11 , c11 ), . . . , (s1p , c1p )} and β 2 = {(s21 , c21 ), . . . , (s2p , c2p )}, with p
380
R. del Castillo-Gomariz and N. Garc´ıa-Pedrajas
genes each one and representing two RS models with p coefficients. As we have explained each gene represents a term and a selector that tells whether the term is present, sji , and the value of the coefficient, cji . The two parents generate two descendants β d1 = {(sd11 , cd11 ), . . . , (sdp1 , cdp1 )} and β d2 = {(sd12 , cd12 ), . . . , (sdp2 , cdp2 )}. The basic idea of the modification is that the genetic material of the best parent will have a higher probability of being inherited by the offspring. In this way, each gene of the two descendants, (sdi 1 , cdi 1 ), and (sdi 2 , cdi 2 ), is obtained by the algorithm depicted in Figure 3. Data: Two genes: (s1i , c1i ) and (s2i , c2i ), from two parents β1 and β2 with fitness F1 and F2 respectively. Result: Two genes: (sdi 1 , cdi 1 ) and (sdi 2 , cdi 2 ). if round(s1i ) = round(s2i ) then sdi 1 = sdi 2 = round(s1i ) (cdi 1 , cdi 2 ) = Apply BLX-α over (c11 , c2i ). else Obtain n1 and n2 randomly in {1, 2} with probability 1,
F2 F1 +F2 n1 1 (sn i , ci ) n2 2 (sn , c i i )
and probability (sdi 1 , cdi 1 ) (sdi 2 , cdi 2 )
= =
F1 F1 +F2
of having value
of having value 2.
end if Fig. 3. Genes of the two descendants of the adapted BLX-α crossover
In this way, the descendants inherit the terms common to both parents, and the coefficients of these common terms are obtained by means of a BLX-α operator. The terms that are present only in one of the parents are inherited with a higher probability if the parent where they are present has a higher fitness value. Standard non-uniform mutation is applied to the individual considering all the values as real numbers. 2.3
Evaluation Measures
As we have stated, TIS recognition is a class-imbalance problem. Accuracy is not a useful measure for imbalanced data, specially when the minority class is very rare. If we have a ratio of 1:100, a classifier that assigns all instances to the majority class will have a 99% accuracy. Several measures [13] have been developed to take into account the imbalance nature of the problems. Given the ratio of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) we can define the following set of basic measures: P – True positive rate T Prate , recall R o sensitivity. T Prate = R = T PT+F N. This measure is relevant if we are interested only in the performance on the positive class. N – True negative rate T Nrate or specificity. T Nrate = T NT+F . P
TIS Recognition Using Evolutionary Response Surfaces
– – – –
P False positive rate F Prate . F Prate = T NF+F P. FN False negative rate F Nrate . F Nrate = T P +F N . Positive predictive value P Pvalue or precision P . P Pvalue = N Negative predictive value N Pvalue . N Pvalue = T NT+F N
TP T P +F P
381
.
From these basic measures others have been proposed. F -measure was proposed [5] to join recall and precision in a measure that is a harmonic mean of 2RP 2 both: F = R+P = 1/R+1/P . The harmonic mean of two measures tends to be closer to the smaller one than the arithmetic mean. Thus, F measures whether recall and precision have both high values. If we are concerned about the performance on both negative and positive classes G − mean measure [4] considers √ both, G − mean = T Prate · T Nrate . G − mean measures the balance performance of the learning algorithm between the two classes. In this work we will use both FP and FN rates and as an average measure the G − mean. From the above, we can conclude that the proposed fitness function, see eq. 4, is not appropriate for TIS recognition, as classification accuracy is used to evaluate the individuals. Thus, for TIS recognition we use a new fitness function which considers G − mean measure instead: nT − nTm F = (1 − β)(G − mean) + β 1 − , (5) nTM − nTm Many classifiers are subject to some kind of threshold that can be varied to achieve different values of the above measures. For that kind of classifiers receiver operating characteristic (ROC) curves can be constructed. A ROC curve, is a graphical plot of the T Prate (sensitivity) against the F Prate (1 - specificity) for a binary classifier system as its discrimination threshold is varied. The perfect model would achieve a true positive rate of 1 and a false positive rate of 0. A random guess will be represented by a line connecting the points (0, 0) and (1, 1). ROC curves are a good measure of the performance of the classifiers. Furthermore, from this curve a new measure, area under the curve (AUC), can be obtained which is a very good overall measure for comparing algorithms. In our experiments we will use ROC curves and AUC as main tools for comparing the algorithms. The source code, in C and licensed under the GNU General Public License, used for all methods, as well as the partitions of the datasets, are freely available upon request from the authors.
3
Experimental Setup and Results
We have used three datasets for testing the performance of the described methods. The CCDS dataset was compiled by Saeys et al. [12] from the consensus
382
R. del Castillo-Gomariz and N. Garc´ıa-Pedrajas
CDS database. The CCDS project is a collaborative effort of compiling and identifying a core of human genes that are accurately annotated. The annotation is a mix of manual curation and automatic annotation. CCDS contains 350,578 negative samples and 13,917 positive samples with a positive/negative ratio of 1:25. Ustilago dataset is a set of coding and non-coding regions of genomic sequences from the fungus Ustilago maydis. The sequences are first obtained from the Broad Institute1 and then completed with the information of the Munich Information Center for Protein Sequences (MIPS)2 . Ustilago dataset contains 607,696 negative samples and 6,515 positive samples with a ratio of 1:93. Arabidopsis dataset comprises coding and non coding regions of the plant Arabidopsis thaliana obtained from “The Arabidopsis Information Resource” (TAIR)3 . This dataset contains 27,342 positive instances and 3,369,875 negatives instances with an imbalance ratio of 1:123. The parameters used for evolving the response surfaces for TIS recognition are shown in Table 1. Table 1. Parameters used for evolving the response surfaces for TIS recognition Population 500 individuals F β = 0.5 Operators Duplication Pd = 0.2 Tournament selection Crossover Pc = 0.6 Tournament selection Adapted BLX-α (α = 0.5) Mutation Pm = 0.2 Random selection Non-uniform mutation (b = 5) Stop 500 generations
We have compared our model with a well-known classification method: a support vector machine (SVM) [14] with a Gaussian kernel. Figure 4 shows the results in terms of ROC curves and AUC for the three datasets. To avoid the problems derived from the class-imbalance nature of the problem, we have performed an undersampling step before applying any learning algorithm. We applied undersampling as it was the overall best performing method in a previous study [3]. We can see how the RS performs comparatively well. It is able to match the results of one of the best methods in the literature and cope with the large number of features in an efficient way. AUC values achieved by both methods are similar. For CCDS dataset, RSs are slightly better and for the remaining two problems, slightly worse. A corrected resampled t-test [10] showed no significant differences between both methods for any the problems. We must also notice that the evolutionary RSs are not using any local optimization procedure. The addition of such a local search method is likely to improve the overall accuracy. 1 2 3
http://www.broadinstitute.org/annotation/genome/ustilago_maydis/ http://www.helmholtz-muenchen.de/en/mips/home/index.html http://www.arabidopsis.org/
TIS Recognition Using Evolutionary Response Surfaces 1
0.91
0.9
0.8 0.9
AUC
True positive rate (sensitivity)
0.7
0.6
0.5 0.89 0.4
0.3
SVM
0.1 SVM RS 0 0
0.2
0.4
0.6
0.8
Response surface
0.88
0.2
1
False positive rate (1-specificity)
Arabidopsis thaliana 1
0.75
0.9
0.8
AUC
True positive rate (sensitivity)
0.7
0.6
0.74
0.5
0.4
0.3
SVM
Response surface
SVM
Response surface
0.73
0.2
0.1 SVM Response surface 0 0
0.2
0.4
0.6
0.8
1
False positive rate (1-specificity)
CCDS (human genome) 1
0.87
0.9
0.86
0.8
AUC
True positive rate (sensitivity)
0.7
0.6
0.85
0.5 0.84
0.4
0.3 0.83
0.2
0.1 SVM Response surface 0 0
0.2
0.4
0.6
0.8
1
False positive rate (1-specificity)
Ustilago maydis Fig. 4. ROC curves and AUC values for Arabidopsis, CCDS and Ustilago
383
384
4
R. del Castillo-Gomariz and N. Garc´ıa-Pedrajas
Conclusions
In this paper we have shown how real-coded genetic algorithms offer a very interesting approach for using RSs for classification. RSs are a powerful tool for regression and classification but their applicability is undermined by the lack of efficient algorithms for optimizing their parameters. We have shown that the structure of the polynomial and its parameters can be adjusted using a genetic algorithm. The evolution of both, parameters and structure, is achieved by the implementation of a dual codification scheme. Specific genetic operators have been developed for this codification. We have shown how the proposed model can be used to solve a very interesting and complex problem such as the recognition of translation initiation sites in DNA sequences. The results show that RSs achieve a performance close to SVMs with a more interpretable classifier function. As future research we are working in the interpretation from the biological point of view of the classification functions obtained.
References 1. Barandela, R., S´ anchez, J.L., Garc´ıa, V., Rangel, E.: Strategies for learning in class imbalance problems. Pattern Recognition 36, 849–851 (2003) 2. Eshelman, L.J., Schaffer, J.D.: Real-coded genetic algorithms and intervalschemata. In: Whitley, L.D. (ed.) Foundations of Genetic Algorithms 2, pp. 187– 202. Morgan Kaufmann, San Mateo (1993) 3. Garc´ıa-Pedrajas, N., Ortiz-Boyer, D., Garc´ıa-Pedrajas, M.D., Fyfe, C.: Class imbalance methods for translation initiation site recognition. In: Garc´ıa-Pedrajas, N., Herrera, F., Fyfe, C., Ben´ıtez, J.M., Ali, M. (eds.) IEA/AIE 2010. LNCS (LNAI), vol. 6096, pp. 327–336. Springer, Heidelberg (2010) 4. Kubat, M., Holte, R., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998) 5. Lewis, D., Gale, W.: Training text classifiers by uncertainty sampling. In: Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information, New York, USA, pp. 73–79 (1998) 6. Liu, H., Han, H., Li, J., Wong, L.: Using amino acids patterns to accurately predict translation initiation sites. Silico Biology 4, 255–269 (2004) 7. Marquardt, D.W.: An algorithm for least-squares estimation of non-linear parameters. Journal of the Society of Industrial and Applied Mathematics 11(2), 431–441 (1963) 8. Miller, G.F., Todd, P.M., Hedge, S.U.: Designing neural networks. Neural Networks 4, 53–60 (1991) 9. Myers, R.H., Montgomery, D.C.: Response Surface Methodology: Process and Product Optimization using Designed Experiments, 2nd edn. John Wiley & Sons, Chichester (2002) 10. Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52, 239–281 (2003)
TIS Recognition Using Evolutionary Response Surfaces
385
11. Rawlings, J.O., Pantula, S.G., Dickey, D.: Applied Regression Analysis: A Research Tool. Springer, New York (1998) 12. Saeys, Y., Abeel, T., Degroeve, S., de Peer, Y.V.: Translation initiation site prediction on a genomic scale: beauty in simplicity. Bioinformatics 23, 418–423 (2007) 13. Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40, 3358–3378 (2007) 14. Vapnik, V.: The nature of Statistical Learning Theory. Springer, New York (1999)
An Evolutionary Algorithm for Gene Structure Prediction Javier P´erez-Rodr´ıguez and Nicol´as Garc´ıa-Pedrajas Department of Computing and Numerical Analysis, University of C´ ordoba, Spain
[email protected],
[email protected] http://www.cibrg.org/
Abstract. Gene recognition, gene structure prediction or gene finding, as all these three and other terms are used, consists of determining which parts of a genomic sequence are coding, and constructing the whole gene from its start site to its stop codon. Gene recognition is one of the most important open problems in Bioinformatics. The process of discovering the putative genes in a genome is called annotation. There are two basic approaches to gene structure prediction: extrinsic and intrinsic methods. Intrinsic methods are now preferred due to their ability to identify more unknown genes. Gene recognition is a search problem, where many evidence sources are combined in a scoring function that must be maximized to obtain the structure of a probable gene. In this paper, we propose the first purely evolutionary algorithm in the literature for gene structure prediction. The application of genetic algorithms to gene recognition will open a new field of research where the flexibility of evolutionary computation can be used to account for the complexities of the problem, which are growing as our knowledge of the molecular processes of transcription and translation deepens.
1
Introduction
A genomic sequence is a string composed of four different nucleotides, A, T, G and C, which codifies in groups of three, called codons, the amino acids that form the proteins that are necessary for all organisms to live. The gene is the structure that codifies the proteins1 . In procaryotes it is a sequence of codons between a start codon (ATG) and a stop codon (TAA, TAG or TGA). In eukaryotes, the structure is more complex. The coding sequence is usually broken by non-coding sequences, called introns, that are excised during the transcription in a process called splicing. The coding pieces are called exons. In this way, the eukaryotic gene begins with a first exon, then any number of pairs intron/exon, and ends with a last exon which finishes with a stop codon. This is called an open reading 1
This work has been financed in part by the Excellence in Research Project P07-TIC2682 of the Junta de Andaluc´ıa. The definition of the gene is far more complex. We will use this definition because we are only concerned with protein coding genes and do not want to introduce an unnecessary complicated terminology.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 386–395, 2011. c Springer-Verlag Berlin Heidelberg 2011
An Evolutionary Algorithm for Gene Structure Prediction
387
frame (ORF). There are also eukaryotic genes composed by a single exon. The boundary between an exon and an intron is called a splice donor site and the boundary between an intron and an exon is called a splice acceptor site. The 64 different codons codify the 20 amino acids that form the proteins and the stop signal that ends the gene. The actual gene has also sequences of nucleotides before the start codon and after the stop codon, the untranslated terminal regions (UTRs). However, it is usual in gene recognition to use the term “gene” when referring only to the coding part of the gene, since that part determines the protein structure. Gene recognition, gene structure prediction or gene finding, as all these three and other terms are used, consists of determining which parts of a sequence are coding and constructing the whole gene from its start site to its stop codon. The remaining of this work is concerned only with eukaryotic gene recognition, as that is more important and difficult. The process of discovering the putative genes in a genome is called annotation. There are two basic approaches to gene structure prediction [11]. Homology based approaches search for similar sequences in databases of known genes. These methods are usually called extrinsic methods [11]. The growing number of sequenced genomes and known genes is increasing the potential of homology based methods. However, it is clear that only genes that are somewhat similar to known genes can be identified in this way. Furthermore, when using homology based techniques it is very difficult to establish the complete structure of the gene, as the exact bounds of the exons are not easy to determine with certainty. The second set of methods are usually known as intrinsic methods, and include two basic approaches: ab initio and de novo methods2 . Both are based on obtaining the features that characterize a coding region and/or the functional sites, and using them to find the correct structure of the unknown genes. Ab initio methods use only the information of the genome to be annotated (the target genome), whereas de novo methods add information of one or more related genomes (the informant genomes). The methods for obtaining and using that information are many, such as neural networks [14], hidden Markov models [7], conditional random fields [5], etc. These methods try to recognize genomic sequence patterns that are characteristic of splice donor and acceptor sites, translation initiation site (TIS) and termination site, and/or features specific of coding regions. Some of them also try to find other important parts, such as promoters, untranslated terminal regions, polyadenilation sites, etc., although the recognition of these sites is a very difficult problem on its own [3] [12]. One of the first significant improvements in ab initio prediction was GENSCAN [1], which achieved both accuracy and robustness using a single genome as input. No new significantly better methods were obtained until the development of dual-genome predictors. Dual-genome predictors use two genomes, the genome to be annotated and the genome of a related organism. The prediction 2
It is also common to refer to all extrinsic approaches as ab initio programs, without differencing between using one genome or many of them.
388
J. P´erez-Rodr´ıguez and N. Garc´ıa-Pedrajas
is improved when two or more genomes are used. However, as more genomes are added, the increase of performance is less significant. Until 2007, the most accurate de novo gene predictors were based on generalized hidden Markov models. In 2007, three programs based on conditional random fields (CRFs) [9] were presented with a significant improvement. The third of these methods, CONTRAST [5], achieved an impressive prediction of the perfect open reading frame (ORF) structure for the 56% of all known human protein-coding genes, using 11 informant genomes. This result also shows that there are still a long way to go before we are able to predict genes with high accuracy, as the best performing program is still not predicting correctly almost half of the known genes. CONTRAST consists of two main components. The first is a set of classifiers designed to recognize the boundaries of coding regions based on local information contained in a small window around a potential boundary. The second is a global model of gene structure that integrates outputs from the classifiers with additional features of a multiple alignment to predict complete genes. It adopts this two-stage approach because it greatly simplifies the task of learning parameters from training data. Training the boundary classifiers requires only short alignment windows corresponding to positive or negative examples of a specific type of coding region boundary. The global model can then be trained on the full set of training data, treating the classifiers as black boxes. This avoids the need for the global model to incorporate the large number of features required for accurate recognition of coding region boundaries. It uses discriminative techniques, such as support vector machines (SVMs) [2] and CRFs, rather than generative models used by previous de novo predictors. There are major reasons to develop better gene recognizers. Although many programs claim they are able to offer high accuracy [16], in general, the claimed high accuracy is only guaranteed for specific domains of very well known genes. Furthermore, even when a good accuracy is achieved at nucleotide level, the results at exon level are much worse. And these results are even poorer at coding protein level. In fact, it is likely that the capacity of current gene recognizers had been overestimated as new research [13] has proved that de novo gene prediction followed by reverse transcription polymerase chain reaction (RT-PCR) and direct sequencing can find many novel exons and introns even in a genome as thoroughly studied as the human. The fact that significant discrepancies are found in automatic annotations [6] of genomes shows that the results of gene recognition programs must be taken with caution. Recent discoveries are making the problem even more complex as new studies of the mammalian transcriptome have revealed new complexities such as pervasive transcription, thousands of noncoding RNAs, extensive antisense transcription, tandem chimerisms and widespread alternative splicing and alternative promoters. Other studies have shown extensive and elaborate regulation and modification at post-transcriptional stages. All these discoveries show an unanticipated level of complexity in the way genome encodes functional molecules and even question the definition of the gene [4].
An Evolutionary Algorithm for Gene Structure Prediction
389
This increased complexity needs to be addressed using new and more powerful methods. Although currently best programs, such as N-SCAN and CONTRAST, achieve good results in terms of predicting genes that mostly follow canonical patterns, they are too rigid to deal with the complexities of the genes or the less common structures. In this situation, evolutionary computation might become a very appropriate tool for developing more flexible solutions where non canonical structures can be taken into account. However, the introduction of evolutionary computation in gene recognition is not an easy task. Many difficulties are found to deal with the huge search space and even with the definition of a fitness function. In this paper, we present a first approach to the application of evolutionary computation to gene structure prediction. The initial results are encouraging. With a relatively simple genetic algorithm we obtain good results that show that the application of genetic algorithms to gene finding is a promising research field. This paper is organized as follows: Section 2 explains the evolutionary approach to gene recognition; Section 3 reports the experiments carried out; and Section 4 summarizes our conclusions and open research lines.
2
Evolutionary Gene Recognition
As explained, intrinsic approaches are more powerful than extrinsic ones, due to their ability to identify new structures. Thus, our first choice was opting for an intrinsic approach. Secondly, we have to decide upon the type of evolutionary method to use. The search is carried out considering many sources of evidence, as the signals that identify coding regions and functional sites are subtle and must be combined to accumulate enough information to assure with a high probability that we have found an exon or an intron. However, the problem is not multi-objective in the usual way. The combination of evidence is used, but each single source is not useful by itself. Thus, our second decision was to use a standard genetic algorithms whose fitness function is a linear combination of different measures. However, this approach forces us to develop also a method for obtaining the weights of the different measures. The application on genetic algorithms to gene structure prediction is based on a two-step procedure. The first step consists of bounding the search space. In a second step, we develop the evolutionary algorithm. The evolution of the structures is coupled with the evolution of the weights assigned to each measure. The first step is devoted to limit the search space. At first sight, a gene is a structure delimited by two sites, the start and stop codons. In a second approach, between these two boundaries we have two different substructures, exons and introns. Exons and introns are delimited by a donor and an acceptor splice sites, with the exception of the first exon, that begins with the start site, and the last exon, that finishes with the stop signal. If we consider no restrictions in the search, the search space would be huge, and any method would very likely fail. The common approach for reducing the search space is to limit the putative start, splice and stops sites, to the most probable ones. In our system, we use the
390
J. P´erez-Rodr´ıguez and N. Garc´ıa-Pedrajas
weight matrix method (WMM), which assigns a score to every base in a specific site. An overall score of the possibility of the considered position to be a certain site can be calculated and a threshold established. The values for each position are obtained from the whole human genome as available in GenBank. With this method we obtain a set of possible start, stop and splice sites which will be the only ones considered. A second reduction of the search space is achieved taking into account the constrains in the gene structure: – – – –
The exons do not overlap. The gene starts and finishes with an exon. An intron must be flanked by two exons. A gene can be made up by only one exon.
The initial population of the genetic algorithm is randomly obtained from the possible exons that satisfy the above constraints. The codification of each individual is a string of integers that represents the sites of the gene. The initial population is divided into a number of subpopulations where the individuals are placed depending on their number of exons. An individual may migrate to another population when a mutation modifies its length. Each generation several actions are carried out: Selection. Selection is performed using a binary tournament and taking care of maintaining the balance among the number of exons of the individuals. The subpopulations with different number of exons are kept with the same number of individuals. Elitism is applied to avoid loosing the best solutions so far. Crossover. Crossover is carried out randomly recombining the exons of two parents two obtain two offspring. The offspring substitutes their parents. Mutation. This operator consists of randomly modifying, removing, adding or exchanging an exon. After mutation the individual is checked to assure its viability. The algorithm is a standard generational genetic algorithm with the particularity that we have forced an even distribution of the lengths of the genes, in terms of number of exons, to avoid a premature convergence to a suboptimal solution. 2.1
Fitness Function
In this first approach we have used a fitness function that is as simple as possible. It must be beared in mind that our main objective is developing a system to prove the validity of evolutionary computation as a tool for gene recognition. We are not creating a system competitive with current gene recognizers, which are very complex programs. The fitness of the possible solution is calculated evaluating each segment regarding its type. In this way, first exons, last exons, internal exons, unique exons
An Evolutionary Algorithm for Gene Structure Prediction
391
and introns are evaluated differently. For all these types there are a common measure, the in-frame hexamer frequency. It has long been known that synonymous codons are not used with equal frequencies and that different organisms differ in their patterns of codon usage. The in-frame hexamer score for the interval starting at nucleotide i and ending at j, IF6 (i, j), is calculated as follows [14]: ⎧ fk ⎪ ⎨k=0,3,6,...,j−6 ln( Fk ) fk IF6 (i, j) = max (1) k=1,4,7,...,j−6 ln( Fk ) ⎪ ⎩ fk k=2,5,8,...,j−6 ln( Fk ) where fk is the frequency, in the table of in-frame hexamers in human coding sequences, of the hexamer starting at position k in the interval. Fk is the frequency of the same hexamer in a random population based on the base composition of the sequence. Hexamers whose occurrences equal that expected by composition have IF6 = 0, those preferred have a positive score and those avoided, a negative score. This is the only content statistic used. Additionally, for each site we calculated a score which is based on a positional weight matrix. Each segment (i, j) of the sequence will have a score which depends on its type, k: first exon, f , internal exon, e, intron, i, last exon, l and unique exon, u. The score of a segment of type k, from j to l, Sk (j, l), represents the probability of the segment of being of type k. The score is the weighted sum of the site and content statistics. The fitness for an individual is the sum of the scores for all its sequences: F = Sf (h, j) + Si (j + 1, n) + Se (n + 1, l) + Si (l + 1, m) + . . . + Sl (p + 1, n) (2) One of the problems with this fitness function is the weight of the different factors that compound score Sk . To obtain those weights we coevolve a population of weights that searches for the optimal values. This population evolves using a standard real-coded genetic algorithm using binary tournament selection, BLX-α crossover, and non-uniform mutation. Each individual is evaluated using the difference between the score assigned to the actual solution and the partial solutions obtained by the evolutionary approach. The outline of the whole architecture of the system is shown in Figure 1. We see the coevolution of both populations. When we face the recognition of a gene in a previously unknown sequence we can use the evolved weights in the training stage, and we can also use the new sequence to continue the evolution of the weights to improved values. 2.2
Evaluation Measures
Accuracy is not a useful measure for imbalanced data, specially when the minority class is very rare. If we have a ratio of 1:100, a classifier that assigns all instances to the majority class will have a 99% accuracy. In gene structure prediction, the ratio of coding against no coding regions is far more imbalanced, so other measures must be used [12].
392
J. P´erez-Rodr´ıguez and N. Garc´ıa-Pedrajas
Fig. 1. General architecture of the evolutionary algorithm
Several measures [15] have been developed to take into account the imbalance nature of the problems. Given the ratio of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) we can define the following set of basic measures: P – True positive rate T Prate , recall R o sensitivity. T Prate = R = T PT+F N. This measure is relevant if we are interested only in the performance on the positive class. N – True negative rate T Nrate or specificity. T Nrate = T NT+F . P FP – False positive rate F Prate . F Prate = T N +F P . N – False negative rate F Nrate . F Nrate = T PF+F N. P – Positive predictive value P Pvalue or precision P . P Pvalue = T PT+F P. TN – Negative predictive value N Pvalue . N Pvalue = T N +F N
From these basic measures others have been proposed. F -measure was proposed [10] to join recall and precision in a measure that is a harmonic mean 2RP 2 of both: F = R+P = 1/R+1/P . The harmonic mean of two measures tends to be closer to the smaller one than the arithmetic mean. Thus, F measures whether recall and precision have both high values. If we are concerned about the performance on both negative and positive classes G − mean measure [8] √ considers both, G − mean = T Prate · T Nrate. G − mean measures the balance performance of the learning algorithm between the two classes. Many classifiers are subject to some kind of threshold that can be varied to achieve different values of the above measures. For that kind of classifiers receiver operating characteristic (ROC) curves can be constructed. A ROC curve, is a graphical plot of the T Prate (sensitivity) against the F Prate (1 - specificity) for a binary classifier system as its discrimination threshold is varied. The perfect model would achieve a true positive rate of 1 and a false positive rate of 0.
An Evolutionary Algorithm for Gene Structure Prediction
393
A random guess will be represented by a line connecting the points (0, 0) and (1, 1). ROC curves are a good measure of the performance of the classifiers. Furthermore, from this curve a new measure, area under the curve (AUC), can be obtained which is a very good overall measure for comparing algorithms. In our experiments we will use ROC curves and AUC as main tools for comparing the algorithms. These are common measures in any class-imbalance problem. There are also measures specific of gene recognition task. On of the most commonly used is the correlation coefficient, CC, measure: (T P )(T N ) − (F P )(F N ) CC = (P P )(P N )(AP )(AN )
(3)
where P P are the predicted positives, AP the actual positives, P N the predicted negatives and AN the actual negatives. This will be our main measure of the performance of the method. The source code, in C and licensed under the GNU General Public License, used for all methods, as well as the partitions of the datasets, are freely available upon request from the authors.
3
Experimental Setup and Results
For evaluating the performance of the program we used k-fold cross-validation. In this method the available data is divided into k approximately equal subsets. Then, the method is learned k times, using, in turn, each one of the k subsets as testing set, and the remaining k − 1 subsets as training set. The estimated error is the average testing error of the k subsets. We have used a fairly standard value for k which is k = 10. The proposed model was tested on the chromosome 19 of the human genome. This chromosome has 1602 genes and a length of more that 55 millions of nucleotides. To make the problem feasible we randomly selected a 10% of the whole chromosome assuring that all the genes were present. With this set up we performed the described evolutionary process. The evolution of each sequence was performed for 1000 generations for the evolution of the structure and another 1000 generations for the evolution of the weights of the fitness function. Both populations had 200 individuals. The results obtained for exons and introns are shown in Table 1. The results are divided into three groups depending on the G+C contents of the sequences. The table shows the correlation coefficient, sensitivity and specificity at nucleotide level. The table also shows the percentage of exons correctly predicted, and the percentage of actual exons at least partially overlapped by predicted exons. The table shows the good behavior of the proposal. The values are comparable with others obtained with far more complex programs [11], using an approach that is simpler and more flexible. However, we must bear in mind that the objective of this work is not achieving a performance comparable with current gene
394
J. P´erez-Rodr´ıguez and N. Garc´ıa-Pedrajas Table 1. Results of the algorithm for chromosome 19 at nucleotide level G+C content CC Sn Sp Exons correct Exons overlapped Low G+C 0.491 0.494 0.542 0.227 0.487 Medium G+C 0.521 0.357 0.689 0.365 0.527 High G+C 0.655 0.467 0.778 0.256 0.678
recognition programs, but to show the potential of genetic algorithms as competitors to the other methods used for this task. Developing a whole recognition system based on evolutionary computation is now the next step, after showing these promising results.
4
Conclusions
In this paper, we have presented the first attempt of using evolutionary computation as the main tool for gene structure prediction. A simple system is presented where no other search paradigm is used. The system achieves moderately good results at the nucleotide level. The flexibility of the method will provide a tool that can deal with alternative splicing, non-canonical functional sites, ignored stop codons, pseudo-genes, and any other issue that need to be addressed in the search process. The proposed methodology opens a new field of application of genetic algorithms to gene structure prediction. Many new sources of evidence can be added to the system, as well as more sophisticated evolutionary methods. Additionally, we are also studying a multi-objective approach to the gene finding problem.
References 1. Burge, C., Karlin, S.: Prediction of Complete Gene Structures in Human Genomic DNA. Journal of Molecular Biology 268, 78–94 (1997) 2. Cort´es, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995) 3. Fickett, J.W., Hatzigeorgio, A.G.: Eukaryotic promoter recognition. Genome Research 7, 861–878 (1997) 4. Gerstein, M.B., Bruce, C., Rozowsky, J.S., Zheng, D., Du, J., Korbel, J.O., Emanuelsson, O., Zhang, Z.D., Weissman, S., Snyder, M.: What is a gene, post encode? History and updated definition. Genome Research 17, 669–681 (2007) 5. Gross, S.S., Do, C.B., Sirota, M., Batzoglou, S.: Contrast: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biology 16, R269.1–R269.16 (2007) 6. Poole II, F.L.P., Gerwe, B.A., Hopkins, R.C., Schut, G.J., Weinberg, M.V., Jenney, F.E., Admas, M.W.W.: Defining genes in the genome of the hyperthermophilic archaeon pyrococcus furiosus. Journal of Bacteriology 187, 7325–7332 (2005) 7. Knapp, K., Chen, Y.P.P.: An evaluation of contemporary hidden Markov model gene-finders with predicted exon taxonomy. Nucleic Acids Research 35(1), 317–324 (2007)
An Evolutionary Algorithm for Gene Structure Prediction
395
8. Kubat, M., Holte, R., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998) 9. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labelling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289 (2001) 10. Lewis, D., Gale, W.: Training text classifiers by uncertainty sampling. In: Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information, New York, USA, pp. 73–79 (1998) 11. Math´e, C., Sagot, M.F., Schiex, T., Rouz´e, P.: Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Research 30(19), 4103–4117 (2002) 12. Saeys, Y., Abeel, T., Degroeve, S., de Peer, Y.V.: Translation initiation site prediction on a genomic scale: beauty in simplicity. Bioinformatics 23, 418–423 (2007) 13. Siepel, A., Haussler, D.: Computational identification of evolutionarily conserved exons. In: Proceedings of the Eighth International Conference on Research in Computational Molecular Biology, pp. 177–186. ACM Press, New York (2007) 14. Snyder, E.E., Stormo, G.D.: Identification of protein coding regions in genomic DNA. Journal of Molecular Biology 248, 1–18 (1995) 15. Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40, 3358–3378 (2007) 16. Zhou, Y., Liang, Y., Hu, C., Wang, L., Shi, X.: An artificial neural network method for combining gene prediction based on equitable weights. Neurocomputing 71, 538–543 (2008)
Prediction of Drug Activity Using Molecular Fragments-Based Representation and RFE Support Vector Machine Algorithm Gonzalo Cerruela García, Irene Luque Ruiz, and Miguel Ángel Gómez-Nieto University of Córdoba, Department of Computing and Numerical Analysis Campus de Rabanales. Albert Einstein Building, E-14071 Córdoba, Spain {gcerruela,iluque,mangel}@uco.es
Abstract. This paper describes the use of a support vector machine algorithm for the classification of molecules database in order for the prediction of the activity of drugs. Molecules database are fragmented, and each molecule is represented by a set of contained fragments. Molecular weighted descriptors are tested for the representation of molecular fragments in order to represent the dataset as a MxF array where each element takes the value of the molecular weighted descriptor calculated for the fragment. As weighted descriptors take into account distances and heteroatoms present in the fragments, the representation space allows the discrimination of similar structural fragments. A Support Vector Machine algorithm is used for the classification process for a training set. Prediction of the activity of the test set is carried out in function of results of training stage and the application of a proposed heuristic. Results obtained shows that the use of weighted molecular descriptors improves the prediction of drug activity for heterogeneous datasets. Keywords: Classification method, Support vector machine algorithm, molecular fragments, QSAR.
1 Introduction Over the last decade computer-assisted drug design represents the latest applications of computers as tools in Drug and pharmaceutical research. The drugs design is a process in which scientists discover the relationships between the structure and the activity of the compounds using data mining methods of molecular fragments. This analysis gives information about the relation between the chemical properties and the biological compounds activity [1-4]. Commonly the chemical structure is represented using a great number of molecular descriptors and characteristics; however the exactitude in the model prediction does not depend lineally on the number of descriptors that represent the chemical structures, but rather the structural characteristic of molecular fragments. This work describes the use of a support vector machine for the classification of molecules to predict the activity of drugs; the paper has been organized as follows: section 2 summarizes the materials and methods, section 3 describes the experimental results, section 4 relates the prediction process, finally there is a conclusion section. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 396–405, 2011. © Springer-Verlag Berlin Heidelberg 2011
Prediction of Drug Activity Using Molecular Fragments-Based Representation
397
2 Material and Methods In the experiment we have used a very heterogeneous dataset composed of 65 chemicals compounds extracted from Roche corporate compounds library with measured hERG activities. This dataset has been extracted by Sun [5] who proposes a clustering procedure using a Naive-Bayes classifier to determine the active and non active compounds based on the use of atom-type frequency and molecular descriptors. Many investigations have placed focus upon a voltage-dependent potassium ion channel, encoded by the hERG (the human ether-,-go-go-related gene) [6,7], so almost all cases of drug-induced QT prolongation can be traced to the blockade of hERG. The QT interval of the electrocardiogram (ECG) is a widely used measure of the ventricular repolarization process. The prolongation of the QT interval is associated with an increased risk in surgeries. Drug-induced long QT syndrome has been identified as a critical side effect of non-cardiovascular drugs and has caused the withdrawal of many drugs from the market. Thus, removal of potential hERG blockers from the drug-discovery pipeline is an important issue for projects throughout all therapeutic areas. Feature selection methods have been introduced for the improvement of classification performance of statistical learning methods and for the selection of features meaningful in discriminating two data sets [8-10]. One approach, the recursive feature elimination (RFE) method, has gained reputation due to its effectiveness for discovering informative features or attributes in cancer classification and drug activity analysis [11]. In the present work the RFE method is used for ranking the molecular fragments according to their importance for the activity classification. The theory of Support Vector Machine (SVM) has been extensively described in the literature [12, 13]. In this work SVM is used as the statistical learning method for the prediction of molecular activity and an appropriate platform to evaluate the effectiveness of feature selection methods. Four topological molecular descriptors were used to weigh the molecular fragments. The A2 index is the sum of the linear combination of the molecular vertex graph weighted for the corresponding decimal adjacency vector, the Wiener index is defined as the sum of the topological distances of the molecular graph, the Harary index is a molecular topological index derived from the inverse distance matrix, Detour index is similar to Wiener but in this case is based on the length of the longest distance, i.e., the maximum number of edges, between vertices of the molecular graph. A more detailed description for these molecular descriptors can be found in [14]. The molecular fragments classification is based on a four steps process: 1) Molecules are fragmented using the jchem fragmenter library [15] selecting two different fragmentation grades. 2) For each molecular fragment a selected descriptor is calculated, four molecular descriptors (A2, Wiener, Harary, Detour) has been used. 3) A molecular-fragment matrix is created where the element (i,j) represent the presence/absence (or a molecular descriptor) of the fragment in the molecule. 4) The (RFE) feature selection method using a linear kernel is applied over the molecularfragment matrix to order the fragments according to the importance for the molecular activity classification using.
398
G. Cerruela García, I. Luque Ruiz, and M. Ángel Gómez-Nieto
3 Experimental Results Table 1 shows a summary of the experimental results considering values of 2 and 4 for the minimum size of fragments (MSF) parameter. Different representation spaces of the molecules dataset have been tested for the classification process, all of them are based on M x F matrices, where M is the cardinality of the dataset and F is the number of fragments generated. In Content type representation space, elements (i, j) take values equal to 1 if the fragment j is present in molecule i. In representation spaces based on molecular descriptors, elements (i, j) take the value of the descriptor value. In the fragmentation process a same fragment can be extracted from a molecule more than one time. Analyses performed have shown that the consideration of redundant fragments in the representation spaces do not represent important changes in the results obtained. Thus, in Table 1 we show the total number of generated fragments without repetition for each value of MSF considered (171 for MSF = 2, and 163 for MSF = 4). Moreover, in Table 1, we also show the results obtained when occurrence of fragments is considered (column B), or not considered (column A). We observe, as is evident, that the number of fragments generated is greater for MSF of 2 rather than 4. However the increase of fragments is not very significant. Highlighted differences in the classification process for the different values of the MSF are observed in the accuracy of the process. In Content based matrix the accuracy is greater for one fragment; however more fragments are necessary to obtain a 95% and 99% of accuracy regarding the descriptors based matrixes. Table 1. Summary of the classification process for the different parameter values of the algorithm. (A) Not considering frequency of fragments, (B) considering frequency of fragments, (1) minimum accuracy, (2) number of fragments for 95% accuracy, (3) number fragments for 99% accuracy. Content A
B
(1) (2) (3)
0.8308 24 35
0.8308 24 35
(1) (2) (3)
0.8308 23 30
0.8308 23 30
A2 A
Wiener B
A
Harary B
A
MSF=2, Total number of fragments = 171 0.8154 0.8154 0.8154 0.8154 0.8154 20 21 21 20 22 103 103 119 119 124 MSF =4, Total number of fragments = 163 0.8154 0.8154 0.8154 0.8154 0.8154 17 17 21 21 21 115 115 109 109 109
Detour B
A
B
0.8154 22 125
0.8154 21 119
0.8154 20 119
0.8154 21 109
0.8154 17 107
0.8154 17 107
We observe that the number of fragments to achieve an accuracy of 95% is less for MSF of 4 than 2 (except for Wiener index), and also is smaller the number of fragments needed to achieve an accuracy greater than 99%. Comparing columns A and B of Table 1, no difference are observed. Thus, we conclude that the consideration of fragments occurrences do not improve the process. The reason can be attributed to the characteristics of the dataset. The dataset is very
Prediction of Drug Activity Using Molecular Fragments-Based Representation
399
heterogeneous and repeated substructures are contained in a molecule, so only few molecules contained redundant fragments. Besides this, no significant differences are observed in the classification process for the different molecular descriptors studied. For all descriptors, the accuracy is very similar, being similar the number of fragments needed to achieve the value of 99%. That is because all the molecular descriptors considered are entropy based descriptors. They take into account the path between the atoms to obtain the different matrixes considering distance metrics. As Detour descriptor is calculated considering a finer measurement of distance between the atoms of the molecule structure, this descriptor shows a better behavior. Table 2. The five most significant fragments and their contribution to the accuracy of the classification process for the different representation spaces of the dataset studied. Content ID
Accuracy
A2 ID
Wiener
Accuracy
ID
Accuracy
Harary ID
Accuracy
Detour ID
Accuracy
Not considering frequency of fragments. MSF = 2 99 47 17 69 41
0.8308 0.8462 0.8462 0.8462 0.8462
99 47 17 69 97
0.8308 0.8462 0.8462 0.8462 0.8462
77 48 93 127 128
0.8308 0.8462 0.8462 0.8462 0.8462
77 48 93 127 128
0.8308 0.8462 0.8462 0.8462 0.8462
61 17 47 99 52
0.8154 0.8462 0.8615 0.8615 0.8615
61 47 99 17 69
0.8154 0.8462 0.8615 0.8615 0.8615
61 47 109 126 17
0.8154 0.8462 0.8462 0.8462 0.8615
61 47 99 17 69
0.8154 0.8462 0.8615 0.8615 0.8615
61 47 99 17 69
0.8154 0.8462 0.8615 0.8615 0.8615
58 48 104 116 77
0.8154 0.8462 0.8462 0.8462 0.8615
58 48 104 116 77
0.8154 0.8462 0.8462 0.8462 0.8615
Considering frequency of fragments. MSF = 2 61 17 47 99 69
0.8154 0.8462 0.8615 0.8615 0.8615
61 47 99 17 69
0.8154 0.8462 0.8615 0.8615 0.8615
61 47 17 99 69
0.8154 0.8462 0.8615 0.8615 0.8615
Not considering frequency of fragments. MSF = 4 58 93 48 127 77
0.8154 0.8154 0.8462 0.8462 0.8615
58 77 127 48 93
0.8154 0.8462 0.8462 0.8615 0.8615
58 77 127 48 93
0.8154 0.8462 0.8462 0.8615 0.8615
Considering frequency of fragments. MSF = 4 58 93 48 127 77
0.8154 0.8154 0.8462 0.8462 0.8615
58 77 127 48 93
0.8154 0.8462 0.8462 0.8615 0.8615
58 77 127 48 93
0.8154 0.8462 0.8462 0.8615 0.8615
In Table 2, we show the results obtained in the classification process for the different parameters considered for the five first most significant fragments. We again observe similar behavior for the different molecular descriptors. In all cases when MSF
400
G. Cerruela García, I. Luque Ruiz, and M. Ángel Gómez-Nieto
is equal to 2, the most significant fragment is the ID = 99 (content matrix) and ID = 61 (descriptor matrixes). Other fragments as 17, 47 and 69 are among the five most significant for almost all descriptors. However, the significance order is different. Other fragments are only considered for some of the descriptors, i.e. 52 for A2 index, 109 and 126 for Harary and 69 for Detour index. Results obtained when a content matrix is used are different. When occurrences are considered for MSF =2 the list of the most significant fragments is very similar for all data matrixes used. The differences between context matrix and descriptors matrixes are due to when the content matrix is used in the classification process only the presence or not of a fragment for each molecule of the dataset is considered, but information about size or structural characteristics like cycles, heteroatoms, etc. Similar behavior is observed when MSF = 4, however the significant fragments are absolutely different. Table 3. Structures of the five more significant fragments for the Detour index without consider the occurrences of fragments for the dataset studied.
ID
MSF = 2
ID
61
58
47
48
99
104
17
116
69
77
MSF = 4
Prediction of Drug Activity Using Molecular Fragments-Based Representation
401
Table 3 shows the five first most significant fragments considered in Table 2. We observe that significant fragments are those presenting heteroatoms and complex cyclical structures. Large structures determine the kind of chemical compounds and therefore their characteristics related to the activity. Furthermore little functional groups, heteroatoms and cycles with heteroatoms, determine the activity of kernel structures. This knowledge is the foundation for the drug design, in which the substitution of a specific functional group improves or diminishes the activity of a kernel structure taken as nucleus of a family of drugs.
Fig. 1. Results of the classification process for the Detour index, MSF= 2 in upper figure and MSF=4 in lower figure, in both cases the occurrence of fragments has not been consider
402
G. Cerruela García, I. Luque Ruiz, and M. Ángel Gómez-Nieto
In Fig. 1 is shown the results of the classification. The Y axis represents the molecules of the dataset; the positive side represents those active molecules related to the property considered, and the negative side represents non active ones. The X axis represents the fragments generated ordered by its identification code in order to compare the results. As commented above, non important differences are found for the different representation spaces considered for a given value of MSF except for the fragment determining the characteristics of activity of the dataset. However, when we compare different values of the MSF we observe the existence of more points for MSF = 2 rather than 4. That is because more fragments are taken into account to classify the molecules when the MSF is 2 rather than 4. When MSF is 4, the number of fragments is lower, and therefore the number of molecules containing those fragments is also lower. Fragments are bigger and they have enough information to distinguish or not the activity of the molecule. Analyzing Fig. 1 we can observe that in the classification process different situations can be presented: • • • •
A fragment is only assigned to one molecule. A molecule is only related to one fragment. A fragment makes all the molecules active in which the fragment is present. A fragment makes a set of molecules active but not another.
In Fig. 1 we observe that fragment 1 is only present in the molecule 8 of the dataset (COOH-CH2OH, the same for MSF = 2 and 4). This molecule is non active, although this characteristic can be provided by the fragment 1 as well as other fragments present in the molecule (32, 65, 87, 98, and 112). Thus, information provided by fragment 1 is not valuable for later performed prediction process. When a molecule is only represented for a fragment (i.e. molecule 14 and fragment 2 for MSF = 4), never we can use this information for the development of predictions. In this case, the molecules are complex cyclical structures which are not fragmented by the algorithm. In most situations a fragment makes all the molecules active containing that fragment. That is the expected and desired conditions, molecules and fragments are ideal for the development of predictions. However, some fragments are present in molecules making some active and others non active. An example can be observed for the fragment 32 (for MSF = 2) or fragment 26 (for MSF = 4) among others. In this case, those fragments cannot be used for the development of predictions, so the presence of other substituents are responsible for the activity, or not, of the molecule. Thus, we observe for MSF =4 that fragment 26 exists in molecules 8, 11, 65, and 66. In molecule 8 it is non active and also contains the fragments 1, 82, 108, 135, 136 and 167, which only exist in this molecule making it non active. Therefore, fragment 26 does not influence the activity of molecule 8. Besides, fragment 26 is also contained in molecule 11. This molecule does not contain any other fragment, so fragment 26 makes non active this molecule. So, we can argue that molecule 11 and fragment 26 cannot be used for the activity prediction. Also, fragment 26 is contained in molecule 64. Molecule 64 contains fragments 123 and 140. These fragments only are contained in molecule 64 and they are responsible of the activity of this molecule (similar behavior can be argued for molecule 65).
Prediction of Drug Activity Using Molecular Fragments-Based Representation
403
4 Results of the Prediction Process Validation of our proposal has been carried out, selecting training and validation subsets of molecules from the dataset studied. The dataset is composed of 65 molecules, 85% (55 molecules) were randomly selected for training and the remaining (10 molecules) for the validation set. In order to select the same percentage for training and test between actives and non active molecules, 44 molecules of between the 52 actives molecules were randomly selected for the training set, and 11 molecules between the 13 non actives ones. The remaining molecules (8 actives and 2 non actives were left for the test essay). In the validation stage, the training set composed of 55 molecules was fragmented and a classification process was developed following the algorithm described in section 2. Resulting from this process is a matrix V of size FxT, F being the number of different fragments extracted and T the number of molecules of the training set (55) was built. Elements V(i, j) of this matrix represent whether or not the fragment i makes the molecule j active or not. Moreover, an array A of size F is retained, where each element A(i) of the array store the contribution of the i fragment to the total accuracy of the activity prediction. Information of the matrix V will use for the determination of the precision in the test analysis. Having generated the matrix V and array A in the training stage, test sets were fragmented and again a matrix fragment/molecule is built. In the prediction stage, some conditions must be considered: − All fragments existing in a molecule of the test set do not exist in the matrix V. In this case, the activity of the molecule cannot be predicted, so the structure of this molecule has little similarity to the structures of the training set, and therefore we cannot have enough information to carry out the prediction. − All fragments existing in a molecule of the test set exist in the matrix V, and all these fragments make all the molecules of the training set active (or non active) in which these fragments are presents. In this case, the activity of the molecule can be predicted. So, we can predict that this molecule is active (or non active) based on the behavior of all fragments present in this molecule for the training set. − Some fragments existing in a molecule of the test set do not exist but others do. In this case, fragments that do not belong to the matrix V are not considered. − Finally, some fragments that belong to a molecule of the test set and are present in the V matrix are considered. Some of them make a molecule of the training set active and others non active. In this case the activity, or non activity, of the molecule of the test set is predicted as follows: ∑
∑
(1)
where: m is the number of fragments that belong to molecules of the test set, n is the number of molecules of the training set containing that fragment, takes value 1 or -1 depending if i fragment makes the j molecule of the training set active or non active, and is the accuracy contribution of the fragment i in the training stage.
404
G. Cerruela García, I. Luque Ruiz, and M. Ángel Gómez-Nieto Table 4. Results of the validation stage.
Molecule ID
Fragments
3 32 37 43 49 55 57
3, 101, 122 72, 110 25, 65, 153 18, 61, 70, 94, 100, 145 62, 107 77, 164 11, 29, 64, 65, 67, 81, 98, 107, 154 9, 86, 94, 131, 140 82, 91 32, 68, 94, 134, 150, 167
60 61 64
p value 2.51 1 0.12 -1.95 -1 0 0.15
Prediction
Activity
Active Active Active Non active Non active ? Active
Active Active Active Non active Non active Active Active
0.08 1 0.8
Active Active Active
Active Active Active
Table 4 shows the results for a round of the validation process. As 13 non active molecules compose the dataset and two of them correspond to the 15% extracted for the test set. Deviations observed were due to the presence of fragments in the molecules of the test set which are not contained in the molecules of the training set, for example molecule 55 only has two fragments (77 and 164) figure 1 shows as in all dataset this fragments only belong to molecule 55 being impossible to carry out the prediction.
5 Conclusion The classification of molecular databases using the molecular fragments representation space, weighing each element with a specific molecular descriptor, constitutes a useful tool to discover the relationship between the structure and the activity of the compounds. Besides the relative information to the presence or absence of certain fragments in the molecules of the dataset, it is possible to extract information that relates this presence to the biological activity of the fragments. The use of feature selection RFE method join the classification using support vector machine, allows us to predict the biological activity of molecules considering the influence of their fragments for other molecules.
References 1. Ghuloum, A.M., Sage, C.R., Jain, A.N.: Molecular Hashkeys: A Novel Method for Molecular Characterization and Its Application for Predicting Important Pharmaceutical Properties of Molecules. Journal of Medicinal Chemistry 42(10), 1739–1748 (1999)
Prediction of Drug Activity Using Molecular Fragments-Based Representation
405
2. Cross, S., Baroni, M., Carosati, E., Benedetti, P., Clementi, S.: FLAP: GRID Molecular Interaction Fields in Virtual Screening. Validation using the DUD Data Set. J. Chem. Inf. Model. 50(8), 1442–1450 (2010) 3. Culp, M., Johnson, K., Michailidis, G.: The Ensemble Bridge Algorithm: A New Modeling Tool for Drug Discovery Problems. J. Chem. Inf. Model. 50(2), 309–316 (2010) 4. Luque Ruiz, I., Cerruela García, G., Gómez-Nieto, M.A.: Representation of the Molecular Topology of Cyclical Structures by Means of Cycle Graphs. 3. Hierarchical Model of Screening of Chemical Databases. J. Chem. Inf. Comput. Sci. 44, 1903–1911 (2004) 5. Sun, H.: An Accurate and Interpretable Bayesian Classification Model for Prediction of hERG Liability. Chem. Med. Chem. 1(3), 315–322 (2006) 6. Zhou, J., Augelli-Szafran, C.E., Bradley, J.A., Chen, X., Koci, B.J.: Novel Potent hERG Potassium Channel Enhancers And Their In Vitro Antiarrhythmic Activity. Molecular Pharmacology 68(3), 876–884 (2005) 7. Chen, W.-h., Wang, W.-y., Zhang, J., Yang, D., Wang, Y.-p.: State-dependent blockade of human ether-a-go-go-related gene (hERG) K + channels by changrolin in stably transfected HEK293 cells. Acta Pharmacologica Sinica 31, 915–922 (2010) 8. Bayada, D.M., Hamersma, H., van Geerestein, V.J.: Molecular diversity and representativity in chemical databases. J. Chem. Inf. Comput. Sci. 39, 1–10 (1999) 9. Furlanello, C., Serafini, M., Merler, S., Jurman, G.: An accelerated procedure for recursive feature ranking on microarray data. Neural Networks 16, 641–648 (2003) 10. García-Pedrajas, N., Ortiz-Boyer, D.: A cooperative constructive method for neural networks for pattern recognition. Pattern Recognition 40(1), 80–99 (2007) 11. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002) 12. Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995) 13. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Disc. 2, 127–167 (1998) 14. Todeschini, R., Consonni, V.: Handbook of Molecular Descriptors. Wiley-VCH (2000) 15. JChem 5.4.0.0, ChemAxon (2010), http://www.chemaxon.com
A Hybrid Video Recommendation System Using a Graph-Based Algorithm Gizem Öztürk and Nihan Kesim Cicekli Department of Computer Engineering Middle East Technical University, Ankara, Turkey
[email protected],
[email protected] Abstract. This paper proposes the design, development and evaluation of a hybrid video recommendation system. The proposed hybrid video recommendation system is based on a graph algorithm called Adsorption. Adsorption is a collaborative filtering algorithm in which relations between users are used to make recommendations. In this paper, Adsorption algorithm is enriched by content based filtering to provide better suggestions. Thus, collaborative recommendations are empowered considering item similarities. Therefore, the developed hybrid system combines both collaborative and content based approaches to produce more effective suggestions. Keywords: Recommendation systems, collaborative filtering, content based filtering, information extraction.
1 Introduction Recommendation systems aim to overcome the difficulty of finding proper information. Available systems try to help users to find the most relevant data they want. There are recommendation systems in different domains. For instance Amazon.com recommends books in book domain; Last.fm helps users to find the songs that they want to listen, MovieLens tries to guide users to reach the movies they might like and Netflix which also aims to suggest relative matches to its customers provides various number of movies and TV shows. Former research work was based on the idea of prediction of ratings only. In other words, the problem seems to guess the rating of unrated items by users. Recent research deals with more complex prediction approaches. Especially, with the improvement of information technologies, recommender systems make use of techniques such as information retrieval, user modeling and machine learning. Recommender systems can be broadly divided into three categories according to the approach they used to make recommendations. These are content-based recommendation, collaborative recommendation and hybrid recommendation [2]. In content-based recommendation, items are suggested according to their similarity to the items the user selected before. In collaborative recommendation, items are suggested according to the similarity between users with similar habits. Hybrid systems combine these methods to obtain better performance. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 406–415, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Hybrid Video Recommendation System Using a Graph-Based Algorithm
407
Adsorption [3] is a collaborative filtering algorithm which is already applied to YouTube successfully. In YouTube, there are millions of videos available and users can state whether they like the video or not. Adsorption uses this rating information and tries to reach unrated videos using a graph-based algorithm. The newly reached videos are suggested to users as new recommendations. Adsorption algorithm [3] is among the new generation graph-based collaborative filtering methods. However, this method is not used together with content-based recommendation before. In this paper, the results of Adsorption algorithm are improved by adding content-based filtering to obtain more accurate suggestions. The main contribution of this work [1] is improving the results of Adsorption algorithm by injecting content-based similarities between videos for the purpose of enhancing recommendations. In addition to videos in YouTube, Adsorption algorithm is also applied to movie domain in MovieLens. The rest of the paper is organized as follows. Section 2 explains the main approach that is used for the development of the hybrid recommendation system. Section 3 describes experiments and evaluation approaches which are used to evaluate the system. Finally, Section 4 concludes the paper and discusses possible future work.
2 A Hybrid Video Recommendation System The hybrid recommendation system that is developed in this paper is an application which aims to select appropriate videos or movies for users. The developed recommendation system can be used for both YouTube and MovieLens [1]. Recommendations are done according to both collaborative and contend based features. First, ratings are guessed according to collaborative relations. Then, content based (CB) features are injected to provide a hybrid system. The general system architecture is presented in Fig. 1
Fig. 1. General System Architecture
The distribution list consists of users and the list of videos for each user. The items in the video list are found to be related to that user and can be recommended accordingly. The item similarities table shows the similarity value of each pair of items.
408
G. Öztürk and N. Kesim Cicekli
2.1 Item and User Modeling In the proposed hybrid recommendation system, first collaborative filtering (CF) is applied; then the content based approach is injected to the results. The input to the CF should be a graph. In this graph, users and items are represented as the nodes. At the beginning items and item ratings are structured together as item-rating pairs. Then, these objects are used in order to model users. Obtained user objects contain user names and a list of item-rating pairs. For each user a graph node is constructed. While examining the list of item-rating objects, a graph node is inserted for each distinct item. Weighted edges are added between nodes considering the ratings that are given to the items by the users. It should be noted that all user names and video IDs are unique in the system. 2.2 YouTube Information Extractor YouTube does not provide a database that can be used as an evaluation data set. Instead an API [4] is provided in order to help developers to implement client applications. We used this Java API to retrieve the necessary data to construct our data set. The extracted data includes user information, such as user name, list of watched and rated videos, and given ratings. Periodically, the system checks for updates in user information and inserts new data accordingly. This enables the data to stay up-to-date. The information extractor consists of three main modules which are video, user and rating fetcher. Since the list of YouTube users is not readily available via YouTube API, various videos are visited as a first step to collect user data. There are standard feeds such as top_rated, most_viewed, top_favorites, most_popular, which are provided by YouTube. The returned feeds are in xml format. Gathered feeds are examined and newly obtained video IDs are stored. The obtained xml files contain video information including a feed link for comments of the corresponding video. Comment feed is retrieved because it contains the list of users who share their opinions about the video. The final step is getting ratings of users. In YouTube, each user has their events feed. If the users agree to share their activities, these feeds can be retrieved from the YouTube API. Activity feeds contain information such as rated videos, favourite videos and commented videos. 2.3 Recommender The proposed recommender system uses both collaborative filtering and content based approaches in order to provide suggestions. Collaborative filtering forms the predictions for the movies and content based approach aims to improve the obtained results. Pure Collaborative Filtering Approach. In this paper, a graph-based collaborative filtering algorithm, Adsorption [3], is used. It is a general framework in which there are both labeled and unlabeled items and it can be used for classification and learning. The basis of the algorithm is giving labels to the unlabeled items using labeled items in the graph structure. The versions of Adsorption algorithm are ‘Adsorption via Averaging’, ‘Adsorption via Random Walks’ and ‘Adsorption via Linear Systems’.
A Hybrid Video Recommendation System Using a Graph-Based Algorithm
409
According to the theorem given in [3] all three version of the Adsorption algorithms are equal. In this work ‘Adsorption via Averaging’ is used due to memory and time issues. The main idea in ‘Adsorption via Averaging’ is forwarding labels from the labelled items to the neighbour items, and saving the received labels by neighbours. The important part of the algorithm is to make sure keeping important information while guaranteeing to converge with a reasonable number of label assignments. More formally it can be explained as the following [3]. A graph G = (V , E, w) is given where V is the set of vertices, E denotes the set of edges, and w : E → R denotes a non-negative weight function on the edges. L denotes a set of labels. Assume each node v in a subset VL ⊂ V carries a probability
Lv on the label set L . VL represents the set of labelled nodes. At this point some pre-processing is necessary. For each vertex v ∈ VL , a shadow ~ is created with exactly one outgoing neighbour v , which means v~ and v vertex v
distribution
are connected by an edge with a weight of 1. The time complexity of the algorithm is O(n2) and the pseudo-code of the algorithm is as follows: Input: G = (V , E , w), L,VL . repeat for each
~ v ∈ V ∪ V do:
let Lv = ∑ w(u, v) Lu u
end-for Normalize
Lv to have unit L1 norm
until convergence Output: Distributions
{Lv | v ∈ V }
In order to apply the algorithm, the first step is to create the user-view graph. Considering effective usage of memory and processor, videos which have a rating lower than the decided threshold are pruned and not added to the graph. As it is aimed to find the preferences, this threshold value is set to 4. That is ratings greater or equal than 4 mean a certain choice of the user. After the pruning step, a shadow node is created for each user and video, which is the end of the graph construction part. User-view graph helps reaching related videos using the connections between users and videos. Starting from a user, traverse is done to watched videos firstly and then to other people who watched those videos and so on. Each node of the graph is traversed one by one and its label distribution list is updated according to its neighbours. First, the label distribution list of the current node is cleared. Then, this list is reconstructed by traversing its neighbours and copying their label distribution lists. The edge weight between the current node and its neighbour is also taken into account in this process. This copying process is continued with the neighbour of the neighbour of the current node and so on. While going deeper, the effect of labels reduces dramatically and time and memory constraints become crucial. For this reason, the system uses only the first
410
G. Öztürk and N. Kesim Cicekli
3 levels of the neighbour label distributions. That is after level 3, reached nodes are not very related with the original node so their do not have a concrete benefit. The size of the label distribution list limits the labels which will be carried to the next iteration. It is accepted that after the upper bound of the label distribution list, remaining items has less importance so they are called poor labels. Therefore, after the label distribution list is formed, it is sorted and poor labels are deleted from the list. This process continues until the label distribution list of all nodes converges. To be more precise, whenever the label distribution list of all nodes remains same on an iteration, the algorithm terminates. Injection of Content Based Methods to Collaborative Filtering. To increase the strength of recommendations it is decided to add content based filtering to the results obtained by collaborative filtering. The content based method that is used in this paper recommends videos/movies to the users that are similar to the ones obtained as a result of the Adsorption algorithm. The aim is to suggest different but also relevant items to the users. Content based filtering is added by using item similarities. Collaborative results are sorted by relevance and less relevant results are replaced with content based similarity results. Item Similarities for videos in YouTube. In YouTube API there is a feed which retrieves the related videos to a specific one. When this feed is retrieved the list of related videos are gathered. If a related video is already in the recommendation list, another related item is added to recommendation list. Item similarities for movies in MovieLens. The similarities between movies can be found according to their features such as year, actors, genre etc. However, in MovieLens database only basic information such as movie name, movie year, movie genre, movie IMDb URL, etc. is provided. So, it is required to gather more detailed movie information from IMDb which stores extra information about movies like movie kind, writer list, cast list, country, language, company and keywords. In movie domain it is not reasonable to give the same importance to all attributes. To be more precise, writer, genre or country of the movie cannot have the same significance with each other for a movie to be preferred. Therefore, there is a need to decide the importance values of the features. This problem is studied in [6] and feature weighs for movies are determined experimentally. In addition, [6] defines similarity with the equation:
S (Oi , O j ) = w1 f ( A1i | A1 j ) + w2 f ( A2i | A2 j ) + ... + wn f ( Ani | Anj ) According to the equation, S describes the similarity between objects Oi and O j where
wn is the weight applied to the similarity between object attributes An . The
difference is calculated by the function f ( Ani | Anj ) . Table 1 shows the feature weight values as determined in [6]. The related videos of a movie are found by using the values above and IMDb database. As a result of Adsorption algorithm, a distribution list is obtained which is aimed to be used as the recommendation list itself. Half of bad results are deleted from the
A Hybrid Video Recommendation System Using a Graph-Based Algorithm
411
distribution list of user. As a result of calculating item similarities, new items are added to the recommendation list of the active user. Therefore, the recommendation list contains items from both collaborative filtering and content based filtering providing a hybrid recommendation to the user. Table 1. Feature Weight Values Feature Type Writer Genre Keyword Cast Country Language Company
Mean 0.18 0.36 0.04 0.03 0.01 0.07 0.09 0.21
3 Experiments and Evaluation This section presents the experiments that were carried out in order to evaluate the performance of the system. 3.1 Datasets Two different datasets are used in order to evaluate the proposed system. These are YouTube data set and MovieLens data set. YouTube dataset is formed by the help of the information extractor. The task of collecting data for our database continued nearly four months. Resulting dataset includes 177733 ratings, 117604 videos and 15090 users. As the values indicate, the YouTube dataset is very sparse. Selected MovieLens dataset has 100,000 ratings for 1682 movies by 943 users [7]. IMDb data is used and movie features are also taken into consideration while finding item similarities for MovieLens. 3.2 Evaluation Metrics This paper focuses on evaluating the effectiveness of results. In order to evaluate effectiveness, precision and recall are among the most preferred metrics. Precision is the ratio of the number of relevant items which are retrieved to the total number of retrieved items [9]. Recall is the ratio of the number of relevant items which are retrieved to the total number of relevant items [9]. F-measure is also a metric for evaluation which combines precision and recall. Actually F-measure is the harmonic mean of precision and recall. In this paper precision, recall and F-Measure values are calculated in order to evaluate the system performance. Evaluation is done for a subset of existing data and using remaining part as training. There are various parameters that may be changed in order to examine results in different perspectives. These are U ,Y , β , γ , δ and their explanations are given in the following.
412
G. Öztürk and N. Kesim Cicekli
Parameters U and Y . Different user groups are formed according to the number of ratings they gave to items. U denotes the user groups for MovieLens users and Y denotes user groups for YouTube users. Because of the high data sparsity of YouTube, only one group of users is formed. This group contains 20 users and average rating of the group is 70. On the other hand, in MovieLens dataset three types of user groups, U1 , U 2 and U 3 , are formed according to their average number of ratings. The details for both YouTube and MovieLens user-sets are shown in Table 2. Table 2. Test User Groups User Group
Y U1 U2 U3
Average # of ratings 70 250 150 60
Parameter β . The parameter β denotes the depth value. It represents how deep the Adsorption algorithm goes in the user-view graph. 3 is selected for this parameter because of time constraints. Parameter γ . It is the size of the label distribution list. Since increasing this parameter also increases the memory usage dramatically, an upper bound value of 40 is selected for its maximum value. On the other hand, there must be a sufficient number of recommendations in order to evaluate the recommendation system properly. So, lower boundary of this parameter is set to 20. Intermediate values are also considered to see the effect of this parameter on overall evaluation. Therefore, calculations are done for five different γ values. These are 20, 25, 30, 35 and 40. Parameter δ . This is the threshold value of ratings. While traversing the videos that are rated by a user, related video is added as a video node only if its rating is equal to or higher than the value of δ . It is assumed that, users give ratings above 3 (in a 1 to 5 rating system) to videos they like. Because of this, 4 is selected for this parameter. 3.3 Experiments Results of both pure CF and the hybrid system are presented in this section. As Adsorption is affected very much from sparsity, content based approach gives a chance to increase the quality of suggestions. Effects of this can be seen in results. YouTube Experiments. In first experiment, the effectiveness of pure CF system is evaluated using YouTube data. For user-set Y calculations are done and the results are presented in Table 3. According to Table 3, while the values for precision are increasing, recall values are decreasing. As it can be observed, especially recall values are very low. This happens because of data sparsity. It can also be deduced that recall is directly proportional to γ values whereas precision is inversely proportional to γ.
A Hybrid Video Recommendation System Using a Graph-Based Algorithm
413
Table 3. YouTube Test Results with pure CF System
γ-value precision recall F-Measure
20 0.255556 0.046589 0.07881
User Group Y 25 30 0.220311 0.173333 0.051023 0.057757 0.082857 0.086644
35 0.171984 0.070678 0.100184
40 0.171429 0.077599 0.106837
The second experiment is done in order to see the effect of content-based filtering over the existing CF system. The results are obtained using YouTube dataset. Table 4 demonstrates the results. Table 4. YouTube Test Results with Hybrid System
γ -value precision recall F-Measure
20 0.20744 0.076589 0.111873
User Group Y 25 30 0.184444 0.161333 0.081209 0.089757 0.112767 0.115344
35 0.122381 0.118986 0.12066
40 0.101429 0.159599 0.124032
Fig. 2. F-measure vs. γ (Y) Comparison
Fig. 2 demonstrates the relationship between γ and F-measure. It can be concluded that F-Measure tends to increase with increasing γ values. As it is seen from the results, hybrid system curve has a similar form with pure CF curve. Similarly, it can also be seen that hybrid system has higher values which means hybrid system performs better results than pure collaborative system when YouTube data is used. MovieLens Experiments. This part of the experiment evaluates the effectiveness of system using MovieLens dataset. For each user group 2 tests are done (one for pure CF and one for hybrid). Therefore MovieLens includes 6 different experiments. The results for the first experiment are for pure CF system using MovieLens user-set U1 are shown in Table 5. Secondly, in Table 6 there are results for hybrid system using MovieLens user-set U1.
414
G. Öztürk and N. Kesim Cicekli Table 5. MovieLens Group U1 Test Results with pure CF System
γ -value precision recall F-Measure
20 0.9373062 0.1147386 0.2044499
User Group U1 25 30 0.9327172 0.925128 0.1198197 0.126901 0.2123591 0.223187
35 0.918103 0.150324 0.258348
40 0.908077 0.167748 0.283184
Table 6. MovieLens Group U1 Test Results with Hybrid System
γ -value precision recall F-Measure
20 0.8006507 0.1999656 0.320008
User Group U1 25 30 0.7562651 0.73188 0.2285986 0.273631 0.3516076 0.398335
35 0.730212 0.309294 0.434534
40 0.724491 0.379793 0.498344
It can be inferred from Fig. 3 that, F-measure increases while the size of the distribution list increases.
Fig. 3. F-measure vs. γ (U1) Comparison
For all user groups U1, U2 and U3 precision, recall and F-measure values do not change very much. For each user group precision, recall and F-measure graphs follow similar patterns. Therefore, there is not a certain relation considering only user groups, and this shows that Adsorption is insensitive to user groups. As a result, CF system gives coherent results with all user groups. In all figures, hybrid curves have higher values than pure CF curves. This means that more accurate results are obtained by inserting CB approach in CF approach. So, it can be said that considering item similarities and applying CB filtering approach improves the results of the recommendation system.
4 Conclusion and Future Work In this paper, a hybrid recommendation system is presented. The system uses both collaborative filtering and content-based recommendation techniques. Base is the collaborative part in which a graph based algorithm called Adsorption. Content information is retrieved from both IMDb and YouTube and this is used in order to propose a better system.
A Hybrid Video Recommendation System Using a Graph-Based Algorithm
415
Enhancement is done on the distribution list which is retrieved from the collaborative filtering. To make use of content based approaches item-item similarities are found. According to the similarity results new movies which are not included in the result of collaborative recommendation are inserted to the list and recommended to the user. Results of experiments show that the hybrid system has a better performance on recommendations than using pure collaborative algorithm. It is also found out that system gives more successful results when MovieLens dataset is used which means good results are obtained when the data is not sparse. As a future work the system can be extended so that even with sparse data the system can give more appropriate suggestions to users. Beside this, the recommendation system proposed in this paper works offline and makes offline predictions. Considering video domain, the next step can be to integrate this system to an online organization where users watch videos online.
Acknowledgments This work is partially supported by The Scientific and Technical Council of Turkey Grant ‘‘TUBITAK EEEAG-107E234”.
References 1. Ozturk, G.: A Hybrid Video Recommendation System Based on a Graph Based Algorithm, Master Paper, Middle East Technical University (2010) 2. Adomavicius, G., Tuzhilin, A.: Towards the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering 17(6) (June 2005) 3. Balabanovic, M., Yoav Shoham, F.: Content-based, collaborative recommendation(Special Section: Recommender Systems). Communications of the ACM 40(66) (1997) 4. Baluja, S., Seth, R., Sivakumar, D., Jing, Y., Yagnik, J., Kumar, S., Ravichandran, D., Aly, M.: Video Suggestion and Discovery for YouTube: Taking Random Walks Through the View Graph. In: The Proceedings of WWW (2008) 5. YouTube API, http://code.google.com/apis/youtube/ 6. Debnath, S., Ganguly, N., Mitra, P.: Feature weighting in content based recommendation system using social network analysis. In: WWW (2008) 7. MovieLens, http://www.movielens.org/ 8. van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth, London (1979) 9. Harman, D., Candela, G.: Retrieving Records from a Gigabyte of Text on a Minicomputer Using Statistical Ranking. Journal of the American Society for Information Science (December 1990)
A Diagnostic Reasoning Approach to Defect Prediction Rui Abreu1,2 , Alberto Gonzalez-Sanchez3 , and Arjan J.C. van Gemund3 1
Dept. of Informatics Engineering, University of Porto, Portugal School of Computer Science, Carnegie Mellon University, USA
[email protected] Software Technology Dept., Delft University of Tech., The Netherlands {a.gonzalezsanchez,a.j.c.vangemund}@tudelft.nl 2
3
Abstract. During software testing, defect prediction approaches measure current reliability status, forecasting future program failures, and provide information on how many defects need to be removed before shipping. Existing approaches often require faults to be detected and identified as a new one, before a model-based trend can be fitted. While during regression testing failures may frequently occur, it is not evident which are related to new faults. Consequently, reliability growth trending can only be performed in sync with fault identification and repair, which is often performed in between regression test cycles. In this paper we present a dynamic, reasoning approach to estimate the number of defects in the system early in the process of regression testing. Our approach, coined Dracon, is based on Bayesian fault diagnosis over abstractions of program traces (also known as program spectra). Experimental results show that Dracon systematically estimates the exact number of (injected) defects, provided sufficient tests cases are available. Furthermore, we also propose a simple, analytic performance model to assess the influence of failed test cases in the estimation. We observe that our empirical findings are in agreement with the model. Keywords: Defect prediction, Bayes, diagnosis, program spectra.
1
Introduction
Software testing is an important activity to verify the presence or absence of faults (defects). When no defects are detected during testing, an important question is when to terminate testing such that sufficient confidence is reached about system health at acceptable testing cost. On the other hand, when failures do occur, an equally important question is how many defects can be assumed to be present. This determines reliability when the product would be shipped, but it also provides information on how many developers need to be involved in debugging in parallel should all defects need to be removed. In defect (density) prediction and reliability growth models one predicts the number of faults by using models. In the former case defect density is predicted in terms of various code complexity (e.g., [28]) or developer (e.g., [19]) metrics, K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 416–425, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Diagnostic Reasoning Approach to Defect Prediction
417
while in the latter case defect detection rate models are fitted to the detection rate obtained from test data (e.g., [26]). As defect prediction models vary considerably over different systems and releases, their accuracy is typically limited to defect density prediction, rather than accurate defect prediction, which is typically based on actual test data. On the other hand, given the inherent data fitting process reliability growth models need extensive test data to deliver sufficient accuracy. Moreover, the use of contemporary test prioritization algorithms [8,13,22,17,23,29] aimed at detecting faults as early as possible, put a strain on the typical assumption that the defect detection rate is more-or-less constant, leading to prediction unreliability. In this paper we propose an alternative, probabilistic approach where the test results are analyzed from a fault diagnosis point of view. In fault diagnosis failures are mapped to their root causes, localizing the faults in the system, enabling more adequate recovery, faster repair, and/or better redesign. Although fault diagnosis focuses on localizing the faults, an important side-result is an estimate of the number of faults in the system (if any), along with its probability. This allows the tester to terminate the process once sufficient probability has been obtained about the number of faults in the system. As the faults are already diagnosed (localized) the faults can also be easily repaired, or a better estimate can be made on their future impact on system reliability. Furthermore, knowledge on the number of defects allows for efficient parallel debugging, the success of which when using statistical fault localization methods has been limited up until now (e.g., [3,14]). The contributions of this paper are twofold: (i) we present our reasoning approach, coined Dracon1 , to defect prediction based on spectrum-based, Bayesian fault diagnosis; (ii) We study the effect of problem parameters such as the number of faults, number of tests, number of system components, false negative failure detection rate, on defect prediction performance, using synthetic systems which allows us to control the above parameters. To the best of our knowledge, this fault diagnostic reasoning approach to defect prediction has not been presented before. The paper is organized as follows. In the next section we discuss the spectrumbased, Bayesian fault diagnosis. Section 3 presents our reasoning approach to defect prediction. Experimental results are reported in Section 4. We compare our approach with related work in Section 5. In Section 6, we conclude and discuss directions for future work.
2
Fault Diagnosis
Two major approaches to automatic software fault localization can be distinguished, (1) statistics-based, and (2) reasoning/probabilistic-based fault localization2 . Statistics-based techniques use abstraction of program traces (also known as program spectra [12]) to correlate software component activity with program 1 2
Dracon is an acronym for Diagnostic ReAsoning-based defeCt predictiON. In this paper we use the word localization/debugging/diagnosis interchangeably.
418
R. Abreu et al.
failures [2,15,18]. While statistics-based approaches scale very well, it has been shown that their diagnostic accuracy is insufficient to enable defect prediction [3]. Reasoning-based techniques, such as model-based diagnosis, deduce component failure through logic reasoning [7,9,20,27] using propositional models of component behavior. While their superior diagnostic accuracy allows them to be used as a basis for defect prediction, their limited scalability prohibits their use for realistic program sizes. In [4] a Bayesian reasoning approach has been proposed that uses generic component behavior models in a spectrum-based context. Pairing the diagnostic accuracy from reasoning approaches with the scalability of statistical approaches, this approach (subsequently coined “Bayesian Fault Diagnosis”) will be used as the basis for defect prediction. 2.1
Bayesian Fault Diagnosis
The following inputs are involved in Bayesian fault diagnosis: – A finite set C = {c1 , c2 , . . . , cj , . . . , cM } of M components of which MF = 0, . . . , M can be faulty. – A finite set T = {t1 , t2 , . . . , ti , . . . , tN } of N tests with binary outcomes oi , where oi = 1 if test ti failed, and oi = 0 otherwise. – An N × M coverage matrix, A = [aij ], where aij = 1 if test ti involves (“covers”) component cj , and 0 otherwise. NF ≤ N denotes the number of failing tests. – An a priori fault probability (“prior”) pj for each component cj , which represents the knowledge available before any test is executed. – A set of false negative rate values 0 ≤ hj ≤ 1 (h for health) for each component cj . The value of hj expresses the probability cj will not cause a failure when covered in a test iff cj is defective. This is also known as coincidental correctness or failure exposing potential. Bayesian fault diagnosis is aimed at obtaining a set of fault candidates D = d1 , . . . , dk . Each candidate dk is a subset of the components which, when at fault, explain the observed failures. For instance, d = {c1 , c3 , c4 } indicates that c1 and c3 and c4 are faulty, and no other component. Many algorithms exist to compute D [6,20]. The multiple-fault Bayesian fault diagnosis as described in this section is an intractable problem given that it has to consider up to |D| = 2M possible candidates, one for each possible combination of components. However, effective strategies for reducing the size of D do exist, and will be discussed in Section 3 as part of our defect prediction algorithm. The candidates returned in D are ordered according to their probability of being the true diagnosis Pr(dk ). As there can only be one correct candidate, all the individual probabilities add up to 1. Initially, the probability of each candidate is pj ·
Pr(dk ) =
cj ∈dk
(1 − pj )
cj ∈dk
(1)
A Diagnostic Reasoning Approach to Defect Prediction
419
After test case ti is executed, the probability of each candidate dk ∈ D is updated depending on the outcome oi of the test, following Bayes’ rule: Pr(dk |oi ) =
Pr(oi |dk ) · Pr(dk ) Pr(oi )
(2)
which is repeated recursively for each subsequent test. In this equation, Pr(dk ) represents the prior probability of candidate dk before the test is executed. Pr(oi ) is a normalization value that represents the residual probability of the observed outcome, independent of which candidate is the actual diagnosis. Pr(oi |dk ) represents the probability of the observed outcome oi produced by a test ti , if that candidate dk was the actual diagnosis. This depends on hj of the components in dk involved in the test (given by A) according to Pr(oi = 0|dk ) = 1 − Pr(oi = 1|dk ) =
hj
(3)
cj ∈dk ∧aij =1
Thus if either of the involved components fails, the program will fail (or-model). This assumes that components fail independently, a standard assumption in fault diagnosis [4,6]. In [4] it is shown how D (including the hj ) is computed from N tests. The accuracy of the diagnosis increases with N , and is measured in terms of diagnostic cost Cd , which is defined as the number of times a component in D has been tested by the diagnostician to be healthy (false positive) while he/she goes through D from top (dk with highest posterior) to bottom (lowest probability). More details can be found in [4]. The above approach to computing D is the basis for the defect prediction approach we present in Section 3. 2.2
Example
To illustrate how Bayesian fault diagnosis works, consider the character counter program in Table 1, which contains 2 defects. For the sake of source code readability the coverage matrix A is shown transposed. For simplicity, all cj are assumed to have equal prior fault probability, pj = 0.1, and we assume that faults, when covered, will always cause a failure, i.e., hj = 0. Initially D = {c1 }, {c2 }, . . . , {c1 , c2 }, . . . {c6 , c12 }, . . . contains all candidates, with the ones with lower cardinality on top (i.e., less faults are more probable than more faults), and Cd (0) = 66% 3 . The initial probability for the true fault explanation d6,12 = {c6 , c12 } is Pr(dk ) = 0.12 · 0.911 = 0.003. After all tests are executed, D = {c6 , c12 }, {c6 , c8 , c12 } with probabilities 0.9 and 0.1 respectively. Even though there is still some uncertainty in the diagnosis, Cd = 0 as the correct diagnosis ranks on top. Note that Cd decreases as more (failed) runs are observed (see Table 1’s last row) ; e.g., after test t3 , Cd is still 9.1%. 3
When multiple candidates have the same probability we take an averaged effort, Cd = 66% in this case.
420
R. Abreu et al. Table 1. Example diagnosis (hj = 0.0, pj = 0.1)
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12
3
A Program: Character Counter t1 t2 t3 t4 t5 t6 t7 t8 main() { 1 1 1 1 1 1 1 1 int let=0, dig=0, other=0, c; 1 1 1 1 1 1 1 1 while(c = getchar()) { 1 1 1 1 1 1 1 1 if (’A’=c) 1 0 1 1 1 1 1 1 let += 2; // FAULT 0 0 1 1 1 1 0 1 elsif (’a’=c) 1 0 1 0 1 1 1 1 let += 1; 0 0 1 0 0 1 0 1 elsif (’0’=c) 1 0 1 0 1 1 1 0 dig += 1; 1 0 1 0 1 0 0 0 elsif (isprint(c)) 1 0 0 0 0 1 1 0 other += 2; // FAULT 0 0 0 0 0 1 1 0 printf("%d %d %d\n", 1 1 1 1 1 1 1 1 let, dig, others);} Test case outcomes 0 0 1 1 1 1 1 1 Cd (%) 6.1 6.1 9.1 4.5 4.5 4.5 0.0 0.0
Defect Prediction
In this section we present our defect prediction algorithm and discuss its accuracy. 3.1
Algorithm
Given a diagnosis D the probability Pr(cj ) that component cj is defective is given by the total probability mass of the dk that include cj , i.e.,
Pr(cj ) =
Pr(dk )
(4)
dk ∈D,cj ∈dk
It follows that the number of defects is given by E[MF ] =
M
Pr(cj )
(5)
j=1
From Eqs. (4) and (5) it follows that the number of detects is estimated by E[MF ] = |dk | · Pr(dk ) (6) dk ∈D
Despite its superior accuracy compared to statistics-based approaches, the major problem of Bayesian fault diagnosis is that it requires maintaining O(2M ) candidates dk , which is impracticable for realistic problem sizes. To reduce the number of candidates we only generate the minimal ones, instead of all possible candidates (e.g., only generate {c6 } instead of {c6 } and {c6 , c8 }, as the former candidate already fully explains all test outcomes and is much more probable). To this end we employ a minimal hitting set (MHS) algorithm called Staccato [1] that generates minimal diagnosis candidates in (approximate) order of posterior probability at polynomial cost. Due to this best-first order, the first
A Diagnostic Reasoning Approach to Defect Prediction
421
100 diagnosis candidates already cover practically all posterior probability mass, greatly improving scalability [1]. The above MHS approach works very well as long as failures are detected (which is the typical application context of fault diagnosis). When applying Bayesian diagnosis for defect prediction, however, we also need to cover the case when there are no test failures found. If (or as long as) no failed runs are observed, we compute a first order approximation subset of the full O(2M ) candidates set (again, to reduce complexity). That subset contains the single and double candidates plus the one which states that all components are healthy. This modified hitting set computation is denoted HS1. Algorithm 1 summarizes the steps involved in the estimation of the number of defects in the program under analysis. Our approach takes as its only input (A, e). First, it computes the set of diagnosis candidates as described above (lines 2 to 6). Second, the candidates in D are ranked using Barinel (line 7, Barinel also estimates hj automatically - for more info, refer to [4]). Finally, the number of defects are estimated from the ranked D as follows (lines 8 to 10 in the algorithm). Algorithm 1. Defect Prediction Algorithm: Dracon Inputs: Matrix (A, e) Output: E[MF ] 1 2 3 4 5 6 7 8 9 10 11
E[MF ] ← 0 if |{n | en = 1}| = 0 then D ← HS1() else D ← Staccato(A,e,100) end if (D, Pr) ← Barinel(A,e,D) for all dk ∈ |D| do E[MF ] ← E[MF ] + |dk | · Pr[dk ] end for return E[MF ]
To illustrate our approach to defect prediction, consider again the example of Section 2.2. After the first test, Dracon estimates that there are 0.012 defects, and it drops to 0.008 after t2 . Once the first failed run is observed, t3 , E[MF ] increases to 1. After all tests are executed, our algorithm estimates that the program has 2 defects (in fact, E[MF ] = 2 is already estimated after t7 is executed). 3.2
Estimating E[MF ]
From the above it can be seen that the quality of D (measured as Cd ) increases with N . This implies that for small N the estimation quality of E[MF ] may not be sufficient. In this section we study the accuracy of E[MF ]. Figure 1 shows
422
R. Abreu et al.
E[MF ] as more test data comes available. The data are based on a random test matrix corresponding to a synthetic system with M = 20 components of which MF = 5 are defective. As for defective systems (i.e., MF > 0) diagnostic accuracy is primarily determined by failing tests, E[MF ] is plotted against NF instead of N . From the figure it can be seen that E[MF ] converges to the correct value MF = 5 according to an approximate, first-order response, modeled by E[MF ] = MF · (1 − e−NF /τ )
(7)
where τ denotes a constant that is determined by the system and the test matrix A. The model is based on the typical intuition in reliability growth modeling that the probability that a new test uncovers a new defect is inversely proportional to the number of uncovered defects (e.g., the Goel-Okumoto model [11,21]). The above model is fitted in the plot, showing a reasonable fit (R2 = 0.90). The above, asymptotic behavior suggests that Dracon properly converges to the exact number of defects. This is confirmed by our experiments in the next section. 6
5
E[MF]
4
3
2
1 model E[MF] E[MF] 0 00
0
10
0
0
95
90
0
85
0
80
0
75
0
0
70
65
0
60
0
55
0
50
0
0
45
40
0
35
0
30
0
25
0
20
0
15
10
50
0
NF
Fig. 1. Impact of NF on E[MF ]/NF , model MF /NF
4
Experimental Results
In this section, we evaluate the estimation performance of our algorithm. 4.1
Experimental Setup
In order to assess the defect prediction performance of our approach we generate synthetic program observations based on random (A, e) generated for various values of N , M , and the number of injected faults MF . The reason for the synthetic experiments is that we can vary parameters of interest in a controlled setting, whereas real programs typically represent only one particular parameter setting. Component activity aij is sampled from a Bernoulli distribution with parameter r, i.e., the probability a component is involved in a row of A equals
A Diagnostic Reasoning Approach to Defect Prediction
423
r. For each faulty component cj (without loss of generality we select the first MF components, i.e., c1 , . . . , cMF are faulty) we set the component healths (false negative rates) hj . Thus the probability of a component j being involved and generating a failure equals r · (1 − hj ). A row i in A generates an error (ei = 1) if at least 1 of the MF components generates a failure (or-model). Measurements for a specific (N, M, MF , ρ, h) scenario are averaged over 1, 000 sample matrices, yielding a negligible coefficient of variance. 4.2
Results
1
6
1e-10
5
1e-20
4
E[MF]
E[MF]
We report experiments using M = 100 and ρ = 0.6 for varying number of runs N and number of injected faults MF . We have considered two scenarios: (i) only observing passed runs and (ii) observing both failed and passed runs. Due to the lack of space, we do not report results for other scenarios but the conclusions are similar. Figure 2(a) plots the evolution of E[MF ] for MF = 0 faults. As discussed before, the more passed runs are observed, the stronger the probability the system is defect-free. Therefore, E[MF ] decreases with N . Figure 2(b) plots the evolution of E[MF ] as the system is exercised with both pass/fail runs (note that the x-axis is logarithmic). From the plots we conclude that E[MF ] estimation quality increases with N consistently converging to the exact number of (injected) defects. This result is dependent on hj : the higher hj is the more it takes to estimate the true number of defects since the failure rate (determining E[MF ]) decreases with h. We also experimented with using the first order approximation (HS1) for MF > 0 as well, but we conclude that the plain MHS yields better results (results are not plotted for the sake of simplicity).
1e-30
3
1e-40
2
1e-50
1
1e-60
h=0.1 and MF=1 h=0.9 and MF=1 h=0.1 and MF=2 h=0.9 and MF=2 h=0.1 and MF=5 h=0.9 and MF=5
0 10
20
30
40
50 N
60
(a) NF = 0
70
80
90
100
1
10
100
1000
N
(b) NF > 0
Fig. 2. Impact of N on E[MS ] (M = 100 and ρ = 0.6)
5
Related Work
Two major approaches to (software) reliability modeling can be distinguished [10]: those that predict reliability from design parameters, and those that predict reliability from test data. The former approach is known as defect
424
R. Abreu et al.
prediction, where defect density is modeled in terms of source code complexity [28], possibly taking into account test coverage [24] and static analysis to predict how well testing will uncover the presence of defects [25,5]. The latter approach is known as reliability growth modeling, where (decreasing) a defect detection rate model is fitted to the test data to obtain the model parameters, yielding the estimated number of residual defects as a function of test effort [26]. Combinations have been proposed, leading to multivariate approaches, either through analytic models (e.g., [16]) or using Bayesian Belief Networks [10]. Our approach also builds on test data, but yields a prediction based on test failures, i.e., when the defects are not even isolated or repaired. An approach more related to ours is described in [14] However, due to their choice of a statistical diagnosis technique they cannot accurately predict the number of defects.
6
Conclusions and Future Work
In this paper we present a dynamic, reasoning approach, Dracon, to estimate the number of defects in the system under test, based on spectrum-based, Bayesian fault diagnosis. Our approach leverages testing/diagnosis information to properly estimate the number of defects. We carried out a number of experiments to assess the performance of our approach, and we conclude that Dracon is able to estimate the exact number of defects, provided sufficient (pass/fail) tests cases are available. We also present an analytic model that predicts the evolution of the estimation as more tests are executed. We observed that our empirical findings are in agreement with the model. Future work includes experimenting with real software systems to validate our findings (experiments are currently being performed and preliminary results support the above conclusions). We also plan to investigate in detail the impact of using the truncated minimal hitting set, in particular for when only a few failed runs are available. Finally, we plan to study the feasibility of model fitting for small N to decrease the number of tests required for proper defect prediction.
References 1. Abreu, R., van Gemund, A.J.C.: A low-cost approximate minimal hitting set algorithm and its application to model-based diagnosis. In: Proc. of SARA 2009, pp. 2–8 (2009) 2. Abreu, R., Zoeteweij, P., Golsteijn, R., van Gemund, A.: A practical evaluation of spectrum-based fault localization. Journal of Systems and Software 82(11), 1780– 1792 (2009) 3. Abreu, R., Zoeteweij, P., van Gemund, A.J.C.: Localizing software faults simultaneously. In: Proc. of QSIC 2009, pp. 367–376 (2009) 4. Abreu, R., Zoeteweij, P., van Gemund, A.J.C.: A new bayesian approach to multiple intermittent fault diagnosis. In: Proc. of IJCAI 2009, pp. 653–658 (2009) 5. Bertolino, A., Strigini, L.: On the use of testability measures for dependability assessment. IEEE TSE 22(2), 97–108 (1996) 6. de Kleer, J.: Diagnosing multiple persistent and intermittent faults. In: Proc. IJCAI 2009, pp. 733–738 (2009)
A Diagnostic Reasoning Approach to Defect Prediction
425
7. de Kleer, J., Williams, B.C.: Diagnosing multiple faults. Artif. Intell. 32(1), 97–130 (1987) 8. Elbaum, S., Malishevsky, A., Rothermel, G.: Incorporating varying test costs and fault severities into test case prioritization. In: Proc. of ICSE 2001, pp. 329–338 (2001) 9. Feldman, A., van Gemund, A.J.C.: A two-step hierarchical algorithm for modelbased diagnosis. In: Proc. of AAAI 2006, pp. 827–833 (2006) 10. Fenton, N.E., Neil, M.: A critique of software defect prediction models. IEEE TSE 25(5), 675–689 (1999) 11. Goel, A.L., Okumoto, K.: Software Reliability. McGraw-Hill, New York (1987) 12. Harrold, M.J., Rothermel, G., Wu, R., Yi, L.: An empirical investigation of program spectra. In: Proc. of PASTE 1998, pp. 83–90 (1998) 13. Jiang, B., Zhang, Z., Tse, T.H., Chen, T.Y.: How well do test case prioritization techniques support statistical fault localization. In: Proc. COMPSAC 2009, pp. 99–106 (2009) 14. Jones, J.A., Harrold, M.J., Bowring, J.F.: Debugging in parallel. In: Proc. of ISSTA 2007, pp. 16–26 (2007) 15. Jones, J.A., Harrold, M.J., Stasko, J.: Visualization of test information to assist fault localization. In: Proc. of ICSE 2002, pp. 467–477 (2002) 16. Khoshgoftaar, T.M., Munson, J.C.: Predicting software development errors using software complexity metrics. IEEE Journal on Selected Areas in Communications 8(2), 253–261 (1990) 17. Li, Z., Harman, M., Hierons, R.M.: Search algorithms for regression test case prioritization. IEEE TSE 33(4), 225–237 (2007) 18. Liu, C., Yan, X., Fei, L., Han, J., Midkiff, S.P.: Sober: Statistical model-based bug localization. In: Proc. of ESEC/FSE-13, pp. 286–295 19. Matsumoto, S., Kamei, Y., Monden, A., ichi Matsumoto, K., Nakamura, M.: An analysis of developer metrics for fault prediction. In: Proc. of PROMISE 2010, pp. 18:1–18:9 (2010) 20. Mayer, W., Stumptner, M.: Evaluating models for model-based debugging. In: Proc. of ASE 2008, pp. 128–137 (2008) 21. Musa, J., Iannino, A., Okumoto, K.: A time dependent error detection model for software reliability and other performance measures. IEEE Trans. Reliability 28(3), 206–211 (1979) 22. Shakeri, M., Raghavan, V., Pattipati, K.R., Patterson-Hine, A.: Sequential testing algorithms for multiple fault diagnosis. IEEE TSMC 30(1), 1–14 (2000) 23. Smith, A.M., Kapfhammer, G.M.: An empirical study of incorporating cost into test suite reduction and prioritization. In: Proc. SAC 2009, pp. 461–467 (2009) 24. Veevers, A., Marshall, A.C.: A relationship between software coverage metrics and reliability. STVR 4(1), 3–8 (1994) 25. Voas, J.M., Miller, K.W.: Software testability: The new verification. IEEE Software 12(3), 17–28 (1995) 26. Wood, A.: Predicting software reliability. Computer 29(11), 69–77 (1996) 27. Wotawa, F., Stumptner, M., Mayer, W.: Model-based debugging or how to diagnose programs automatically. In: Hendtlass, T., Ali, M. (eds.) IEA/AIE 2002. LNCS (LNAI), vol. 2358, pp. 746–757. Springer, Heidelberg (2002) 28. Zhang, H., Zhang, X., Gu, M.: Predicting defective software components from code complexity measures. In: Proc. of PRDC 2007, pp. 93–96 (2007) 29. Zhang, L., Hou, S.-S., Guo, C., Xie, T., Mei, H.: Time-aware test-case prioritization using integer linear programming. In: Proc. of ISSTA 2009, pp. 213–224 (2009)
Multiple Source Phoneme Recognition Aided by Articulatory Features Mark Kane and Julie Carson-Berndsen CNGL, School of Computer Science and Informatics, University College Dublin, Ireland
[email protected],
[email protected] Abstract. This paper presents an experiment in speech recognition whereby multiple phoneme recognisers are applied to the same utterance. When these recognisers agree on an hypothesis for the same time interval, that hypothesis is assumed to be correct. When they are in disagreement, fine-grained phonetic features, called articulatory features, recognised from the same speech utterance are used to create an articulatory feature-based phoneme. If the output of either of the phoneme recognisers for that interval matches the articulatory feature-based phoneme, then that phoneme is selected as an hypothesis for the interval. Underspecification of the articulatory feature-based phoneme is implemented if an hypothesis is not found and the matching process is repeated. The results of the experiment show that the final output accuracy is greater than both of the initial two phoneme recognisers. Keywords: phoneme recognition, phonetic similarity, articulatory features, underspecification.
1
Introduction
In Automatic Continuous Speech Recognition (ACSR), utterances are typically decoded by first recognising distinct sounds where the combination of these sounds creates the word. American English, for example, is typically represented by approximately 39 of these distinct sounds called phonemes. A common approach to ACSR incorporates machine learning techniques such as Hidden Markov Models (HMMs). In this approach an acoustic HMM is constructed for each phoneme derived from parameterised speech such as Mel Frequency Cepstrum Coefficients (MFCCs) coupled with a language model to incorporate how often a word or phoneme follows another. Typical language models are based on n-grams. Both language models and HMMs must be trained from several hours of 1
This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation (www.cngl.ie) at University College Dublin. The opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of Science Foundation Ireland.
K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 426–435, 2011. c Springer-Verlag Berlin Heidelberg 2011
Multiple Source Phoneme Recognition Aided by Articulatory Features
427
speech called a corpus which consists of training set, a test set for evaluation and annotations. Conventionally phonemes can be modelled as monophones (where the acoustic based HMM models only the individual phoneme) or triphones (where the acoustic based HMM models left and right context, hence a triple). The approach outlined above is often referred to as the ’beads on a string’ approach as each phoneme can be regarded as a bead following another bead [1], thus a linear approach. Over the years, research has been shifting from this approach to a multi-tiered paradigm building on theories of autosegmental and articulatory phonology and underpinned by phonological feature theory [2], [3], whereby, the speech utterance is represented by several features on different tiers, each representing a different aspect of the utterance at the same time, thus a multilinear model. Multi-tiered approaches to speech recognition as in [4],[5],[6] have drawn much attention to the use of finer grained information than the phoneme, namely articulatory features (AFs) which, it is argued, more readily represent acoustic events than the basic concept of the phoneme [7]. In this approach, a phoneme corresponds to a combination of AFs, on different tiers, which overlap in time; typically a sequence of phonemes then constitutes a syllable and a sequence of syllables constitutes the word. Once an utterance is recognised at any of these levels, a standard evaluation procedure is carried out in order to measure the performance of the system. An alternative approach to measuring the performance of a speech recognition system is to apply a confidence measure post-recognition and pre-evaluation where the inaccuracies of the recognition output can be highlighted. ROVER [8], is a system which applies multiple ASR systems to the same utterance and employs a voting process to confidently decide the output sequence. Other research on how a recognition system performs is based on understanding where there is confusion between hypothesis and reference (what the recogniser should have recognised). Scanlon et al. [9] demonstrates that approximately 75% of misclassified frames in a speech recognition process were assigned labels within the same Broad Phonetic Group (BPG) as the reference, thus BPG experts were designed to discriminate phonemes within the same BPG. Other work in this area suggests removing a phoneme model for a particular sound from the training data causes the decoding process to select a different model instead. As the correct model can never be selected, it is confused with other models 100% percent of the time. Analysing these confusions gives an increased insight into how phonetically similar the models are in-comparison to a standard confusion matrix where all phoneme models are included [10]. This paper presents a novel approach which combines the methods mentioned above by incorporating a multi-tiered paradigm while maintaining the strengths of the conventional linear models. Firstly, the outputs of multiple phoneme recognition systems are combined when they are in agreement i.e. produce the same hypothesis for a particular interval, and are thus assumed to have a high degree of confidence for that interval. Where they are not in agreement for a particular interval, whichever phoneme recogniser output is phonetically similar to the recognised AFs is then used as the hypothesis. Underspecification is incor-
428
M. Kane and J. Carson-Berndsen
porated to resolve a disagreement between phoneme recognisers when a fully specified AF generated phoneme cannot hypothesize a first-pass solution to the disagreement. The remainder of the paper is structured as follows. Section 2 introduces articulatory features and underspecification. Section 3 details the experiments that were carried out while section 4 outlines the recognition system and corpus used. The results of the experiment are discussed in section 5 and directions for future work are highlighted, then finally conclusions are drawn in section 6.
2
Articulatory Features and Underspecification
A phoneme is considered fully specified if it simultaneously has a value for all features which are associated with it (this paper primarily uses those features specified within the International Phonetic Alphabet (IPA) [11]). It is this simultaneous nature of the multi-tiered (or multilinear) approach, that is different to the linear approach of the beads on a string analogy. Instead of one tier containing just phoneme acoustic models, several tiers are used where each contains a unique set of feature acoustic models. If at the same time on each tier specific features co-occur (overlap in time), it is then said that a specific phoneme is found. An example of the interpretation of multi-tiered feature representations is described in [12]. The complete set of multi-tiered features incorporated in this paper are listed in Table 1. These tiers are not hierarchically structured (i.e. they are independent of each other). Tier 1 represents consonant, vowels and approximants where non indicates none of the aforementioned. Tiers 2 and 3 represent the manner and place of articulation for consonants while tier 4 is the represents voicing information. Tier 5 indicates the represents rhoticity to capture phonemes /r/, /er/ and /axr/ inherent in the corpus used which is described in section 4. Tier 6 and tier 7 represent tongue movements and jaw movements associated with vowels, respectively. Tier 8 represents the vowel space in terms of a broad phonetic group as in figure 1 along with the phoneme assignments of each quadrant. Models are trained for Q1, Q2, Q3, Q4, Qd (default quadrant) and non-Qs (consonants) Table 1. Feature tiers used in experiment tier feature possibilities for each tier 1 consonant, approximant, vowel, non 2 plosive, fricative, affricative, approximant, nasal, non 3 bilabial, labio-dental, dental, alveolar, post alveolar, velar, glottal, non 4 voiced, unvoiced 5 rhotic, non-rhotic 6 front, centre, back, non 7 close, close-mid, mid, open-mid, open, non 8 Q1, Q2, Q3, Q4, Qd, non
Multiple Source Phoneme Recognition Aided by Articulatory Features
429
phoneme canonical features /p/ consonant plosive bilabial* unvoiced* /f/ consonant fricative labio-dental* unvoiced* /ch/ consonant affricative* unvoiced* /r/ consonant approximant alveolar* rhotic* /ix/ vowel centre* close Q1* /ux/ vowel centre* close Q2* Fig. 1. The table in this figure gives an example of the phoneme to feature map where the asterix indicates the minimum required features as set in this experiment. The diagram to the right of the figure shows the vowel quadrilateral broad phonetic group.
based on these assignments. These broad phonetic groups help disambiguate between such phonemes as /ix/, /ih/, /ux/, /uh/, and /ah/. Underspecification occurs when some of the information required to recognise a phoneme is not available. In practice, features are unlikely to start and end at the same temporal points as the phoneme. Features can start before the phoneme, they can start within the previous phoneme and they can be omitted due to pronunciation variation, speaking rates and accents etc. Linguistic knowledge which defines how the phonemes are fully specified allows sets of phonemes to be inferred from underspecified data. An example of this is shown in [12]. Bates et al. [13] also uses a phonetic-feature-based prediction model where the features representing the phones can be considered as on, off, unspecified or unused. The table in figure 1 presents several phonemes from the data set used in the experiment outlined in the next section. Here a maximum underspecification (minimum features required to infer a phoneme) has been defined for this purpose and highlighted with an asterix. The experiment presented in the next section describes how AFs and underspecification are used to aid the recognition system when intervals of different phoneme recognisers do not agree.
430
3
M. Kane and J. Carson-Berndsen
Experiment
In this experiment, a process is described where the output of multiple recognition systems are combined if the systems produce the same hypothesis for an interval. If they disagree, AFs are used to describe which hypothesis should be chosen. Figure 2 gives an overview of the experiment presented in this section. The first stage of the experiment is to combine the hypotheses for a speech utterance from the output of two phoneme recognition systems. All matching phonemes between the two systems are considered to be accurate with high confidence if they are recognised at the same time interval. In the second stage, all outputs where the phonemes do not match are then compared against a phoneme hypothesis that is generated from AFs from the same speech utterance. Since the AF recognisers are applied to the utterance independently from phoneme recoginition, a strictly hierarchical approach, which may lead to a cascade of hard decisions, is avoided. If either output of the phoneme recognisers is the same as the fully specified AF phoneme then this hypothesis is selected. If no match is found, another, underspecified, AF phoneme is generated. If a result is still not found, the AF phoneme is underspecified further and the resultant AF phoneme is compared with the output of the phoneme
Fig. 2. Overview of experiment
Multiple Source Phoneme Recognition Aided by Articulatory Features
431
recognition systems. The process is repeated until the AF phoneme generation is too underspecified to create a phoneme, or a match is found. These two stages are described in detail in the subsections below. 3.1
Stage 1
Stage 1 of this experiment applies two different phoneme recognisers to a speech utterance. Further detail on the recognition systems and the speech corpus used in this experiment are described in section 4. An example of the combination process for the output of the two recognisers is illustrated in figure 3 and is outlined below where start and end represent the time interval the phoneme occurred as per each recogniser. – Phoneme hypotheses from both recognisers are compared with respect to their start and end time information. If they agree, they are combined as they are assumed to be correct. If they are not in agreement, then they are passed to Stage 2 for further classification. – However before the hypotheses are passed to Stage 2, the interval is checked to see whether it is a common transition e.g. interval 0.2 0.5 /ih/ is followed by an interval containing both /n/ with /ih/, which is then followed by an interval containing only /n/. Intervals in which a common transition (e.g. 0.5 0.6 n ih), is found, are ignored.
Fig. 3. Stage 1: Combined output of two phoneme recognisers
432
M. Kane and J. Carson-Berndsen
Table 2. The left table shows the features found at a specific time interval. The right table shows the AF based phonemes. * indicates that the AF based phoneme produced is underspecified and the greater the rank, the greater the underspecification. tier 1 2 3 4 5 6 7 8
3.2
features consonant plosive alveolar voiced non-rhotic centre non Q1
rank phoneme features found 1 d consonant, plosive, alveolar, voiced 2 d* plosive, alveolar, voiced 3 d* consonant, alveolar, voiced 3 z* consonant, alveolar, voiced 3 n* consonant, alveolar, voiced 4 ix* centre,Q1 . . . etc etc etc
Stage 2
As mentioned above, in the case that the output hypotheses of the two phoneme recognisers do not match, the hypotheses are passed to Stage 2 for verification using AFs. In the example 0.7 0.8 m n is passed to this stage. 1. The AFs between within the interval are examined to see if they infer either /m/ or /n/ starting from the closest canonical AF based phoneme to the maximum underspecification allowed.
For example if AFs for the interval are as stated in the left table of table 2, then from this information the AF based phonemes are shown in the right table. Only one AF-based phoneme, d, is fully specified; the rest are underspecified. 2. Starting with the most specified, each AF-based phoneme is compared with the two phoneme hypotheses from Stage 1. Where they are the same, this hypothesis is output for this interval, otherwise the next underspecified AF-based phoneme is compared and so on; if no matching hypotheses are found, the interval is left empty.
In this example /n/ is found to match and the process moves to next interval, 0.8 1.0 m b.
4
Speech Corpus and Recognition System
The implementation of this experiment was carried out using the hidden Markov model toolkit (HTK) [14] for two phoneme recognisers (monophone and triphone) and articulatory feature recognition. The TIMIT corpus, consisting of read speech, was the corpus of choice for this experiment [15]. 4.1
Corpus
The language of the TIMIT corpus is American English and is spoken by 630 speakers which is divided into two sets, training (3696 utterances) and core-test
Multiple Source Phoneme Recognition Aided by Articulatory Features
433
data (192 utterances). There is no overlap between the training and test set and the dialectal variants, known as SA data, is not used in this paper. The 61 labels of this corpus are mapped to 39 for evaluation similar to [16]. Timit transcription to IPA format is that as described by [17]. This information is used to find the corresponding features for a phoneme as defined by the IPA. The start and end silences are not included in evaluation as they are always detected accurately. 4.2
Recognition System
Each of the utterances are parameterised using MFCCs, with their associated log energy and first and second order regression coefficients, hence every frame is represented by 39 coefficients. Each speech waveform is first passed through a pre-emphasis filter. The waveform is then framed at a rate of 10 ms with a frame size of 25 ms. Each frame is also windowed using a Hamming window function. The MFCCs representing the utterances are then used in the calculation of the HMM models. The monophone HMMs are context-independent models whereas the triphones are context-dependent models that are initially calculated by cloning and re-estimating the monophone models. The triphones states were tied using basic phonetic feature decision trees for clustering. The AFs are trained using the context-dependent method (hence tri-AFs) and were initially calculated by cloning and re-estimating their context-independent counter part (hence mono-AFs). All model types are comprised of 5 states where only the centre 3 states are emitting. The decoding process is implemented with a bigram language model for monophones and a trigram model for triphones and tri-AFs. This process is executed without any pruning. Finally, the number of components in each mixture is 8 for triphones, 16 for monophones and 20 for AFs.
5
Results and Discussion
Table 3 depicts the accuracies of the monophone and triphone recognisers after Stage 1, namely 68.02% and 68.33%. The final accuracy after Stage 2 has increased to 69.34% where this final result is 1.01% greater than the highest scoring monophone or triphone recogniser. It is important to note, however, that after Stage 1 the two recognisers agree in their hypotheses 61.4% of the time. Stage 2 thus contributes a real increase in overall accuracy of 7.94% of which 5.12% were as result of a fully specified AFs and 2.82% were as a result of underspecified AFs. The experiments demonstrate the benefits of using AFs and phonemes in tandem during recognition and not in isolation, combining the linear and multilinear approaches. While the final result produced an overall increase in accuracy of 1.01%, future work envisages adding an extra stage using the AF process to output a best choice hypothesis from Stage 2, when no agreement can be found based on the initial two phoneme recognisers. Currently the underspecified AF process can independently map directly to phones with 61.44% accuracy by inferreing the best feature-to-phone mapping where only one phone hypothesis
434
M. Kane and J. Carson-Berndsen Table 3. Table of results Stage 1 Recogniser #1 Stage 1 Recogniser #2 Stage 2
Accuracy (%) 68.02 68.33 69.34
is returned. However, there is an additional 17% that cannot be mapped to a phone as the best feature-to-phone mapping allows multiple phone hypotheses for the same interval to be inferred. Utilising these hypotheses within the aforementioned extra stage is also part of future work.
6
Conclusion
This paper offers a novel approach to the task of phoneme recognition which combines multiple source phoneme recognition aided by multi-tiered articulatory features using the notion of underspecification. The experiment shows that the recognition accuracy can be increased when there are multiple sources of recognition on the same speech utterance and articulatory features are used to resolve disagreements. The result of the experiment was found to be better than the top scoring phoneme recogniser by 1.01%. The underspecification of phonemes as described in section 2 and 3 aids this experiment to decide between two hypotheses where it could not do so previously. Future work involves inferring hypotheses from articulatory features for deletions and insertions (phonemes that were not recognised and phonemes that were falsely recognised).
References 1. Ostendorf, M.: Moving Beyond the ‘Beads-On-A-String’ Model of Speech. In: Proc. IEEE ASRU Workshop (1999) 2. Jakobson, R., Fant, G.M.C., Halle, M.: Preliminaries to Speech Analysis: The Distinctive Features and their Correlates. MIT Press, Cambridge (1952) 3. Chomsky, N., Halle, M.: The Sound Pattern of English. MIT Press, Cambridge (1968) 4. Deng, L., Sun, D.X.: A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features. J. Acoust. Soc. Am. 95(5), 2702–2719 (1994) 5. Carson-Berndsen, J.: Time Map Phonology: Finite State Models and Event Logics in Speech Recognition. Kluwer Academic Publishers, Dordrecht (1998) 6. Chang, S., Wester, M., Greenberg, S.: An elitist approach to automatic articulatory-acoustic feature classification for phonetic characterization of spoken language. Speech Communication 47(3), 290–311 (2005) 7. King, S., Taylor, P.: Detection of phonological features in continuous speech using neural networks. Computer Speech & Language 14(4), 333–353 (2000) 8. Fiscus, J.G.: A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER). In: Proceedings IEEE Workshop on Automatic Speech Recognition and Understanding (1997)
Multiple Source Phoneme Recognition Aided by Articulatory Features
435
9. Scanlon, P., Ellis, D., Reilly, R.: Using broad phonetic group experts for improved speech recognition. IEEE Transactions on Audio, Speech and Language Processing 15(3), 803–812 (2007) 10. Kane, M., Mauclair, J., Carson-Berndsen, J.: Automatic identification of phonetic similarity based on underspecifcation. In: Proceedings of Springer-Verlag Lecture Notes in Artificial Intelligence, LNAI (2011) 11. The-International-Phonetic-Alphabet (2005), http://www.langsci.ucl.ac.uk/ipa/ 12. Aioanei, D., Neugebauer, M., Carson-Berndsen, J.: Efficient Phonetic Interpretation of Multilinear Feature Representations for Speech Recognition Language Technology Conference (2005) 13. Bates, R.A., Ostendorf, M., Wright, R.A.: Symbolic phonetic features for modeling of pronunciation variation. Speech Communication 49, 83–97 (2007) 14. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: Hidden markov model toolkit (htk), Version 3.4.1 (2009), http://htk.eng.cam.ac.uk/ 15. Garofolo, J., Lamel, L., Fisher, W., Fiscus, J., Pallett, D., Dahlgren, N.: The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM (1993) 16. Lee, K.F., Hon, H.W.: Speaker-Independent Phone Recognition Using Hidden Markov Models. IEEE Transactions on Acoustics, Speech, and Signal Processing 37(11) (1989) 17. Robinson, T., Fallside, F.: A recurrent error propagation network speech recognition system. Computer Speech & Language 5(3), 259–274 (1991)
Plan Recommendation for Well Engineering Richard Thomson2 , Stewart Massie1 , Susan Craw1 , Hatem Ahriz1 , and Ian Mills2
2
1 IDEAS Research Institute, Robert Gordon University, Aberdeen, UK {s.massie,s.craw,h.ahriz}@rgu.ac.uk Exceed Ltd, 12 Bon Accord Square, Aberdeen, UK
Abstract. Good project planning provides the basis for successful offshore well drilling projects. In this domain, planning occurs in two phases: an onshore phase develops a project plan; and an offshore phase implements the plan and tracks progress. The Performance Tracker applies a case-based reasoning approach to support the reuse of project plans. Cases comprise problem parts that store project initiation data, and solution parts that record the tasks and subtasks of actual plans. An initial evaluation shows that nearest neighbour retrieval identifies projects in which the retrieved tasks and subtasks are relevant for the new project. The Performance Tracker can be viewed as a recommender system in which recommendations are plans. Thus the data that is routinely captured as part of the performance tracking during offshore implementation is utilised as experiences. Keywords: Case-Based Reasoning, Recommender Systems.
1
Introduction
The oil and gas industry is one of the world’s largest industries and is estimated to be worth $10400 trillion, based on current discovered oil reserves and the average price of oil. In the oil and gas industry, wells are holes produced by boring for the purpose of finding and producing hydrocarbons. Wells have various categorisations and are constructed either to obtain geological data prior to drilling (exploration well, test well and appraisal well), or to research possible oil fields (wildcat well), or to extract the raw materials from the ground (oil well, gas well, production well, aquifer producers and gas injectors). This paper considers the reuse of project plans for subsea drilling. Offshore wells are constructed using rigs with various types of equipment used for drilling, casing the hole and extraction. The process of drilling a well can be split into 5 segments: planning where the tasks and subtasks required to construct the well are identified; boring the hole to reach the reservoir; preparing the hole for the extraction of the hydrocarbons by casing the hole with cement; extracting and refining the hydrocarbons; and lastly plugging the well when the reservoir is empty or the reservoir has stopped producing enough hydrocarbons to be seen as a viable use of resources. K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 436–445, 2011. c Springer-Verlag Berlin Heidelberg 2011
Plan Recommendation for Well Engineering
437
Due to the contractual nature of employment within the oil industry, knowledge retention can be challenging and hiring individuals with the experience necessary is expensive, as a result the cost of retaining corporate memory is high. The capture and reuse of knowledge using a centralised system can help reduce this cost. The rest of the paper is organised as follows. Section 2 explains the process of planning and monitoring a subsea well construction project. Section 3 describes how the Performance Tracker will use the data produced during the planning process to identify similarities between projects. The architecture of the Performance Tracker is described in Section 4. In Section 5 the effectiveness and performance of the approach is investigated. Related work is discussed in Section 6. Lastly, Section 7 reflects on the work done, highlighting our conclusions and planned future developments.
2
Problem Domain
Well construction projects often share some characteristics with related previous projects and, as a result, follow the same drilling process. It is therefore possible to reuse project plans of related projects in order to save time in the planning process. The planning process is split into two stages. Onshore Planning Stage: The project plan is developed by the onshore team. At this stage a new well has been identified and approval for drilling has been received. A new project plan is developed after which, the plan is scrutinised during a “Drill Well on Paper” exercise where the type of rig is chosen, the project budget is calculated and potential causes of non-productive time are identified. The final project plan is then created, comprising the list of tasks and their associated subtasks, containing target times. This plan is then ready to be used during the second phase of the process. Offshore Monitoring Stage: The plan is implemented by the offshore team where data is monitored and recorded. The project plan changes its primary function from a planning tool to a monitoring tool. The offshore project manager will input the operational drilling data. Where there is a discrepancy between the planned task times and the actual task times, it is classified as either invisible lost time which states inefficiency within the well drilling operation, or nonproductive time which is time spent rectifying unforeseen problems during the operation e.g. tool failures. The project plan is then refined taking into account any lost time encountered. Once the project has been completed, the project plan is used to evaluate the project before being retained by the company.
3
Performance Tracker
During the onshore process it is common practice for the onshore team to employ the time consuming process of manually retrieving old project plans and tailoring
438
R. Thomson et al.
these to suit the new project. Currently it is up to the onshore team members to recall previous project plans, based on their own past experiences. This can be problematic as individuals may forget about potentially suitable projects or may be unable to obtain plans located on a local machine. The project team may also have an incomplete understanding of the project resulting in a poor plan selection. Furthermore, an inexperienced team may lack past knowledge to effectively reuse project plans. The Performance Tracker addresses these issues by supporting the user during the onshore process with a CBR recommendation function for the retrieval and reuse of past project plans from a central project plan repository. This approach uses CBR to both reduce the time required for selecting suitable project plans and to provide a more informed set of potentially relevant plans. 3.1
Case Representation
A project case c is made up of a problem part p and a solution part s. Problem p contains the feature values of the project initiation data that will be used to identify the similarities between cases. The project initiation data stores the core project information required to begin the planning stage, consisting of a project description, the geographical location of the well and the planned drilling depth. Problem p is made up of five base features: a textual description of the project; the selected rig; the well to be drilled; the estimated drilling depth in feet; and the estimated duration in days. The rig and well are represented by a set of sub-features combining to give the overall representation in Figure 1. p
project Description
rig
well
depth
duration
type
derricks type
utmZone
maxStorage C apacity
longitude
latitude
maxDrillingDepth
Fig. 1. Project problem structure
Solution s contains tasks and subtasks required to complete the project (Figure 2). Each task is decomposed into a number of smaller subtasks that are used to plan and monitor a task in greater detail. The subtasks also allow any unproductive lost time to be captured and classified which provides useful knowledge for the refinement of future plans. A sample case is shown in Figure 3 where the left hand box contains the project problem features and the right hand contains a set of tasks making up the project solution.
Plan Recommendation for Well Engineering
439
s
tasks tasks tasks
taskN o.
subtask No
depth
subtask N ame
subtasks subtasks subtasks
taskN ame
duration
risk risk risk
resource resource resource
Fig. 2. Project solution structure
description: New BP Project Nigeria rig-type: Jack-up rig-maxStorageCapacity: 34000 rig-maxDrillingDepth: 5400ft well-type: Oil Producer well-longitude: 1 well-latitude: 79 well-utmZone: 34 depth: 3000ft duration: 64 days
taskNo : 2... taskNo : 2... taskNo:... 1 ... taskName; Rig Prep. depth: 2087.0ft subtaskNo: 1.1 subtaskName: Transit to rig duration: 7.0 hours risk: Bad Weather resource: Mud mat, dat tool subtaskNo: 1.2....
Fig. 3. Example of Project Initiation p and Solution s
3.2
Similarity
Cases are selected by using k-Nearest Neighbour in which the case similarity score is determined by using a weighted feature average and the closest k cases are then recommended to the user. Local Similarity. The problem part of the case representation consists of numeric (depth, duration, longitude, latitude), symbolic (rigType, wellType) and textual features (projectDescription). The process for calculating the similarity for each feature type will now be outlined. – Numeric Similarity: Similarity between numeric features is derived using Normalised Manhattan distance: sim(qv,cv) = 1 −
|cv − qv| r
where qv is the numerical feature value of the query, cv is the numerical feature value of the case and r is the predefined feature range; e.g. the latitude feature will have values ranging from -90 to 90, hence range r will be 180.
440
R. Thomson et al.
– Symbolic Similarity: An oil rig may be suitable for various project types. Therefore a Boolean similarity of 1 for a match and 0 for no match will not be suitable. A similarity matrix is used to determine the similarity between different rigType and wellType values. The rigType matrix in Table 1 was developed in conjunction with a domain expert and shows the similarities between different rig types. This matrix serves as a look-up table to provide the similarity between rigType feature values. A similar process is applied to identifying the similarity between wellType values. Table 1. rigType similarity matrix Jackup
Semi Sub
Platform
Drillship
1
0.8
0.7
0.6
Semi Sub.
0.8
1
0.9
0.8
Platform
0.7
0.9
1
0.5
Drillship
0.6
0.8
0.5
1
Jackup
– Text Similarity: The projectDescription gives an overview of which company the project is for and any special conditions, such as “test deep water project” or “HSE recovery”. The completed projectDescription is treated as a bag of words, as the presence of a word is of more importance than word position. The Jaccard Coefficient which assesses the overlap of words within the two word sets is used. projectDescription similarity is defined as: sim(qv,cv) =
|cv ∩ qv| |cv ∪ qv|
Due to the industry specific nature of terms used within documents and the more personalised approach to shorthand (for example CC, circ and ccution are all used in place of circulation), the use of a generic lexicon such as WordNet would prove to be unsuitable. For this reason a domain specific lexicon has been created to analyse similarity and meaning behind these industry specific terms. Global Similarity. The global similarity GSim of a case is assessed by calculating the weighted average of the local feature similarities. The feature weights are set to give “more important” features greater influence. The global similarity function is shown below, where wi is the weight of the ith feature and simi (q, c) is the local similarity of the ith feature of the query q and of the case c. n wi simi (q, c) GSim(q,c) = i=0n i=0 wi The individual feature weights were set after consultation with domain experts and are shown in Figure 4. The features of the well and the rig are given higher importance. The depth will help determine the suitability of a rig for the project, as a rig will have a maximum depth that it can drill.
Plan Recommendation for Well Engineering
well
rig
depth
duration
description
1
0.9
0.85
0.6
0.5
longitude
latitude
utmZone
type
derricks
Capacity
Depth
0.3
0.3
0.2
0.2
0.25
0.25
0.25
441
type 0.25
Fig. 4. Feature weights applied
4
Architecture
The Performance Tracker architecture shown in Figure 5 consists of 2 main components, the onshore project plan recommendation tool and the offshore monitoring tool, both of which reside within a central server. The onshore project plan recommendation tool uses the JColibri2 framework in order to apply the modelling rules to the cases [1]. JColibri2 proved to be a suitable framework because it contains modules for both pre-processing text and calculating the similarity of the three feature types used by the Performance Tracker. New project initiation data is sent from the web portal to the Case Modeller where modelling rules are applied prior to the connection with the case base. The Similarity Assessor applies the similarity metrics to identify the k most similar cases in the case base. The Solution Extraction Module can now take the solution part of the k cases and displays these as recommendations in the Web Portal. Once the user has adapted the project plan of the selected case, it is stored as a new project plan within the database. This ensures that the project plans in the database can be refined during the offshore monitoring process where task times are constantly being adapted to correspond with the live project data. In the initial design the case base was built from the database for each query, however, retrieval times were unacceptably slow. In order to alleviate this problem the case base is now stored in server memory providing quick access for the
Fig. 5. Performance tracker architecture
442
R. Thomson et al.
onshore plan recommendation tool when a query is made. In order to ensure that both new projects created by the onshore process and refined projects generated by the offshore monitoring are consistent a weekly synchronisation process was developed, adding any newly created projects as new cases whilst updating current cases. The project case base stored in memory is used as the data source for case similarity matching and case retrieval.
5
Evaluation
Testing is split into two parts. The first set of experiments investigates the quality of recommendations and describes the results of three recommendation tasks made using the Performance Tracker. The second set of experiments measures efficiency of the Performance Tracker in terms of retrieval time. The case base used for testing comprises 200 cases for 200 wells, drilled by 87 rigs in 9 locations. The drilling depths range from 958 feet to 23,060 feet. 5.1
Recommendation Quality
Three cases are extracted from the case base in turn, providing the project initiation data for three queries and leaving a case base of 199 cases. After each query a comparison of the problem and solution similarities is made between the query and the top 5 recommended cases. The quality of the recommended solutions is measured by the overlap between the tasks of each retrieved case and those of the query. Query 1: The first query was based on an oil well construction project off the Nigerian coast. Table 2 shows that the 5 most similar cases retrieved were all based on the same well type. All of the project plans returned were very similar to those in the query, as shown by the high overlap values, but would require adaptations to the depth and task times in order to be suitable. The top ranked case contained an almost identical project plan to the extracted case with only one difference. The plan for both cases was the same; however, the retrieved project was refined to include a new task to address a lost time issue. Table 2. Results for query 1 Rank 1 2 3 4 5
Well Type Oil Well Oil Well Oil Well Oil Well Oil Well
Initiation Data Sim (%) 83% 81% 81% 78% 74%
Plan Overlap (%) 97% 84% 80% 81% 80%
Query 2: The second query contained incomplete project initiation data,which is common in this domain, for an oil well project. The projectDescription and well-type attributes had missing data and, as a result, the similarity of the retrieved cases was low. The overlap of the retrieved plans fluctuated greatly,
Plan Recommendation for Well Engineering
443
however 3 of the recommendations have an overlap of over 75%. It is evident from Table 3 that the type of well being constructed greatly impacts the similarity value. Although the 2nd most similar retrieved case had a lower Project Initiation similarity value, the project problem was very similar in respect of depth, well and the duration. The low ranking was due to the dissimilarity between the rig and the projectDescription attributes. When a projectDescription was added to the problem, the second ranked case was promoted as the most similar case. Table 3. Results for query 2 Rank 1 2 3 4 5
Well Type Oil Well Oil Well Wildcat Gas Well Test Well
Initiation Data Sim (%) 71% 63% 63% 62% 53%
Plan Overlap (%) 79% 81% 44% 77% 39%
Query 3: The third query used initiation data for the JADA oil field test well, containing a sizeable projectDescription with the phrases “test well”, “JADA field” and “oil and gas producing well” which provides a larger vocabulary for evaluating similarity within the text matching module. Table 4 again illustrates that the type of well being constructed has a large impact on the project plan overlap. The top ranked case used the same rig to drill an oil well that was closely located to the query and this resulted in a high similarity of project initiation data. However, tasks only overlapped during the rig set-up and abandonment stages. The case ranked 2nd contained the project plan for the test well of an oil field adjacent to the JADA field it is therefore not surprising that this case had a higher plan overlap value than the top ranked case. Table 4. Results for query 3 Rank 1 2 3 4 5
5.2
Well Type Oil Well Test Well Oil Well Oil Well Test Well
Initiation Data Sim (%) 87% 85% 81% 80% 74%
Plan Overlap (%) 34% 94% 34% 37% 86%
Retrieval Speed
In this experiment we compare the Performance Tracker using weekly synchronisation, as described in Section 4, with a Baseline system that performs a retrieval based on the standard JColibri2 cycle with the case base being built for each query from the database. Twenty identical queries were run on the synchronised and baseline systems to see the effect that the synchronisation process has on the retrieval times for a query.
444
R. Thomson et al.
The baseline system had an average retrieval time of 1 minute 34 seconds with times ranging between 1 minute 27 seconds and 1 minute 45 seconds. The synchronised system had an average retrieval time of 5.7 seconds. The fastest retrieval time was 2.15 seconds and the slowest retrieval time was 1 minute 34 seconds. The slowest retrieval time was the first retrieval as the case base had to initially be built from the database. Subsequent retrievals did not require the case base to be rebuilt and the retrieval times were much shorter. This experiment confirms that the design architecture adopted greatly improves retrieval times while the synchronisation process ensures the availability of revisions made during implementation.
6
Related Work
Case-based planning is not a new concept with applications ranging from holiday planners [2] to planning for logistics as used in CaPER [3]. However, the Performance Tracker has more in common with CBR recommender systems and applies the single shot, proposal type of recommendation described by Smyth [5]. This approach uses a specific user problem, and based on the user criteria, a set of cases are returned. Other similar systems such as Cobot [6] use a natural language conversational approach to extend a query by asking questions until conditions are met to provide a recommendation. The process in obtaining the user criteria is different but the end result of recommendation is the same. Research has shown that a CBR methodology can be used effectively for reusing past experiences within a drilling environment, particularly within the context of lost time reduction. Skalle et al [7] identified the usefulness of CBR when reducing lost time by analysing one problem area, stuck drill strings. Although the research focused on one area, and primarily on research, it is very important as $250 million per annum is wasted on this form of downtime alone. Drill Edge [4] uses CBR to identify possible reasons for a lost time problem during the offshore monitoring stage of a project, and then advises users on how the project could be refined to solve the problem. Drill Edge builds on the studies carried out for Creek and TrollCreek knowledge intensive CBR frameworks [8]. TrollCreek was developed to identify lost time based on data from the “Drilling Club” [9,10].
7
Conclusions
We have presented an approach to apply similarity matching to offshore well drilling project data to effectively recommend past project plans. We show that, by choosing appropriate project initiation features and feature weightings, it is possible to retrieve a set of suitable project plans that can be manually adapted for a new well construction project. Furthermore, the process of project plan refinement during the offshore monitoring stage has demonstrated that the project plan is constantly being updated and becoming a more robust solution, throughout the life of a project.
Plan Recommendation for Well Engineering
445
We also show that by storing the case base within server memory, rather than constantly extracting data from the database, retrieval times can be reduced. The weekly synchronisation with the database will enable the case base to remain up-to-date, as more refinements and projects are added to the database. Drawbacks came when a user was required to manually adapt a small number of task depths of retrieved project plans. To limit the amount of adaptation required, the addition of a sea bed depth feature and a true vertical depth feature has been made. It was also shown that the type of well being drilled had a large impact on the suitability of a project plan and this may be emphasised in retrieval. Acknowledgments. This partnership received financial support from the Knowledge Transfer Partnerships programme which is funded by the Technology Strategy Board along with other UK government funding organisations.
References 1. jColibri CBR Framework, http://gaia.fdi.ucm.es/projects/jcolibri/ 2. Stewart, S., Vogt, C.: A Case-Based Approach to Understanding Vacation Planning. Leisure Sciences 21(2), 79–95 (1999) 3. High Performance Case-Based Planning, www.cs.umd.edu/projects/plus/Caper/ 4. DrillEdge, www.verdandetechnology.com/products-a-services.html 5. Smyth, B.: Case-based recommendation. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 342–376. Springer, Heidelberg (2007) 6. Sahay, S., Ram, A.: Conversational Framework for Web Search and Recommendations. In: ICCBR Workshop Proceedings, pp. 161–170. Springer, Heidelberg (2010) 7. Skalle, P., Aamodt, A., Sveen, J.: Case-Based Reasoning, a method for gaining experience and giving advice on how to avoid and how to free stuck drill strings. In: IADC Middle East Drilling Conference, Dubai (1998) 8. Aamodt, A.: Knowledge-Intensive Case-Based Reasoning in CREEK. In: 6th Int. Conference on Case-Based Reasoning, Workshop Proceedings, pp. 62–71 (2005) 9. Islam, A.M., Skalle, P.: Review of wellbore instability cases in drilling through case base resoning (CBR) method. In: Proceedings of the International Conference on Mechanical Engineering (2007) 10. Skalle, P., Aamodt, A.: Knowledge-Based decision support in oil well drilling. In: Intelligent Information Processing II IFIP International Conference on Intelligent Information Processing, vol. 163, pp. 443–455 (2005)
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach K. Kancherla and S. Mukkamala Institute for Complex Additive Systems and Analysis (ICASA) Computational Analysis and Network Enterprise Solutuons (CAaNES) New Mexico Institute of Mining and Technology Socorro, New Mexico 87801, U.S.A. {kancherla,srinivas}@cs.nmt.edu
Abstract. In this paper we demonstrate the use of multi-spectrum imaging and machine learning techniques for automated detection of lung cancer. The sputum samples from patients are first stained using Tetrakis Carboxy Phenyl Porphine (TCPP). Tetrakis Carboxy Phenyl Porphine (TCPP) is a porphyrin molecular marker which binds to cancer and pre cancerous cells, causing cancer cells to glow red under fluorescent microscope. After the sputum samples are stained, images are taken at multiple frequencies 650nm and 660nm. We extracted four different sets of features (shape based, intensity based, wavelet based and Gabor filter based features). Both wavelet based and Gabor based features capture the texture properties of cell. Using these features we built different machine learning models. We obtained an accuracy of 96% using initial set of 35 features (shape based, intensity based and wavelet based features). After adding Gabor based features to this initial set, we obtained accuracy of about 98%. Our experiments show the potential of using TCPP stain, machine learning techniques and Multi-spectrum imaging for early detection of lung cancer. Keywords: Lung Cancer, Multi spectrum, Machine Learning, Image processing, Wavelet Transform, Gabor transform.
1 Introduction Lung cancer is the leading cancer killer among both men and women. Based on the statistics by the American Cancer Society, it is believed there are 220,000 new cases, deaths per year is about 160,000 and 5-year survival rate for all stages is 15% [1]. Various factors influence the 5-year survival rate stage of cancer, type of cancer, other factors like symptoms, general health etc. Early detection of lung cancer is the leading factor in improving survival rate. However the symptoms of lung cancer do not appear until the cancer spreads to other areas, thus leading to 24% chances of lung cancer detection in early stages [3]. We need an accurate early detection of lung cancer for increasing the survival rate. Various methods like Computed Tomography (CT) scan, chest radiography, Sputum analysis, microarray data analysis are used for lung cancer detection [5]. Mass screening by Computed Tomography (CT) scan of chest is a promising method for K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 446–458, 2011. © Springer-Verlag Berlin Heidelberg 2011
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach
447
lung cancer detection. However this method is not recommended because of its cost and long term safety of this method is not established due to the risk of exposure to radiation [7]. The use of microarray data for cancer is investigated in [9]. However the use of micro array data is a costly approach. In this paper we investigate the use of Tetrakis Carboxy Phenyl Porphine (TCPP) as an alternative approach for early detection of lung cancer. The use of machine learning for cancer detection is investigated in [8]. Machine learning techniques such as Artificial Neural Networks (ANN) and Decision Tress (DT) are used for cancer detection for nearly 20 years [10, 11, and 12]. The potential of using machine learning methods for detecting cancer cells or tumors via X-rays, Computed Tomography (CT) is shown in [13, 14]. Machine learning methods used for tumor classification or cancer detection using microarray data or gene expression include Fisher Linear Discriminant analysis [15], K-Nearest Neighbor (KNN) [16], Support Vector Machines (SVM)[17], boosting, and Self-Organizing Maps (SOM) [18], Hierarchical clustering [19], and Graph theoretic approaches [20]. In this paper we investigate the use of various machine learning methods for cancer detection. We obtain the multi spectrum images of Tetrakis Carboxy Phenyl Porphine (TCPP) stained sputum samples and extract individual cell segments from these images. For our initial set of features we extracted intensity based features, shape based features and wavelet based features. Later we added Gabor based features to our initial set. Gabor filter is widely used in image segmentation and computer vision. Gabor based features capture the texture properties for cells which is important for cancer detection. The rest of the paper is organized as follows: section 2 contains description about sample collection, culture cell line extraction, Tetrakis Carboxy Phenyl Porphine (TCPP) staining procedure, section 3 contains image processing steps involved like image acquisition, image segmentation and feature extraction; section 4 we will discuss about the dataset and results obtained using various machine learning techniques; in section 5 we give our conclusion and explain our future work.
2 Sample Collection The central hypothesis of our experiments is that TCPP labeled sputum specimens can assist in detection of lung cancer. To test this hypothesis, sputum specimens from various subject cohorts were examined with the Biomoda CyPath® Early Lung Cancer Detection Assay. The long-term goal is to establish the Biomoda CyPath® Early Lung Cancer Detection Assay along with machine learning techniques as an effective program for screening and early detection of lung cancer, with a resultant decrease in lung cancer mortality, and long term monitoring of patients undergoing therapy. A diagnostic and screening tool for lung cancer is important considering that early detection increases survivability. This tool can also be used to determine the efficacy of treatment protocols by monitoring the development of disease. Deep lung samples are collected following “triple morning cough procedure”. After creating a monolayer of cells on slide, the slide is immersed in TCPP labeling solution which binds to the plasma membrane and coats the cancer cell surface. This coating provides a distinction between cancer and non cancer cells.
448
K. Kancherla and S. Mukkamala
The samples are acquired from two different sources; the first set of samples is cancer cell lines from American Type Cell Cultures (ATCC) and the second set is normal patient samples from Cureline Inc. The cancer cell lines used in our experiments are been cultured and maintained by ATCC and the codes for cultured samples that we worked with are HTB-171, HTB-182, CRL-5889, CRL-5810, CCL-171. At the time of collection all samples were from at least 50 year old male Caucasians with some of them having a cigarette smoking history of 50 pack years. Each single sample is either a different form of lung cancer or a different stage of the disease with HTB182 being a squamous cell carcinoma, CRL-5889 a stage1 adenocarcinoma; non small cell lung cancer, CRL-5810 is stage2 adenocarcinoma and HTB-171 is small cell lung carcinoma. All the samples are collected, cultured, and preserved by ATCC’s standard protocol. Cureline Inc. provided the normal samples for our study. Participants were provided with a container filled with 30ml of Cytolyt fixative and instructed to collect sputum by Triple Morning Collection (sputum sample collected for 3 consecutive days). At the Biomoda laboratory, the samples were processed onto a microscope slide, which contained a monolayer of the sputum cells. After preparing the labeling reagents containing TCPP (Biomoda CyPath® Early Detection Lung Cancer Assay), the slide was immersed in the labeling solution, rinsed, air-dried and cover-slipped. The completed slide was viewed under an ultraviolet microscope utilizing a FITC filter.
3 Multi-spectrum Image Analysis A multi spectrum image [32] can be represented by a function that maps a two dimensional point P (x, y) in an image plane to an m-dimensional , , , ,…, , where m is number of multi spectrum bands. A tor Gray-scale or Intensity image can be considered as a two dimensional signal with intensity as amplitude measured over spatial domain, i.e. for gray scale image m is equal to 1. For color images like RGB, YCrCb m is equal to 3, with each function representing one band (like Red, Green and Blue for RGB). In this work we used Multi-spectrum image for detecting cancer cells. As TCPP spectrum is centered at 655nm most of the important information is between 650nm and 660nm, we use only 650nm and 660nm wavelengths. Multi-spectrum and hyper-spectrum are both related terms except that in Multi-spectrum the image cube is taken at selected wavelengths using various sensors whereas in Hyper-spectrum the image cube consists of continuous bands usually using single sensor. Hyper-spectrum and Multi-spectrum are widely used in applications like agriculture, mineralogy and surveillance. In [27] the authors proposed the use of Hyper-spectrum to detect different grape varieties and early detection of disease outbreak. In [28] Hyper-spectrum is used for military surveillance because data from multiple spectrums results in unique signature for each object. Image analysis consists of following steps 1. Image acquisition 2. Image segmentation 3. Feature extraction
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach
449
a. Image Acquisition In our experiment we acquired images using fluorescent microscope. Images are taken from 550nm-700nm, with a step size of 10nm. However we performed analysis only on 650nm and 660nm, as TCPP spectrum is centered at 655nm. Each TCPP labeled cell was photographed using the 20x objective under Ultra Violet (UV) light, with a FITC (Fluorescein isothiocyanate) Filter. We used Nuance Fx system for acquiring Multi-spectrum cubes. “The Nuance FX [30] multispectral imaging system is a fluorescent and brightfield microscopic imaging system, with an optimized high-throughput tunable filter that has been matched to the bandwidths of common chromogens and fluorophores. In addition, this system has a variable bandpass tunable filter which enables this system to be used for fluorescence-based imaging while optimizing for sensitivity or throughput. The spectral range of this system is 420nm-720nm.” b) Image segmentation The task of image segmentation is to partition image into multiple non-overlapping segments. We first convert each image into a binary image with ‘0’ representing background and ‘1’ representing foreground. To do this we used a simple threshold method, where a threshold T is chosen and any pixel with intensity value greater than this threshold is considered as ‘Object’ (foreground) and pixel with intensity value less than or equal to this threshold is considered as ‘background’.
if f ( x , y ) > T (object point) ⎧1 g ( x, y ) = ⎨ if f ( x, y ) ≤ T (background point) ⎩0 T : global threshold, g is binary image and f is input image .
(1)
We performed various experiments for finding optimal threshold and chose threshold equal to 25% of average intensity of the image. We remove the components that at boundary of image, as they are partial cells. Due to noise in image, there are holes or gaps left in the image. A hole is defined as the background region surrounded by connected border of foreground pixels. We remove these holes to obtain full information about cell. After this we use the concept of connectivity for finding individual cells. c) Feature extraction After obtaining individual cells we extract 105 different features from each cell. Features can be divided into four different types based on the properties they capture. The first set of features capture the intensity based properties. The second set of features capture shape based properties. The third set of features capture the texture properties of cell by applying wavelet transform. The fourth set of features also capture texture properties of cell by using Gabor filter. In the experiments below shape based, intensity based and wavelet based features are considered initial set and initial set along with Gabor features as modified set. • Gray scale image based feature Intensity based features are extracted directly from intensity image. We extract average intensity, minimum intensity, variance, mode, variance, maximum intensity,
450
K. Kancherla and S. Mukkamala
skewness, kurtosis, and number of pixels with maximum intensity and minimum intensity. • Shape based features To capture the shape based properties we three different shape factors which are widely used in image analysis and microscopy. The next three features are extracted to capture the shape properties of the segment. ─ Area of segment is the number of pixels present in the cell ─ Aspect ratio is the ratio between the smallest diameter to the largest diameter
dmin dmax
Aspect Ratio=
(2)
Where dmax is the largest diameter and dmin is the smallest diameter orthogonal to largest diameter. ─ Circularity is defined by equation 3
C ircularity=
4Π A
P
2
(3)
Where A is the area of the segment, which is size of segment and P is the perimeter of the segment. A value of 1 means the object is circular. • Wavelet based features Wavelet transform is powerful signal processing tool for analyzing signals. It overcomes the problems of Short Time Fourier Transform (STFT) relating to time and space resolution. DWT provides high time resolution and low frequency resolution for high frequencies and high frequency resolution and low time resolution for low frequencies. The wavelet transform has excellent energy compaction and decorrelation properties, which can be used to effectively generate compact representations that exploit the structure of data. Wavelets can capture both texture and shape information efficiently. The wavelet representation consists of coarse overall approximation together with detail coefficients that influence the function at various scales. Continuous wavelet transformation is given by formula 4
X (τ , a ) =
1 a
* ∫ x (t )ψ (
t −τ )dt a
(4)
Where ψ is the wavelet function given by formula 5
Ψ τ ,a =
1 t −τ ψ( ) a a
(5)
For two dimensional signals wavelet is performed by using multiple filters. Discrete Wavelet transform can be obtained by decomposing the signal simultaneously with high-pass filter h and low-pass filter g. The outputs will be detailed coefficients
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach
451
from high-pass filter and approximation coefficients (A) from low-pass filter. Detailed coefficients consist of Horizontal (H), Vertical (V) and Diagonal (D) coefficients. This decomposition can be repeated further to increase the frequency resolution. Based on the level of decomposition approximate coefficients will be further decomposed. Figure 1 shows level 3 wavelet decomposition of lena image. In our experiments we applied level 3 wavelet decomposition using Daubechies wavelet ‘db4’. After applying wavelet transform we will be getting one set of approximate coefficients and three sets of detailed coefficients. We extract mean, variance, maximum and minimum values from each of these coefficients.
Fig. 1. Level 3 decomposition using Daubechies wavelet
• Gabor based features Gabor filter [31] is an excellent band pass filters that can be defined as convolution of image with Gabor function. If I (m, n) is a input image then Gabor transformed image is given by equation 6
r (x, y) =
∫∫ Ω
I (m , n ) g ( x − m , y − n )dm dn
(6)
452
K. Kancherla and S. Mukkamala
Where Ω contains all pixels in image, g(x, y) is Gabor filter given by equation7
g λ,θ , ∂(x, y ) = e w h e re x
'
− (( x ' 2 + γ
2
y '2 )/ 2σ
2
co s(2π
)
= x c o s θ + y s in θ , y
'
x'
λ
+ ∂)
(7)
= − x s in θ + y c o s θ
σ = 0 .5 6 λ a n d γ = 0 .5 Gabor filter [31] acts as local band pass filter which captures localization properties both in spatial domain and frequency domain. Figure 2 shows an example of Gabor filter and power spectrum of the filter. Gabor filters are used in applications like iris recognition, fingerprint recognition, gait recognition for segmentation, analysis and classification. Even though Gabor based features are useful in wide variety of applications, it is not used due to its computational complexity. However for our application accuracy is of importance, so we used Gabor based features in our experiment. In our experiments we used 5 scales and 7 orientations. The Gabor filter will produce an image of size MxN for every orientation and scale, which produces 5*7 images. From each of these 35 images we extract statistical features like mean and standard deviation.
Fig. 2(a). 2-D Gabor filter
Fig. 2(b). power spectrum of Gabor filter
4 Experiments and Results Once the sputum samples are collected and slides are prepared. For each of these images we perform image analysis as explained in previous section. For each of these cells we extract 105 different features explained in previous section feature extraction section. We perform analysis on both 650nm and 660 nm simultaneously. The final dataset consists of 1008 cancer cells and 191 non cancer cells. From these we use 700 cancer cells and 130 non-cancer cells are used for training and remaining dataset is used for testing. We performed two sets of experiments, in first experiment we used
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach
453
35 features (intensity, shape and wavelet based features), in second experiment we added Gabor based features to initial set. 35 features set is called initial feature set and 105 features set is called modified feature set. Results obtained using various machine learning techniques is given in table 1. Table 1. Accuracy obtained using initial and modified features Accuracy obtained for 650nm using modified features
Accuracy obtained for 650nm using initial features
Accuracy obtained for 660nm using modified features
Accuracy obtained for 660nm using initial features
SVM
96.56
93
95.45
94.5
RBF Network
96.06
94.85
96.36
94.5
Naïve Bayesian
96.06
95.09
95.3
95.34
Multinomial Logistic Model
90.3
91.42
90.7
91.7
Sequential Minimal Optimization
95.81
96.1
93.85
Linear Logistic Model
90.9
93.62
93.2
94.6
K Nearest Neighbor (KNN)
96.3
95.34
96.3
94.6
Multilayer Perceptron
95.57
92.4
93.6
90.6
Random Forest
97.29
96.32
97.78
95.3
ML Technique
94.5
We obtained an accuracy of 97.29 % for 650nm and 97.78% for 660nm using random forest machine learning technique using modified feature set and obtained an accuracy of 96.3% using initial feature set. These results show an improvement in accuracy when Gabor based features is used. We obtained similar performance for both 650nm and 660nm. For both wavelengths we obtained best accuracy using Random Forest. Sensitivity and specificity are useful in measuring the performance of a classifier. Sensitivity also called recall rate is defined as true positive rate, which is the percentage of actual positive that are correctly classified. Specificity is the proportion of negatives that are correctly identified. Receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot of the sensitivity vs. (1 − specificity) for a binary classifier system as its discrimination threshold is varied. Receiver operating characteristic (ROC) curve is a two dimensional signal plotted between true positive and false negative. In other words it gives tradeoff between false negatives and true positives. Receiver operating characteristic (ROC) always starts from (0, 0) and ends at (1, 1). For an ideal classifier, the plot starts from (0, 0), move immediately to (0, 1) and then to (1, 1). If the plot is a line with angle 45 degree, then it is called neutral, it is similar to flipping a coin. Any Receiver operating characteristic (ROC) curve plot below this line is considered very bad classifier. We plotted the Receiver operating characteristic (ROC) curve for Support Vector Machine (SVM). The X-axis is false positive rate and Y-axis is true positive rate. Figure 3 and figure 4 show the Receiver operating characteristic (ROC) curves for 650nm and 660nm respectively. One
454
K. Kancherla and S. Mukkamala
Fig. 3(a). ROC curve for 650nm using initial feature set
Fig. 3(b). ROC curve for 660nm using initial feature set
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach
Fig. 4(a). ROC curve for 650nm using modified feature set
Fig. 4(b). ROC curve for 660nm using modified feature set
455
456
K. Kancherla and S. Mukkamala
drawback of Receiver operating characteristic (ROC) curve is it two dimensional signal, so it is difficult to compare two different models. For this we used Area Under Curve (AUC). Area Under Curve (AUC) is a scalar quantity which is integral of Receiver operating characteristic (ROC) curve, it is area under the ROC curve. We obtained Area Under Curve (AUC) of 0.9864 and 0.977 for 650nm and 660nm respectively using modified feature set and for initial features we obtained 0.97 and 0.9677 for 650nm and 660nm respectively. This shows there is an improvement in performance using modified features (addition of Gabor based features).
5 Conclusion and Future Work In this paper we used multi spectrum imaging and applied various machine learning techniques for early detection of lung cancer. First the sputum samples are obtained from patient and stained using Biomoda CyPath®. After obtaining the slides we acquire the images at 650nm and 660nm using nuance system. This multi spectrum has the advantage of removing Auto-fluorescence. After obtaining the image we perform image processing and obtain features from each cell. We apply machine learning to this feature vector. We obtained an accuracy of about 98% for both 650nm and 660nm using intensity based, shape based and texture based features (Gabor and Wavelet based features) which shows the potential of our method in detecting lung cancer. Other than efficiency our approach has the advantage by its ease to use. The PAP staining on the other hand consists of approximately 27 steps and five reagents, some of which are categorized as hazardous materials. The level of expertise required to read these slides is very significant and limits PAP’s use as a screening tool. As discussed earlier in this paper, the limitations of other methods for lung cancer detection make them unsatisfactory as a screening tool for early lung cancer detection. Preliminary results have shown the potential of our method for early detection of lung cancer. In future we want to perform our experiments on real cancer patient sputum samples. We also want to perform experiments by adding additional parameters like smokers vs. non-smokers, treatment process. We also would like to address the problem of overlapped cells by using watershed algorithms. We would like to build an ensemble of classifiers using 650nm and 660nm for improving the accuracy.
References 1. American Cancer Society (ACS), Report on Lung Cancer (2008), http://www.cancer.org/downloads/PRO/LungCancer.pdf 2. Lung Cancer Survival Rate, http://lung-cancer.emedtv.com/lung-cancer/ lung-cancer-survival-rate.html 3. Jemal, A., Siegel, R., Ward, E., Murray, T., Xu, J., Thun, M.J.: Cancer statistics. CA Cancer J. Clin. 57(1), 43–66 (2007) 4. American Lung Association, Trends in lung cancer morbidity and mortality report (2006)
Lung Cancer Detection Using Labeled Sputum Sample: Multi Spectrum Approach
457
5. Oken, M.M., Marcus, P.M., Hu, P., Beck, T.M., Hocking, W., Kvale, P.A., Cordes, J., Riley, T.L., Winslow, S.D., Peace, S., Levin, D.L., Prorok, P.C., Gohagan, J.K.: Baseline chest radiograph for lung cancer detection in the randomized prostate, lung, colorectal and ovarian cancer screening trial. Journal of the National Cancer Institute 97(24), 1832–1839 (2005) 6. Zhou, Z., Jiang, Y., Yang, Y., Chen, S.: Lung Cancer Cell Identification Based on Artificial Neural Network Ensembles. Artificial Intelligence in Medicine 24(1), 25–36 (2002) 7. Berrington de Gonzalez, A., Darby, S.: Risk of cancer from diagnostic X-rays: estimates for the UK and 14 other countries. Lancet 363, 345–351 (2004) 8. Cruz, J.A., Wishart, D.S.: Applications of machine learning in cancer prediction and prognosis. Cancer informatics 2, 59–77 (2007) 9. Shah, S., Kusiak, A.: Cancer gene search with data-mining and genetic algorithms. Computers in Biology and Medicine archive 37(2), 251–261 (2007) 10. Simes, R.J.: Treatment selection for cancer patients: application of statistical decision theory to the treatment of advanced ovarian cancer. J. Chronic Dis. 38, 171–186 (1985) 11. Maclin, P.S., Dempsey, J., Brooks, J.: Using neural networks to diagnose cancer. J. Med. Syst. 15, 11–19 (1991) 12. Cicchetti, D.V.: Neural networks and diagnosis in the clinical laboratory: state of the art. Clin. Chem. 38, 9–10 (1992) 13. Petricoin, E.F., Liotta, L.A.: SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer. Curr. Opin. Biotechnol. 15, 24–30 (2004) 14. Bocchi, L., Coppini, G., Nori, J., Valli, G.: Detection of single and clustered microcalcifications in mammograms using fractals models and neural networks. Med. Eng. Phys. 26, 303–312 (2004) 15. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97(457), 77–87 (2002) 16. Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001) 17. Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000) 18. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., GaasenBeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Blomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene-expression monitoring. Science 286, 531–537 (1999) 19. Eisen, M.B., Spellman, P.T., Brown, P.O., Bostein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Science USA 14, 863–868 (1998) 20. Hartuv, E., Schmitt, A., Lange, J., Meier-Ewert, S., Lehrach, H., Shamir, R.: An algorithm for clustering cDNA fingerprints. Genomics 66, 249–256 (2000) 21. Figge, F.H.J., Weiland, G.S., Manganiello, L.O.J.: Cancer Detection and Therapy. Affinity of Neoplastic, Embryonic and Traumatized Tissues for Porphyrins and Metalloporphyrins. Proc. Soc. Exp. Biol. Med. 68, 640–641 (1948) 22. Taxdal, D.S.R., Ward, G.E., Figge, F.H.J.: Fluorescence of Human Lymphatic and Cancer Tissues Following High Doses of Intravenous Hematoporphyrin. Surg. Forum 5, 619–624 (1955)
458
K. Kancherla and S. Mukkamala
23. Lipson, R.L., Baldes, E.J., Olsen, A.M.: Hematoporphyrin Derivative: A New Aid for Endoscopic Detection of Malignant Disease. J. Thorac. Cardiovasc. Surg. 42, 623–629 (1961) 24. Galeotti, T., Borrello, S., Palombini, G., Masotti, L., Ferrari, M.B., Cavatorta, P., Arcioni, A., Stremmenos, C., Zannoni, C.: Lipid Peroxidation and Fluidity of Plasma Membranes from Rat Liver and Morris Hepatoma 3924A. FEBS Lett. 169, 169–713 (1984) 25. Galeotti, T., Borrello, S., Minotti, G., Masotti, L.: Membrane Alterations in Cancer Cells: The Role of Oxy Radicals. Ann. NY Acad. Sci. 488, 468–480 (1986) 26. Campanella, R.: Membrane Lipids Modifications in Human Gliomas of Different Degree of Malignancy. J. Neurosurg. Sci. 36, 11–25 (1992) 27. Lacar, F.M., Lewis, M.M., Grierson, I.T.: Use of hyperspectral imagery for mapping grape varieties in the Barossa Valley. In: Geoscience and Remote Sensing Symposium, vol. 6, pp. 2875–2877 (2001) 28. Schurmer, J.H.: Hyperspectral imaging from space. Air Force Research Laboratories Technology Horizons (2003) 29. Ellis, J.: Searching for oil seeps and oil-impacted soil with hyperspectral imagery. Earth Observation Magazine (2001) 30. Nuance multi spectrum system, http://www.cri-inc.com/products/nuance-fx.asp 31. Grigorescu, S.E., Petkov, N., Kruizinga, P.: Comparison of Texture Features Based on Gabor Filters. IEEE Transactions on Image Processing 11(10) (2002) 32. Cumani, A.: Edge Detection in Multi Spectrum images. CVGIP: Graphical Models and Image Processing 53(1), 40–51 (1991)
Improvement of Building Automation System Mark Sh. Levin1 , Aliaksei Andrushevich2 , and Alexander Klapproth2
2
1 Inst. for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russia
[email protected] CEESAR-iHomeLab, Lucerne University of Applied Sciences, Technikumstrasse 21, CH-6048 Horw, Switzerland {aliaksei.andrushevich,alexander.klapproth}@hslu.ch
Abstract. The paper addresses redesign/improvement of a building automation system (BAS). For the sake of simplicity, the field bus technology on KNX example and WSN technologies on IEEE.15.4/Zig.Bee basis are examined. The basic system example consists of four parts: (1) IP/KNX Gateway, (2) IP/WSN 6LoWPAN Gateway, (3) ZigBee Wireless Sensor Network, and (4) KNX Field Bus Infrastructure. A tree-like system model (and/or morphological tree) is used. The following system improvement design schemes are examined: (i) upgrade of system components (strategy 1), (ii) extension by adding an additional part (strategy 2), and (iii) combined scheme (strategy 3). Three underlaying problems are used: (a) multicriteria ranking, (b) multicriteria multiple choice problem, and (c) combinatorial synthesis. Numerical examples illustrate the redesign processes. Keywords: System design, combinatorial optimization, heuristics, multicriteria decision making, building automation, smart home.
1
Introduction
In recent two decades, the significance of reengineering (i.e., issues of systems re-design, rebuilt, improvement, upgrade, transformation, extension) has been increased (e.g., [1], [12], [13], [19]). Building design automation systems (BASs) are increasing in popularity (e.g., [5], [7], [9], [16], [21], [22], [24]). BAS provide managing some building equipments (e.g., monitoring, detection of faults, tracking, minimization of energy usage associated with facility assets, etc.) while taking into account reliability and maintenance cost. The building automation domain is still experiencing intensive growth. Generally, it is quite cumbersome to estimate the advantages and disadvantages of the whole variety of not mature technologies and services. Fig. 1 depicts an architecture of a building automation system (“SAC” corresponds to “sensor, actuator, controller”) (e.g., [21]). For the sake of simplicity, here the field bus technology on KNX example and WSN technologies on IEEE.15.4/Zig.Bee basis are examined. The considered system example consists of four parts (subsystems/components): (1) IP/KNX Gateway, (2) IP/WSN 6LoWPAN Gateway, (3) ZigBee Wireless Sensor Network, K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 459–468, 2011. c Springer-Verlag Berlin Heidelberg 2011
460
M.S. Levin, A. Andrushevich, and A. Klapproth
?
@ @ SAC @ @ ...
Management devices @ R @ Interconnection WAN devices Backbone
@ @ SAC @ @ ...
@ @ @ @ SAC SAC @ @ @ @ BACnet KNX
network i rf @ @A A @ @ SAC 6 @ R @ I @ @ @ @ ... @ @ ... SAC ?@ @ @ @ @ @ SAC SAC @ @ @ @ LonWorks ZigBee
Fig. 1. Architecture of building automation system
and (4) KNX Field Bus Infrastructure. In our paper a tree-like system model (as and/or morphological tree) is used for the system modeling (e.g, [12], [13]). In the paper three improvement design schemes are examined: (i) bottlenecksbased strategy to improve (upgrade) some system components, (ii) extension of an existing system, and (iii) combined scheme. The following underlaying combinatorial problems are used: (a) multicriteria ranking of alternatives (e.g., system bottlenecks), (b) multicriteria multiple choice problem, and (c) combinatorial synthesis based on Hierarchical Morphological Multicriteria Design (HMMD) approach (morphological clique problem). Realistic numerical examples illustrate the redesign processes for a building automation system. Selected bottlenecks Initial system (for improvement/ bottleneck candidates upgrade) H @ ? R HH @ ?@ R @ 9 j h h s h h sh h s * yXX X X * 6 X X 6 X I @ Improvement @ System Upgrade plan actions Fig. 2. Bottlenecks-based redesign scheme
2
Basic CC system: CC C structure, CCAdditionalC components, system C component CC part C interconnection CC CC Fig. 3. System extension
Systems Modification Issues
In general, system improvement/modification processes are based on the following three action kinds (e.g., [12], [13]): (i) improvement (modification, upgrade, addition) of a system component, (ii) improvement of system components compatibility, (iii) change of a system structure. Evidently, revelation of system bottlenecks may be considered as a special preliminary phase. Thus, the following
Improvement of Building Automation System
S = A0 B0 C0 x Au s A0
Bu s B0
Cu
=⇒ =⇒ =⇒
s C0
461
S = A1 B0 C1 D0 x Au cq A1
Bu s B0
Cu cq C1
D u cq D0
Fig. 4. Modification by components and extension
three system improvement design schemes can be considered: (1) bottlenecksbased strategy to improve (i.e., upgrade) a subset of system components and/or compatibility between system components (strategy 1, Fig. 2); (2) extension by adding an additional part (i.e., extension strategy 2, Fig. 3), and (3) a new design (strategy 3). Fig. 4 depicts an illustrative example of the combined scheme (strategy 1 and strategy 2): S ⇒ S .
3
Underlaying Problems
Let V = {1, ..., i, ..., t} be a set of items which are evaluated upon criteria K = {1, ..., j, ..., d} and zi,j is an estimate (quantitative, ordinal) of item i on criterion j. The matrix {zi,j } is a basis to build a partial order on V , for example, through the following generalized scheme: (a) pairwise elements comparison to get a preference (and/or incomparability, equivalence) binary relation, (b) building a partial order on V . Here the following partial order (partition) as linear ordered subsets of V is searched for: V = ∪m k=1 V (k), |V (k1 ) ∩ V (k2 )| = 0 if k1 = k2 , i2 i1 ∀i1 ∈ V (k1 ), ∀i2 ∈ V (k2 ), k1 ≤ k2 . Set V (k) is called layer k, and each item i ∈ V gets priority ri that equals the number of the corresponding layer. The list of basic techniques for multicriteria selection is the following (e.g., [2]): (1) multi-attribute utility analysis; (2) multi-criterion decision making: (3) Analytic Hierarchy Process (AHP); and (4) outranking techniques. In this paper a version of outranking technique is used [20]. m m The basic knapsack problem is ([4], [10]): max i=1 ci xi s.t. i=1 ai xi ≤ b, xi ∈ {0, 1}, i = 1, m, where xi = 1 if item (element) i is selected, ci is a value (“utility”) for item i, and ai is a weight (or required resource). Often nonnegative coefficients are assumed. The problem is NP-hard [4]. In the case of multiple choice knapsack problem, the items are divided into groups and it is necessary to select elements (items) from each group m while qi taking into account m aqitotal resource constraint (or constraints): max i=1 j=1 cij xij s.t. i=1 j=1 aij xij ≤ b, qi x = 1, i = 1, m; x ∈ {0, 1}. Further, multiple criteria description may ij ij j=1 be used {ci,j } ∀(i, j) (i.e., multi-objective multiple choice knapsack problem). Thus,the vector objectivefunction f = ( f 1 , ..., f r ) is ([14], [15]): m qi m qi 1 (max i=1 j=1 cij xij , ..., max i=1 j=1 crij xij ). Evidently, here it is necessary to search for the Pareto-efficient (by the vector objective function above) solutions. The following solving schemes can be considered: 1. dynamic programming (e.g., [11]); 2. enumeration solving schemes to obtain exact solutions (e.g., branch-and-bound algorithms) (e.g., [3]); 3. greedy
462
M.S. Levin, A. Andrushevich, and A. Klapproth
algorithms (e.g., [25]); 4. genetic algorithms or multi-objective evolutionary optimization (e.g., [8]); and 5. various meta-heuristics (e.g., [6]). In this paper a two-stage heuristic based on reducing the initial multicriteria problem to oneobjective one is used: (i) multicriteria ranking for each element i, j (i.e., by vector (c1ij , ..., cpij , ..., crij )) to get the ordinal priority for each element above, (ii) examination of the obtained multiple choice knapsack problem in which the priorities are used instead of cij ; the solving process of this problem may be based well-known methods (e.g., greedy heuristic, dynamic programming). A new heuristic based on a discrete space of results quality was suggested for multicriteria multiple choice knapsack problem in [15]. The next underlaying combinatorial problem consists in Hierarchical Morphological Multicriteria Design (HMMD) that is based on morphological clique problem (e.g., [12], [13]). Here a composite (modular, decomposable) system is under examination and consists of the components and their interconnections or compatibilities. Basic assumptions of HMMD are the following: (a) a treelike structure of the system; (b) a composite estimate for system quality that integrates components (subsystems, parts) qualities and qualities of interconnections (hereinafter referred as ’IC’) across subsystems; (c) monotonic criteria for the system and its components; and (d) quality of system components and IC are evaluated on the basis of coordinated ordinal scales. The designations are: (i) design alternatives (DAs) for leaf nodes of the model; (ii) priorities of DAs (r = 1, k; 1 corresponds to the best one); (iii) ordinal compatibility (IC) for each pair of DAs (w = 0, l; l corresponds to the best one). The basic phases of HMMD are: (1) design of the tree-like system model; (2) generation of DAs for model’s leaf nodes; (3) hierarchical selection and composition of DAs into composite DAs for the corresponding higher level of the system < 3, 0, 0 > The ideal point n(S ∗ )
< 2, 1, 0 > < 2, 0, 1 >
< 1, 2, 0 >
< 1, 1, 1 >
< 0, 3, 0 >
< 1, 0, 2 >
< 0, 2, 1 >
< 0, 1, 2 > < 0, 0, 3 > The worst point Fig. 5. Lattice of system quality
xS =X Y Z S ∗ = X3 Y2 Z4 Y e Z e eX Y1 (3) e Z (2) r X1 (2) e e r r 1 reX2 (1) e r Y (2) reZ2 (3) 2 r X3 (1) e r Z3 (1) e r e Z4 (2) Fig. 6. Composition problem
Improvement of Building Automation System
463
hierarchy (morphological clique problem); and (4) analysis and improvement of the resultant composite DAs (decisions). Let S be a system consisting of m parts (components): P (1), ..., P (i), ..., P (m). A set of DAs is generated for each system part above. Thus the problem is: Find a composite design alternative S = S(1) ... S(i) ... S(m) of DAs (one representative design alternative S(i) for each system component/part P (i), i = 1, m) with non-zero IC between design alternatives. A discrete space of the system excellence on the basis of the following vector is used: N (S) = (w(S); n(S)), where w(S) is the minimum of pairwise compatibility between DAs which correspond to different system components (i.e., ∀ Pj1 and Pj2 , 1 ≤ j1 = j2 ≤ m) in S, n(S) = (n1 , ..., nr , ...nk ), where nr is the number of DAs of the rth quality in S. As a result, we search for composite decisions which are nondominated by N (S). Fig. 5 illustrates a simplified lattice of system quality (by n(S)). Fig. 6 illustrates the composition problem. Here a composite solution is (Fig. 5 and Fig. 6): S ∗ = X3 Y2 Z4 .
4
System Morphology and Design
The following hierarchical structure of building automation example is examined (Fig. 7): System. Building automation example S = G Q T I. Part 1. IP/KNX Gateway G: Gira HS3 G1 , EibPort G2 , and EibPC G3 . Part 2. IP/WSN 6LoWPAN Gateway Q: Sensinode Nanorouter Q1 , Jennic Border-Router Q2 , Arch Rock PhyNet Router Q3 , and Crossbow Ehernet Gateway (MIB600) Q4 . Part 3. ZigBee Wireless Sensor Network T : TI T1 , Ember T2 , Jennic T3 , Freescale T4 , and Microchip T5 . Part 4. KNX Field Bus Infrastructure I: ABB I1 , Siemens I2 , SCHNEIDER Electric I3 , and BOSCH I4 . The following criteria are used for assessment of DAs (’+’ corresponds to positive orientation of a ordinal scale and ’-’ corresponds to the negative xS =GQT I S1 = G1 Q3 T2 I3 S2 = G1 Q3 T2 I2 S3 = G1 Q3 T2 I4 Qe T e I e eG reG1 (1) e r Q1 (2) e r T1 (2) e r I1 (2) reG2 (3) e r Q2 (2) e r T2 (1) e r I2 (3) r G3 (3) e e r Q3 (1) e r T3 (3) e r I3 (1) r e r e r e Q4 (2) T (3) I4 (2) r 4 e T5 (2) Fig. 7. System structure (morphology)
464
M.S. Levin, A. Andrushevich, and A. Klapproth
orientation of the scale): (1) reliability C1 , + ; (2) performance C2 , + ; (3) scalability C3 , + ; (4) maintenance C4 , + ; and (5) cost C5 , − . The following ordinal scales are used: (i) ordinal scale [1, 10] for criteria 1, 2, 3, and 4; (ii) scale [200, 7500] for criterion 5. The criteria weights are the following: 5, 5, 3, 3, and 2. Table 1 contains ordinal estimates of DAs upon the above-mentioned criteria (expert judgment) and priorities of DAs (as a result of multicriteria ranking procedure based on Electre-like method). Priorities of DAs are shown in Fig. 7 (in parentheses) as well. Compatibility estimates between DAs are presented in Table 2 (expert judgment). Now let us consider composite DAs. The following Pareto-efficient composite DAs are obtained: S1 = G1 Q3 T2 I3 , N (S1 ) = (2; 4, 0, 0); S2 = G1 Q3 T2 I1 , N (S2 ) = (3; 3, 1, 0); and S3 = G1 Q3 T2 I4 , N (S3 ) = (3; 3, 1, 0). Fig. 8 illustrates the space of system quality and resultant DAs above.
5
System Extension by Additional Part
Here system extension is considered by addition of the 5th subsystem HomeServer SW H: Domovea SW H1 , Eisbaer SW H2 , Misterhouse SW H3 , and Eibd-linKNX-KNXweb H4 . The extended system morphology is presented in Fig. 9. Table 3 contains estimates of the DAs for subsystem H and results of multicriteria ranking (priorities). Table 4 contains compatibility estimates. The resultant Pareto-efficient composite DAs are: S1 = G1 Q3 T2 I3 H3 , N (S1 ) = (1; 5, 0, 0); S2 = G1 Q3 T2 I1 H3 , N (S1 ) = (2; 4, 1, 0); S3 = G1 Q3 T2 I2 H2 , N (S1 ) = (3; 3, 1, 1). Table 1. Estimates
DAs
Criteria 1 2 3 4
G1 G2 G3 Q1 Q2 Q3 Q4 T1 T2 T3 T4 T5 I1 I2 I3 I4
7 8 8 7 7 9 7 8 9 7 7 8 9 8 7 9
9 7 6 7 9 8 9 8 9 9 7 8 7 6 9 9
9 7 7 9 8 9 7 8 8 7 9 9 7 8 9 8
8 6 4 6 8 8 9 8 9 8 6 8 9 8 9 9
5
Priority (r)
3000 2500 2000 250 200 290 300 900 1000 800 850 950 5500 5000 6500 7500
1 3 3 2 2 1 2 2 1 3 3 2 2 3 1 2
Table 2. Compatibility
Q1 Q2 Q3 Q4 T1 T2 T3 T4 T5 I1 I2 I3 I4 G1 3 3 3 3 2 3 3 3 2 3 3 2 3 G2 2 2 3 3 3 3 3 2 2 3 2 2 3 G3 2 2 3 3 3 3 2 2 3 3 2 2 3 Q1 2 3 3 2 2 3 2 2 3 Q2 2 3 3 2 2 3 2 2 3 Q3 2 3 3 2 2 3 3 2 3 Q4 2 3 3 2 2 3 2 2 3 T1 3 2 2 3 T2 3 3 2 3 T3 3 2 2 3 T4 3 2 2 3 T5 3 2 2 3
Improvement of Building Automation System
x S = G Q T I H S1 = G1 Q3 T2 I3 H3 S2 = G1 Q3 T2 I1 H3 S3 = G1 Q3 T2 I2 H2 Q T e I e eG e
The ideal rfpoint N (S1 ) t N (SA2 ), N (S3 ) tA A A A A AA A A AA A AA w=3 w = 2 w = 1
r G1 (1) e reG2 (3) r G3 (3) e
Fig. 8. Quality space
reQ1 (2) r Q2 (2) e r Q3 (1) e r e Q4 (2)
reT1 (2) r T2 (1) e r T3 (3) e r e T (3) r 4 e T5 (2)
reI1 (2) r I2 (3) e r I3 (1) e r e I4 (2)
He r H1 (2) e r H2 (2) e r H3 (1) e r e H4 (3)
Fig. 9. Extended system structure Table 4. Compatibility
H1 H2 H3 H4
Table 3. Estimates
DAs
Criteria 1 2 3 4
H1 H2 H3 H4
8 7 9 8
9 9 8 7
7 7 9 8
8 9 9 7
5
Priority (r)
3000 3500 2300 2200
2 2 1 3
The ideal rfpoint A A N (S1 ) A t A N ( S ) tA 2 N (AS3 ) A t A A A AA A AA w = 3 w = 2 w = 1 Fig. 10. Quality space (S)
G1 G2 G3 Q1 Q2 Q3 Q4 T1 T2 T3 T4 T5 I1 I2 I3 I4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
3 1 1 1 1 3 1 1 3 1 1 1 3 3 1 2
3 1 1 1 1 3 1 1 3 1 1 1 2 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Table 5. Bottlenecks & improvement actions
Composite DAs
465
Bottleneck Action w/r DA IC
S1 = G1 Q3 T2 I3 (Q3 , I3 ) 2 ⇒ 3 S2 = G1 Q3 T2 I1 I1 2⇒1 S3 = G1 Q3 T2 I4 I4 2⇒1
466
M.S. Levin, A. Andrushevich, and A. Klapproth
Fig. 10 illustrates the obtained composite DAs. Note solutions S1 and S2 are an extension of composite DAs S1 and S2 (strategy 2). On the other hand, solution S3 is a result of new design for the extended system structure (morphology) (strategy 2).
6
System Improvement
6.1
Improvement of Bottleneck
Generally, improvement of composite DAs can be based on two kinds of actions (e.g., [12], [13]): (i) improvement of element, (ii) improvement of compatibility between elements. Here a bottleneck corresponds to an element or an element interconnection. Four-component composite DAs from system design (Fig. 7) are examined: S1 , S2 , S3 . Let us consider a set of prospective bottlenecks as elements and element interconnections in the obtained Pareto-efficient composite DAs which can be improved: I1 , I4 , (Q3 , I3 ). Further, Table 5 illustrates the improvement process: bottlenecks (by elements, by compatibility) and improvement actions for composite DAs. It is shown that obtained Pareto-efficient DAs can be improved (by element or by element interconnection) to get the ideal solution. 6.2
Improvement via Multiple Choice Problem
System improvement can be considered as upgrade of some system components while taking into account a total budget constraint. This approach is based on multiple choice problem. Here a five-component extended solution is examined: S = G2 Q1 T4 I2 H1 (Fig. 11). The corresponding multiple choice problem is: 5 qi 1 5 qi 4 max i=1 j=1 cij xij , ..., max i=1 j=1 cij xij 5 qi qi s.t. i=1 j=1 aij xij ≤ b, j=1 xij = 1, i = 1, 5;
xij =∈ {0, 1}.
Initial composite solution: S = G2 Q1 T4 I2 H1 Upgrade Upgrade Upgrade Upgrade tUpgrade of G2 t of Q1 t of T4 t of I2 t of H1 x11 x21 x31 x41 x51 x12 x22 x32 x42 x52 x33 x43 Fig. 11. Improvement via multiple choice problem
Here binary variables (xij ) correspond to the following improvement actions: (a) none (xi1 ), (b) improvement of an element to quality level 1 (xi2 ), and (c) improvement of an element to quality level 2 (xi3 ). Table 6 contains improvement actions and their estimates (criteria form section on design are used): c1ij , c2ij , c3ij , c4ij , and aij .
Improvement of Building Automation System
467
Table 6. Estimates
Improvement actions x11 x12 x21 x22 x31 x32 x33 x41 x42 x43 x51 x52
(none) (G2 → G1 ) (none) (Q1 → Q3 ) (none) (T4 → T5 ) (T4 → T2 ) (none) (I2 → I3 ) (I2 → I1 ) (none) (H1 → H3 )
Criteria (clij ) Cost (aij ) 1 2 3 4 8 7 7 9 7 8 9 8 7 9 8 9
7 9 7 8 7 8 9 6 9 7 9 8
7 9 9 9 9 9 8 8 9 7 7 9
6 8 6 8 6 8 9 8 9 9 8 9
0 3000 0 290 0 950 1000 0 6500 5500 0 2300
The resultant solutions are (resource constraint limitation b was changed from 40 to 1100): (1) b = 40: S ⇒ S1 = G2 Q3 T4 I2 H1 (Q1 → Q3 ); (2) b = 130: S ⇒ S1 = G2 Q3 T2 I2 H1 (Q1 → Q3 , T4 → T2 ); (3) b = 430: S ⇒ S3 = G1 Q3 T2 I2 H1 (G2 → G1 , Q1 → Q3 , T4 → T2 ); (4) b = 980: S ⇒ S4 = G1 Q3 T2 I1 H1 (G2 → G1 , Q1 → Q3 , T4 → T2 , and I2 → I1 ); (5) b = 1080: S ⇒ S5 = G1 Q3 T2 I3 H1 (G2 → G1 , Q1 → Q3 , T4 → T2 , and I2 → I3 ).
7
Conclusion
In the paper our approaches to improvement/extension of a building automation system have been described. This material is a preliminary one and is based on illustrative numerical examples. Clearly, it is reasonable to consider other redesign problems, e.g., system reconfiguration problems (e.g., [14], [23]). In the future it may be prospective to consider the following research directions: 1. examination of designing a system improvement trajectory (i.e., multistage system redesign); 2. analysis of on-line redesign/adaption problems for building automation systems; 3. usage of design models while taking into account uncertainty (e.g., stochastic models, fuzzy sets); 4. usage of AI techniques in design and redesign procedures; 5. examination of other realistic applied examples; and 6. usage of the suggested approaches in engineering/CS education.
References 1. Balasubramainan, K., Cellatoglu, A.: Improvements in home automation strategies for designing apparatus for efficient smart home. IEEE Trans. Consumer Electronics 54(4), 1681–1687 (2008)
468
M.S. Levin, A. Andrushevich, and A. Klapproth
2. Buede, D.M.: Software review. Overview of MCDA software market. J. of MultiCriteria Decision Analysis 1(1), 59–61 (1992) 3. Dyer, M.E., Kayal, N., Walker, J.: A branch-and-bound algorithm for solving the multiple choice knapsack problem. J. of Comput. and Appl. Math. 11(2), 231–249 (1984) 4. Garey, M.R., Johnson, D.S.: Computers and Intractability. The Guide to the Theory of NP-Completeness. W.H. Freeman and Company, San Francisco (1979) 5. Helal, A., Mann, W., El-Zabadani, H., King, J., Kaddoura, Y., Jansen, E.: The gator tech smart house: A programmable pervasive space. IEEE Comput. 38(3), 50–60 (2005) 6. Hifi, M., Michrafy, M., Sbihi, A.: Heuristic algorithms for the multiple-choice multidimensional knapsack problem. J. of the Oper. Res. Soc. 55(12), 1323–1332 (2004) 7. Intille, S.S.: Designing a home of the future. IEEE Pervasive Computng 1(2), 80–86 (2002) 8. Jaszkiewicz, A.: On the computational efficiency of multiple objective metaheuristics. The knapsack problem case study. Eur. J. of Oper. Res. 158(2), 418–433 (2004) 9. Kastner, W., Neugschwandtner, G., Soucek, S., Newman, H.M.: Communication systems for building automation and control. Proc. of the IEEE 93(6), 1178–1203 (2005) 10. Kellerer, H., Pferschy, U., Pisinger, D.: Knapsack Problems. Springer, Berlin (2004) 11. Klamroth, K., Wiecek, M.: Dynamic programming approaches to the multiple criteria knapsack problem. Naval Research Logistics 47, 57–76 (2000) 12. Levin, M.S.: Combinatorial Engineering of Decomposable Systems. Kluwer Academic Publishers, Dordrecht (1998) 13. Levin, M.S.: Composite Systems Decisions. Springer, New York (2006) 14. Levin, M.S.: Combinatorial optimization in system configuration design. Automation and Remote Control 70(3), 519–561 (2009) 15. Levin, M.S., Safonov, A.V.: Heuristic for multicriteria multiple choice knapsack problem. Artif. Intell. and Decision Making 4, 53–64 (2009) (in Russian) 16. Levin, M.S., Andrushevich, A., Klapproth, A.: Composition of management system for smart homes. Information Processes 10(1), 78–86 (2010) 17. Peine, A.: Understanding the dynamics of technological configurations: A conceptual framework and the case of Smart Home. Technological Forecasting and Social Change 76(3), 396–409 (2009) 18. Rashidi, P., Cook, D.J.: Keeping the resident in the loop: Adapting the smart home to the user. IEEE Trans. SMC, Part A 39(5), 949–959 (2009) 19. Rouse, W.B.: A theory of enterprise transformation. Systems Engineering 8(4), 279–295 (2005) 20. Roy, B.: Multicriteria Methodology for Decision Aiding. Kluwer Academic Publishers, Dordrecht (1996) 21. Schein, J.: An information model for building automation systems. Automation in Construction 16(2), 125–139 (2007) 22. Snoonian, D.: Smart buildings. IEEE Spectrum 40(8), 18–23 (1997) 23. Vrba, P., Marik, V.: Capabilities of dynamic reconfiguration of multiagent-based industrial control systems. IEEE Trans. SMC, Part A 40(2), 213–223 (2010) 24. Yamazaki, T.: The ubiquitous home. Int. J. of Smart Home 1(1), 17–22 (2007) 25. Zemel, E.: An O(n) algorithm for the linear multiple choice knapsack problem and related problems. Inform. Proc. Letters 18(3), 123–128 (1984)
Efficient Load Balancing Using the Bees Algorithm Anabela Moreira Bernardino1, Eugénia Moreira Bernardino1, Juan Manuel Sánchez-Pérez2, Juan Antonio Gómez-Pulido2, and Miguel Angel Vega-Rodríguez2 1
Computer Science and Communication Research Centre, School of Technology and Management, Polytechnic Institute of Leiria, 2411 Leiria, Portugal {anabela.bernardino,eugenia.bernardino}@ipleiria.pt 2 Department of Technologies of Computers and Communications, Polytechnic School, University of Extremadura, 10071 Cáceres, Spain {sanperez,jangomez,mavega}@unex.es
Abstract. The massive growth of the Internet traffic in the last decades has motivated the design of high-speed optical networks. Resilient Packet Ring (RPR), also known as IEEE 802.17, is a standard designed for optimising the transport of data traffic over optical fiber ring networks. In this paper, we consider a Weighted Ring Arc-Loading Problem (WRALP), which arises in engineering and planning of the RPR systems. The WRALP is a combinatorial optimisation NP-complete problem. Given a set of point-to-point unidirectional traffic demands of a specified bandwidth, the demands should be assigned to the clockwise or to the counter-clockwise ring in order to yield the best performance. This paper suggests an efficient load balancing algorithm – the Bees Algorithm. We compare our results with nine meta-heuristics used in literature to solve the same problem. The simulation results verify the effectiveness and robustness of the Bees Algorithm. Keywords: Bees Algorithm, Optimisation Algorithms, Weighted Ring ArcLoading Problem.
1 Introduction The standard IEEE 802.17 for the Resilient Packet Ring (RPR) aims to combine the appealing functionalities from Synchronous Optical Network/ Synchronous Digital Hierarchy (SONET/SDH) networks with the advantages of Ethernet networks. It is a ring-based architecture that consists on two counter directional optical fiber rings [1, 2, 3]. To effectively use the RPR’s potential, namely the spatial reuse, the statistical multiplexing and the bi-directionality, it is necessary to route the demands efficiently. A fundamental problem is to design a transmission route (direct path) for each demand, to avoid high load on the ring arcs, where an arc is an edge endowed with a direction (clockwise or counter-clockwise). The load of an arc is defined as the total weight of those demands that are routed through the arc in its direction. This problem can be classified into two formulations: with demand splitting (WRALP) or without demand splitting (non-split WRALP). Split loading allows the splitting of a demand into two portions to be carried out in both directions, while in a non-split loading each K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 469–479, 2011. © Springer-Verlag Berlin Heidelberg 2011
470
A.M. Bernardino et al.
demand must be entirely carried out in either clockwise or counter-clockwise direction. In either split or non-split cases, WRALP ask for a routing scheme such that maximum load on arcs will be minimum. In this paper we study the WRALP without demand splitting. Cosares and Saniee [4] and Dell’Amico et al. [5] studied the non-split loading problem on SONET rings. Cosares and Saniee [4] proved that the formulation without demand splitting is NP-complete. For the split problem, various approaches are summarised by Schrijver et al. [6] and their algorithms compared in Myung and Kim [7] and Wang [8]. Recently Kim et al. [9] presented an Ant Colony Optimisation (ACO) algorithm using different strategies to solve the SONET ring loading. The non-split WRALP considered in the present paper is identical to the one described by Kubat and Smith [10] (non-split WRALP), Cho et al. [11] (non-split WRALP and WRALP) and Yuan and Zhou [12] (WRALP). The main purpose of previous works was to produce near optimal solutions for WRALP in a reduced amount of time. Our purpose is different, we want to compare the performance of our algorithm with others in the achievement of the best-known solution. Using the same principle Bernardino et al. [13] proposed several bio-inspired algorithms to solve the non-split SONET loading problem and the non-split WRALP. The authors made a comparison between several heuristics to prove the efficience of their algorithms. Bio-inspired randomised search heuristics such as EAs and SI algorithms are very popular among practitioners. Genetic Algorithms (GAs) and Differential Evolution (DE) are EAs that simulate the natural evolution of species by iteratively applying evolutionary operators such as mutation, recombination, and selection to a set of solutions for a given problem [14]. SI is the discipline that deals with natural and artificial systems composed of many individuals that coordinate using decentralized control and self-organization [15, 16]. ACO, Particle Swarm Optimisation (PSO), Bees Algorithm (BA) and Artificial Bee Colony (ABC) algorithms are some of the most known approaches. These algorithms can be used in real-world optimisation problems. A recent trend is to hybridise bio-inspired heuristics with LS to refine newly constructed solutions [17]. In this paper we propose a BA to solve the WRALP. It mimics the food foraging behaviour of swarms of honey bees [18, 19]. We compare the performance of BA with the algorithms: GA with multiple operators (GAM), Hybrid DE with multiple strategy (HDEM), Hybrid Discrete PSO (HDPSO), Tabu Search (TS), Local Search– Probability Binary PSO (LS-PBPSO), Hybrid DE (HDE), Hybrid ACO (HACO), ABC and Discrete DE (DDE) used in literature to solve the same problem. The paper is structured as follows. In Sect. 2 we describe the WRALP; in Sect. 3 we describe the proposed BA; in Sect. 4 we present the results obtained and, in Sect. 5 we report about the conclusions.
2 Weighted Ring Arc Loading Problem Let Rn be a n-node bidirectional ring with nodes {n1,n2,…,nn} labelled clockwise. Each edge {ek, ek+1} of Rn, 1≤ k ≤ n is taken as two arcs with opposite directions, in which the data streams can be transmitted in either direction: a + = (e , e ), a − = (e , e ) . k
k
k +1
k
k +1
k
Efficient Load Balancing Using the Bees Algorithm
471
A communication request on Rn is an ordered pair (s,t) of distinct nodes, where s is the source and t is the destination. We assume that data can be transmitted clockwise or counter-clockwise on the ring, without splitting. We use Path+(s,t) to denote the directed (s,t) path clockwise around Rn, and Path-(s,t) the directed (s,t) path counterclockwise around Rn. Table 1. Solution representation 3DLUVW 'HPDQG Æ Æ Æ Æ Æ Æ Q QXPEHU1RGHV P QXPEHU3DLUV 5HSUHVHQWDWLRQ9
& && && & && & &±FORFNZLVH &&±FRXQWHUFORFNZLVH 3DLU 3DLU 3DLU
3DLU
3DLU
3DLU
Often a request (s,t) is associated with an integer weight w>=0; we denote this weighted request by (s,t;w). Let D={(s1,t1;w1),(s2,t2;w2),...,(sm,tm; wm)} be a set of integrally weighted requests on Rn. For each request/pair (si,ti) we need to design a directed path Pathi of Rn from si to ti. A set Path={Pathi: i=1,2,...,m} of such directed paths is called a routing for D. In this work, the solutions are represented using binary vectors (Table 1). We assume that weights cannot be split, that is, for some integer Vi =1, 1≤ i ≤ m, the total amount of data is transmitted along Path+(si,di); for Vi=0, the total amount of data is transmitted along Path-(si,di). The vector V=(V1,V2,…, Vm) determines a routing scheme for D.
3 Bees Algorithm Honey bees live in hives around the world in well organised colonies. These colonies are characterised by the division of labour where specific bees perform specific tasks. Recently, researchers started to study the behaviour of social insects in an attempt to use the SI concepts to develop algorithms with the ability to search the solution search space of the problem in a way similar to the foraging search by colony of social insects [15, 16]. Algorithms inspired by the behaviour of natural systems are sometimes called meta-heuristic algorithms because they provide a high-level framework which can be adapted to solve optimisation, search, and related problems, as opposed to providing a stringent set of guidelines for solving a particular problem. A review of the literature on algorithms inspired by the behaviour of bees [16] suggests that the topic is evolving and that there is no consensus on a single descriptive title for metaheuristics based on bees behaviour. In literature it is possible to find several bee inspired algorithms that use different algorithm models: Bee System, BeeHive, Virtual Bee Algorithm, Bee Swarm Optimisation, Bee Colony Optimisation, ABC, and BA.
472
A.M. Bernardino et al.
The BA proposed by Pham et al. [18] is a SI algorithm which mimics the foraging behaviour of honey bees. In its basic version, the algorithm performs a kind of neighbourhood search combined with random search. For neighbourhood selection, the smallest fitnesses are chosen as selected bees. For recruitment, bees are assigned based on the fitnesses associated with the sites they are visiting. Only the best bees are selected in order to form the next generation of the colony unlike the process in nature of real bees. This paper presents the first application of BA to solve the WRALP. The main steps of the BA are given below: Initialise Parameters Create initial Population of bees, P0 Evaluate Population P0 t=0 WHILE stop criterion isn't reached Apply Local Search Procedure to all individuals in Pt t=iteration Select nss best bees in Pt, PBt Compute probabilities for nbs best sites Compute probabilities for(nss-nbs) selected sites FOR i=1 TO nss DO IF (i=0 and each positive demand value is routed in either clockwise (C) or counter-clockwise (CC) direction (1); for an arc, the load is the sum of wk for clockwise or counter-clockwise direction between nodes ek and ek+1 (2). The purpose is to minimise the maximum load on the arcs of a ring (3). 3.4 Local Search The LS algorithm applies a partial neighbourhood examination. Some pairs of the solution are selected and their directions are exchanged (partial search). This method can be summarised in the following pseudo-code steps [13]: FOR t=0 TO numberNodesRing/4 Pair1 = random (number of pairs) Pair2 = random (number of pairs) N = neighbourhoods of SOLUTION (one neighbourhood results of interchange the direction of Pair1 and/or Pair2) TEST_SOLUTION = FindBest (N) IF (fitness(SOLUTION)>fitness(TEST_SOLUTION) SOLUTION=TEST_SOLUTION
3.5 Select Best Bees The bees that have the smallest fitnesses are chosen as “selected bees” (PBt) and the sites visited by them are selected for neighbourhood search. 3.6 Compute Probabilities for nbs Best Sites A bee is recruited for a best site i, depending on the probability value associated with that site. The probabilities are calculated using expression (4).
474
A.M. Bernardino et al.
WRWDO)LWQHVV
QEV W ¦ ILWQHVV3% [ SUREL [
W WRWDO)LWQHVV ILWQHVV3% L WRWDO)LWQHVV
(4)
3.7 Compute Probabilities for (nss-nbs) Selected Sites A bee is recruited for a selected site i, depending on the probability value associated with that site. The probabilities are calculated using expression (5). nss t totalFitne ss = fitness(PB x ) ∑ x = nbs +1
t totalFitness − fitness(PB ) i . probi = totalFitness
(5)
3.8 Compute Number of Bees In our implementation the algorithm computes the number of bees (nb) which will be sent to a site, according to previously determined probabilities using expression (6). IF
i numberNodesRing /2 testSolution[c] = C ELSE testSolution[c] = CC ELSE testSolution[c]= bestSolution[c] IF (fitness(testSolution) X0 |φt ) — the probability that a randomly drawn value x1 ∈ X1 is greater than the other randomly drawn value x0 ∈ X0 at time step t. Since the associated stochastic variables X0 and X1 are normally distributed, with parameters (μ0 [t], σ0 [t]) and (μ1 [t], σ1 [t]), respectively, we have that: P (α[t] = 1) = P (X1 ≥ X0 |φ ) =
0
t
−∞
f (x; μ0 [t] − μ1 [t],
σ02 [t] + σ12 [t]) (1)
In the following, we will let p[t] denote this latter probability. 3.2
BS-TANB Based Decentralized Decision Making
The overall decentralized decision making scheme is illustrated in Fig. 1. On each round t, the n decision makers Vq ∈ {V1 , . . . , Vn } choose one of two arms αq [t] = i ∈ {0, 1} simultaneously and independently (they do not see each other), with αq [t] = 0 refering to a “No”-vote and αq [t] = 1 refering to a “Yes”-vote. Let pq [t] = P (αq [t] = 1) be the probability that decision maker Vq casts a “Yes” vote on round t. Then 1 − pq [t] is the probability that Vq casts a “No” vote, and each voting αq [t] can be seen as a Bernoulli trial in which a “Yes” vote is a success and a “No” vote is a failure. Note that the concrete instantiation of 1
By this we mean that P is not a fixed function. Rather, it denotes the probability function for a random variable, given as an argument to P .
A TAB Based Scheme for Accelerated Decentralized Learning
537
the arm selection probability pq [t] is governed by the learning scheme applied, which in our case is BS-TANB. Definition 1 (Arm Selection Variance). In a two-armed bandit problem where the current arm selection probability is p, we define Arm Selection Variance, σ2 , to be the variance, p(1 − p), of the outcome of the corresponding Bernoulli trial. As seen in Fig. 1, in addition to casting a vote αq [t], each decision maker Vq also submits its present Arm Selection Variance, σq2 [t], in order to signal its level of exploration. Thus, as in the traditional Goore Game setup, a Referee calculates the fraction, λ[t], of “Yes” votes. In addition, it now also calculates the variance 2 σA [t] of the total number of “Yes” votes, which simply is the sumof the variances n 2 of the independently cast votes (cf. Bienayme formula): σA [t] = q=1 σq2 [t]. Note that in practice, such as in QoS control in sensor networks [1], this operation is conducted by the so-called base station of the network. The Referee has a uni-modal normally distributed performance criterion 2 is the variance, which is thus G(λ[t]; μG , σG ), where μG is the mean and σG optimized when the fraction of “Yes” votes is exactly μG , λ[t] = μG . The current voting round ends with the Referee awarding a reward r˜i to each voter, with the reward being of magnitude G(λ[t]; μG , σG ). Additionally, white noise N (0, σW ) is independently added to the reward received by each voter. On the basis of their individual gains, the voters then decide, again independently, how to cast their votes on the next round. 3.3
Measuring Fluctuating Observation Noise in Goore Games
In order to develop a decentralized BS-TANB based scheme for solving the above problem, whose accuracy does not rely merely on conservative learning, it is 2 crucial that we are able to determine the observation noise (σob ), needed by BS-TANB for its Bayesian computations. From the perspective of voter Vq , let Yq = r=q αr [t] be the total number of “Yes” votes found among the n − 1 votes cast by the other voters (r = q). According to our Bayesian bandit scheme, each voter Vq , at any given iteration t of the game, cast its vote according to a Bernoulli distribution with success probability pq [t] = P (αq [t] = 1) = P (X1 > X0 |φtq ) — the probability of voting “Yes”. Furthermore, initially, all voters vote “Yes” with probability pq [1] = 0.5, and based on Bayesian computations, gradually shift their probability of voting “Yes” towards either 0 or 1, as learning proceeds. This leads us to design a solution for the case where Yq is a sum of independent random variables of similar magnitude, in other words, where Yq is approximately normally distributed for large n, Yq ∼ N (μqF , σFq ). Since each term inthe summation is Bernoulli distributed, the mean of the sum becomes μqF = r=q pr [t] while the variance 2 becomes σFq = r=q pr [t](1 − pr [t]). The above means that, essentially, we may assume that each voter Vq decides whether to add an additional “Yes” vote or not to a random sum of yes votes, Yq ∼ N (μqF , σFq ). That is, the reward that
538
O.-C. Granmo and S. Glimsdal
voter Vq receives when he votes either “Yes” (αq [t] = 1) or “No” (αq [t] = 0), beY +α [t] comes a function G( q n q ) governed by the random variable Yq ∼ N (μqF , σFq ) as well as the decision αq of voter Vq . Y +α [t] Thus E[G( q n q )] is the expected reward received by voter Vq when pulling Y +α [t] arm αq [t] and V ar[G( q n q )] is the variance of the reward, which we will refer to as observation noise, σob . Lemma 1. Let X be a normally distributed random variable, X ∼ N (μF , σF ). The expected value E[G(X)] of a deterministic function G(X) ∼ N (μG , σG ) of X then becomes: 2
E[G(X)] =
(μ −μ ) − G2 F2 1 2(σ +σ ) G F e 2 + σ2 ) 2π(σG F
(2)
Proof. The proof is found in [12] and is omitted here in the interest of brevity. Lemma 2. A deterministic function G(X) ∼ N (μG , σG ) of a normally distributed random variable, X ∼ N (μF , σF ), has the variance:
−
2
(μG −μF )2 2 2
(μ −μF ) − G σ2 σ2 e σG +σF σ2 +2σ2 G F F G V ar[G(X)] = e − 2 + σ2 ) 2 2 2π(σG 2πσF σG σG + 2σF2 F
(3)
Proof. The proof is found in [12] and is omitted here in the interest of brevity. Since both the mean of G, μG , and the mean of F , μF , are unknown, the latter equation cannot be used directly to guide the bandit based learning. Instead, we consider the maximum of V ar[G(X)] in terms of μF ∈ (0, 1). By considering the maximum, learning accuracy is prioritized, at the potential cost of reduced learning speed. It turns out that both μF and μG is eliminated from the equation when only considering the maximum of V ar[G(X)] with respect to μF , as follows. Theorem 1. The maximum of the variance V ar[G(X)] with respect to μF of the function G(X) ∼ N (μG , σG ), with X ∼ N (μF , σF ), is: max V ar[G(X)] =
μF ∈(0,1)
2 2 σF2 σG + σF2
2 σG
log
4 +2 σ2 σ2 +σ4 (|σF | (σG F G F ) |σG |)−log σ2 F
e 2π
2 σG
2 +2 σ 2 3 (σG F)
2 σF σG
3
2 +2 σ2 2 (σG F)
(4)
Proof. The proof is found in [12] and is omitted here in the interest of brevity. In other words, since σF in the above equation can be approximated based on the feedback σA from the Referee (see Fig. 1), we can find the worst case observation noise based on Theorem 1. Thus, essentially, we have found a closed form formula that approximates the worst case observation noise σob that each voter can apply adaptively in its Bayesian computations.
A TAB Based Scheme for Accelerated Decentralized Learning
4
539
Empirical Results
In this section we evaluate the ADL-TAB scheme by comparing it with the currently best performing algorithm — the family of Bayesian techniques reported in [2]. Based on our comparison with these “reference” algorithms, it should be quite straightforward to also relate the ADL-TAB performance results to the performance of other similar algorithms. We have conducted numerous experiments using various reward distributions, including a wide range of G(λ)-functions and a wide range of voters, under varying degrees of observation noise. The full range of empirical results are reported in [12], and they all show the same trend. Thus, in this paper, we report performance on a representative subset of the experiment configurations, involving the 3, 5, and 10 player Goore Game. Performance is measured in terms of Regret — the difference between the sum of rewards expected after N successive rounds of the GG, and what would have been obtained by always casting the optimal number of “Yes” votes. For these experiment configurations, an ensemble of 1000 independent replications with different random number streams was performed to minimize the variance of the reported results. In order to investigate the performance of the schemes under a broad spectrum of environments, we test the schemes using three different representative G(λ) functions — one sloped, with optimum close to λ = 0.5, G ∼ N (0.35, 0.2), another one also sloped, but with optimum farther from λ = 0.5, G ∼ N (0.125, 0.2), and finally, one peaked reward function, also with optimum far from λ = 0.5 (thus, being the most challenging one). In Table 1, Regret is reported after 10, 100, 1000, and 10 000 iterations for both the new accelerating scheme and the traditional static scheme. As seen from the table, for all reported configurations, our ADL-TAB scheme not only learns faster initially, but also attains the best regret in the long run. Note that for the two bottom configurations, we use an augmented σF , σ F = c · σF , with c = 1.5, when the final observation noise σob is calculated. Indeed, the constant c can be used to handle the non-stationarity arising as the number of voters grows, as demonstrated in Table 2. Since ADL-TAB applies the standard deviation σG of the reward function G(λ) to find overall observation variance, it is interesting to see how robust the scheme is to distortion of σG . As summarized in Table 3, setting σG too low is better than setting it too high in the present setting. Indeed, performance improves slightly with a lower σG . Note that the above reported performance gap is reduced with the level of white noise added to G, as shown in Table 4. As the variance of the white noise raises to extreme values, the white noise dominates the overall observation noise, rendering the variance introduced by the voters insignificant. However, for realistic degrees of white noise, as also seen from the table, ADL-TAB clearly outperforms the static BS-TANB scheme. Thus, based on our empirical results, we conclude that ADL-TAB is the superior choice for the Goore Game, both when σG is known or slightly distorted, providing significantly better performance in all experiment configurations.
540
O.-C. Granmo and S. Glimsdal Table 1. Regret after 10, 100, 1000, and 10 000 iterations for 10 players Scheme #Players Function 10 100 1000 10 000 Accelerating 3 G ∼ N (0.125, 0.1) 11.56 26.72 30.96 33.17 Static 3 G ∼ N (0.125, 0.1) 11.63 27.27 34.88 47.20 Accelerating 3 G ∼ N (0.125, 0.2) 5.26 8.47 10.35 11.09 Static 3 G ∼ N (0.125, 0.2) 5.28 9.53 15.15 25.15 Accelerating 3 G ∼ N (0.375, 0.2) 6.62 10.86 11.99 12.63 Static 3 G ∼ N (0.375, 0.2) 6.73 12.15 14.36 17.72 Accelerating 5 G ∼ N (0.125, 0.1) 18.37 41.60 51.78 61.65 Static 5 G ∼ N (0.125, 0.1) 18.28 44.94 58.52 99.49 Accelerating 5 G ∼ N (0.125, 0.2) 6.92 12.94 22.99 60.80 Static 5 G ∼ N (0.125, 0.2) 7.01 15.39 32.94 69.86 Accelerating 5 G ∼ N (0.375, 0.2) 6.12 20.47 22.70 25.75 Static 5 G ∼ N (0.375, 0.2) 6.16 24.24 30.11 35.74 Accelerating 10 G ∼ N (0.125, 0.1) 32.81 93.82 133.65 443.7 Static 10 G ∼ N (0.125, 0.1) 32.84 99.27 143.67 549.8 Accelerating 10 G ∼ N (0.125, 0.2) 10.19 19.57 39.58 110.53 Static 10 G ∼ N (0.125, 0.2) 10.21 22.57 56.91 167.09 Accelerating 10 G ∼ N (0.375, 0.2) 4.40 31.20 113.42 116.65 Static 10 G ∼ N (0.375, 0.2) 4.41 32.03 163.40 197.31
Table 2. Performance with σF augmented, σ F = c · σF (10 players, G ∼ N (0.1, 0.1), σW = 0.1) Scheme / c 1.0 1.25 1.5 1.75 Accelerating 1030.0 684.7 444.7 408.7 Static 965.6 624.3 550.4 414.2 Table 3. Performance with distorted σ
G given to ADL-TAB (10 players, G ∼ N (0.125, 0.2), σW = 0.1) σ
G 0.85 · σG 0.90 · σG 0.95 · σG 1.0 · σG 1.05 · σG 1.10 · σG 1.15 · σG Regret 74.4 75.7 90.9 123.5 162.9 194.4 237.5 Table 4. Performance with varying degrees of white noise N(0, σW ) (10 players, G ∼ N (0.375, 0.2)) Scheme / σW 0.01 0.05 0.1 0.5 1.0 5.0 Accelerating 56.6 54.8 61.1 123.4 315.8 2012.0 Static 120.5 121.4 121.6 184.4 371.7 2013.3
A TAB Based Scheme for Accelerated Decentralized Learning
5
541
Conclusion and Further Work
In this paper we proposed a novel scheme, ADL-TAB, for decentralized decision making based on the Goore Game. Theoretical results concerning the variance of the observations made by each individual decision maker, enabled us to accelerate learning as exploration turns into exploitation. Indeed, our empirical results demonstrated that the accelerated learning improves both learning accuracy and speed, outperforming state-of-the-art Goore Game solution schemes. As further work, we intend to study how the Kalman filter can be incorporated into ADL-TAB, so that non-stationary behavior can be modeled and addressed in a principled manner. We are also currently investigating how the present result can be extended to other classes of decentralized decision making problems.
References 1. Iyer, R., Kleinrock, L.: QoS Control For Sensor Networks. In: IEEE International Conference on Communications, vol. 1, pp. 517–521 (2003) 2. Granmo, O.C.: Solving Two-Armed Bernoulli Bandit Problems Using a Bayesian Learning Automaton. International Journal of Intelligent Computing and Cybernetics 3(2), 207–234 (2010) 3. Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285–294 (1933) 4. Norheim, T., Br˚ adland, T., Granmo, O.C., Oommen, B.J.: A Generic Solution to Multi-Armed Bernoulli Bandit Problems Based on Random Sampling from Sibling Conjugate Priors. In: Proceedings of the Second International Conference on Agents and Artificial Intelligence (ICAART 2010), INSTICC, pp. 36–44 (2010) 5. Granmo, O.C., Berg, S.: Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters. In: Proceedings of the Twenty Third International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems (IEA-AIE 2010), pp. 199–208. Springer, Heidelberg (2010) 6. Scott, S.L.: A modern Bayesian look at the multi-armed bandit. Applied Stochastic Models in Business and Industry (26), 639–658 (2010) 7. May, B.C., Korda, N., Lee, A., Leslie, D.S.: Optimistic Bayesian sampling in contextual-bandit problems. Submitted to the Annals of Applied Probability (2011) 8. Tsetlin, M.L.: Automaton Theory and Modeling of Biological Systems. Academic Press, London (1973) 9. Narendra, K.S., Thathachar, M.A.L.: Learning Automata: An Introduction. Prentice-Hall, Englewood Cliffs (1989) 10. Chen, D., Varshney, P.K.: QoS Support in Wireless Sensor Networks: A Survey. In: The 2004 International Conference on Wireless Networks, ICWN 2004 (2004) 11. Cao, Y.U., Fukunaga, A.S., Kahng, A.: Cooperative Mobile Robotics: Antecedents and Directions. Autonomous Robots 4(1), 7–27 (1997) 12. Granmo, O.C., Glimsdal, S.: Accelerated Bayesian Learning for Decentralized TwoArmed Bandit Solutions to the Goore Game (2010), Unabridged version of this paper
Specification of Interlevel Relations for Agent Models in Multiple Abstraction Dimensions Jan Treur VU University Amsterdam, Department of Artificial Intelligence De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands
[email protected] http://www.few.vu.nl/~treur
Abstract. Multiagent systems for a certain application area can be modelled at multiple levels of abstraction. Interlevel relations are a means to relate models from different abstraction levels. Three dimensions of abstraction often occurring are the process abstraction, temporal abstraction, and agent cluster abstraction dimension. In this paper a unifying formalisation is presented that can be used as a framework to specify interlevel relations for any of such dimensions. The approach is illustrated by showing how a variety of different types of abstraction relations between multi-agent system models can be formally specified in a unified manner. Keywords: interlevel relation, abstraction dimension.
1
Introduction
Different models describing the same process in reality, usually are based on different conceptualisations of this process. As they are models of the same process in which different concepts used may refer to the same aspects or elements in reality, a natural question is how in general two of such descriptions can be related to each other. In the literature on reduction relations it is described, for example, how by an interlevel relation a functional or cognitive model (or theory) can be related to a biochemical or neurological model, (e.g., [5], [20]). In [29] this notion was further developed by formalising (context-dependent) interlevel relations between cognitive agent models and neurological agent models. More in general, abstraction levels between multiagent system models may concern different dimensions of abstraction. For example, process abstraction can be considered from neurological to cognitive models (as addressed by the literature on reduction relations), and from cognitive to behavioural models (as is done in information hiding or from a behaviourist perspective; e.g., [4]). As another example, in temporal abstraction the detailed steps of a process are left out of consideration and instead the patterns are considered emerging from such steps. Yet another example is agent cluster abstraction: abstracting from individual agents to consider populations, groups or clusters of agents instead, as often happens in organisational modelling, ecological modelling, and system dynamics. In [8] a three-dimensional abstraction framework was introduced, addressing how a multi-agent model can be positioned in the three-dimensional space defined by the K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 542–555, 2011. © Springer-Verlag Berlin Heidelberg 2011
Specification of Interlevel Relations for Agent Models
543
dimensions mentioned. Relations between models at different abstraction levels were not addressed in [8]. The focus of the current paper is to provide a unified specification format for such interlevel relations. The unified formalisation for interlevel relations introduced here will cover (at least) these abstraction dimensions. The unifying formalisation was inspired by the use of interpretation mappings from logic (e.g., [28]) to describe reduction relations between cognitive and neurological agent models in [22] and [29], and to describe a mapping from a single agent model to a multiagent model in [11]. The formalisation of interlevel relations introduced here subsumes a number of notions known from the literature. As a first example, for the process abstraction dimension it subsumes not only reduction relations between cognitive and neurological models (e.g., [5], [20], [22], [29]), but also relations between behavioural agent models and cognitive agent models (e.g., [4], [27], [28], [29]). Furthermore, for the temporal abstraction dimension it subsumes the relation between emerging properties of a multi-agent system and the basic mechanisms for the agents (e.g., [1], [2], [3], [6]). Moreover, for the agent cluster dimension it subsumes the relation between agentbased models and population-based models (e.g., [7], [15], [30], [31]) or organisation models (e.g., [11], [14], [18], [23], [26], [32]). The introduced unifying formalisation can play a useful role as a specification format to handle different abstraction levels in analysis and design of multi-agent systems. By making explicit the relationships as part of the specifications, model descriptions (or views) at different levels of abstraction can be freely used while in the meantime keeping track on how they relate. This supports working at an appropriate level of abstraction for different subprocesses (e.g., with stakeholders in a more abstract fashion). The paper is organised as follows. In Section 2 the unified notion of interlevel relation is introduced in a general form. Section 3 addresses how the notion applies to specification of process abstraction interlevel relations, Section 4 of temporal abstraction interlevel relations, and Section 5 of agent cluster abstraction interlevel relations. Finally, Section 7 is a discussion.
2
Specification of Interlevel Relations: Format
Multi-agent system models are usually specified by temporal relationships (dynamic properties) between states; e.g., [3], [6], [10], but also [1], [15], [19], [25]. Interlevel relations between two models relate the states and dynamic properties specified as part of one model, to states and properties specified as part of the model at the other abstraction level. The introduced general format for specification of interlevel relations between two models involves three key elements: (1) An ontology mapping to relate basic state properties of the higher level model to state properties of the lower level. (2) A dynamic property mapping extending the basic ontology mapping to dynamic properties in a (reified) temporal predicate logical language (cf. [16]). This mapping can be applied to dynamic properties that are part of the higher level model, or to dynamic properties that describe patterns in the behaviour of the higher level model.
544
J. Treur
(3) Logical entailment relations for dynamic properties. Such relationships formally expressed as valid logical implications between temporal predicate logical expressions indicate how mapped higher level properties can be related to properties of the lower level model. These three key elements are explained in more detail below; see also the overall view depicted in Fig. 1. The format was inspired by the use of interpretation mappings from logic to describe reduction relations between cognitive and neurological models in [22]. Within logic (e.g., [28]) an interpretation mapping φ* from one theory T2 to another one T1 usually is defined as a (1) a mapping φ* from formulae F in the language of T2 to formulae in the language of T1, fulfilling (2) compositionality, and (3) T2 |─ F ⇒ T1 |─ φ*(F) for any formula F. The three key elements follow these criteria. Basic Ontology Mapping. An ontology mapping can be used to map a basic concept (atom) in the ontology used for the higher level to a concept in the ontology of the lower level (criterion (1) of an interpretation mapping). For example, suppose at some level basic state property b2 is given. Moreover, suppose at a lower level state property b1 is given. Then by a basic ontology mapping φ property b2 can be mapped onto b1, i.e., φ(b2) = b1 . When also b1 is a basic concept in the ontology of its level, then this φ is just a mapping between basic concepts. Note that sometimes also general relations, for example between numbers or time points, are involved that do not belong to any particular level; they are assumed to be mapped onto themselves: φ(T1 v(✓?) > v(✗?) > v(✗!). The strategies we present in this section all apply the lexicographic ordering in the sense that a preference between two objects is determined at the highest importance level of attributes where a preference can be derived. They differ in the way a preference is determined within one importance level. In a quantitative approach, one object is preferred over another on a certain importance level if the utility of the first object on that level is higher than the utility of the second. A utility function U takes the number of certainly true (n), defeasibly true (m), and defeasibly false (r) attributes that an object has at a certain importance level and returns that object’s utility at that level. A linearly additive utility function is defined as U(n, m, r) = n · v(✓!) + m · v(✓?) + r · v(✗?). This can be incorporated into our generic argumentation framework by means of inference rule 10 in Table 1b.
An Argumentation Framework for Deriving Qualitative Risk Sensitive Preferences
563
4.1 Lexicographic Strategy In this section we model a strategy in which no number of defeasibly true attributes can be valuated higher than a single certainly true attribute. We do this by applying a lexicographic ordering within one importance level. That is, within one level, we first count all certainly true attributes. If one object has more of those than another, the first object is preferred over the second (within this level). If both objects have the same number, we go on to count the defeasibly true attributes, and so on. When both objects have the same number of certainly true, defeasibly true and defeasibly false attributes, we go on to consider the next importance level. Instead of the inference schemes 8 and 9 from Table 1a, we use the inference schemes in Table 1c. Inference scheme 11 says that an object a is preferred over an object b if the number of attributes of object a at a certain importance level and with a certain certainty level is higher than the number of attributes of object b on the same importance and certainty levels. It is also required that a and b have the same number of attributes on any higher certainty level within the same importance level, and on any certainty level at importance levels higher than that of P. We model this by defining inference schemes 12 and 13 that undercut scheme 11 if either is not the case. Similar inference schemes can be used for equal preference. Example 1. Consider the situP ≈ Q R ≈ S ≈ T ation to the right. Argument A a ✗? ✓! ✓! ✓! ✗? claims that b is preferred over a b ✓! ✗? ✓? ✓? ✓? because it has more defeasibly . . . . . . true attributes at the second im. . . ∗ ∗ portance level. This argument is R(b) S(b) T (b)∗ R ≈ S ≈ T has(b,[R],✓?,3) has(a,[R],✓?, 0) 3>0 undercut, by argument B stating β pref(b, a) that since a and b have a differ- A: ent number of certainly true atR(a) S(a) R ≈ S tributes at this importance level, has(a,[R],✓!,2) has(b, [R], ✓!, 0) ✓! > ✓? 2 = 0 preference cannot be derived on B: β is inapplicable the basis of the number of deR(a) S(a) R ≈ S feasibly true attributes. The only has(a,[R],✓!,2) has(b, [R], ✓!, 0) 2>0 justified preference argument is C: pref(a, b) C, which states that a is preferred over b because it has more certainly true attributes at the second importance level. With the quantitative approach described above, we can model the same preferences if we choose the right valuations for the certainty levels. In this case, the valuations for ✓? and ✗? depend on the maximal number of attributes at an importance level, n. Since no number of defeasibly true attributes is enough to beat one certainly true attribute, we have v(✓?) < v(✓!) . Similarly, we have v(✗?) < v(✓?) . So, for example, v(✓?) = 1 n
n
n+1
1 and v(✗?) = (n+1) 2 would satisfy these constraints and produce the same preferences as the lexicographic strategy. Since the maximum number of attributes at one importance level is typically more than one, both v(✓?) and v(✗?) would be low, certainly below p(✓?) and most probably below p(✗?), which means that this strategy is risk averse.
564
W. Visser et al.
4.2 Adaptive Strategy Up till now we have been specific in assigning valuations to certainty levels. But the probability of truth associated with a certainty level is not so specific. So maybe a user is hesitant to specify exact values for certainty levels, especially if he is risk neutral or only moderately risk averse or risk prone. In that case the valuation of a certainty level would have to be the same as or close to the probability of truth, which is unknown. We do know some lower and upper bounds: 0.5 < p(✓?) < 1 and 0 < p(✗?) < 0.5. We can use these to construct worst and best scenarios. In the worst case, v(✓?) = 0.5 and v(✗?) = 0, and in the best case, v(✓?) = 1 and v(✗?) = 0.5. Now we can use the following intuition. If the worst possible case for object a is still preferred over the best possible case for object b, then a has to be preferred over b (on some importance level). This is formalised by inference rule 14. Two objects are equally preferred if they are equally preferred in every possible scenario, which can only be the case if the probabilities of truth are known exactly, or neither object has any defeasibly true or false attributes. There are also cases in which no preference can be derived. To facilitate the use of the undercutting scheme 9 we use inference scheme 15 which states that unless two triples of numbers can be derived to be equal according to B, they are inequal. It may happen that the user is able to give more information regarding the probabilities of truth of the defeasible certainty levels. If for example a user knows that p(✓?) lies between 0.8 and 0.85, the upper and lower values for ✓? could be adapted accordingly. This makes this strategy adaptive to the background knowledge of the user. The strategy is less decisive than the others we presented, since it is not always able to derive a preferene between two objects. But it can also be considered safer because it takes into account the uncertainty of the probability of truth of defeasible information. Example 2. Consider the same situation again. If nothing specific is known about the probabilities of truth of ✓? and ✗?, no preference can be derived, because the best scenario for a is better than the worst scenario for b and the best scenario for b is better than the worst scenario for a. But if we would know that 0.8 < p(✓?) < 0.9 and 0.1 < p(✗?) < 0.2, we can construct the following argument. 2 + 0 + 0.1 < 0 + 2.4 + 0 has(a, [R], 2, 0, 1)
has(b, [R], 0, 3, 0) pref(b, a)
2, 0, 1 n , or n = n and m > m , or n = n and m = m and r > r , or n = n and m = m and r = r . Since this relation is complete (for any n, m, r, n , m , r we have either n, m, r ≥B n , m , r or n , m , r ≥B n, m, r), the resulting preference order is also complete. In the adaptive strategy, triples are mapped to a range between the worst case and the best case. Since ranges may overlap, this ordering is not complete. One case where no preference can be derived was shown in Example 2.
An Argumentation Framework for Deriving Qualitative Risk Sensitive Preferences
565
5 Conclusion In this paper we have made the following contributions. We discussed the risk involved in basing preferences on defeasible information and identified the relation between this type of risk and risk as defined in utility theory. We then proposed a generic argumentation framework to reason about multi-attribute preferences in the context of defeasible information. Finally, we presented several qualitative strategies to deal with risk sensitive preferences, which were incorporated into the generic framework. In our future work we would like to evaluate the strategies in more detail. It would be interesting to understand the properties of the adaptive strategy better. We would also like to distinguish more explicitly between mental attitudes such as beliefs, goals, desires and preferences. This will also allow us to reason about these attitudes, for example that a certain preference we have is based on some specific beliefs. We hope to gain insight from modal preference languages with belief operators. Other interesting areas for future work include the representation of dependent preferences, different degrees of satisfaction of attributes, and preferences based on underlying interests. Acknowledgements. This research is supported by the Dutch Technology Foundation STW, applied science division of NWO and the Technology Program of the Ministry of Economic Affairs. It is part of the Pocket Negotiator project with grant number VICIproject 08075.
References 1. Visser, W., Hindriks, K.V., Jonker, C.M.: Argumentation-based preference modelling with incomplete information. In: Dix, J., Fisher, M., Nov´ak, P. (eds.) CLIMA X. LNCS, vol. 6214, pp. 141–157. Springer, Heidelberg (2010) 2. Keeney, R.L., Raiffa, H.: Decisions with multiple objectives: preferences and value tradeoffs. Cambridge University Press, Cambridge (1993) 3. Dubois, D., Fargier, H., Perny, P.: Qualitative decision theory with preference relations and comparative uncertainty: An axiomatic approach. Artif. Intell. 148, 219–260 (2003) 4. Dubois, D., Fargier, H., Bonnefon, J.F.: On the qualitative comparison of decisions having positive and negative features. J. Artif. Intell. Res. 32, 385–417 (2008) 5. Bonet, B., Geffner, H.: Arguing for decisions: A qualitative model of decision making. In: Proc. UAI, pp. 98–105 (1996) 6. Amgoud, L., Prade, H.: Using arguments for making and explaining decisions. Artif. Intell. 173(3-4), 413–436 (2009) 7. Brewka, G.: A rank based description language for qualitative preferences. In: Proc. ECAI, pp. 303–307 (2004) 8. Bonnefon, J.F., Fargier, H.: Comparing sets of positive and negative arguments: Empirical assessment of seven qualitative rules. In: Proc. ECAI, pp. 16–20 (2006) 9. Dung, P.M.: On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artif. Intell. 77, 321–357 (1995) 10. Vreeswijk, G.A.W.: Abstract argumentation systems. Artif. Intell. 90(1-2), 225–279 (1997)
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion of Mental States Tibor Bosse, Mark Hoogendoorn, Michel C.A. Klein, Jan Treur, and C. Natalie van der Wal VU University Amsterdam, Department of Artificial Intelligence De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands {tbosse,mhoogen,mcaklein,treur,cn.van.der.wal}@few.vu.nl http://www.few.vu.nl/~{tbosse,mhoogen,mcaklein,treur, cn.van.der.wal}
Abstract. In this paper an agent-based analysis is made of patterns in crowd behaviour, in particular to simulate a real-life incident that took place on May 4, 2010 in Amsterdam. As a basis, an existing agent-based model is used for contagion of emotions, beliefs and intentions. From available video material and witness reports, useful empirical data were extracted. Similar patterns were achieved in simulations, whereby some of the parameters of the model were tuned to the case addressed, and most parameters were assigned default values. The results show the inclusion of contagion of belief, emotion, and intention states of agents results in better reproduction of the incident than non-inclusion. Keywords: crowd behaviour, contagion, emotion, belief, intention.
1 Introduction Behavioural patterns emerging in large crowds are often not easy to regulate. Various examples have shown how things can easily get out of control when many people come together during big events. Especially when in a crowd, emotion spirals (e.g., for aggression or fear) develop to high levels, the consequences can be devastating. In this paper, it is analysed what happened on Dam square in Amsterdam at the 4th of May in 2010, when large numbers gathered for the national remembrance of the dead (‘dodenherdenking’). In the middle of a two-minute period of silence, one person started shouting, causing panic to occur among the people present. What happened there, as a result of a panic spiral, was a relatively mild case in which ‘only’ a number of persons ended up in hospitals with fractures and bruises. In such situations, for each person involved, both cognitive and affective states and their intra-person interaction play a role. In this paper, beliefs and intentions are considered from the cognitive perspective, as they usually are the basis for actual behaviour: e.g., running away from a place that is believed to be dangerous. From the affective perspective, emotions are considered, such as fear, but also positive emotions for certain actions that are possible: for example, to go to a place believed to be safe. On the one hand such internally interacting cognitive and affective states are K.G. Mehrotra et al. (Eds.): IEA/AIE 2011, Part II, LNAI 6704, pp. 566–577, 2011. © Springer-Verlag Berlin Heidelberg 2011
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion
567
individual, private states, but on the other hand they are easily affected via verbal and/or nonverbal inter-person interaction by similar states of other persons. Exploiting insights from Social Neuroscience, for the dynamics of such states and their intra- and interpersonal interaction, an agent-based model was presented in [5], which we refer to as the ASCRIBE model (Agent-based Social Contagion Regarding Intentions Beliefs and Emotions). For each person the ASCRIBE model takes into account a number of parameters representing personal characteristics, for example, expressivity and openness for emotions and other mental states. This current model uses ASCRIBE in an adapted form to simulate the empirical data gathered for the May 4 incident: ASCRIBEMay4. As a first step, useful empirical data has been extracted from available video material and witness reports. In order to specialise the existing agent-based model to this case, values for most of the parameters of the model where set by hand at certain default values, whereas values of other parameters were automatically tuned by use of a parameter tuning method developed earlier; cf. [2]. By comparing different default settings for the hand-set parameters relating to contagion of emotions, beliefs and intentions, it was possible to analyse the contribution of contagion in the model: parameter settings indicating low or no contagion show higher deviations from the empirical data. In this paper, Section 2 presents a brief overview of the ASCRIBE model. In Section 3 the May 4 incident is described and how empirical data was extracted from available material. In Section 4 it is discussed how the model was extended and specialised for the case study addressed. Section 5 describes the parameter estimation method by which parameters of the model were tuned to cover the patterns shown in the empirical data. Section 6 discusses the results and Section 7 is a conclusion.
2 Overview of the Agent-Based Model used The agent-based model ASCRIBE that was used (cf. [5]) has been inspired by some concepts and principles from Neuroscience. One of them is the concept of a mirror neuron (e.g., [6], [9], [10]). Such a neuron is not only active in preparation for certain actions or bodily changes but also when the person observes somebody else intending or performing the action or body change. When states of other persons are mirrored by some of the person’s own states, which at the same time play a role in generating their own behaviour, then this provides an effective basic mechanism for how in a social context persons fundamentally affect each other’s mental states and behaviour. Moreover, the model exploits the concept somatic marker (cf. [1]), which describes how emotions felt play a central role in decision making. Each considered decision option induces (via an emotional response) a feeling which is used to mark the option. Such somatic markers are used as a basis to choose an option. Within the ASCRIBE model it is assumed that at the individual intra-agent level, the strength of an intention for a certain decision option depends on the agent’s beliefs and emotions in relation to that option (intra-agent interaction from beliefs and emotions to intentions). Moreover, it is assumed that beliefs generate certain emotions (e.g., fear), that in turn may affect the strength of beliefs (mutual intra-agent interaction between beliefs and emotions). To describe inter-agent interaction, a mirroring mechanism is used for the three different mental states considered: emotions (fear, and emotions felt about a certain
568
T. Bosse et al.
decision option), beliefs (e.g., about safe places), intentions (for certain decision options). Below, only a brief overview is given of the central idea of the model. For a complete overview, see [5]. The model is based upon the notion of contagion strength γSBA which is the strength with which an agent B influences agent A with respect to a certain mental state S (which, for example, can be an emotion, a belief, or an intention). It depends on the expressiveness (εSB) of the sender B, the strength of the channel (αSBA) from sender B to receiver A and the openness (δSA) of the receiver: γSBA = εSB αSBA δSA. The level qSA for mental state S of agent A is updated using the overall contagion strength of all agents B not equal to agent A: γSA = ΣB≠A γSBA. Then the weighed external impact qSA*, for the mental state S of all the agents B upon agent A, is determined by: qSA* = ΣB≠A γSBA qSB / γSA. Given these, state S for an agent A is updated by: qSA(t+Δt) = qSA(t) + ψSA γSA [ f(qSA*(t), qSA(t)) - qSA(t)] Δt
Here ψSA is an update speed factor for S, and f(V1, V2) a combination function. This expresses that the value for qSA is defined by taking the old value, and adding the change term, which basically is based on the difference between f(qSA*(t), qSA(t)) and qSA(t). The change also depends on two factors: the overall contagion strength γSA (i.e., the higher this γSA , the more rapid the change) and the speed factor ψSA. Within the definition of the combination function f(V1, V2) a number of further personality characteristics determine the precise influence of the contagion. First, a factor ηSA is distinguished which expresses the tendency of an agent to absorb or amplify the level of a state S, whereas another personality characteristic βSA represents the bias towards reducing or increasing the value of the state S. Thus, the combination function f(V1, V2) is defined as follows: f(V1, V2) = ηSA [ βSA (1 – (1 - V1)(1 - V2)) + (1-βSA) V1V2 ] + (1 - ηSA) V1 This general model for any state S is applied to four types of states: fear of agent A emotion for option O of agent A intention indication for option O of agent A belief in X of agent A
qfearA(t) qemotion(O)A(t) qintention(O)A(t) qbelief(X)A(t)
The total number of such states that is available in the model depends on the number o of decision options and the number b of beliefs. It can be calculated as 1+2o+b types of states per agent (which amounts to actual states). When a is the number of agents, then within the whole multi-agent system (1+2o+b)a types of individual states play a role. In principle all parameters εSA, δSA, ηSA, βSA, ψSA, αSBA for different states S and agents A, B may have different values. Therefore the overall number of these parameter values is quadratic in the number a of agents and linear in the number o of options and the number b of beliefs; it can be calculated as 5a (1+2o+b ) + a2 (1+2o+b ) = a (a+5) (1+2o+b ) parameter values. Furthermore, interactions between different states are considered within the agentbased model. First, the emotions have an effect on the beliefs. This influence has been modelled for the emotion of fear. The personality characteristics εbelief(X)A, δbelief(X)A, ηbelief(X)A, βbelief(X)A and interaction characteristic αbelief(X)BA are assumed to be dynamic,
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion
569
depending on the fear level. In addition the opposite direction is modelled: levels of emotions being influenced by levels of beliefs. Finally, the impact of levels of beliefs and emotions related to certain options on levels of intentions for these options is modelled. For more detailed specifications of these interactions, see [5]. To model these interactions, additional person-specific parameters are involved: µ δbeliefA, µ ηbeliefA, µ βbeliefA
σA, τA ζA
νA ωX,fear,A ωOEA1 ωOBA1
ωOIA1 ωOEA2 ωOBA2
adaptation speed for δ, η, β for beliefs steepness and threshold values for adaptation optimistic/pessimistic bias upon fear weight of fear against beliefs weight of information X for fear weight of the group impact on the emotion of A for O weight for the own belief impact on the emotion of A for O weight for the group impact on the intention of A for O weight for own emotion impact on the intention of A for O weight for the own belief impact on the intention of A for O
These parameters add to the overall number of parameter values needed, providing 3a + 2a + a + a + ba + 5oa = a(b + 5o +7) parameter values. The current model ASCRIBEMay4 is specialised to the May 4 case and therefore adds a few parameters, see Section 4.
3 Case Study: The May 4 Incident The computational model mentioned above was applied to the May 4 incident in Amsterdam (Netherlands). The incident was as follows. In the evening of the 4th of May, around 20.000 people gathered on Dam Square in Amsterdam (Netherlands) for the National Remembrance of the dead. At 20:00 everybody in the Netherlands, including the crowd on Dam Square, were in silence for 2 minutes to remember the dead. The 20.000 people on Dam Square were compartmented by fences and officials. At 20.01 a man in the crowd on Dam Square disturbs the silence by screaming loudly. People standing around him could see that this man looked a bit ‘crazy’ or ‘lost’, and they did not move. Other people, not within a few meters of the screaming source, started to panic and ran away from the man that screamed. The panic spread through the people that were running away and infected each other with their emotions and intentions to flee in a certain direction and also because of a loud ‘BANG’ that was heard about 3 seconds after the start of the scream. Queen Beatrix and other royal members present, were escorted to a safe location nearby. In total, 64 persons got injured: broken bones and scrapes, by being pushed into a certain direction, or ran over by the crowd. The police exported the screaming man and got control over the situation within 2 minutes. After 2½ minutes, the master of ceremony announced to the crowd, that a person had become ill and had received care. He asked everybody to take their initial place again, to continue the ceremony. After this, the ceremony continued. For a short movie with images from the live broadcast on Dutch National Television, see URL: http://www.youtube.com/watch?v=0cEQp8OQj2Y. This shows how within two minutes the crowd starts to panic and move. The actual live broadcast of the National Remembrance on Dutch National Television has been acquired in HD-quality. In this video, one can see the crowd on Dam
570
T. Bosse et al.
Square flee from the persp pective shown in Fig. 1. The video includes the cuts and editing that were done durring the live broadcast, because the uncut/un-edited video material of all cameras that were filming that day was not saved.
Fig. 1. Still im mage of the people on Dam Square starting to flee. The circle on the riight bottom indicates the location of the yelling person
From the total broadcastt, a shorter 3-minute long .mpeg movie was made from the moment where the crowd was w in silence and a person started to scream loudly. In this 3-minute movie there are tw wo time slots that were processed further, namely the pparts from 11-17 seconds and 20--27 seconds. In these seconds, the camera angle from Figg. 1 was visible and the directio on and speed of the movements of the people could be aanalysed. These specific parts of o 15 seconds in total length were analysed as follows. T The 3-minute long .mpeg moviee was cut into still images, to detect the location of peoople by hand. This was done witth a computer program called FFmpeg.1 Ten still images per second were chosen for th he cutting, to be able to detect the movements of runnning people frame by frame. Th he location/movement detection of the crowd was donee as follows: the still images weere viewed in a program called IrFanView2, where you can see the coordinates of your mouse m click on the picture in the upper left corner. The total of 130 frames were analysed by hand. In an Excel file all coordinatess of mes were collected. Not all people in the crowd couldd be selected persons in the fram analysed by hand, becausee of the quantity, but also because it was not possiblee to trace every ‘dot’ (person) over o multiple still images. In total 35 persons were tracced. Persons in different positio ons of the crowd that have simultaneous movements as the people around them were chosen, c so these target subjects can represent multiple ppeople around them. The densiity of the crowd around a target subject was also acquirred, which could be used to buiild a representative large scale simulation of ten thousaands of agents. Since the exact number of surrounding persons of a target could nott be 1 2
FFmpeg is a cross-platform m program to record, convert and stream audio and viddeo. http://www.ffmpeg.org/. IrFanView is a graphic vieweer, see: http://www.irfanview.com/
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion
571
distinguished in the video, 3 distinctions in density were made: high, medium and low. The size of the circle around the target subject in which density was measured, is shown on the right in the picture. The next step was to correct for the angle the camera makes with the floor and recalculate the coordinates, into coordinates that would fit into a bird’s view on the Dam Square, perpendicular to the floor. For the transformation of the pixelscoordinates in the image to the location on the map as seen from above, both the horizontal and vertical distances in pixels between corners of buildings near the center of the image were calculated. This was compared with the real distances in meters to calculate the average pixels per meter in the image for the x and y axis near the center of the image. This method results in a distortion for points further away from the center of the image, however, given the distance of the camera from the area of interest and the fact that most of the movement were in the middle horizontal band of the image, the distortion is limited. Eventually, the positions in meters from corners of the buildings were translated to the position in pixels on a 600x800 map of the area, using offsets and scaling. Specifically, the following formulae are used to translate movements in pixels to movements in meters: xmeter = xpixel / 22 and ymeter = ypixel / 8. This was then transformed to the map using the following formulae: xmap = ( xmeter * 5.15 ) + 136 ymap = ( ymeter * 5.15 ) - 167 The bird’s eye view perspective used in the computational model can be seen Fig. 2. 600 x 800 pixel image of the in Fig. 2. The resulting 600 by 800 pixels Dam Square figure was represented in the simulation in Matlab as a grid of 300 by 400. Locations of certain obstacles, like buildings and fences, were also transformed with the formula from the camera angle into the bird’s eye view.
4 Extending and Specialising the Model for the May 4 Case To tailor the ASCRIBE model towards this domain, a number of steps were taken. Case specific states. First of all, the relevant states for the agents have been distinguished. In this case, the emotion, belief and intention states relate to the options for each agent. A total of 9 options are available including ‘remain standing’, and moving
572
T. Bosse et al.
in any wind direction (N, NE, E, SE, S, SW, W, NW). Besides these, there is an additional belief about the current situation. This expresses how positive a person judges the current situation (0 a negative judgment, and 1 a positive judgment). Finally, the emotions for each of the options and the emotion fear is represented. Channel strength. In the scenario described above, the channel strengths between the various agents are dependent on the physical location of the agents. If other agents are close, the channel strength is high, whereas it is low or 0 in case agents are far apart. Therefore, a threshold function was used expressing within which reach agents still influence each other in a significant manner: αSBA(t) = 1 – (1/1+e-σ(distanceBA(t) - τdistance)) Here σ and τdistance are global parameters and distanceBA is the Euclidean distance between the positions (xA(t), yA(t)) and (xB(t), yB(t)) of A and B at t.
Movement. The movement of the agents directly depends upon their intentions. The highest feasible intention is selected (in cases where certain movements are obstructed, the next highest intention is selected). For each of the selected options O, the movement xmovement(O) on the x-axis and ymovement(O) on the y-axis is specified; e.g., the option for going south means -1 step on the y-axis and none on the x-axis: xmovement(O) = 0 and ymovement(O) = -1. The actual point to which the agent will move is then calculated by taking the previous point and adding the movement of the agent during a certain period to that. The movement of the agent depends upon the strength of the intention for the selected option and the maximum speed with which the agent can move. If the intention is maximal (i.e., 1) the agent will move with the maximum speed. In case the intention is minimal (i.e., 0) the agent will not move. The model that establishes this behaviour is as follows: xA(t+Δt) = xA(t) + max_speedA ⋅ qintention(O)A(t) ⋅xmovement(O)⋅ Δt yA(t+Δt) = yA(t) + max_speedA ⋅ qintention(O)A(t) ⋅ymovement(O)⋅ Δt
Here the maximum speeds max_speedA are agent-specific parameters.
5 The Parameter Tuning Method Used As explained above, the computational model contains a large number of parameters; these parameters address various aspects of the agents involved, including their personality characteristics (e.g., expressiveness, openness, and tendency to absorb or amplify mental states), physical properties (e.g., minimum and maximum speed, and limit of their sight), and characteristics of their mutual interactions (e.g., channel strength between sender and receiver). The accuracy of the model (i.e., its ability to reproduce the real world data as closely as possible) heavily depends on the settings of these parameters. Therefore, parameter estimation techniques [12] have been applied to learn the optimal values for the parameters involved. In order to determine what is ‘optimal’, first an error measure needs to be defined. The main goal is to reproduce the movements of the people involved in the scenario; thus it was decided to take the average (Euclidean) distance (over all agents and time points) between the actual and simulated location:
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion
ε
∑
∑
,,
,, #
2
,,
,,
573
2
⋅#
Here, x(a, t, sim) is the x-coordinate of agent a at time point t in the simulation, and x(a, t, data) the same in the real data (similarly the y-coordinates). Both are in meters. Next, the relevant parameters were tuned to reduce this error. To this end, the approach described in detail in Section 3 and 4 of [2] was used. This approach makes use of the notion of sensitivity of variables for certain parameter changes. Roughly spoken, for a given set of parameter settings, the idea is to make small changes in one of the parameters involved, and to observe how such a change influences the change of the variable of interest (in this case the error). Here, ‘observing’ means running the simulation twice, i.e., once with the original parameter settings, and once with the same settings were one parameter has slightly changed. Formally, the sensitivity SX,P of changes ΔX in a variable X to changes ΔP in a parameter P is defined as follows (note that this sensitivity is in fact the partial derivative ∂X/∂P): SX,P = ΔX /ΔP . Based on this notion of sensitivity, the adaptation process as a whole, is an iterative process, which roughly consists of: 1) calculating sensitivities for all parameters under consideration, and 2) using these sensitivities to calculate new values for all parameters. This second step is done by changing each parameter with a certain amount ΔP, which is determined as follows: ΔP = -λ * ΔX / SX,P. Here, ΔX is the deviation found between actual and simulated value of variable X, and λ is a speed factor. Note that, since in the current case X represents the error, the ‘actual value’ of X is of course 0, so ΔX simply equals ε in the simulation.
6 Results This section presents the results of specialising and tuning the agent-based model with 35 agents, to the real world data of the May 4 incident. The results are presented for the first part of the data (i.e., seconds 11-17 of the 3-minute movie). The number of parameters to tune was large, therefore, before starting the tuning process, the settings for a large majority of the parameters were fixed at default values (see Table 1). For example, parameters with a relatively small sensitivity were left out of consideration for the tuning process (cf. [2]). For these parameters, reasonable default settings were chosen by hand (based on experimentation). The values of the remaining parameters (among others, the maximum speed for each individual agent, the minimum distance within which agents influence each other, and the initial values of one of the beliefs, see Table 1) were initialised by hand, but were then adapted using the parameter tuning approach described in the previous section. The speed factor λ of this tuning process was set to 0.1. The initial locations of the agents involved were taken equal to the locations in the real world data. An overview of all optimal settings found for the global parameters and the initial variables involved in the model (cf. [5]) is shown in Table 1. Here, the settings shown in the first two columns were set by hand, and the settings shown in the last two columns were found after tuning. Note that all settings (except those for maximum speed) were used globally for all agents.
574
T. Bosse et al. Table 1. Optimal parameter settings found
'ůŽďĂůƉĂƌĂŵĞƚĞƌƐ ;ŶŽƚƚƵŶĞĚͿ ηĂŐĞŶƚƐ ϯϱ ŵĂdžͺdž ϲϬϬ ŵĂdžͺLJ ϴϬϬ
/ŶŝƚŝĂůǀĂƌŝĂďůĞƐĞƚƚŝŶŐƐ ;ŶŽƚƚƵŶĞĚͿ Ϭ͘ϱ εŝŶƚĞŶƚŝŽŶ δŝŶƚĞŶƚŝŽŶ Ϭ͘ϱ ηŝŶƚĞŶƚŝŽŶ Ϭ͘ϱ
/ŶŝƚŝĂůǀĂƌŝĂďůĞƐ ;ƚƵŶĞĚͿ ƋďĞůŝĞĨ;ŶŽŵŽǀĞͿ Ϭ͘ϬϬϱ
Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ
'ůŽďĂůƉĂƌĂŵĞƚĞƌƐ ;ƚƵŶĞĚͿ ϭϵϬ τĚŝƐƚĂŶĐĞ ƐŝŐŚƚͺƌĞĂĐŚ ϮϬϬ ŵĂdžͺƐƉĞĞĚ differs ƉĞƌ ĂŐĞŶƚ ;ƉĞƌĂŐĞŶƚͿ
Δƚ ђδďĞůŝĞĨ ђηďĞůŝĞĨ ђɴďĞůŝĞĨ ζďĞůŝĞĨ σ ωK/ϭ ωKϮ ωKϮ ωKϭ ωKϭ ĂůůƋďĞůŝĞĨ;yͿ ŝŵƉĂĐƚŽĨ ĞǀĞŶƚŽŶ ƋďĞůŝĞĨ;yͿ ŵŝŶͺƐƉĞĞĚ
Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ Ϭ͘ϱ ϭϬϬ Ϭ͘ϯ Ϭ͘ϯ Ϭ͘ϯ Ϭ͘ϱ Ϭ͘ϱ Ϭ ϭ
ɴŝŶƚĞŶƚŝŽŶ εďĞůŝĞĨ δďĞůŝĞĨ ηďĞůŝĞĨ ɴďĞůŝĞĨ εĞŵŽƚŝŽŶ δĞŵŽƚŝŽŶ ηĞŵŽƚŝŽŶ ɴĞŵŽƚŝŽŶ
Ϭ͘Ϭϭ
These optimal settings were compared to two other variants of the model: one baseline variant in which the agents do not move at all, and one variant in which all agents also make individual decisions, but do not influence each other (i.e., no contagion takes place). For the latter variant, in order to enable a fair comparison, parameter tuning was applied to find optimal settings as well. Fig. 3 shows for each of the three variants how the average error (over all agents) increases during the simulation. Note that the error is expressed in meters. At the first time point, the error is 0 (all agents start at their actual position), but over time the error increases very quickly in the baseline case, so that the averaged error over all steps becomes quite large (0.87). The overall error found for the tuned model without contagion is much lower (0.66, i.e., an improvement of 24%), and is even lower for the tuned model with contagion (0.54, i.e., an improvement of 38%). 2.5
2
1.5 baseline without contagion
1
with contagion 0.5
0 0
5
10
15
20
25
30
35
40
45
Fig. 3. Development of error over the simulation for three variants of the model
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion
575
This finding provides strong evidence for the conclusion that incorporating the contagion makes the model more accurate, even when it is based on default settings for the parameters. Note that in the current scenario, the agents’ movements involve relatively small steps, compared to the size of the grid; in case the steps would have been larger, the difference in performance between the three models would be expected to have been bigger as well. After the tuning process was finished, the optimal settings found for all parameters were used as input for the simulation model with contagion, to generate a simulation trace which closely resembles the real world scenario. Using visualisation software (written in Matlab), the simulation trace has been visualised in Fig. 4. Screenshot of the simulation. the form of a 2D animation (see Units displayed on the axes are in pixels, where 5.15 pixels equals 1 meter. http://www.few.vu.nl/~tbosse/may4/). A screenshot of the animation is shown in Fig. 4. Here, the lines represent fences that were used to control the crowd, the large circle represents the monument on the square (see Fig. 1 for the actual situation), and the big dots represent corners of other buildings. The plus sign on the right indicates the location of the screaming man. The small dots represent the actual locations of the 35 people in the crowd that were tracked, and the stars represent the locations of the corresponding agents in the simulation. Even at the end of the simulation (see Fig. 4), the distances between the real and simulated positions are fairly small.
7 Discussion This paper has two main contributions. Firstly, it presents how empirical data has been extracted from available video material and witness reports of the May 4 incident in Amsterdam. Qualitative data about escape panics are rare [4]. Based on these data, it is possible to compare models for crowd behaviour with qualitative data of a real panicking event. Second, an existing agent-based model for describing group behaviour involving contagion of emotion, belief and intention, ASCRIBE [5], has been adapted to construct a model for behaviour in a crowd when a panic spiral occurs. Experiments have been performed with two variants of the model. In one variant parameters related to contagion were set in such a way that there was no contagion at all; in this case the movement of individuals is only determined by their individual state. In the other variant, mutual influencing took place because emotions, beliefs and intentions were spreading to persons nearby. When comparing the simulations of both variants of the model with the most optimal settings for the other parameters, the
576
T. Bosse et al.
variant with contagion had an 18% lower average error rate (0.54 instead of 0.66). Thus, it is shown that the contagion of emotions is an essential element to model the behaviour of crowds in panic situations. Several models for crowd behaviour have been presented by other researchers. An influential paper has been [4], in which a mathematical model for crowd behaviour in a panic situation is presented, based on physics theories and socio-psychological literature. This model is based on the principle of particle systems, in which forces and collision preventions between particles are important. This approach is often used for simulating crowd behaviour in virtual environments [11, 14]. In [3] the model of [4] is extended by adding individual characteristics to agents, such as the need for help and family membership. In both models, there are no individual emotion, belief and intention states that play a role. In contrast, in [7] an agent has an ‘emotional_status’, which determines whether agents walk together (i.e. it influences group formation). The emotional status of an agent can change when to agents meet. An even further elaborated role of emotional and psychological aspects in a crowd behaviour model can be found in [8]. In this model, several psychological aspects influence the decision making of individual agents, for example, motivation, stress, coping, personality and culture. In none of the models presented above, there is contagion of emotional or other mental states between people. Also, no evaluation with real qualitative data has been performed. One of the most developed tools for crowd simulation, which also incorporates mental states, is ESCAPES [13]. This system, which specifically targets evacuation scenarios, has several similarities with the approach shown here. Future work will explore the possibilities to incorporate the detailed mechanisms for contagion of mental states presented here into ESCAPES. Moreover, in the future, further parameter tuning experiments are planned to study the effect of the parameters that were fixed as default values in the current experiments. The aim is to explore whether even more realistic simulations can be achieved by exploiting the details of the model for contagion of emotions, beliefs and intentions in a more differentiated form.
Acknowledgement This research has partly been conducted as part of the FP7 ICT Future Enabling Technologies program of the European Commission under grant agreement No. 231288 (SOCIONICAL).
References 1. Bechara, A., Damasio, A.: The Somatic Marker Hypothesis: A Neural Theory of Economic Decision. Games and Economic Behavior 52, 336–372 (2004) 2. Bosse, T., Memon, Z.A., Treur, J., Umair, M.: An Adaptive Human-Aware Software Agent Supporting Attention-Demanding Tasks. In: Yang, J.-J., Yokoo, M., Ito, T., Jin, Z., Scerri, P. (eds.) PRIMA 2009. LNCS, vol. 5925, pp. 292–307. Springer, Heidelberg (2009)
Agent-Based Analysis of Patterns in Crowd Behaviour Involving Contagion
577
3. Braun, A., Musse, S.R., de Oliveira, L.P.L., Bodmann, B.E.J.: Modeling Individual Behaviors in Crowd Simulation. In: The 16th International Conference on Computer Animation and Social Agents CASA 2003, pp. 143–147. IEEE Press, New Jersey (2003) 4. Helbing, D., Farkas, I., Vicsek, T.: Simulating Dynamical Features of Escape Panic. Nature 407(6803), 487–490 (2000) 5. Hoogendoorn, M., Treur, J., van der Wal, C.N., van Wissen, A.: Modelling the Interplay of Emotions, Beliefs and Intentions within Collective Decision Making Based on Insights from Social Neuroscience. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010, Part I. LNCS(LNAI), vol. 6443, pp. 196–206. Springer, Heidelberg (2010) 6. Iacoboni, M.: Mirroring People: The New Science of How We Connect with Others. Farrar, Straus & Giroux, New York (2008) 7. Musse, S.R., Thalmann, D.: A Model of Human Crowd Behavior: Group Interrelationship and Collision Detection Analysis. Computer Animation and Simulation 97, 39–51 (1997) 8. Pelechano, N., O’brien, K., Silverman, B., Badler, N.: Crowd Simulation Incorporating Agent Psychological Models, Roles and Communication. In: First International Workshop on Crowd Simulation, V-CROWDS 2005, Lausanne, Switzerland, pp. 21–30 (2005) 9. Pineda, J.A. (ed.): Mirror Neuron Systems: the Role of Mirroring Processes in Social Cognition. Humana Press Inc., New Jersey (2009) 10. Rizzolatti, G., Sinigaglia, C.: Mirrors in the Brain: How Our Minds Share Actions and Emotions. Oxford University Press, Oxford (2008) 11. Sakuma, T., Mukai, T., Kuriyama, S.: Psychological model for animating crowded pedestrians. Computer Animation and Virtual Worlds 16, 343–351 (2005) 12. Sorenson, H.W.: Parameter Estimation: Principles and Problems. Marcel Dekker, Inc., New York (1980) 13. Tsai, J., Fridman, N., Bowring, E., Brown, M., Epstein, S., Kaminka, G., Marsella, S., Ogden, A., Rika, I., Sheel, A., Taylor, M.E., Wang, X., Zilka, A., Tambe, M.: ESCAPES Evacuation Simulation with Children, Authorities, Parents, Emotions, and Social comparison. In: Tumer, K., Yolum, P., Sonenberg, L., Stone, P. (eds.) Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), Innovative Applications Track (in press, 2011) 14. Ulicny, B., Thalmann, D.: Crowd Simulation for Interactive Virtual Environments and VR Training systems. In: Proceedings of the Eurographics Workshop on Animation and Simulation 2001, pp. 163–170. Springer, Heidelberg (2001)
Author Index
Abreu, Rui II-416 Adibuzzaman, Mohammad I-135 Aguilar Ruiz, Jes´ us I-1 Aguilar-Ruiz, Jesus Salvador II-367 Ahamed, Sheikh Iqbal I-135 Ahmad, Mumtaz I-19 Ahriz, Hatem II-436 ´ Alvarez-Garc´ ıa, Juan A. II-48 Andrushevich, Aliaksei II-459 ´ Angel G´ omez-Nieto, Miguel II-396 Arredondo, Tom´ as V. II-183 Barber, Rick I-285 Barhm, Mukhtaj Singh II-511 Basile, Teresa M.A. I-275 Batteux, Michel I-186 Benferhat, Salem I-49 Ben Hamza, Abdessamad II-68 Bentahar, Jamal II-37, II-68 Bernardi, Giulio II-79 Bernardino, Anabela Moreira II-469 Bernardino, Eug´enia Moreira II-469 Bosse, Tibor II-566 Briano, Enrico II-58 Brooks, Philip W. II-312 Buchanan, Bruce G. I-176 Bullard, Kalesh S. II-328 Bustillo, Andres I-199 Candel, Diego C. II-183 Card, Stuart W. II-296 Carson-Berndsen, Julie II-426 Castro, Carlos I-79 Cerruela Garc´ıa, Gonzalo II-396 Cesta, Amedeo II-79, II-216 Chandrasekaran, Muthukumaran II-347 Che, Chan Hou II-276 Chen, Chun-Hao I-95 Chen, Pei-Yu I-125 Chen, Qiao II-237 Cho, Chin-Wei I-242 Christy, Thomas I-317 Chumakov, Roman II-153
Cortellessa, Gabriella II-79 Craw, Susan II-436 Crawford, Broderick I-79 Creixell, Werner II-183 Dague, Philippe I-186 Damarla, Thyagaraju I-69 D’Amico, Rita II-79 Deb, Kaushik II-163 De Benedictis, Riccardo II-79 de Haro-Garc´ıa, Aida II-357 del Castillo-Gomariz, Rafael II-376 D´ıaz, Daniel II-216 Di Mauro, Nicola I-275 Dong, Liang I-176 Dssouli, Rachida II-37 el-Khatib, Khalil II-511 El-Menshawy, Mohamed II-37 Embley, David W. I-253 Faghihi, Usef II-27 Fan, Jinfei I-59 Felfernig, Alexander I-105 Ferilli, Stefano I-275 Ferreiro, Susana I-199 Fezer, Karl F. II-328 Fiani, Philippe I-186 Forouraghi, Babak II-302 Fournier-Viger, Philippe II-27 Fujita, Hamido II-21 Fujita, Katsuhide II-501 Fumarola, Fabio I-285 Funatsu, Kimito I-115 Gao, Qi-Gang I-146 Gao, Xiang II-193, II-256 Gao, Yang I-207 Garc´ıa-Pedrajas, Nicol´ as II-357, II-376, II-386 Garza-Casta˜ n´ on, Luis I-10 Garza Casta˜ n´ on, Luis Eduardo I-29 Glimsdal, Sondre II-532 G´ omez-Pulido, Juan Antonio II-469 Gonzalez, Jesus A. I-39
580
Author Index
Gonz´ alez-Abril, Luis II-48 Gonzalez-Sanchez, Alberto II-416 Gorritxategi, Eneko I-199 Granmo, Ole-Christoffer II-522, II-532 Grzenda, Maciej I-232 Guill´en, Deneb Robles I-29 Han, Jiawei I-285 Haque, Munirul I-135 Hasan, Chowdhury I-135 Haselager, Willem I-295 He, Guang-Nan I-207, I-220 Hern´ andez, Danilo C´ aceres I-338 Hernandez-Leal, Pablo I-39 Hindriks, Koen I-295 Hindriks, Koen V. II-556 Hirasawa, Yasuharu I-348 Hong, Tzung-Pei I-156 Hoogendoorn, Mark II-566 Huang, Weili II-276 Ibarguengoytia, Pablo H. Indurkhya, Bipin I-327 Isak, Klaus I-105 Ito, Takayuki II-501
I-39
Jo, Kang-Hyun I-338, II-163 Jonker, Catholijn I-295 Jonker, Catholijn M. II-120, II-556 Kancherla, Kesav II-446 Kane, Mark II-426 Kaneko, Hiromasa I-115 Karanam, Saraschandra I-327 Kawaguchi, Shogo II-501 Kawsar, Ferdaus I-135 Kesim Cicekli, Nihan II-406 Kim, Taeho I-338 Kitahara, Tetsuro II-1 Klapproth, Alexander II-459 Klein, Michel C.A. II-98, II-130, II-566 Koh, Jia-Ling I-242 Krishnamoorthy, Mukkai I-253 Kuncheva, Ludmila I. I-317 Lamirel, Jean-Charles I-19 Lamiroy, Bart I-264 Le, My Ha II-163 L´etourneau, Sylvain I-165 Levin, Mark Sh. II-459
Li, Ning I-220 Lim, Andrew II-193, II-237, II-246, II-256, II-276, II-286 Lin, Chun-Wei I-156 Lin, Frank Yeong-Sung I-125 Lin, Wen-Yang I-95 Lopresti, Daniel I-264 Lozovyy, Paul II-319 Luque Ruiz, Irene II-396 Ma, Li II-302 Malerba, Donato I-285 Mall, Raghvendra I-19 Mandl, Monika I-105 Mansoor, Sa’ad P. I-317 Massie, Stewart II-436 Mehrotra, Kishan. G. I-69 Mena Torres, Dayrelis I-1 M’Hallah, Rym II-226 Mills, Ian II-436 Minamikawa, Atsunori II-89 Mirenkov, Nikolay II-11 Mogles, Nataliya II-130 Mohan, Chilukuri K. I-69 Monfroy, Eric I-79 Montecinos, Mauricio I-79 Morais, Hugo II-490 Morales, Eduardo F. I-39 Morales-Menendez, Ruben I-10, I-29 Mosca, Roberto II-58 Muecke, Karl II-266 Mukkamala, Srinivas II-446 Nadig, Karthik II-347 Nagy, George I-253 Nkambou, Roger II-27 Oddi, Angelo II-216 Ogata, Tetsuya I-348, II-1 Okuno, Hiroshi G. I-348, II-1 Oommen, B. John II-522 Oon, Wee-Chong II-246, II-256 Oppacher, Franz I-59 Ormaz´ abal, Wladimir O. II-183 Ortega-Ram´ırez, Juan A. II-48 Ozono, Tadachika II-173 ¨ urk, Gizem II-406 Ozt¨ Pagani, Marco II-79 Pan, Dan I-146
Author Index Pandhiti, Swetha II-328 Paquet, Eric I-85 P´erez-Rodr´ıguez, Javier II-357, II-386 Pierce, Iestyn I-317 Pinto, Tiago II-490 Polyakovskiy, Sergey II-226 Potter, Walter D. II-312, II-328 Powell, Brian II-266 Pra¸ca, Isabel II-490 Qin, Hu II-193, II-256 Qu, Yan II-328 Qureshi, Faisal Z. II-511 Qwasmi, Nidal II-511 Ramaswamy, Srini II-336 Ramirez-Mendoza, Ricardo I-10 Rapin, Nicolas I-186 Rasconi, Riccardo II-216 Rasheed, Khaled II-347 Revetria, Roberto II-58 R-Moreno, M. Dolores II-216 Rodrigues, F´ atima II-490 Rodr´ıguez, Juan J. I-199 Rodr´ıguez Sarabia, Yanet I-1 S´ anchez-P´erez, Juan Manuel II-469 Santiesteban-Toca, Cosme Ernesto II-367 Scheidegger, Carre II-203 Schumann, Anika II-480 Seth, Sharad I-253 Shah, Arpit II-203 Shang, Lin I-207 Shintani, Toramatsu II-173 Shiramatsu, Shun II-173 Simon, Dan II-203, II-319 Singh, Gurmeet I-69 Smith, Reid G. I-176 Soria-Morillo, Luis M. II-48 Soto, Ricardo I-79 Sucar, L. Enrique I-39 Sugawara, Kohei II-21 Swezey, Robin M.E. II-173 Takahashi, Toru I-348, II-1 Testa, Alessandro II-58 Thomas, George II-319 Thomson, Richard II-436 Threm, David II-336 Tiberio, Lorenza II-79
581
Tiihonen, Juha I-105 Titouna, Faiza I-49 Treur, Jan I-306, II-109, II-130, II-542, II-566 Tsai, Chun-Ming II-143 Tud´ on-Mart´ınez, Juan Carlos I-10 Umair, Muhammad
II-109
Vale, Zita II-490 van de Kieft, Iris II-120 van der Laan, Yara I-306 van der Wal, C. Natalie II-566 van Gemund, Arjan J.C. II-416 van Lambalgen, Rianne II-98 van Oostendorp, Herre I-327 van Riemsdijk, M. Birna II-120 van Wissen, Arlette II-130 Vega-Rodr´ıguez, Miguel Angel II-469 Viktor, Herna Lydia I-85 Villar, Alberto I-199 Visser, Wietske II-556 Wan, Wei II-68 Wang, Shyue-Liang I-156 Watanobe, Yutaka II-11 Wei, Lijun II-286 Wei, You-En I-95 Wen, Ya-Fang I-125 Weng, Yujian II-246 Weninger, Tim I-285 White, Tony I-59 Wiggers, Pascal I-295 Wilson, Nic II-480 Woo, Byung-Seok II-163 Yamakawa, Nobuhide II-1 Yang, Chunsheng I-165 Yang, Kuo-Tung I-156 Yang, Yu-Bin I-207, I-220 Yen, Hong-Hsu I-125 Yokoyama, Hiroyuki II-89 Yoshioka, Rentaro II-11 Yu, Liguo II-336 Zeng, An I-146 Zhang, Xuan II-522 Zhang, Yao I-207, I-220 Zhu, Wenbin II-193, II-237, II-246, II-276, II-286