Junzo Watada, Gloria Phillips-Wren, Lakhmi C. Jain, and Robert J. Howlett (Eds.) Intelligent Decision Technologies
Smart Innovation, Systems and Technologies 10 Editors-in-Chief Prof. Robert J. Howlett KES International PO Box 2115 Shoreham-by-sea BN43 9AF UK E-mail:
[email protected] Prof. Lakhmi C. Jain School of Electrical and Information Engineering University of South Australia Adelaide, Mawson Lakes Campus South Australia SA 5095 Australia E-mail:
[email protected] Further volumes of this series can be found on our homepage: springer.com Vol. 1. Toyoaki Nishida, Lakhmi C. Jain, and Colette Faucher (Eds.) Modeling Machine Emotions for Realizing Intelligence, 2010 ISBN 978-3-642-12603-1 Vol. 2. George A. Tsihrintzis, Maria Virvou, and Lakhmi C. Jain (Eds.) Multimedia Services in Intelligent Environments – Software Development Challenges and Solutions, 2010 ISBN 978-3-642-13354-1 Vol. 3. George A. Tsihrintzis and Lakhmi C. Jain (Eds.) Multimedia Services in Intelligent Environments – Integrated Systems, 2010 ISBN 978-3-642-13395-4 Vol. 4. Gloria Phillips-Wren, Lakhmi C. Jain, Kazumi Nakamatsu, and Robert J. Howlett (Eds.) Advances in Intelligent Decision Technologies – Proceedings of the Second KES International Symposium IDT 2010, 2010 ISBN 978-3-642-14615-2 Vol. 5. Robert J. Howlett (Ed.) Innovation through Knowledge Transfer, 2010 ISBN 978-3-642-14593-3 Vol. 6. George A. Tsihrintzis, Ernesto Damiani, Maria Virvou, Robert J. Howlett, and Lakhmi C. Jain (Eds.) Intelligent Interactive Multimedia Systems and Services, 2010 ISBN 978-3-642-14618-3 Vol. 7. Robert J. Howlett, Lakhmi C. Jain, and Shaun H. Lee (Eds.) Sustainability in Energy and Buildings, 2010 ISBN 978-3-642-17386-8 Vol. 8. Ioannis Hatzilygeroudis and Jim Prentzas (Eds.) Combinations of Intelligent Methods and Applications, 2010 ISBN 978-3-642-19617-1 Vol. 9. Robert J. Howlett (Ed.) Innovation through Knowledge Transfer 2010, 2011 ISBN 978-3-642-20507-1 Vol. 10. Junzo Watada, Gloria Phillips-Wren, Lakhmi C. Jain, and Robert J. Howlett (Eds.) Intelligent Decision Technologies, 2011 ISBN 978-3-642-22193-4
Junzo Watada, Gloria Phillips-Wren, Lakhmi C. Jain, and Robert J. Howlett (Eds.)
Intelligent Decision Technologies Proceedings of the 3rd International Conference on Intelligent Decision Technologies (IDT´ 2011)
123
Prof. Junzo Watada
Prof. Lakhmi C. Jain
Waseda University Graduate School of Information, Production and Systems (IPS) 2-7 Hibikino Wakamatsuku, Fukuoka Kitakyushu 808-0135 Japan E-mail:
[email protected] University of South Australia School of Electrical and Information Engineering Mawson Lakes Campus Adelaide South Australia SA 5095 Australia E-mail:
[email protected] Prof. Gloria Phillips-Wren
Prof. Robert J. Howlett
Loyola University Maryland Sellinger School of Business and Management 4501 N Charles Street Baltimore, MD 21210 USA E-mail:
[email protected] KES International PO Box 2115 Shoreham-by-sea West Sussex BN43 9AF United Kingdom E-mail:
[email protected] ISBN 978-3-642-22193-4
e-ISBN 978-3-642-22194-1
DOI 10.1007/978-3-642-22194-1 Smart Innovation, Systems and Technologies
ISSN 2190-3018
Library of Congress Control Number: 2011930859 c 2011 Springer-Verlag Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Scientific Publishing Services Pvt. Ltd., Chennai, India. Printed on acid-free paper 987654321 springer.com
Preface
Intelligent Decision Technologies (IDT) seeks an interchange of research on intelligent systems and intelligent technologies which enhance or improve decision making in industry, government and academia. The focus is interdisciplinary in nature, and includes research on all aspects of intelligent decision technologies, from fundamental development to the applied system. The field of intelligent systems is expanding rapidly due, in part, to advances in Artificial Intelligence and environments that can deliver the technology when and where it is needed. One of the most successful areas for advances has been intelligent decision making and related applications. Intelligent decision systems are based upon research in intelligent agents, fuzzy logic, multi-agent systems, artificial neural networks, and genetic algorithms, among others. Applications of intelligence-assisted decision making can be found in management, international business, finance, accounting, marketing, healthcare, medical and diagnostic systems, military decisions, production and operation, networks, traffic management, crisis response, human-machine interfaces, financial and stock market monitoring and prediction, and robotics. Some areas such as virtual decision environments, social networking, 3D human-machine interfaces, cognitive interfaces, collaborative systems, intelligent web mining, e-commerce, e-learning, e-business, bioinformatics, evolvable systems, virtual humans, and designer drugs are just beginning to emerge. In this volume we publish the research of scholars from the Third KES International Symposium on Intelligent Decision Technologies (KES IDT’11), hosted and organized by the University of Piraeus, Greece, in conjunction with KES International. The book contains chapters based on papers selected from a large number of submissions for consideration for the symposium from the international community. Each paper was double-blind, peer-reviewed by at least two independent referees. The best papers were accepted based on recommendations of the reviewers and after required revisions had been undertaken by the authors. The final publication represents the current leading thought in intelligent decision technologies. We wish to express our sincere gratitude to the plenary speakers, invited session chairs, delegates from all over the world, the authors of various chapters and reviewers for their outstanding contributions. We express our sincere thanks to the University of Piraeus for their sponsorship and support of the symposium. We thank the International Programme Committee for their support and assistance. We would like to thank Peter Cushion of KES International for his help with
VI
Preface
organizational issues. We thank the editorial team of Springer-Verlag for their support in production of this volume. We sincerely thank the Local Organizing Committee, Professors Maria Virvou and George Tsihrintzis, and students at the University of Piraeus for their invaluable assistance. We hope and believe that this volume will contribute to ideas for novel research and advancement in intelligent decision technologies for researchers, practitioners, professors and research students who are interested in knowledge-based and intelligent engineering systems. Piraeus, Greece 20–22 July 2011
Junzo Watada Gloria Phillips-Wren Lakhmi C. Jain Robert J. Howlett
Contents
Part I: Modeling and Method of Decision Making 1
2
3
4
5
6
A Combinational Disruption Recovery Model for Vehicle Routing Problem with Time Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuping Wang, Junhu Ruan, Hongyan Shang, Chao Ma
3
A Decision Method for Disruption Management Problems in Intermodal Freight Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minfang Huang, Xiangpei Hu, Lihua Zhang
13
A Dominance-Based Rough Set Approach of Mathematical Programming for Inducing National Competitiveness . . . . . . . . . . . . Yu-Chien Ko, Gwo-Hshiung Tzeng
23
A GPU-Based Parallel Algorithm for Large Scale Linear Programming Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianming Li, Renping Lv, Xiangpei Hu, Zhongqiang Jiang
37
A Hybrid MCDM Model on Technology Assessment to Business Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mei-Chen Lo, Min-Hsien Yang, Chien-Tzu Tsai, Aleksey V. Pugovkin, Gwo-Hshiung Tzeng A Quantitative Model for Budget Allocation for Investment in Safety Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuji Sato
7
Adapted Queueing Algorithms for Process Chains . . . . . . . . . . . . . . . ´ Agnes Bog´ardi-M´esz¨oly, Andr´as R¨ovid, P´eter F¨oldesi
8
An Improved EMD Online Learning-Based Model for Gold Market Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shifei Zhou, Kin Keung Lai
47
57 65
75
VIII
9
Contents
Applying Kansei Engineering to Decision Making in Fragrance Form Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun-Chun Wei, Min-Yuan Ma, Yang-Cheng Lin
85
10 Biomass Estimation for an Anaerobic Bioprocess Using Interval Observer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elena M. Bunciu
95
11 Building Multi-Attribute Decision Model Based on Kansei Information in Environment with Hybrid Uncertainty . . . . . . . . . . . 103 Junzo Watada, Nureize Arbaiy 12 Building on the Synergy of Machine and Human Reasoning to Tackle Data-Intensive Collaboration and Decision Making . . . . . . . . 113 Nikos Karacapilidis, Stefan R¨uping, Manolis Tzagarakis, Axel Poign´e, Spyros Christodoulou 13 Derivations of Information Technology Strategies for Enabling the Cloud Based Banking Service by a Hybrid MADM Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Chi-Yo Huang, Wei-Chang Tzeng, Gwo-Hshiung Tzeng, Ming-Cheng Yuan 14 Difficulty Estimator for Converting Natural Language into First Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Isidoros Perikos, Foteini Grivokostopoulou, Ioannis Hatzilygeroudis, Konstantinos Kovas 15 Emergency Distribution Scheduling with Maximizing Marginal Loss-Saving Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Yiping Jiang, Lindu Zhao 16 Fuzzy Control of a Wastewater Treatment Process . . . . . . . . . . . . . . . 155 Alina Chiros¸c˘a, George Dumitras¸cu, Marian Barbu, Sergiu Caraman 17 Interpretation of Loss Aversion in Kano’s Quality Model . . . . . . . . . 165 P´eter F¨oldesi, J´anos Botzheim 18 MCDM Applications on Effective Project Management for New Wafer Fab Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Mei-Chen Lo, Gwo-Hshiung Tzeng 19 Machine Failure Diagnosis Model Applied with a Fuzzy Inference Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Lily Lin, Huey-Ming Lee 20 Neural Network Model Predictive Control of a Wastewater Treatment Bioprocess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Dorin S¸endrescu, Emil Petre, Dan Popescu, Monica Roman
Contents
IX
21 Neural Networks Based Adaptive Control of a Fermentation Bioprocess for Lactic Acid Production . . . . . . . . . . . . . . . . . . . . . . . . . 201 Emil Petre, Dan Selis¸teanu, Dorin S¸endrescu 22 New Evaluation Method for Imperfect Alternative Matrix . . . . . . . . 213 Toshimasa Ozaki, Kanna Miwa, Akihiro Itoh, Mei-Chen Lo, Eizo Kinoshita, Gwo-Hshiung Tzeng 23 Piecewise Surface Regression Modeling in Intelligent Decision Guidance System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Juan Luo, Alexander Brodsky 24 Premises of an Agent-Based Model Integrating Emotional Response to Risk in Decision-Making . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Ioana Florina Popovici 25 Proposal of Super Pairwise Comparison Matrix . . . . . . . . . . . . . . . . . 247 Takao Ohya, Eizo Kinoshita 26 Reduction of Dimension of the Upper Level Problem in a Bilevel Programming Model Part 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Vyacheslav V. Kalashnikov, Stephan Dempe, Gerardo A. P´erez-Vald´es, Nataliya I. Kalashnykova 27 Reduction of Dimension of the Upper Level Problem in a Bilevel Programming Model Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Vyacheslav V. Kalashnikov, Stephan Dempe, Gerardo A. P´erez-Vald´es, Nataliya I. Kalashnykova 28 Representation of Loss Aversion and Impatience Concerning Time Utility in Supply Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 P´eter F¨oldesi, J´anos Botzheim, Edit S¨ule 29 Robotics Application within Bioengineering: Neuroprosthesis Test Bench and Model Based Neural Control for a Robotic Leg . . . . 283 Dorin Popescu, Dan Selis¸teanu, Marian S. Poboroniuc, Danut C. Irimia 30 The Improvement Strategy of Online Shopping Service Based on SIA-NRM Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Chia-Li Lin 31 The Optimization Decisions of the Decentralized Supply Chain under the Additive Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Peng Ma, Haiyan Wang 32 The Relationship between Dominant AHP/CCM and ANP . . . . . . . . 319 Eizo Kinoshita, Shin Sugiura
X
Contents
33 The Role of Kansei/Affective Engineering and Its Expected in Aging Society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Hisao Shiizuka, Ayako Hashizume
Part II: Decision Making in Finance and Management 34 A Comprehensive Macroeconomic Model for Global Investment . . . 343 Ming-Yuan Hsieh, You-Shyang Chen, Chien-Jung Lai, Ya-Ling Wu 35 A DEMATEL Based Network Process for Deriving Factors Influencing the Acceptance of Tablet Personal Computers . . . . . . . . 355 Chi-Yo Huang, Yi-Fan Lin, Gwo-Hshiung Tzeng 36 A Map Information Sharing System among Refugees in Disaster Areas, on the Basis of Ad-Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . 367 Koichi Asakura, Takuya Chiba, Toyohide Watanabe 37 A Study on a Multi-period Inventory Model with Quantity Discounts Based on the Previous Order . . . . . . . . . . . . . . . . . . . . . . . . 377 Sungmook Lim 38 A Study on the ECOAccountancy through Analytical Network Process Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Chaang-Yung Kung, Chien-Jung Lai, Wen-Ming Wu, You-Shyang Chen, Yu-Kuang Cheng 39 Attribute Coding for the Rough Set Theory Based Rule Simplications by Using the Particle Swarm Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Jieh-Ren Chang, Yow-Hao Jheng, Chi-Hsiang Lo, Betty Chang 40 Building Agents by Assembly Software Components under Organizational Constraints of Multi-Agent System . . . . . . . . . . . . . . 409 Siam Abderrahim, Maamri Ramdane 41 Determining an Efficient Parts Layout for Assembly Cell Production by Using GA and Virtual Factory System . . . . . . . . . . . . 419 Hidehiko Yamamoto, Takayoshi Yamada 42 Development of a Multi-issue Negotiation System for E-Commerce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Bala M. Balachandran, R. Gobbin, Dharmendra Sharma 43 Effect of Background Music Tempo and Playing Method on Shopping Website Browsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Chien-Jung Lai, Ya-Ling Wu, Ming-Yuan Hsieh, Chang-Yung Kung, Yu-Hua Lin
Contents
XI
44 Forecasting Quarterly Profit Growth Rate Using an Integrated Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 You-Shyang Chen, Ming-Yuan Hsieh, Ya-Ling Wu, Wen-Ming Wu 45 Fuzzy Preference Based Organizational Performance Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 Roberta O. Parreiras, Petr Ya Ekel 46 Generating Reference Business Process Model Using Heuristic Approach Based on Activity Proximity . . . . . . . . . . . . . . . . . . . . . . . . 469 Bernardo N. Yahya, Hyerim Bae 47 How to Curtail the Cost in the Supply Chain? . . . . . . . . . . . . . . . . . . 479 Wen-Ming Wu, Chaang-Yung Kung, You-Shyang Chen, Chien-Jung Lai 48 Intelligent Decision for Dynamic Fuzzy Control Security System in Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 Xu Huang, Pritam Gajkumar Shah, Dharmendra Sharma 49 Investigating the Continuance Commitment of Volitional Systems from the Perspective of Psychological Attachment . . . . . . . . . . . . . . . 501 Huan-Ming Chuang, Chyuan-Yuh Lin, Chien-Ku Lin 50 Market Structure as a Network with Positively and Negatively Weighted Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 Takeo Yoshikawa, Takashi Iino, Hiroshi Iyetomi 51 Method of Benchmarking Route Choice Based on the Input Similarity Using DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Jaehun Park, Hyerim Bae, Sungmook Lim 52 Modelling Egocentric Communication and Learning for Human-Intelligent Agents Interaction . . . . . . . . . . . . . . . . . . . . . . . . . 529 R. Gobbin, Masoud Mohammadian, Bala M. Balachandran 53 Multiscale Community Analysis of a Production Network of Firms in Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 Takashi Iino, Hiroshi Iyetomi 54 Notation-Support Method in Music Composition Based on Interval-Pitch Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 Masanori Kanamaru, Koichi Hanaue, Toyohide Watanabe 55 Numerical Study of Random Correlation Matrices: Finite-Size Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557 Yuta Arai, Kouichi Okunishi, Hiroshi Iyetomi
XII
Contents
56 Predicting of the Short Term Wind Speed by Using a Real Valued Genetic Algorithm Based Least Squared Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 Chi-Yo Huang, Bo-Yu Chiang, Shih-Yu Chang, Gwo-Hshiung Tzeng, Chun-Chieh Tseng 57 Selecting English Multiple-Choice Cloze Questions Based on Difficulty-Based Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Tomoko Kojiri, Yuki Watanabe, Toyohide Watanabe 58 Testing Randomness by Means of RMT Formula . . . . . . . . . . . . . . . . 589 Xin Yang, Ryota Itoi, Mieko Tanaka-Yamawaki 59 The Effect of Web-Based Instruction on English Writing for College Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597 Ya-Ling Wu, Wen-Ming Wu, Chaang-Yung Kung, Ming-Yuan Hsieh 60 The Moderating Role of Elaboration Likelihood on Information System Continuance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605 Huan-Ming Chuang, Chien-Ku Lin, Chyuan-Yuh Lin 61 The Type of Preferences in Ranking Lists . . . . . . . . . . . . . . . . . . . . . . 617 Piech Henryk, Grzegorz Gawinowski 62 Transaction Management for Inter-organizational Business Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629 Joonsoo Bae, Nita Solehati, Young Ki Kang 63 Trend-Extraction of Stock Prices in the American Market by Means of RMT-PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637 Mieko Tanaka-Yamawaki, Takemasa Kido, Ryota Itoi 64 Using the Rough Set Theory to Investigate the Building Facilities for the Performing Arts from the Performer’s Perspectives . . . . . . . . 647 Betty Chang, Hung-Mei Pei, Jieh-Ren Chang
Part III: Data Analysis and Data Navigation 65 A Novel Collaborative Filtering Model for Personalized Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661 Wang Qian 66 A RSSI-Based Localization Algorithm in Smart Space . . . . . . . . . . . 671 Liu Jian-hui, Han Chang-jun 67 An Improved Bee Algorithm-Genetic Algorithm . . . . . . . . . . . . . . . . 683 Huang Ming, Ji Baohui, Liang Xu
Contents
XIII
68 Application of Bayesian Network in Failure Diagnosis of Hydro-electrical Simulation System . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 Zhou Yan, Li Peng 69 Application of Evidence Fusion Theory in Water Turbine Model . . . 699 Li Hai-cheng, Qi Zhi 70 Calculating Query Likelihoods Based on Web Data Analysis . . . . . . 707 Koya Tamura, Kenji Hatano, Hiroshi Yadohisa 71 Calculating Similarities between Tree Data Based on Structural Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719 Kento Ikeda, Takashi Kobayashi, Kenji Hatano, Daiji Fukagawa 72 Continuous Auditing for Health Care Decision Support Systems . . . 731 Robert D. Kent, Atif Hasan Zahid, Anne W. Snowdon 73 Design and Implementation of a Primary Health Care Services Navigational System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743 Robert D. Kent, Paul D. Preney, Anne W. Snowdon, Farhan Sajjad, Gokul Bhandari, Jason McCarrell, Tom McDonald, Ziad Kobti 74 Emotion Enabled Model for Hospital Medication Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753 Dreama Jain, Ziad Kobti, Anne W. Snowdon 75 Health Information Technology in Canada’s Health Care System: Innovation and Adoption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763 Anne W. Snowdon, Jeremy Shell, Kellie Leitch, O. Ont, Jennifer J. Park 76 Hierarchical Clustering for Interval-Valued Functional Data . . . . . . 769 Nobuo Shimizu 77 Multidimensional Scaling with Hyperbox Model for Percentile Dissimilarities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 Yoshikazu Terada, Hiroshi Yadohisa 78 Predictive Data Mining Driven Architecture to Guide Car Seat Model Parameter Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789 Sabbir Ahmed, Ziad Kobti, Robert D. Kent 79 Symbolic Hierarchical Clustering for Visual Analogue Scale Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 799 Kotoe Katayama, Rui Yamaguchi, Seiya Imoto, Hideaki Tokunaga, Yoshihiro Imazu, Keiko Matsuura, Kenji Watanabe, Satoru Miyano
XIV
Contents
Part IV: Miscellanea 80 Acquisition of User’s Learning Styles Using Log Mining Analysis through Web Usage Mining Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 809 Sucheta V. Kolekar, Sriram G. Sanjeevi, D.S. Bormane 81 An Agent Based Middleware for Privacy Aware Recommender Systems in IPTV Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821 Ahmed M. Elmisery, Dmitri Botvich 82 An Intelligent Decision Support Model for Product Design . . . . . . . . 833 Yang-Cheng Lin, Chun-Chun Wei 83 Compromise in Scheduling Objects Procedures Basing on Ranking Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843 Piech Henryk, Grzegorz Gawinowski 84 Decision on the Best Retrofit Scenario to Maximize Energy Efficiency in a Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853 Ana Campos, Rui Neves-Silva 85 Developing Intelligent Agents with Distributed Computing Middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863 Christos Sioutis, Derek Dominish 86 Diagnosis Support on Cardio-Vascular Signal Monitoring by Using Cluster Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873 Ahmed M. Elmisery, Mart´ın Serrano, Dmitri Botvich 87 Multiple-Instance Learning via Decision-Based Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885 Yeong-Yuh Xu, Chi-Huang Shih 88 Software Testing – Factor Contribution Analysis in a Decision Support Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897 Deane Larkman, Ric Jentzsch, Masoud Mohammadian 89 Sustainability of the Built Environment – Development of an Intelligent Decision System to Support Management of Energy-Related Obsolescence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907 T.E. Butt, K.G. Jones Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
Part I Modeling and Method of Decision Making
A Combinational Disruption Recovery Model for Vehicle Routing Problem with Time Windows Xuping Wang, Junhu Ruan, Hongyan Shang, and Chao Ma
*
Abstract. A method of transforming various delivery disruptions into new customer-adding disruption is developed. The optimal starting time of delivery vehicles is analyzed and determined, which provides a new rescue strategy (starting later policy) for disrupted VRPTW. Then the paper considers synthetically customer service time, driving paths and total delivery costs to put forward a combinational disruption recovery model for VRPTW. Finally, in computational experiments, Nested Partition Method is applied to verify the effectiveness of the proposed model, as well as the strategy and the algorithm. Keywords: VRPTW, combinational disruption, disruption management, rescue strategies, nested partition method.
1 Introduction There are various unexpected disruption events encountered in the delivery process, such as vehicles break down, cargoes damage, the changes of customers’ service time, delivery addresses and demands. These disruption events, which often make actual delivery operations deviate from intended plans, may bring negative effects on the delivery system. It is necessary to develop satisfactory recovery plans quickly for minimizing the negative effects of disruption events. Vehicle routing problem (VRP), initially proposed by Dantzig and Ramser (1959), is an abstraction of the vehicle scheduling problem in real-world delivery systems. A variety of solutions for VRP have been put forward (Burak et al 2009), and a few researchers took delivery disruptions into account. Li et al (2009a, 2009b) proposed a vehicle rescheduling problem (VRSP), trying to solve the problem vehicle breakdown disruption. The thought of disruption management, which aims at minimizing the deviation of actual operations from intended plans with minimum costs, provides an effective idea to deal with unpredictable events (Jeans et al 2001). Several researchers have introduced the thought into logistics delivery field. Wang et al (2007) developed a disruption recovery model for the vehicle breakdown problem of Xuping Wang · Junhu Ruan · Chao Ma Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China e-mail:
[email protected] *
Hongyan Shang Commercial College, Assumption University of Thailand, Bangkok, 10240, Thailand J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 3–12. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
4
X. Wang et al.
VRPTW and proposed two rescue strategies: adding vehicles policy and neighboring rescue policy. Wang et al (2009a) built a multi-objective VRPTW disruption recovery model, studied the VRP disruption recovery model with fuzzy time windows (2009b), and carried out a further study on the vehicle breakdown disruption (2010). Mu et al (2010) developed Tabu Search algorithms to solve the disrupted VRP with vehicle breakdown disruption. Ding et al (2010) considered human behaviors to construct a disruption management model for delivery delay problem. Existing literatures have produced effective solutions for disrupted VRP with some certain disruption event, but the existing models and algorithms can handle only a certain type of uncertainty. It is difficult to solve actual vehicle routing problems with the reality that various disruption events often occur successively or even simultaneously. The purpose of the paper is to develop a common disruption recovery model for VRPTW, which may handle a variety and a combination of disruption events. Meanwhile, existing models for disrupted VRP did not consider the optimal starting time of delivery vehicles. Vehicles may arrive at delivery addresses early, but they can not serve until customers’ earliest service time. If the starting time is optimized for each vehicle, waiting costs may be reduced and a new rescue strategy may be provided for some disruption evens. The paper is organized as follows. A transformation method of delivery disruption events is developed in Section 2. Section 3 determines vehicles’ optimal starting time from the depot. Section 4 builds a combinational disruption recovery model for VRPTW. In Section 5, computational experiments are given to verify the effectiveness of the proposed model. Conclusions are drawn in Section 6.
2 A Transformation Method of Delivery Disruption Events 2.1 Preview of VRPTW The original VRPTW studied in the paper is as follows. One depot has K delivery vehicles with the same limited capacity. A set of customers N should all be visited once in requested time intervals. Each vehicle should leave from and return to the central depot. The target is to determine the delivery plan with the shortest total delivery distance. Notations used in the following are defined in Table 1. Table 1 Notations for original VRPTW Notations N: N0: K: Q: cij: tij: di: seri qijk : xijk : uik : Rstai : [stai,endi]: M:
Meanings A set of customers, N={1,2,…,n} A set of customers and the depot, N0= {0} N The total number of vehicles The limited capacity of each vehicle The distance between node i and j, i, j N0 The travel time between node i and j, i, j N0 The demand of node i, d0 = 0 The service time at node i The available capability of vehicle k between node i and node j A binary variable; xijk =1 mens vehicle k travels from node i to node j; otherwise xijk =0 A binary variable; uik =1 means that customer i is served by vehicle k; otherwise uik =0 The starting service time for customer i The time window of customer i; stai, earliest service time, endi, latest service time A large positive number
∪
∈
∈
A Combinational Disruption Recovery Model for Vehicle Routing Problem
5
The mathematical model of above VRPTW is: N0
K
N0
∑∑ ∑
m in
k =1 i = 0 j = 0 , j ≠ i
c ij x ijk
(1)
s.t. xijk = {0,1}
i, j ∈ N 0 , k ∈ {1,..., K }
(2)
uik = {0,1}
i ∈ N , k ∈ {1,..., K }
(3)
i ∈ N , k ∈ {1,..., K }
(4)
= ∑ uk 0 ≤ K k ∈ {1,..., K }
(5)
K
∑u k =1
ik
K
∑u k =1
=1 K
0k
N0
∑
l = 0, i ≠ l
k =1
xlik =
N0
∑
xijk = uik
j = 0,i ≠ j
N0
N0
∑d ×u − ∑q i =1
i
ik
qijk ≤ Q × xijk
j =1
0 jk
=0
i ∈ N 0 , k ∈ {1,..., K } k ∈ {1,..., K }
(7)
i, j ∈ N0 , k ∈ {1,..., K }
Rsta j ≥ Rstai + seri + tij − (1 − xijk ) × M stai ≤ Rstai ≤ endi
(6)
(8) i, j ∈ N , k ∈ {1,..., K }
i∈N
(9) (10)
where the objective function (1) is to minimize total delivery distance; constraint (4) ensures each customer is served only once by one of the vehicles; (5) ensures each vehicle should leave from and return to the depot; (6) represents the vehicle which arrives at customer i should leave from customer i; (7) and (8) represent any vehicle shouldn’t load more than its limited capability.
2.2 Delivery Disruption Events Transformation In order to develop a combinational disruption recovery model for VRPTW, the paper tries to transform different disruption events (involving the changes of customers’ requests) into new customer-adding disruption event. (1) Change of time windows Assuming that the service time of customer i is requested earlier, its original time window [stai, endi] will become [stai-△t, endi-△t] where △t is a positive number. If △t is so small that the vehicle k planned to serve customer i can squeeze out some extra time longer than △t by speeding up, the request will be ignored. If △t is large and vehicle k can’t squeeze out enough time, the request will be regarded as a disruption and customer i will be transformed into a new customer i' with time window [stai-△t, endi-△t]. Assuming that the service time of customer i is requested later, its time window [stai, endi] will be [stai+△t, endi+△t]. If △t is relatively small and vehicle k can
6
X. Wang et al.
wait for the extra time at customer i with the precondition that it will bring no effects on remaining delivery tasks, the request will be ignored and no disruption is brought to the original plan. If △t is large and vehicle k can’t wait for customer i, the request will be regarded as a disruption and customer i will be transformed into a new customer i' with time window [stai+△t, endi+△t]. (2) Change of delivery addresses If delivery addresses change, the original plan can’t deal with the changes which will be regarded as disruptions. For example, the delivery coordinate (Xi,Yi) of customer i is changed into (Xi’,Yi’), customer i will be transformed into a new customer i' with delivery coordinate (Xi’,Yi’). (3) Change of demands Changes of customers’ demands include demand reduction and demand increase. The demand reduction of some customer won’t bring disruptions on the original delivery plan. Vehicles can deliver according to their planed routing, so the demand reduction isn’t considered as a disruption in the paper. Whether the demand increase of customer i will be regarded as a disruption depends on the occurrence time t and the increase amount △d. If vehicle k which will serve customer i has left from the depot at t and no extra cargoes more than △d is loaded, the increase will be regarded as a disruption; If vehicle k has left from the depot at t and loads extra cargoes more than △d, the increase would not be regarded as a disruption. If vehicle k hasn’t left from the depot at t and can load more cargoes than △d, there will be no disruption on the original plan If vehicle k hasn’t left from the depot at t but can’t load more cargoes than △d, the demand increase is also looked as a disruption. After the demand increase being identified as a disruption, a new customer i' whose demand is △d will be added. (4) Removal of requests Customers may cancel their requests sometimes because of a certain reason, but the planed delivery routing needs no changes. When passing the removed customers, delivery vehicles just go on with no service. (5) Combinational disruption Combinational disruption refers to that some of above disruption events occur simultaneously on one customer or several customers. For one customer i with coordinate (Xi,Yi) and demand di, if its delivery address is changed into (Xi’,Yi’) and extra demand △d is requested, a new customer i' can be added with coordinate (Xi’,Yi’) and demand △d. For several customers, time window of customer i is requested earlier, from [stai, endi] to [stai-△t, endi-△t]; delivery address of customer j is changed, (Xi,Yi)→(Xi’,Yi’); extra demand △d is requested by customer m. The transformation of these disruptions is shown as Table 2.
;
Table 2 Transformation of combinational disruption from multi-customers Original customers i j m
Disruption events
△
△
[stai, endi] →[stai- t, endi- t] (Xi,Yi)→(Xi’,Yi’) dm→dm + d
△
New customers
△
△
i': [stai- t, endi- t] j’: (Xi’,Yi’) m’: d
△
Note that: After being transformed into new customers, original customers won’t be considered in new delivery plan except the customers with demand increase disruption.
A Combinational Disruption Recovery Model for Vehicle Routing Problem
7
3 Determination of the Optimal Starting Time for Vehicles Most existing researches on disrupted VRP assumed that all assigned vehicles left from the depot at time 0. Although delivery vehicles may arrive early at customers, they have to wait until the earliest service time, which will result in waiting costs. In fact, the optimal starting time of each vehicle can be determined according to its delivery tasks, which may decrease total delivery costs and provide a new rescue strategy for some disruption events. Some new notations used in the following are supplemented. Nk: the set of customers served by vehicle k, k {1,...,K}; wi: the waiting time at node i, i Nk; BSTk: the optimal starting time of vehicle k. [stai, endi] is the time window of customer i; ti-1,i is the travel time between node i-1 and i. Rstai , the starting service time for customer i, equals to the larger between the actual arrival time arri and the earliest service time stai , that is,
∈
∈
Rstai = max{arri , stai } , (i ≥ 1, i ∈ N k )
(11)
where arri depends on the starting service time for node i-1, the service time at node i-1 and the travel time ti-1,i between node i-1 and i. Thus, the actual arrival time arri at node i:
arri = Rstai −1 + seri −1 + ti −1,i , (i ≥ 1, i ∈ N k )
(12)
wi, the waiting time at node i, equals to Rstai-arri. The waiting time which can be saved by vehicle k is min{Rstai-arri}, which will equal to 0 when arri is bigger than stai for all the customers in Nk. When the actual finishing time Rstai+seri is later than the latest service time endi, the extra time at node i will be 0. When Rstai+seri ≤ endi, that is, the latest service time endi isn’t due when vehicle k finished the service for customer i, an extra time interval [Rstai+seri, endi] will exist. The total extra time of vehicle k in the delivery process, TFTLk, equals to: ⎧0, Rstai + seri > endi ; i ≥ 1, i ∈ N k (13) TFTLk =min {σ [endi − ( Rstai + seri )]} , σ = ⎨ ⎩1, Rstai + seri ≤ endi To sum up, the optimal starting time of vehicle k can be calculated by the following conditions and equations: (1) If the earliest service time of the first customer served by vehicle k is earlier than or equal to the travel time from the depot to the customer, that is, sta1≤ t0,1, the optimal starting time of vehicle k equals to 0, that is, BSTk=0. (2) If sta1> t0,1 and min{Rstai-arri}=0, then BSTk = ( sta1 − t0,1 ) + TFTLk , k ∈ {1,..., K }
(14)
(3) If sta1> t0,1 and min{Rstai-arri}>0, then BSTk = ( sta1 − t0,1 ) + min{min{Rstai − arri }, TFTLk } , i ∈ N k , k ∈ {1,..., K }
(15)
8
X. Wang et al.
4 Combinational Disruption Recovery Model for VRPTW Disruption Management aims at minimizing the negative effects caused by unexpected events to original plans, so the effects should be measured quantitatively before being taken as the minimization objective, which is called disruption measurement. The effects of disruption events on VRPTW mainly involve three aspects: customer service time, driving paths and delivery costs (Wang et al 2009a). In Section 2, the paper has transformed different disruption events into new customer-adding disruption event, so the disruption measurement on disrupted VRP, which will focus on measuring the new customer-adding disruption, is relatively simple. In disruption recovery plan, the number of customers, delivery addresses, time windows and other parameters may change, so some notations in original VRPTW are labeled as new notations correspondingly: N0→N0’, xijk→xijk’, stai→stai’, Rstai→Rstai’, endi→endi’, and so on. However, there are some notations unchanged, such as the number of vehicles K, the limited capability of vehicle Q. (1) Disruption measurement on customer service time The disruption on customers’ service time refers to that the actual arrival time is earlier than the earliest service time or later than the latest service time. The service time deviation of customer i is:
λ1 ( stai '− arri ') + λ2 ( arri '− endi ') , λ1 , λ2 ∈ {0,1}
(16)
where, if arri’<stai’, then λ1 = 1 and λ2 =0; if arri’>endi’, then λ2 = 1 and λ1 =0; if stai’ arri’ endi’, then λ1 , λ2 =0. The total service time disruption is:
≦ ≦
N'
θ ∑ (λ1 ( stai '− arri ') + λ2 (arri '− end i ')) , λ1 , λ2 ∈ {0,1}
(17)
i =1
Where θ is the penalty cost coefficient of per unit time deviation. (2) Disruption measurement on driving paths Total driving paths disruption is: N0 ' N0 '
N 0 ' N0 '
i =0 j = 0
i=0 j =0
σ ∑∑ ci j ( xi j '− xi j ) + μ ∑∑ ( xi j '− xi j ) , i, j ∈ N 0 ', xi j , xi j ' ∈ {0,1}
(18)
where σ is delivery cost coefficient of per unit distance; μ is the penalty cost coefficient of increasing or reducing a delivery path; xij, xij’ {0,1}, if there is a delivery path between node i and node j, xij=1, xij’=1, otherwise, xij=0, xij’=0. (3) Disruption measurement on delivery costs Delivery costs depend on total travel distance and the number of assigned vehicles, so total delivery costs disruption is:
∈
K
N0 '
σ (∑∑
N0 '
∑
k =1 i = 0 j = 0, j ≠ i
K
N0
cij xijk '− ∑∑
N0
∑
k =1 i = 0 j = 0, j ≠ i
K
cij xijk ) + ∑ CK (vk '− vk ) k =1
(19)
A Combinational Disruption Recovery Model for Vehicle Routing Problem N0 '
K
N0 '
∑∑ ∑
where
k =1 i = 0 j = 0, j ≠ i
N0
K
cij xijk '
9
represents the total delivery distance of the recovery plan;
N0
∑∑ ∑
k =1 i = 0 j = 0, j ≠ i
cij xijk represents the total delivery distance of the original plan; Ck is K
the fixed cost of a vehicle and
∑C k =1
∈
K
(vk '− vk ) represents the change of vehicle
fixed costs, where vk, vk’ {0,1}, if vehicle k is assigned in the original plan or in the recovery plan, vk or vk’=1, otherwise, vk or vk’=0. To sum up, a combinational disruption recovery model is developed: N'
min(θ ∑ (λ1 ( stai '− arri ') + λ2 (arri '− end i ')))
(20)
i =1
m in (σ
N0 ' N0 '
N0 ' N0 '
i= 0 j= 0
i=0 j=0
∑ ∑ c i j ( x i j '− x i j ) + μ ∑ ∑ ( x i j '− x i j )) N0 '
K
N0 '
min(σ (∑ ∑
∑
k =1 i = 0 j = 0, j ≠ i
K
N0
cij xijk '− ∑ ∑
N0
∑
k =1 i = 0 j = 0, j ≠ i
(21)
K
cij xijk ) + ∑ C K (vk '− vk ))
(22)
k =1
s.t. xijk ' = {0,1},
i, j ∈ N 0 ', k ∈ {1,..., K }
(23)
u jk ' = {0,1},
j ∈ N ', k ∈ {1,..., K }
(24)
K
∑u k =1
ik
K
∑u k =1
⎧ = 1, di ' = di '⎨ , ⎩≤ 2, di ' > di
i∈N '
(25)
K
0k
N0 '
∑
l = 0, i ≠ l
' = ∑ uk 0 ' ≤ K
(26)
k =1
xlik ' =
N0 '
∑ d '× u
N0 '
∑
j = 0, i ≠ j
xijk ' = uik ',
i ∈ N0 ', k ∈{1,..., K }
(27)
N0 '
'− ∑ q0 jk ' = 0, k ∈ {1,..., K }
(28)
qijk ' ≤ Q × xijk ' i, j ∈ N 0 ', k ∈ {1,..., K }
(29)
Rsta j ' ≥ Kstak + Rstai '+ seri '+ tij '− (1 − xijk ') × M , i, j ∈ N ', k ∈ {1,..., K }
(30)
Kstak = BSTk , k ∈ {1,..., K }
(31)
λ1 , λ2 ∈ {0,1} , xi j , xi j ' ∈ {0,1} , vk ', vk ∈ {0,1}
(32)
i =1
i
ik
j =1
Objective (20), (21) and (22) is to minimize the disruption on customers’ service time, driving paths and delivery costs respectively. Constraint (30) and (31) ensure vehicles leave from the central depot at their optimal starting time, where Kstak is the actual starting time of vehicle k and BSTk is the optimal starting time determined in Section 3.
10
X. Wang et al.
5 Computational Experiments The paper applied Nested Partitions Method (NPM) to solve the proposed model. NPM, proposed by Shi (2000), is a novel global optimization heuristic algorithm. The designed NPM algorithm for the combinational recovery model integrates three rescue strategies: (1) Adding vehicles policy. The strategy means some new vehicles which haven’t delivery tasks according to the original plan are added to meet requests of new customers. (2) Starting later policy. As vehicles don’t leave from the depot until their optimal starting time, there may be some vehicles still staying at the depot when disruption events occur. (3) Neighboring rescue policy. The strategy uses in-transit vehicles which adjoin new customers to deal with the disruptions.
5.1 Original VRPTW and Combinational Disruption Data The original VRPTW studied in the paper is from [7]: one depot owns 8 vehicles with the limited capacity 5t; the distance between two nodes can be calculated according to their coordinates; the speed of each vehicle is 1 km/h; coefficients θ , σ and μ are given 1, 1 and 10 respectively; detailed original data can be seen in [7]. By using improved genetic algorithm, [7] attained the optimal initial routing: vehicle 1: 0-8-2-11-1-4-0; vehicle 2: 0-10-5-13-0; vehicle 3: 0 -9-7-6-0; vehicle 4: 0-3-14-12-15-0. The total driving distance is 585.186. After the initial scheduling, there are still four spare vehicles in the depot. At time 32.65, change requests are received from customer 4, 8, 11, 14 and a new customer 16 occurs. The detailed change data are shown in Table 3. Table 3 Data of changes Customers
4 8 11 14 16
Original Original time Original demands coordinates windows (53,19) [96,166] 0.6 (56,4) [9,79] 0.2 (41,10) [74,174] 0.9 (73,29) [56,156] 1.8 -
New coordinates (53,29) Unchanged Unchanged Unchanged (55,60)
New time New demands windows [10,54] 1.6 Unchanged 1.4 Unchanged 1.9 [20,70] Unchanged [30,75] 2.0
5.2 Results and Findings According to Section 2, above disrupted nodes can be transformed into new customer nodes: from 4, 14, 8, 11 to 16, 17, 18, 19. A new customer 20 is added. There are three in-transit vehicles which are transformed into virtual customers 21, 22, 23 (A vehicle has not left when the disturbance occurred). Data after transformation are shown as Table 4. NPM algorithm with neighboring rescue policy produced the new routing: 0-17-18-8-2-0, 0-10-5-13-0, 0-9-7-6-0, 0-3-1-19-11-16-12-0, 0-20-15-0. NPM algorithm with starting later policy produced the new routing: 0-8-2-11-1-0, 0-10-5-13-0, 0-9-7-6-0, 0-16-3-18-12-15-0, 0-20-17-19-0.
A Combinational Disruption Recovery Model for Vehicle Routing Problem
11
Table 4 Data after transformation Customers 0 1 2 3 5 6 7 8 9 10 11 12 13 15 16 17 18 19 20 21 22 23
X (km) 50 19 33 35 70 27 10 56 16 68 41 83 25 70 53 73 56 41 55 54 57 43
Y (km) 50 0 3 21 94 44 69 4 81 76 10 43 91 18 29 29 4 10 60 18 61 57
di (t) 0 1.0 1.8 1.1 1.9 1.4 1.2 0.2 1.7 0.8 0.9 0.8 1.9 0.9 1.6 1.8 1.2 1.0 2.0 0 0 0
stai (h) 0 74 58 15 47 85 21 9 37 21 74 58 15 87 10 20 9 74 30 0 0 0
endi (h) +∞ 144 128 85 177 155 91 79 107 121 174 158 125 187 54 70 79 174 75 400 400 400
Table 5 Comparison of results
16 9
Total Time deviation 210.75 37.03
Number of Vehicles 7 6
679.79
19
66.76
5
737.11
7
33.30
5
Methods
Total distance
Paths deviation
Rescheduling Disruption Management by GA Disruption Management by NPM with neighboring rescue policy Disruption Management by NPM with starting later policy
840.76 841.69
The comparison of results with [7] is shown as Table 5. From the comparison, the paper finds: (1) Disruption Management by GA is superior to the Rescheduling in Paths deviation, Total Time deviation and Number of vehicles; Disruption Management by NPM with neighboring rescue policy produced better results in Total distance, Total Time deviation and Number of vehicles than the Rescheduling; Disruption Management by NPM with starting later policy outdoes the Rescheduling in all aspects. This verifies the advantage of Disruption Management in dealing with disruption events for VRPTW. (2) In Total distance and Number of vehicles, Disruption Management by NPM with neighboring rescue policy is superior to the Rescheduling and Disruption Management by GA; Disruption Management by NPM with starting later policy produced better results than the Rescheduling and Disruption Management by GA. This gives some evidences on that the transformation of disruption events and the designed NPM algorithm are effective to the combinational disruption recovery model. (3) Disruption Management by NPM with neighboring rescue policy is better than Disruption Management by NPM with starting later policy in Total distance but worse than the later in Paths deviation and Total Time deviation, which proves that considering the optimal starting time of vehicles may provide a new rescue strategy for disrupted VRP.
12
X. Wang et al.
6 Conclusions Disruption Management provides a good idea to minimize the negative effects of disruption events on the whole delivery system. For the reality that a variety of delivery disruptions often occur successively or simultaneously, the paper proposed a method of transforming various disruption events into new customer-adding disruption, which can facilitate to develop a combinational VRPTW disruption recovery model. The paper considered vehicles’ optimal starting time from the central depot, which can not only reduce the waiting costs of vehicles in transit but also provide a new rescue strategy for the disrupted VRP. The paper focused on the customer disruption events, but didn’t give enough consideration to vehicle disruption events and cargoes disruption events, which needs further efforts. Acknowledgments. This work is supported by National Natural Science Foundation of China (No. 90924006, 70890080, 70890083) and National Natural Science Funds for Distinguished Young Scholar (No.70725004).
References 1. Burak, E., Arif, V.V., Arnold, R.: The vehicle routing problem: A taxonomic review. Computers & Industrial Engineering 57(4), 1472–1483 (2009) 2. Dantzig, G.B., Ramser, J.H.: The truck dispatching problem. Management Science 6(1), 80–91 (1959) 3. Ding, Q.-l., Hu, X.-p., Wang, Y.-z.: A model of disruption management for solving delivery delay. In: Advances in Intelligent Decision Technologies: Proceedings of the Second KES International Symposium IDT, Baltimore, USA, July 28-30. SIST, vol. 4, pp. 227–237 (2010) 4. Jeans, C., Jesper, L., Allan, L., Jesper, H.: Disruption management operation research between planning and execution. OR/MS 28(5), 40–51 (2001) 5. Li, J.-q., Pitu, B.M., Denis, B.: A Lagrangian heuristic for the real-time vehicle rescheduling problem. Transportation Research Part E: Logistics and Transportation Review 45(3), 419–433 (2009) 6. Li, J.-q., Pitu, B.M., Denis, B.: Real-time vehicle rerouting problems with time windows. European Journal of Operational Research 194(3), 711–727 (2009) 7. Mu, Q., Fu, Z., Lysgaard, J., Eglese, R.: Disruption management of the vehicle routing problem with vehicle breakdown. Journal of the Operational Research Society, 1–8 (2010) 8. Shi, L.-y.: Nested Partition Method for Global Optimization. Operation Research 48(3), 390–407 (2000) 9. Wang, X.-p., Niu, J., Hu, X.-p., Xu, C.-l.: Rescue Strategies of the VRPTW Disruption. Systems Engineering-Theory & Practice 27(12), 104–111 (2007) 10. Wang, X.-p., Xu, C.-l., Yang, D.-l.: Disruption management for vehicle routing problem with the request changes of customers. International Journal of Innovative Computing, Information and Control 5(8), 2427–2438 (2009a) 11. Wang, X.-p., Zhang, K., Yang, D.-l.: Disruption management of urgency vehicle routing problem with fuzzy time window. ICIC Express Letters 3(4), 883–890 (2009b) 12. Wang, X.-p., Wu, X., Hu, X.-p.: A Study of Urgency Vehicle Routing Disruption Management Problem. Journal of networks 5(12), 1426–1433 (2010)
A Decision Method for Disruption Management Problems in Intermodal Freight Transport Minfang Huang, Xiangpei Hu, and Lihua Zhang
*
Abstract. In this paper, we propose a new decision method for dealing with disruption events in intermodal freight transport. First of all, the forecasting decision for the duration of disruption events is presented, which decides whether a rearrangement is needed. Secondly, a network-based optimization model for intermodal freight transport disruption management is built. Then an improved depth-first search strategy is developed, which is beneficial to automatically generating the routes and achieving the recovery strategies quickly. Finally, a numerical example is applied to verify the decision method. The new decision method supports the real-time decision making for disruption management problems. Keywords: Decision method, Disruption management, Intermodal freight transport.
1 Introduction Many power facilities are delivered by multiple modes of transport. Uncertainties and randomness always exist in freight transportation systems, especially in intermodal freight transportation. Intermodal freight transportation is the term used to describe the movement of goods in one and the same loading unit or vehicle which uses successive, various modes of transport (road, rail, air and water) without any handling of the goods themselves during transfers between modes (European Conference of Ministers of Transport, 1993) [14]. It is a multimodal chain of transportation services. This chain usually links the initial shipper to the final consignee of the container and takes place over long distances. The whole Minfang Huang School of Economics and Management North China Electric Power University No. 2 Beinong Rd., Huilongguan, Beijing, 102206, China e-mail:
[email protected] *
Xiangpei Hu · Lihua Zhang Dalian University of Technology No. 2 Linggong Rd., Ganjingzi Dist., Dalian, Liaoning, 116023, China e-mail:
[email protected],
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 13–21. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
14
M. Huang, X. Hu, and L. Zhang
transportation is often provided and finished by several carriers. Almost all types of freight carriers and terminal operator may, thus, be involved in intermodal transportation, either by providing service for part of the transportation chain or by operating an intermodal transportation system (network) [5]. Therefore, the satisfied flow continuity and transit nodes compatibility of the multimodal chain of transportation services is significant while making modal choice decision once multiple transport modes, multiple decision makers and multiple types of load units are included. Unexpected events (e.g. Hurricane, the snow disaster, traffic accidents) happening in one link of the multimodal chain could result in the disturbance of pre-decided transportation activities. A new strategy used to handle disruptions is disruption management. Its objective is the smallest disturbances the entire transportation system encounters with the new adjustment scheme, rather than the lowest cost. How to real-timely deal with the disruption events and achieve the coping strategies quickly and automatically is an important problem. It is necessary to present a new solution approach to improve the rationality and efficiency of disruption management in intermodal freight transportation. The remainder of this paper is organized as follows: Section 2 briefly reviews the related solution approaches and applications. In Section 3, the forecasting decision method for the duration of disruption events is presented, and an optimization algorithm for intermodal freight transport disruption management is constructed. A numerical example is given in Section 4. Finally, concluding remarks and future research directions are summarized in Section 5.
2 A Brief Review of Related Literature The research on the planning issues in intermodal freight transport has begun since the 1990s. Macharis and Bontekoning [15] conducted a comprehensive review on OR problems and applications of drayage operator, terminal operators, network operators and intermodal operators. Related decision problems for intermodal freight concern some combinations of rail, road, air and water transport. Following this approach, Caris et al. [3] provide an update on the review in Macharis and Bontekoning, with a stronger orientation towards the planning decisions in intermodal freight transport and solution methods. In the current results on intermodal transportation system, we find most of them are related with planning transportation activities. We divide them into 4 categories from the perspectives of intermodal carrier selection, transportation mode selection, transportation routes, and terminal location. For the aspect of intermodal carrier selection, Liu et al. [12] establish an improved intermodal network and formulate a multiobjective model with the consideration of 5 important characteristics, multiple objective, in-time transportation, combined cost, transportation risks and collaboration efficiency. Ma [16] proposes a method for the optimization of the carrier selection in network environment by inviting and submitting a tender based on multi-agent. With respect to transportation mode selection, Liu and Yu [11] use the
A Decision Method for Disruption Management Problems
15
graph-theory technique to select the best combination of transportation modes for shipment with the consideration of 4 characteristics, multiple objective, in-time transportation, combined cost and transportation risks. Shinghal and Fowkes [19] presents empirical results of determinants of mode choice for freight services in India which shows that frequency of service is an important attribute determining mode choice. For transportation route selection, Huang and Wang [7] analyze the evaluation indicators (transportation cost, transportation time, transportation risk, service quality, facility level) of intermodal transportation routes, establish the set of alternatives, incorporate the perspectives of quantitative and qualitative analysis to compare and select the route alternatives. Chang [4] formulate an international intermodal routing problem as a multi-objective multimodal multi-commodity flow problem with time windows and concave costs. Yang et al. [23] present a goal programming model for intermodal network optimization to examine the competitiveness of 36 alternative routings for freight moving from China to and beyond Indian Ocean. For the aspect of terminal location, in the area of hub location problems, Campbell et al. [2] review various researches on new formulations and better solution methods to solve larger problems. Arnold et al. [1] investigate the problem of optimally locating rail/road terminals for freight transport. A linear 0-1 program is formulated and solved by a heuristic approach. Sirikijpanichkul et al. [20] develop an integral model for the evaluation of road-rail intermodal freight hub location decisions, which comprises four dominant agents, hub owners or operators, transport network infrastructure providers, hub users, and communities. Meng and Wang [17] develop a mathematical program with equilibrium constraints model for the intermodal hub-and-spoke network design problem with multiple stakeholders and multitype containers. The existing results on intermodal transportation system are related with planning transportation activities. They just put emphasis on planning in advance. However, they lack the research on disruptions possibly occurred in each transport mode and can not achieve an operational scheme with overall smallest disturbances quickly. It is more important to ensure flow continuity and transit nodes compatibility. We have seen a number of results in disruption management in urban distribution system where only one mode of transport is used. Most of them are focused on the study of the disruption caused by customer requests or by dynamic travel time. The work of Potvin et al. [18] describes a dynamic vehicle routing and scheduling problem with time windows where both real-time customer requests and dynamic travel times are considered. In terms of the disruption caused by dynamic travel time, Huisman et al. [8] present a solution approach consisting of solving a sequence of optimization problems. Taniguchi et al. [21] use dynamic traffic simulation to solve a vehicle routing and scheduling model that incorporates real time information and variable travel times. Du et al. [6] design a solution process composed of initial-routes formation, inter-routes improvement, and intra-route improvement. Besides the above, there are some other results. The study of Zeimpekis et al. [24] present the architecture of fleet management system. Li et al. [10]
16
M. Huang, X. Hu, and L. Zhang
develop a prototype decision support system (DSS) [11]. Wang et al. [22] propose a transformation method for the recovery of the vehicle routing. The results above investigate the distribution system with only one transport mode. For example, in urban areas, only the transport mode of road is utilized to accomplish the delivery. Therefore, the disruption management in urban distribution system puts emphasis on the satisfaction of the customers. Meanwhile, in intermodal freight transport, it focuses on the choice of transportation modes and carriers. Therefore, it is necessary to provide a solution approach with the ability of qualitative and quantitative processing to disruption management problems in intermodal freight transport system.
3 A Decision Method for Intermodal Freight Disruption Problems and a Solution Algorithm 3.1 The Forecasting Decision for the Duration of Disruption Events The disruption management problems in intermodal freight transport system can be described as follows: after the cargos start from the origins according to the plan, unexpected events (e.g. Hurricane, the snow disaster, traffic accidents) happen in one link of the multimodal chain, which might result in the interruption of one transport mode. The pre-decided transportation activities might need to be rearranged. If necessary, a rescue scheme with smallest deviation which is measured by the cost and the time of the routes should be achieved in an efficient way. The duration of a disruption event, also means the delay of current transport activity, is used to decide whether a new arrangement should be made. If the delay is within the next carrier’s tolerance, then no rearrangement is needed, otherwise, rerouting with smallest deviation should be made. It is assumed the effect of the delay on the one after the next carrier has been considered in the next carrier’s tolerance. The way to achieve the duration of disruption events is to collect historic statistic data of typical disruption events in the specific transport mode. We take the accidents happened on freeways as an example to illustrate the study of disruption event duration. According to the causes of disruption events, the accidents on freeways can be divided into 7 types [13], vehicle crash, bumping into the objects, vehicle breakdown, injuring-causing accident, vehicle burning accident, death-causing accident, and dangerous stuff accident. The distribution of each kind of disruption event’s duration could be obtained by statistical analysis of historic statistic data. The forecasting decision tree for the duration of disruption event (vehicle crash) is shown in Figure 1. We use 3 parameters to illustrate an event’s duration, average duration ( Ai ), the lower and the upper of four quartiles ( Q min i , Q max i ) as the upper bound and lower bound of confidence interval. They can be calculated according to historic statistic data.
A Decision Method for Disruption Management Problems
Vehicles crash
17
Average duration ( A1 ) ( Q min1 , Q max1 )
Injuring and death causing? Yes Average duration ( A2 ) ( Q min 2 , Q max 2 )
No Average duration ( A3 ) ( Q min 3 , Q max 3 ) Any truck involved? Yes
Average duration ( A4 ) ( Q min 4 , Q max 4 )
No Average duration ( A5 ) ( Q min 5 , Q max 5 )
Fig. 1 The forecasting decision tree for the duration of disruption event
3.2 An Optimization Algorithm for Intermodal Freight Transport Disruption Management According to the forecasting duration or delay in Sect. 3.1, if the delay is out of the range of the next carrier’s tolerance, then rerouting with smallest deviation should be made. The deviation of sth route, denoted by Ds , can be calculated as follows.
Ds = α1
cs − c0 t −t + α2 s 0 c0 t0
(1)
The variables and parameters in Eq. (1) are explained below. The coefficients α1 and α 2 denote the decision maker’s preference. They are equal or greater than 0, and α1 + α 2 = 1 . cs denotes the total cost of sth route; t s the total transport time of sth route; c0 the total cost of the initial plan; t0 the total transport time of the initial plan. Therefore, Eq. (1) is equivalent to Eq. (2).
Ds' = α1
cs t + α2 s c0 t0
(2)
Fig.2 shows a network of intermodal freight transport. The nodes represent the cities (A, B, ……, H) which cargos need to pass through. In each city, there are several carriers providing different modes of transport respectively. For example, from the origination node, there are k modes of transport (A1, A2, ……, Ak) for the cargos to choose to arrive at City A.
18
M. Huang, X. Hu, and L. Zhang
Fig. 2 The virtual network of disruption event
The decision for the routing problem of intermodal freight transport can be turned into path searching through the graph shown in Fig. 2. Considering the characteristics of search strategies, we adopt an improved depth-first search strategy to generate the routing schemes.
3.3 An Improved Depth-First Search Strategy From Fig. 2, we observe that the network of intermodal freight transport corresponds to a State Space for decision-making, in which a transshipment location is a search node. An initial route is a path spanning through the State Space from the initial node (origination) to the goal node (destination), and an operational disruption recovery scheme with deviation is a path spanning through the State Space from the disrupted location to the goal node. Considering the characteristics of search strategies and the problem, we apply the principle of depth-first search strategy and improve it to generate the routing schemes. The improved depth-first search algorithm includes three factors: state sets, operators, and goal state. The details of state-space search based on the improved depth-first search algorithm for disruption management problems in intermodal freight system are given below. • State sets: are described by three elements, which are Pi (the cargos’ current location, where the disruption happens), i = A, B, C, ······; csij (accumulative cost of sth route when the cargos arrive at ith city by jth transport mode); tsij (accumulative transport time of sth route when the cargos arrive at ith city by jth transport mode). • Operators: cargos move from the current location (a search node) to the next location (a search node). • Goal state: As the nodes are searched, the goal state (destination) can be reached. Until to this state, csij is the total cost of sth route (ts), and tsij is the total transport time of sth route (ts).
After the search process is finished, the feasible recovery schemes are generated. And Ds' can be calculated by Eq. (2).
A Decision Method for Disruption Management Problems
19
4 A Numerical Example A numerical example is constructed to illustrate and verify the above decision method. The description of the example is given as follows. Suppose there are 10 ton of cargos to be transported from City A to City D, which should pass through City B and City C. The example intermodal transportation network is described in Fig. 3, which shows the available transport modes between two cities. And its corresponding operational intermodal transportation network is presented in Fig. 4. Transport cost per unit (1000RMB/ton) and required transport time (h) are listed in Table 1. Transshipment cost (1000RMB/ton) and time (h) between different transport modes are listed in Table 2.
Fig. 3 An example intermodal transportation network
(A -B
3 ,r
(B
(B
1-
oa d)
C
-C 3
2 ,r
(C
a) ,se
1-
D
1
oa d)
(C
-D 2
,ra i
l)
) ad ,ro
Fig. 4 The operational intermodal transportation network Table 1 Transport cost per unit/required time A-B Rail
B-C
9/20 --
C-D 6/18
Road 6/15 10/18
7/16
Sea
--
3/28 4/30
Table 2 Transshipment cost/time between different transport modes
Rail
Rail
Road
0/0
0.1/0.5 0.2/0.5
Sea
Road 0.1/0.5 0/0
0.1/1
Sea
0/0
0.2/0.5 0.1/1
20
M. Huang, X. Hu, and L. Zhang
We assume the decision maker has the same preference on the cost and the transport time, that is, α1 = 0.5 , α 2 = 0.5 . According to the improved depth-first search strategy in Sect. 3.3, the initial feasible routes can be generated. And the optimal one is achieved as A-B3-C1-D2, with the total cost of 163 and the total transport time of 64.5. If there is a disruption event happened in the link A-B3, and a delay of 8 hours occurred. The transport mode of sea in the link B3-C1 in Fig. 4 will not be available for the cargos. The recovery schemes have to be generated by the improved depth-first search strategy. And the scheme with smallest deviation will be chosen to deliver the cargos. Here the optimal recovery scheme is B3-C2-D2, with a deviation 0.98.
5 Concluding Remarks We present a new decision method for disruption problems in intermodal freight transport, which comprises the forecasting decision for the duration of disruption events, network-based optimization model, and improved depth-first search strategy. The forecasting for the duration of disruption events helps make a decision whether a rearrangement is need. The introduction of improved depth-first search strategy can be beneficial to automatically generating the initial routes and recovery routes. The method can compete with rapid decision-making in disruption management problems. Furthermore, it provides a new solution idea for other disruption management problems. Some specific work remains to be further studied, for example, large scale case study network should be selected to verify the decision method. Acknowledgments. This work is partially supported by Specialized Research Fund for the Doctoral Program of Higher Education of China (No.20100036120010), “the Fundamental Research Funds for the Central Universities” in China (No.09QR56), and by the grants from the National Natural Science Funds for Distinguished Young Scholar (No.70725004). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
References [1] Arnold, P., Peeters, D., Thomas, I.: Modelling a rail/road intermodal transportation system. Transportation Research Part E 40(3), 255–270 (2004) [2] Campbell, J., Ernst, A., Krishnamoorthy, M.: Hub location problems. In: Drezner, Z., Hamacher, H. (eds.) Facility location: Applications and theory. Springer, Heidelberg (2002) [3] Caris, A., Macharis, C., Janssens, G.K.: Planning problems in intermodal freight transport: accomplishments and prospects. Transportation Planning and Technology 31(3), 277–302 (2008) [4] Chang, T.S.: Best routes selection in international intermodal networks. Computers & operations research 35(9), 2877–2891 (2008)
A Decision Method for Disruption Management Problems
21
[5] Crainic, T.G., Kim, K.H.: Intermodal transportation. In: Laporte, G., Barnhart, C. (eds.) Handbooks in Operations Research & Management Science: Transportation. Elsevier, Amsterdam (2007) [6] Du, T.C., Li, E.Y., Chou, D.: Dynamic vehicle routing for online B2C delivery. Omega 33(1), 33–45 (2005) [7] Huang, L.F., Wang, L.L.: An analysis of selecting the intermodal transportation routes. Logistics Engineering and Management 32(187), 4–6 (2010) [8] Huisman, D., Freling, R., Wagelmans, A.P.M.: A robust solution approach to the dynamic vehicle scheduling problem. Transportation Science 38(4), 447–458 (2004) [9] Li, J.Q., Borenstein, D., Mirchandani, P.B.: A decision support system for the singledepot vehicle rescheduling problem. Computers & Operations Research 34(4), 1008– 1032 (2007) [10] Li, J.Q., Mirchandani, P.B., Borenstein, D.: A Lagrangian heuristic for the real-time vehicle rescheduling problem. Transportation Research Part E 45(3), 419–433 (2009) [11] Liu, J., Yu, J.N.: Optimization model and algorithm on transportation mode selection in intermodal networks. Journal of Lanzhou Jiaotong University 29(1), 56–61 (2010) [12] Liu, J., Yu, J.N., Dong, P.: Optimization model and algorithm for intermodal carrier selection in various sections. Operations Research and Management Science 19(5), 160–166 (2010) [13] Liu, W.M., Guan, L.P., Yin, X.Y.: Prediction of freeway incident duration based on decision tree. China Journal of Highway and Transport 18(1), 99–103 (2005) [14] Macharis, C., Bontekoning, Y.M.: Opportunities for OR in intermodal freight transport research: A review. European Journal of Operational Research 153(2), 400–416 (2004) [15] Macharis, C., Bontekoning, Y.M.: Opportunities for OR in intermodal freight transport research: a review. European Journal of Operational Research 153(2), 400–416 (2004) [16] Ma, C.W.: Carrier selection in various sections of multi-modal transport based on multi-Agent. Journal of Harbin Institute of Technology 39(12), 1989–1992 (2007) [17] Meng, Q., Wang, X.C.: Intermodal hub-and-spoke network design: Incorporating multiple stakeholders and multi-type containers. Transportation Research Part B (2010), doi:10.1016/j.trb.2010.11.002 [18] Potivn, J.Y., Ying, X., Benyahia, I.: Vehicle routing and scheduling with dynamic travel times. Computers & Operations Research 33(4), 1129–1137 (2006) [19] Shinghal, N., Fowkes, T.: Freight model choice and adaptive stated preferences. Transportation Research Part E 38(5), 367–378 (2002) [20] Sirikijpanichkul, A., Van Dam, H., Ferreira, L., Lukszo, Z.: Optimizing the Location of Intermodal Freight Hubs: An overview of the agent based modelling approach. Journal of Transportation Systems Engineering and Information Technology 7(4), 71–81 (2007) [21] Taniguchi, E., Shimamoto, H.: Intelligent transportation system based dynamic vehicle routing and scheduling with variable travel times. Transportation Research Part C 12(3-4), 235–250 (2004) [22] Wang, X.P., Xu, C.L., Yang, D.L.: Disruption management for vehicle routing problem with the request changes of customers. International Journal of Innovative, Information and Control 5(8), 2427–2438 (2009) [23] Yang, X.J., Low, J.M.W., Tang, L.C.: Analysis of intermodal freight from China to Indian Ocean: A goal programming approach. Journal of Transport Geography (2010), doi:10.1016/j.jtrangeo.2010.05.007 [24] Zeimpekis, V., Giaglis, G.M., Minis, I.: A dynamic real-time fleet management system for incident handling in city logistics. In: 2005 IEEE 61st Vehicular Technology Conference, pp. 2900–2904 (2005)
A Dominance-Based Rough Set Approach of Mathematical Programming for Inducing National Competitiveness Yu-Chien Ko and Gwo-Hshiung Tzeng
*
Abstract. The dominance-based rough set approach is a powerful technology for approximating ranking classes. Analysis of large real-life data sets shows, however, decision rules induced from lower approximations are weak, that is supported by few entities only. For enhancing the DRSA, the mathematical programming is applied to support the lower approximations with entities as more as possible. The mathematical coding such as unions of decision classes, dominance sets, rough approximations, and quality of approximation is implemented in Lingo 12. It is applied on the 2010 World Competitiveness Yearbook of International Institute for Management Development (WCY-IMD). The results show the business finance and attitudes & values matter achieving the top 10 positions in the world competitiveness. Keywords: dominance-based rough set approach (DRSA), Mathematical programming (MP), national competitiveness.
1 Introduction DRSA is a powerful technology to process the relational structure, which has been successfully applied in many fields [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. However, it has rarely been used in the analysis of national competitiveness due to skewed dimensions and unique characteristics among nations. Up to today the annular reports, World Competitiveness Yearbook (WCY) and Global Competitiveness Report (GCR), publish the competitiveness ranks with statistical descriptions instead of relational structure, thus inferring of competitiveness structure still cannot be elaborated for policy makers and national leaders. This research adopts mathematical programming to design Yu-Chien Ko Department of Information Management, Chung Hua University, 707, Sec.2 Wufu road Hsinchu City 30012, Taiwan e-mail:
[email protected] *
Gwo-Hshiung Tzeng Graduate Institute of Project Management, Kainan University, No. 1 Kainan Road, Luchu Taoyuan County 338, Taiwan e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 23–36. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
24
Y.-C. Ko and G.-H. Tzeng
and develop unions of decision classes, dominance sets, rough approximations, and quality of approximation [11, 12, 13]. Finally, induction rules based on WCY 2010 are generated for stakeholders to help policy making and verification. WCY collects figure data and expert opinions together into in 4 consolidated factors, i.e. Economic Performance, Government Efficiency, Business Efficiency, and Infrastructure. Their details are divided into 20 sub-factors. Totally there more than 300 criteria are collected in the 20 sub-factors. The report provides the weakness, strength, trends of nations from a view of individual nation instead of crossing over nations [14, 15]. This research partitions nations into 2 parts i.e. at least 10th and at most 11th, shown in Figure 1, then induces rules. It discovers the facts how the top nations outperformed than the others.
Inducing rules for at least 10th
At least 10th
WCY dataset Inducing rules for at most 11th
At most 11th
Fig. 1 Competitiveness model
The remainders of this paper are organized by presenting the rough set and DRSA in Section 2, propositions of mathematical programming for DRSA in Section 3, application of the proposed DRSA on national competitiveness in Section 4, discussion of competitiveness criteria and the future work in Section 5. The last section concludes the paper and remarks.
2 Review on DRSA Measures or evaluations of competitiveness crossing over nations have not been deeply explored in the fields of dominance-based rough set. This section starts from the concept of rough set then extends to the dominance-based rough set. A. Rough Set The rough set can discover important facts hidden in data, and has the capacity to express decision rules in the form of logic statements “if conditions, then decision”. The conditions and decision in the statement specify their equivalence relations based on respective criteria. The degree of satisfying the rule is measured with entities contained in the relations such as coverage rate, certainty, and strength [16, 17, 18, 19]. Generally, it performs well at classification in many
A Dominance-Based Rough Set Approach of Mathematical Programming
25
application fields but cannot handle preference inconsistency between condition and decision for choice, ranking, and sorting. Therefore, the rough set is extended by applying dominance principle on the preference-ordered rough set [3, 4]. The extension substitutes indiscernibility relations by dominance-based relations. The approximation of dominance-based relations has involved the preference function, dominating and dominated entities, rough approximation, and unions of decision classes [1~10] below. B. Information System of DRSA A data table with preference information IS = (U , Q,V , f ) is presented with U = {x1 , x2 ,..., xn } , Q = C ∪ D
, f : X × Q → V , and V = {V1 , V2 , ..., Vq } where n is
the number of entities, C is the set of condition criteria, D is the set of decision criteria, X representing a subset of U that decision makers have willing to tell their preference on criteria, and f representing a total function such that f ( x, q ) ∈ Vq for all q ∈ P . The information function f ( x, q) can be regarded as a preference function when its domain is a scale of preference for criterion q (Greco et al., 2000). Thus the pair comparison of x, y ∈ U, f ( x , q ) ≥ f ( y , q ) ⇔ x q y ⇔ f q ( x ) ≥ f q ( y ) , means ‘x is at least as good as y in the preference strength with respect to criterion q’. The outranking relations not only make sense in data structure but also in mathematical functions. The rough approximation related to the mathematical structure is described below. C. Rough Approximation by Dominance Sets The dominance-based rough approximations can serve to induce entities assigned ≥
≤
to Clt (an upward union of classes) or Clt (a downward union of classes) where Cl is a cluster set containing ordered classes Clt , t ∈T and T={1, 2,…, n}. For all
s, t ∈T and s ≥ t, each element in Cl s is preferred at least as each element in Clt , which is constructed as
Clt≥ = ∪ Cls s≥t
Clt≤
= ∪ Cls s≤t
P-dominating and P-dominated sets are the rough approximations by taking object x as a reference with respect to P, P ⊆ C. +
P -dominating set: DP ( x) = { y ∈ X , yDP x} −
P -dominated set: DP ( x ) = { y ∈ X , xDP y}
26
Y.-C. Ko and G.-H. Tzeng
where y
q
x for DP+ ( x ) , x
q
y for DP− ( x) , and all q ∈ P . Explaining the ap-
proximation of decision classes by P-dominance sets is the key idea of rough set to infer one knowledge by another knowledge, which has been implemented in DRSA as ≥
≥
+
≥
P (Clt ) = { x' ∈ Clt , DP ( x' ) ⊆ Clt } ≥
≤
P (Clt ) = U − P (Clt −1 ) ≥
≥
≥
Bnp (Clt ) = P (Clt ) − P (Clt )
0 parallel processing units called nodes. The output of the ith node is defined by a Gaussian function γ i ( x) = exp ( − | x − ci | 2 / σi2 ) , where x ∈ ℜ n is the input to the NN, ci is the centre of the i-th node, and σ i is its size of influence. The output of a RBNN, y NN = F ( x,W ) , may be calculated as [13] F ( x,W ) = ∑ip=1 wi γ i ( x ) = W T (t )Γ ( x ) ,
(13)
where W (t ) = [ w1 (t ) w2 (t ) … w p (t )]T is the vector of network weights and Γ(x) is a set of radial basis functions defined by Γ( x) = [ γ1 ( x ) γ 2 ( x) … γ p ( x)]T . Given a RBNN, it is possible to approximate a wide variety of functions f (x) by making different choices for W. In particular, if there is a sufficient number of nodes within the NN, then there is some W * such as sup x∈S F ( x, W * ) − f ( x ) < ε , x
where S x is a compact set, ε > 0 is a finite constant, provided that f (x) is continuous [13]. The RBNN is used to estimate the reaction rates Φ1 and Φ 2 (that are considered unknown) using some state measurements.
3.3 The Control Algorithm The model predictive control is a strategy that is based on the explicit use of some kind of system model to predict the controlled variables over a certain time horizon, called the prediction horizon. The control strategy can be described as follows [14]: 1) At each sampling time, the value of the controlled variable y(t+k) is predicted over the prediction horizon k=1,..., N. This prediction depends on the future values of the control variable u(t+k) within a control horizon k=l,..., Nu. 2). A reference trajectory yref(t+k), k=1,.., N is defined which describes the desired system trajectory over the prediction horizon. 3). The vector of future controls u(t+k) is computed such that an objective function (a function of the errors between the reference trajectory and the predicted output of the model) is minimised. 4). Once the minimisation is achieved, the first optimised control action is applied to the plant and the plant outputs are measured. Use this measurement of the plant states as the initial states of the model to perform the next iteration. Steps 1 to 4 are repeated at each sampling instant; this is called a receding horizon strategy. The strategy of the MPC based control is characterized by scheme represented in Fig. 1.
Neural Network Model Predictive Control of a Wastewater Treatment Bioprocess
reference
NMPC
optimised input
Nonlinear Bioprocess
197
controlled output
Nonlinear Model predicted output
Fig. 1 NMPC control scheme
When a solution of the nonlinear least squares (NLS) minimization problem cannot be obtained analytically, the NLS estimates must be computed using numerical methods. To optimize a nonlinear function, an iterative algorithm starts from some initial value of the argument in that function and then repeatedly calculates next available value according to a particular rule until an optimum is reached approximately. Between many different methods of numerical optimization the Levenberg-Marquardt (LM) algorithm was chosen to solve the optimisation problem. The LM algorithm is an iterative technique that locates the minimum of a multivariate function that is expressed as the sum of squares of non-linear realvalued functions [15], [16]. It has become a standard technique for non-linear least-squares problems [17], widely adopted in a broad spectrum of disciplines. LM can be thought of as a combination of steepest descent and the Gauss-Newton method. When the current solution is far from the correct one, the algorithm behaves like a steepest descent method. When the current solution is close to the correct solution, it becomes a Gauss-Newton method.
4 Simulation Results In this Section we will apply the designed nonlinear model predictive control in the case of the anaerobic digestion bioprocess presented in Section 2. In order to control the output pollution level y, as input control we chose the dilution rate, u = D . The main control objective is to maintain the output y at a specified low level pollution y d ∈ ℜ . We will analyze the realistic case where the structure of the system of differential equation (2) is known and specific reaction rates Φ1 and Φ 2 (Eqs. (6) and (7)) are completely unknown and must be estimated. Using a RBNN from subsection 3.2, one constructs an on-line estimate of Φ1 respectively of Φ 2 . The performance of the nonlinear predictive controller presented in subsection 3.3 has been tested through extensive simulations by using the process model (2). The values of yield and kinetic coefficients are [18]: k1 = 3.2, k2 = 16.7, k3 = 1.035, k4 = 1.1935, k5 = 1.5, k6 = 3, k7 = 0.113, μ1∗ = 0.2 h-1, K M 1 = 0.5 g/l, μ∗2 = 0.35 h-1, K M 2 = 4 g/l, K I 2 = 21 g/l, and the values α1 = 1.2, α 2 = 0.75. It must be
D. Şendrescu et al.
198
noted that for the reaction rates estimation a full RBNN with deviation σ i = 0.05 was used. The centres ci of the radial basis functions are placed in the nodes of a mesh obtained by discretization of states X 1 ∈ [1, 12] g/l, X 2 ∈ [0.4, 0.65] g/l, S1 ∈ [0.1, 1.4] g/l and S 2 ∈ [0.3, 1.6] g/l with dX i = dSi = 0.2 g/l, i = 1, 2. The simulation results, obtained with a sample period Ts=6 min, are presented in Figs. 2 – 5. In Fig. 2 the controlled output trajectory is presented and in Fig. 3 the nonlinear model predictive control action (dilution rate D evolution) is depicted. The functions Φ1 and Φ 2 provided by the RBNN are depicted versus the “real” functions in Fig. 4 and Fig. 5. From these figures it can be seen that the behaviour of the control system with NMPC controller is very good, although the process dynamics are incompletely known. The control action has an oscillatory behavior, but these oscillations are relatively slow and with small magnitude. 2.6
0.2
[g/l]
[1/h]
2.4
0.18
2.2
0.16 0.14
2
0.12
1.8 0.1
1.6 0.08
2
1.4
0.06
1
1.2
0.04
1
0.02
Time [h] 0.8 0
10
20
30
40
50
60
Fig. 2 The controlled output evolution (reference (1) and controlled output (2)).
10
20
30
40
50
60
70
Fig. 3 The nonlinear model predictive control action (dilution rate D).
0.04
0.7
[g/l h]
[g/l h] 0.6
Time [h]
0 0
70
0.035
1
1
0.03 0.5 2
0.025
2
0.4
0.02 0.3
0.015 0.2
0.01
0.1
0.005 Time [h]
Time [h] 0 0
10
20
30
40
50
60
70
0 0
10
20
30
40
50
60
70
Fig. 4 The real reaction rate Φ1 (1) versus Fig. 5. The real reaction rate Φ 2 (1) versus the function provided by the RBNN (2). the function provided by the RBNN (2).
Neural Network Model Predictive Control of a Wastewater Treatment Bioprocess
199
5 Conclusion In this paper, a nonlinear model predictive control strategy was developed for a wastewater treatment bioprocess. The nonlinear model used by the control algorithm was obtained using the analytical description of the biochemical reactions. The unknown reaction rates are estimated using radial basis neural networks. The nonlinear model states are used to calculate the optimal control signal applied to the system. The optimization problem was solved using the iterative Levenberg-Marquardt algorithm. The main goal of feedback control was to maintain a low pollution level in the case of an anaerobic bioprocess with strongly nonlinear and not exactly known dynamical kinetics. The obtained results are quite encouraging from a simulation viewpoint and show good tracking precision. The numerical simulations show that the use of the nonlinear model predictive control strategy leads to a good control performance. Acknowledgments. This work was supported by CNCSIS-UEFISCDI Romania, project number PN II-RU TE 106/2010.
References 1. Bastin, G., Dochain, D.: On-line Estimation and Adaptive Control of Bioreactors. Elsevier, Amsterdam (1990) 2. Bastin, G.: Nonlinear and adaptive control in biotechnology: a tutorial. In: Proc. ECC 1991 Conf., Grenoble, pp. 2001–2012 (1991) 3. Selişteanu, D., Petre, E.: Vibrational control of a class of bioprocesses. Contr. Eng. and App. Inf. 3(1), 39–50 (2001) 4. Camacho, E.F., Bordons, C.: Model Predictive Control, 2nd edn. Springer, Heidelberg (2004) 5. Hayakawa, T., Haddad, W.M., Hovakimyan, N.: Neural network adaptive control for a class of nonlinear uncertain dynamical systems with asymptotic stability guarantees. IEEE Trans. on Neural Networks 19, 80–89 (2008) 6. Petre, E., Selişteanu, D., Şendrescu, D.: Neural Networks Based Adaptive Control for a Class of Time Varying Nonlinear Processes. In: Int. Conf. on Control, Automation and Systems ICCAS 2008, COEX, Seoul, Korea, October 14-17, pp. 1355–1360 (2008) 7. Funahashi, K.: On the approximate realization of continuous mappings by neural networks. Neural Networks 2, 183–192 (1989) 8. Yu, W., Li, X.: Some new results on system identification with dynamic neural networks. IEEE Trans. Neural Networks 12(2), 412–417 (2001) 9. Isidori, A.: Nonlinear Control Systems, 3rd edn. Springer, Berlin (1995) 10. Petre, E.: Nonlinear Control Systems – Applications in Biotechnology, 2nd edn. Universitaria, Craiova (2008) (in Romanian) 11. Dochain, D., Vanrolleghem, P.: Dynamical Modelling and Estimation in Wastewater Treatment Processes. IWA Publishing (2001) 12. Mayne, D.Q., Rawlings, J.B., Rao, C.V., Scokaert, P.O.: Constrained model predictive control: stability and optimality. Automatica 36, 789–814 (2000)
200
D. Şendrescu et al.
13. Spooner, J.T., Passino, K.M.: Decentralized adaptive control of nonlinear systems using radial basis neural networks. IEEE Trans. on Autom. Control 44(11), 2050–2057 (1999) 14. Eaton, J.W., Rawlings, J.R.: Feedback control of nonlinear processes using online optimization techniques. Computers and Chemical Engineering 14, 469–479 (1990) 15. Wang, Y., Boyd, S.: Fast model predictive control using online optimization. In: Proc. of the 17th World Congress of International Federation of Automatic Control, WCIFAC 2008 (2008) 16. Kouvaritakis, B., Cannon, M.: Nonlinear Predictive Control: Theory and Practice. IEE (2001) 17. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, Heidelberg (1999) 18. Petre, E., Selişteanu, D., Şendrescu, D.: Adaptive control strategies for a class of anaerobic depollution bioprocesses. In: Proc. of Int. Conf. on Automation, Quality and Testing, Robotics, Cluj-Napoca, Romania, Tome II, May 22-25, pp. 159–164 (2008)
Neural Networks Based Adaptive Control of a Fermentation Bioprocess for Lactic Acid Production Emil Petre, Dan Selişteanu, and Dorin Şendrescu
*
Abstract. This work deals with the design and analysis of some nonlinear and neural adaptive control strategy for a lactic acid production that is carried out in continuous stirred tank bioreactors. An indirect adaptive controller based on a dynamical neural network used as on-line approximator to learn the time-varying characteristics of process parameters is developed and then is compared with a classical linearizing controller. The controller design is achieved by using an input-output feedback linearization technique. The effectiveness and performance of both control algorithms are illustrated by numerical simulations applied in the case of a lactic fermentation bioprocess for which kinetic dynamics are strongly nonlinear, time varying and completely unknown. Keywords: Neural networks, Adaptive control, Lactic acid production.
1 Introduction In the last decades, the control of bioprocesses has been a significant problem attracting wide attention, the main engineering motivation being the improvement of operational stability and production efficiency. It is well known that control design involves complicated mathematical analysis and has difficulties in controlling highly nonlinear and time varying plants. A powerful tool for nonlinear controller design is the feedback linearization [1, 2], but the use of it requires the complete knowledge of the process. In practice there are many processes described by highly nonlinear dynamics; for that reason an accurate model for these processes is difficult to develop. Therefore, in recent years, it has been noticed a great progress in development of adaptive and robust adaptive controllers, due to their ability to compensate for both parametric uncertainties and process parameter variations. Recently, also there has been considerable interest in the use of neural networks (NNs) for the identification and control of complex dynamical system [3, 4, 5, 6, 7, 8]. The main advantage of Emil Petre, Dan Selişteanu, and Dorin Şendrescu Department of Automatic Control, University of Craiova, A.I. Cuza 13, Craiova, Romania e-mail: {epetre,dansel,dorins}@automation.ucv.ro *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 201–212. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
202
E. Petre, D. Selişteanu, and D. Şendrescu
using NNs in control applications is based both on their ability to uniformly approximate arbitrary input-output mappings and on their learning capabilities that enable the resulting controller to adapt itself to possible variations in the controlled plant dynamics [5]. More precisely, the variations of plant parameters are transposed in the modification of the NN parameters (i.e. the adaptation of the NN weights). Using the feedback linearization and NNs, several NNs-based adaptive controllers were developed for some classes of uncertain, time varying and nonlinear systems [4, 5, 7, 9]. In this paper, the design and analysis of some nonlinear and NNs control strategies for a lactic fermentation bioprocess are presented. In fact, by using the feedback linearization, the design of a linearizing controller and of an indirect adaptive controller based on a dynamical NN is achieved for a class of square nonlinear systems. The control signals are generated by using a recurrent NN approximation of the functions representing the uncertain or unknown and time varying plant dynamics. Adaptation in this controller requires on-line adjustment of NN weights. The adaptation law is derived in a manner similar to the classical Lyapunov based model reference adaptive control design, where the stability of the closed loop system in the presence of the adaptation law is ensured. The derived control methods are applied to a fermentation bioprocess for lactic acid production, which is characterized by strongly nonlinear, time varying and completely unknown dynamical kinetics.
2 Process Modelling and Control Problem Lactic acid has traditionally been used in the food industry as an acidulating and/or preserving agent, and in the biochemical industry for cosmetic and textile applications [10, 12]. Recently, lactic acid fermentation has received much more attention because of the increasing demand for new biomaterials such as biodegradable and biocompatible polylactic products. Two major factors limit its biosynthesis, affecting growth and productivity: the nutrient limiting conditions and the inhibitory effect caused by lactic acid accumulation in the culture broth [10]. A reliable model that explicitly integrates nutritional factor effects on both growth and lactic acid production in a batch fermentation process implementing Lb. casei was developed by Ben Youssef et al. [10] and it is described by the following differential equations:
X = μX − k d X , P = ν p X , S = −qs X ,
(1)
where X, S and P are, respectively, the concentrations of cells, substrate (glucose) and lactic acid. μ, νp and qs correspond, respectively, to specific growth rate of cells, specific rate of lactic acid production and to specific rate of glucose consumption. kd is the rate of cell death. In (1) the mechanism of cell growth was modelled as follows:
⎞⎛ ⎛ K gc ⎞⎛ S P ⎟⎜1 − gc μ = μ max ⎜⎜ gc P ⎟⎟⎜⎜ gc ⎟ ⎜ ⎝ K P + P ⎠⎝ K S + S ⎠⎝ PC
⎞ ⎟, ⎟ ⎠
(2)
Neural Networks Based Adaptive Control of a Fermentation Bioprocess
203
with μ max the maximum specific growth rate, K Pgc the lactic acid inhibition constant, K Sgc the affinity constant of the growing cells for glucose, and PCgc the critical lactic acid concentration. The superscript gc denotes the parameters related to growing cells and rc to that of the resting cells. The specific lactic acid production and specific consumption rates are given by
ν p = δμ + γ (S /( K Src + S ) ) , q s = ν p / YPS ,
(3)
where δ and γ are positive constants, K Src is the affinity constant of the resting cells for glucose, and YPS represent the substrate to product conversion yield. The kinetic parameters of this model can be readjusted depending on the medium enrichment factor α . So, the nutritional limitation is described by the following hyperbolic type expressions [10]:
μ max =
K gc (α − α 0 ) K rc (α − α 0 ) μ max (α − α 0 ) , K Pgc = P max , K Src = S max , K αP + (α − α 0 ) K αS + (α − α 0 ) K αμ + (α − α 0 )
(4)
with α0 minimal nutritional factor necessary for growth, Kαμ , KαP and KαS saturation constants. μ max , K Pgcmax and K Srcmax correspond to the limit value of each parameter. Since in the process of lactic acid production the main cost of raw material comes from the substrate and nutrient requirements, in [10] some possible continuous-flow control strategies that satisfy the economic aspects were investigated. The advantage of a continuous-flow process is that the main product, which is also an inhibitor, is continuously withdrawn from the system. Much more, according to microbial engineering theory, for a product-inhibited reaction like lactic acid [10] or alcoholic fermentation [11], a multistage system composed of many interconnected continuous stirred tank reactors, where in the different reactors some variables of microbial culture (substrate, metabolites) can be kept closely to some optimal values, may be a good idea. Therefore the model (1)-(4) can be extended to a continuous-flow process that is carried out in two continuous stirred tank reactors sequential connected, as in Fig. 1. For this bioreactor, the mathematical model is given by the following set of differential equations, each stage being of same constant volume V:
First stage : X 1 = ( μ1 − k d ) X 1 − D1 X 1 P = ν X − D P 1
p1
1
1 1
Second stage : X 2 = ( μ 2 − k d ) X 2 + D1 X 1 − ( D1 + D2 ) X 2 P = ν X + D P − ( D + D ) P 2
p2
2
1 1
1
2
2
(5)
S1 = −q s1 X 1 + D S − D1 S1 S 2 = − q s 2 X 2 + D1 S1 + D2 S − ( D1 + D2 ) S 2 α 2 = D1α1 − ( D1 + D2 )α 2 α1 = D12α1in − D1α1 in 11 1
in 2
with D1 = D11 + D12 and where X i , S i and Pi , (i = 1, 2) are, respectively, the concentrations of biomass (cells), substrate and lactic acid in each bioreactor. μ i ,
ν pi and qsi (i = 1, 2) correspond, respectively, to specific growth rate of the cells,
E. Petre, D. Selişteanu, and D. Şendrescu
204
specific rate of lactic acid production and to specific rate of glucose consumption in each bioreactor. D11 is the first-stage dilution rate of a feeding solution at an influent glucose concentration S1in . D12 is the first-stage dilution rate of a feeding solution at an influent enrichment factor α1in . D2 is the influent dilution rate added at the second stage and S 2in is the corresponding feeding glucose concentration. It can be seen that in the second stage no growth factor feeding tank is included since this was already feeding in the first reactor. In the model (5) the mechanism of cell growth, the specific lactic acid production rate and the specific consumption rate are given by [10]:
⎛ KPigc ⎞⎛ Si ⎞⎛ ⎞ ⎟⎜1 − Pi ⎟ ,ν pi = δμi + β ⎛⎜ Si ⎟⎜ gc gc gc ⎟ ⎟ ⎜ ⎜ K rc + S ⎜ ⎟ i ⎝ KPi + Pi ⎠⎝ KS i + Si ⎠⎝ PC ⎠ ⎝ Si
μi = μmax i ⎜⎜
S 1in
α 1in
S 2in
D11
D12
D2
X1, S1, P1 X1, S1, P1
ν ⎞ ⎟ , qsi = pi . ⎟ YPS ⎠
(6)
X2, S2, P2 X2, S2, P2
Fig. 1 A cascade of two reactors for the lactic acid production
The kinetic parameters of this model may be readjusted depending on the medium enrichment factor α i as following [10]:
μ max i =
K gc (α − α 0 ) K rc (α − α 0 ) μ max (α i − α 0 ) , K Pigc = P max i , K Srci = S max i . K αP + (α i − α 0 ) K αS + (α i − α 0 ) K αμ + (α i − α 0 )
(7)
Now, the operating point of the continuous lactic acid fermentation process could be adjusted by acting on at least two control inputs, i.e. the primary and the secondary glucose feeding flow rates D11 and D2 . The number of input variables can be increased by including as a control input the rate of enrichment feeding D12 . As it was already formulated, the control objective consists in adjusting the plant’s load in order to convert the glucose into lactic acid via fermentation, which is directly correlated to the economic aspects of lactic acid production. More exactly, considering that the process model (5)-(7) is incompletely known and its parameters are time varying, the control goal is to maintain the process at some operating points, which correspond to a maximal lactic production rate and a minimal residual glucose concentration. By a process steady-state analysis, it was demonstrated [10] that these desiderata can be satisfied if the operating point is kept around the points S1* = 3 g/l and S 2* = 5 g/l. As control variables we chose the
Neural Networks Based Adaptive Control of a Fermentation Bioprocess
205
dilution rates both in the first and in the second reactors D1 and D2 , respectively. In this way we obtain a multivariable control problem with two inputs: D1 and D2 , and two outputs: S1 and S 2 .
3 Design of Control Strategies Consider the class of multi-input/multi-output square nonlinear dynamical systems (that is, the systems with as many inputs as outputs) of the form [7, 8]: n
x = f ( x ) + ∑ g i ( x )u i = f ( x ) + G ( x )u ; i =1
y = Cx
(8)
with the state x ∈ ℜ n , the input u ∈ ℜ n and the output y ∈ ℜ n . f : ℜ n → ℜ n is an unknown smooth function and G a matrix whose columns are the unknown smooth functions g i ; note that f and g i contain parametric uncertainties which are not necessarily linear parameterizable. C is a n × n constant matrix. For the processes (8) the control objective is to make the output y to track a specified trajectory y ref . The problem is very difficult or even impossible to be solved if the functions f and gi are assumed to be unknown. Therefore, in order to model the nonlinear system (8), dynamical NNs are used. Dynamical neural networks are recurrent, fully interconnected nets, containing dynamical elements in their neurons. They can be described by the following system of coupled first-order differential equations [7, 8]: n
xˆ i = a i xˆ i + bi ∑ wij φ ( xˆ i ) + bi wi ,n +1ψ ( xˆ i )u i , i = 1,..., n j =1
(9)
or compactly xˆ = Axˆ + BWΦ ( xˆ ) + BWn +1Ψ ( xˆ )u;
y N = Cxˆ
(10)
with the state xˆ ∈ ℜ n , the input u ∈ ℜ n , the output y N ∈ ℜ n , W a n × n matrix of adjustable synaptic weights, A - a n × n diagonal matrix with negative eigenvalues ai , B - a n × n diagonal matrix of scalar elements bi , and Wn +1 a n × n diagonal matrix of adjustable synaptic weights: Wn+1 = diag{w1,n +1 " wn ,n +1} . Φ (xˆ ) is a ndimensional vector and Ψ (xˆ ) is a n × n diagonal matrix, with elements the activation functions φ ( xˆ i ) and ψ ( xˆ i ) respectively, usually represented by sigmoids of the form:
φ ( xˆ i ) =
m1 m2 , , ψ ( xˆi ) = ˆ + β i i = 1, ..., n , −δ xˆ 1 + e −δ 2 xi 1+ e 1 i
where m k and δ k , k = 1, 2 are constants, and β i > 0 are constants that shift the sigmoids, such that ψ ( xˆi ) > 0 for all i = 1, ..., n . Next, by using the feedback linearization technique, two nonlinear controllers for the system (8) are presented: a linearizing feedback controller, and a nonlinear
E. Petre, D. Selişteanu, and D. Şendrescu
206
adaptive controller using dynamical neural networks. Firstly, the linearizing feedback controller case is considered, which it is an ideal case, when maximum prior knowledge concerning the process is available. We suppose that the functions f and G in (8) are completely known, the relative degree of differential equations in (8) is equal to 1, and all states are on-line measurable. Assume that we wish to have the following first order linear stable closed loop (process + controller) behaviour: ( y ref − y ) + Λ ( y ref − y ) = 0
(11)
with Λ = diag{ λi }, λi > 0, i = 1,..., n . Then, by combining (8) and (11) one obtains the following multivariable decoupling linearizing feedback control law: u = (CG ( x) )
−1
(− Cf ( x) + ν )
(12)
with (CG ( x)) assumed invertible, which applied to the process (8) result in y = ν , where ν is the new input vector designed as ν = y ref + Λ ( y ref − y ) . The control law (12) leads to the linear error model et = −Λet , where et = y ref − y is the tracking error. For λi > 0 , the error model has an exponential stable point at et = 0 . Because the prior knowledge concerning the process is not realistic, next it will be analyzed a more realistic case, when the model (8) is practically unknown, that is the functions f and G are completely unknown and time varying. To solve the control problem, a NN based adaptive controller will be used. The dynamical NN (10) is used as a model of the process for the control design. Assume that the unknown process (8) can be completely described by a dynamical NN plus a modelling error term ω ( x, u ) . In other words, there exist weight values W * and Wn*+1 such that (8) can be written as: x = Ax + BW *Φ ( x) + BWn*+1Ψ ( x)u + ω ( x, u ); y = Cx .
(13)
It is clear that the tracking problem can be now analyzed for the system (13) instead of (8). Since W * and Wn*+1 are unknown, the solution consists in designing a control law u(W , Wn +1 , x) and appropriate update laws for W and Wn+1 such that the network model output y tracks a reference trajectory y ref . The dynamics of NN model output (13), where the term ω ( x, u ) is assumed to be 0, can be expressed as: y = Cx = CAx + CBW *Φ ( x) + CBWn*+1Ψ ( x)u
(14)
Assume that CBWn*+1Ψ ( x) is invertible, which implies relative degree equal to one for input-output relation (14). Then, the control law (12) is particularized as follows
(
u = CBWn*+1Ψ ( x)
) (− CAx − CBW Φ ( x) + ν ) −1
*
(15)
Neural Networks Based Adaptive Control of a Fermentation Bioprocess
207
where the new input vector ν is defined as ν = y ref + Λ ( y ref − y ) , which applied to the model (14) results in a linear stable system with respect to this input, as y = ν . Defining the tracking error between the reference trajectory and the network output (14), as et = y ref − y , then the control law (15) leads to a linear error model et = − Λet . For λi > 0, i = 1,..., n , the error et converges to the origin exponentially. Note that the control input (15) is applied both to plant and neural model. Now, we can define the error between the identifier (NN) output and real system (ideal identifier) output as em = y N − y = C ( xˆ − x ) . Assuming that the identifier states are closely to process states [7, 8], then from (10) and (13) we obtain the error equation: ~ ~ (16) em = CAC −1em + CBWΦ ( x) + CBWn+1Ψ ( x)u ~ ~ with W = W − W * , Wn+1 = Wn+1 − Wn*+1 . Since control law (15) contains the unknown weight matrices W * and Wn*+1 , this becomes an adaptive control law if these weight matrices are substituting by their on-line estimates calculated by appropriate updating laws. Since we are interested to obtain stable adaptive control laws, a Lyapunov synthesis method is used. Consider the following Lyapunov function candidate: ~ ~ ~ ~ (17) V = (1 / 2) ⋅ e mT Pem + etT Λ−1et + tr{W T W } + tr{WnT+1Wn +1 }
(
)
where P > 0 is chosen to satisfy the Lyapunov equation PA + AT P = − I . Differentiating (17) along the solution of (16), where C is considered to be equal to identity matrix, finally one obtains: ~ ~ ~ ~ V = −1/ 2 ⋅ emT em − etT et + Φ T ( x)W T BPem + uT ΨT (x)Wn+1 BPem + tr{W TW} + tr{WnT+1Wn+1} (18) ~ ~ ~ ~ For tr{W T W } = −Φ T ( x)W T BPem , tr{W nT+1Wn +1 } = −u T Ψ T ( x)Wn +1 BPem , (18) becomes: (19) V = −1 / 2 ⋅ e T e − e T e = −1 / 2⋅ || e || 2 − || e || 2 ≤ 0 m m
t
t
m
t
and consequently, for the network weights the following updating laws are obtained: w ij = −bi piφ ( x j )e mi , i , j = 1, ..., n ; w i , n+1 = −bi piψ ( xi )u i emi , i = 1, ..., n (20) Theorem 1. Consider the control law (15), and the tracking and model errors defined above. The updating laws (20) guarantee the following properties [7, 8]: ~ lim t →∞ et (t ) = 0 ; ii) lim t →∞ W (t ) = 0 , i) lim t →∞ em (t ) = 0 , ~ lim t →∞ Wn +1 (t ) = 0 .
E. Petre, D. Selişteanu, and D. Şendrescu
208
Proof. Since V in (12) is negative semidefinite, then we have V ∈ L∞ , which ~ ~ implies et , em , W , Wn +1 ∈ L∞ . Furthermore, xˆ = x + C −1em is also bounded. Since V is a non-increasing function of time and bounded from below, then there exists lim t →∞ V (t ) = V (∞) . By integrating V from 0 to ∞ we obtain ∞
⎛ 2 2 ⎞ ∫ ⎜ 2 || e m || + || et || ⎟ dt = V (0) − V ( ∞ ) < ∞ 0
1
⎝
⎠
which implies et , em ∈ L2 . By definition, φ ( xi ) and ψ ( xi ) are bounded for all x and by assumption all inputs to the NN, the reference y ref and its time derivative are also bounded. Hence, from (8) we have that u is bounded and from et = −Λet and (9) we conclude that et , em ∈ L∞ . Since et , em ∈ L2 ∩ L∞ and et , em ∈ L∞ , using Barbalat’s Lemma [13], one obtains that lim t →∞ et (t ) = 0 and lim t →∞ em (t ) = 0 . Using now the boundedness of u, Φ ( x), Ψ ( x ) and the convergence of e (t ) to 0, we have that W and W also converge to 0. But we n +1
m
cannot conclude anything about the convergence of weights to their optimal values. In order to guarantee this convergence, u , Φ ( x ), Ψ ( x ) need to satisfy a persistency of excitation condition [8].
4 Control Strategies for Lactic Acid Production Firstly, we consider the ideal case where maximum prior knowledge concerning the process (kinetics, yield coefficients and state variables) is available, and the relative degree of differential equations in process model is equal to 1. Assume that for the two interconnected reactors we wish to have the following first order linear stable closed loop (process + controller) behaviour: d ⎡ S1* − S1 ⎤ ⎡λ1 0 ⎤ ⎡ S1* − S1 ⎤ ⎢ ⎥+ ⎢ ⎥ = 0 , λ1 , λ2 > 0 , dt ⎣ S 2* − S 2 ⎦ ⎢⎣ 0 λ 2 ⎥⎦ ⎣ S 2* − S 2 ⎦
(21)
where S1* and S 2* are the desired values of S1 and S 2 . Since the dynamics of S1 and S 2 in the process model (5) have the relative degree equal to 1, these can be considered as an input-output model and can be rewritten in the following form: d ⎡ S1 ⎤ ⎡ − q s1 X 1 ⎤ ⎡− D12 S1in ⎤ ⎡ S1in − S1 0 ⎤ ⎡ D1 ⎤ (22) +⎢ . =⎢ +⎢ in ⎥ ⎢ ⎥ ⎥ − S q X dt ⎣ 2 ⎦ ⎣ s 2 2 ⎦ ⎣ 0 ⎦ ⎣ S1 − S 2 S 2 − S 2 ⎥⎦ ⎢⎣ D2 ⎥⎦ Then from (21) and (22) one obtains the following exactly multivariable decoupling linearizing feedback control law [12]: ⎡ D1 ⎤ ⎡ S1in − S1 0 ⎤ ⎢ D ⎥ = ⎢ S − S S in − S ⎥ ⎣ 2⎦ ⎣ 1 2 2 2⎦
−1
⎧⎪⎡S1* ⎤ ⎡λ1 0 ⎤ ⎡ S1* − S1 ⎤ ⎡ qs1 X 1 ⎤ ⎡ D S in ⎤ ⎫ + 12 1 ⎬ (23) ⎨⎢ * ⎥ + ⎢ ⎥+ ⎥⎢ * ⎪⎩⎣S 2 ⎦ ⎣ 0 λ2 ⎦ ⎣S 2 − S 2 ⎦ ⎢⎣qs 2 X 2 ⎥⎦ ⎢⎣ 0 ⎥⎦ ⎭
Neural Networks Based Adaptive Control of a Fermentation Bioprocess
209
⎡ S in − S1 0 ⎤ where the decoupling matrix ⎢ 1 in ⎥ remains invertible as long as S S S − 2 2 − S2 ⎦ ⎣ 1 S1 < S1in and S 2 < S 2in (conditions satisfied in a normal operation of the two reactors). The control law (23) applied to process (22) leads to the following two linear error models e1 = −λ1e1 , e2 = −λ2 e2 , where e1 = S1* − S1 and e2 = S 2* − S 2 represent the tracking errors, which for λ1 , λ 2 > 0 have an exponential stable point at origin. Since the prior knowledge concerning the process previously assumed is not realistic, now we will analyze a more realistic case, where the process dynamics are incompletely known and time varying. We will assume that the reaction rates qs1 X 1 and qs 2 X 2 are completely unknown and can by expressed as:
qs1 X 1 = ρ1 , qs 2 X 2 = ρ 2 ,
(24)
where ρ1 and ρ 2 are considered two unknown and time-varying parameters. In this case, the control law (23) becomes: ⎡ D1 ⎤ ⎡S1in − S1 0 ⎤ ⎢ D ⎥ = ⎢ S − S S in − S ⎥ ⎣ 2⎦ ⎣ 1 2 2 2⎦
−1
⎧⎪⎡ S1* ⎤ ⎡λ1 0 ⎤ ⎡ S1* − S1 ⎤ ⎡ ρ1 ⎤ ⎡ D12 S1in ⎤ ⎫⎪ ⎨⎢ * ⎥ + ⎢ ⎥+⎢ ⎥+⎢ ⎥⎬ . ⎥⎢ * ⎪⎩⎣ S 2 ⎦ ⎣ 0 λ2 ⎦ ⎣ S 2 − S 2 ⎦ ⎣ ρ 2 ⎦ ⎣ 0 ⎦ ⎪⎭
(25)
Since the control law (25) contains the unknown parameters ρ1 and ρ 2 , these will be substituted by their on-line estimates ρˆ1 and ρˆ 2 calculated by using a dynamical neural network (10), whose structure for this case is particularized as follows:
ρˆ i (t ) = ai ρˆ i + bi ∑ wijφ ( ρˆ j ) + bi wi , n +1ψ ( ρˆ i ) Di ; i, j = 1, 2 . n
j =1
(26)
So, D1 and D2 in (25) are modified so that the estimations ρˆ1 (t ) and ρˆ 2 (t ) are used in place of ρ1 and ρ2. The parameters wij and wi , n +1 are adjusted using the adaptation laws (20).
5 Simulation Results and Comments The performance of designed neural adaptive controller (25), (26), by comparison to the exactly linearizing controller (23) (which yields the best response and can be used as benchmark), has been tested by performing extensive simulation experiments, carried out by using the process model (5)-(7) under identical and realistic conditions. 0 The values of the kinetic parameters used in simulations are [11]: μ max = 0.45 h-1,
K Sgc = 0.5 g/l, K Srcmax = 12 g/l, K Pgcmax = 15 g/l, δ = 3.5, γ = 0.9 h-1, α0 = 0.02 g/l, K αμ = 0.2 g/l, K αP = 1.1 g/l, K αS = 4 g/l, PCgc = 95 g/l, YPS = 0.98 g/g, k d = 0.02 h-1, in in D12=0.002 h-1, S10in = 50 g/l, S 20 = 200 g/l, α10 = 6 g/l.
E. Petre, D. Selişteanu, and D. Şendrescu
210
The system’s behaviour was analyzed assuming that the influent glucose concentrations in the two feeding substrates act as perturbations of the form: in S1in (t ) = S10in ⋅ (1 + 0.25 sin(π t / 25)) and S 2in (t ) = S 20 ⋅ (1 − 0.25 cos(π t / 50)) , and the 0 (1 + 0.25 sin(π t / 40)) . kinetic coefficient μ max is time-varying as μ max (t ) = μ max The behaviour of closed-loop system using NN adaptive controller, by comparison to the exactly linearizing law is presented in Fig. 2. The graphics from the left figure show the time evolution of the two controlled variables S1 and S2 respectively, and the graphics from the right figure correspond to control inputs D1 and D2, respectively. In order to test the behaviour of the indirect adaptive controlled system in more realistic circumstances, we considered that the measurements of both controlled variables (S1 and S2) are corrupted with an additive white noise with zero average (5% from their nominal values). The simulation results in this case, conducted in the same conditions are presented in Fig. 3. The behaviour of controlled variables and of control inputs is comparable with the results obtained in the free noise simulation. The time evolution of the estimates of unknown functions (24) provided by the recurrent NN estimator is presented in Fig. 4, in both simulation cases.
0.05
4
S 2* 2
Control inputs D1, D2 (h-1)
Controlled outputs S1, S2 (g/l)
4.5
1
3.5 3 2 2.5 1
S1*
2 1.5
0
100
200
300
400
500
600
700
2 0.04
D1
0.02 1
0.01
0
800
1
0.03
D2 2 0
100
200
300
Time (h)
400
500
600
700
800
Time (h)
Fig. 2 Simulation results – neural adaptive control (2) versus exactly linearizing control (1) 0.05
4
S 2*
1
2
Control inputs D1, D2 (h-1)
Controlled outputs S1, S2 (g/l)
4.5
3.5 3 2 2.5 1
S1*
2 1.5
0
100
200
300
400
Time (h)
500
600
700
800
2 0.04
D1
1
0.03
0.02 1 0.01
0
D2 2 0
100
200
300
400
500
600
700
800
Time (h)
Fig. 3 Simulation results – neural adaptive control (noisy data) (2) versus linearizing control (1)
Neural Networks Based Adaptive Control of a Fermentation Bioprocess 2.5
Estimated parameter ρ2 (g/lh)
Estimated parameter ρ1 (g/lh)
3 2.5 2 1.5 1 1 2 0.5 0
211
0
100
200
300
400
Time (h)
500
600
700
800
2 2 1.5
1 1 0.5
0
0
100
200
300
400
500
600
700
800
Time (h)
Fig. 4 Estimates of unknown functions: (1) - without, and (2) - with noisy measurements
It can be noticed from Fig. 4 that the time evolution of estimates for noisy measurements of S1 and S2 is similar with the time profiles in free noise case. From graphics in Fig. 2 and Fig. 3 it can be seen that the behaviour of overall system with indirect adaptive controller, even if this controller uses much less a priori information, is good, being very close to the behaviour of closed loop system in the ideal case when the process model is completely known. Note also the regulation properties and ability of the controller to maintain the controlled outputs S1 and S2 very close to their desired values, despite the high load variations for S1in and S 2in , time variation of process parameters and the influence of noisy measurements. The gains of control laws (23), respectively (25) are λ1 = 0.55, λ 2 = 0.85. For the NN adaptive controller the initial values of the weights are set to 0.5 and the design parameters were chosen as: m1 = 180, m2 = 180, δ 1 = δ 2 = 0.1, β1 = β 2 = 0.2, a1 = a2 = −12, b1 = b2 = 0.01, p1 = p 2 = 2.5 . It must be noted that a preliminary tuning for the NN controller is not necessary. It can be concluded that when process nonlinearities are not completely known and bioprocess dynamics are time varying, the NN adaptive controllers are viable alternatives. As a future problem remains the controller design when the real plant is of higher order then assumed, or in the presence of other unmodelled dynamics.
6 Conclusion An indirect NN adaptive control strategy for a nonlinear system for which the dynamics is incompletely known and time varying was presented. The controller design is based on the input-output linearizing technique. The unknown controller functions are approximated using a dynamical neural network. The form of the controller and the neural controller adaptation laws were derived from a Lyapunov analysis of stability. It was demonstrated that under certain conditions, all the controller parameters remain bounded and the plant output tracks the output of a linear reference model. The simulation results showed that the performances of the proposed adaptive controller are very good.
212
E. Petre, D. Selişteanu, and D. Şendrescu
Acknowledgments. This work was supported by CNCSIS–UEFISCSU, Romania, project number PNII–IDEI 548/2008.
References 1. Isidori, A.: Nonlinear Control Systems, 3rd edn. Springer, Berlin (1995) 2. Sastry, S., Isidori, A.: Adaptive control of linearizable systems. IEEE Trans. Autom. Control 34(11), 1123–1131 (1989) 3. Diao, Y., Passino, K.M.: Stable adaptive control of feedback linearizable time-varying nonlinear systems with application to fault-tolerant engine control. Int. J. Control 77(17), 1463–1480 (2004) 4. Fidan, B., Zhang, Y., Ioannou, P.A.: Adaptive control of a class of slowly time varying systems with modelling uncertainties. IEEE Trans. Autom. Control 50, 915–920 (2005) 5. Hayakawa, T., Haddad, W.M., Hovakimyan, N.: Neural network adaptive control for a class of nonlinear uncertain dynamical systems with asymptotic stability guarantees. IEEE Trans. Neural Netw. 19, 80–89 (2008) 6. McLain, R.B., Henson, M.A., Pottmann, M.: Direct adaptive control of partially known non-linear systems. IEEE Trans. Neural Netw. 10(3), 714–721 (1999) 7. Petre, E., Selişteanu, D., Şendrescu, D., Ionete, C.: Neural networks-based adaptive control for a class of nonlinear bioprocesses. Neural Comput. & Applic. 19(2), 169– 178 (2010) 8. Rovithakis, G.A., Christodoulou, M.A.: Direct adaptive regulation of unknown nonlinear dynamical systems via dynamic neural networks. IEEE Trans. Syst. Man, Cybern. 25, 1578–1594 (1995) 9. Petre, E., Selişteanu, D., Şendrescu, D.: Neural networks based adaptive control for a class of time varying nonlinear processes. In: Proc. Int. Conf. Control, Autom. and Systems ICCAS 2008, Seoul, Korea, pp. 1355–1360 (2008) 10. Ben Youssef, C., Guillou, V., Olmos-Dichara, A.: Modelling and adaptive control strategy in a lactic fermentation process. Control Eng. Practice 8, 1297–1307 (2000) 11. Dahhou, B., Roux, G., Cheruy, A.: Linear and non-linear adaptive control of alcoholic fermentation process: experimental results. Int. J. Adapt. Control and Signal Process. 7, 213–223 (1973) 12. Petre, E., Selişteanu, D., Şendrescu, D.: An indirect adaptive control strategy for a lactic fermentation bioprocess. In: Proc. IEEE Int. Conf. Autom., Quality and Testing, Robotics, Cluj-Napoca, Romania, May 28-30, pp. 175–180 (2010) 13. Sastry, S., Bodson, M.: Adaptive control: Stability, Convergence and Robustness. Prentice-Hall International Inc., Englewood Cliffs (1989)
New Evaluation Method for Imperfect Alternative Matrix Toshimasa Ozaki*, Kanna Miwa, Akihiro Itoh, Mei-Chen Lo, Eizo Kinoshita, and Gwo-Hshiung Tzeng *
Abstract. In the presumption of the missing values for the imperfect alternative matrix, there are two methods; the Harker method and the Nishizawa method. However, it is often difficult to determine which method is appropriate. This paper focuses on the decision-making process of the Analytical Network Process (ANP) by examining the evaluation matrix as an imperfect matrix. The proposed method is composed of the alternative matrix and the criterion matrix, which is based on the matrix inverse of the alternative matrix, and presumes the missing values in the four by four matrix from the eigenvector. The same estimate obtained by the Harker method is stably obtained by artificially providing information in the imperfect matrix. Furthermore, the essence of the decision-making is considered through these examination processes. Keywords: Decision analysis, AHP, ANP, Harker method, Imperfect matrix.
1 Introduction In the Analytical Hierarchy Process (AHP), it is the assumption that the evaluation matrix does not have the missing value. However, the imperfect matrix is inevitably caused when pair comparison is difficult. Harker (1987) and Nishizawa (2005) Toshimasa Ozaki · Kanna Miwa · Akihiro Itoh Faculty of Commerce, Nagoyagakuin University, Nagoya, Japan *
Mei-Chen Lo Department of Business Management, National United University, Miaoli, Taiwan Eizo Kinoshita Faculty of Urban Science, Meijo University, Japan Mei-Chen Lo · Gwo-Hshiung Tzeng Institute of Project Management, Kainan University, Taoyuan, Taiwan Mei-Chen Lo · Gwo-Hshiung Tzeng Institute of Management of Technology, Chiao Tung University, Hsinchu, Taiwan * Corresponding author. J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 213–222. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
214
T. Ozaki et al.
et al. proposed the methods of presuming the missing values of such as the matrix. Though we can presume the values even by the Harker method and the Nishizawa method, we hesitate which method is appropriate when the value of C.I. (Consistency Index) defined by Saaty differs. Authors (Ozaki et al., 2009a, 2009b, 2010a, 2010b) found out the method of solving the simple dilemma in the imperfect matrix that had the missing values by using the ANP, and they have proposed applying this method to the evaluation matrix of the AHP. However, although the missing values can be presumed by other methods, they cannot be presumed by this proposed method because it has instability of solution. This paper proposes the way to address the faults of this method by resolving the decision-making problem of the ANP and the imperfect alternative matrix.
2 Proposed Method and the Problem 2.1 Priority of the Alternative Matrix with the Missing Values Sugiura et al. (2005) set the example of the alternative matrix with the missing values shown in Table 1, and prioritized the evaluation procession. Table 1 Example that cannot be compared Evaluation Evaluation Evaluation Kodama
a1
Hikari
a2
Nozomi
b1
b2
c1 c2
On the other hand, authors (Ozaki et al., 2009a, 2010a, 2010b) defined the criterion matrix W by taking the inverse of the alternative matrix U in which the positions of the missing values were replaced with zeros, and prioritize the three alternatives of "Kodama", "Hikari", and "Nozomi" using the ANP. ⎡ a1 b1 0 ⎤ U = ⎢a2 0 c1 ⎥ ⎥ ⎢ ⎢⎣ 0 b2 c2 ⎥⎦
0 ⎤ ⎡ b2c1 b1c2 1 ⎢ W= 0 a1c1 ⎥ a2c2 ⎥ a2b1c2 + a1b2c1 ⎢ ⎢⎣ 0 a1b2 a2b1 ⎥⎦
(1)
(2)
It is shown that this eigenvector of the multiple product UW (the agreement matrix) is the same as the eigenvector obtained by Sugiura et al.. It is suggested this evaluation method by the ANP that authors presume the missing values by considering as a decision-making for the imperfect alternative matrix.
New Evaluation Method for Imperfect Alternative Matrix
215
2.2 Proposal of the ABIA Method and Its Problem Now, we name the above method the ABIA method (ANP Based on Inverse Alternative), and consider the imperfect matrix U with the missing elements. Some examples are shown in Table 2 where the missing values are arbitrarily generated. The table shows the differences from that obtained by our method and those obtained by the Harker and Nishizawa. We simply utilize our proposed method to have the comparison with those mentioned three methods. Following this, the missing values are possible to presume by the Harker method, instead of those might be nearly impossible to presume by the ABIA method and the Nishizawa method. Table 2 Examples of comparison with the other method 1
Example U
Missing values Our's Harker's Nishizawa's Example U
Missing values Our's Harker's Nishizawa's
1 1/2 1/5 1/4 a
2 1 1/ a 1/3
2 5 a 1 1/4
4 3 4 1
1 1/2 1/5 1/4 a
2 1 1/2 1/ a
3 5 2 1 1/4
4 a 4 1
1 1/2 1/5 1/4 a
1.369
4
1.190
3.468
1.088
1.369
4
1.095
a 1 1/2 1/3 b
Solution none 3.135
12.992
Solution none
5 5 2 1 1/4
b 3 4 1
1 1/2 1/5 1/ b a
5 2 1 1/ a
4 3 a 1
a c 1 1/4 c
b 3 4 1
1.095
4 1 1/ a 1/5 1/ b a
2 1 1/2 1/3
2 1 1/ a 1/3 b
6 5 b 1 1/4
a 3 4 1
1 1/2 1/ a 1/ b a
2 1 1/ b 1/3 b
10.929
1.369
1.500
3.709
1.214
10.961
1.370
1.501
6.005
0.750
10.954
1.369
Solution none
Nevertheless, the ABIA method does not give the stable estimates and same solution for the Harker for an imperfect evaluation matrix. Therefore, we focus on an imperfect alternative matrix of four by four, and resolve the problems with the ABIA method.
3 Proposal of the P-ABIA Method In the ABIA method, the agreement point of these two matrices is shown by the eigenvector of the ANP, which is composed of the imperfect alternative matrix and the criterion matrix based on the inverse alternative matrix. However, an unanticipated problem seems to occur by the ABIA method because the matrix with the missing values has not been treated in the ANP. Then we propose the revised statutes named P-ABIA method (Power Method of ANP Based on Inverse Alternative) in this section.
216
T. Ozaki et al.
For the solution described above, the more we have the missing values, the more increased we have zeros to the element of the agreement matrix. The principal eigenvector is difficult to calculate mathematically, even with the Harker method. In addition the P-ABIA method is judged from C.I.. Therefore, the presumed values by the P-ABIA method are compared with those obtained by the Harker method according to the example of the numerical values.
3.1 Numerical Example 1 The missing values to be presumed are shown as Eq. (3), and those values are presumed to be a=3.315 and b=12.99 respectively by the Harker method (See Example 4 in Table 2). Furthermore, the maximum eigenvalue of the alternative matrix for which the values are substituted is 4.083, and the value of C.I. is 0.028.
a 5 ⎡ 1 ⎢1 / a 1 2 U =⎢ 1 ⎢1 / 5 1 / 2 ⎢ ⎣1 / b 1 / 3 1 / 4
b⎤ 3⎥ ⎥ 4⎥ ⎥ 1⎦
(3)
Because the eigenvector of the agreement matrix UW cannot be determined by the ABIA method, this case is an example in which the missing values cannot be presumed. 4 ⎡ 1 ⎢ 0 1 UW = ⎢ ⎢1 / 5 4 / 5 ⎢ 0 ⎣ 0
0 − 12 ⎤ 0 0 ⎥ ⎥ 1 − 12 / 5⎥ ⎥ 0 1 ⎦
(4)
We can derive det(UW − λI ) = (1 − λ )4 as the characteristic polynomial of Eq.
(4) using the eigenvalue λ so that the eigenvalue is the multiplicity of one. However, the eigenvalue exists from first row and third row of the agreement matrix of Eq. (4) in the closed discs of 8 and 7/5 in the radius in the complex plane that centers on one according to the theorem of Gershgorin, this existence of the eigenvalue is denied because of the multiplicity of one. Furthermore, the eigenvalue cannot exist from both second row and forth row of the agreement matrix of Eq. (4) because of the radius of zero in the complex plane. This is a reason the eigenvector of Eq. (4) cannot be calculated. The missing values can be presumed by the agreement between the imperfect matrix U and the criterion matrix W in the ABIA method, so that the agreement of both becomes difficult to obtain when zero increases to the element of the agreement matrix UW like this. That is, it is possible to agree by agreement UW if there is some information in a or b in the alternative matrix Eq. (3). Then, we propose the P-ABIA method to which some information x is artificially provided to the missing value of the ABIA method, and assume that the information x is uncertain. There are two cases with a=x, and b=x in this example.
New Evaluation Method for Imperfect Alternative Matrix
217
(a) a=x Because the missing value is only by artificially providing information x in this case, the agreement matrix UW is as follows: ⎡ ⎢ 1 ⎢ 3 ⎢ 20 UW = ⎢ 3 x + ⎢ 4 ⎢ 3 x + 20 ⎢ 1 ⎢ ⎣ 3 x + 20
60 x ⎤ 3 x + 20 ⎥ ⎥ 60 ⎥ 3 x + 20 ⎥ 60 x ⎥ 5(3 x + 20 ) ⎥ ⎥ 1 ⎥ ⎦
0 0 1 0 0 1 0 0
(5)
.
Because the components of the eigenvector z of this agreement matrix UW can be shown to be z1=1, z2= ( 15 x + 10 ) / 10 x , and z4= 1 / 2 15 x as an analytical solution when the eigenvalue is 1 + 2 15 x / (3 x + 20) , the missing value b is obtained to be 2 15 x . Therefore, the missing value becomes b that can be rewritten as 2 15 x when assuming a=x. We substitute these values for Eq. (3), define the eigenvector to be x, and search for the value of x that minimizes the maximum eigenvalue of the evaluation matrix. The maximum eigenvalue λ is shown as the following equation when the first row of Ux = λx is expanded:
λ = 1 + 2 /( 4 240 ) + 4 6 / 4 10 15 ⋅ 8 x 3 + 5 * 4 2 / 5 / 4 10 15 ⋅ 8 x −3 .
(6)
Because dλ dx = 0 , we obtain x=3.466807: the presumed values become a=3.4668, and b=14.4225: and the validity of these values can be judged from C.I.=0.027. (b) b=x Because artificially providing the missing value b information x in this case, the eigenvector z of the agreement matrix UW can be shown to be z1=1, z2= 6 / 5 x . ⎡ ⎢ 1 ⎢ 6 ⎢ UW = ⎢ 2 x + 15 ⎢ 3 ⎢ 2 x + 15 ⎢ 2 ⎢ ⎣ 2 x + 15
5x 2 x + 15 1 x 2 x + 15 5 2 x + 15
⎤ 0 0⎥ ⎥ 0 0⎥ . ⎥ 1 0⎥ ⎥ ⎥ 0 1⎥ ⎦
(7)
Therefore, the missing value becomes a that can be rewritten as z1/z2= 5 x / 6 when assuming b=x. We substitute a and b for Eq. (3), and search for x
218
T. Ozaki et al.
that minimizes the maximum eigenvalue of Eq. (3). The maximum eigenvalue λ is shown as the following equation when the first row of Ux = λx is expanded:
λ = 2 + 5 5−1 ⋅ 4 2 6 / 5 ⋅ 8 x −3 + −4 60 5 / 6 ⋅ 8 x 3 .
(8)
Because dλ dx = 0 becomes x=14.4225, the presumed values become a=3.4668, and b=14.4225. Though the estimates of a and b is higher than that obtained by the Harker method in the P-ABIA method, the value of C.I. is just a little lower than that obtained by the Harker method.
3.2 Numerical Example 2 In this case, missing values are three pieces (a, b, and c). We show the imperfect alternative matrix U on the left, the criterion matrix W in the center, and the agreement matrix on the right. 0 2 a b⎤ ⎡1 ⎡ 1 ⎢0 1 ⎢1 / 2 1 c 3⎥ ⎥ W =⎢ U =⎢ ⎢0 1 4⎥ ⎢1 / a 1 / c 0 ⎢ ⎢ ⎥ ⎣1 / b 1 / 3 1 / 4 1 ⎦ ⎣0 1 / 3
0 0 6⎤ ⎡ 1 0 0⎤ ⎢1 / 2 1 ⎥ 0 3⎥ 0 3⎥ ⎥ UW = ⎢ ⎢ 0 4 / 3 1 0⎥ 1 0⎥ ⎢ ⎥ ⎥ 0 1⎦ ⎣ 0 1 / 3 1 / 4 1⎦
(9)
The missing values of a, b, and c become 1.5, 6, and 0.75 obtained by the Harker method, and the maximum eigenvalue is 4. However, their values become 1.5, 3.7308, and 1.214 obtained by the ABIA method, and the maximum eigenvalue is 4.09. The estimates obtained by the ABIA method are different from values obtained by the Harker method, and the value of C.I. is inferior. Though there is many zero in the element of the agreement matrix UW as well as the Numerical Example 1 in this case, the eigenvalue and eigenvector exist from the closed disc in the radius in the complex plane that centers on one according to the theorem of Gershgorin. Then, the next three cases are examined in the P-ABIA method as follows: Case 1 2 0 ⎡ 1 ⎢1 / 2 1 y Ua = ⎢ ⎢ 0 1/ y 1 ⎢ ⎣1 / x 1 / 3 1 / 4
Case 2 x⎤ x 2 ⎡ 1 ⎢1 / 2 1 3⎥⎥ y , Ub = ⎢ 4⎥ ⎢1 / x 1 / y 1 ⎥ ⎢ 1⎦ ⎣ 0 1/ 3 1/ 4
Case 3 2 x y ⎤ (10) ⎡ 1 0⎤ ⎢1 / 2 1 ⎥ 0 3 ⎥⎥ 3⎥ , Uc = ⎢ ⎢1 / x 0 1 4⎥ 4⎥ ⎥ ⎢ ⎥ 1 / y 1 / 3 1 / 4 1⎦ . 1⎦ ⎣
(a) Case 1 The value a is obtained to be z1/z3= xy / 2 from the eigenvector z of the agreement matrix UaWa when assuming b=x and c=y. The missing values of a, b, and c are substituted for Case1 in Eq. (10), x, and y that minimize the maximum eigenvalue of Case 1 are searched. The eigenvector of Case 1 is assumed to be x,
New Evaluation Method for Imperfect Alternative Matrix
219
and then the first and third rows Ux = λx are considered. Therefore, 2532y=x is obtained from the first row as ∂λ ∂x = 0 , and 32x=27y is obtained from the third row as ∂λ ∂y = 0 . The values x=6 and y=3/4 are obtained from these equations, and another missing value, a=3/2, is obtained from x and y. (b) Case 2 The value b is obtained to be z1/z4= 2 6x from the eigenvector z of the agreement matrix UbWb when assuming a=x and c=y. The missing values of a, b, and c are substituted for Case 2 in Eq. (10), x, and y that minimize the maximum eigenvalue of Case 2 are determined. The eigenvector of Case 2 is assumed to be x, and the first and fourth rows of Ux = λx are considered. Therefore, x3=6y2 is obtained from the first row as ∂λ ∂x = 0 , and 25xy2=33 is obtained from the fourth row as ∂λ ∂y = 0 . The values x=6 and y=3/4 are obtained from these equations, and another missing value, a=3/2, is obtained from x and y. (c) Case3 The value c is obtained to be z2/z4= 6 x / 4 from the eigenvector z of the agreement matrix UcWc when assuming a=x and b=y. The eigenvector of Case 3 is assumed to be x, and the second and third rows of Ux = λx are considered. Then, xy2=2133 and 3y2=25x3 are obtained from the second and thirds rows as ∂λ ∂x = 0 and ∂λ ∂y = 0 . Therefore, the values x=1.5, and y=6 are obtained from these equations, and another missing value, c=3/4, is obtained from x and y. The maximum eigenvalue of the alternative matrix for which a, b, and c are substituted is 4, and the C.I. is minimized. Estimated value of a, b, and c obtained by the ABIA method are corrected by the P-ABIA method.
4 Discussion of the Proposed Method (1) Imperfect alternative matrix with one loss in the rectangle We examine the P-ABIA method for the imperfect alternative matrix U with only one missing value ui,j(i<j) in the rectangle. Here, the missing element is assumed to be zero because it cannot be decided. Now, the missing value ui,j is replaced with a.
ª 1 «1 / a 12 U =« «1 / a13 « ¬1 / a14
a12 1 0 1 / a 24
"
a14 º 0 a 24 » » 1 a 34 » » " 1 ¼
(11)
220
T. Ozaki et al.
In the ABIA method, a=zi/zj= aik ail a jk −1a jl −1 (i<j,k 0⎪ ⎪⎝ k ⎪ ⎠ R ( x) = ⎨ ⎬, α ⎪⎛ 1 ⎪ ⎞ ⎪⎜ • x ⎟ , x ≥ 0, α > 0 ⎪ ⎠ ⎩⎝ k ⎭
(3)
where: - 1 / k is a parameter specific to the agents’ emotional mood and it depends on the objective pursued by the type of agent. In the model, k is the amount borrowed (So). - α, β, are the exponents describing a fractal [8] non-linear evolution of humansubjective phenomena. - x refers to a unit of the result, which can be a loss (f (x) 0) or break-even point. Risk function R(x) is defined as having fractal characteristics [18] due to human-subjective perception of risk by each individual agent due to information asymmetry and psychological factors of human perception [7]. The function F (x) representing the decision making process describes the emotional response to risk of an agent from a global perspective. The decision function F(x) depends on context (loss or gain) and refers to the value (positive or negative) of the emotional reward (monetary of social) perceived by the agent [7]. It is managed according to elements in the formula: ⎧1 ⇔ x ≥ 0 ∧ R( x) ≥ 1 ⎫ ⎪ ⎪ ⎪− 1 ⇔ x ≥ 0 ∧ R( x) < 1⎪ F ( x) = ⎨ ⎬ ⎪1 ⇔ x < 0 ∧ R( x) ≥ 1 ⎪ ⎪⎩− 1 ⇔ x < 0 ∧ R( x) < 1⎪⎭
(4)
4 Results of the Empirical Analysis of the Model The model of decision-making in an agent-based economy is described by an example of a decision-making process for financing projects implemented by agents that act in the name of the companies they represent. Agents’ decision is based on the criteria of risk and reward that accompany the transaction. The choice of the best financing source for a project is similar to buying a certain good or service on the market. Risk is defined in terms of financial sustainability of a project using a certain type of financing source. On the other hand, reward is defined according to
242
I.F. Popovici
the agents’ characteristics or objective followed. In this sense the model is built upon three types of agents. Each type follows a specific objective, has its own emotional response to risk and reward. Table 2. presents a comparative analysis [16] of the four study cases carried out by testing an investment project through various forms of financing and results reveal a ranking of the four combinations of project financing based on the criteria of risk. Table 2 Sources of financing a project and associated risk of financial imbalance
No. 1 2 3 4
Study case PRIVATE INVESTOR GRANT P.P.P. CREDIT
Type of risk Medium-high risk Medium risk Low risk Low-medium risk
The decision-making process of financing a project in the model takes place according to the type of agent and its emotional response to risk and reward from implementing it. Therefore, the agent "homo ludens" will select to finance his investment project through a private investor financial contribution because it offers him the highest degree of emotional reward according to his ‘mood’, due to the highest risk score compared to the other options chosen (see table 3). This agent shows an increased appetite for risk during transactions with other agents. The second type of agent „homo oeconomicus” will choose the financing option of P.P.P. (public-private partnership) because it corresponds to his behavior described in the model referring to his aversion for risk. His selected option is characterized by the lowest level of emotional reward and risk. The third type of agent "homo social-responsible" can select any of the four combinations of financing. Such an agent is not designed to get emotionally involved in pursuing the maximization of monetary gain and he is indifferent to risk. His decision-making process is guided by objectives such as social benefit or environmental protection and the financial gain is subordinated to those stated beforehand. Table 3 Emotional decision-making of an agent
Type Of Agent
Risk
Reward
Emotional Decision
”HOMO OECONOMICUS”
Aversion
Gain
P.P.P
”HOMO LUDENS”
Appetite
Gain/ loss
PRIVATE INVESTOR
”HOMO SOCIALRESPONSIBLE”
Indifferent
Social gain
CREDIT/ GRANT
Premises of an Agent-Based Model Integrating Emotional Response
243
5 Conclusions and Future Works The conclusions that can be drawn at this stage refer to the fact that it is possible to integrate emotional mood of an agent into his actions during transactions with other agents. This can be done by using a function of value by the agent when deciding which strategy to follow by choosing an option from multiple alternatives found in the local environment. The model describes how emotional mood of the agent regarding risk and reward from an activity is used to influence agents’ decision regarding the choice of a financing source for a certain project. Present research shows that the agent’s decision implying the financing of a project is not solely based on notions like utility or maximizing profit but also on the emotional attitude or’ mood’ towards risk which is connected to the reward obtained from the project. Another important aspect of the agent’s choice refers to the nature of reward which is not just monetary but it can also have social or environmental content. One innovative aspect of this research refers to the integration of emotions into decision-making quantified into the local behavioral strategy of the agent. This is integrated into the model by the use of fractal theory principles into the agent’s decision function regarding risk and reward. The present work lays a brick to the field of agent based modeling and fractals used for the study of emotional decision-making process in the financing of a project. Risk is the result of the cumulative action of several factors. Actual management risk intends to diversify risk to all the participants in the market. This splits risk towards several players in order to minimize risk per individual. The logic is similar to that of economies of scale. It refers to the fact that the total cost is divided between a large number of participants so that the cost per unit of production is minimized. Only that the process of risk diversification created a network of risks in the financial market. That connected all the participants into a big entity through the network effect created. Any little change into this large and interconnected mechanism of risk turned into a big movement at a global view through network effect. Secondly, the risk sharing mechanism unifies the players into a complicated network. Every “share” of risk is perceived differently by individuals, at various levels. This creates different perspectives towards the dimension of the phenomenon of risk. Let`s take into consideration that every level has its own network characterized by different scales inside the level. All in all, there are different dimensions of the phenomenon of risk on a horizontal, vertical or diagonal level. Each of these dimensions has different scales inside the level. Therefore, the inside structure of the phenomenon of risk sharing in the financial market is characterized by a fractal structure. The scaling relation between the various components of risk is similar to that of a fractal. This throws a different view over risk probability because of the fractal dimension of the phenomenon. This view assumes the fact that risk phenomenon hasn`t got a uniformly distributed structure across overall dimension. This implies that it is made up of pieces of “risk” shared at individual level at different scales which show the same characteristics as the whole. Further research will extend the application of the model to estimating risk of financial crisis in the global economy due to the increased access to finance from
244
I.F. Popovici
financial market of companies. Players on the market are represented by agents, managers of projects or owners of companies who use various financing sources assuming high financial risks in the context of uncertainty and instability of the global economy.
References 1. Bloomquist, K.: A comparison of agent-based models of income tax evasion. Internal Revenue Service, 8–15 (2004) 2. Boloş, M., Popovici, I.: Ultramodernity in risk theory. In: Annals of the International Conference on Financial Trends in the Global Economy – Universitate Babeş Bolyai, Cluj Napoca – FSEGA, November 13-14, pp. 3–10 (2009) 3. Bătrâncea, L.: Teoria jocurilor, comportament economic. In: Risoprint (ed.) Experimente, Cluj Napoca, pp. 163–194 (2009) 4. Camerer, C., Ho, T.-H., Chong, J.K.: Behavioral game theory: Thinking, building and Teaching. Research paper NSF grant, 24 (2001) 5. Gellert, W.K., Hellwich, K.: Mică Enciclopedie matematică. In: Silvia, C. (ed.) Traducere de Postelnicu V, Tehnică, Bucureşti, pp. 234–250 (1980) 6. Kahneman, D., Tversky, A.: The framing of decisions and the psychology of choice science. New Series 211(4481), 453–458 (1981) 7. Daniel, K., Tversky, A.: Advances in prospect theory: Cumulative representation of uncertainty stanford university, Department of Psychology, Stanford. Journal of Risk and Uncertainty 5, 297–323 (1992) 8. Lapidus, M., van Frankenhuijsen, M.: Fractal geometry. In: Complex Dimensions and Zeta Functions Geometry and Spectra of Fractal Strings, pp. 41–45. Springer Science, Business Media, LLC (2006) 9. Liebovitch, L.: Fractals and chaos simplified for life sciences. In: Center for Complex Systems, Florida Atlantic University, Oxford University Press, New York,Oxford (1998) 10. Mandelbrot, B.: The fractal geometry of nature, updated and augmented. In: International Business Machines, Thomas J. Watson Research Center Freeman and Company, New York (1983) 11. Daniel, M., Morris Cox, E.: The new science of pleasure consumer behavior and the measurement of well-being. In: Frisch Lecture, Econometric Society World Congress, London, August 20, pp. 3–7. University of California, Berkeley (2005) 12. Rasmusen, E.: Games and information. In: An Introduction to Game Theory, 3rd edn., pp. 10–28. Basil Blackwell (2000) 13. Scarlat, E.: Agenţi şi modelarea bazată pe agenţi în economie. ASE Bucureşti, 15–20 (2005) 14. Scarlat, E., Maries, I.: Simulating collective intelligence of the communities of practice using agent-based methods. In: Jędrzejowicz, P., Nguyen, N.T., Howlet, R.J., Jain, L.C. (eds.) KES-AMSTA 2010. LNCS, vol. 6070, pp. 305–314. Springer, Heidelberg (2010) 15. Scarlat, E., Maracine, V.: Agent-based modeling of forgetting and depreciating knowledge in a healthcare knowledge ecosystem. Economic Computation And Economic Cybernetics Studies and Research 41(3-4) (2008)
Premises of an Agent-Based Model Integrating Emotional Response
245
16. Scarlat, E., Boloş, M., Popovici, I.: Agent-based modeling in decision-making for project financing. Journal Economic Computation and Economic Cybernetics Studies and Research, 5–10 (2011) 17. Taleb, N.,, P.: Epistemiology and risk management. Risk and Regulation Magazine, 3– 10 (August 25, 2007) 18. Taleb, N., Mandelbrot, B.: Fat tails, asymmetric knowledge, and decision making, Nassim Nicholas Taleb”s Essay in honor of Benoit Mandelbrot”s 80th birthday. Wilmott Magazine, 2,16 (2005) 19. Tulai, C., Popovici, I.: Modeling risk using elements of game theory and fractals. In: Annals of the International Conference - Universitatea din Craiova, Competitivitate şi Stabilitate în Economia bazată pe Cunoaştere, May 14-15, vol. 10, pp. 2–7 (2010)
Proposal of Super Pairwise Comparison Matrix Takao Ohya* and Eizo Kinoshita
*
Abstract. This paper proposes a Super Pairwise Comparison Matrix (SPCM) to express all pairwise comparisons in the evaluation process of the dominant analytic hierarchy process (AHP) or the multiple dominant AHP (MDAHP) as a single pairwise comparison matrix. In addition, this paper shows, by means of a numerical counterexample, that an evaluation value resulting from the application of the Harker method to a SPCM does not necessarily coincide with that of the evaluation value resulting from the application of the dominant AHP(DAHP) to the evaluation value obtained from each pairwise comparison matrix by using the eigenvalue method. Keywords: pairwise comparison matrix, dominant AHP, logarithmic least square method, Harker's method.
1 Introduction The analytic hierarchy process (AHP) proposed by Saaty[1] enables objective decision making by top-down evaluation based on an overall aim. In actual decision making, a decision maker often has a specific alternative (regulating alternative) in mind and makes an evaluation on the basis of the alternative. This was modeled in dominant AHP (DAHP), proposed by Kinoshita and Nakanishi[2]. If there are more than one regulating alternatives and the importance of each criterion is inconsistent, the overall evaluation value may differ for each regulating alternative. As a method of integrating the importances in such cases, the concurrent convergence method (CCM) was proposed. Kinoshita and Sekitani[3] showed the convergence of CCM. Takao Ohya School of Science and Engineering, Kokushikan University, Tokyo, Japan *
Eizo Kinoshita Faculty of Urban Science, Meijo University, Gifu, Japan * Corresponding author. J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 247–254. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
248
T. Ohya and E. Kinoshita
Meanwhile, Ohya and Kinoshita proposed the geometric mean multiple dominant AHP (GMMDAHP), which integrates weights by using a geometric mean based on an error model to obtain an overall evaluation value. Herein, such methods of evaluation with multiple regulating alternatives will be generically referred to as the multiple dominant AHP (MDAHP). Section 2 briefly explains DAHP and MDAHP and then proposes a super pairwise comparison matrix (SPCM) to express the pairwise comparisons appearing in the evaluation processes of the dominant AHP and MDAHP as a single pairwise comparison matrix. Section 3 gives a specific numerical example in which the Harker method are applied to a SPCM. With the numerical example, it is shown that the evaluation value resulting from the application of DAHP to the evaluation value obtained from each pairwise comparison matrix by using the eigenvalue method does not necessarily coincide with the evaluation value resulting from the application of the Harker method to the SPCM.
2 SPCM In this section, we propose a SPCM to express the pairwise comparisons appearing in the evaluation processes of DAHP and MDAHP as a single pairwise comparison matrix. Section 2.1 outlines DAHP procedure and explicitly states pairwise comparisons. Section 2.2 proposes the SPCM that expresses these pairwise comparisons as a single pairwise comparison matrix.
2.1 Evaluation in DAHP The true absolute importance of alternative a (a = 1,..., A) at criterion c(c = 1,..., C ) is vca . The final purpose of the AHP is to obtain the relative valC
ue (between alternatives) of the overall evaluation value va = ∑ vca of alternac =1
tive a . The procedure of DAHP for obtaining an overall evaluation value is as follows: DAHP Step 1: The relative importance uca = α c vca (where α c is a constant) of alternative a at criterion c is obtained by some kind of methods. In this paper, uca is obtained by applying the pairwise comparison method to alternatives at criterion c .
Proposal of Super Pairwise Comparison Matrix
249
Step 2: Alternative d is the regulating alternative. The importance uca of alternative a at criterion c is normalized by the importance ucd of the regud lating alternative d , and u ca ( = u ca / u cd ) is calculated. Step 3: With the regulating alternative d as a representative alternative, the
importance ucd of criterion c is obtained by applying the pairwise comparison method to criteria, where, ucd is normalized by
C
∑ ucd = 1 .
c =1
d Step 4: From u ca , u cd obtained at Steps 2 and 3, the overall evaluation value
C
d u a = ∑ ucd uca of alternative a is obtained. By normalization at Steps 2 and c =1
3, u d = 1 . Therefore, the overall evaluation value of regulating alternative d is normalized to 1.
2.2 Proposal of SPCM The relative comparison values rcca ′a ′ of importance vca of alternative a at criteria c as compared with the importance vc ′a ′ of alternative a′ in criterion c′ , are arranged in a (CA × CA) or (AC × AC) matrix. In a (CA × CA) matrix, index of alternative changes first. In a (AC × AC) matrix, index of criteria changes first. This is proposed as the SPCM R = ( rcca′a ′ ) or ( raac′c′ ) . In a SPCM, symmetric components have a reciprocal relationship as in pairwise comparison matrices. Diagonal elements are 1 and the following relationships are true: c ′a ′ r ca = 1
ca r ca =1
rcca′a ′
.
,
(1)
(2)
Pairwise comparison at Step 1 of DAHP consists of the relative comparison value rccaa′ of importance vca of alternative a , compared with the importance vca ′
of alternative a′ at criterion c . Pairwise comparison at Step 3 of DAHP consists of the relative comparison value rccd′d of importance vcd of alternative d at criterion c , compared with the importance vc ′d of alternative d at criterion c′ , where the regulating alternative is d . Figures 1 and 2 show SPCMs using DAHP when there are three alternatives (13) and four criteria (I-IV) and the regulating alternative is Alternative 1. In these
250
T. Ohya and E. Kinoshita
figures, * represents pairwise comparison between alternatives at Step 1 and # represents pairwise comparison between criteria at Step 3.
I1 I2 I3 II 1 II 2 II 3 III 1 III 2 III 3 IV 1 IV 2 IV 3
I 1 1 * * #
I 2 * 1 *
I 3 * * 1
II 1 #
II 2
II 3
III
III
III
1 #
2
3
* 1 *
* * 1
#
#
1 * * #
#
#
IV 1 #
IV 2
IV 3
* 1 *
* * 1
#
1 * * #
* 1 *
* * 1
#
1 * *
Fig. 1 SPCM by Dominant AHP (CA × CA)
I 1 II 1 III 1 IV 1 I2 II 2 III 2 IV 2 I3 II 3 III 3 IV 3
I 1 1 # # # *
II 1 # 1 # #
III
1 # # 1 #
IV 1 # # # 1
I 2 *
II 2
III
2
IV 2
I 3 *
*
II 3
3
* *
*
* *
1 *
* 1
*
* 1
* *
1 *
*
* 1
* *
IV 3
* *
1
*
III
1 *
1
Fig. 2 SPCM by Dominant AHP (AC × AC)
Figures 3 and 4 show SPCMs using the MDAHP when there are three alternatives (1-3) and four criteria (I-IV) and all alternatives are regulating ones. In these figures, * represents pairwise comparison between alternatives in a same criterion and # represents pairwise comparison between criteria of a same alternative.
Proposal of Super Pairwise Comparison Matrix
I1 I2 I3 II 1 II 2 II 3 III 1 III 2 III 3 IV 1 IV 2 IV 3
I 1 1 * * #
I 3 * * 1
I 2 * 1 *
II 1 #
II 2
II 3
251
III
III
III
1 #
2
3
# 1 * * #
# # # #
# #
#
# #
1 * * #
# #
#
# # * * 1
* 1 *
#
# # # 1 * *
#
#
IV 3
#
#
#
IV 2
# # * * 1
* 1 *
#
IV 1 #
#
#
# * * 1
* 1 *
Fig. 3 SPCM by MDAHP (CA × CA)
I 1 II 1 III 1 IV 1 I2 II 2 III 2 IV 2 I3 II 3 III 3 IV 3
I 1 1 # # # *
II 1 # 1 # #
III
1 # # 1 #
IV 1 # # # 1
* * * * *
I 2 *
II 2
III
2
IV 2
*
1 # # # *
# 1 # #
III
3
IV 3
*
# # 1 #
* * # # # 1
* *
II 3
*
* *
I 3 *
*
* * * * 1 # # #
# 1 # #
# # 1 #
* # # # 1
Fig. 4 SPCM by MDAHP (AC × AC)
SPCM of DAHP or MDAHP is an incomplete pairwise comparison matrix. Therefore, the LLSM based on an error model or an eigenvalue method such as the Harker or two-stage method is applicable to the calculation of evaluation values from an SPCM.
252
T. Ohya and E. Kinoshita
3 Numerical Example Three alternatives from 1 to 3 and four criteria from I to IV are assumed, where Alternative 1 is the regulating alternative. As the result of pairwise comparison between alternatives at criteria c (c = I,...,IV), the following pairwise comparison matrices RcA , c = Ι,..., IV are obtained:
⎛ 1 1/ 3 5⎞ ⎜ ⎟ R =⎜ 3 1 3 ⎟, ⎜1 / 5 1 / 3 1 ⎟ ⎝ ⎠ ⎛1 1/ 3 1/ 3⎞ ⎟ ⎜ A R III = ⎜ 3 1 1 / 3 ⎟, ⎜3 3 1 ⎟⎠ ⎝ A I
⎛ 1 ⎜ R = ⎜ 1/7 ⎜1 / 3 ⎝
7 1
⎛ 1 ⎜ = ⎜ 1/3 ⎜1 / 5 ⎝
3 1
A II
R
A IV
3
1
3 ⎞ ⎟ 1 / 3 ⎟, 1 ⎟⎠ 5⎞ ⎟ 1⎟ 1 ⎟⎠
.
With regulating alternative 1 as the representative alternative, importance between criteria was evaluated by pairwise comparison. As a result, the following pairwise comparison matrix
RIC is obtained:
⎛ 1 ⎜ ⎜ 3 C RI = ⎜ 1/3 ⎜ ⎜ 3 ⎝
1/ 3 1 1/3 1
3 1/3 ⎞ ⎟ 3 1 ⎟. 1 1/3 ⎟ ⎟ 3 1 ⎟⎠
(1) SPCM + Harker method In the Harker method, the value of a diagonal element is set to the number of missing entries in the row plus 1 and then evaluation values are obtained by the usual eigenvalue method. Figure 5 shows the SPCM by the Harker method. Table 1 shows the evaluation values obtained from the SPCM in Fig. 5. Table 1 Evaluation values obtained by the Harker method
Criterion I Alternative 1 Alternative 2 Alternative 3
0.196 0.370 0.074
Criterion II 0.352 0.039 0.107
Overall Criterion III Criterion IV evaluation value 0.095 0.356 1 0.190 0.087 0.687 0.391 0.072 0.645
Proposal of Super Pairwise Comparison Matrix
253
3 ⎛ 7 1/ 3 5 1/ 3 ⎜ 3 10 3 ⎜ ⎜ 1 / 5 1 / 3 10 ⎜ ⎜ 3 7 7 3 3 ⎜ 1 / 7 10 1 / 3 ⎜ ⎜ 1 / 3 3 10 ⎜ 1 / 3 1/ 3 7 1/ 3 1/ 3 ⎜ ⎜ 3 10 1 / 3 ⎜ 3 3 10 ⎜ ⎜ 3 1 3 ⎜ ⎜ ⎜ ⎝
⎞ ⎟ ⎟ ⎟ ⎟ 1 ⎟ ⎟ ⎟ ⎟ ⎟ 1/ 3 ⎟ ⎟ ⎟ ⎟ 7 3 5⎟ ⎟ 1 / 3 10 1 ⎟ ⎟ 1 / 5 1 10 ⎠ 1/ 3
Fig. 5 SPCM by the Harker method
+
(2) DAHP the eigenvalue method By applying the eigenvalue method to the individual pairwise comparison matrices RcA (c = I,..., IV) , R1C , the evaluation values at Steps 2 and 3 of DAHP are obtained as follows: ⎛ 1 ⎜ ⎜ 3 ⎜1 / 5 ⎝ ⎛ 1 ⎜ ⎜ 1/7 ⎜1 / 3 ⎝
1/ 3 1 1/ 3 7 1 3
5 ⎞ ⎡1.000 ⎤ ⎟ 3 ⎟ ⎢⎢1.754 ⎥⎥ = 3.295 1 ⎟⎠ ⎢⎣ 0.342 ⎥⎦
⎡1.000 ⎤ ⎢1.754 ⎥ ⎢ ⎥ ⎢⎣ 0.342 ⎥⎦
3 ⎞ ⎡1.000 ⎤ ⎟ 1 / 3 ⎟ ⎢⎢ 0.131 ⎥⎥ = 3.007 1 ⎟⎠ ⎢⎣ 0.362 ⎥⎦
⎡1.000 ⎤ ⎢ 0.131 ⎥ ⎢ ⎥ ⎢⎣ 0.362 ⎥⎦ ⎡1.000 ⎤ = 3.136 ⎢⎢ 2.080 ⎥⎥ ⎢⎣ 4.327 ⎥⎦
⎛ 1 1 / 3 1 / 3 ⎞ ⎡1.000 ⎤ ⎜ ⎟ 1 1 / 3 ⎟ ⎢⎢ 2.080 ⎥⎥ ⎜3 ⎜3 3 1 ⎟⎠ ⎢⎣ 4.327 ⎥⎦ ⎝ 3 5 ⎞ ⎡1.000 ⎤ ⎛ 1 ⎡1.000 ⎤ ⎜ ⎟⎢ ⎥ 1 ⎟ ⎢ 0.281 ⎥ = 3.029 ⎢⎢ 0.281 ⎥⎥ ⎜ 1/3 1 ⎜1 / 5 1 ⎢⎣ 0.237 ⎥⎦ 1 ⎟⎠ ⎢⎣ 0.237 ⎥⎦ ⎝ ⎛ 1 1 / 3 3 1/3 ⎞ ⎡ 0 . 169 ⎤ ⎡ 0 . 169 ⎤ ⎜ ⎟⎢ ⎥ ⎢ ⎥ 3 1 3 1 0 . 368 ⎜ ⎟⎢ ⎥ = 4 .155 ⎢ 0 .368 ⎥ ⎜ 1/3 1/3 1 1/3 ⎟ ⎢ 0 . 096 ⎥ ⎢ 0 . 096 ⎥ ⎜ ⎟⎢ ⎥ ⎢ ⎥ ⎜ 3 ⎟ 1 3 1 ⎠ ⎣ 0 . 368 ⎦ ⎝ ⎣ 0 .368 ⎦ .
254
T. Ohya and E. Kinoshita A
A
A
A
C
From the above result, the eigenvalues of RI , RII , RIII , RIV , R1 are 3.259, 3.007, 3.136, 3.029, and 4.155, respectively. The C.I. values are 0.147, 0.004, 0.068, 0.015, and 0.052, respectively. Based on these results, Table 2 shows the evaluation value u1c u1ca of each alternative and criterion and the overall evaluation value of each alternative. Table 2 Evaluation values obtained by the eigenvalue method
Criterion I Alternative 1 Alternative 2 Alternative 3
0.169 0.296 0.058
Criterion II 0.368 0.048 0.133
Criterion III Criterion IV 0.096 0.199 0.414
0.368 0.103 0.087
Overall evaluation value 1 0.646 0.692
The evaluation value resulting from the application of DAHP to evaluation values that are obtained from each pairwise comparison matrix shown in Table 2 by the eigenvalue method do not coincide with the evaluation value resulting from the application of the Harker method to SPCM shown in Table 1. With these numerical example, it is shown that the evaluation value resulting from the application of DAHP to the evaluation value obtained from each pairwise comparison matrix by using the eigenvalue method does not necessarily coincide with the evaluation value resulting from the application of the Harker method to the SPCM.
References 1. Saaty, T.L.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980) 2. Kinoshita, E., Nakanishi, M.: Proposal of new AHP model in light of do-minative relationship among alternatives. Journal of the Operations Research Society of Japan 42, 180–198 (1999) 3. Kinoshita, E., Sekitani, K., Shi, J.: Mathematical Properties of Dominant AHP and Concurrent Convergence Method. Journal of the Operations Research Society of Japan 45, 198–213 (2002) 4. Harker, P.T.: Incomplete pairwise comparisons in the Analytic Hierarchy Process. Mathematical Modeling 9, 837–848 (1987)
Reduction of Dimension of the Upper Level Problem in a Bilevel Programming Model Part 1 Vyacheslav V. Kalashnikov, Stephan Dempe, Gerardo A. Pérez-Valdés, and Nataliya I. Kalashnykova*
Abstract. The paper deals with a problem of reducing dimension of the upper level problem in a bilevel programming model. In order to diminish the number of variables governed by the leader at the upper level, we create the second follower supplied with the objective function coinciding with that of the leader and pass part of the upper level variables to the lower level to be governed but the second follower. The lower level problem is also modified and becomes a Nash equilibrium problem solved by the original and the new followers. We look for conditions that guarantee that the modified and the original bilevel programming problems share at least one optimal solution.
1 Introduction Bilevel programming modeling is a new and dynamically developing area of mathematical programming and game theory. For instance, when we study value chains, the general rule usually is: decisions are made by different parties along the chain, and these parties have often different, even opposed goals. This raises the difficulty of supply chain analysis, because regular optimization techniques Vyacheslav V. Kalashnikov ITESM, Campus Monterrey, Monterrey, Mexico e-mail:
[email protected] Stephan Dempe TU Bergacademie Freiberg, Freiberg, Germany e-mail:
[email protected] Gerardo A. Pérez-Valdés NTNU, Trondheim, Norway e-mail:
[email protected] Nataliya I. Kalashnykova UANL, San Nicolás de los Garza, Mexico e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 255–264. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
256
V.V. Kalashnikov et al.
(e.g., like linear programming) cannot be readily applied, so that tweaks and reformulations are often needed (cf., [1]). The latter is the case with the Natural Gas Value Chain. From extraction at the wellheads to the final consumption points (households, power plants, etc.), natural gas goes through several processes and changes ownership many a time. Bilevel programming is especially relevant in the case of the interaction between a Natural Gas Shipping Company (NGSC) and a Pipeline Operating Company (POC). The first one owns the gas since the moment it becomes a consumption-grade fuel (usually at wellhead/refinement complexes, from now onward called the extraction points) and sells it to Local Distributing Companies (LDC), who own small, city-size pipelines that serve final costumers. Typically, NGSCs neither engage in business with end-users, nor actually handle the natural gas physically. Whenever the volumes extracted by the NGSCs differ from those stipulated in the contracts, we say an imbalance occurs. Since imbalances are inevitable and necessary in a healthy industry, the POC is allowed to apply control mechanisms in order to avoid and discourage abusive practices (the so called arbitrage) on part of the NGSCs. One of such tools is cash-out penalization techniques after a given operative period. Namely, if a NGSC has created imbalances in one or more pool zones, then the POC may proceed to `move' gas from positiveimbalanced zones to negative-imbalanced ones, up to the point where every pool zone has the imbalance of the same sign, i.e., either all non-negative or all nonpositive, thus rebalancing the network. At this point, the POC will either charge the NGSC a higher (than the spot) price for each volume unit of natural gas withdrawn in excess from its facilities, or pay back a lower (than the sale) price, if the gas was not extracted. Prices as a relevant factor induce us into the area of stochastic programming instead of the deterministic approach. The formulated bilevel problem is reduced to the also bilevel one but with linear constraints at both levels (cf., [2]). However, this reduction involves introduction of many artificial variables, on the one hand, and generation of a lot of scenarios to apply the essentially stochastic tools, on the other hand. The latter makes the dimension of the upper level problem simply unbearable burden even for the most modern and powerful supercomputers. The aim of this paper is a mathematical formalization of the task of reduction of the upper level problem’s dimension without affecting (if possible!) the optimal solution of the original bilevel programming problem.
2 Main Results We start with an example. Consider the following bi-level (linear) programming problem (P1):
Reduction of Dimension of the Upper Level Problem
257
F ( x, y, z ) = x − 2 y + z → min x ,y ,z
s.t. x + y + z ≥ 15, 0 ≤ x, y,z ≤ 10 , and z ∈Ψ ( x, y ) ,
(P1)
where Ψ ( x, y ) = { z solving the lower level (linear) problem} : f 2 ( x, y,z ) = 2 x − y + z → min z
s.t. x + y − z ≤ 5, 0 ≤ z ≤ 10. It is easy to show that problem (P1) has a unique optimal solu-
(
)
(
)
tion x* , y* ,z* = ( 0 ,10 ,5 ) , with F x* , y* ,z* = −15 . By the way, the lower level
(
)
optimal value f 2 x* , y* ,z* = −5 . Now, let us construct a modified problem (MP1), which is, strictly speaking, an MPEC (mathematical program with equilibrium constraints): F ( x, y,z ) = x − 2 y + z → min x,y,z
s.t.
(MP1)
x + y + z ≥ 15, 0 ≤ x, y,z ≤ 10, and
( y,z ) ∈Φ ( x ) , where Φ ( x ) = {( y,z ) solving the lower level equilibrium problem} : Find a Nash equilibrium between two followers: 1) Follower 1 has the problem: f1 ( x, y,z ) ≡ F ( x, y,z ) = x − 2 y + z → min y
s.t. x + y − z ≤ 5, 0 ≤ y ≤ 10;
258
V.V. Kalashnikov et al.
2) Follower 2 has the problem: f 2 ( x, y,z ) ≡ 2 x − y + z → min z
s.t. x + y − z ≤ 5, 0 ≤ z ≤ 10. In other words, in problem (MP1), the leader controls directly only variable x , whereas the lower level is represented with an equilibrium problem. In the latter, there are two decision makers: the second one is the same follower from problem (P1); she/he controls variable z , accepts the values of the leader’s variables x as a parameter, and tries to reach a Nash equilibrium with the first follower, who actually is aiming at finding also an equilibrium with the second follower by controlling only variable y and taking the values of the leader’s variable x as a parameter. Actually, follower 1 is the same leader (her/his objective function is the leader’s objective function’s projection onto the space R 2 of the variables ( y, z ) for each fixed value of the variable x .) Now it is not difficult to demonstrate that problem (MP1) is also solvable and
(
(
)
)
has exactly the same solution: x* , y* ,z* = ( 0 ,10 ,5 ) with F x* , y* ,z* = −15 .
By the way, the lower level equilibrium problem has the optimal solution y* = y* ( x ) = 10, z* = z* ( x ) = min {10, 5 + x} for each value 0 ≤ x ≤ 10 of the leader’s upper level variable. Of course, the optimal value x* = 0 provides for the ■ minimum value of the upper level objective function F. Now more generally, consider the following bi-level programming problem: Find a vector
(
F x* , y* ,z*
(x , y ,z )∈ X ×Y × Z ⊂ R *
)
*
*
n1
× R n2 × R n3 such that (P1):
⎧ ⎫ ⎪ ⎪ ⎪ F ( x, y,z ) w.r.t. ( x, y ) ∈ X × Y , subject to ⎪ ⎪⎪ ⎪⎪ G ( x, y,z ) ≤ 0, where = min ⎨ ⎬ ⎪ ⎪ ⎧⎪ f 2 ( x, y,z ) w.r.t. z ∈ Z and ⎫⎪ ⎪ ⎪ z x, y Arg min ∈ Ψ = ( ) ⎨ ⎬⎪ ⎪ ⎪⎩subject to g ( x, y,z ) ≤ 0. ⎪⎭ ⎭⎪ ⎩⎪
Here F , f 2 : R n → R, and G : R n → R m1 , g : R n → R m2 are continuous functions and mappings, respectively, with n = n1 + n2 + n3 , where ni ,i = 1, 2,3;m j , j = 1, 2,
are some fixed natural numbers. In a relation to problem (P1), let us define the following auxiliary subset:
Φ = {( x, y ) ∈ X × Y : ∃ z ∈ Z such that g ( x, y,z ) ≤ 0} .
(1)
Reduction of Dimension of the Upper Level Problem
259
Now we make the following assumption: A1. The set Φ1 ⊆ Φ defined as the subset of all pairs ( x, y ) ∈Φ , for which there
exists a unique vector z = z ( x, y ) ∈Ψ ( x, y ) satisfying, in addition, the inequality G ( x, y, z ( x, y ) ) ≤ 0 , is nonempty, convex, and compact. Moreover, suppose that
the thus defined function z : Φ1 → R n3 is continuous with respect to all its variables. Next, we introduce another bi-level programming problem, which is actually a so-called mathematical program with equilibrium constraints (MPEC): Find a
(x , y ,z )∈ X ×Y × Z ⊂ R *
*
*
(MP1)
n1
× R n2 × R n3 solving the problem (MP1):
F ( x, y, z ) →
(2)
min
( x,y,z )∈ X ×Y × Z
subject to (3) ( y,z ) ∈ Λ ( x ) , where Λ ( x ) is a collection of generalized Nash equilibrium (GNE) points of the following two-person game. Player 1 selects her strategies from the set Y and minimizes her payoff function f1 ( x, y, z ) ≡ F ( x, y, z ) subject to the con-
straints G ( x, y, z ) ≤ 0 and g ( x, y,z ) ≤ 0 . Player 2 uses the set of strategies Z and minimizes
her
payoff
function
G ( x, y, z ) ≤ 0 and g ( x, y,z ) ≤ 0 .
f 2 ( x, y,z )
subject
to
the
constraints
Remark 1. It is clear that if a vector ( y ,z ) ∈ Y × Z solves the lower level equilib-
rium problem of the MPEC (MP1) for a fixed x ∈ X , then z = z ( x, y ) , with the
mapping z = z ( x, y ) ∈Ψ ( x, y ) from assumption A1. Conversely, if a vector y minimizes the function f1 ( y ) ≡ f1 ( x, y, z ( x, y ) ) over an appropriate set of vec-
tors y, and in addition, G ( x, y ,z ( x, y ) ) ≤ 0 , then lower level problem in (MP1).
( y ,z ) = ( y ,z ( x, y ))
solves the ■
We are interested in establishing relationships between the solutions sets of problems (P1) and (MP1). First, we can prove the following auxiliary result. Theorem 1. Under assumption A1, there exists a nonempty convex compact subset D ⊂ X such that for all x ∈ D , there is a generalized Nash equilibrium (GNE) solution ( y,z ) ∈ Y × Z of the lower level equilibrium problem of the MPEC (MP1). ■
In order to establish relationships between the optimal solutions sets of problems (P1) and (MP1). We start with a rather restrictive assumption concerning problem (MP1), having in mind to relax them in our second paper.
260
V.V. Kalashnikov et al.
A2. Assume that the generalized Nash equilibrium (GNE) state y = y ( x ) , whose ex-
istence for each x ∈ D has been established in Theorem 1, is determined uniquely. Remark 2. In assumption A2, it would be redundant to demand the uniqueness of the GNE state z = z ( x ) , because this has been already required implicitly in as-
y = y ( x)
sumption A1: indeed, if
z = z ( x, y ( x ) ) = z ( x ) .
is determined uniquely, so is the ■
Theorem 2. Under assumptions A1 and A2, problems (P1) and (MP1) are equivalent. ■
Next, we examine certain nonlinear and linear bilevel programs and find out when assumptions A1 and A2 hold in this particular case. Moreover, we try to relax some of the too restrictive conditions in these assumptions.
3 Nonlinear Case First, it is easy to verify that for a problem (P1) with a non-void solutions set, assumption A1 always holds if all the components of the mappings G and g are convex (continuous) functions, and in addition, the lower level objective function f 2 = f 2 ( x, y,z ) is strictly convex with respect to z for each fixed pair of values of
( x, y ) .
■
Lemma 3. Under the above cited conditions, assumption A1 holds.
Assumption A2 is much more restrictive than A1: the uniqueness of a generalized Nash equilibrium (GNE) is quite a rare case. In order to deal with assumption A2, we have to suppose additionally that the upper and lower level objective functions are (continuously) differentiable, and moreover, the combined gradient
(∇
T T y F ,∇ z f2
mapping
fixed x ∈ X .
In
)
T
: R n2 + n3 → R n2 + n3
mathematical
is strictly monotone for each
terms,
the
latter
means
(∇ F ( x, y ,z ) ,∇ f ( x, y ,z )) − ( ∇ F ( x, y ,z ) ,∇ f ( x, y , z )) T y
1
T z 2
1
1
1
T
T y
2
2
T z 2
2
2
that
T
,
>0
( y ,z ) − ( y , z ) for all ( y , z ) ≠ ( y ,z ) from the (convex) subset Ξ = Ξ ( x ) defined below: 1
1
2
1
1
2
2
2
Ξ = Ξ ( x ) = {( y, z ) ∈ Y × Z : G ( x, y,z ) ≤ 0 and g ( x, y,z ) ≤ 0} ,
(4)
which is assumed to be non-empty for some subset K of X. Then it is well-known (cf., [3]), that for each x ∈ K , there exists a unique GNE ( y ( x ) , z ( x ) ) of the LLP in (MP1), which can be found as a (unique) solution of the corresponding variational inequality problem: Find a vector ( y ( x ) , z ( x ) ) ∈ Ξ ( x ) such that
Reduction of Dimension of the Upper Level Problem
( y − y ( x))
T
261
∇ y F ( x, y ( x ) ,z ( x ) ) + ( z − z ( x ) ) ∇ z f 2 ( x, y ( x ) ,z ( x ) ) ≥ 0 T
(5)
for all ( y, z ) ∈ Ξ ( x ) .
4 Linear Case In the linear case, when all the objective functions and the components of the constraints are linear functions and mappings, respectively, the situation with providing that assumptions A1 and A2 hold, is a bit different. For assumption A1 to hold, again it is enough to impose conditions guaranteeing the existence of a unique solution of the lower level LP problem z = z ( x, y ) on a certain compact subset of Z. For instance, the classical conditions will do (cf., [4]). As for assumption A2, here in linear case, the problem is much more complicated. Indeed, the uniqueness of a generalized Nash equilibrium (GNE) at the lower level of (MP1) is much too restrictive a demand. As was shown by Rosen [5], the uniqueness of a so-called normalized GNE is rather more realistic assumption. This idea was further developed later by many authors (cf., [6]). Before we consider the general case, we examine an interesting example (a slightly modified example from [7]), in which one of the upper level variables accepts only integer values. In other words, the problem studied in this example, is a so called mixed-integer bi-level linear programming problem (MIBLP). Consider the following example. Let the upper level problem have the following objective function: F ( x, y, z ) = −60 x − 10 y − 7 z → min
(6)
x ∈ X = {0,1} ; y ∈ [ 0,100] ; z ∈ [0 ,100]
(7)
x,y ,z
subject to and
f 2 ( x, y, z ) = −60 y − 8 z → min
(8)
z
subject to ⎡10 2 3 ⎤ ⎡ x ⎤ ⎡ 225⎤ ⎡0 ⎤ g ( x, y,z ) = ⎢⎢ 5 3 0 ⎥⎥ ⎢⎢ y ⎥⎥ − ⎢⎢ 230 ⎥⎥ ≤ ⎢⎢0 ⎥⎥ . ⎢⎣ 5 0 1 ⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣ 85 ⎥⎦ ⎢⎣0 ⎥⎦
(9)
We select the mixed-integer bi-level linear program (MIBLP) (6)–(9) as problem (P1). Its modification in comparison to the original example in [7] consists in elevating the lower level variable y (in the original example) up to the upper level in our example. (However, it is curious to notice that the optimal solution of the original example coincides with that of the modified one:
( x*, y*, z * ) = (1; 75; 21 23 )
in both cases!)
262
V.V. Kalashnikov et al.
It is easy to examine that assumption A1 holds in this problem: indeed, the lower level problem (8)–(9) has a unique solution
{
}
z = z ( x, y ) = min 85 − 5 x, 75 − 103 x − 23 y
for
any
pair
of
feasible
val-
ues ( x, y ) ∈Φ = {( x, y ) : x ∈ {0,1} , 0 ≤ y ≤ 100} , which is in line with the predic-
tions by Mangasarian [4]. However, not all triples ( x, y, z ( x, y ) ) satisfy the lower
level constraints g ( x, y,z ( x, y ) ) ≤ 0 , and the feasible subset Φ1 ⊂ Φ described in assumption A1 becomes here
Φ1 = {( 0 ,1) : 0 ≤ y ≤ 76 23 } ∪ {(1, y ) : 0 ≤ y ≤ 55} ,
(10)
with the optimal reaction function ⎧⎪75 − 2 y, if x = 0; 3 z = z ( x, y ) = ⎨ (11) 2 2 − 71 ⎪⎩ 3 3 y, if x = 1. Therefore, assumption A1 would hold completely if the variable x were a continuous one. However, here the subset Φ1 ⊂ Φ is non-void, composed of two compact and convex parts, and the (constant) function z = z ( x, y ) is continuous with re-
spect to the continuous variable y over each of the connected parts of Φ1 . Next, comparing the optimal values of the upper level objective function F over both connected parts of the feasible set Φ1 , we come to the conclusion that the triple
( x*, y*, z * ) = (1; 75; 21 23 )
is the optimal solution of problem (P1). Indeed,
(
) ) ( than F ( 0 , y ,z ) = F ( 0,76
F 1, y1* ,z1* = F 1, 75, 21 23 = −1011 23 * 0
* 0
2 3
is
strictly
less
)
, 23 89 = −933 89 .
Now consider the modified problem: F ( x, y, z ) = −60 x − 10 y − 7 z → min ,
(12)
x,y ,z
subject to
x ∈ X = {0,1} ; y ∈ [ 0,100] ; z ∈ [ 0,100] ;
(13)
and ⎧ ⎪ ⎪ ; f 2 ( x, y,z ) = −60 y − 8 z → min , ⎪ f1 ( x, y,z ) = −60 x − 10 y − 7 z → 0≤min y ≤100 0 ≤ z ≤100 ⎪⎪ subject to (14) ⎨ ⎪ ⎡10 2 3⎤ ⎡ x ⎤ ⎡ 225⎤ ⎡0 ⎤ ⎪ ⎪ G ( x, y,z ) = ⎢⎢ 5 3 0 ⎥⎥ ⎢⎢ y ⎥⎥ − ⎢⎢ 230 ⎥⎥ ≤ ⎢⎢0 ⎥⎥ . ⎪ ⎢⎣ 5 0 1⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣ 85⎥⎦ ⎢⎣0 ⎥⎦ ⎪⎩
Reduction of Dimension of the Upper Level Problem
263
We call problem (12)–(14) the modified problem (MP1). It is easy to see that for each value of x, either x = 0 or x = 1, the lower level problem has a continuous set of GNEs. Namely, if x = 0, then all the GNE points ( y,z ) = ( y ( 0 ) ,z ( 0 ) ) belong to the strait line interval described by the equation: 2 2 y + 3 z = 225 with 0 ≤ y ≤ 76 . (15) 3 In a similar manner, another strait line interval of GNE vectors for x = 1, that is ( y,z ) = ( y (1) ,z (1) ) , can be represented by the linear equation
(16) 2 y + 3 z = 215 with 0 ≤ y ≤ 75 . As it could be expected, the linear upper level objective function F attains its minimum value at the extreme points of the above intervals (15) and (16), corresponding to the greater value of the variable y:
) ( ) ( = F ( x , y , z ) = F ( 0 , 75, 21 ) = −1011
F0* = F x*0 , y*0 , z*0 = F 0 , 76 23 , 23 89 = −933 89 ; F1*
* 1
* 1
* 1
2 3
2 3
(17)
.
As F1* < F0* , the global optimal solution of problem (MP1) coincides with that of
(
)
the original problem (P1): ( x*, y*, z * ) = 1; 75; 21 23 , although assumption A2 is clearly invalid in this example.
■
Acknowledgements The research activity of the first author was financially supported by the R&D Department (Cátedra de Investigación) CAT-174 of the Instituto Tecnológico y de Estudios Superiores de Monterrey (ITESM), Campus Monterrey, and by the SEP-CONACYT project CB-200801-106664, Mexico. The third author was supported by the SEP-CONACYT project CB2009-01-127691.
References 1. Kalashnikov, V., Ríos-Mercado, R.: A natural gas cash-out problem: A bi-level programming framework and a penalty function method. Optimization and Engineering 7(4), 403–420 (2006) 2. Kalashnikov, V., Pérez-Valdés, G., Tomasgard, A., Kalashnykova, N.: Natural gas cash-out problem: Bilevel stochastic optimization approach. European Journal of Operational Research 206(1), 18–33 (2010) 3. Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Ine-qualities and Their Applications. Academic Press, New York (1980) 4. Mangasarian, O.: Uniqueness of solution in linear programming. Linear Algebra and Its Applications 25, 151–162 (1979)
264
V.V. Kalashnikov et al.
5. Rosen, J.: Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33(3), 520–534 (1965) 6. Nishimura, R., Hayashi, S., Fukushima, M.: Robust Nash equilibria in N-person noncooperative games: Uniqueness and reformulation. Pacific Journal of Optimization 5(2), 237–259 (2005) 7. Saharidis, G., Ierapetritou, M.: Resolution method for mixed integer bilevel linear problems based on decomposition technique. Journal of Global Optimization 44(1), 29–51 (2009)
Reduction of Dimension of the Upper Level Problem in a Bilevel Programming Model Part 2 Vyacheslav V. Kalashnikov, Stephan Dempe, Gerardo A. Pérez-Valdés, and Nataliya I. Kalashnykova*
Abstract. The paper deals with a problem of reducing dimension of the upper level problem in a bilevel programming model. In order to diminish the number of variables governed by the leader at the upper level, we create the second follower supplied with the objective function coinciding with that of the leader and pass part of the uppser level variables to the lower level to be governed but the second follower. The lower level problem is also modified and becomes a Nash equilibrium problem solved by the original and the new followers. We look for conditions that guarantee that the modified and the original bilevel programming problems share at least one optimal solution.
5 Normalized Generalized Nash Equilibrium We continue considering the reduction of dimension of the upper level problem in a bilevel program, studied in Part 1 of this paper. Following the line proposed in Rosen [5], we consider the concept of normalized generalized Nash equilibrium (NGNE) defined below. First of all, we have to make our assumptions more detailed: A3. We assume that each component G j ( x, y,z ) and g k ( x, y,z ) of the mapping G and g, respectively, is convex with respect to the variable ( y, z ) . Moreover, for Vyacheslav V. Kalashnikov ITESM, Campus Monterrey, Monterrey, Mexico e-mail:
[email protected] Stephan Dempe TU Bergacademie Freiberg, Freiberg, Germany e-mail:
[email protected] Gerardo A. Pérez-Valdés NTNU, Trondheim, Norway e-mail:
[email protected] Nataliya I. Kalashnykova UANL, San Nicolás de los Garza, Mexico e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 265–272. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
266
V.V. Kalashnikov et al.
each
feasible x ∈ X ,
fixed
there
( y , z ) = ( y ( x ) , z ( x ) ) ∈ Y × Z such that g ( x, y ( x ) ,z ( x ) ) < 0 for every nonlinear 0
0
0
0
0
0
k
(
exists
G j x, y
0
constraint
a
( x ) , z ( x )) < 0 0
point and
G j ( x, y, z ) ≤ 0 and
g k ( x, y, z ) ≤ 0 , respectively.
Remark 3. The latter inequalities in assumption A3 give a sufficient (Slater) condition for the satisfaction of the Karush-Kuhn-Tucker (KKT) constraint qualification. We wish to use the differential form of the necessary and sufficient KKT conditions for a constrained maximum. We therefore make the additional assumption: A4. All the components G j ( x, y,z ) and g k ( x, y,z ) of the mappings G and g, re-
spectively, possess continuous first derivatives with respect to y and z for all feasible ( x, y,z ) ∈ X × Y × Z . We also assume that for all feasible points, the payoff
function fi ( x, y,z ) for the i-th player possess continuous first derivatives with respect to the corresponding variables controlled by the player. In our problem (P2), for the scalar functions fi ( x, y, z ) , i = 1,2 , we denote by ∇ y f1 ( x, y, z ) and ∇ z f 2 ( x, y, z ) , respectively, the gradient with respect to the
player’s variables. Thus ∇ y f1 ( x, y, z ) ∈ R m2 and ∇ z f 2 ( x, y,z ) ∈ R m3 . The KKT conditions equivalent to (3) from Part 1 can now be stated as follows:
G ( x, y,z ) ≤ 0 , and g ( x, y,z ) ≤ 0 ,
(18)
and there exist u = ( u1 ,u2 ) ∈ R+m1 × R+m1 ,v = ( v1 ,v2 ) ∈ R+m2 × R+m2 such that
uiT G ( x, y,z ) = 0, viT g ( x, y,z ) = 0 , i = 1, 2,
(19)
and
⎧⎪ f1 ( x, y,z ) ≤ f1 ( x,w,z ) + u1T G ( x,w,z ) + v2T g ( x,w,z ) ; ⎨ T T ⎪⎩ f 2 ( x, y,s ) ≤ f 2 ( x, y,s ) + u2 G ( x, y,s ) + v2 g ( x, y,s ) .
(20)
Since fi ,i = 1, 2 , and the components G j ( x, y,z ) and g k ( x, y,z ) of the mappings
G and g, respectively, are convex and differentiable by assumptions A3 and A4, inequalities (20) are equivalent to
⎧⎪∇ y f1 ( x, y,z ) + u1T ∇ y G ( x, y,z ) + v1T ∇ y g ( x, y,z ) = 0; ⎨ T T ⎪⎩∇ z f 2 ( x, y,z ) + u2 ∇ z G ( x, y,z ) + v2 ∇ z g ( x, y,z ) = 0.
(21)
Reduction of Dimension of the Upper Level Problem
267
We shall also use the following relation, which holds as a result of the convexity of the components G j ( x, y,z ) and g k ( x, y,z ) . For every
( y , z ) ,( y , z ) ∈ Y × Z and at each fixed x ∈ X we have 0
0
1
1
(
)
(
) ( ) ( ) ( ) (
)
(
)
) ( )
)
T ⎧ 1 1 0 0 1 0 1 0 ∇ ( y ,z )G j x, y 0 ,z 0 = ⎪G j x, y ,z − G j x, y , z ≥ y − y ,z − z ⎪ T T ⎪ = y1 − y 0 ∇ y G j x, y 0 , z 0 + z1 − z 0 ∇ z G j x, y 0 , z 0 ; ⎪ (22) ⎨ T 0 0 ⎪ g k x, y1 , z1 − g k x, y 0 , z 0 ≥ y1 − y 0 ,z1 − z 0 ∇ = g x, y , z ( y ,z ) k ⎪ T T ⎪ 1 0 ∇ y g k x, y 0 ,z 0 + z1 − z 0 ∇ z g k x, y 0 ,z 0 . ⎪⎩ = y − y
(
)
(
(
)
(
(
)
(
)
(
)
)
(
A weighted nonnegative sum of the functions fi ,i = 1, 2 , is given by
σ ( x, y, z;r ) = r1 f1 ( x, y, z ) + r2 f 2 ( x, y,z ) , ri ≥ 0 ,
(23)
for each nonnegative vector r ∈ R 2 . For each fixed r, a related mapping p ( x, y,z;r ) of R n2 + n3 into itself is defined in terms of the gradients of the functions fi ,i = 1, 2, by
⎡ r1∇ y f1 ( x, y, z ) ⎤ p ( x, y, z; r ) = ⎢ ⎥. ⎢⎣ r2∇ z f 2 ( x, y,z ) ⎥⎦
(24)
After Rosen [5], we shall call p ( x, y,z;r ) the pseudo-gradient of σ ( x, y,z;r ) .
An important property of σ ( x, y,z;r ) is given by the following
Definition 1 [5]. The function σ ( x, y,z;r ) will be called uniformly diagonally
( y, z ) ∈ Y × Z and fixed
strictly convex for
(
0
any y , z
0
) ,( y , z ) ∈ Y × Z we have 1
r ≥ 0 if for every fixed x ∈ X and for
1
T
⎛ y1 − y 0 ⎞ ⎜ ⎟ ⎡ p x, y1 ,z1 ;r − p x, y 0 ,z 0 ;r ⎤ > 0. ⎦ ⎜ z1 − z 0 ⎟ ⎣ ⎝ ⎠
(
) (
)
(25)
Repeating and modifying arguments similar to those in [5], we will show later that a sufficient condition that σ ( x, y,z;r ) be uniformly diagonally strictly conT vex is that the symmetric matrix ⎡⎢ P ( x, y,z;r ) + P ( x, y, z;r ) ⎤⎥ is (uniformly by x ⎣ ⎦ from X) positive definite for ( y, z ) ∈ Y × Z , where P ( x, y,z;r ) is the Jacobi ma-
trix with respect to (y,z) of p ( x, y,z;r ) .
268
V.V. Kalashnikov et al.
Now following [5] we consider a special kind of equilibrium point such that each of the nonnegative multipliers u ∈ R+m1 ,v ∈ R+m2 involved in the KKT conditions (19)–(20) is given by
⎧⎪u1 = u 0 r1 and v1 = v 0 r1 , ⎨ 0 0 ⎪⎩u2 = u r2 and v2 = v r2 ,
(26)
for some r > 0 and u 0 ≥ 0, v 0 ≥ 0 . Like Rosen [5], we call this a normalized Nash equilibrium point (NGNE point). Now we establish, by slightly modifying the proofs of Theorems 3 and 4 in [5], the existence and uniqueness results for the NGNE points involved in the modified problem (MP1). Theorem 4. Under assumptions A3 and A4, there exists a normalized generalized Nash equilibrium point to a lower level equilibrium problem (3) in (MP1) for every specified r > 0 . Proof.. For a fixed value r = r > 0 , let
ρ ( x, y,z; w,s; r ) = r1 f1 ( x,w,z ) + r2 f 2 ( x, y,s ) .
(27)
Consider the feasible set of equilibrium problem (3):
Θ ( x ) = {( y,z ) ∈ Y × Z such that G ( x, y, z ) ≤ 0 and g ( x, y,z ) ≤ 0}
(28)
and the point-to-set mapping Γ : Θ ( x ) → Θ ( x ) , given by
⎧
Γ ( y, z ) = ⎨( w,s ) ∈ Θ ( x ) ρ ( x, y, z; w,s;r ) = ⎩
min
( q ,t )∈Θ ( x )
⎫
ρ ( x, y, z;q,t; r ) ⎬ . ⎭
(29)
It follows (by assumptions A3 and A4) from the continuity of ρ ( x, y, z;q,t;r )
and the convexity in (q,t) of ρ ( x, y, z;q,t;r ) for fixed (x,y,z) that Γ is an upper semi-continuous mapping that maps each point of the convex, compact set Θ ( x )
into a closed compact subset of Θ ( x ) . Then by the Kakutani Fixed Point Theo-
(
)
(
) ( min ρ ( x, y , z ; w,s;r ) . ) Θ( )
)
rem, there exists a point y 0 ,z 0 ∈Θ ( x ) such that y 0 , z 0 ∈ Γ y 0 , z 0 , or
(
)
ρ x, y 0 , z 0 ; y 0 , z 0 ; r = The fixed point
0
( w,s ∈
( y , z ) ∈Θ ( x ) 0
0
0
is an equilibrium point satisfying (3). For sup-
pose that it were not. Then, say for player 1, there would be a point
( y , z ) ∈Θ ( x ) and f ( x, y ,z ) < f ( x, y ,z ) . But then ρ ( x, y ,z ; y ,z ;r ) < ρ ( x, y ,z ; y ,z ;r ) , which contradicts (30). 1
0
1
0
0
1
0
0
1
0
1
0
(30)
x
0
0
0
0
y1 such that we
have
Reduction of Dimension of the Upper Level Problem
269
Now by the necessity of the KKT conditions, (30) implies the existence of u ∈ R+m1 ,v 0 ∈ R+m2 such that 0
(u )
T
0
(v )
G ( x, y,z ) = 0 ,
0
T
g ( x, y,z ) = 0,
(31)
and
( ) ( )
( ) ( )
T T ⎧ 0 ∇ y G ( x, y,z ) + v0 ∇ y g ( x, y,z ) = 0; ⎪ r1∇ y f1 ( x, y,z ) + u ⎨ T T ⎪ r2∇ z f 2 ( x, y,z ) + u 0 ∇ z G ( x, y,z ) + v 0 ∇ z g ( x, y,z ) = 0. ⎩ But these are just the conditions (19) and (21), with 0 0 ⎪⎧u1 = u r1 and v1 = v r1 , ⎨ 0 0 ⎪⎩u2 = u r2 and v2 = v r2 ,
which, together with (18), are sufficient to ensure that (3);
( y ,z ) 0
0
(32)
( y , z ) ∈Θ ( x ) 0
0
satisfies
is therefore a normalized generalized Nash equilibrium (NGNE)
point for the specified value of r = r .
■
Theorem 5. Let assumptions A3 and A4 be valid, and σ ( x, y,z;r ) be (uniformly
by x from X) diagonally strictly convex for every r ∈ Q , where Q is a convex subset of the positive orthant of R 2 . Then for each r ∈ Q there is a unique normalized generalized Nash equilibrium (NGNE) point. Proof. Assume that for some r = r ∈ Q we have two distinct NGNE points
( y , z ) ≠ ( y , z ) ∈Θ ( x ) . Then we have for A = 0,1, G ( x, yA , zA ) ≤ 0 , and g ( x, yA , zA ) ≤ 0; 0
0
1
1
(33)
there exist uA ∈ R+m1 ,vA ∈ R+m2 , such that
(uA ) G ( x, yA , zA ) = 0, ( vA ) g ( x, yA , zA ) = 0, T
T
( (
(34)
) ( ) ∇ G ( x, yA ,zA ) + ( vA ) ∇ g ( x, yA ,zA ) = 0; (35) ) ( ) ∇ G ( x, yA ,zA ) + ( vA ) ∇ g ( x, yA ,zA ) = 0. We multiply the first row in (35) by ( y − y ) for A = 0 and by ( y − y ) for A = 1 ; in a similar manner, we multiply the second row in (35) by ( z − z ) for A = 0 and by ( z − z ) for A = 1 ; finally, we sum all these four terms. This gives ⎧ A A A ⎪ r1∇ y f1 x, y ,z + u ⎨ ⎪ r2∇ z f 2 x, yA ,zA + u A ⎩
T
T
y
y
T
T
z
z
0
1
1
0
1
β + γ = 0 , where
0
0
1
270
V.V. Kalashnikov et al. T
⎛ y1 − y 0 ⎞ β = ⎜ 1 0 ⎟ ⎡ p x, y1 ,z1 ;r − p x, y 0 ,z 0 ;r ⎤ , ⎦ ⎜z −z ⎟ ⎣ ⎝ ⎠
(
) (
)
(36)
and
( ) ∇ G ( x, y ,z )( y − y ) + (u ) ∇ G ( x, y , z )( y − y ) + + ( v ) ∇ g ( x, y ,z )( y − y ) + ( v ) ∇ g ( x, y ,z )( y − y ) + + ( u ) ∇ G ( x, y ,z )( z − z ) + ( u ) ∇ G ( x, y ,z )( z − z ) + + ( v ) ∇ g ( x, y ,z )( z − z ) + ( v ) ∇ g ( x, y ,z )( z − z ) ≥ ≥ ( u ) ⎡G ( x, y ,z ) − G ( x, y ,z ) ⎤ + ( u ) ⎡G ( x , y ,z ) − G ( x, y ,z ) ⎤ + ⎣ ⎦ ⎣ ⎦ + ( v ) ⎡ g ( x, y ,z ) − g ( x, y ,z ) ⎤ + ( v ) ⎡ g ( x, y , z ) − g ( x, y ,z ) ⎤ = ⎣ ⎦ ⎣ ⎦ = − ( u ) G ( x, y ,z ) − ( u ) G ( x, y ,z ) − ( v ) g ( x, y ,z ) − ( v ) g ( x, y ,z ) ≥ 0.
γ = u0
0
T
0
0
0
1 T
1
y
T
0
0
0
1
1
T
y
0
T
0
1
0
1
1
1
0
0
0
1
1
1
1
1
0
T
z
0 T
0
0
0
1 T
1
1
z
1
1
0
z
0 T
0
T
0
1
y
z
0
1
y
0
T
0
1
0
1
1
1
1
T
1 T
1
1
T
1
0
1
0
1
0
1
0
1
T
0
1
1
1
0
0
T
0
0
(37) Then since σ ( x, y,z;r ) is (uniformly by x from X) diagonally strictly convex we
have β > 0 by (25), which contradicts β + γ = 0 and proves the theorem.
■
We complete by giving (similarly to [5]) a sufficient condition on the functions fi ,i = 1, 2, that insures that σ ( x, y,z;r ) is (uniformly by x from X) diagonally
strictly convex. The condition is given in terms of the ( n2 + n3 ) × ( n2 + n3 ) matrix P ( x, y,z;r ) , which is the Jacobi matrix with respect to (y,z) of p ( x, y,z;r ) for
fixed r > 0 . That is, the j-th column of P ( x, y,z;r ) is ∂p ( x, y,z;r ) ∂y j , if 1 ≤ j ≤ n2 ; and it is ∂p ( x, y,z;r ) ∂z j − n2 , if n2 + 1 ≤ j ≤ n3 , where p ( x, y,z;r ) is defined by (24). Theorem 6. A sufficient condition that σ ( x, y,z;r ) be (uniformly by x from X)
diagonally strictly convex for
( y, z ) ∈Θ ( x ) and fixed
r = r > 0 is that the sym-
T metric matrix ⎡⎢ P ( x, y,z;r ) + P ( x, y,z;r ) ⎤⎥ be (uniformly by x from X) positive ⎣ ⎦ definite for ( y, z ) ∈Θ ( x ) .
( y , z ) ≠ ( y , z ) ∈Θ ( x ) be any two distinct points in Θ ( x ) , and let ( y (α ) , z (α ) ) = α ( y , z ) + (1 − α ) ( y ,z ) , so that ( y (α ) , z (α ) ) ∈Θ ( x ) for
Proof. Let
0
0
1
1
1
1
0
0
0 ≤ α ≤ 1 . Now, since P ( x, y,z;r ) is the Jacobi matrix of p ( x, y, z; r ) , we have
Reduction of Dimension of the Upper Level Problem
dp ( x, y (α ) , z (α ) ;r ) dα
= P ( x, y (α ) ,z (α ) ;r )
271
d ( y (α ) ,z (α ) )
dα 1 ⎛ y − y0 ⎞ = P ( x, y (α ) , z (α ) ;r ) ⎜ ⎟, ⎜ z1 − z 0 ⎟ ⎝ ⎠
= (38)
or
(
1
) (
) ∫
p x, y1 ,z1 ; r − p x, y 0 ,z 0 ;r = P ( x, y (α ) , z (α ) ;r ) dα .
(39)
0
T
⎛ y1 − y 0 ⎞ Multiplying both sides by ⎜ ⎟ gives ⎜ z1 − z 0 ⎟ ⎝ ⎠ T
⎛ y1 − y 0 ⎞ ⎜ ⎟ ⎡ p x, y1 ,z1 ;r − p x, y 0 ,z 0 ;r ⎤ = ⎜ z1 − z 0 ⎟ ⎣ ⎦ ⎝ ⎠
(
) (
)
T
⎛ y 0 − y1 ⎞ ⎛ y 0 − y1 ⎞ = ⎜ ⎟ P ( x, y (α ) ,z (α ) ;r ) ⎜ ⎟ dα = ⎜ z 0 − z1 ⎟ ⎜ z 0 − z1 ⎟ ⎠ ⎝ ⎠ 0⎝ 1
∫
1 = 2
T
0 1 ⎛ y 0 − y1 ⎞ ⎡ T ⎛y −y ⎞ ⎜ ⎟ P ( x, y (α ) ,z (α ) ;r ) + P ( x, y (α ) ,z (α ) ;r ) ⎤ ⎜ ⎟ dα > 0, ⎥⎦ ⎜ z 0 − z1 ⎟ ⎜ z 0 − z1 ⎟ ⎢⎣ ⎠ ⎝ ⎠ 0⎝
1
∫
which shows that (25) is satisfied.
■
6 Conclusion The paper (Part 1 and Part 2) deals with an interesting problem of reducing the number of variables at the upper level of bilevel programming problems. The latter problems are widely used to model various applications, in particular, the natural gas cash-out problems described in [1] and [2]. To solve these problems with stochastic programming tools, it is important that part of the upper level variables be governed at the lower level, to reduce the number of (upper level) variables, which are involved in generating the scenario trees. The paper presents certain preliminary results recently obtained in this direction. In particular, it has been demonstrated that the desired reduction is possible when the lower level optimal response is determined uniquely for each vector of upper level variables. In Part 2, the necessary base for similar results is prepared for the general case of bilevel programs with linear constraints, when the uniqueness of the lower level optimal response is quite a rare case. However, if the optimal response is defined for a fixed set of Lagrange multipliers, then it is possible to demonstrate (following Rosen [5]) that the so called normalized Nash equilibrium is unique. The latter gives one a hope to get the positive results for reducing
272
V.V. Kalashnikov et al.
the dimension of the upper level problem without affecting the solution of the original bilevel programming problem.
Acknowledgements The research activity of the first author was financially supported by the R&D Department (Cátedra de Investigación) CAT-174 of the Instituto Tecnológico y de Estudios Superiores de Monterrey (ITESM), Campus Monterrey, and by the SEP-CONACYT project CB-200801-106664, Mexico. The third author was supported by the SEP-CONACYT project CB2009-01-127691.
References 1. Kalashnikov, V., Ríos-Mercado, R.: A natural gas cash-out problem: A bilevel programming framework and a penalty function method. Optimization and Engineering 7(4), 403–420 (2006) 2. Kalashnikov, V., Pérez-Valdés, G., Tomasgard, A., Kalashnykova, N.: Natural gas cash-out problem: Bilevel stochastic optimization approach. European Journal of Operational Research 206(1), 18–33 (2010) 3. Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications. Academic Press, New York (1980) 4. Mangasarian, O.: Uniqueness of solution in linear programming. Linear Algebra and Its Applications 25, 151–162 (1979) 5. Rosen, J.: Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33(3), 520–534 (1965) 6. Nishimura, R., Hayashi, S., Fukushima, M.: Robust Nash equilibria in N-person noncooperative games: Uniqueness and reformulation. Pacific Journal of Optimization 5(2), 237–259 (2005) 7. Saharidis, G., Ierapetritou, M.: Resolution method for mixed integer bilevel linear problems based on decomposition technique. Journal of Global Optimization 44(1), 29–51 (2009)
Representation of Loss Aversion and Impatience Concerning Time Utility in Supply Chains P´eter F¨oldesi, J´anos Botzheim, and Edit S¨ule
Abstract. The paper deals with the investigation of the critical time factor of supply chain. The literature review gives a background to understand and handle the reasons and consequences of the growing importance of time, and the phenomenon of time inconsistency. By using utility functions to represent the value of various deliverytimes for the different participants in the supply chain, including the final customers, it is shown that the behaviour and willingness of payment of time-sensitive and non time-sensitive consumers are different for varying lead times. Longer lead times not only generate less utility but impatience influences the decision makers, that is the time elasticity is not constant but it is function of time. For optimization soft computing techniques (particle swarm optimization in this paper) can be efficiently applied.
1 Introduction Time has limits, consumers have become time-sensitive and choose the contents of their basket of commodities according to available time as well. The time necessary to obtain a product/service (access time) is involved in product utility to an increasing extent, the assurance of which is the task of logistics. There are more reasons for P´eter F¨oldesi Department of Logistics and Forwarding, Sz´echenyi Istv´an University, 1 Egyetem t´er, Gy˝or, 9026, Hungary e-mail:
[email protected] J´anos Botzheim Department of Automation, Sz´echenyi Istv´an University, 1 Egyetem t´er, Gy˝or, 9026, Hungary e-mail:
[email protected] Edit S¨ule Department of Marketing and Management, Sz´echenyi Istv´an University, 1 Egyetem t´er, Gy˝or, 9026, Hungary e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 273–282. c Springer-Verlag Berlin Heidelberg 2011 springerlink.com
274
P. F¨oldesi, J. Botzheim, and E. S¨ule
the shortening of this access time, one of the most important is the change in customer expectations, which can be related to new trends emerging in the most diverse areas with the time factor playing the main role [10]. Increasing rapidity is also encouraged by the sellers in the competition against each other based on time, because of the pressure to reduce costs and inventory, and to increase the efficiency and customer satisfaction [12, 19, 21]. Customer satisfaction is determined by human factors as well, based on the loss aversion and impatient features of human thinking: “future utility is less important” [5]. After identifying the character of time as a resource it can be seen that there is a spread of new management technologies aimed at time compression and faster service. In addition to the actors are expected to achieve the traditional requirements such as cost reduction, capacity utilization, increase in efficiency, quality improvement and customer satisfaction [4, 18]. The paper is concerned with the time-sensitive and non time-sensitive customers behaviour by using utility functions, and it tries to find optimal time-parameters for different time-demands by using logistical performance measures based on time.
2 Time Compression The time-sensitive segment of population continuously increases. The time-sensitive segment, which depends on time, expects special services with high time-quality, speed and punctuality. Because of the increasing time-preference the intertemporal decisions are present-asymmetric. Lots of work is coupled with higher income [2, 16, 22] so time becomes scarcer than money. So time sensitivity dominates price sensitivity. The change in the attitude towards time is not a novelty and cannot be related to the formation of the pace society [3, 16]. Logistics has to find delivery solutions adjusted to the consumption behaviour of products, which generates many kinds of logistical needs to be seen already today. The importance of time is different according to production and consumption points of view, but it is different due to customer segments and groups of product as well. Relevant literature deals with consequences of time-based competition and those methods, which can respond to this challenge. More research [13] has found that there is a close relationship between the entire lead time (defined as the period between a purchase order placement and its receipt by the client), the customers demands, the willingness of payment and the customer loyalty. Karmarkar [6] pointed out that shorter delivery times are most probably inversely related to market shares or price premiums or both. Customers highly appreciate short and punctual delivery time; therefore they will not turn to competitors. Customers may be willing to pay a price premium for shorter delivery times. We can find several methods and practices in operations management, which cause visible results in manufacturing. These time-based performances include sales growth, return of investment, market share gain, and over-all competitive position [13, 14, 17, 20].
Representation of Loss Aversion and Impatience Concerning Time Utility
275
The customer’s need for fast service follows in the same direction as the company’s ambition to decrease lead time. By now it has turned out that time itself also behaves like a resource that has to be managed. Therefore within the supply chain not only the interior solutions are aimed at time-saving within the company but the spatial and temporal expansion of remote processes arranged by different actors and with different time-consumption is also of high importance. Literature on the competition strategies based on time is also aimed at the temporal integration of the different levels of the supply chain. Among these we can find the methods being popular nowadays such as just-in-time (JIT), agile production, lean production, Quick Response (QR), Efficient Customer Response (ECR), cross-docking, etc. [3, 13, 17, 19, 20, 21, 22] Based on these we can distinguish the internal - measurable only by the company – and the external - perceived also by the customers – forms of time performances [19] Customer responsiveness is an ability to reduce the time required to deliver products and to reorganize production processes quickly in response to requests. Improved customer responsiveness can be achieved through available inventory, which is close to buyers or faster delivery with shorter lead time and good connection to shipment logistics [11, 18].
2.1 Value Factors of Goods The possession-, consumer-, place- and time-value of products is different but it is the result of correlative processes. Consumer-value is created through production, which is basically determined by the quality of the product but it can be also influenced by the time and place of its access. These two latter values are valuecategories created by logistics. Place- and time-value can be interpreted only in relation to consumer-value because we can decide the optimal time and place of consumption only by obtaining consumer-value and only in accordance with it. Time-value becomes more important as it is determined by the lead time between the appearance and the satisfaction of demand [19]. It is maximal when the searchproduction-obtaining of the product does not have any time-requirements, that is to say the demand can be fulfilled immediately at the moment of its appearance. Time sensitivity is different with each consumer and product. We can speak about time sensitive consumer segments and also such kinds of products, which are very sensitive to any waiting or delay. The willingness of waiting is in relation to the importance of the product and its substitution. With the first one, the waitingwillingness is in direct proportion while with the latter one it is in the inverse ratio. Its formation determines the amount of the opportunity cost of waiting of a product for the consumer. Waiting means opportunity cost, the cost of which comes from wasted-time and wasted possibilities. Time - in a resource environment - behaves as a capacity, which we have to use efficiently. The consumer is always willing to wait as long as the advantage of sacrificed possibilities is lower than the benefit coming from the product, or the cost of waiting does not exceed it (for example unutilized capacities).
276
P. F¨oldesi, J. Botzheim, and E. S¨ule
2.2 The Value of Time Customers tend to make decisions based on acceptability, affordability and accessibility. In the literature these are the 3As framework in assessing potential benefits. Perceived benefits are determined by more elements in connection with product, provider and circumstances, against the perceived sacrifices; factors like cost, risk/uncertainty and time. Time appears like a hidden cost. How we value time depends on several factors. First of all it depends on the customer type. We distinguish between the end user and the industrial customer. The final buyer gets more and more time-sensitive, so in his case the choice based on time can describe a utility function, which measures product usefulness depending on the quantity/lengths of time it takes to obtain it. Fig. 1 shows a possible form of such a function. The derivative function can also give information about how the marginal utility of time behaves. If we can compare it with the marginal cost function of service, we can see whether it is worth making efforts to have faster service in a certain segment.
Fig. 1 Utility of time for the final buyers
Fig. 2 Utility of time for the industrial buyers
For the buyers at higher levels of the supply chain, those who buy for further processing (producers), or for reselling (wholesalers, retailers), there is another kind of utility function to draw. This is shown in Fig. 2. The limited time-utility is due to the larger time consciousness, because time costs money for companies. Like the aim to satisfy the consumer at a high level, the aim to operate efficiently as well leads to optimizing on a time basis.
2.3 Elasticity of Time There are consumers who are not sensitive to time, who do not want to or are not able to afford rapidity. There are products/services as well, where urgency is not necessary, just the opposite, quality is brought by time (e.g., process-centred services). The behaviour of these consumers is shown in Fig. 3, where price is not increasing parallel to faster service (opposite direction on the lead time axis) price
Representation of Loss Aversion and Impatience Concerning Time Utility
277
is constant, independent of time. The buyer does not pay more, even for a quicker service. His relation to time is totally inflexible. Fig. 4 shows the opposite side, where, to get something at a certain time is worth everything; it means there is an infinite time-elasticity.
Fig. 3 Absolute time-insensitive consumer
Fig. 4 Infinite time-elasticity
Time-elasticity shows how much value there is for a buyer to get 1% faster service. Fig. 5 shows the behaviour of a consumer who is not willing to appreciate the acceleration of delivery time in the same degree. Cutting the lead time from T1 to T2 he/she is only willing to pay the price P2 instead of the price P1, that means the relative decrease of T results only in a relative price increment P2−P1 P1 .
Fig. 5 Non-time sensitive consumers
Fig. 6 Time sensitive consumers
Time-elasticity appears in a flexible behaviour, which means a 1% relative decrease in lead time can realize a relative higher price-increment. Even a consumer surplus can arise if the reservation price (the maximum price the buyer is willing to pay for a certain time) is higher than the price fixed by the provider. Fig. 6 shows the behaviour of a time sensitive consumer. Economics and marketing oriented research recognizes that longer lead times might have a negative impact on customer demand. The firms objective is to maximize profit by optimal selection of price and delivery time.
278
P. F¨oldesi, J. Botzheim, and E. S¨ule
2.4 Maximizing the Value-Cost Ratio Concerning the Time Factor Concerning the customers’ time sensitivity detailed in the previous sections the following model can be set. The customer satisfaction (S) is affected by two elements: • SU – the actual utility of obtaining the goods • SA – the accuracy of the service, variance of the lead time A simple representation of the satisfaction based on utility can be: SU (t) = u0 − u1 · t βU
(1)
where u0 and u1 are real constants, t is is the lead time, and βU > 1 represents the time sensitivity of customers. (The value of u0 shows the satisfaction of obtaining the goods with zero lead time. Negative values of SU mean dissatisfaction.) The accuracy is considered as an attractive service element in modern logistic, just-in-time systems. When the supply chain is being extended, that is the lead time is growing, the accuracy, and hit of the time-window is getting harder (see also Fig. 2), thus we can write: A(t) = a0 − a1 · t, (2) where A(t) is the measure of accuracy, a0 and a1 are real constants, t is the time. The satisfaction measure is progressive: SA (t) = (a0 − a1 · t)βA ,
(3)
where βA > 1 is the sensitivity. The cost of the actual logistic service depends on the lead time required, the shorter lead time is the more expensive service. Since the cost reduction is not a linear function of lead time extension we can write: C(t) = c0 +
c1 , t
where C(t) is the cost, t is the lead time and c0 and c1 are real constants. The target is to maximize the total satisfaction over the costs: SU (t) + SA(t) max , 0 < t < ∞. C(t)
(4)
(5)
Increasing the lead-time leads to an “objective” decline in satisfaction. On the other hand, since the future utility is less important [5, 15] in “subjective sense”, another time sensitivity should be applied in the model. Thus instead of considering βU and βA as a constant, the exponents are interpreted as a function of time [1, 9]. In this paper we suggest that let:
βU = βu0 + βu1 · t
(6)
Representation of Loss Aversion and Impatience Concerning Time Utility
and
β A = β a 0 − β a1 · t
279
(7)
So, the function to be maximized is: f=
u0 − u1 · t [βu0 +βu1 ·t] + (a0 − a1 · t)[βa0 −βa1 ·t] . c0 + ct1
(8)
3 Particle Swarm Optimization Particle swarm optimization (PSO) is a population based stochastic optimization technique inspired by social behavior of bird flocking or fish schooling [7, 8]. In these methods a number of individuals try to find better and better places by exploring their environment led by their own experiences and the experiences of the whole community. Each particle is associated with a position in the search space which represents a solution for the optimization problem. Each particle remembers the coordinates of the best solution it has achieved so far. The best solution achieved so far by the whole swarm is also remembered. The particles are moving towards in the direction based on their personal best solution and the global best solution of the swarm. The positions of the particles are randomly initialized. The next position of particle i is calculated from its previous position xi (t) by adding a velocity to it: xi (t + 1) = xi (t) + vi (t + 1).
(9)
The velocity is calculated as: vi (t + 1) = wvi (t) + γ1r1 [pi (t) − xi(t)] + γ2 r2 [g(t) − xi(t)],
(10)
where w is the inertia weight, γ1 and γ2 are parameters, r1 and r2 are random numbers between 0 and 1. The second term in equation 10 is the cognitive component containing the best position remembered by particle i (pi (t)), the third term in the equation is the social component containing the best position remembered by the swarm (g(t)). The inertia weight represents the importance of the previous velocity while γ1 and γ2 parameters represent the importance of the cognitive and social components, respectively. The algorithm becomes stochastic because of r1 and r2 . After updating the positions and velocities of the particles the vectors describing the particles’ best positions and the global best position have to be updated, too. If the predefined iteration number is reached then the algorithm stops. The number of particles is also a parameter of the algorithm.
4 Numerical Example The parameters of function f in Equation 8 used in the simulations are presented in Table 1. Parameters βu1 and βa1 are varied in the simulations. Table 2 shows the PSO
280
P. F¨oldesi, J. Botzheim, and E. S¨ule
parameters. The obtained results for time using different βu1 and βa1 parameters are presented in Table 3. Table 1 Function Parameters Parameter
Value
u0 u1 a0 a1 c0 c1 βu0 βa0
10 0.1 10 0.1 10 100 1.3 1.3
Table 2 PSO Parameters Parameter
Value
Number of iterations Number of particles w γ1 γ2
200 30 1 0.5 0.5
Table 3 Results
βu1
βa1
f
t
0 0.01 0.02 0.03 0.04 0.05 0.06 0.1 0 0.1 1 0 1 2
0 0.01 0.02 0.03 0.04 0.05 0.06 0.1 0.1 0 1 1 0 2
1.368 1.076 0.913 0.804 0.724 0.662 0.611 0.478 0.517 0.962 0.124 0.125 0.521 0.067
15.856 10.825 8.728 7.509 6.688 6.093 5.635 4.516 6.578 6.597 1.3 1.3 2.506 0.65
Representation of Loss Aversion and Impatience Concerning Time Utility
281
It is clearly shown that the increasing impatience presses down the lead-time (t), however this effect is degressive, that is ten times higher impatience (βu1 = 0.01 and βa1 = 0.01 versus βu1 = 0.1 and βa1 = 0.1) results in only around one half alteration in lead time (t = 10.825 versus t = 4.516). By increasing βu1 and βa1 the solution is shifting from non time sensitive case (see Fig. 5) to time sensitive situation (Fig. 6) is subjective sense as well, and it represents the time compression, the “fear” of loss and the impatience of decision makers: they want shorter lead times an they are willing to pay more than it can be derived from the objective value/cost ratio.
5 Conclusions The affects of time sensitivity have objective and subjective features. The utility of possessing goods in time, and the accuracy of delivery times can be described by univariate functions, also the cost of performing that given lead-time can be considered as a hyperbolic function. When maximizing the utility-cost ratio, an other, subjective element can be embedded in the model, by extending the meaning of the power functions used, and time dependent exponents are used. The overall effects that kind of impatience can be detected by using simulations. Particle swarm optimization is an efficient method to explore the side effects of loss aversion, that are turned out to be degressive in our investigation.
References [1] Bleichrodt, H., Rhode, K.I.M., Wakker, P.P.: Non-hyperbolic time inconsistency. Games and Economic Behavior 66, 27–38 (2009) [2] Bosshart, D.: Billig: wie die Lust am Discount Wirtschaft und Gesellschaft ver¨andert. Redline Wirtschaft, Frankfurt/M (2004) [3] Christopher, M.: Logistics and Supply Chain Management (Creating Value-Adding Networks). Prentice-Hall, Englewood Cliffs (2005) [4] F¨oldesi, P., Botzheim, J., S¨ule, E.: Fuzzy approach to utility of time factor. In: Proceedings of the 4th International Symposium on Computational Intelligence and Intelligent Informatics, ISCIII 2009, Egypt, pp. 23–29 (October 2009) [5] Frederick, S., Loewenstein, G., O’Donoghue, T.: Time discounting and time preference: A critical review. Journal of Economic Literature 40, 351–401 (2002) [6] Karmarkar, U.S.: Manufacturing lead times. Elsevier Science Publishers, Amsterdam (1993) [7] Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, pp. 1942–1948 (1995) [8] Kennedy, J., Eberhart, R.C., Shi, Y.: Swarm Intelligence. Morgan Kaufmann, San Francisco (2001) [9] Khaneman, D., Tversky, A.: Prospect theory: An analysis of decision under risk. Econometria 47(2), 263–292 (1979)
282
P. F¨oldesi, J. Botzheim, and E. S¨ule
[10] LeHew, M.L.A., Cushman, L.M.: Time sensitive consumers’ preference for concept clustering: An investigation of mall tenant placement strategy. Journal of Shopping Center Research 5(1), 33–58 (1998) [11] Lehmusvaara, A.: Transport time policy and service level as components in logistics strategy: A case study. International Journal of Production Economics 56-57, 379–387 (1998) [12] McKenna, R.: Real Time (The benefit of short-term thinking). Harvard Business School Press, Boston (1997) [13] Nahm, A.Y., Vonderembse, M.A., Koufteros, X.A.: The impact of time-based manufacturing and plant performance. Journal of Operations Management 21, 281–306 (2003) [14] Pangburn, M.S., Stavrulaki, E.: Capacity and price setting for dispersed, time-sensitive customer segments. European Journal of Operational Research 184, 1100–1121 (2008) [15] Prelec, D.: Decreasing impatience: A criterion for non-stationary time preference and “hyperbolic” discounting. Scandinavian Journal of Economics 106, 511–532 (2004) [16] Rosa, H.: Social acceleration: Ethical and political consequences of a desynchronized high-speed society. Constellation 10(1), 3–33 (2003) [17] Saibal, R., Jewkes, E.M.: Customer lead time management when both demand and price are lead time sensitive. European Journal of Operational Research 153, 769–781 (2004) [18] Stalk, G.: Time-based competition and beyond: Competing on capabilities. Planning Review 20(5), 27–29 (1992) [19] De Toni, A., Meneghetti, A.: Traditional and innovative path towards time-based competition. International Journal of Production Economics 66, 255–268 (2000) [20] Tu, Q., Vonderembse, M.A., Ragu-Nathan, T.S.: The impact of time-based manufacturing practices on mass customization and value to customer. Journal of Operations Management 19, 201–217 (2001) [21] Vanteddu, G., Chinnam, R.B., Yang, K., Gushikin, O.: Supply chain focus dependent safety stock placement. International Journal of Flexible Manufacturing System 19(4), 463–485 (2007) [22] Waters, C., Donald, J.: Global Logistics and Distribution Planning. Kogan Page Publishers, Corby (2003)
Robotics Application within Bioengineering: Neuroprosthesis Test Bench and Model Based Neural Control for a Robotic Leg Dorin Popescu, Dan Selişteanu, Marian S. Poboroniuc, and Danut C. Irimia
*
Abstract. This work deals with motion analysis of the human body, with robotic leg control and then with neuroprosthesis test bench. The issues raised in motion analysis are of interest for controlling motion-specific parameters for movement of the robotic leg. The resulting data are used for further processing in humanoid robotics and assistive and recuperative technologies for people with disabilities. The results are implemented on a robotic leg, which has been developed in our laboratories. It has been used to build a neuroprosthesis control test bench. A model based neural control strategy is implemented, too. The performances of the implemented control strategies for trajectory tracking are analysed by computer simulation. Keywords: bioengineering, neural networks, robotics, neuroprosthesis control.
1 Introduction Engineering is facing a challenge by development of a new field, biomedical engineering (application of engineering principles and techniques to the medical field). The present work seeks to close the gap between engineering and medicine, by creating a bridge for motion analysis and neuroprosthesis testing bench. For this reason, the work aims to offer to orthopedic doctors and engineers interested in medical field, a common tool. It is estimated that the annual incidence of spinal cord injury (SCI) in the USA, not including those who die at the scene of the accident, is approximately 40 cases per million of population. The number of people living with a SCI in the USA in Dorin Popescu · Dan Selişteanu Department of Automation & Mechatronics, University of Craiova Craiova, 107 Decebal Blvd., Romania e-mail:
[email protected] *
Marian S. Poboroniuc · Danut C. Irimia Faculty of Electrical Engineering, “Gh. Asachi” Technical University of Iasi Iasi, 53 Mageron Blvd., Romania e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 283–294. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
284
D. Popescu et al.
2008 was estimated to be approximately 259,000 [1]. For the nations of the EU (803,850,858 population estimate in 2009 [2]) there is a lack of data, but some sources estimate that the annual incidence of SCI is approximately 14 cases per million of population [3]. Spinal cord injury results in damage to the upper motor neurons. If there is no damage to the lower motor neurons, the muscles themselves retain their ability to contract and to produce forces and movements. Functional electrical stimulation (FES) is a technology that uses small electrical pulses to artificially activate peripheral nerves causing muscles to contract, and this is done so as to restore body functions. The devices that deliver electrical stimulation and aim to substitute for the control of body functions that have been impaired by neurological damage are termed ‘neuroprostheses’ [4]. An increased effort in terms of organizing and performing experimental tests with a neuroprosthesis and dealing with the required ethic approvals, is required from those testing a neuroprosthesis as well as from a patient by himself. Much more, there is no permanent availability of a disabled person to test a novel control strategy implemented within a neuroprosthesis. Therefore, simulations and equipment that emulate the effects of a neuroprosthesis aiming to rehabilitate disabled people are required [5]. Image processing can be carried out for various purposes (surface analysis, pattern recognition, size determination, etc.). The issues raised in motion analysis are of interest in industrial research, in ergonomics, for the analysis of techniques in movement, for obtaining motion-specific parameters for movements in various sports [6]–[9]. The resulting data (spatial coordinates, velocities and accelerations) can be used for further processing in design, robotics and animation programs. Compared with most other methods of measurement, image analysis has the advantage that it has no direct repercussions (the determination of quantitative dimensions by means of this measuring system has no influence on the behavior of the measured object). Rigid robot systems are subjects of the research in a both robotic and control fields. The reported research leads to a variety of control methods for rigid robot systems [10]. The present paper is addressed to a robotic leg control. High speed and high precision trajectory tracking are frequently requirements for applications of robots. Conventional controllers for robotic leg structures are based on independent control schemes in which each joint is controlled separately ([10], [11]) by a simple servo loop. This classical control scheme is inadequate for precise trajectory tracking. The imposed performances for industrial applications require the consideration of the complete dynamics of the robotic leg. Furthermore, in real-time applications, the ignoring parts of the robot dynamics or errors in the parameters of the robotic leg may cause the inefficiency of this classical control (such as PD controller). An alternative solution to PD control is the computed torque technique. This classical method is in fact a nonlinear technique that takes account of the dynamic coupling between the robot links. The main disadvantage of this structure is the assumption of an exactly known dynamic model. However, the basic idea of this
Robotics Application within Bioengineering
285
method remains important and it is the base of the neural and adaptive control structures [11] - [15]. When the dynamic model of the system is not known a priori or is not available, a control law is erected based on an estimated model. This is the basic idea behind adaptive control strategies. Over the last few years several authors ([14], [16] - [19]) have considered the use of neural networks within a control system for robotic arms. Our work began with motion image analysis (2D and 3D) of the human body motion and interpretation of obtained data (joints positions, velocities and accelerations), which are presented in Section II. Then, in Section III the robotic legs designed for this application are described. In Section IV a model based neural control structure is implemented. The artificial neural network is used to generate auxiliary joint control torque to compensate for the uncertainties in the computed torque based primary robotic leg. The computer simulation is presented. Section V presents a neuroprosthesis control test bench which integrates a robotic leg mimicking the human body movements. The human body is supposed to be under the action of a neuroprosthesis.
2 Motion Analysis The aim of this work is the design and implementation of a recuperative system (neuroprosthetic device) for people with locomotion disabilities [20]. The stages of the evaluation of people with locomotion disabilities are presented in Fig. 1.
Fig. 1 Evaluation of the people with locomotion disabilities.
Kinematic analysis of human body motion can be done using SIMI Motion software [21]. First phase in motion analysis is description of movements with analytical stand out of motion system. Next phase is recording of motion features and analysis of motion. The obtained data are processed, graphically presented and analysed (Fig. 2). Analysis of the leg movements can be done independently, without forces consideration that produces motion, for position, speed and acceleration, or with forces consideration.
286
D. Popescu et al.
Fig. 2 Motion analysis.
After the image analysis of the human body, the obtained results have to be interpreted and used for modelling. We obtained the position, speed and acceleration evolutions that can be used in the modelling and control of the robotic leg.
3 Robotic Leg Another stage in our work is to design and achieve two robotic legs. The legs are similarly, but one was built in Craiova and another in Iasi. Later, both will be connected in order to create a lower part of a humanoid robot. The aim is to use the obtained results from analysis of the human body motion images in order to implement human movements for robotic legs and test some control algorithms for them. The kinematics chain of the robotic leg means five revolute joints: 2 for hip, 1 for knee and 2 for ankle (Fig. 3).
Fig. 3 The kinematic chain of the robotic leg.
Robotics Application within Bioengineering
287
Fig. 4 The robotic leg.
We implemented the actuating system of the robotic leg with five OMRON servomotors and harmonic gearboxes and the control system of the robotic leg with five OMRON servo drivers and Trajexia control unit.
4 Neural Control In this section, model based neural control structures for the robotic leg is implemented. Various neural control schemes have been studied, proposed and compared. The differences in these schemes are in the role that artificial neural network (ANN) is playing in the control system and the way it is trained for achieving desired trajectory tracking performance. The most popular control scheme is one which uses ANN to generate auxiliary joint control torque to compensate for the uncertainties in the computed torque based primary robotic leg controller that is designed based on a nominal robotic leg dynamic model. This is accomplished by implementing the neural controller in either a feedforward or a feedback configuration, and the ANN is trained on-line. Based on the computed torque method, a training signal is derived for neural controller. Comparison studies based on a robotic planar leg have been made for the neural controller implemented in both feedforward and feedback configurations. A feedback error based neural controller is proposed. In this approach, a feedback error function is minimized and the advantage over Jacobian based approach is that Jacobian estimation is not required.
288
D. Popescu et al.
Fig. 5 Neural control.
The dynamic equation of an n-link robotic leg is given by ([10]):
T = J (q )q + V (q, q )q + G (q ) + F (q )
(1)
where: -
T is an (n×1) vector of joint torques; J(q) is the (n×n) manipulator inertia matrix; V (q, q ) is an (n×n) matrix representing centrifugal and Coriolis effects; G(q) is an (n×1) vector representing gravity; F (q ) is an (n×1) vector representing friction forces; q, q , q are the (n×1) vectors of joint positions, speed and accelerations.
In this approach, a feedback error function is minimized and the advantage over Jacobian based approach is that Jacobian estimation is not required. The inputs to the neural controller (Fig. 5) are the required trajectories qd(t), q d (t ), qd (t ) . The compensating signals from ANN, φp, φv, φa, are added to the desired trajectories. The control law is:
(
(
))
T = Jˆ qd + φa + KV (e + φv ) + K P e + φ p + Hˆ Combining (2) with dynamic equation of robotic leg yields: ~ ~ u = e + KV e + K P e = Jˆ −1 J q + H − Φ
(
)
(2)
(3)
where Φ = φa + KV φv + K Pφ p . Ideally, at u = 0, the ideal value of Φ is:
(
~ ~ Φ = Jˆ −1 J q + H
)
(4)
Robotics Application within Bioengineering
289
The error function u is minimized and the objective function is: J=
( )
1 T u u 2
(5)
The gradient of J is: ∂ J ∂u T ∂Φ T = u=− u ∂w ∂w ∂w
(6)
The backpropagation updating rule for the weights with momentum term is: ∂Φ T ∂J + αΔw(t − 1) = η u + αΔw(t − 1) (7) ∂w ∂w For simulation the planar robot leg with three revolute joints is used (only hip, knee and ankle joints in the same plane are considered). The control objective is to track the desired trajectory given by: Δw(t ) = −η
q1d = 0.4 ⋅ sin(0,4πt )
for hip
q2 d = −0.5 ⋅ sin(0 ,5πt )
for knee
q3d = 0.2 ⋅ sin (0,2πt )
for ankle
For feedback error based neural controller with update backpropagation rule (7) the tracking errors for q1 and q2 are presented in Fig. 6. Tracking errors for q1 ( - ) and for q2 ( ... ) 0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 -0.04
0
2
4
6
8
10 t [s]
12
14
16
18
20
Fig. 6 Tracking errors.
5 Neuroprosthesis Control Test Bench The designed robotic leg may be integrated in a test bench which aim to test neuroprosthesis control (Fig. 7).
290
D. Popescu et al.
In our Simulink model we have implemented a three segmental model with nine mono- and biarticular muscle groups, as described in [22]. These muscle groups are modelled in the sagittal plane inducing moments about the ankle, knee, and hip joints. All muscle groups except monoarticular hip flexors can be activated in a real experiment by a proper arrangement of surface electrodes. Each modelled muscle group has its own activation and contraction dynamics. The inputs for the model are the stimulator pulse width and frequency. Muscle activation, muscle contraction and body segmental dynamics are the three main components of the implemented model. The forces computed for any of the nine muscle groups that are activated due to an applied electrical stimulus, are input to the body-segmental dynamics. The interaction (horizontal and vertical reaction forces) with a seat is modelled by means of a pair of nonlinear spring-dampers.
Fig. 7 Neuroprosthesis control test bench
The upper body effort has to be taken into account into any model that aims to support FES-based controllers testing. Within the patient model as developed in [23], shoulder forces and moment representing the patient voluntary arm support are calculated on a basis of a look-up table, as functions deviations of horizontal and vertical shoulder joint position and trunk inclination from the desired values, and their velocities. In fact, the shoulder forces and moment model is based on a reference trajectory of the shoulder position and trunk inclination during the sit-tostand transfer obtained during an experiment on a sole paraplegic patient. In our case the vertical shoulder forces are modelled as a function of measured knee angles by means of a fuzzy controller.
Robotics Application within Bioengineering
291
Fig. 8 Servo-potentiometers housed in neoprene knee cuffs
A number of experimental tests have been conducted in order to determine the knee angles, knee angular velocities which occur during controlled sitting-down or standing-up. These parameters have been monitored with servo-potentiometers housed in neoprene knee cuffs (Fig. 8). The electronic circuit that measures the knee sensors analogue data and presents it to the PC via the serial port is shown in Fig. 9. A microcontroller PIC16F876A controls the data acquisition and sending protocol. The instrumented knee cuff has been intensively tested to verify the reliability of the acquired data during standing-up and sitting-down. Several trials of recorded knee angle and knee angular velocities during sitting-down are presented in Fig. 10, 11 and 12.
Fig. 9 The electronic circuit which collects knee sensorial data
292
Fig. 10 Knee angle versus time
Fig. 11 Knee angular velocity versus time
Fig. 12 Knee angular velocity versus knee angle
D. Popescu et al.
Robotics Application within Bioengineering
293
The recorded data have to be used to compare with those recorded from the human body Simulink&Matlab model and provided to the robotic leg which mimics the human body movements supposed under the action of a neuroprosthesis.
6 Conclusion and Future Work As a non-obtrusive technique, image processing is an ideal method for collecting movement parameters. Mathematical techniques enable the calculation of data using the spatial coordinates from at least two planes. Only a few conditions must be met to achieve good results. After the human motion analysis and interpretation, the human body model has been implemented in Matlab&Simulink and tested on test bench. Then a robotic leg was designed and achieved and we implemented and tested some control algorithms in order to implement human movements for robotic legs. Classical and neural strategies have been applied to the control of the robotic leg. In the future these control algorithms will be hardware implemented into a recuperative system (neuroprosthetic device) for people with locomotion disabilities. Any new proposed control strategy that aims to support standing in paraplegia has to be embedded within a neuroprosthesis and to be intensively tested in order to appreciate its effectiveness. This isn’t an easy task while ethical commission approval is required anytime, a disabled patient won’t be able to perform trials at any moment and therefore a neuroprosthesis test bench would be a good idea. Modelling the human body within Matlab&Simulink allows us to include effects as: nonlinear, coupled and time-varying muscles response in the presence of the electrical stimulus; muscle fatigue, spasticity, etc. Intensively testing a neuroprosthesis control on a test bench reduces the number of experimental trials to be performed on disabled people. As future work we think to implement a training platform which provide a repository of training material with real clinical case studies using digital imaging and accompanying notes, an interactive multimedia database system containing full reports on patients receiving recuperative treatment. Acknowledgments. This work was supported by CNCSIS–UEFISCSU, Romania, project number PNII–IDEI 548/2008.
References 1. Spinal Cord Injury statistics (updated June 2009), Available from the Foundation for Spinal Cord Injury Prevention, Care and Cure http://www.fscip.org/facts.htm (accessed March 3, 2011) 2. Internet world stats, http://www.internetworldstats.com/europa2.htm (accessed March 3, 2011 3. Paraplegic and Quadriplegic Forum for Complete and Incomplete Quadriplegics and Paraplegics and wheelchair users Paralyzed with a Spinal Cord Injury, http://www.apparelyzed.com/forums/ index.php?s=2f63834381835165c0aca8b76a0acc74&showtopic=2489 (accessed March 3, 2011)
294
D. Popescu et al.
4. Poboroniuc, M., Wood, D.E., Riener, R., Donaldson, N.N.: A New Controller for FESAssisted Sitting Down in Paraplegia. Advances in Electrical and Computer Engineering 10(4), 9–16 (2010) 5. Irimia, D.C., Poboroniuc, M.S., Stefan, C.M.: Voice Controlled Neuroprosthesis System. In: Proceedings of the 12th Mediterranean Conf. on Medical and Biological Engineering and Computing MEDICON 2010, Chalkidiki, Greece, May 27-30. IFMBE Proceedings, vol. 29, p. 426 (2010) 6. Moeslund, T., Granum, E.: A Survey of Computer Vision-Based Human Motion Capture. Computer Vision and Image Understanding 81, 231–268 (2001) 7. Sezan, M.I., Lagendijk, R.L.: Motion Analysis and Image Sequence Processing. Springer, Heidelberg (1993) 8. Aggarwal, J.K., Cai, Q.: Human motion analysis: a review. In: Proc. of Nonrigid and Articulated Motion Workshop, pp. 90–102 (1997) 9. Jun, L., Hogrefe, D., Jianrong, T.: Video image-based intelligent architecture for human motion capture. Graphics, Vision and Image Processing Journal 5, 11–16 (2005) 10. Ivanescu, M.: Industrial robots, pp. 149–186. Universitaria, Craiova (1994) 11. Ortega, R., Spong, M.W.: Adaptive motion control of rigid robots: a tutorial. Automatica 25, 877–888 (1999) 12. Dumbrava, S., Olah, I.: Robustness analysis of computed torque based robot controllers. In: 5-th Symposium on Automatic Control and Computer Science, Iasi, pp. 228–233 (1997) 13. Gupta, M.M., Rao, D.H.: Neuro-Control Systems. IEEE Computer Society Press, Los Alamitos (1994) 14. Ozaki, T., Suzuki, T., Furuhashi, T.: Trajectory control of robotic manipulators using neural networks. IEEE Transaction on Industrial Electronics 38, 641–657 (1991) 15. Liu, Y.G., Li, Y.M.: Dynamics and Model-based Control for a Mobile Modular Manipulator. Robotica 23, 795–797 (2005) 16. Miyamoto, H., Kawato, M., Setoyama, T.: Feedback error learning neural networks for trajectory control of a robotic manipulator. Neural Networks 1, 251–265 (1998) 17. Pham, D.T., Oh, S.J.: Adaptive control of a robot using neural networks. Robotica, 553–561 (2004) 18. Popescu, D.: Neural control of manipulators using a supervisory algorithm. In: A&Q 1998 Int. Conf. on Automation and Quality Control, Cluj-Napoca,pp. A576–A581 (1998) 19. Zalzala, A., Morris, A.: Neural networks for robotic control, pp. 26–63. Prentice-Hall, Englewood Cliffs (1996) 20. Poboroniuc, M., Popescu, C.D., Ignat, B.: Functional Electrical Stimulation. Neuroprostheses control, Politehnium Iasi (2005) 21. SIMI MOTION Manual 22. Poboroniuc, M., Stefan, C., Petrescu, M., Livint, G.: FES-based control of standing in paraplegia by means of an improved knee extension controller. In: 4th International Conference on Electrical and Power Engineering EPE 2006, Bulletin of the Polytechnic Institute of Iasi, tom LII (LIV), Fasc.5A, pp. 517–522 (2006) 23. Riener, R., Fuhr, T.: Patient-driven control of FES-supported standing up: a simulation study. IEEE Trans. Rehabil. Eng. 6, 113–124 (1998)
The Improvement Strategy of Online Shopping Service Based on SIA-NRM Approach Chia-Li Lin
*
Abstract. Nowadays, online shopping through shopping platforms is getting more popular. The new shopping styles are more diverse and provide choices that are more convenient for customers. Despite the disputes that result from the misunderstanding caused by differences between the real and virtual products, i.e., the information showed on website, number of users, and shopping platforms has increased continually, many enterprises selling through physical channels have started to sell on shopping platforms, such as online shopping. Thus, shopping through multiple-platform becomes a trend in the future. Understanding customers’ attitude towards multiple-platform shopping helps shopping platform providers not only improve their service quality but also enlarge the market size. This study has revealed the major service influencers of online shopping platform by using DEMATEL (Decision-making trial and evaluation laboratory) technique. The results showed that design & service feature (DS), maintenance & transaction security (MS), and searching & recommendation service (SR) influence mainly online shopping platforms. In addition, the study used satisfaction-importance analysis (SIA) to measure the importance and service gaps between platforms and suggested that online shopping service providers can improve their strategies using NRM (network relation map). Using this research method, service providers can improve existing functions and plan further utilities for shopping platform in the next generation. Keywords: Service performance, Improvement strategy, Online shopping, Decision-making trial and evaluation laboratory (DEMATEL), SIA-NRM.
1 Introduction E-commerce has changed consumer behavior in current years. Consumers who were accustomed to shop in the department stores, shopping malls, and retail Chia-Li Lin Department of Resort and Leisure Management, Taiwan Hospitality & Tourism College, No. 268, Chung-Hsing St. Feng-Shan Village, Shou-Feng Township, Hualien County, 974, Taiwan, ROC e-mail:
[email protected] *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 295–306. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
296
C.-L. Lin
stores several years ago now purchase goods and services through the online shopping platforms. The trend of E-commerce and Internet service force retail service providers to change their operation model and sales channel to adopt an online shopping service. These retail service providers have begun operate online shopping malls and provide diversified online shopping service. Therefore, online shopping changes shopping habits of customers: (1) Customers can now purchase any products and services in online service platform. They can understand the product information through the electronic catalog, and finish the order process using a transaction system. Then they can chose payment style by credit card payment, cash on delivery payments, pick up payment, and receive the product via home delivery services or fixed-point pick up. (2) Customer searching costs can be reduced by a wide margin because of product and price information received from online shopping platforms. In order to increase their identification degree, some service operators limit product service and refund price differences. (3) The customer familiarity will gradually influence their royal behavior. If customers are already familiar with a certain shopping platform, their experience with this shopping platform will become a part of shopping experience. Any service transfer of shopping platform may become inconvenient to customers, except when the original experience is negative, in case of which customers move to another service platform that can offer better service quality. The remainder of this paper is organized as follows. Section 2 presents the customer needs and online shopping service system. We establish the development strategy of the online shopping platform service in Section 3. Section 4 presents the SIA-NRM analysis of online shopping service system. Section 5 presents the conclusion, which uses the empirical results to propose the improvement strategy of online shopping platform service providers.
2 The Service Evaluation System of Online Shopping Platform The service system evaluation becomes more and more important in current online shopping environment. To measure the service quality of e-commerce and internet service, two multi-items scale evaluation models, E-S-QUAL and E-RecS-QUAL, we used to measure the service quality of online shopping website (Parasuraman, et al., 2005) . E-S-QUAL evaluation system evaluates four aspects (efficiency, fulfillment, system availability, and privacy) of website service quality of regular work with twenty-two multi-item scales, and E-RecS-QUAL evaluates three aspects (responsiveness, compensation, and contact) using eleven multi-item scales of website service quality of no regular work (Parasuraman, et al., 2005). Some researchers suggest that existing e-service quality scales focus on goal-oriented service aspect and ignore hedonic quality aspects; therefore, some aspects are not included in evaluation model of e-service quality. Hence, the e-service quality evaluation model that integrated utilitarian and hedonic aspects was proposed, with the evaluation model of e-TransQual including all stages of the electronic service delivery process. The e-TransQual model determines five discriminant quality aspects (functionality/design, enjoyment, process, reliability, and responsiveness)
The Improvement Strategy of Online Shopping Service
297
based on exploratory and confirmatory factor analysis. A previous study found that enjoyment plays a key role in relationship duration and repurchase intention and drives customer lifetime value (Bauer, et al., 2006). A study of market segmentation of online service considered that online shopping databases that provide information on purchasing activity and demographic characteristics also help service operators understand the customers’ consumption attributes (internet usage and service satisfaction). This information also helps service operators establish good customer relations and refine service strategies to match customers’ needs. The current study proposes a soft clustering method based on latent mixed-class membership clustering approach and used customers’ purchasing data across categories to classify online customers. The soft clustering method can provide better results than hard clustering and better within-segment clustering quality compared to the finite mixture model (Wu and Chou, 2010). In order to understand the factors that influence the customers’ use of online shopping service, some studies evaluated three aspects (perceived characteristics of the web as a sale channel, online consumer characteristics, and website and product characteristics). The aspect of perceived characteristics of the web as a sale channel includes five criteria (perceived risk of online shopping, relative advantage of online shopping, online shopping experience, service quality and trust) while the aspect of online consumer characteristics also includes five criteria (consumer shopping orientations, consumer demographics, consumer computer/internet experience, consumer innovativeness, and social psychological variables). The aspect of website and product characteristics has two criteria (risk reduction measures and product characteristics).
3 Building the Service Improvement Model Based in SIA-NRM for Online Shopping Service The analytical process for expanding a firm’s marketing imagination capabilities is initiated by collecting the firm’s marketing imagination capabilities needed to develop a company as well as goals to be achieved after the firm’s enhancing marketing imagination capabilities using the Delphi method. Since any goals to be derived by the Delphi may impact each other, the structure of the MCDM problem will be derived using the DEMATEL. The weights of every goal are based on the structure derived by using the ANP. Finally, the firm’s marketing imagination capabilities expansion process will be based on a multiple objective programming approach based on the concept of minimum spanning tree by introducing marketing imagination capabilities/competences being derived by Delphi and weights corresponding to each objective being derived by ANP in the former stages. This section introduces the service improvement model based on SIA-NRM of online shopping. First, we need to define the critical decision problem of online shopping service and then identify the aspects/criteria that influence the service quality of online shopping service through literature review and expert interviews in the second stage. In the third stage, using SIA analysis, this study indicates that the aspects/ criteria that are still associated with low satisfaction and high importance
298
C.-L. Lin
are also linked to low service quality. The current study determines the relational structure of online shopping service system, and identifies the dominant aspects/criteria of the service system based on NRM analysis in the fourth stage. Finally, this study integrates the results of SIA analysis and NRM analysis to establish the improved strategy path and determine the effective service improvement strategy for online shopping service system. The analytic process includes five stages. (1) It clearly defines the critical decision problems of service system. (2) It establishes the aspects/ criteria of service system. (3) It measures the state of aspects/ criteria based on SIA analysis. (4) It measures the relational structure using network ration map (NRM). Finally, (5) it integrates the results of SIA analysis and NRM analysis to determine the service improvement strategy of service system. The analytic process uses the analytic techniques (SIA analysis, NRM analysis and SIA-NRM analysis) and five analytic stages as shown in Fig. 1.
Fig. 1 The analysis process of SIA-NRM
Fig. 2 The analysis map of satisfied and importance (SIA)
The Improvement Strategy of Online Shopping Service
299
Table 1 The analysis table of satisfied and importance (SIA) Aspects MS SS MI SI (SS, SI ) Searching & recommendation service (SR) 6.839 -0.687 7.839 -1.177 ▼ (-,-) Maintenance & transaction security (MS) 7.222 1.349 8.861 1.492 ○ (+,+) Design & service functions (DS) 7.106 0.734 8.435 0.380 ○ (+,+) Transaction cost & payment method (CP) 6.896 -0.384 8.146 -0.375 ▼ (-,-) Reputation & customer relationship (RR) 6.778 -1.011 8.167 -0.321 ▼ (-,-) Average 6.968 0.000 8.290 0.000 Standard deviation 0.188 1.000 0.383 1.000 Maximum 7.222 1.349 8.861 1.492 Minimum 6.778 -1.011 7.839 -1.177 Note1: ○(+,+) is the criteria of a high degree of satisfaction and a high degree of importance; ●(+,-) is the criteria of a high degree of satisfaction but a low degree of importance; ▼(-,-) is the criteria of a low degree of satisfaction and a low degree of importance; X (-, +) is the criteria of a low degree of satisfaction but a high degree of importance. Note2: MS, SS, MI, SI, stand for satisfaction value, standardized satisfaction value, importance value, and standardized importance value, respectively.
4 The Service Improvement Model Based in SIA-NRM for Online Shopping Service 4.1 The Analysis SIA (Satisfaction and Importance Analysis, SIA Analysis of the degree of importance and satisfaction of criteria is conducted and the surveyed data is normalized into equal measuring scales. According to the results of the surveyed data, we divided the criteria into four categories. The first category is a high degree of satisfaction with a high degree of importance marked with the symbol ○ (+,+). The second category is a high degree of satisfaction with a low degree of importance marked with the symbol ●(+,-). The third category is a low degree of satisfaction with a low degree of importance marked with the symbol ▼( -,-). The fourth category is a low degree of satisfaction with a high degree of importance marked with the symbol X(-, +). In this study, the analysis of SIA (Satisfied Importance Analysis) is proposed as follows: The first step is to improve those aspects (i.e., SR, CP and RR) falling into the third category [ ▲ (-,-)]. The fourth category criteria [X (-, +)] are key factors that affect the whole satisfaction degree of service platform for online shopping service. For the third category criteria [ ▲ (-,-)], a higher degree of importance would affect the whole satisfaction degree of service platform for online shopping in the short run as shown in Fig. 2 and Table 1.
4.2 The NRM Analysis Based in DEMATEL Technique DEMATEL is used to construct the structure of the network relationships map (NRM) of the shopping platform. When users are making-decisions in using shopping platforms, there are many criteria they may consider. The most common problem they face is that those criteria have impacts on one another. Therefore, before making improvements on criteria, it is necessary to know the basic criteria
300
C.-L. Lin
and then make effective improvements to enhance overall satisfaction. When a decision-maker needs to improve a lot of criteria, the best way to handle this is to determine the criteria which impact others most and improve them. It has been widely adopted for complicated problems. In the early stages, it was used on user interface of monitoring system (Hori and Shimizu, 1999), and failure sorting on system failure analysis (Seyed-Hosseini, et al., 2006). In recent years, DEMATEL has drawn lots of attention on decision and management domains. Some recent studies considered the DEMATEL techniques for solving complex studies, such as developing global managers’ competencies (Wu and Lee, 2007) and the evaluation system of vehicle telematics (Lin, et al., 2010). (1) Calculation of the original average matrix Respondents were asked to indicate the perceived influence of each aspect on other aspects on a scale ranging from 0 to 4, with “0” indicating no influence whereas “4” indicating extremely strong influence between aspect/criterion. Scores of “1”, “2”, and “3” indicate “low influence”, “medium influence”, and “high influence”, respectively. As the data Table 2 shows, the influence of “searching & recommendation service (SR)” on “design & service functions (DS)” is 2.889, which indicates a “medium influence”. On the other hand, the influence of “design & service functions (DS)” on “maintenance & transaction security (MS)” is 3.083, which indicates “high influence”. Table 2 Original average matrix (A) Aspects Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR) Total
SR 0.000 2.444 2.889 2.556 2.722 10.611
MS 2.667 0.000 3.083 2.361 2.194 10.306
DS 2.889 2.583 0.000 2.556 2.556 10.583
CP 2.750 2.889 2.556 0.000 2.278 10.472
RR 2.722 2.806 2.833 2.222 0.000 10.583
Total 11.028 10.722 11.361 9.694 9.750 -
(2) Calculation of the direct influence matrix From Table 2, we processed the “Original average matrix” (A) using Equation (1) and (2) and obtained the “Direct influence matrix” (D). As shown in Table 3, the diagonal items of D are all 0 and the sum of a row is 1 in most cases. We then obtained Table 4 by adding up rows and columns. In Table 4, the sum of row and column for “design & service functions (DS)” is 1.932, which is the most important influence aspect. On the other hand, the sum of row and column for “transaction cost & payment method (CP)” is 1.775, which is the least important influence aspect. (1) D = sA, s > 0 where n
n
s = min [1/ max ∑ aij ,1/ max ∑ aij ], i, j = 1, 2,..., n i, j
1≤ i ≤ n
j =1
1≤ j ≤ n
i =1
(2)
The Improvement Strategy of Online Shopping Service and lim D m = [0]n× n , where D = [ xij ] n× n , when 0 < m →∞
n
∑x
least one
j =1
ij
301 n
∑x j =1
ij
n
≤ 1 or 0 < ∑ xij ≤ 1 , and at i =1
n
or
∑x i =1
ij
equal one, but not all. Therefore, we can guaran-
tee lim D m = [0]n× n . m →∞
Table 3 The direct influence matrix D Aspects Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR)
SR 0.000 0.215 0.254 0.225 0.240 0.934
MS 0.235 0.000 0.271 0.208 0.193 0.907
DS 0.254 0.227 0.000 0.225 0.225 0.932
CP 0.242 0.254 0.225 0.000 0.200 0.922
RR 0.240 0.247 0.249 0.196 0.000 0.932
Total 0.971 0.944 1.000 0.853 0.858 -
Table 4 The degree of direct influence Aspects
Sum of row
Sum of column
Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR)
0.971 0.944 1.000 0.853 0.858
0.934 0.907 0.932 0.922 0.932
Sum of row and column 1.905 1.851 1.932 1.775 1.790
Importance of Influence 2 3 1 5 4
(3) Calculation of the indirect influence matrix The indirect influence matrix can be derived from Equation (3) as shown in Table 5. ∞
ID = ∑ D i = D 2 ( I − D) −1
(3)
i=2
Table 5 The indirect influence matrix Aspects Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR) Total
SR 2.433 2.331 2.440 2.153 2.157 11.513
MS 2.331 2.311 2.375 2.107 2.120 11.244
DS 2.375 2.320 2.484 2.147 2.156 11.481
CP 2.356 2.288 2.420 2.168 2.143 11.375
RR 2.381 2.314 2.437 2.158 2.199 11.489
Total 11.876 11.564 12.156 10.732 10.774 -
(4) Calculation of the full influence matrix Full influence matrix T can be derived from Equation (4) or (5). Table 6 shows the full influence matrix T consisting of multiple elements, as indicated in
302
C.-L. Lin
Equation (6). The sum vector of row value is {d i } and the sum vector of column value is {ri} ; then, the sum vector of row value plus column value is {di + ri}, which indicates the full influence of the matrix T. As the sum of row value plus column value {di + ri} is higher, the correlation of the dimension or criterion is stronger. The sum of row value minus column value is {di - ri}, which indicates the net influence relationship. If di - ri > 0, it means the degree of influence exerted on others is stronger compared to influence from others. As shown in Table 7, the design & service functions (DS) has the highest degree of full influence ( d3 + r3 =25.567) as well as the highest degree of net influence ( d3 − r3 =0.743). The order of other net influences is as follows: the searching & recommendation Service (SR) ( d1 − r1 = 0.400), the maintenance & transaction security (MS) ( d 2 − r2 = 0.357), the transaction cost & Payment method (CP) ( d 4 − r4 = -0.711), and finally, the reputation & customer relationship (RR) ( d5 − r5 = -0.789). Table 6 The full influence matrix (T) Aspects Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR)
SR 2.433 2.546 2.694 2.378 2.397 12.447
MS 2.566 2.311 2.646 2.315 2.313 12.151
DS 2.629 2.547 2.484 2.372 2.381 12.412
CP 2.598 2.542 2.645 2.168 2.343 12.296
RR 2.621 2.561 2.686 2.354 2.199 12.421
Total 12.847 12.507 13.155 11.586 11.632 -
Table 7 The degree of full influence Aspects Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR)
T = D + ID =
{d } 12.847 12.507 13.155 11.586 11.632
{r } 12.447 12.151 12.412 12.296 12.421
{d +r } 25.294 24.658 25.567 23.882 24.053
{d −r } 0.400 0.357 0.743 -0.711 -0.789
∞
∑D
i
(4)
i =1
∞
T = ∑ D = D ( I − D) i
−1
i =1
T = [tij ],
i, j ∈ {1, 2,..., n}
(5) (6)
n
d = d n×1 = [∑ tij ]n×1 = (d1 ,..., d i ,..., d n )
(7)
j =1
n
r = rn×1 = [ ∑ tij ]1′× n = (r1 ,..., r j ,..., rn ) i =1
(8)
The Improvement Strategy of Online Shopping Service
303
(5) The analysis of the NRM (network relation map) Experts were invited to discuss the relationships and influence levels of criteria under the same aspects/criteria and to score the relationship and influence among criteria based on the DEMATEL technique. Aspects/criteria are divided into different types so the experts could answer the questionnaire in areas/fields with which they were familiar. The net full influence matrix, C net , is determined by Equation (9). C net = [tij − t ji ],
i , j ∈ {1, 2,..., n}
(9)
The diagonal items of the matrix are all 0. In other words, the matrix contains a strictly upper triangular matrix and a strictly lower triangular matrix. Moreover, while values of the strictly upper and strictly lower triangular matrix are the same, their symbols are opposite. This property helps us; we only have to choose one of the strictly triangular matrices. Table 6 shows the full influence matrix. The Equation (9) can produce the net full influence matrix, as shown in Table 8. Using the values of ( d + r ) and ( d − r ) in Table 7 as X value and Y value, respectively, the network relation map (NRM) can be drawn as in Fig.3. Fig.3 shows that DS (design & service functions) aspect is the major dimension of net influence while RR (reputation & customer relationship) aspect is the major dimension being influenced. DS (design & service functions) aspect is the dimension with the highest full influence while the CP (transaction cost & payment method) is the one with the smallest full influence aspect.
4.3 The Analysis of SIA-NRM Approach The analysis processes of SIA-NRM include two stages, the first stage involves the satisfied importance analysis (SIA) and the second stage involves the analysis of network ration map (NRM). The SIA analysis determines the satisfaction and importance degree of aspects/criteria for online shopping service platforms; the SIA analysis can help decision marking find criteria that should improved while the standard satisfied degree is less than the average satisfied degree. The three improvement strategies are listed in the Table 9. The improvement strategy A (which requires no further improvement) can be applied to the aspects of MS (maintenance & transaction security) and DS (design & service functions) (SS > 0). The improvement strategy B (which requires direct improvements) should be applied to SR (searching & recommendation service) (SR). The improvement strategy C (which requires indirect improvements) can be applied to the aspect of CP (transaction cost & payment method) and RR (reputation & customer relationship). The SIA-NRM approach can determine the criteria that should be improved based on SIA analysis and the improvement path using network ration map (NRM). As shown in Fig. 4, we can determine what aspects of SR, CP, and RR should be improved. The DS is the aspect with major net influence. Therefore, we
304
C.-L. Lin
can improve the SR aspect with the aspect of DS and improve the CP aspect with the aspects of DS, MS, and SR. The RR (reputation & customer relationship) aspect is the major dimension being influenced; therefore, the aspect of RR can be improved when other four aspects (DS, MS, SR, and CP) are improved, as shown in Table 9 and Fig. 4.
5 06 0.
73 5 0.2 .30 0 25 0.2
Fig. 3 The NRM of online shopping service ( d + r / d − r ) Table 8 The net influence matrix of online shopping service Aspect Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR)
SR -0.020 0.065 -0.221 -0.225
MS
DS
CP
RR
0.099 -0.228 -0.248
-0.273 -0.305
-0.011
-
Table 9 The improvement strategy of aspects for online shopping service Aspects Searching & recommendation service (SR) Maintenance & transaction security (MS) Design & service functions (DS) Transaction cost & payment method (CP) Reputation & customer relationship (RR)
SS -0.687 1.349 0.734 -0.384 -1.011
SIA SI -1.177 1.492 0.380 -0.375 -0.321
(SS,SI ) ź (-,-) ż(+,+) ż(+,+) ź (-,-) ź (-,-)
d+r 25.294 24.658 25.567 23.882 24.053
NRM d-r 0.400 0.357 0.743 -0.711 -0.789
(R D) D (+,+) D (+,+) D (+,+) ID (+,-) ID (+,-)
Strategies B A A C C
Notes: The improvement strategies include three types: Improvement strategy A (which requires no further improvement), Improvement strategy B (which requires direct improvements) and improvement strategy C (which requires indirect improvements).
The Improvement Strategy of Online Shopping Service
305
Fig. 4 The SIA-NRM analysis for online shopping service
5 Conclusions Considering that the amount of internet transactions and frequency of internet transactions increase continually, the complete certification process becomes increasingly more important for online shoppers. The complete certification process can reduce the incidence of account been hacked by unknown subject and assures that users’ personal information and online shopping data would not be jeopardize. Additionally, some online shopping service providers cannot satisfy customers’ consulting and appeal service needs. Some customers who purchase products online consider that the quality and utility of online shopping products cannot conform to their anticipation. They usually hope that the online shopping service providers can listen to their complaints and help them exchange or return products without additional fees. However, customers cannot always exchange a product or receive a reimbursement, and if they can, the process of reimbursing is often quite inconvenient. Customers then complain and their satisfaction with and royalty to the provider decrease continually. Consequently, the customers adjust their decision and move to a better service online shopping provider. Therefore, the online shopping service providers need to deliberate whether their services can satisfy customers’ needs and reduce the number of complaints.
References 1. Bauer, H.H., Falk, T., Hammerschmidt, M.: eTransQual: A transaction process-based approach for capturing service quality in online shopping. Journal of Business Research 59(7), 866–875 (2006) 2. Hori, S., Shimizu, Y.: Designing methods of human interface for supervisory control systems. Control Engineering Practice 7(11), 1413–1419 (1999) 3. Lin, C.L., Hsieh, M.S., Tzeng, G.H.: Evaluating vehicle telematics system by using a novel MCDM techniques with dependence and feedback. Expert Systems with Applications 37(10), 6723–6736 (2010)
306
C.-L. Lin
4. Parasuraman, A., Valarie, A.Z., Arvind, M.: E-S-QUAL: A multiple-Item scale for assessing electronic service quality. Journal of Service Research 7(3), 213–233 (2005) 5. Seyed-Hosseini, S.M., Safaei, N., Asgharpour, M.J.: Reprioritization of failures in a system failure mode and effects analysis by decision making trial and evaluation laboratory technique. Reliability Engineering & System Safety 91(8), 872–881 (2006) 6. Wu, R.S., Chou, P.H.: Customer segmentation of multiple category data in e-commerce using a soft-clustering approach. Electronic Commerce Research and Applications (2010) (in Press, Corrected Proof) 7. Wu, W.W., Lee, Y.T.: Developing global managers’ competencies using the fuzzy DEMATEL method. Expert Systems with Applications 32(2), 499–507 (2007)
The Optimization Decisions of the Decentralized Supply Chain under the Additive Demand Peng Ma and Haiyan Wang**
Abstract. The paper considers a two-stage decentralized supply chain which consists of a supplier and a retailer. Some optimization decisions are studied when the market demand is linear and additive. The optimization decisions include who should decide the production quantity, the supplier or retailer? What is the production-pricing decision? et al. The retailer’s share of channel cost is defined as the ratio of the retailer's unit cost to the supply chain’s total unit cost. The relationship between the supply chain’s total profit and the retailer’s share of channel cost is established. The production-pricing decisions are obtained with the case that the supplier and retailer decide the production quantity, respectively. The results show that the total profit of decentralized supply chain is always less than that of the centralized supply chain, independent of the retailer’s share of channel cost. If the retailer’s share of channel cost is between and , the supplier should decide the production quantity. If the retailer’s share of channel cost is between and , the supplier should decide the production quantity. Keyworks: decentralized supply chain, additive demand, production-pricing decision, optimal channel profit, consignment contract.
1
Introduction
There are two kinds of optimization decisions in the decentralized supply chain. One is who should decide the production quantity, the supplier or retailer? The other is to make the production-pricing decisions which maximize their individual profit.
Peng Ma · Haiyan Wang Institute of Systems Engineering, School of Economics and Management, Southeast University, Nanjing, Jiangsu 210096, P.R. China e-mail:
[email protected],
[email protected] * Corresponding author. J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 307–317. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
308
P. Ma and H. Wang
There is a modification of Vendor Managed Inventory (VMI) in which the supplier makes inventory decisions and owns the goods until they are sold, which is called Consignment Vendor Managed Inventory (CVMI) (Lee and Chu 2005; Ru and Wang 2010). CVMI is used by many retailers, such as Wal-Mart, Ahold USA, Target, and Meijier Stores. There is also a modification of Retailer Managed Inventory (RMI) in which the retailer decides about how much inventory to hold in a period, which is labeled as Consignment Retailer Managed Inventory (CRMI) (Ru and Wang 2010). Ru and Wang (2010) consider a two-stage supply chain with a supplier and a retailer when market demand for the product is dependent on multiplicative functional form. They find that the supplier and retailer have equal profit under both CVMI and CRMI program. They also find that it is beneficial for both the supplier and the retailer if the supplier makes the inventory decision in the channel. Generally, production-pricing decision in supply chain has been extensively studied in the literature. Petruzzi and Dada (1999) provide a review and extend such problems to single-period newsvendor setting. Several researchers consider this problem under multi-period setting (Federgruen and Heching 1999; Chen and Simchi-Levi 2004a, 2004b). In the extended newsvendor setting of decentralized decision-making, some researchers consider a setting where a supplier wholesales a product to a retailer who makes pricing-procurement decisions (Emmons and Gilbert 1998; Granot and Yin 2005; Song et al. 2006). These papers explore how the supplier can improve the channel performance by using an inventory-return policy for items overstocked by the assembler. Granot and Yin (2007) study the price-dependent newsvendor model in which a manufacturer sells a product to an independent retailer facing uncertain demand and the retail price is endogenously determined by the retailer. Granot and Yin (2008) analyze the effect of price and order postponement in a decentralized newsvendor model with multiplicative and price-dependent demand. Huang and Huanget (2010) study price coordination problem in a three-stage supply chain composed of a supplier, a manufacturer and a retailer. There are other recent papers related to the joint production-pricing decisions of decentralized supply chains structure as follows: Bernstein and Federgruen (2005) investigate the equilibrium behavior of decentralized supply chains with competing retailers under demand uncertainty. Wang et al. (2004) consider a supply chain structure where a retailer offers a consignment-sales contract with revenue sharing to a supplier, who then makes production-pricing decisions. Wang (2006) considers n suppliers each producing a different product and selling it by a common retailer to the market. Ray et al (2005) study a serial two-stage supply chain selling a procure-to-stock product in a price-sensitive market. Zhao and Atkins (2008) extends the theory of n competitive newsvendors to the case where competition occurs simultaneously in price and inventory. The purpose of this paper is to investigate the optimization decisions under decentralized supply chain with the additive demand, especially shed light on the problem that who should decide the production quantity in the supply chain. A game-theoretic model is built to capture the interactions between the supplier and
The Optimization Decisions of the Decentralized Supply Chain
309
retailer when different members manage the supply chain inventory. Our key contribution in this paper is to investigate the effect of the retailer’s share of channel cost which is defined as the ratio of the retailer's unit cost to the supply chain’s total unit cost on the supply chain total profit. We also get some different results with Ru and Wang (2010). The paper proceeds as follows. Section 2 details the model assumptions. Section 3 derives the centralized decisions. Section 4 considers the decentralized supply chain where the supplier decides the production quantity. Section 5 considers the decentralized supply chain where the retailer decides the production quantity. Section 6 gives the result and managerial meaning. Section 7 concludes the paper.
2
Model Assumptions
We assume that the supplier’s unit production cost is c , and the retailer’s unit c as the total unit cost for the holding and selling cost is c . We define c c channel, and α c ⁄c as the retailer’s share of channel cost. Supposing the market demand D p is linear and additive, i.e. D p
a
bp
ε
(1)
where p is the retail price, and ε is a random variable which is supported on A, B with B A 0. Let F . and f . be its cumulative density function and probability density function, respectively. Further, we assume that a bc A 0, and F A 0, F B 1. We consider that the supplier produces Q units of the product and delivers them to the retailer, and the retailer sells them to the market at the retail price p. We use some contractual arrangement to specify who makes what decisions. Two different contractual arrangements, namely production quantity decided by supplier (PQDS) and production quantity decided by retailer (PQDR) are considered. Under PQDS, the decisions are made as follows: firstly, the supplier chooses the consignment price w and production quantity Q and delivers them to the retailer. Secondly, the retailer decides the retail price p. On the contrary, Under PQDR, the decisions are made as follows: first, the supplier specifies the consignment price w. Second, the retailer decides production quantity Q for the supplier to deliver and the retail price p.
3
Centralized Decision
We first characterize the optimal solution of the centralized supply chain, in which the retail price p and the production quantity Q are simultaneously chosen by a decision-maker. The expected channel profit can be written as Π p, Q
pE min D, Q
cQ
pE min a
bp
ε, Q
cQ
(2)
Following Petruzzi and Dada (1999), we regarded z Q a bp as a stocking factor. The problem of choosing the retail price p and the production
310
P. Ma and H. Wang
quantity Q is equivalent to choosing the retail price p and the stocking factor z. Substituting Q z a bp into (2), then we can rewrite the above profit function as Π p, z
p
c a
bp
Λ z
z
Theorem 1. For any given stocking factor is given by
pz
Λ z
cz
(3)
x f x dx
(4)
, the unique optimal retail price (5)
and, if the probability distribution function . satisfies the property of increasthat maximizes ing failure fate (IFR), the optimal stocking factor , is uniquely determined by (6) Proof. First, for any given z, A ,
Π
Since
Π
a
,
z
2bp
bc
,
Π
0,
B, we have z
,
Π
Λ z ,
2b Λ
0 implies that p z
0
, which is (5),
and p z is the unique maximum of Π p, z . Next we characterize p z , which maximizes Π p z , z . By the chain rule, we have Π
,
Π
Π
,
,
,
Π
.
Λ
1
Λ
, which is (6). z
F z
c
g z Since
0 implies that F z
Λ
always exists in the support interval A, B of F . , because g z is continuous, and g A 0, g B c 0. To verify the uniqueness of z , we have g′ z g" z
1
F z
3h z 1
a F z
g z h z
bc
z
a
bc
2h z 1
Λ z h z z F z
Λ z
h z a
bc
h z z
Λ z h z where h z
is define as the failure rate of the demand distribution.
Now, if h′ z 0, then g" z 0 at g′ z 0, implying that g z itself is a unimodal function. In conjunction with g A 0 and g B 0, it guarantees the uniqueness of z and A z B. This completes the proof of Theorem 1. □ Increasing failure rate, i.e., h z
being increasing in z, as required by
Theorem 1, is a relatively weak condition satisfied by most commonly used
The Optimization Decisions of the Decentralized Supply Chain
311
probability distributions like Normal, Uniform and exponential, etc. Substituting (5) and (6) into (3), we derive the optimal channel profit as Π
4
z
Λ z
cz
(7)
Decentralized Channel Decisions under PQDS
Under PQDS, the supplier chooses the consignment price w and stocking factor z at the first stage, and then the retailer chooses the retail price p at the second stage. Using a backward-induction procedure similar as Ru and Wang (2010), the retailer’s expected profit can be written as Π
p w, z
,
p p
w E min D, Q cαQ w E min a bp ε, Q p w cα a bp p
cαQ w z Λ z
cαz
Theorem 2. For any given stocking factor and consignment price unique optimal retail price , is given by
(8) 0, the
,
(9)
Proof. We take the partial derivative of (8) with respect to p as Π Π
So
Π
Since Π
,
,
, ,
,
,
a
bw
0 implies that p ,
2b
bcα
z
Λ z α
w, z
0 , so p
2bp
w, z
Λ
, which is (9).
is the unique maximizer of
p w, Q . This completes the proof of Theorem 2.
□
The supplier’s profit function is given by Π
w, Q
,
wE min D, Q
After substituting D we have
a
bp
ε, Q
z
w, z
c
cα a
bp
wz
Π
,
w w
c
cα a bw Λ z wz
bcα
c 1 a
bp Λ z
z Λ z c 1 α z
α Q
(10)
and (9) into (10), c 1
α z (11)
Theorem 3. For any given stocking factor , the supplier’s unique optimal consignment price z is given by z
(12)
and, if the demand distribution satisfies increasing failure rate (IFR), the optimal stocking factor that maximizes , is uniquely determined by ,
312
P. Ma and H. Wang
(13) □
Proof. It is similar to the proof of Theorem 1.
Substituting (12) into (9), then the retail price chosen by the retailer in equilibrium is given as p
z
(14)
Substituting (12), (14) into (8) and (11) respectively, then we obtain their optimal expected profits of the retailer and supplier respectively as Π
z
,
Π , where the optimal stocking factor z profit of supply chain is α
Π
Π
Π
,
Λ z
cαz
(15)
z Λ z c 1 α z (16) is determined by (13). So the total channel z
,
Λ z
cz (17)
In conclusion, for the decentralized channel under PQDS, the supplier chooses the consignment price and stocking factor according to (12) and (13), respectively. Then the retailer chooses the retail price as given by (14).
5
Decentralized Channel Decisions under PQDR
Under PQDR, the supplier chooses the consignment price w at the first stage, and then the retailer chooses the stocking factor z and retail price p to maximize his own expected profit which is calculated as Π
,
p, z w p
p w E min D, Q cαQ w E min a bp ε, Q cαQ p w cα a bp p w z
Λ z
cαz
(18)
Theorem 4. For any given stocking factor z, the unique optimal retail price is given by (19) and, if the probability distribution function ing fate (IFR), the optimal stocking factor is uniquely determined by
. satisfies the property of increasthat maximizes , , (20) □
Proof. It is similar to the proof of Theorem 1. The supplier’s profit function is given by Π
,
w
wE min D, Q
c 1
α Q
(21)
The Optimization Decisions of the Decentralized Supply Chain
ε, Q Substituting D a bp we have Π , w w c cα a bp
z
a
wz
313
bp , and (19), (20) into (21), Λ z
c 1
w c cα a bcα wb z Λ z c 1 α z Theorem 5. The supplier’s unique optimal consignment price
α z w
c 1
α (22)
is given by (23)
Proof. Because z chosen by the retailer at the second stage does not depend on the consignment price w set by the supplier at the first stage, the derivative of Π , w with respect to w can be simplified as , ,
So,
,
,
b
0 implies that w
0
⇔ which is (23). □
This completes the proof of Theorem 5. Substituting (23) into (19) and (20), then we have p
z
(24)
F z
(25)
Substituting (23) and (24) into (18), we can derive the retailer’s optimal expected profit as Π
z
,
Λ z
cαz
(26)
where the optimal stocking factor z is determined by (25). Substituting (23) and (24) into (22), we get the supplier’s optimal expected profit as Π
z
,
Λ z
c 1
α z
(27)
Therefore the total channel profit is Π
α
Π
,
Π
,
z
Λ z
cz
(28)
Note that although the expression of the optimal retail price here in (24) is the same as that of (14) under PQDS, but the size relationship between the equilibrium stocking factor z under PQDR and the equilibrium stocking factor z under PQDS is dependent on parameter α.
314
P. Ma and H. Wang
In conclusion,o for the decentralized channel under PQDR, the supplier chooses the consignment price according to (23). Then the retailer chooses the retail price and stocking factor as given by (24) and (25), respectively.
6
Results and Managerial Meaning
The expected profit of the channel depends on the choice for the retail price p and stocking factor z. In this section, we focus on the difference of total profit between the centralized channel and the decentralized channel under PQDS and PQDR, respectively. We also derive the difference of total profit between the decentralized channel under PQDS and PQDR. Proposition 1. Under PQDS, if 0
is increasing with
;
if
;
, and 1.
if
Proof . From (6) and (13), we can deduce that F z
F z It is obvious that F z ing with α . If α ; z 0 Proposition 1.
is increasing with α, so z
F z , then F z
α
z
F z
0, z
z is increasα
z if
1 . This completes the proof of □
if
is increasing with
Proposition 2. Under PQDS,
α
z , so z
, and
.
Proof. From (5) and (14), we derive that p
α
p
Combining (6) and (13), we can deduce that F z is increasing with α, then p so p
α
α
p
Proposition 3. Under PQDR,
p is increasing with α. When α
,z
. This completes the proof of Proposition 2. is decreasing with
if 0 ; if ; Proof. It is similar to the proof of Proposition 1. Proposition 4. Under PQDR,
is increasing with α, thus z □
, and 1.
if
is decreasing with
z ,
□ , and .
Proof. It is similar to the proof of Proposition 2.
□
The Optimization Decisions of the Decentralized Supply Chain
315
Proposition 1 and 3 indicate that: When the supplier’s unit production cost c is more than the retailer’s unit holding and selling cost c , the stocking factor under PQDS is less than that of centralized supply chain, while the stocking factor under centralized supply chain is less than that of the decentralized supply chain under PQDR, vice versa. When the supplier's unit production cost c is equal to the retailer's unit holding and selling cost c , the stocking factor of the decentralized supply chain under PQDS and PQDR is equal to that of the centralized supply chain. Proposition 2 and 4 indicate that: When the supplier's unit production cost c is equal to the retailer's unit holding and selling cost c , the retail price of the decentralized supply chain under PQDS is equal to that under PQDR. Therefore, the production quantity Q of the decentralized supply chain under PQDS is equal to that under PQDR. Proposition 5. Comparing centralized and decentralized supply chain, we have the following results: (i) Under PQDS, with
1. For all 0
if
if 0
is increasing with
and decreasing
1, we have 0.
(ii) Under PQDR, with
1. For all 0
if
if 0
is increasing with
and decreasing
1, we have 0.
Proof. (i) Define z
H z z
G z
Λ z Λ z
cz, cz,
and Π G z . Under PQDS, It is then Π α H z , Π α H z similar to the proof of Theorem 1, there is an unique z and A z B, such 0 in A, z and H z 0 in [z , B]. that H z F z , and we can derive that 0 , so Π α First, we make F z is increasing with α if 0
. Next, we make F z
F z , we can de-
1, so Π α is increasing with α if 1. rive that From the proof of Theorem 1, we know that G z derives maximum at z , then G z G z , so for all 0 1, we have Π
Π
α
Π
Π
G z z
(ii) It is similar to the proof of (i).
H z
z
Λ z
0.
H z □
316
P. Ma and H. Wang
Proposition 6. Comparing decentralized supply chain under PQDS and PQDR, we have the following results: (i)
if
;
(ii)
if
.
Proof. Let H z (i) We know that H z osition 5. We make F z z z Π .
z
if
0 in A, z F z
z Λ z cz. and H z 0 in [z , B] from Prop-
F z
, then we derive
, then we can get H z
H z
. So
, namely Π
F z F z , we have , then we can get (ii) Next, Let F z H z H z , namely Π Π . This completes the proof of Proposition 6. □ Proposition 5 and 6 indicate that: The total profit of decentralized supply chain is always less than that of the centralized supply chain, independent of the retailer’s share of channel cost. The total profit of the decentralized supply chain not only depends on the retailer’s share of channel cost, but also depends on the decision strategy. When the supplier decides the production quantity, the total profit of the . When the retailer decides the production supply chain reaches maximum at α quantity, the total profit of the supply chain reaches maximum at α . The total profit of supply chain under the supplier deciding the production quantity is more . The total than that of the retailer deciding the production quantity if profit of supply chain under the retailer deciding the production quantity is more . than that of the supplier deciding the production quantity if
7
Conclusions
In this paper, two optimization decisions of the decentralized supply chain under the additive demand are considered. The first optimization decision is productionpricing decision. We derive the production-pricing decisions when the supplier and retailer decide the production quantity, respectively. The other optimization decision is to choose the supplier or retailer to decide the production quantity. We consider the effect of the retailer’s share of channel cost on the two decisions which the supplier or retailer decides the production quantity. We derive the relationship between the total profit of supply chain and the retailer’s share of channel cost. By comparing the results of the centralized supply chain, we find that whether the supplier or retailer decides the production quantity is dependent on the retailer’s share of channel cost. Firstly, the total profit of decentralized supply chain is always less than that of the centralized supply chain, independent of the retailer’s share of channel cost. Secondly, the supplier should decide the production quantity if the retailer’s share of channel cost is between and . The retailer
The Optimization Decisions of the Decentralized Supply Chain
317
should decide the production quantity when the retailer’s share of channel cost is between and . The results have significant implication on the supply chain management. For the future research, one of our interests is to extend our model into a multiperiod setting. We also intend to consider how to control inventory in the supply chain when the retailer is subject to both supply uncertainty and random demand.
References 1. Bernstein, F., Federgruen, A.: Decentralized supply chains with competing retailers under demand uncertainty. Manag. Sci. 51, 18–29 (2005) 2. Chen, X., Simchi-Levi, D.: Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The finite horizon case. Oper. Res. 52, 887–896 (2004a) 3. Chen, X., Simchi-Levi, D.: Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The infinite horizon case. Math. Oper. Res. 29, 698–723 (2004b) 4. Emmons, H., Gilbert, S.: The role of returns policies in pricing and inventory decisions for catalogue goods. Manage. Sci. 44, 276–283 (1998) 5. Federgruen, A., Heching, A.: Combined pricing and inventory control under uncertainty. Oper. Res. 47, 454–475 (1999) 6. Granot, D., Yin, S.: On the effectiveness of returns policies in the price-dependent newsvendor model. Nav. Res. Logist. 52, 765–779 (2005) 7. Granot, D., Yin, S.: On sequential commitment in the price-dependent newsvendor model. Eur. J. Oper. Res. 177, 939–968 (2007) 8. Granot, D., Yin, S.: Price and order postponement in a decentralized newsvendor model with multiplicative and price-dependent demand. Oper. Res. 56, 121–139 (2008) 9. Huang, Y., Huang, G.: Price coordination in a three-level supply chain with different channel structures using game-theoretic approach. Int. Soc. Manag. Sci. 5, 83–94 (2010) 10. Lee, C., Chu, W.: Who should control inventory in a supply chain? Eur. J. Oper. Res. 164, 158–172 (2005) 11. Petruzzi, N., Dada, M.: Pricing and the newsvendor problem: A review with extensions. Oper. Res. 47, 184–194 (1999) 12. Ray, S., Li, S., Song, Y.: Tailored supply chain decision-making under price-sensitive stochastic demand and delivery uncertainty. Manag. Sci. 51, 1873–1891 (2005) 13. Ru, J., Wang, Y.: Consignment contracting: Who should control inventory in the supply chain? Eur. J. Operl Res. 201, 760–769 (2010) 14. Song, Y., Ray, S., Li, S.: Structural properties of buy-back contracts for price-setting newsvendors. Manuf. Serv. Oper. Manag. 10, 1–18 (2006) 15. Wang, Y., Jiang, L., Shen, Z.: Channel performance under consignment contract with revenue sharing. Manag. Sci. 50, 34–47 (2004) 16. Wang, Y.: Joint pricing-production decisions in supply chains of complementary products with uncertain demand. Oper. Res. 54, 1110–1127 (2006) 17. Zhao, X., Atkins, D.: Newsvendors under simultaneous price and inventory competition. Manuf. Serv. Oper. Manag. 10, 539–546 (2008)
The Relationship between Dominant AHP/CCM and ANP Eizo Kinoshita and Shin Sugiura
*
Abstract. This paper demonstrates that the equation, Dominant AHP (analytic hierarchy process)+ANP(analytic network process)=Dominant AHP, is valid, when the weights of criteria relative to the evaluation of dominant alternative is chosen as the basis for the evaluation of Dominant AHP. It also substantiates that the equation, CCM(concurrent convergence method)+ANP=CCM, is valid by applying the same approach.
1 Introduction The paper consists of six chapters. Chapter 2 explains AHP/ANP, proposed by Saaty. Chapter 3 and 4 explain Dominant AHP and CCM, proposed by Kinoshita and Nakanishi [4], [5]. Chapter 5 describes mathematically that Dominant AHP equates with ANP, when the weights of criteria relative to the evaluation of dominant alternative is chosen as the basis for the evaluation of Dominant AHP, and proves the equation, Dominant AHP+ANP=Dominant AHP, as valid. The same chapter also shows that the equation, CCM+ANP=CCM, as valid by applying the same approach. And Chapter 6, a conclusion.
2 AHP/ANP [1] [2] AHP, proposed by Saaty, can be summarized as follows: For a set of alternatives A1, …, An, several evaluation criteria C1, …, Cm have been defined and validities ν1i, …, νni of all alternatives will be evaluated using criteria Ci. On the other hand, significances e1, …, em of C1, …, Cm are determined based on the ultimate goal G to give the aggregate score Ej of alternative Aj. Ej = νj1 e1 + νj2 e2 + … + νjm em
(1)
Eizo Kinoshita · Shin Sugiura Meijo University, Kani, Japan e-mail:
[email protected],
[email protected] *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 319–328. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
320
E. Kinoshita and S. Sugiura
In addition, applying the ANP method proposed by Saaty (though it is also applicable to a wider range of problems than typical AHP as stated above) to the above AHP will change the perspective of significances of criteria to the ones viewed from each alternative rather than viewing from the perspective of the ultimate goal as it is the case with AHP. That is to say, ANP has a structure of mutual evaluation, in which an evaluation criteria Ci determines the validity νji of alternative Aj while Aj determines the significance eij of Ci at the same time, and thus contains a feedback structure. For this reason, the solution requires an operation of solving a kind of equation involving a supermatrix as a coefficient matrix, rather than simple multiplication of weights and addition as in the case of AHP. Using the solution of ANP, if the graph structure is such that any node in the graph can be reached from any other node by tracing arrows, the aggregate score (the validity if the nodes represent alternatives or the significance if the nodes represent evaluation criteria) xT=[x1, …, xn] of graph nodes can be obtained as a solution to the following equation. (2)
Sx=x ∞
(Each column vector of S converges to a single vector and Saaty proposed to use it as the aggregate score. This is the same as the solution x for equation (2).) Since equation (2) can also be viewed as a homogeneous equation (S-I)x =0, it is solvable by ordinary Gaussian elimination. In addition, since S is a probability matrix having the maximum eigenvalue of 1 and the solution x for equation (2) is the eigenvector of S. The power method is also applicable. It is obvious that x is uniquely determined except for its multiples and all its elements are positive from the famous Perron-Frobenius Theorem. Here, the supermatrix for this ANP is defined as (3). Where, W be criteria weight and M be evaluation matrix.
⎡0 S=⎢ ⎣M
W⎤ 0 ⎥⎦
(3)
3 Dominant AHP [3][4] In this chapter, the authors describe Dominant AHP, proposed by Kinoshita and Nakanishi. Firstly, the authors explain its model by demonstrating numerical examples, and then describes its mathematical structure. Table-1 shows the results obtained when there are two evaluation criteria and three alternatives. Each figure denotes an evaluated score of respective alternative.
The Relationship between Dominant AHP/CCM and ANP
321
Table 1 Evaluation of alternatives
A1 A2 A3 Sum
C1 84 48 75 207
C2 24 65 21 110
Sum 108 113 96
The Dominant AHP method is an approach where the aggregate score is obtained by selecting a particular alternative (this is called dominant alternative) out of several alternatives, and making it a criterion. The evaluated score of a dominant alternative is normalized to one, and the weights of criteria relative to the evaluation of dominant alternative is chosen as the basis for the evaluation. The evaluated scores as well as the weights of criteria, when selecting alternative A1 as a dominant alternative, are shown in Table-2. Table 2 Evaluated scores of Dominant AHP(dominant alternative A1)
A1 A2 A3
C1 84/84=1 48/84=0.571 75/84=0.362
C2 24/24=1 65/24=2.708 21/24=0.875
The weights of criteria of Dominant AHP(dominant alternative A1)
Weights of Criteria
C1
C2
84/(84+24)=0.778
24/(84+24)=0.222
Based on the results shown in Table-2, the evaluated scores of alternatives are expressed by Formula (4), and the weights of criteria by Formula (5).
1 ⎤ ⎡ 1 ⎢ M = ⎢0.571 2.708⎥⎥ ⎢⎣0.893 0.875⎥⎦
(4)
⎡0.778⎤ W =⎢ ⎥ ⎣0.222⎦
(5)
322
E. Kinoshita and S. Sugiura
The aggregate score is obtained by Formula (6).
1 ⎤ ⎡ 1 ⎡ 1 ⎤ ⎡0.778⎤ ⎢ ⎥ E1 = ⎢⎢ 0.571 2.708⎥⎥ ⋅ ⎢ = ⎥ ⎢1.046 ⎥ 0 . 222 ⎦ ⎢0.889⎥ ⎢⎣0.893 0.875⎥⎦ ⎣ ⎦ ⎣
(6)
By nomalizing the results obtained by Formula (6) so that the total shall be one, Formula (7) which denotes the final evaluated score, is acquired.
A1 ⎡ 0.341⎤ E = A 2 ⎢⎢0.356⎥⎥ A3 ⎢⎣0.303⎥⎦
(7)
Next, the authors describe the case when A2 is picked up as a dominant alternative. The evaluated scores and the weights of criteria in this case are shown in Table-3. Table 3 Evaluated scores of Dominant AHP (dominant alternative A2)
A1 A2 A3
C1 84/48=1.750 48/48=1 75/48=1.563
C2 24/24=0.369 65/24=1 21/24=0.323
The weights of criteria of Dominant AHP(dominant alternative A2)
Weights of Criteria
C1 48/(48+65)=0.425
C2 65/(48+65)=0.575
Based on the results shown in Table-3, Formula (8), showing the evaluated scores of alternatives, and Formula (9), denoting the weights of criteria, are aquired.
⎡1.750 0.369⎤ 1 ⎥⎥ M = ⎢⎢ 1 ⎢⎣1.563 0.323⎥⎦
(8)
⎡0.425⎤ W =⎢ ⎥ ⎣0.575⎦
(9)
As a result, the aggregate score is obtained by Formula (10).
⎡1.750 0.369⎤ ⎡0.956⎤ ⎡0.425⎤ ⎢ ⎥ ⎢ 1 ⎥⋅⎢ = ⎢ 1 ⎥⎥ E2 = ⎢ 1 ⎥ 0.575⎦ ⎢⎣1.563 0.323⎥⎦ ⎣ ⎢⎣0.850⎥⎦
(10)
The Relationship between Dominant AHP/CCM and ANP
323
When nomalizing the result obtained by Formula (10) so that the total shall be one, it matches with the result obtained by Formula (7). The same result can be achieved when A3 is chosen as a dominant alternative. Next, the authors describe the mathematical structure of the aggregate score of Dominant AHP. Suppose that evaluated score of alternative i is denoted as aij under a criterion j in the conventional AHP. The aggregate score pi of AHP is acquired by aij times cj, which signifies the weight of a criterion, and is expressed by Formula (11).
pi = ∑ c j aij
(11)
j
Dominant AHP is a method where the aggregate score is acquired by picking up a particular alternative (this is called a dominant alternative) out of several alternatives, and making it a criterion. When the evaluated score of alternative i is denoted by aij under a criterion j, the normalized evaluated score is expressed by
a ij a~ ij = . The weights of criteria, relative to the evaluation of dominant altera lj native, and whose aggregate score is nomalized to one, is chosen as the basis for the evaluation. As a result, the weight of a criterion is expressed as c~ j =
a lj
∑a
. lk
k
Thus the aggregate score of alternative i is shown as follows. ~ pi =
∑ ~c a~ = ∑ j
j
ij
j
alj
∑a k
a ij lk
a lj
=
1 a lk
∑
∑a j
ij
(12)
k
It was found that the result matches with AHP, when the weights of criteria relative to the evaluation of dominant alternative is chosen as the basis for the evaluation of Dominant AHP.
4 CCM [5][6] In this chapter, the authors describe the concurrent convergence method (CCM) proposed by Kinoshita and Nakanishi [4], [5]. In Dominant AHP, no matter which alternative is selected as a dominant alternative, the aggregate score, obtained respectively, can be the same. However, the aggregate score can be different if there are several weights of criteria, such as when a decision maker declares a different weights of criteria for each dominant alternative he or she selects, and as when the evaluated score of an alternative is not tanjible, but intanjible. As a result, the CCM [5], [6] method becomes vital in order to adjust the weights of criteria. The CCM method functions as follows. Suppose that there are two criteria and three alternatives. In this case, b1 denotes the weight vector of evaluation criteria
324
E. Kinoshita and S. Sugiura
from alternative 1, and b2 denotes the weight vector of evaluation criteria from alternative 2. When the evaluated score A of an alternative is given, the input data is shown as Figure-1.
CriteriaⅠ
b1 Alternative1
CriteriaⅡ
b3
b2 Alternative 2 ⎡ a1I A = ⎢⎢a2I ⎢⎣ a3I
Alternative 3
a1II ⎤ a2II ⎥⎥ a3II ⎥⎦
Fig. 1 Input data of CCM
As a result, the estimation rules of weight vector of evaluation criteria of dominant alternative 2 from dominant alternative 1, are shown as follows: Estimation rule for the weight vector of evaluation criteria: b Estimation rule for the evaluated score:
−1
AA1 → AA2
1
−1
→ A2 A1 b1
−1
Similarly, the estimation rule for the weight vector of evaluation criteria relative to dominant alternative 1 from dominant alternative 2, is shown as follows. Estimation rule for the weight vector of evaluation criteria: Estimation rule for the evaluated score:
−1
AA2 → AA1
−1
b 2 → A1 A2 b 2 −1
Here, the authors describe a case where there are evaluation "gaps" between the weight vector of evaluation criteria b1 and the estimated score A1A2-1b2 of a weight vector, also those between the weight vector of evaluation criteria b2 and the estimated score A2A1-1b1 of a weight vector. When there is no "gap," such a state is called "interchangeability of dominant alternatives"[3]. In reality, however, the interchangeability is seldom maintained and we often end up with minor evaluation gaps. Kinoshita and Nakanishi, in order to adjust those evaluation gaps, proposed CCM. First of all, the adjusted value for the weight vector of evaluation criteria b1 relative to dominant alternative 1 signifies the average score of the original data b11, the estimated score b12 relative to dominant alternative 2, and the estimated value b13 relative to dominant alternative 3, as shown below.
The Relationship between Dominant AHP/CCM and ANP
{
}
1 11 b + b12 + b13 3 −1 −1 −1 A1 A3 b 3 ⎫⎪ A1 A2 b 2 1 ⎧⎪ A1 A1 b1 = ⎨ T + + ⎬ 3 ⎪⎩ e A1 A1 −1b1 e T A1 A2 −1b2 e T A1 A3 −1b3 ⎪⎭
b1 =
325
(13)
By the same token, the adjusted scores for the weight vector for criteria b2 and b relative to dominant alternative 2 and 3, respectively, are shown as Formula (14) and Formula (15). 3
b2 =
{
1 21 b + b 22 + b 23 3
}
−1 −1 −1 A2 A3 b 3 ⎫⎪ A2 A2 b 2 1 ⎧⎪ A2 A1 b1 = ⎨ T + + ⎬ 3 ⎪⎩ e A2 A1 −1b1 e T A2 A2 −1b 2 e T A2 A3 −1b 3 ⎪⎭
b3 =
{
1 31 b + b 32 + b 33 3
(14)
}
−1 −1 −1 A3 A2 b 2 A3 A3 b 3 ⎫⎪ 1 ⎧⎪ A3 A1 b1 = ⎨ T + + ⎬ 3 ⎪⎩ e A3 A1 −1b1 e T A3 A2 −1b 2 e T A.3 A3 −1b 3 ⎪⎭
(15)
In CCM, the same process will be repeated until the "gap" between a new weight vector of evaluation criteria bi and the old weight vector of evaluation criteria bi (i=1,2,3) is eliminated. As to how this process contributes to the convergence of weight vector of evaluation criteria bi, detailed explanation is given in the reference [6]. The aggregate score based on the convergence value of CCM is consistent with Dominant AHP calculation.
5 The Relationship between Dominant AHP/CCM and ANP (1) Dominant AHP and ANP Analytic Network Process (ANP) is an improved form of AHP. It is a model which can deal with different weights of criteria at the same time. In this chapter, the authors explain the mechanism of ANP, by demonstrating numerical example, and then describe its mathematical structure. The authors then prove the equation, Dominant AHP+ANP=Dominant AHP, is valid when the weights of criteria relative to the evaluation of dominant alternative is chosen as the basis for the evaluation of Dominant AHP. ANP is a method, where the aggregate score is obtained by creating Formula (3), which is called a supermatrix, and by acquiring its major eigenvector (eigenvector which corresponds with the eigenvalue 1).
326
E. Kinoshita and S. Sugiura
However, in ANP, parallel sum of W and M in Formula (3) needs to be one of a stochastic matrix. As a result, M must denote the aggregate score of alternatives, which is normalized to one, and W should denote the weight of a criterion relative to Dominant AHP. Formula (16) is a supermatrix, utilizing the numerical example shown in Chapter 3. 0 0.778 0.425 0.781 ⎞ ⎛ 0 ⎜ ⎟ 0 0 0 .222 0.575 0.219 ⎟ ⎜ S = ⎜ 0.406 0.218 0 0 0 ⎟ ⎜ ⎟ 0 0 0 ⎟ ⎜ 0.232 0.591 ⎜ 0.362 0.191 0 0 0 ⎟⎠ ⎝
In Formula (16), when applying
and calculating
(16)
⎛ p⎞ ⎜⎜ ⎟⎟ as a major eigenvector of supermatrix S, ⎝q⎠
⎛ p⎞ ⎛ p⎞ S ⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟ , Formulas (17) and (18), each of which denotes a ⎝q⎠ ⎝q⎠
major eigenvector, are acquired. p signifies a column vector concerning criteria, while q is a column vector concerning evaluated score.
⎡ 0.653⎤ p=⎢ ⎥ ⎣0.347 ⎦
(17)
⎡ 0.341⎤ q = ⎢⎢0.356⎥⎥ ⎢⎣0.303⎥⎦
(18)
The evaluated score vector q of Formula (18) is consistent with the result obtained by Formula (7). In ANP, the aggregate score is obtained by creating supermatrix S, and acquiring its major eigenvector (eigenvector which corresponds with the eigenvalue one). However, submatrix M is a list of evaluated scores aij, and W is a transposed form of M. But since the sum of respective column is normalized, so that the total shall be one, each element of M and W is expressed as
M ij =
aij
∑a k
,Wij = kj
⎛q⎞ ⎛q⎞
a ji
∑a
jk
, respectively. S ⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟ is obtained when ap⎝ p⎠ ⎝ p⎠
k
⎛q⎞
plying ⎜⎜ ⎟⎟ as a major eigenvector of supermatrix S. However, when focusing onp
⎝ ⎠
ly on p, which corresponds with the aggregate score, MWp=p is acquired. Here, it is obvious that p denotes a major eigenvector of matrix M.
The Relationship between Dominant AHP/CCM and ANP
When applying pi =
∑a
ij
327
, and calculating the line i of MWp, the result is
j
shown as follows. ⎛ ⎛ ⎞⎞ ⎜ ⎜ ail a jl ⎟ ⎟ (MWp )i = ∑ (MW )ij p j = ∑ ⎜ ∑ ⎜ × ⎟ ⎟ × ∑ a jm j j ⎜ l ⎜ ∑ a kl ∑k a jk ⎟⎠ ⎟ m k ⎝ ⎠ ⎝ ⎛ ⎛ ⎞ ⎞⎞ ⎛ ⎜ 1 ⎜ ail a jl ⎟ ⎜ ail a jl ⎟ ⎟ = ∑⎜ ⎟ ⎜ ⎟ ⎟ × ∑ a jm = ∑∑ ⎜ ∑ j ⎜ ∑ a jk l ⎜ ∑ akl ⎟ ⎟ m j l ⎜ ∑ akl ⎟ ⎝ k ⎠i ⎝ k ⎠⎠ ⎝ k ⎛ ⎞ ⎜ ail a jl ⎟ = ∑∑ ⎜ ⎟= l j ⎜ ∑ a kl ⎟ ⎝ k ⎠
∑a
il
(19)
= pi
l
Formula (19)shows that pi =
∑a
ij
denotes a major eigenvector of MW, in
j
other words, it is what the authors attempted to acquire, or the aggregate score. It is consistent with Formula (12), which signifies the aggregate score of Dominant AHP in Chapter 3. Similarly, it is also possible to prove that q j =
∑a
ij
is valid
i
for each of the elements of q. As a result, the following relation between Dominant AHP and ANP is found valid. Dominant AHP+ANP=Dominant AHP (2) CCM and ANP In this section, the authors describe that when making the weights of criteria and the evaluated scores of alternatives as the value of ANP, by utilizing the weights of criteria which converged through CCM, the acquired result is consistent with that of CCM. It is not difficult to substantiate this, when utilizing the results obtained in the previous section. It is already known that the calculation which converged through CCM is consistent with Dominant AHP calculation[5][6]. When applying the weights of criteria which converged through CCM to a supermatrix of ANP, the acquired result turns out to be CCM. It is proved because the converged value of CCM equals to that of Dominant AHP, which signifies that it is structurally the same as Dominant AHP+ANP=Dominant AHP when it is valid, as shown in the previous section. Thus, the following relationship is applicable also to CCM. CCM+ANP=CCM
328
E. Kinoshita and S. Sugiura
6 Conclusion In this paper, when the weights of criteria relative to the evaluation of dominant alternative is chosen as the basis for the evaluation of Dominant AHP, Dominant AHP+ANP=Dominant AHP is valid. By applying the same approach, CCM+ANP=CCM is also proved valid. However, the authors would like to emphasize the fact that the results would be different unless making the weights of criteria relative to the evaluation of dominant alternative as the basis for the evaluation of AHP/ANP as proposed by Saaty, and of Dominant AHP/CCM as proposed by Kinoshita and Nakanishi. The authors believe that the availability of various decision making support models, such as AHP/ANP and Dominant AHP/CCM, is advantageous in solving a wide range of problems.
References 1. Saaty, T.L.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980) 2. Saaty, T.L.: The Analytic Network Process. Expert Choice (1996) 3. Kinoshita, E., Nakanishi, M.: A Proposal of a New Viewpoint in Analytic Hierarchy Process. Journal of Infrastructure Planning and Management, IV-36 569, 1–8 (1997) (in Japanese) 4. Kinoshita, E., Nakanishi, M.: Proposal of New AHP model in light of Dominant relationship among Alternatives. Journal of Operations Research Society of Japan 42(2), 180–197 (1999) 5. Kinoshita, E., Nakanishi, M.: A Proposal of CCM as a Processing Technique for Additional Data in the Dominant AHP. Journal of Infrastructure Planning and Management, IV-42 611, 13–19 (1999) (in Japanese) 6. Kinoshita, E., Sekitani, K., Shi, J.: Mathematical Properties of Dominant AHP and Concurrent Convergence Method. Journal of Operations Research Society of Japan 45(2), 198–213 (2002)
The Role of Kansei/Affective Engineering and Its Expected in Aging Society Hisao Shiizuka and Ayako Hashizume 1
2
Abstract. Kansei engineering originally evolved as a way to “introduce human ‘Kansei’ into manufacturing.” The recent trend indicates that the development of Kansei engineering is expanding beyond manufacture and design and is widening into relevant fields, creating one of the most conspicuous features of Kansei engineering. This trend can also be felt by presentations made at the recent annual conferences of Japan Society of Kansei Engineering. It is needless to say, therefore, some kind of interdisciplinary development is necessary to find a mechanism for creating Kansei values, which is as important as the Kansei values themselves. This paper consists of three parts. The first part of this paper describes the general history of Kansei and the basic stance and concept of Kansei research. Second part discusses the significance and roles of creating Kansei values in Kansei engineering and also provides basic ideas that will serve as future guidelines, with its positioning in mind debating from many aspects. Finally, the paper emphasizes the necessity for Kansei communication in the Ageing society. It is important for the Kansei communication to give birth to the empathy. Keywords: Kansei/Affecive engineering, Kansei value creation, Kansei communication, senior people, nonverbal communication.
1 Introduction The twentieth century was what we call a machine-centered century. The twentyfirst century is a human-centered century in every respect, with scientific technologies that are friendly to humans and the natural and social environments of humans being valued. Therefore, research, development, and deployment of Hisao Shiizuka Department of Information Design, Kogakuin University 1-24-2 Nishishinjuku, Shinjuku-ku, Tokyo 163-8677, Japan e-mail:
[email protected] 1
Ayako Hashizume Graduate School of Comprehensive Human Science, University of Tsukuba 1-1-1 Tennodai, Tsukuba-shi, Ibaraki 305-8577, Japan e-mail:
[email protected] 2
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 329–339. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
330
H. Shiizuka and A. Hashizume
advanced scientific technologies may no longer be determined solely by technological communities, and issues whose solutions have been carried over to the twenty-first century may not be solved by the technologies of only one specific field [1]. For example, tie-ups and cooperation with human and social scientists are greatly needed since the issues can no longer be solved by the scientific efforts of research and development alone. It is no exaggeration to say that such issues today affect all aspects of social technologies. Therefore, these issues can no longer be solved by conventional solutions. Matured technologies that have brought happiness to humans are among interdisciplinary scientific areas, and the mounting issues to be solved in the twenty-first century cannot be fully understood by conventional technological frameworks; the nature of scientific technology issues that we need to understand is obviously changing. Research focused on Kansei is expected to be the solution to these issues with the greatest potential. Meanwhile, it is only recently that interest in Kansei values has rapidly heightened as the Ministry of Economy, Trade and Industry launched the “Kansei Value Creation Initiative [2].” However, it is true that we have many issues to solve including the development of new viewpoints in defining the universal characteristics of Kansei values, detailed guidelines for future research, etc. Kansei engineering originally evolved as a way to “introduce human ‘Kansei’ into manufacturing.” The recent trend indicates that the development of Kansei engineering is expanding beyond manufacture and design and is widening into relevant fields, creating one of the most conspicuous features of Kansei engineering. It is needless to say, therefore, some kind of interdisciplinary development is necessary to find a mechanism for creating Kansei values, which is as important as the Kansei values themselves. This paper consists of three parts. The first part of this paper describes the general history of Kansei and the basic stance and concept of Kansei research. Second part discusses the significance and roles of creating Kansei values in Kansei engineering and also provides basic ideas that will serve as future guidelines, with its positioning in mind debating from many aspects. Finally, the paper emphasizes the necessity for Kansei communication in the ageing society. It is important for the Kansei communication to give birth to the empathy.
2 History and Definition of Kansei The term Kansei that we use today originates from the aesthesis (an ancient Greek word meaning sensitivity/senses) used by Aristotle and is thought to have similar meaning to ethos. The German philosopher Alexander Gottlieb Baumgarten (1714-1762) specified the study of sensible cognition as “aesthetics” for the first time in the history of philosophy, and this influenced Immanuel Kant. Baumgarten defined the term “sensible cognition” using the Latin word Aesthetica as in Aesthetica est scientia cognitionis senstivae [Aesthetics is the study of sensible cognition]. He defined “beauty” as a “perfection of sensible cognition with a coordinated expression” and defined “aesthetics” as “the study of natural beauty and artistic beauty.” Lucien Paul Victor Febvre (1878-1956) understood Kansei as the French word sensibilite, which can be traced back to the early fourteenth century.
The Role of Kansei/Affective Engineering and Its Expected in Aging Society
331
He also maintained that Kansei meant human sensitivity to ethical impressions such as “truth” and “goodness” in the seventeenth century, and in the eighteenth century, it referred to emotions such as “sympathy” “sadness,” etc. On the other hand, in Japan, aesthetica was translated as bigaku [aesthetics]. Given this international trend, in Japan, also, Kansei research was invigorated and attempts were made to understand Kansei from various perspectives [1]. A major argument was that it could be interpreted in many ways based on the meaning of its Chinese characters, such as sensitivity, sense, sensibility, feeling, aesthetics, emotion, intuition, etc. Another argument was that the word, from a philosophical standpoint, was coined through translation of the German word Sinnlichkeit in the Meiji period. It consists of two Chinese characters – Kan [feel] and Sei [character], which represent yin and yang respectively, and if combined, constitute the universe. The two different natures of yin and yang influence and interact with each other to exert power [3]. Another interpretation was that “Sinn” in the word Sinnlichkeit includes such meanings as senses, sensuality, feeling, awareness, spirit, perception, self-awareness, sensitivity, intuition, giftedness, talent, interest, character, taste, prudence, discretion, idea, intelligence, reason, judgment, mind, will, true intention, soul, etc. These suggest that the word Kansei has not only innate aspects such as giftedness and talent but also postnatal aspects such as interest, prudence, etc. It is supposed to be polished while exerting its power through interaction with others in various environments. Table 1 shows the result of a survey of Kansei researchers’ views on the word Kansei [4]. These findings lead us to believe that Kansei is multifunctional. Table 1 Interpretations of Kansei by researchers
No. 1
2
3
4
5
Description Kansei has postnatal aspects that can be acquired through expertise, experience, and learning such as cognitive expressions, etc. as well as an innate nature. Many design field researchers have this view. Kansei is a representation of external stimuli, is subjective, and is represented by actions that can hardly be explained logically. Researchers in information science often have this view. Kansei is to judge the changes made by integration and interaction of intuitive creativity and intellectual activities. Researchers in linguistics, design, and information science often have this view. Kansei is a function of the mind to reproduce and create information based on images generated. Researchers in Kansei information processing have this view. Kansei is the ability to quickly react and assess major features of values such as beauty, pleasure, etc. Researchers in art, general molding, and robot engineering have this view.
332
H. Shiizuka and A. Hashizume
3 Kansei System Framework Figure 1 shows a two-dimensional surface mapped with the elements required for Kansei dialogue. The vertical axis represents natural Kansei, which depicts the Kansei of real people, and artificial Kansei, which realizes Kansei through artificial means. The left-hand side of the horizontal axis represents “measurement (perception),” which corresponds to perception of understanding people’s feelings and ideas, etc. The right-hand side of the horizontal axis represents “expression (creation, representation and action),” which depicts your own feelings and ideas, etc. Kansei epistemology
Artificial Kansei
Design
Smart agent
Measurement (perception)
Kansei expression theory
Kansei system
Cognitive system
Expression (representation, creation, action)
Soft computing Multivariate analysis
Cognitive science
Modeling Natural Kansei
Fig. 1 Kansei system framework
The idea behind this is the two underlying elements of “receiving” and “transmitting” ideas, as understanding Kansei requires “a series of information processing mechanisms of receiving internal and external information intuitively, instantly, and unconsciously by using a kind of antenna called Kansei, then selecting and determining only necessary information, and, in some cases, transmitting information in a way recognizable by the five senses.” This space formed as mentioned above and required for Kansei dialogue is called the Kansei system framework [1]. The framework shows that past researches on natural Kansei focused almost entirely on understanding natural Kansei and on methods of accurately depicting this understanding in formulae. These researches analyzed data of specific times. Researches on both the third and fourth quadrant were conducted in this context. Researches on Kansei information processing mainly correspond to the third and fourth quadrants. Artificial Kansei, meanwhile, corresponds to the first and second quadrants. The major point of artificial Kansei is to build a system to flexibly support each individual and each circumstance, or, to put it differently, to realize artificial Kansei whose true value is to provide versatile services to versatile people and to support differences in individuals.
The Role of Kansei/Affective Engineering and Its Expected in Aging Society
333
Understanding Kansei as a system is of great significance for future researches on Kansei. First, in natural Kansei, as mentioned earlier, cognitive science and modeling correspond to the third and fourth quadrants, respectively. In artificial Kansei on the upper half, meanwhile, Kansei representation theory corresponds to the first quadrant while Kansei epistemology corresponds to the second quadrant. Kansei representation theory mainly deals with design. Design involves broad subjects. In “universal design” and “information design,” which are gaining increasing attention nowadays, new methodologies, etc. are expected to be developed through increasing awareness about Kansei. Second, Kansei epistemology, which corresponds to the second quadrant, is a field studied very little by researchers so far. While many researches currently underway to make Kansei of robots closer to that of humans are mainly focused on hardware (hardware robots), software robots will be essential for the future development of Kansei epistemology. A major example is non-hardware robots (or software robots) that can surf the internet (cyber space) freely to collect necessary information. Such robots are expected to utilize every aspect of Kansei and to be applied more broadly than hardware robots.
4 History of Kansei Values There will be no single answer to the question of “why create Kansei values now?” [2]. Japan has a long history of magnificent traditional arts and crafts created through masterly skill. It will be necessary to look at Kansei value creation from not only a national (Japanese) viewpoint but also a global perspective by taking the history of Kansei into consideration. Information technology is currently at a major turning point. The “value model” has been changing . The past model in which information technologies brought about values was based on the “automation of works and the resulting energysaving and time-saving.” The value model originates from the “Principles of Scientific Management” released by Taylor in 1911. Seemingly complex works, if broken down into elements, can be a combination of simple processes many of which can be performed efficiently at lower cost and with less time by using computers and machines. In this respect, the twentieth century was a century of machine-centered development with the aim of improving efficiency (productivity) by using computers, etc. The twenty-first century, meanwhile, is said to be a human-centered century in every respect. The quest has already begun, in ways friendly to humans, to streamline business operations that go beyond simple labor-saving. This is represented by such keywords as “inspiration,” “humans,” and “innovation” and requires new types of information technology to support them. Changes have already begun. The most remarkable change, among all the three changes, is the third one or the “change from labor-saving to perception” 4). This changes the relationship between “information technology and humans.” In the twentieth century, the relationship between theory-based IT (information technology) and perception-based humans was represented by the “conflict” in which perception was replaced by theory. In the twentyfirst century, meanwhile, it is expected to be represented by the “convergence of IT
334
H. Shiizuka and A. Hashizume
theory and human perception.” Specifically, how to incorporate mental models into IT models will be an essential issue for this convergence. It is worth noting that Drucker, in his book Management Challenges for the 21st Century [5], said that “the greatest deed in the twentieth century was the fiftyfold improvement in the productivity of physical labor in the manufacturing industry” and “expected in the twenty-first century will be improvement in the productivity of intellectual labor.” From these historical perspectives of productivity improvement, we can understand that we currently need to find ways to incorporate “perception” and “human minds” into products. Finding ways of doing so could lead to improving the values of Kansei that people feel.
5 Resonance of Kansei If you look at Kansei value creation from a system methodology point of view, you will find that one Kansei converter is involved in it. The Kansei converter enables the resonance of Kansei to generate (or create) Kansei values. This phenomenon can be explained by the resonance that you can observe in the physical world. The physical phenomenon of “resonance,” especially in electric circuits, enables electric currents containing resonance frequency components to flow smoothly as the circuits that include inductors and capacitors turn into pure resistance with zero at the imaginary part through interaction of these reactive elements. When manufacturers (senders) have stories with a strong message, as shown in Figure 3, the level of “excitement” and/or “sympathy” increases among users (receivers), causing “resonance of Kansei,” which in turn creates values. The “resonance of Kansei” can be interpreted as a phenomenon that could occur when the distance between a sender and a receiver is the shortest (or in a pure state where all the impurities are removed). The distance between the two becomes shortest when, as shown in Figure 2, the level of the story is strong where excite-
ment or sympathy is maximized . How should we quantitatively assess (measure) the “level of a story” and the level of “excitement or sympathy”? This will be the most important issue in the theoretical part of Kansei value creation.
Manufacturers
Manufacturers (senders) have “stories” with strong Fig. 2 Resonance of Kansei
Resonance of Kansei
Users
Level of excitement” or “sympathy” i
The Role of Kansei/Affective Engineering and Its Expected in Aging Society
335
6 Creativity and Communication Kansei values are created only when some communication between manufacturers (senders) and users (receivers) takes place. The communication may take the form of explicit knowledge where specific Kansei information is exchanged or may take the form of tacit knowledge. Messages exchanged are treated as information and go through “analytic” processing and “holistic” processing for interpreting, deciding on, and processing the information. Analytic processing generally takes place in the left brain while intuitive processing takes place in the right brain. Holistic processing takes place through interaction of the right and left brains. It is based on the idea that the right and left brains work interactively rather than independently. An idea generated in the right brain is verified by the left brain. Theoretical ideas not backed by creativity or intuitive insight sooner or later flounder. It is essential for the right and left brains to work together to solve increasingly complex issues. It may be no exaggeration to say that the most creative products in human culture, whether it be laws, ethics, music, arts, or scientific technology, are made possible through collaboration of the right and left brains. For this reason, human cultures are the work of the corpus callosum. Figure 3 shows a summary of information processing by the right and left brains [6]. We can understand moral, metaphorical, or emotional words or punch lines of jokes only when the right and left brains work together (Professor Howard Gardner, Harvard University). Therefore, the mechanism of giving the stimulus of Kansei information to create values is made possible only through collaboration of the right and left brains. Analytic
Combination
Intuitive
Left brain
Combination of left and right brains Processing data one by one and simultaneously Unconscious Combination of emblematic and signal functions Systemized random processing
Right brain
Processing data one by one Conscious Emblematic, partial, and quantitative functions Systematic processing
analysis/ modification
Processing data simultaneously Unconscious Signal and whole functions Random processing
idea
Fig. 3 Collaboration of the right and left brains (ways of information processing)
336
H. Shiizuka and A. Hashizume
Kansei communication is the foundation of Kansei value creation; creativity is generated through encounters, and these encounters are the source of the creativity (Rollo May, an American psychologist). Therefore, communication and creativity are inseparable and the power of creativity can be fully exerted when knowledge and experience, wisdom and insight regarding the two are freely shared and effectively combined. Kansei values are created through the relationship of “joint creation” between manufacturers and users. “Joint creation” literally means “creation jointly with customers.” To put it differently, “joint creation” is a new word based on the concept of making new agreements or methodologies, through competitive and cooperative relationships among experts and expertise in different fields of study, to solve issues that cannot be solved by experts of any single field of study. Thus, “joint creation” is based on collaboration and, therefore, Kansei encounters and communications with people in different fields of study are essential for “joint creation.”
7 Aging Society and Kansei Communication The United Nations defines an Ageing society as a society that has a population Ageing rate of 7% or higher. The population Ageing rate is defined as the percentage of the overall population that is 65 or older. Japan became an Ageing society in 1970. The population Ageing rate of Japan continued to increase thereafter, and in 2007, it became a super-aged society (a population Ageing rate of 21% or higher). Many European countries and the United States are experiencing an increase in the population Ageing rate, although the rate of Ageing is slower than in Japan. East Asian countries like China, Korea, and Singapore have not reached superaged society status, but the change in the population Ageing rate is similar to Japan, or in some cases, even higher than that of Japan. Ageing population is the biggest issue facing the world today, Japan in particular. Also, because of the trend towards nuclear families and the increase in one-person households, the number of households with only elderly people and the number of elderly people living alone are on the rise in Japan. It is known as an important problem in Aging society that the lack of communications raises the incidence rate of the dementia. We discuss importance of Kansei aspects of communication of the elderly. Kansei communication is “communication that is accompanied by a positive emotional response.” Two important factors that are required for Kansei communication are discussed in some detail below. Empathy is a must in order for a person to feel good when communicating. Empathy is possible only if information is conveyed well from a speaker to a listener. Such effective communication must contain not only the verbal aspects of the story being communicated in a narrative sense, but also nonverbal aspects that are difficult to verbalize, including knowledge, wisdom, and behavior . Mehrabian’s Rule, as proposed by Mehrabian, also states that nonverbal aspects of communication (visual information like facial expressions and body language, as well as auditory information like tone and pitch of voice) determine what is communicated more so than the meanings of the words spoken [7]. Note,
The Role of Kansei/Affective Engineering and Its Expected in Aging Society
337
however, that this is true only when there are inconsistencies between the words and facial expressions and tone of voice, for example. Having said that, human verbal communication is generally supported by nonverbal information even without inconsistencies between these aspects. This is because speakers are unable to convey everything that they want to verbally and listeners unconsciously interpolate to fill the gap with nonverbal information given by the speaker. In clear communication with good nonverbal conveyance, good understanding and strong empathy are generated. Common understanding by nonverbal communication is closely connected to Kansei communication. Humans do not want to communicate with everyone. A comfortable distance, or Kansei distance, exists between people within societal relationships. Cultural anthropologist Edward T. Hall divided proxemics into intimate distance, personal distance, social distance, and public distance. He further refined them into relatively close (close phase) and relatively far (far phase), and then explained the significance of each distance. These distances convey how close people feel toward each other, and what desires they have in terms of communication. Kansei communication exists within intimate distance and personal distance. These distances are limited to face-to-face communication. The significance of physical distance is becoming somewhat obsolete with increased communication through electronic equipment with the advance and spread of information technologies. People often desire face-to-face or telephone communication between truly close acquaintances, but nonverbal communication like e-mail is often used to accommodate the surrounding circumstances of the communicators. Emotional closeness, in addition to physical distance, is another factor that shapes Kansei distance. It is safe to assume that the nonverbal aspects of communications between such communication partners are shared. This closeness and the existence of nonverbal aspects of communication help to form a comfortable distance (Kansei distance) in e-mail communication, for example.
8 The Elderly and Need for Kansei Communication Ageing brings various changes to bodily functions including sensory functions and motor functions. These changes to bodily functions, along with changes in social status and economic circumstances that come with getting older, bring various changes to emotional and psychological functions as well. These changes affect communication and relationships with society and elderly people. The onset of the symptoms of Ageing in elderly people is deeply related to the age of an individual, but there is no rule as to when these symptoms may occur. Rather, the timing largely depends on a person’s physical characteristics and life history. Similarly, the Ageing process and its rate of progress for each function show large differences between individuals. When considering elderly people, we need to understand what types of change occur as a person ages, and we must understand that Ageing shows itself in various dimensions and that it does not always fit into a general stereotype.
338
H. Shiizuka and A. Hashizume
In this section, we will passively observe the changes that come with Ageing as we consider the social relationships of elderly people and Kansei communication. Also we consider changes that come with Ageing. The psychological state of elderly people is affected by a multitude of factors, including societal factors like social status and economic circumstances, psychological factors like subjective sense of well being, as well as physical factors like deterioration of visual and auditory sensory functions. These life changes create a very stressful environment for elderly people. Therefore, we feel that elderly people potentially have a strong need for Kansei communication. Enriched communication has been reported to have preventative effects on diseases like dementia and depression. Likewise, Kansei communication holds its own promise of showing a preventative effect equal to, or better than, the aforementioned. In this regard, it is very necessary to invigorate communication at Kansei distances utilizing nonverbal aspects of communication.
9 Concluding Remarks It is only recently that the interest in Kansei values has rapidly increased as the Ministry of Economy, Trade and Industry launched the “Kansei Value Creation Initiative [2].” However, it is true that we still have many issues to solve including the development of new viewpoints to extract the universal characteristics of Kansei values and the development of detailed guidelines for future research, etc. It is only natural that scientific discussions on Kansei could specifically lead to the “quantification” of Kansei, and further discussions are required on the quantification issue. The main agenda of such discussions will be on how to quantify Kansei in a general way and extract the essence of Kansei by using masterly skills. Further researches are required on this issue [8]. Appropriate convergence of technology and mentality (Kansei) and recovery of the proper balance are expected to be achieved by shifting discussions on Kansei engineering from individual discussions to integrated discussions. Kansei engineering is a cross-sectional (comprehensive) science and should integrate specific technologies in psychology, cognitive science, neuropsychology, sociology, business management, education, psychophysiology, value-centered designing, ethics engineering, and computer science. To that end, a “technological ability + artistic ability + collaborative spirit” needs to converge appropriately. Kansei value creation, among all aspects of Kansei engineering, requires collaboration with the relevant areas of Kansei and needs to make progress while looking at the whole picture. Kansei/Affective engineering is expected in the Ageing society problem, especially in Kansei communication. If the motivation of elderly people to use new communications media rises through the use of Kansei communication, the gap in new media use that exists between the generations can be somewhat closed. Kansei communication will also strengthen bonding-type social capital, leading to enhanced psychological health and societal welfare. Through this, diseases characteristic of elderly people can be prevented and the quality of life of the elderly population can be improved.
The Role of Kansei/Affective Engineering and Its Expected in Aging Society
339
References 1. Shiizuka, H.: Kansei system framework and Outlook for Kansei engineering. Kansei Engineering 6(4), 3–16 (2006) 2. Minister of Economy, Trade and Industry, Japan, Press Release, http://www.meti.go.jp/press/20070522001/20070522001.html 3. Kuwako, T.: Kansei philosophy, NHK Books (2001) 4. Harada, A.: Definition of Kansei, Research papers by Tsukuba University project on Kansei evaluation structure model and the model building, pp. 41–47 (1998) 5. Atsuo Ueda (translation): What governs tomorrow, Diamond (1993) 6. Howell, W.S., Kume, A.: Kansei communication, Taishukan Shoten (1992) 7. Hashizume, A., Shiizuka, H.: Ageing society and Kansei communication. In: PhillipsWren, G., et al. (eds.) Advances in Intel. Decision Technologies. SIST, vol. 4, pp. 607– 615. Springer, Heidelberg (2010) 8. Shiizuka, H.: Positioning Kansei value creation in Kansei engineering. Kansei Engineering 7(3), 430–434 (2008)
Part II Decision Making in Finance and Management
A Comprehensive Macroeconomic Model for Global Investment Ming-Yuan Hsieh, You-Shyang Chen, Chien-Jung Lai, and Ya-Ling Wu
*
Abstract. After the globalization of macroeconomic, the each country economy has the comprehensively inseparable interrelationships. Therefore, in order to fully ascertain the developed pulsation and tendency of the international finances, the global investors need to take vigorous tactics to face the competition for globalization. Financial investment environment changes with each passing day, investors’ satisfaction are more and more discerning, and market demands can fluctuate unpredictably. This research stays focus on the comparison of these industrial regions consisting of ten industrial regions comprised of two developed industrial regions (USA and Japan) and eight high-growth industrial regions (Four Asia Tigers and BRIC). Further, the measurement objectives consists of twenty-four macroeconomic indicators of these above ten industrial regions includes fourteen macroeconomic indicators and ten statistic indexes from four academic institutes comprised of the IMD World Competitiveness Yearbook (WCY), World Economic Forum (WEF), Business Environment Risk Intelligence (BERI), and Economist Intelligence Unit (EIU) that are the focal dimensions of integration investigation in these contexts. Significantly, this research also deals with quantitative and empirical analysis of the prominent features and the essential conditions for portfolio theory and macroeconomic model and to evaluate the relative strengths and weaknesses of twelve stock markets of the ten industrial regions through focusing on the scenario analysis and empirical analysis through the use of the fluctuate percentage of stock price index and stock market capitalization of twelve stock markets. Keywords: Macroeconomic Model, Global Investment. Ming-Yuan Hsieh Department of International Business, National Taichung University of Education *
You-Shyang Chen Department of Information Management, Hwa Hsia Institute of Technology Chien-Jung Lai Department of Distribution Management, National Chin-Yi University of Technology Ya-Ling Wu Department of Applied English, National Chin-Yi University of Technology J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 343–353. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
344
M.-Y. Hsieh et al.
1 Introduction Many investors have confronted more challenges due to the rapid capricious development of the world economic and financial investment environment. Financial investment environment changes with each passing day, investors’ satisfaction are more and more discerning, and market demands can fluctuate unpredictably. While facing the constant changes of the global financial markets, it is important to know how to break through the current situation, maintain an advantage and continuously make a profit. Many investors have the pressure of competing to positively adapt, to form a competitive investment strategy, and to have a great project management strategy. The traditional business investment is not enough to deal with the issues regarding new and various economic challenges. This study will focus on answering the topic of this research - Reducing Systematic Risk through Portfolio Theory and Macroeconomic Model. The extensive investors and researchers commenced to aware of that risks have always the key point to control the realized returns. In order to do a research regarding the risks, however, there are a series of questions to appear for them. For example, how to measure the conceptual risks? How to concretize the abstract risks? How to quantify the national risks? How to connect the risks and expect returns? How to link the risks with realized returns? How to define, measure, and quantify the relationship among expected returns, realized returns and risks? How to decrease the impact of risks if the conceptual risks could be materialized? [1] initially claimed that the risk can be dispersed through diversified invested activities. Further, [1] advocated that risk is equal the covariance ( σ ) of returns of in2
vested portfolios. Covariance ( σ ) is equal that total of that invested portfolio re2
turn rate ( Ri ) is multiplied by invested objective selection percentages (invested objectives weights, Wi ). In Markowitz’s paper, Portfolio Selection, the risk firstly is quantified and the connected with expected returns and realize returns. The initial research of link between increasing invested return rate and decreasing invested risks was Markowitz Portfolio Theory. After the portfolio theory was published in the journal of finance, the another economic expert [2], [2] commenced to do research in Markowitz portfolio theory and he resoundingly created any financial model, Capital Asset Pricing Model (“CAPM”) that cleanly linked the relationship among the expected return rate of a stock, risk-free retuned rate, expected return rate, market portfolio return rate and risk priority number (Beta coefficient, “β”). Often systematic risk results in declining of total portfolio investment value as most portfolio investments declines in value. [2][3] claimed that investors can utilize the invested portfolio invested strategies (separate invested objectives) to decrease or eliminate unsystematic risk, for example, the investors can invest high-profit and high-risk and low-profit and low-risk invested objectives in order to perfect hedge to create efficient invested portfolio. On the contrast, systematic risk resulted from the invested markets including economic growth rate, production rate, foreign exchange rate, inflation, government economic policies and others macroeconomic environment factors. A typology of portfolio theory and macroeconomic model was presented in order to make the
A Comprehensive Macroeconomic Model for Global Investment
345
meaning of the portfolio theory and macroeconomic model more exact in this research, thus avoiding the confusion and misunderstanding rampant in popular and journalistic discussions on the subject research of macroeconomic environment among the ten industrial regions, especially in the rapidly developing BRIC (Brazil, Russia, India and China) , and by the four international world-level economyacademic measure instructions: IMD World Competitiveness Yearbook (WCY), World Economic Forum (WEF), Business Environment Risk Intelligence (BERI) and Economist Intelligence Unit (EIU).
2 Methodologies 2.1 Research Specification of Research Sample and Data Collection Bayesian was the first person to originate the theory regarding measuring probabilities based on historic event data to analyze the expectation rate of probabilities (“Bayes’s Theorem”) in 1701. Bayes’s Theorem has already been utilized for more than two hundred and fifty years and the Bayesian interpretation of probability is still more popular and recent. In Bayes’s Theorem theory, the measurement and revaluation of probability can be approached in several ways. The preliminary application is based on betting that the degree of belief in a proposition is presented in the odds that the assessor is willing to bet on the success of a trial of its truth. In the 1950s and 1960s, the Bayes’s Theorem became the preferred and general approach for assessing probability. [4] was an English psychologist. Spearman was the forerunner to explored factor analysis for analyzing the research factor’s rank correlation coefficient and also addressed the creative work on statistic models for human intelligence, including his theory that disparate cognitive test scores reflect one single general factor (G-factor) that is coining the term g factor (for example, personality, willpower , nerve situation, blood running and others related mental energy factors) and another specific factor (S-factor) which only is related with particular ability. With his strong statistical background, Spearman commenced to estimate the intelligence of twenty-four children in the village school. After a series of research, he discovered that the correlations of factors were positive and hierarchal that resulted in the two-factor theory of intelligence in this study.
2.2 Literature Review on Portfolio Theory Recently, Harry Markowitz delivered that investors have been always pursuing the effect portfolio (minimum risks) of maximum returns under the limited invested capital and average of return rate through different percentage of various invested objectives. [2] extended the portfolio concept of [1] to explore the “Sharpe Ratio”. The fundamental idea of Sharpe Ratio briefly points out the relationship between invested return premium and invested risk that the total that expected return rate subtracted risk-free return rate is divided by standard deviation. To take one step ahead, [2] adopted the portfolio theory of [1] to explore the single-index model (“SIM”). SIM is an asset pricing model commonly used for calculating the
346
M.-Y. Hsieh et al.
invested portfolio to measure risk and return of a stock. In the CAPM, portfolio risk is displayed by higher variance (i.e. less predictability). From another standpoint, the beta of the portfolio is the defining factor in rewarding the systematic exposure taken by an investor. In terms of risks, he further argued that invested risks of a portfolio consists of systematic risk (non-diversifiable risk), and unsystematic risk (diversifiable risk or idiosyncratic risk). Systematic risk results from the risk common to all invested market because each invested market possess their various volatilities. Unsystematic risk is related with individual invested objectives. Unsystematic risk can be diversified away to smaller levels by including a greater number of invested objectives in the portfolio. [7] inaugurated to bring the macroeconomic factors (market factors) into portfolio theory in order to find out the impact of systematic risks on expected return rates of invested objectives.
2.3 Literature Review on Macroeconomic Model Recently, there are a large number of economic and financial scholars and researchers who devoted on doing portfolio researches based on portfolio theory. In the final analysis, the fundamental concept of portfolio theory resulted from the[5]. [5] was one of the extremely impartment and deeply impacted economists in the economic and financial research fields in Europe from 1920s to 1980s. In his released articles and books, in the 1930s, he explored the marvelous concept of elasticity of substitution resulted in a complete restatement of the marginal productivity theory. Initially, in terms of commencement of macroeconomic model, [8] and [9] briefly classified two main research models. One is macroeconomic model (“MEM”) which is complicated to calculate and analyze and another is computable general equilibrium (“CGE”) which can be simply inferred and calculated. Further, [10] assorted MEM to five categories including Keynes–Klein (“KK”) model, Phillips–Bergstrom (“PB”) model, Walras–Johansen (“WJ”) model, Walras–Leontief (“WL”) and Muth–Sargent (“MS”) model. Nowadays, the Keynes–Klein (“KK”) model has been utilized for a large number of academic economic researches [11]. On the other hand, CGE also utilizes the joint equations to explain the correlation between economic factors [12] but though Social Accounting Matrix (“SAM”), CGE accentuates three analytical factors (labor, manufacture product and financial market) in the macroeconomic model [13]. [13] delivered the academic paper, “A Dynamic Macroeconomic Model for short-run stabilization in India”, to do research on India’s complicated economy. The effects of a reform policy package similar to those implemented for Indian’s trade and inflation in 1991 was able to assess through this macroeconomic model in this research. The most different standpoint of this research is that the non-stationarity of the data into this macroeconomic model and estimation procedures with the stationarity assumption. [14] delivered the topic of economic freedom in the Four Asia Tigers. [14] focused on an analysis of the main controversy: the role of the state in their rapid growth in the study. Further, Paldam (2001) briefly produced the economic freedom index in order to quantify the abstract concept of the economic freedom and
A Comprehensive Macroeconomic Model for Global Investment
347
he successfully expressed the level of economic freedom though analyzing the economic freedom index in the Four Asia Tigers by surveys data and additional. In the conclusion, three of the Four Asia Tigers (Shang pore, Hong Kong and Taiwan) have a level of regulation that are familiar with the west European countries due to the result of the economic freedom index from these three are close to level of conditional laisser-faire. [15] investigated the East Asian economic from 1986 to 2001 including the academic lecture reviews and empirical observations though the method of biblimetirc research. In their study, they organized around “4,200 scholarly articles written about the East Asian economies that were indexed by the Journal of Economic Literature from 1986 to 2001 and included on the CD-ROM EconLit.” [15]. Further, they centralized these lecture papers regarding the East Asia economic to identify the leading research authors, remarkable journals and empirically observed papers in detail. After implicitly discussion, they commenced to compare Four Asia Tigers with those of other emerging market economies (Czech Republic, Hungary, Mexico, and Poland) and a developed market economy (Italy) in order to attempts to burrow correspondence between the growth in articles and the growth of the economies. [16] clarified the competitiveness between two financial cities, Hong Kong and Singapore. [16] utilized the rational foundation of the categories of finance centers to expand his research concept and empirical investigation. Since the government of Hong Kong realized the world finance trend due to “Asia Finance Crisis” and “Economic Area Competition”, the economic concepts and financial policies of government to specify Hong Kong to be attractively invested financial market. This research in detail explored the competition, comparison and evaluation of two financial cities as attractively invested financial market. Besides, the contribution of this research is to successfully build the evaluated model based on macroeconomic profile and a financial centre profile of these two cities in qualitative and quantitative theses.
3 Learning of Imbalanced Data In terms of examining the complexity and uncertainty challenges surrounding portfolio theory and macroeconomic model, five years of data was analyzed along with multi-methods and multi-measures field research in order to achieve a retrospective cross-sectional analysis of industrial regions. These industrial regions consisted of ten industrial regions comprised of two developed industrial regions (USA and Japan) and eight high-growth industrial regions (Four Asia Tigers and BRIC). This chapter not only characterizes the overall research design, empirical contexts, and research sample and data collection procedures but also is designed to compare the ten industrial regions.
3.1 Research Specification of Research Sample and Data Collection In terms of the representativeness and correction of the efficient macroeconomic model though factor analysis, the research sample must collectively and statistically constrain all impacted macroeconomic factors as far as possible. Further, the sample
348
M.-Y. Hsieh et al.
in this research contains large and complicated macroeconomic factors that are collected from two authoritative and professional channels. One is the official government statistic departments and another is from the four economic statistics for macroeconomic factors data, including IMD World Competitiveness Yearbook (WCY)-National Competitive Index (NCI), World Economic Forum (WEF)-Global Competitiveness Index (GCI), Business Environment Risk Intelligence (BERI)Business Environment, and Economist Intelligence Unit (EIU) Business Environment. The content of research sample consists of the vertical range and horizontal scope. Specifically, in terms of the validity and reliability of collected data, this study focused on the three important measuring aspects: (1) Content Validity which was judged subjectively; (2) Construct Validity which was examined by factor analysis and (3) Reliability which concluded that the seven measures of quality management have a high degree of criterion-related validity when taken together. Given the sensitive nature of research data, time was devoted to cite the impacted macroeconomic factors of academic institutions. A database of all macroeconomic factors was created using public and primary economic reports including press releases, newspapers, articles, journal articles and analyzing reports. These sources provided a macroeconomic-level understanding of the motivations and objectives, basic challenges, target characteristics, financial market contexts and general sense regarding portfolio theory and macroeconomic model. The economic indicators data includes the annual economic indicators data (leading and lagging indicators). The vertical range stretch over five years from 2004 to 2008 and the horizontal scope consists of twelve stock markets of ten industrial regions including USA, Japan, Four Asia Tigers and BRIC. With regard to the analysis method, whichever gains a higher score will be given full marks and other methods are in accordance with relative value to decide who wins the score. The measured research data includes 24 macroeconomic indicators.
3.2 Research Design The fundamental research design in this research is based on combining the portfolio theory and macroeconomic model in order to create the efficient and effective macroeconomic model to measure the beta priority number (beta coefficient) in CAPM. Further, in order to produce the macroeconomic model, this research follows the above procedure of the research theory development framework to build the research design framework in Figure 1. This research design framework not only focuses on the application of the portfolio theory, the macroeconomic model, and the assessment method but also concentrates on the macroeconomic environment for industrial regions that is measured by some major statistic macroeconomic factors that included GDP, Economic Growth Rate, Import, Export, Investment Environment, and Financial Trade which are from official government statistic departments such as Taiwan’s Bureau of Foreign Trade (“MOEA”) and USA’s Federal Reserve Board of Governors and academic economy institutions such as the IMD, World Economic Forum (WEF), Business Environment Risk Intelligence (BERI), and Economist Intelligence Unit (EIU).
A Comprehensive Macroeconomic Model for Global Investment
349
Identifying Research Topic
Collecting Related Lecture 1. Outstanding papers and journals regarding research methodology 2. Fundamental concept of portfolio theory
Measuring Invested Systematic Risk Indexes - Utilizing Factor Analysis to Produce the Effective Macroeconomic Model for Analytical Industrial regions
Comparison of Invested Systematic Risk among ten industrial regions (USA, Japan, Four Asia Tigers and BRIC)
Bring the annual growth rate of ICI into CAPM model in order to calculate Beta Priority Numbers of twelve stock markets of ten industrial regions
Develop and Apply
Factor Analysis
Macroeconomic Model
Portfolio Theory (CAPM), Scenario Analysis & Empirical Analysis
Conclusion and Recommendations
Fig. 1 The Research Design Framework
4 Classifier for Large Imbalanced Data According to inductive reasoning from two industrial regions areas, technical compatibility is identified to provide a macroeconomic profile and a financial profile of the ten industrial regions in quantitative terms for comparison and evaluation purposes. The analysis suggests that each of the economic financial competitiveness indicators can have positive or negative implications for development of portfolio theory and macroeconomic model and are constructed based on the insights from the research of twelve stock markets of ten industrial regions. This chapter deals with quantitative and empirical analysis of the prominent features and the essential conditions for portfolio theory and macroeconomic model and to evaluate the relative strengths and weaknesses of twelve stock markets of the ten industrial regions by examining three hypotheses. Through factor analysis, the “Component Scores” among the selected main six factors (Academic Economic Institute Score Factor, Economic Production Factor, Economic Trade Factor, Economic Exchange Rate Factor, Economic Interest Rate Factor and Economic Consumer Price Factor) and twenty analytical variables are presented in following explanation formulation. According to the statistics, the analytical measure model
350
M.-Y. Hsieh et al.
of national competition is presented in the following formulation (1) and formulation (2) (Varimax Method): Assumption: all collected data are correct and the formula inaccuracy is given and constant. Invested Systematic Risk Index (Competition of invested financial markets) (df) = Academic Economic Institute Score Factor + Economic Production Factor + Economic Trade Factor + Economic Exchange Rate Factor + Economic Interest Rate Factor + Economic Consumer Price +e (formula inaccuracy) = (WEFCBC, WEFGE, BERIFOR, BERIOR, GDPPC, EIUER, WEFGR, BERIPR, WEFIF, IMDCEP and BERIER) + (IPY, NRFEG and IPG) + (GDPPP, IP and EP) + (ER) + (IR) + (CPI) +e (formula inaccuracy) =0.957*WEFCBC+0.951*WEFGE+0.913*BERIFOR+0.898*BERIOR+0.891* GDPPC+0.882*EIUER+0.881*WEFGR+0.881*BERIPR+0.868*WEFIF+0.833*I MDCEP+0.778*BERIER+0.857*IPY+0.792*NRFEG+0.729*IPG +0.954*GDPPP+0.852*IP+0.677* EP+ 0.807*ER+0.456*IR+0.527 *CPI+ e (formula inaccuracy) (1) Assumption: all collected data are correct and the formula inaccuracy is given and constant. Rotated Invested Systematic Risk Index (Competition of invested financial markets) (df) = Academic Economic Institute Score Factor + E Economic Trade Factor + Economic Profit Factor + Economic Reserves Factor + Economic Consumer Price Index Factor + Economic Exchange Rate Factor = (BERIFOR, BERIPR, BERIOR, WEFCBC, WEFGE, WEFGR, GDPPC, EIUER, IMDCEP, BERIER and WEFIF) + (GDPPP, IP and EP) + (IPY, IPG and EG) + (NRFEG) + (CPI) + (ER) +e (formula inaccuracy) =0.954*BERIFOR+0.931*BERIPR+0.908*BERIOR+0.893*WEFCBC+0.886* WEFGE+0.861*WEFGR+0.843*GDPPC+0.835*EIUER+0.799*IMDCEP+0.796 *WEFIF+0.778*BERIER+0.959*GDPPP+0.984*IP+0.688*EP+0.889*IPG+0.94 8*IP+0.751*EG+0.91*NRFEG+0.884 *CPI+ 0.972*ER + e (formula inaccuracy) (2) Hence, the Invested Systematic Risk Index (ISRI), the Rotated Invested Systematic Risk Index RISRI, the Growth Rating of the Invested Systematic Risk Index, and the Growth Rating of the Rotated Invested Systematic Risk Index of each of the ten industrial regions are presented in Table 1. To take the further step, the five-year beta priority numbers of the twelve stock markets (USA New York, USA NASDAQ, Japan, Taiwan, Singapore, Korea, Hong Kong, Brazil, Russia, India, China Shanghai and China Shenzhen) from the ten industrial regions (USA, Japan, Taiwan, Singapore, Korea, Hong Kong, Brazil, Russia, India and China) through CAPM ( E ( RStock ) = R f + β Stock × ⎡ E ( RMin ) − R f ⎤ ) of portfolio theory and the two ⎣
⎦
equations: macroeconomic model (1) and rotated macroeconomic model (2).
A Comprehensive Macroeconomic Model for Global Investment
351
Table 1 Factor Analysis of Invested Performance in Twelve Stock Market of Ten Industrial Regions from 2004 to 2008
USA 2004 USA 2005 USA 2006 USA 2007 USA 2008 JAPAN 2004 JAPAN 2005 JAPAN 2006 JAPAN 2007 JAPAN 2008 TAIWAN 2004 TAIWAN 2005 TAIWAN 2006 TAIWAN 2007 TAIWAN 2008 SINGAPORE 2004 SINGAPORE 2005 SINGAPORE 2006 SINGAPORE 2007 SINGAPORE 2008 KOREA 2004 KOREA 2005 KOREA 2006 KOREA 2007 KOREA 2008 HONG KONG 2004 HONG KONG 2005 HONG KONG 2006 HONG KONG 2007 HONG KONG 2008 BRAZIL 2004 BRAZIL 2005 BRAZIL 2006 BRAZIL 2007 BRAZIL 2008 RUSSIA 2004 RUSSIA 2005 RUSSIA 2006 RUSSIA 2007 RUSSIA 2008 INDIA 2004 INDIA 2005 INDIA 2006 INDIA 2007 INDIA 2008 CHINA 2004 CHINA 2005 CHINA 2006 CHINA 2007 CHINA 2008
Growth Rating of InInvested Systematic Risk Rotated Invested Systemvested Systematic Risk Index atic Risk Index Index 47939.05 47452.32 5.69% 50829.85 50476.59 5.69% 53582.36 53399.25 5.14% 55452.83 55332.69 3.37% 57553.24 57523.32 3.65% 30578.81 29678.99 5.24% 32269.18 31352.22 5.24% 34259 33336.81 5.81% 35928.98 35016.01 4.65% 37330.76 36426.45 3.76% 22895.45 21885.17 6.80% 24564.83 23524.94 6.80% 26429.42 25309.2 7.05% 28424.24 27234.41 7.02% 29692.29 28442.72 4.27% 36420.57 34631.95 7.75% 39479.08 37566.93 7.75% 42671.31 40631.67 7.48% 45346.12 43236.08 5.90% 47350.66 45142.68 4.23% 20210.04 19636.25 6.21% 21547.26 20934.21 6.21% 23266.19 22606.01 7.39% 24958.46 24273.97 6.78% 26395.33 25685.36 5.44% 29752.28 28453.7 9.16% 32750.99 31345.82 9.16% 35714.21 34170.09 8.30% 38808.21 37158.84 7.97% 40751.45 39027.63 4.77% 8456.545 8129.662 6.50% 9044.654 8707.226 6.50% 9296.606 8964.129 2.71% 9921.351 9582.526 6.30% 11010.61 10675.02 9.89% 11265.09 10840.45 9.40% 12433.47 11980.63 9.40% 13918.89 13445.71 10.67% 15635.57 15157.75 10.98% 17398.09 16928.12 10.13% 5955.555 5917.939 7.64% 6448.421 6418.842 7.64% 7086.197 7069.151 9.00% 8015.897 8065.254 11.60% 6055.779 6207.333 -32.37% 10686.48 11005.24 12.84% 12260.32 12748 12.84% 14676.83 15261.61 16.46% 16902.61 17641.32 13.17% 15107.45 16007.7 -11.88%
Growth Rating of Rotated Invested Systematic Risk Index 5.99% 5.99% 5.47% 3.49% 3.81% 5.34% 5.34% 5.95% 4.80% 3.87% 6.97% 6.97% 7.05% 7.07% 4.25% 7.81% 7.81% 7.54% 6.02% 4.22% 6.20% 6.20% 7.40% 6.87% 5.49% 9.23% 9.23% 8.27% 8.04% 4.79% 6.63% 6.63% 2.87% 6.45% 10.23% 9.52% 9.52% 10.90% 11.29% 10.46% 7.80% 7.80% 9.20% 12.35% -29.93% 13.67% 13.67% 16.47% 13.49% -10.21%
352
M.-Y. Hsieh et al.
5 Concluding Remarks After the measurement of this research, the five research questions are resolved in detail, which are outstanding findings for global investors who desired to invest in these twelve stock markets from ten industrial regions. Further, global investors are able to forecast the variation of fluctuating systematic risk by measuring the ISRI and RISRI through the combined utilization of CAPM ( E ( RStock ) = R f + β Stock × ⎡⎣ E ( RMin ) − R f ⎤⎦ ), macroeconomic model and rotated macroeconomic model. In term of research limitation, despite the measuring significance of all the consequences, this study comes with some research limitations as expected. The most apparent of these limitations is the generalization of the findings. The sample consisted of 5 years of economic indicators data for the ten industrial regions from 2004 to 2008 with related macroeconomic factors. Further these macroeconomic factors were collected from the official government statistic departments and four main academic economic institutions. For all that, the conclusions of this research are not able to take into entire consideration other macroeconomic sectors (e.g., political, legal, technology). This will require additional data collection, greater discussion and further investigation. Notwithstanding, its limitations, this thesis on reducing the invested systematic risk in twelve stock markets from ten industrial regions through portfolio theory and macroeconomic model makes some contributions to the literature and future direction. Further, looking back, this research discusses across ten industrial regions by analyzing five-year beta priority numbers (beta coefficients) of twelve stock markets (USA New York, USA NASQAQ, Japan Tokyo, Taiwan, Singapore, Korea, Hong Kong, Brazil, Russia, India, China Shanghai and China Shenzhen) from 2004 to 2008. Before 2006, the banking industry, based on the disputed stress of public opinion and requirement of Taiwan domestically financial institutes, the Taiwan Government started to allow financial institutions to set up offices and braches (in limited numbers) in Mainland China in 2001. The Taiwan financial institutes established 14 sub-banks, 200 braches and 242 offices in 25 cities in Mainland China. It also included 115 financial institutes directly authorized to run the monetary business of Renminbi. In the stock market, under the protection of Mainland Government financial policies, only 8 overseas stock investment institutions are authorized to established direct joint-stock companies and about 24 overseas fundinvest-institutes are allowed to establish direct joint venture fund management companies. After the candidate of the Kuomintang Party, Ma Ying-jeou, won the competitive presidential elections in March 2008, the party’s main objective was to restart a direct connection channel with Mainland China on a governmental level and to address the concept of the “Cross-Strait Market”. It is the first time that the Taiwan Government seriously considered the concept of “developing hinterland” which means there will be the numerous economic and financial benefits if Taiwan (Taipei) regard as Mainland China (Shanghai) to be a cooperateddeveloping-area. The international funds (hot-money) that flows through Taiwan (Taipei) for the subsequent investment in Mainland China (Shanghai) is going to bring an unaccountable advantage for Taiwan if the “direct three-point transportation policy” can effectively be implemented which will restart the high-growth boom in the Taiwan economy.
A Comprehensive Macroeconomic Model for Global Investment
353
References [1] Markowitz, H.M.: The early history of portfolio theory: 1600-1960. Financial Analysts Journal 55(4), 12–23 (1999) [2] Sharpe, W.F.: Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance 19(3), 425–442 (1964) [3] Ross, S.A.: The Capital Asset Pricing Model (CAPM), Short-sale Restrictions and Related Issues. Journal of Finance 32, 177 (1977) [4] David, B., Calomiris, C.W.: Statement of Charles W. Calomiris Before a joint meeting of the Subcommittee on Housing and Community Opportunity and the Subcommittee on Financial Institutions and Consumer Credit of the Committee on Financial Services. U.S. House of Representatives, 1–34 [5] Hicks, J.R.: The Theory of Uncertainty and Profit. Economica 32, 170–189 (1931) [6] Hicks, J.R.: Value and Capital. Economic Journal 72, 87–102 (1939) [7] Burmeister, E.W., Kent, D.: The arbitrage pricing theory and macroeconomic factor measures. Financial Review 21(1), 1–20 (1986) [8] Bautista, R.M.: Macroeconomic Models for East Asian Developing Countries. AsianPacific Economic Literature 2(2), 15–25 (1988) [9] Bautista, C.C.: The PCPS Annual Macroeconomic Model (1993) (manuscript) [10] Challen, D.W., Hagger, A.J.: Macroeconometric Systems: Construction, Validations and Applications. Macmillan Press Ltd, London (1983) [11] Lai, C.-S., et al.: Utilize GRA to Evaluate the Investment Circumstance of the Ten Industrial Regions after the Globally Financial Crisis. The Journal of Grey System 13(4) (2010) [12] Kidane, A., Kocklaeuner, G.: A Macroeconometric Model for Ethiopia: Specification, Estimation, Forecast and Control. Eastern Africa Economic Review 1, 1–12 (1985) [13] Mallick, S.K., Mohsin, M.: On the Effects of Inflation Shocks in a Small Open Economy. Australian Economic Review. The University of Melbourne, Melbourne Institute of Applied Economic and Social Research 40(3), 253–266 (2007) [14] McKinnon, R.: Money and Capital in Economic Development. The Brookings Institution, Washington, D.C (1973) [15] Arestis, P., et al.: Financial globalization: the need for a single currency and a global central bank. Journal of Post Keynesian Economics 27(3), 507–531 (2005) [16] Hsieh, M.-Y., et al.: An Macroeconomic Analysis in Stock Market of the Ten Industrial and Emerging Regions by Utilizing the Portfolio Theory. In: nternational Conference in Business and Information, Kitakyushu, Japan (2010)
A DEMATEL Based Network Process for Deriving Factors Influencing the Acceptance of Tablet Personal Computers Chi-Yo Huang, Yi-Fan Lin, and Gwo-Hshiung Tzeng
*
Abstract. The tablet personal computers (Tablet PCs) emerged recently as one of the most popular consumer electronics devices. Consequently, analyzing and predicting the consumer purchasing behaviors of Tablet PCs for fulfilling customers’ needs has become an indispensable task for marketing managers of IT (information technology) firms. However, the predictions are not easy. The consumer electronics technology evolved rapidly. Market leaders including Apple, ASUS, Acer, etc. are also competing in the same segmentation by providing similar products which further complicated the competitive situation. How the consumers’ acceptance of novel Tablet PCs can be analyzed and predicted have become an important but difficult task. In order to accurately analyze the factors influencing consumers’ acceptance of Tablet PCs and predict the consumer behavior, the Technology Acceptance Model (TAM) and the Lead User Method will be introduced. Further, the differences in the factors being recognized by both lead users as well as mass customers will be compared. The possible customers’ needs will first be collected and summarized by reviewing literature on the TAM. Then, the causal relationship between the factors influencing the consumer behaviors being recognized by both the lead users as well as the mass customers will be derived by the DEMATEL based network process (DNP) and the Structural Equation Modeling (SEM) respectively. An empirical study based on the Taiwanese Tablet PC users will be leveraged for comparing the results being derived by the DNP and the SEM. Based on the DNP based lead user method, the perceived usefulness, perceived ease of use, attitude and Chi-Yo Huang · Yi-Fan Lin Department of Industrial Education, National Taiwan Normal University No. 162, Hoping East Road I, Taipei 106, Taiwan e-mail:
[email protected] *
Gwo-Hshiung Tzeng Department of Business and Entrepreneurial Administration, Kainan University No. 1, Kainan Road, Luchu, Taoyuan County 338, Taiwan Gwo-Hshiung Tzeng Institute of Management of Technology, National Chiao Tung University Ta-Hsuch Road, Hsinchu 300, Taiwan e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 355–365. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
356
C.-Y. Huang, Y.-F. Lin, and G.-H. Tzeng
behavioral intention are perceived as the most important factors for influencing the users’ acceptance of Tablet PCs. The research results can serve as a basis for IT marketing managers’ strategy definitions. The proposed methodology can be used for analyzing and predicting customers’ preferences and acceptances of high technology products in the future. Keywords: Technology Acceptance Model (TAM), Lead User Method, DEMATEL based Network Process (DNP), Structural Equation Modeling (SEM), Multiple Criteria Decision Making (MCDM), Tablet Personal Computer (Tablet PC).
1 Introduction During the past decades, social and personality psychologists attempted to study human behaviors. However, considering the complexity, explanation and prediction of human behavior is a difficult task. The concepts referring to behavioral intentions like the social attitude, personal intention and personality traits have played important roles in prediction of human behaviors (Ajzen 1989; Bagozzi 1981). The tablet personal computers (Tablet PCs), a portable PC being equipped with a touchscreen as the input device (Beck et al. 2009), emerged recently as one of the most popular consumer electronics devices. Analyzing and predicting the consumer behaviors of Tablet PCs for fulfilling customers’ needs has become an indispensable task for marketing managers. However, the predictions are not easy. The consumer electronics technology evolved rapidly which shorten the product life cycles. Market leaders including Apple, ASUS, Acer, etc. are competing in the same segmentation by providing like products. Further, various alternatives including the notebook computers, large-screen smart phones, etc. may replace the Tablet PCs. Above phenomenon further complicated the competitive situation. How the consumers’ acceptance of novel Tablet PCs can be analyzed and predicted have become an important issue for marketing managers of Tablet PC providers. However, the analysis and prediction of consumer behaviors are not easy from the above aspects mentioned aspects on PLC, competition, and alternative products. . In order to accurately derive the factors influencing consumers’ acceptance of Tablet PCs and predict the purchase behaviors, the Technology Acceptance Model (TAM) and the Lead User Method (LUM) will be introduced as the theoretic basis of this analysis. The TAM can be used for illustrating the factors influencing the consumers’ acceptance of future Tablet PCs. Meanwhile, the LUM is more suitable for evaluating the future high technology innovations which are disruptive in nature. However, to demonstrate the differences between the results being derived based on the mass customers’ opinions and the results based on lead users’, the mass customers’ opinions will also be surveyed to demonstrate the differences. Consequently, a novel DEMATEL based network process (DNP) based multiple criteria decision making (MCDM) framework will be proposed for deriving the acceptance intention of Tablet PCs. The criteria for evaluating factors influencing the acceptance of the Tablet PCs will first be summarized by the
A DEMATEL Based Network Process for Deriving Factors
357
literature review. After that, the casual structure corresponding to the acceptance intention of Tablet PCs will be derived by using the DNP and the SEM methods. The structure versus each criterion from lead users’ perspective will be established by using the DNP. Then, the criteria weights of lead users will be calculated by the DNP method as well as the traditional SEM based statistical techniques will be used to derive the casual relationships for the opinions of the lead users’ and the mass customers’, respectively. Finally, the analytic result based on both the lead users’ and mass customers’ perspectives will be compared. A pilot study on the feasibility of the DNP based LUM on Tablet PCs’ acceptance predictions and the SEM based derivations of the TAM will be based on the opinions of four lead users now serving in the world’s leading IT firms and thirty Taiwanese consumers with the intention to purchase Tablet PCs. The empirical study results can serve as the basis for marketers’ understanding of consumers’ intentions of accepting the Tablet PCs. Meanwhile, such knowledge can serve as the foundation for future marketing strategy definitions. The remainder of this article is organized as follows. The related literature regarding to technology acceptance theories and TAM model will be reviewed in Section 2. The analytic framework based on the DNP framework and the SEM method will be introduced in Section 3. Then, in Section 4, an empirical study follows, designing the Tablet Personal Computer on the proposed DNP and SEM based TAM framework. Managerial implications as well as discussion will be presented in Section 5. Finally, the whole article will be concluded in Section 6.
2 Human Behavior Theory – Concepts and Models During the past decades, scholars have tried to develop concepts and models for formulating and predicting of human behaviors. Further, to explain and predict the factors which influence users’ acceptance of a new product, the theories and models, including the TRA (Theory of Reasoned Action), the TPB (Theory of Planned Behavior), and the TAM have been developed. In this Section, related theories, concepts and models will be reviewed for serving as a basis for the analytic framework development in this research. The TRA, a prediction model referring to the social psychology, attempts to derive the requirements of intended behaviors regarding to the acceptance of users. The relationship versus each variable in the TRA is presented in Fig. 1(a) (Davis 1986; Davis et al. 1989; Fishbein and Ajzen 1974; Hu and Bentler 1999). The TRA is a general theory, which demonstrates that the beliefs are operative for human behavior and user acceptance. Therefore, researchers applying the TRA should identify whether the belief is a prominent variable for users regarding the product acceptance behavior (Davis et al. 1989). The Theory of Planned Behavior (TPB) was developed by Ajzen (1987) to improve prediction capability of the TRA. The TPB framework intends to deal with complicated social behaviors of users toward some specific product (Ajzen and Fishbein 1980). In order to improve and overcome the shortcoming of the TRA, Ajzen concentrated on the cognitive self-regulation as an important aspect of the human behavior. The relationships between those influence factors are demonstrated in Fig 1(b).
358
C.-Y. Huang, Y.-F. Lin, and G.-H. Tzeng
The TAM, an adaptation of the TRA, was proposed especially for predicting the users’ acceptance of information systems (Davis et al. 1989) by Davis (1986). The TAM is in general, capable of explaining user behaviors across a broad range of end-user computing technologies and user populations, while at the same time being both parsimonious and theoretically justified (Davis 1986; Davis et al. 1989). The TAM posits that two particular beliefs, perceived usefulness and perceived ease of use, are of primary relevance for computer acceptance behaviors (Fig. 1(c)).
Source: Fishbein & Ajzen (1974) (a)
Source: Ajzen (1985) (b)
Source: Davis (1986) (c)
Fig. 1 (a) TRA, (b) TPB, and (c) TAM
In general, the early adopters’ of a novel product behave differently from the mass consumers. However, the bulk of the users will follow early adopters when the market is mature (Rogers 1962). Consequently, the LUM, a market analysis technique, is applied to the development of new products and services (Urban and Hippel 1988). The methodology is composed of four major steps based on the work of Urban and Hippel (1988): (1) specify lead user indicators, (2) identify lead user group, (3) generate concept (product) with lead users and (4) test lead user concept (product). Further details can be found in the earlier work by Urban and Hippel (1988).
3 Analytic Framework for Deriving the TAM In order to build the analytical framework for comparing the factors influencing the acceptance of the Tablet PCs from the aspects of lead users and mass customers, the DNP based MCDM framework and the SEM will be introduced. At first, the criteria being suitable for measuring the users’ acceptance of Tablet PCs will be derived based on literature review. The factors being identified by Davis in the TAM will be introduced. The DNP will then be introduced for deriving the causal relationship and weights versus each criterion from the lead users’ aspect. The SEM will be introduced for deriving the causal structure between the factors from the viewpoint of the mass customers’ at the same time. Finally, the result from mass customers and lead users will be compared. In summary, the evaluation framework consists of four main steps: (1) deriving the factors influencing customers’ acceptance of the Tablet PCs by literature review; (2) evaluating the determinants of mass customers’ acceptance by using SEM; (3) evaluating the
A DEMATEL Based Network Process for Deriving Factors
359
determinants, causal relationship and criteria weights of lead users’ acceptance by applying DNP; and finally, (4) comparing the result of mass customers and lead users.
3.1 The DNP The DNP is an MCDM framework consisting of the DEMATEL and the ANP. The DEMATEL technique was developed by the Battelle Geneva Institute to analyze complex “real world problems” dealing mainly with interactive map-model techniques (Gabus and Fontela 1972) and to evaluate qualitative and factor-linked aspects of societal problems. The DEMATEL technique was developed with the belief that the pioneering and proper use of scientific research methods could help to illuminate specific and intertwined phenomena and contribute to the recognition of practical solutions through a hierarchical structure. The ANP is general form of the analytic hierarchy process (AHP) (Satty 1980) which has been used in MCDM problems by releasing the restriction of the hierarchical structure and the assumptions of independence between criteria. Combining the DEMATEL and ANP method, which had been reviewed in this Section, the steps of this method can be summarized on the work of Prof. Tzeng, Gwo-Hshiung (Wei et al. 2010): Step 1: Calculate the direct-influence matrix by scores. Based on experts’ opinions, the relationships between criteria can be derived based on mutual influences. The scale ranges from 0 to 4, representing “no influence” (0), “low influence” (1), “medium influence” (2), “high influence” (3), and “very high influence” (4), respectively. Respondents are requested to indicate the direct influence of a factor i on a factor j , or dij . The direct influence matrix D can thus be derived. Step 2: Normalize the direct-influence matrix based on the direct-influence matrix D . The normalized direct relation matrix X can be derived by using n
N = vD; v = min{1 / max i
n
∑ d ,1 / max ∑ d }, i, j ∈{1,2,..., n} . ij
j =1
ij
j
i =1
Step 3: Attain the total-influence matrix T . Once the normalized direct-influence matrix N is obtained, the total-influence matrix T of NRM can further be derived by using T = N + N 2 + ... + N k = N ( I - N )-1 , where k → ∞ and T is a total influence-related matrix. N is a direct influence matrix and N = [ xij ]n×n ;
(
lim N 2 + " + N k
k →∞
)
n
stands for a indirect influence matrix and 0 ≤
∑ i =1
xij < 1 , and only one
ij
< 1 or
j =1
n
n
0≤
∑x
∑
n
xij
or
j =1
∑x
ij
equal to 1 for ∀i, j . So
i =1
lim N k = [0]n×n . The element tij of the matrix T denotes the direct and indirect
k →∞
influences of the factor i on the factor j .
360
C.-Y. Huang, Y.-F. Lin, and G.-H. Tzeng
Step 4: Analyze the result. In this stage, the row and column sums are separately denoted as r and c within the total-relation matrix T through equations ⎡ n ⎤ ⎡ n ⎤ T = [tij ], i, j ∈ {1, 2,..., n} , r = [ ri ]n×1 = ⎢ tij ⎥ and c = [c j ]1×n = ⎢ tij ⎥ , ⎢ ⎥ ⎢ i =1 ⎥ = 1 j ⎣ ⎦ ⎣ ⎦ n×1 1× n where the r and c vectors denote the sums of the rows and columns,
∑
∑
respectively. Suppose the ri denotes the row sum of the ith row of the matrix T . Then, ri is the sum of the influences dispatching from the factor i to the other factors, both directly and indirectly. Suppose that c j denotes the column sum of the j th column of the matrix T . Then, c j is the sum of the influences that factor i is receiving from the other factors. Furthermore, when i = j (i.e., the sum of the row sum and the column sum), (ri + ci ) represents the index representing the strength of the influence, both dispatching and receiving), where (ri + ci ) is the degree of the central role the factor i plays in the problem. If (ri - ci ) is positive, then the factor i primarily is dispatching influences upon the strength of other factors; and if ( ri - ci ) is negative, then the factor i primarily is receiving influence from other factors. Therefore, a causal graph can be achieved by mapping the dataset of ( ri + si , ri − si ) providing a valuable approach for decision making (Chiu et al. 2006; Huang and Tzeng 2007; Tamura et al. 2002; Huang et al. 2011; Tzeng and Huang 2011). can first Let the total-influence matrix TC = ⎡⎣tij ⎤⎦ . The matrix TD = ⎡⎣tijD ⎤⎦ nxn nxn be derived based on the dimensions (or clusters) from TC . Then, the weights versus each criterion can be derived by using the ANP based on the influence matrix TD . m
⎡ t D11 " t D1 j 1j ⎢ 11 ⎢ # # # ⎢ Dij D i 1 ⎢ TD = ti1 " tij ⎢ ⎢ # # ⎢ Dmj D ⎢t m1 " tmj ⎣ m1
⎤ ⎯⎯ " → d1 = ⎥ # # ⎥ ⎥ D → di = " timim ⎥ ⎯⎯ ⎥ # # ⎥ ⎥ Dmm ⎥ ⎯⎯ →d = " tmm ⎦ m t1Dm1m
∑t
D1 j 1j
j =1 m
∑
D
tij ij , di =
j =1 m
∑t
m
∑t
Dij ij , i
= 1,..., m
j =1
Dmj mj
j =1
Step 5: The original supermatrix of eigenvectors is obtained from the total-influence matrix T = [tij ] . For example, D values of the clusters in matrix TD . Where if tij < D , then tijD = 0 else tijD = tij , and tij is in the total-influence
matrix T . The total-influence matrix T can be normalized as follows.
A DEMATEL Based Network Process for Deriving Factors
361
⎡ t D11 / d " t D1 j / d " t D1m / d ⎤ ⎡ α D11 " α D1 j 1 1 1 ⎥ ⎢ 11 1j 1m 1j ⎢ 11 ⎢ ⎥ ⎢ # # # # # # # # ⎢ ⎥ ⎢ D D Dim D TD = ⎢ tiD1 i1 / di " tij ij / di " tim / di ⎥ = ⎢ α i1 i1 " α ij ij ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ # # # # # # ⎢ ⎥ ⎢ Dmj Dmj D D D m m 1 1 mm ⎢t ⎥ ⎢ ⎣ m1 / d m " tmj / d m " tmm / d m ⎦ ⎣α m1 " α mj D
" α1Dm1m ⎤ ⎥ # # ⎥ ⎥ D " α imim ⎥ ⎥ # # ⎥ ⎥ Dmm ⎥ " α mm ⎦
D
where α ij ij = tij ij / di . This research adopts the normalized total-influence matrix TD (here after abbreviated to “the normalized matrix”) and the unweighted
supermatrix W using the following equation. The unweighted supermatrix can serve as the basis for deriving the weighted supermatrix. D21 ⎡ α D11 × W × W12 α 21 " 11 ⎢ 11 D D 12 22 ⎢ α × W21 α " 22 × W 22 ⎢ 12 * D ji W =⎢ α ji × Wij # " ⎢ ⎢ # # # ⎢ D D ⎢⎣α1m1m × Wm1 α 2m2 m × Wm 2 "
"
α mD1m1 × W1m ⎤
⎥ ⎥ ⎥ Dmi " α mi × Wim ⎥ ⎥ ⎥ # # ⎥ Dmm " α mm × Wmm ⎥⎦ "
#
Step 6: Limit the weighted supermatrix by raising it to a sufficiently large power k, or lim k →∞ (W * ) k , until the supermatrix has converged and become a long-term stable supermatrix. The global priority vectors or the ANP weights can thus be derived.
3.2 The SEM The primary aim of the SEM technique is to analyze latent variables and derive the causal relations between latent constructs to verify a theory. To develop a research based on the SEM method, related multivariate methods and procedures will be summarized based on the work by Schumacker and Lomax (1996): (1) theoretical framework development: the model will be derived and developed by literature review; (2) model specification: the hypothetical model will be constructed and the observed and latent variables will be defined; (3) model identification: the most appropriate number of criteria will be estimated; (4) sample and measurement: user opinions will be collected by questionnaires; (5) parameter estimation: the relationship versus criteria will be estimated by using the multiple regression analysis, the path analysis, and the factor analysis; (6) fitness assessment: the fitness between the criteria and the model will be confirmed by calculating the goodness of fit; (7) model modification: the poorly fit criteria will be deleted or modified if the goodness of fit is poor; (8) result discussion: the management implication will be discussed.
362
C.-Y. Huang, Y.-F. Lin, and G.-H. Tzeng
4 Empirical Study In order to verify the framework being mentioned in Section 3 and demonstrate the efficiency of the LUM in differentiating the casual relationships being derived based on the lead users and the mass customers, an empirical study based on the pilot research results by surveying four experts now serving in the world’s leading IT firms providing Tablet PC related products and thirty mass customers being interested in purchasing Tablet PCs will be surveyed. The empirical study consists of five stages: (1) deriving factors influencing the acceptance of technology by literature review; (2) deriving the causal relationship versus each requirement of lead users by using the DNP; (3) selecting the significant factors based on the degree of central role and criteria weights in the DNP method; (4) deriving the causal relationship versus each requirement of mass customers by using the SEM; (5) selecting the significant factors based on the total effect of the SEM method. At first, the factors influencing the acceptance of Tablet PCs were collected based on literature review. These factors include (1) the perceived usefulness (PU), (2) the perceived ease of use (PEU), (3) the subjective norm (SN), (4) the perceived behavioral control (PBC), (5) the attitude (ATT), (6) the behavioral intention (BI), (7) the actual system use where (B) refers to the actual behaviors of users (Davis 1986). After literature review, the causal relationship and structure versus each requirement (criteria) through the lead users’ perspective was derived by using the DNP. Then, the SEM is applied to derive the relationships and structure from the viewpoints of the mass customers’. In order to derive the causal structure based on the opinions of the lead users’, four experts now serving in the world’s leading IT firms were invited. Then, the causal structure can be derived by using the DNP. After the derivation of the total influence matrix, the casual relationship can be derived by setting 0.599 as the threshold. Based on the empirical study results, the DNP, PU, PEU, BI, ATT, PBC, SN and B can serve as the factors for predicting the acceptance of the Tablet PCs. However, there is no influence between the PBC and other criteria. Consequently, the actual purchase can’t be predicted by the PBC. The casual structure being derived by using the DNP is demonstrated in Fig.2(a). Further, the DNP is applied to derive the weights versus each factor, which are 0.143. 0.155, 0.144, 0.149, 0.133, 0.126, and 0.151 for the PU, PEU, ATT, BI, SN, PBC, and B, respectively. According to the weights being derived, the ATT, PU, PEU and BI are important factors based on the viewpoint of the lead users while the PBC is regarded as non-vital. For deriving the mass customers’ viewpoints by using the SEM, first, this research regards the SNs and the PBCs as external variables in the TAM. In a good fit model, the p-value should be greater than 0.5 while the RMSEA should be smaller than 0.05. However, in this pilot study, the Thus, the assumption couldn’t TAM theory from the mass customers’ perspective. Further, this research attempts to delete subject norms and perceived behavioral controls. The fit statistics, chi-square (74.66), p-value (1.00), RMSEA (0.00) were all indicators of a good fit after the reduction. On the other hand, the path coefficients referring to the refer to the casual relationship between two latent variables, including a
A DEMATEL Based Network Process for Deriving Factors
363
dependent variable and an independent variable can be derived by using the multiple regressions. Namely, the dependent variable can be predicted by the independent variable and path coefficient. The stronger the influence and the casual relationship are, the higher the path coefficient can be. The analytic result being derived by using the SEM is demonstrated in Fig. 2(b). The path coefficients versus each latent variable (refers to the influence) are demonstrated in Fig. 2 (c). According to the analytic results, the PU and the ATT significantly influenced the PEU. The BI also influences the actual behavior most significantly.
(a)
(b)
(c)
Fig. 2 (a) The Causal Relationship being Derived by the DNP, (b) The Causal Relationship by the SEM, (c) The Path Coefficients being Derived by the SEM
5 Discussion This research aims to establish an analytic procedure for deriving the factors influencing consumers’ intention of accepting future Tablet PCs by using the DNP based LUM. Differences between the casual structure being derived based on the opinions of lead users’ and those of the mass customers’. Managerial implications and advances in research methods will be discussed in this Section. Regarding to the managerial implication, both the casual relationship structure and strength of influences between the factors are different from the aspects of the lead users’ and mass customers’. From the aspect of the lead users, the PU, PEU, ATT, and BI per are recognized as significant factors for influencing customers’ acceptance of the Tablet PCs. From the perspective of mass customers, there is no causal relationship between SN, PBC and other criteria. Therefore, the actual usage of Tablet PCs can’t be predicted by SN and the PBC. On the other hand, the BIs influence the actual usage significantly. The PU and PEU will also influence the actual usage indirectly. Further, the SN and the PBC are unsuitable for the acceptance of Tablet PCs. Regarding to the importance of predicting acceptance factors, the PEU and the BIs influence other factors the most. Consequently,
364
C.-Y. Huang, Y.-F. Lin, and G.-H. Tzeng
comparing the analytic results being derived by both the lead users and the mass customers, the BI was recognized by both the lead users and the mass customers. The PU and the PEU were recognized as important by the lead users only while the PEU is recognized as important for mass customers only. Regarding to the advances in the research methods, the SEM and the DNP have been verified and compared in this research. The strength of casual relationship will be derived by SEM. However, the strength of casual relationship and casual structure will be derived by DNP. Both the SEM and the DNP can be used to derive the casual relationship and the corresponding weight versus each criterion. Nevertheless, the casual path versus each criterion should be set by the social sciences theory (i.e. management, consumer behavior and marketing). Then, the strength of the casual relationship will be deriving by the SEM. On the contrary, both of the casual structure and the strength of influence can be derived by the DNP. Namely, the SEM can be applied to solve the problem with the pre-defined casual structure. However, the DNP can be applied to solve the problem without a pre-defined casual structure.
6 Conclusions The consumer behavior prediction of Tablet PCs is a difficult and indispensable task due to the fast emerging technology and severe competitions of the vendors. This research attempted to predict and compare the consumer behaviors based on the acceptance of mass customers’ and lead users’ by using the SEM and the novel DNP method being proposed by Prof. Gwo-Hshiung Tzeng. According to the analytic results, both the preferences and casual structure are divergent between lead users and mass customers. On one hand, the perceived ease of use and behavioral intention are important for predicting the acceptance of mass customers. On the other hand, the perceived usefulness and behavioral intention influence the acceptance of lead users. Last but not least, the feasibility of SEM and DNP had been verified in this research. These two methods can be applied to solve different problems. The SEM can be applied to solve the problem with the defined casual structure. The DNP can be applied to solve the problem without defined casual structure.
References Ajzen, I.: Attitudes, traits, and actions: Dispositional prediction of behavior in personality and social psychology. Advances in experimental social psychology 20, 1–63 (1987) Ajzen, I.: Attitudes, personality, and behavior. Open University Press, Milton Keynes (1989) Ajzen, I., Fishbein, M.: Understanding Attitudes and Predicting Social Behavior. Prentice-Hall, Englewood Cliffs (1980) Bagozzi, R.P.: Attitudes, intentions, and behavior: A test of some key hypotheses. Journal of Personality and Social Psychology 41(4), 607–627 (1981) Beck, H., Mylonas, A., Rasmussen, R.: Business Communication and Technologies in a Changing World. Macmillan Education Australia (2009)
A DEMATEL Based Network Process for Deriving Factors
365
Chiu, Y.J., Chen, H.C., Shyu, J.Z., Tzeng, G.H.: Marketing strategy based on customer behavior for the LCD-TV. International Journal of Management and Decision Making 7(2/3), 143–165 (2006) Davis, F.D.: A Technology Acceptance Model for Empirically Testing New End-User Information Systems: Theory and Results, Doctoral dissertation (1986) Davis, F.D., Bagozzi, R.P., Warshaw, P.R.: User acceptance of computer technology: a comparison of two theoretical models. Management Science 35(8), 982–1003 (1989) Fishbein, M., Ajzen, I.: Attitudes toward objects as predictors of single and multiple behavioral criteria. Psychological Review 81(1), 59–74 (1974) Gabus, A., Fontela, E.: World Problems, an Invitation to Further Thought Within the Framework of DEMATEL. Batelle Geneva Research Center, Switzerland (1972) Hu, L.T., Bentler, P.M.: Cut off criteria for fit indexes in covariance. Structural Equation Modeling 6(1), 1–55 (1999) Huang, C.Y., Tzeng, G.H.: Reconfiguring the Innovation Policy Portfolios for Taiwan’s SIP Mall Industry. Technovation 27(12), 744–765 (2007) Huang, C.Y., Hong, Y.H., Tzeng, G.H.: Assessment of the Appropriate Fuel Cell Technology for the Next Generation Hybrid Power Automobiles. Journal of Advanced Computational Intelligence and Intelligent Informatics (2011) (forthcoming) Rogers, E.M.: Diffusion of innovation (1962) (accessed) Satty, T.L.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980) Schumacker, R.E., Lomax, R.G.: A Beginner’s Guide to Structural Equation Modeling. Lawrence Erlbaum Associates, Publishers, Mahwah (1996) Tamura, H., Akazawa, K., Nagata, H.: Structural modeling of uneasy factors for creating safe, secure and reliable society. Paper Presented at the SICE System Integration Division Annual Conference (2002) Tzeng, G.-H., Huang, C.-Y.: Combined DEMATEL technique with hybrid MCDM methods for creating the aspired intelligent global manufacturing & logistics systems. Annals of Operations Research, 1–32 (2011) Urban, G., Von Hippel, E.: Lead User Analyses for the Development of New Industrial Products. Management Science 34(5), 569–582 (1988) Wei, P.L., Huang, J.H., Tzeng, G.H., Wu, S.I.: Causal modeling of Web-Advertising Effects by improving SEM based on DEMATEL technique. Information Technology & Decision Making 9(5), 799–829 (2010)
A Map Information Sharing System among Refugees in Disaster Areas, on the Basis of Ad-Hoc Networks Koichi Asakura, Takuya Chiba, and Toyohide Watanabe
Abstract. In disaster areas, some roads cannot be passed thorough because of road destruction or rubbles from collapsed buildings. Thus, information on safe roads that can be used for evacuation is very important and has to be shared among refugees. In this paper, we propose a map information sharing system for refugees in disaster areas. This system stores the roads passed by a refugee as map information. When another refugee comes close to the refugee, they exchange their map information each other in ad-hoc network manner. In this exchange, in order to reduce communication frequency, the Minimum Bounding Rectangle (MBR) for map information is calculated for comparing the similarity of map information. Experimental results show that the quantity of map information increases by data exchange between refugees and that the frequency of communication is reduced by using the MBR. Keywords: information sharing, ad-hoc network, disaster area, minimum bounding rectangle. Koichi Asakura Department of Information Systems, School of Informatics, Daido University, 10-3 Takiharu-cho, Minami-ku, Nagoya 457-8530, Japan e-mail:
[email protected] Takuya Chiba Department of Information Systems, School of Informatics, Daido University, 10-3 Takiharu-cho, Minami-ku, Nagoya 457-8530, Japan e-mail:
[email protected] Toyohide Watanabe Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 367–376. c Springer-Verlag Berlin Heidelberg 2011 springerlink.com
368
K. Asakura, T. Chiba, and T. Watanabe
1 Introduction For communication systems in disaster situations such as a big earthquake, mobile ad-hoc network (MANET) technologies have attracted great attention recently[1, 2]. In such situations, mobile phones and wireless LAN networks cannot be used since communication infrastructures such as base stations and WiFi access points may be broken or malfunction. The MANET is defined as an autonomous network by a collection of nodes that communicate with each other without any communication infrastructures[3, 4, 5]. Thus, the MANET is suitable for communication systems in disaster areas since it does not require any communication infrastructures, In this paper, we propose a map information sharing system among refugees in disaster areas which is based on the MANET technologies. In order for refugees to evacuate to shelters quickly, correct and up-to-the-minute map information is essential. Namely, information on roads’ condition, that is which roads can be passed thorough safely, which roads has to be selected for quick evacuation and so on, is very important for refugees. Our proposed system stores passing roads as map information. When a refugee comes close to another refugee, the systems exchange their map information each other. This exchange is performed in ad-hoc network manner. Thus, map information of refugees is merged and information on roads that can be passed safely in the disaster situation is collected without any communication infrastructures. The rest of this paper is organized as follows. Section 2 describes related work. Section 3 describes our proposed system in detail. Section 4 explains our experiments. Finally, Section 5 concludes this paper and gives our future work.
2 Related Work Many map information systems have been developed so far. Roughly, map information systems can be categorized into two types: static map information systems and networked map information systems. In static map information systems[6, 7], map information such as road networks, building information and so on is installed in users’ computers before they are used. Thus, this system can be used quickly everywhere even if there are no equipments for network connection. However, the installed map information is not updated in real-time manner, which decreases correctness and timeliness of information. On the other hand, many products for networked map information systems such as Google Maps[8], Yahoo Local Maps[9], and so on, have been developed. In this type of map information systems, map information is stored on server computers of a service provider and users acquire map information through the Internet. Thus, ubiquitous network environment must be provided. As mentioned later, to provide a network connection environment is very difficult in disaster areas. In some navigation systems, the optimal route is provided by using real time traffic information. For example, Vehicle Information and Communication System (VICS) provides traffic information such as traffic jams, traffic accidents and so on,
A Map Information Sharing System among Refugees in Disaster Areas
369
in real time[10, 11]. However, such traffic information is gathered by static sensors. Namely, information is measured by static sensors in many different locations and gathered to central servers by using static networks. Thus, infrastructure equipments must be provided and operated correctly. However, such static equipments may be broken or malfunction in disaster situation. Thus, refugees cannot use these systems in disaster situation. Communication systems based on the MANET technologies in disaster areas have been proposed[12, 13]. However, these systems are mainly used for rescuers in government, and thus do not focus on real time information sharing among refugees.
3 Map Information Sharing among Refugees 3.1 System Overview In a disaster area such as a big earthquake, the most important factors with respect to information are correctness and timeliness. In such an area, the situation is different from that in normal times. Furthermore, it changes from moment to moment: for example, roads cannot be used because of fire spread, rubbles from collapsed buildings, and so on. Thus, for developing effective information systems for refugees, correctness and timeliness have to be taken into account. In order to keep correctness and timeliness of information in disaster situation, we must not rely on static infrastructure equipments such as sensors and networks. This is because such static equipments cannot be used in disaster situation. Thus, we have to develop an information sharing mechanism that works with client terminals, namely without any central servers. Figure 1 shows our proposed map information sharing system for refugees in disaster areas. This system provide refugees with information on safe roads that can be used for evacuation to shelters. Our proposed system consists of only client terminals that have static map data for map matching, ad-hoc network communication mechanisms for communicating with neighboring terminals and Global Positioning System (GPS) for acquiring current positions of refugees. When a refugee moves to a shelter for evacuation, the system stores passed roads as map information. This map information states that roads stored in the system can be passed through for evacuation. Such timely map information is very useful not only for other refugees but also for rescuers since information on whether a road can be used or not, can be acquired only on the spot. When another refugee comes close, refugees’ terminals communicate with each other in the ad-hoc network manner, and exchange their map information. By this system, we can collect and share map information in real time, which enables refugees to move to shelters by using safe and available roads.
370
K. Asakura, T. Chiba, and T. Watanabe
Trajectory of refugee A
Trajectory of refugee B
Trajectory of refugee C
Sharing map information by ad-hoc network
Safety road map in the disaster area
Fig. 1 System overview
x x x x x x xxx xx xx x xx x x x x xx x x
(a) History of positions
r34
p3
p4 r23
r12 p1
p2
(b) Extracted road segments
Fig. 2 Map matching
3.2 Map Information Map information stored in the system represents a history of positions of a refugee. In order to reduce data stored as map information, we introduce a map matching method[14, 15, 16] for extracting road segments passed by a refugee. Road segments used by a refugee are determined based on the sequence of position information acquired by GPS and road network data in static map data. Extracted road segments are stored in the system as map information. Definition 1 (Road segment). A road segment ri j is defined as a two-tuple of intersections pi and p j : ri j = (pi , p j ). Definition 2 (Map information). Map information M is defined as a sequence of road segments which are passed by a refugee: M =< ri j , r jk , · · · , rmn >. Figure 2 shows an example of map information. Figure 2(a) represents road network data and a history of positions of a refugee captured by GPS. By using the map matching method, intersections p1 , · · · , p4 are extracted as points passed by the refugee as shown in Figure 2(b). Then, road segments r12 , r23 and r34 are stored as map information of the refugee.
A Map Information Sharing System among Refugees in Disaster Areas
371
Fig. 3 A minimum bounding rectangle (MBR)
3.3 Exchanging Map Information When a refugee comes close to another refugee, they exchange their map information. This exchange is performed by ad-hoc network communication, which enables refugees to share map information without any communication infrastructures such as the Internet. The processing flow is as follows. 1. A refugee’s terminal sends a beacon packet periodically, which notifies existence of the refugee to surrounding refugees’ terminals. This beacon packet is called the HELLO packet. 2. When a terminal receives a HELLO packet from another terminal, the terminal checks whether exchange of map information is required or not based on the information on the HELLO packet (a detailed algorithm is described later). If the exchange is required, the terminal sends back a packet as reply. This packet is called the REPLY packet. 3. The terminal which receives the REPLY packet sends a packet containing map information. This packet is called the MAP packet. 4. The terminal which receives the MAP packet also sends back a MAP packet. In order to achieve effective information exchange and to reduce power consumption of terminals, we have to reduce the frequency of communication for exchange of map information. This is because the size of MAP packets containing map information is relatively big. If two refugees have almost the same map information, they can omit the data exchange. In order to represent the similarity of map information, we introduce the Minimum Bounding Rectangle (MBR) of road segments in map information. The MBR is a rectangle surrounding all the road segments in map information. Figure 3 shows an example of the MBR. If the ratio of the overlapped area of two MBRs is high, the similarity of map information is also regarded as high, and thus two terminals omit to send MAP packets each other. HELLO Packet The HELLO packet consists of the following attributes. source ID:
This shows a unique identifier of the sender terminal.
372
K. Asakura, T. Chiba, and T. Watanabe
ALGORITHM: SIMILARITY Input: MBRs . Output: Similarity. BEGIN Calculate the minimum bounding rectangle for a refugee’s own map information: MBRr . MBRov := overlapped rectangle between MBRs and MBRr . Sov := area of MBRov . S := max(area of MBRs , area of MBRr ). Similarity := Sov / S. return Similarity. END Fig. 4 An algorithm for calculating the similarity between two MBRs
MBR: This shows the MBR of the sender’s map information. This consists of the x, y coordinates of upper-left and bottom-right corners of the MBR. The HELLO packet has no destination ID. Namely, the HELLO packet is received by all neighboring terminals within the communication range of a sender terminal. REPLY Packet When terminals receive a HELLO packet, they calculate the similarity of map information by the MBR in the HELLO packet. Figure 4 shows the algorithm of calculating the similarity. The MBR of the sender of the HELLO packet is denoted as MBRs . First, a receiver terminal calculates the MBR of its map information: MBRr . Then, MBRov , the overlapped part of two MBRs, is calculated. If the ratio of the areas of MBRov to the larger area of the two MBRs is lower than a threshold, the similarity of map information is regarded as low and thus the REPLY packet is sent back to the sender of the HELLO packet. The REPLY packet consists of the following attributes. source ID: This shows a unique identifier of the sender terminal. destination ID: This specifies the destination terminal. MAP Packet The MAP packet is used for exchanging the map information in two terminals. The MAP packet consists of the following attributes. source ID: This shows a unique identifier of the sender terminal. destination ID: This specifies the destination terminal. size: This describes the number of road segments in this packet.
A Map Information Sharing System among Refugees in Disaster Areas
373
Fig. 5 Simulation area
road segments: As shown in Section 3.2, map information consists of a sequence of road segments. Thus, this shows the a sequence of road segments. One road segment is expressed as the two-tuple of x, y coordinates. It is clear that the size of the MAP packet is variable and relatively huge in comparison with other packets. Thus, we have to reduce the number of the MAP packet by using the MBR described above.
4 Experiments We conducted a simulation experiment for evaluating proposed information sharing system for refugees. This section describes the experimental results.
4.1 Simulation Settings In the experiments, we provided a virtual disaster area 2.0 kilometers wide and 1.5 kilometers high. Figure 5 shows the simulation area. In the experiments, 300 refugees were deployed on the road randomly and moved randomly. Refugees are denoted as dots on roads with ID numbers in Figure 5. The communication range was set to 100 [m]. We have two parameters for the experiments: a sending interval of the HELLO packet and a threshold of the similarity of two MBRs. In order to evaluate the system, we measured the following two indicators. The frequency of communication: The number of packets was measured by varying the threshold of the similarity of MBRs in order to evaluate the method using MBRs.
374
K. Asakura, T. Chiba, and T. Watanabe
The degree of information sharing: The number of road segments in map information that were acquired from other refugees by the system was measured by varying the threshold of the similarity of MBRs. For each case, we conducted five experiments and calculated the average values.
4.2 Experimental Results Figure 6, Figure 7 and Figure 8 are the experimental results when the sending intervals of the HELLO packets are 10 [sec], 30 [sec] and 60 [sec], respectively. Figure 6(a), Figure 7(a) and Figure 8(a) show the number of each packet. Figure 6(b), Figure 7(b) and Figure 8(b) show the number of road segments in map information. “Own” describes the number of road segments that are captured by their own GPS devices, and “Others”describes the number of road segments that are acquired
200
50000
40000
Others Own
HELLO REPLY MAP 150
30000 100 20000 50
10000
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Threshold of the similarity of MBRs
Threshold of the similarity of MBRs
(a) The number of packets
(b) The number of road segments
1
Fig. 6 Experimental results when the sending interval is 10 seconds 20000
200
HELLO REPLY MAP
Others Own
15000
150
10000
100
5000
50
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Threshold of the similarity of MBRs
Threshold of the similarity of MBRs
(a) The number of packets
(b) The number of road segments
Fig. 7 Experimental results when the sending interval is 30 seconds
1
A Map Information Sharing System among Refugees in Disaster Areas 10000
8000
375
200
HELLO REPLY MAP
Others Own 150
6000 100 4000 50
2000
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Threshold of the similarity of MBRs
Threshold of the similarity of MBRs
(a) The number of packets
(b) The number of road segments
1
Fig. 8 Experimental results when the sending interval is 60 seconds
by data exchange between other refugees. From these experimental results, we can clarify the followings. • The number of packets depends on the threshold of the similarity between two MBRs. However, the number of packets is almost the same when the threshold value is lower than or equal to 70%, although the number of packets increases exponentially when the threshold value is higher than 70%. This property is independent of the sending interval of the packet. • Refugees can acquire much map information from other refugees by our proposed system. The number of road segments that are acquired by exchanging map information is not influenced by the threshold values although it is highly influenced by the communication frequency. Namely, when the threshold value is lower than 80%, data exchange is achieved effectively with lower communication frequency. This property is also independent of the sending interval of the packet. From these results, we can conclude that our proposed system makes the quantity of map information increase and that communication frequency can be controlled effectively by using the similarity of two MBRs.
5 Conclusion In this paper, we propose a map information sharing system for refugees in disaster areas. In this system, refugees record the history of positions as map information and share map information among neighboring refugees in ad-hoc network manner. By sharing map information, refugees can acquire correct safety road information in disaster areas timely without any central servers. Experimental results show that the proposed sharing method based on ad-hoc network makes the quantity of map information increase and that the frequency of communication is reduced appropriately by using the MBR.
376
K. Asakura, T. Chiba, and T. Watanabe
There is a problem in comparing map information that MBRs cannot represent the accurate area of map information. Namely, MBRs represent the area of map information approximately. Thus, for our future work, we plan to introduce the convex hull[17] of road segments in map information. Furthermore, we have to evaluate the system in practice, not in computer simulation.
References 1. Midkiff, S.F., Bostian, C.W.: Rapidly-Deployable Broadband Wireless Networks for Disaster and Emergency Response. In: The 1st IEEE Workshop on Disaster Recovery Networks, DIREN 2002 (2002) 2. Meissner, A., Luckenbach, T., Risse, T., Kirste, T., Kirchner, H.: Design Challenges for an Integrated Disaster Management Communication and Information System. In: The 1st IEEE Workshop on Disaster Recovery Networks, DIREN 2002 (2002) 3. Toh, C.-K.: Ad Hoc Mobile Wireless Networks: Protocols and Systems. Prentice-Hall, Englewood Cliffs (2001) 4. Murthy, C.S.R., Manoj, B.S.: Ad Hoc Wireless Networks: Architectures and Protocols. Prentice Hall, Englewood Cliffs (2004) 5. Lang, D.: Routing Protocols for Mobile Ad Hoc Networks: Classification, Evaluation and Challenges. VDM Verlag (2008) 6. Esri: ArcGIS, http://www.esri.com/software/arcgis/ 7. Fudemame: Pro Atlas SV6, http://fudemame.net/products/map/pasv6/ (in Japanese) 8. Google: Goole Maps, http://maps.google.com/ 9. Yahoo: Yahoo Local Maps, http://maps.yahoo.com/ 10. Sugimoto, T.: Current Status of ITS and its International Cooperation. In: International Conference on Intelligent Transportation Systems, p. 462 (1999) 11. Nagaoka, K.: Travel Time System by Using Vehicle Information and Communication System (VICS). In: International Conference on Intelligent Transportation Systems, p. 816 (1999) 12. Mase, K.: Communications Supported by Ad Hoc Networks in Disasters. Journal of the Institute of Electronics, Information, and Communication Engineers 89(9), 796–800 (2006) 13. Umedu, T., Urabe, H., Tsukamoto, J., Sato, K., Higashino, T.: A MANET Protocol for Information Gathering from Disaster Victims. In: The 2nd IEEE PerCom Workshop on Pervasive Wireless Networking, pp. 447–451 (2006) 14. Quddus, M.A., Ochieng, W.Y., Zhao, L., Noland, R.B.: A General Map Matching Algorithm for Transport Telematics Applications. GPS Solutions 7(3), 157–167 (2003) 15. Brakatsoulas, S., Pfoser, D., Salas, R., Wenk, C.: On Map-matching Vehicle Tracking Data. In: The 31st International Conference on VLDB, pp. 853–864 (2005) 16. Quddus, M.A., Ochieng, W.Y., Zhao, L., Noland, R.B.: Current Map-matching Algorithms for Transport Applications: State-of-the-art and Future Research Directions. Transportation Research Part C 15, 312–328 (2007) 17. Berg, M.D., Cheong, O., Kreveld, M.V., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer, Heidelberg (2008)
A Study on a Multi-period Inventory Model with Quantity Discounts Based on the Previous Order Sungmook Lim
*
Abstract. Lee [Lee J-Y (2008). Quantity discounts based on the previous order in a two-period inventory model with demand uncertainty. Journal of Operational Research Society 59: 1004-1011] previously examined quantity discount contracts between a manufacturer and a retailer in a stochastic, two-period inventory model in which quantity discounts are provided on the basis of the previous order size. In this paper, we extend the above two-period model to a k-period one (where k > 2) and propose a stochastic nonlinear mixed binary integer program for it. With the k-period model developed herein, we suggest a solution procedure of receding horizon control style to solve n-period (n > k) order decision problems.
1
Introduction
This paper deals with a single-item, stochastic, multi-period inventory model, in which the retailer places an order with the manufacturer in each of the periods to fulfill stochastic demand. In particular, the manufacturer offers a QDP (quantity discounts based on the previous order) contract, under which the retailer receives a price discount on purchases in the next period in excess of the present-period order quantity. This type of quantity discount scheme is generally referred to as an incremental QDP. We intend to construct a mathematical programming formulation of the model and propose a method to solve the problem. For the past few decades, a number of studies have been conducted to develop decision-making models for various supply chain management problems, and a great deal of specific attention has been paid to inventory models with quantity discounts. The majority of quantity discount models have been studied using deterministic settings. Hadley and Whitin (1963), Rubin et al. (1983), and Sethi (1984) studied the problem of determining the economic order quantity for the buyer, given a quantity discount schedule established by the supplier. Monahan Sungmook Lim Dept. of Business Administration, Korea University, Chungnam 339-700, Republic of Korea e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 377–387. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
378
S. Lim
(1984) discussed a quantity discount policy which maximizes the supplier's profit while not increasing the buyer's cost. Lal and Staelin (1984) presented a fixedorder quantity decision model that assumed special discount pricing structure forms involving multiple buyers and constant demands. Lee and Rosenblatt (1986) generalized Monahan’s model to increase suppliers’ profits by incorporating constraints imposed on the discount rate and relaxing the assumption of a lot-for-lot supplier policy. Weng and Wong (1993) developed a general all-unit quantity discount model for a single buyer or for multiple buyers to determine the optimal pricing and replenishment policy. Weng (1995) later presented models for determining optimal all-unit and incremental quantity discount policies, and evaluated the effects of quantity discounts on increasing demand and ensuring Paretoefficient transactions under general price-sensitive demand functions. Hoffmann (2000) also analyzed the impact of all-unit quantity discounts on channel coordination in a system comprising one supplier and a group of heterogeneous buyers. Chang and Chang (2001) described a mixed integer optimization approach for the inventory problem with variable lead time, crashing costs, and price-quantity discounts. Yang (2004) recently proposed an optimal pricing and ordering policy for a deteriorating item with price-sensitive demand. In addition to these studies, Dada and Srikanth (1987), Corbett and de Groote (2000), and Viswanathan and Wang (2003) evaluated quantity discount pricing models in EOQ settings from the viewpoint of the supplier. Chung et al.(1987) and Sohn and Hwang (1987) studied dynamic lot-sizing problems with quantity discounts. Bregman (1991) and Bregman and Silver (1993) examined a variety of lot-sizing methods for purchased materials in MRP environments when discounts are available from suppliers. Tsai (2007) solved a nonlinear SCM model capable of simultaneously treating various quantity discount functions, including linear, single breakpoint, step, and multiple breakpoint functions, in which a nonlinear model is approximated to a linear mixed 0-1 program that can be solved to determine a global optimum. Recently, several researchers have studied quantity discount models involving demand uncertainty. Jucker and Rosenblatt (1985) examined quantity discounts in the context of the single-period inventory model with demand uncertainty (also referred to as the newsvendor model). Based on a marginal analysis, they provide a solution procedure for determining the optimal order quantity for the buyer. Weng (2004) developed a generalized newsvendor model, in which the buyer can place a second order at the end of the single selling period to satisfy unmet demand, and provides quantity discount policies for the manufacturer to induce the buyer to order the quantity that maximizes the channel profit. Su and Shi (2002) and Shi and Su (2004) examined returns-quantity discounts contracts between a manufacturer and a retailer using a single-period inventory model, in which the retailer is offered a quantity discount schedule and is allowed to return unsold goods. Lee (2008), whose study motivated our own, developed a single-item, stochastic, two-period inventory model to determine the optimal order quantity in each period, wherein the manufacturer offers a QDP contract, under which the retailer receives a price discount on purchases in the next-period in excess of the presentperiod order quantity. It can be asserted that Lee's work extended the literature in two directions: firstly, it studied quantity discounts in a two-period inventory
A Study on a Multi-period Inventory Model
379
model with demand uncertainty; and secondly, it evaluated quantity discounts based on the previous order. He acquired three main results. First, under the (incremental and all-units) QDP contract, the retailer’s optimal ordering decision in the second period depends on the sum of the initial inventory (i.e. inventory at the beginning of the second period) and first-period order quantity (i.e. price break point). Second, under the QDP contract, the retailer orders less in the first period and more in the second period, as compared with the order quantities under the wholesale-price-only contract. However, the total order quantity may not increase significantly. Third, the QDP contract always increases the retailer's profit, but will increase the manufacturer’s profit only in cases in which the wholesale margin is large relative to the retail margin. He derived an analytical formula for the optimal second-period order quantity, but due to the complicated structure of the profit function he could propose only a simple search method enumerating all possible values to determine the optimal first-period order quantity. Therefore, his model can be viewed as a one-period model. This study (i) extends Lee's work to develop a k-period model, and (ii) proposes a mathematical programming approach to the model. This paper extends the literature in two directions. First, this is the first work to deal with a k-period (k > 2) inventory model with demand uncertainty and quantity discounts on the basis of the previous order quantity. Second, whereas Lee (2008) did not provide an efficient solution method to determine the optimal first-period order quantity, we developed a mathematical optimization model involving all periods under consideration and proposed an efficient procedure for solving the model.
2
Model
We consider a single-item, stochastic, k-period inventory model involving a single manufacturer and a single retailer. The retailer places replenishment orders with the manufacturer to satisfy stochastic demand from the customer, and the customer's demand in each period is distributed independently and identically. The manufacturer provides the retailer with a QDP contract, under which the manufacturer provides the retailer with a price discount on purchases in the next period in excess of the present-period order quantity. Under this contract, the retailer places replenishment orders with the manufacturer at the beginning of each period, and it is assumed that the orders are delivered immediately, with no lead time. While unmet demands are backordered, those in the k-th period are assumed to be lost. Leftovers left unsold at the end of the k-th period are assumed to have no value, although this assumption can be readily relaxed by adjusting the inventory holding cost for the k-th period. The retailer incurs linear inventory holding and shortage costs, and the manufacturer produces the item under a lot-for-lot system. The following symbols and notations will be used hereafter: · ·
demand in the i-th period, a discrete random variable with mean probability function of demand cumulative distribution function of demand inventory level in the beginning of the i-th period
380
S. Lim
order quantity in the i-th period retail price per unit wholesale price per unit production cost per unit price discount per unit inventory holding cost per unit per period for the retailer shortage cost per unit per period for the retailer in the i-th period retailer's profit in the i-th period retailer's sales in the i-th period retailer's purchasing cost in the i-th period retailer's inventory holding cost in the i-th period retailer's shortage cost in the i-the period With the exception of shortage costs, all price and cost parameters are fixed over periods. The following figure diagrams the inventory levels and order quantities over the i-th period and the i+1-th period.
Here, the value of , which is the inventory level at the end of the i, which is the inventory level at the beth period, coincides with the value of ginning of the i+1-th period. The retailer's profit in the i-th period is the sales revenue minus the purchasing cost, the inventory holding and shortage costs in the same period, which can be expressed as follows: (1) (2) (3) (4) Then, the retailer's profit is:
(5) A mathematical programming model to maximize the retailer's profit over k periods can be formulated as follows:
A Study on a Multi-period Inventory Model (P) max s. t.
381
∑ 0, 0, 0, 1,
, 0,
0,
1,
, ,
1, , , 1, , , 1, , , 1,
,
(6)
is the order quantity in the 0-th period, and if it is not known or given, it where can assume a sufficiently large number, which means that no price discount is offered in the first period. Since is a random variable, it is impossible to solve the above problem (P) directly. Robust optimization is an approach to solving optimization problems with uncertain parameters such as (P). Robust optimization models can be classified into two categories, depending on how the uncertainty of parameters is incorporated into models; stochastic robust optimization and worst-case robust optimization. To illustrate the two models, consider the following optimization problem: , ,
min s. t.
0,
1,
,
,
(7)
is the decision variable, the function is the objective where , 1, , , are the constraint functions, and function, the functions is an uncertain parameter vector. In stochastic robust optimization models, the parameter vector is modeled as a random variable with a known distribution, and we work with the expected values of the constraints and objective functions, as follows: , ,
min s. t.
0,
1,
,
,
8
where the expectation is with respect to . In worst-case robust optimization models, we are given a set in which is known to lie, and we work with the worst-case values of the constraints and objective functions as follows: min sup s. t.
sup
, ,
0,
1,
,
(9)
In this study, we assume that the customer's demand in each period ( ) has a truncated Poisson distribution, and solve the problem (P) using stochastic robust can assume is , and the expected value optimization. The largest value that of is denoted by . If we take the expectation of the objective function and the constraint functions with respect to , we obtain the following problem (P1):
382
S. Lim (P1) max s. t.
∑ 1,
0,
, ,
0,
0, 0, 0, 1,
,
,
1, , , 1, , , 1, , , 1,
10
and Now, let us evaluate the expected values, , involved in the constraint functions. First of all, it can be readily shown that the following equation holds: ∑
∑
(11)
∑ If we introduce two additional sets of variables and ∑ , it holds that . Here, it is obvious that has a truncated Poisson distribution with mean and the largest value that ∑ it can take on, denoted by , is . The probability function of , denoted by
, is and
Then,
0,1,
!
,
,
∑
!
.
can be evaluated as follows:
∑ ∑ 1 ∑
2
2 ∑
3
3
,
(12)
∑ ∑
1 ∑ ∑
1 ,
(13)
, is the cumulative distribution function of , and we define where ∑ 0 if 0. ̃ and ̃ when Since ∑ we set ̃ , the mathematical programming problem (P1) can be transformed to
A Study on a Multi-period Inventory Model
(P2)
max s. t.
∑
383
̂
0, 1,
0, ̂
0, ̃ ∑ 0,
1, ̃
, ,
, , 0,
,
1,
1, , ,
, ,
∑
,
1,
, ,
1,
, ,
1, (14)
which is a nonlinear mixed integer optimization problem. Although the objective function is linear, the complicated structure of the constraint functions renders the problem non-convex. Therefore, it is not an easy task to find the global optimal solution to the problem. In order to make the problem more tractable, we propose a technique for the piecewise linearization of nonlinear functions, which will be described in detail in the following section.
3 Solution Procedure In this section, we develop a linear approximation-based solution procedure for the nonlinear mixed integer optimization problem (P2) derived in the previous section. Furthermore, we propose an algorithm based on the concept of receding horizon control (Kwon and Han, 2005) for solving n-period models using the solutions of k-period models (n > k). First of all, the nonlinear term can be linearized using the binary by the following four inequalities: variable 1
,
, 0
0
1
,
(15)
where M is a sufficiently large number. Similarly, the nonlinear term can be linearized using a binary variable by the following four inequalities: 1
, , 0
0
1
.
(16)
On the other hand, as it is difficult to perfectly linearize the nonlinear term ∑ , we approximate it using piecewise linearization. Let us say that ̃ , which is the function of , can be linearized by the following piecewise linear function over l-intervals ( 1 and integer): ̃
0, , (17) , ,
384
S. Lim
and ( 1, ) are the slopes and the y-intercepts of the piecewhere wise linear function in each interval, respectively, and and are the lower 0, and upper bounds of each interval, respectively. It holds that 1 ( 1, , ). Then, can be linearized using ( 2, ) and ( 1, , ) as follows: binary variables 1 1
,
1
,
1 1
, 1 1 1
1
,
,
1 , ∑
1.
(18)
Based on the above linearization, we can finally obtain the following mathematical programming model: (P3) max . .
∑ 0, 1,
0,
0, , ,
1, , , , 1
1 1
1, ,
1,
, ,
, , 1 1
,
1 1
1
,
1
,
,
1 ,
1,
1, , 1 ,
, ,
0 , 0 1
∑
s 0,
.
, λ,
1, , , 1,
1 0 , 0
,
1, 1
, , ,
1,
, ,
1, (19)
Up to now, we have discussed the development of a k-period model and its solution procedure. With increasing k values, however, the numbers of constraints and variables introduced for linearization increase significantly, as does the
A Study on a Multi-period Inventory Model
385
computation time of the solution procedure. Consequently, there are some practical difficulties associated with the application of the k-period model to real decision-making situations with a long planning horizon. In order to resolve this difficulty, we propose an online algorithm of receding horizon control style for the solution of an n-period (n > k) model using the solutions of k-period models. The algorithm begins with the solution of a k-period model with an initial condition to determine the optimal first-period order quantity, . As a time period passes, the customer's demand in the first-period is realized and thus the inventory level in the beginning of the second period is determined. Then, the optimal second-period order quantity, , is determined by solving another k-period model with the input of the initial inventory level and previous order quantity. These steps are repeated over time. The algorithm can be described, generally, as follows: < Algorithm : The solution procedure for n-period problems > Step 0: 1. Step 1: min 1, . Step 2: Solve the v-period model (P3) with the input of initial inventory level to determine the optimal order quantiand previous order quantity ties over periods [ , 1]. in the beginning of the i-th period. Step 3: Place an order of size Step 4: As one time period passes, the customer's demand in the j-th period, , is realized, and thus the inventory level in the beginning of the j+1-th pe, is determined. riod, Step 5: If , then stop. Otherwise, set 1 and return to Step 1. When v equals 1, that is, when it is required for the solution of a one-period model, the method proposed by Lee (2008) can be used. Lee's method determines the optimal second-period order quantity given an initial inventory in the beginning of the second period and a previous order quantity. The optimal first-period order quantity is obtained via a simple enumeration of all possible values. Therefore, Lee's two-period model, by its nature, can be seen as a one-period model, as previously noted in Section 1.
4
Concluding Remarks
This study established an inventory model with price discounts based on the previous order quantity that determines the optimal order quantities over multiple periods, and developed a solution procedure for the model. This work makes two major contributions to the relevant literature. Firstly, this is the first study to deal with a single-item, stochastic, multi-period inventory model with price discounts on the basis of the previous order quantity. Secondly, this study developed a mathematical programming model that simultaneously determines the optimal order quantities for all periods under consideration, whereas Lee (2008) did not develop any efficient method for the determination of the optimal first-period order quantity.
386
S. Lim
The proposed k-period inventory model for determining the optimal order quantities was formulated as a mixed integer nonlinear programming problem, and a piecewise linearization technique based on an evolutionary algorithm and linear regression was suggested for the transformation of the nonlinear problem into a linear one. A solution procedure of receding horizon control style was also developed for the solution of n-period problems using the solutions of k-period problems (n > k). Although this study extends the literature meaningfully, it has some inherent limitations, owing principally to the fact that the proposed method is a kind of an approximate approach, based on a linearization of nonlinear functions. Considering the practical importance of multi-period inventory models with QDP, the development of optimal algorithms can be considered worthwhile.
References 1. Bregman, R.L.: An experimental comparison of MRP purchase discount methods. Journal of Operational Research Society 42, 235–245 (1991) 2. Bregman, R.L., Silver, E.A.: A modification of the silver-meal heuristic to handle MRP purchase discount situations. Journal of Operational Research Society 44, 717– 723 (1993) 3. Bowerman, P.N., Nolty, R.G., Scheuer, E.M.: Calculation of the poisson cumulative distribution function. IEEE Transactions on Reliability 39, 158–161 (1990) 4. Chang, C.T., Chang, S.C.: On the inventory model with variable lead time and pricequantity discount. Journal of the Operational Research Society 52, 1151–1158 (2001) 5. Chung, C.-S., Chiang, D.T., Lu, C.-Y.: An optimal algorithm for the quantity discount problem. Journal of Operations Management 7, 165–177 (1987) 6. Corbett, C.J., de Groote, X.: A supplier’ optimal quantity discount policy under asymmetric information. Management Science 46, 444–450 (2000) 7. Dada, M., Srikanth, K.N.: Pricing policies for quantity discounts. Management Science 33, 1247–1252 (1987) 8. Hadley, G., Whitin, T.M.: Analysis of Inventory Systems. Prentice-Hall, Englewood Cliffs (1963) 9. Hoffmann, C.: Supplier’s pricing policy in a Just-in-Time environment. Computers and Operations Research 27, 1357–1373 (2000) 10. Jucker, J.V., Rosenblatt, M.J.: Single-period inventory models with demand uncertainty and quantity discounts: Behavioral implications and a new solution procedure. Naval Research Logistics Quarterly 32, 537–550 (1985) 11. Knuth, D.E.: Seminumerical Algorithms. The Art of Computer Programming, 2nd edn. Addison-Wesley, Reading (1969) 12. Kesner, I.F., Walters, R.: Class—or mass? Harvard Business Review 83, 35–45 (2005) 13. Kwon, W.H., Han, S.: Receding horizon control: Model predictive control for state models. Springer, Heidelberg (2005) 14. Lal, R., Staelin, R.: An approach for developing an optimal discount pricing policy. Management Science 30, 1524–1539 (1984) 15. Lee, J.-Y.: Quantity discounts based on the previous order in a two-period inventory model with demand uncertainty. Journal of Operational Research Society 59, 1004–1011 (2008)
A Study on a Multi-period Inventory Model
387
16. Lee, H.L., Rosenblatt, J.: A generalized quantity discount pricing model to increase supplier’s profits. Management Science 33, 1167–1185 (1986) 17. Monahan, J.P.: A quantity pricing model to increase vendor profits. Management Science 30, 720–726 (1984) 18. Rubin, P.A., Dilts, D.M., Barron, B.A.: Economic order quantities with quantity discounts: Grandma does it best. Decision Sciences 14, 270–281 (1983) 19. Sethi, S.P.: A quantity discount lot size model with disposal. International Journal of Production Research 22, 31–39 (1984) 20. Shi, C.-S., Su, C.-T.: Integrated inventory model of returns-quantity discounts contract. Journal of Operational Research Society 55, 240–246 (2004) 21. Sohn, K.I., Hwang, H.: A dynamic quantity discount lot size model with resales. European Journal of Operational Research 28, 293–297 (1987) 22. Su, C.-T., Shi, C.-S.: A manufacturer’ optimal quantity discount strategy and return policy through game-theoretic approach. Journal of Operational Research Society 53, 922–926 (2002) 23. Tsai, J.-F.: An optimization approach for supply chain management models with quantity discount policy. European Journal of Operational Research 177, 982–994 (2007) 24. Viswanathan, S., Wang, Q.: Discount pricing decisions in distribution channels with price-sensitive demand. European Journal of Operational Research 149, 571–587 (2003) 25. Weng, Z.K.: Coordinating order quantities between the manufacturer and the buyer: A generalized newsvendor model. European Journal of Operational Research 156, 148– 161 (2004) 26. Weng, Z.K.: Modeling quantity discounts under general price-sensitive demand functions: Optimal policies and relationships. European Journal of Operational Research 86, 300–314 (1995) 27. Weng, Z.K., Wong, R.T.: General models for the supplier’s all-unit quantity discount policy. Naval Research Logistics 40, 971–991 (1993) 28. Yang, P.C.: Pricing strategy for deteriorating items using quantity discount when demand is price sensitive. European Journal of Operational Research 157, 389–397 (2004)
A Study on the ECOAccountancy through Analytical Network Process Measurement Chaang-Yung Kung, Chien-Jung Lai, Wen-Ming Wu, You-Shyang Chen, and Yu-Kuang Cheng *
Abstract. Enterprises have always sacrificed the benefits of the human and environmental causes in order to pursue the maximization of companies’ profits. With more awareness of the concept of environmental protection and Corporate Social Responsibility (CSR), enterprises have to commence to take the potential duty for the human and the environment beyond the going-concern assumption and the achievement of profits maximization for stockholders. Further, with reference to vigorous environmental issues (unstable variation of climate, huge fluctuation of the Earth’s crust, abnormal raises of the sea and so on), the traditional accounting principles and assumptions do conform to the contemporarily social requirement. Therefore, based on the full disclosure principle, enterprises are further supposed to disclose the full financial cost of complying with environment regulations (regarding the dystrophication resulted from operation, amended honorable of contamination, environmental protection policies and so on) which are going to lead to enterprises are still able to achieve “economy-efficiency” for the environments under the go-concern assumption. Hence, the innovative accounting theory (the ECOAccountancy Theory) is created to establish new accounting principles and regulations for confronting these issues. This paper utilizes three essential relations in consolidation that are evaluated by nineteen assessable sub-criteria of four evaluated criteria through the comparison of the Analytical Network Process (ANP). Specifically, the specific feature of the three-approach models is to calculate the priority vector weights of each assessable characteristic, criteria and subcriteria by pairwise comparative matrix. Furthermore, in the content, the analytical hierarchical relations are definitely expressed in four levels among each criterion Chaang-Yung Kung Department of International Business, National Taichung University of Education *
Chien-Jung Lai · Wen-Ming Wu Department of Distribution Management, National Chin-Yi University of Technology You-Shyang Chen Department of Applied English, National Chin-Yi University of Technology Yu-Kuang Cheng Department of English, National Taichung University of Education J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 389–397. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
390
C.-Y. Kung et al.
which enable enterprises to choose the potential role of ECOAccountancy in CSR in a thriving hypercompetitive commerce environment. Keywords: ECOAccountancy, Analytical Network Process (ANP).
1 Introduction Beyond pursuing maximum profits and developing the giant organization, the enterprises around the world have devoted to collect social resources consisted of the manpower, environmental materials, political supports and such forth which results in direct circumstantial scarification included air pollution, water dystrophication, land contamination and social utilitarianism effect. Hence, in the people’s concentration of environmental concept era, the traditional environmental actions or policies of these enterprises are not enough to deal with the issues regarding various new competitive environment challenges. Enterprises, immediately, have to introspect for the Corporate Social Responsibilities (“CSR”) to have the pressure of competing to positively adapt and to form an effectively and comprehensively environmental strategy. However, in terms of final goals for all enterprises, the financial profits are still the most critical achievement to chase and therefore, the most completely reflected reports are the financial statements contained the income statement, balance sheet, stockholders’ equity statement, retained earnings and cash-flow statement. Further, according to concept of [1], in terms of the development of corporate social responsibilities Phase, the initial CSR development of enterprises focuses on setting up domestic CSR footholds. As the enterprises grow, they then concentrate on establishing central CSR national centers in order to set up the useful accounting department regarding in hypercompetitive commerce environment. Most enterprises employ multiple-national CSR strategies to achieve the most beneficial social responsibilities effectiveness. In terms of general manufacture developmental organization, there are currently three main organization structures which are suited in the multi-national off-shoring CSR strategies: (1) Integrated Device Manufacturer (IDM), (2) Original Design Manufacturer (ODM), and (3) Original Equipment Manufacturer (OEM). [2] Therefore, as long as the environmental concepts are considered into the design of products or procedures of production, the pollution are able to be minimized. However, many of these multi-national enterprises have re-considered their worldwide CSR strategies by analyzing the significant cost of indirect outsourcing and off-shoring of the CSR. Further, in a hypercompetitive and lower profits environment, enterprises are faced with the decision of how to cost down their operational expenditure of the Economy Accountancy (“ECOAccountancy”) through three principle accounting strategies: traditional accounting with passively green accountancy, traditional accounting with actively green accountancy, and effectively diversified ECOAccountancy. In terms of developmental geography for enterprises, enterprises are supposed to institute a complete and competitive global CSR accounting that deal with the pollutions due to cross-industry knowledge and high-contamination technologies with each other
A Study on the ECOAccountancy through ANP Measurement
391
in order to integrate their capacity for the highest benefits of achieving the lowest expenditure strategy of the operational accountancy system (“GAS”) figure 1. Up to the present, the new innovation of a successful ECOAccountancy in the global enterprises spreads wealth far beyond the lead position, and who bears primary responsibility for conceiving, coordinating, and marketing new products through effective and efficient global accounting in order to create the most beneficial synergy. While the enterprises positioned in the lead level and its shareholders are the main intended beneficiaries of the enterprise’s accounting strategic planning, other beneficiaries include partners in the enterprise’s accounting and firms that offer complementary products or services may also benefit. [3] Resource Independent for CSR & ECOAccountancy
Central Resource for CSR & ECOAccountancy
The CSR Expenditure
The national CSR strategy
De-central and outsourcing resource for CSR & ECOAccountancy
The multiple-national CSR strategies– Struggle to build decentralized multiplenational CSR & ECOAccountancy beachheads
CSR Resource Sharing and ECOAccountancy The global competitive CSR strategies– Institute global ECOAccountancy in order to capture highest CSR synergy with lowest extra environmental expenditure
The operational expenditure of the ECOAccountancy
The domestic CSR strategy
Profits Centralization
Domestic Accounting
National Accounting
Multiple-national Accounting
Global Accounting
CSR Strategy Development
Fig. 1 The CSR and ECOAccoutnancy Development Trend
2 Methodologies 2.1 Literature Review on the CSR Performance Analysis There are a large number of qualitative and quantitative papers and journals have studied on the CSR performance analysis. [4] focuses on the alternative performance measure of the operational expenditure of the ECOAccountancy in the new product development (“NPD”) through considering the effect of “acceleration of trap” of a series of time to analyze financial performance of the CSR. [5] addresses top eleven measured metrics out of 33 assessable metrics in his creating
392
C.-Y. Kung et al.
technology value pyramid through evaluating 165 industrial companies. The analytical model bases on the out-put oriented from top-down analytical steps. [1] surveys 150 questionnaires to present the difference between enterprise’s focusspots of the CSR performance and academic research-points because enterprises pay more attention on the based expenditure, time in need, quality of product of the CSR and oppositely, academic researches concentrate on customer-related measure as designing and developing the researches.
2.2 Literature Review on the Analysis Network Process (ANP) Literature on analytical network process (ANP) The initial theory and idea of the analytical network process (“ANP”) is published by the research journal of Thomas L. [6], professor of University of Pittsburgh which is utilized for handling the more complex research questions are not solved by analytical hierarchy process (“AHP”). Due to the original decision hypothesis principle (variable) of AHP defined to the “independence”, AHP is challenged for its fundamental theory by some scholars and decisive leaders because the relationships between characteristic, criteria, sub-criteria and selected candidates are not certain “independence”. [7] develops the new research methodology, positive reciprocal matrix and supermatrix, to pierce out this limited hypothesis in order to implement more complicated hierarchical analysis. More scholars further has combined AHP model into more analytical approach to inductively create the ANP. Afterwards, more researches integrate others analytical methods such as factor analysis to infer more assessable and accurate methods such as data envelopment analysis (“DEA”) and quality function deployment (“QFD”).
3 Learning of Imbalanced Data In terms of assessing the complexity and uncertainty challenges surrounding the ANP model, a compilation of expert’s collection was analyzed along with empirical survey in order to achieve retrospective cross-sectional analysis of the accounting relationship between the enterprises and accounting partners for diminishing the operational expenditure of the ECOAccountancy. This section not only characterizes the overall research design, research specification of analytical and research methodology but also is designed for comparing each assessable criteria of the relationship for characteristic, criteria, sub-criteria and selected candidates.
3.1 Research Design The research design framework, in this research, is presented in Figure 2, which contains four main research design steps: identifying, selecting, utilizing and integrating. Overall research steps includes identifying the research motive, using the research model development, measuring framework, selecting the research methodology, investigating procedures, analyzing empirically collected data, assessing overall analytical criteria through the use of Delphi method, comparing and empirical analysis in order to make a comprehensive conclusion.
A Study on the ECOAccountancy through ANP Measurement
393
Identify the research motive in order to define the clear research purpose Select the research methodology Utilize research methodology to analyze empirical data Integrate overall analysis in to inductively make conclusion Fig. 2 The Research Design Framework [8]
In terms of the representativeness of the efficient ANP model through transitivity, comparing weights principle, evaluated criteria, positive reciprocal matrix and supermatrix, research data source must collectively and statistically constrain all impacted expert’s opinion related to each assessable criteria. Based on the assessment of the ANP model, the pairwise comparison of the evaluation characteristics, criteria and attribution at each level are evaluated with respect to the related interdependence and importance from equal important (1) to extreme important (9) as expressed in Figure 3. Characteristics of ECOAccountancy 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Characteristics of ECOAccountancy 2
Criteria of ECOAccountancy 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Criteria of ECOAccountancy 2
Attributes of ECOAccountancy 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Attributes of ECOAccountancy 2
Selected Candidate of ECOAccountancy 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Selected Candidate of ECOAccountancy 2
Fig. 3 The Research Assessable Criteria
Based on the principle of consistency ratio, the pairwise comparison matrix can be acceptable when the number of C.R. is equal or small than 0.01. Further, the research data source in this research is derived from the scholars and experts who understand the measurement performance of the operational expenditure of the ECOAccountancy and the ANP model, are employed or served in Taiwan and Mainland China. Additionally, according to the fundamental characteristics of the CSR management and the ANP with concepts of [9] and [10], the three basic performance measurement of The operational expenditure of the ECOAccountancy have been considered into the characteristics of the CSR management are costdown policy, strategic demand and business development. Further, based on the collected data of expert’s opinion, this research is organized based on the following five assessable criteria: productive cost, productive technology, human resource, products marketing and company profits together with their homologous
394
C.-Y. Kung et al.
sub-criteria which are expressed in Figure 4. This criterion is then used in this research to testify and analyze the consistency of three kinds of accounting strategies: traditional accounting with passively green accountancy, traditional accounting with actively green accountancy and effectively diversified ECOAccountancy. (1) Productive Cost. For the overall reflection of the performance evaluation of The operational expenditure of the ECOAccountancy for enterprises in production from three characteristics of CSR management, three principle assessable subcriteria are considered in the criterion of financial perspective: direct cost (“DC”), indirect cost (“IC”) and manufacture expense (“ME”). (2) Productive Technology. In terms of ensuring the rising manufacturing technology after accounting, the three assessable sub-criteria, based on expert’s opinion, are considered in the criterion of qualitative and quantitative review: yield rate (“YR”) and automatic rate (“AR”). (3) Human Resource. In order to realize the effect of the accounting strategy in human resource, the two major sub-criteria according to effective and efficient concepts and expert’s discuss, are considered in the criterion of human resource: productive rate of human resource (“PR-HR”) and capacity-growing rate of human resource (“CGR-HR”). (4) Products Marketing. In terms of evaluating products marketing after accounting, the experts who are surveyed in this research, considered two chief evaluated sub-criteria: market share rate (“MSR”) and customer satisfaction (“CS”) in this criterion. (5) Company Profits. Based on the discussion of experts, in this assessable criterion, they deem that the three principle and crucial evaluated sub-criteria in evaluating profits are return on asset (“ROA”), gross profit rate (“GPR”) and net income after total expenditure (“NI-ATE”). Best accounting strategy with the lowest CSR expenditure Characteristics of CSR management for ECOAccountancy Criteria of assessment
Sub-criteria of each criterion
Selected potential CSR strategy for ECOAccountancy
Cost-down Policy Productive Cost DC IC MC
Strategic Demand
Productive Technology YR AR
The traditional accounting with passively ECOAccountancy
Human Resource PR-HR CGR-HR
Business Development Products Marketing MSR CS
The traditional accounting with actively ECOAccountancy
Company Profits ROA GPR NIR
The effectively diversified accounting (ECOAccountancy)
Fig. 4 The Relationship Among Assessable Attitudes, Criteria, Sub-criteria, and Candidates [11]
A Study on the ECOAccountancy through ANP Measurement
395
4 Classifier for Large Imbalanced Data In the hierarchical relations in the last level, each potential accounting partner has to fit match each assessable sub-criterion matched in each evaluated criterion through pairwise compared performance of each potential accounting strategy following. In order to reflect the comparative score for three kinds of accounting strategies, the equation (1) is applied to computed the comprehensively comparative related priority weight w (eigenvector) in the matrix. Consequently, the appropriate accounting partner is selected by calculating the “accounting comparative index” Di [11], which is defined by: s
kj
Di = ∑∑ PjTkj Rikj
(1)
j =1 k =1
Where the importance of related priority, D , is weight w (eigenvector) for assessable i
criterion j; Tkj is the importance of related priority weight w (eigenvector) for assessable attribute k of criterion j and R is the important potential accounting partner i on ikj
the attribute k of criterion j. Additionally, based on the equation (1) processing manipulation, the ultimate evaluated step is to combine the overall outcome of complete importance of related priority weights w (eigenvector) of Table 1. Table 1 ECOAccountancy Comparative Indexes (Productive cost / Productive technology/ Human resource / Products marketing) ECOAccountancy Comparative Index
Traditional accounting Traditional accounting with actively ECOAcwith passively countancy ECOAccountancy 0.5115
0.2833
Effectively diversified ECOAccountancy 0.2069
First, the highest evaluated score of 0.1842 is in sub-criterion of direct cost (DC) of assessable criterion of productive cost during implementing traditional accounting with passively green accountancy strategy. Then, the highest evaluated score of 0.77 is in sub-criterion of direct cost (DC) of assessable criteria of productive cost during practicing traditional accounting with actively green accountancy strategy as well. However, the highest evaluated score of 0.0717 is in sub-criterion of automatic rate (AR) of productive technology during handling effectively diversified ECOAccountancy strategy. Further, consequently, the highest result of the evaluated score of accounting comparative index of 0.5115 is traditional accounting with passively green accountancy strategy which means the best selection of accounting strategy is traditional accounting with passively green accountancy from minimizing the operational expenditure of the ECOAccountancy Y for the enterprises.
396
C.-Y. Kung et al.
5 Concluding Remarks This study has motivated global enterprises to undertake their fundamental CSR activities through accountings with competing companies (local and foreign). Specifically, as a result in this research, the traditional accounting with passively green accountancy is the best competitive strategy under the lowest The operational expenditure of the ECOAccountancy by evaluating the characteristics, assessable criteria and sub-criteria under the current business environment due to the incomplete ECOAccountancy with difficult accountancy (such as unannounced accounting systems from global accounting boards, none impartial auditing from third-party, and enterprises incompletely financial disclosures and so on). Our contention, therefore, not only focuses on the original central concept of three kinds of accounting strategies but also concentrates on the diminishment of The operational expenditure of the ECOAccountancy during the selection of the best potential accounting strategy through new, financial perspective and novel approach (ANP model). The ANP model is used not only to clearly establish comprehensively hierarchical relations between each assessable criterion but also to assist the decision-maker to select the best potential traditional accounting with passively green accountancy strategy with the lowest operational expenditure of the ECOAccountancy influence through the academic Delphi method and expert’s survey. In the content, there are five main assessable criteria which cover keypoint of evaluating the innovative shortcut in competitive accounting strategy. The next step beyond this research is to focus attention on analyzing additional influences of The operational expenditure of the ECOAccountancy which is created in the accounting strategy through more measurement and assessment. As these comprehensive versions are respected, the enterprises will be able to obtain more comparativeness under the lower operational expenditure of the ECOAccountancy through traditional accounting with passively green accountancy strategy to survive in this complex, higher-comparative and lower-profit era.
References [1] Driva, H., et al.: Measuring product development performance in manufacturing organizations. International Journal of Production Economic 66, 147–159 (2000) [2] Lewis, J.D.: The new power of strategic accountings. Planning Review 20(5), 45–46 (1992) [3] Kottolli, A.: Lobalization of CSR. CSR Management 36(2), 21–23 (2005) [4] Curtis, C.C.: Nonfinancial performance measures in new product development. Journal of Cost Management 1, 18–26 (1994) [5] Curtis, C.C.: Balance scorecards for new product development. Journal of Cost Management 1, 12–18 (2002) [6] Saaty, T.L.: Decision Making with Dependence and Feedback: The Analytic Network Process. RWS Publications, Pittsburgh (1996) [7] Saaty, T.L.: Multi-criteria decision making: the analytic hierarchy process. RWS Publications, Pittsburgh (1998)
A Study on the ECOAccountancy through ANP Measurement
397
[8] Hsieh, M.-Y., et al.: Management Perspective on the Evaluation of the ECOAccountancy in the Corporate Social Responsibility. Electronic Trend Publications, ETP (2010) [9] Bidault, Cummings: CSR strategic challenges for multinational corporations. CSR Management 21(3), 35–41 (1997) [10] Ernst, D., Bleeke, J.: Collaborating to Compete: Using Strategic Accountings and Acquisitions in the Global Marketplace. Wiley, New York (1993) [11] Hsieh, M.-Y., et al.: Management Perspective on the Evaluation of the ECOAccountancy in the Corporate Social Responsibility. In: 2010 International Conference on Management Science and Engineering, Wuhan, China, pp. 393–396 (2010)
Attribute Coding for the Rough Set Theory Based Rule Simplications by Using the Particle Swarm Optimization Algorithm Jieh-Ren Chang, Yow-Hao Jheng, Chi-Hsiang Lo, and Betty Chang
*
Abstract. The attribute coding approach has been used in the Rough Set Theory (RST) based classification problems. The attribute coding defined ranges of the attribute values as multi-thresholds. If attribute values can be defined as appropriate values, the appropriate number of rules will be generated. The attribute coding for the RST based rule derivations significantly reduces unnecessary rules and simplifies the classification results. Therefore, how the appropriate attribute values can be defined will be very critical for rule derivations by using the RST. In this study, the authors intend to introduce the particle swarm optimization (PSO) algorithm to adjust the attribute setting scopes as an optimization problem to derive the most appropriate attribute values in a complex information system. Finally, the efficiency of the proposed method will be benchmarked with other algorithms by using the Fisher’s iris data set. Based on the benchmark results, the simpler rules can be generated and better classification performance can be achieved by using the PSO based attribute coding method. Keywords: Particle Swarm Optimization (PSO); Rough Set Theory (RST); Attribute Coding; optimization. Jieh-Ren Chang Department of Electronic Engineering, National Ilan University No. 1, Sec. 1, Shen-Lung Road, I-Lan, 260, Taiwan, R.O.C e-mail:
[email protected] Yow-Hao Jheng Department of Business and Entrepreneurial Administration, Kainan University No. 1, Sec. 1, Shen-Lung Road, I-Lan, 260, Taiwan, R.O.C e-mail:
[email protected] Chi-Hsiang Lo Institute of Management of Technology, National Chiao Tung University No. 1, Sec. 1, Shen-Lung Road, I-Lan, 260, Taiwan, R.O.C e-mail:
[email protected] Betty Chang Graduate Institute of Architecture and Sustainable Planning, National Ilan University No. 1, Sec. 1, Shen-Lung Road, I-Lan, 260, Taiwan, R.O.C e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 399 – 407. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
400
1
J.-R. Chang et al.
Introduction
In recent years, many expert systems have been established for deriving appropriate responses and answers based on the knowledge based systems. But in the real life, the knowledge is filled with ambiguity and uncertainty. The conflicting and unnecessary rules can usually be found in the knowledge based system. The goal of this research is to avoid excessive system operations through simplification and correct descriptions of inputs for the knowledge based systems. In addition, the overall efficiency of classification and computation can be enhanced by reasonable analysis. The Rough Set Theory (RST) [15, 16, 19] was proposed by Zdzisław Pawlak in 1982. is the RST has widely been applied in various fields such as the prediction of business outcome [4], road maintenance [8], the insurance market analysis [18], consumer behavior analysis [13], material property identification [7] and so on. According to the procedures of the RST, the redundant attributes can be removed automatically. The patterns can be identified directly. Pattern classification is a longstanding problem in various engineering fields, such as radar detection, control engineering, speech identification, image recognition, biomedical diagnostics, etc.. Despite the huge progresses in researches in artificial intelligence during the past decades, the gap between the artificial intelligence based pattern recognitions and the human recognitions are still significant. Thus, novel methods were proposed for improving the classification performance of pattern classification problems. The RST is one of the best methods which can be manipulated easily. However, too many classification rules were generated by the RST based approaches which can be computation time and database space wasting. The attribute coding technique can be introduced as the initial step of the RST based pattern classification approach. The original information can be encoded in the beginning of the RST based procedure. Appropriate attribute code scopes can be defined based on the whole numeric range of information data. An appropriate definition of attribute code scopes can be helpful for appropriate number of rule generation and unnecessary rule reduction. The Particle Swarm Optimization (PSO) [11] algorithm was proposed by Kennedy and Eberhart in 1995 based on simulating the social behavior of birds’ foraging ways. The PSO initiates by a population of candidate solutions, or particles. Then, these particles will be moved around in the search-space according to simple mathematical formulae. Since the procedures of the PSO are simple while the parameters can easily be adjusted, the PSO was widely applied in medicine [10], stock [12], energy [14], construction [9] and so on. The PSO based method is suitable for finding optimal solution in a wide range of search-space. Therefore, the PSO is appropriate to search attributes code scope in the RST. In this study, a method for in the attribute coding in for the RST Based rule simplications by using the PSO algorithm. In the next Section, the basic concepts of the RST and the PSO algorithm are introduced. The overall architecture and the procedures of using the PSO algorithm to improve the RST based pattern classification problem will be demonstrated in the third Section. In addition, the details of
Attribute Coding for the Rough Set Theory
401
the initial conditions, termination criteria and modified steps for the proposed PSO algorithm are also described in this Section three. Performance evaluation and comparison of the analytic results with the ones being derived by other methods based on the Iris data set will be demonstrated in the fourth Section. Finally, concluding remarks of this study will be presented in Section five. In recent years, many expert systems have been established, which offer users ap-propriate responses and answers through the knowledge database. However, in re-al world, a lot of knowledge is filled with ambiguity and uncertainty. The conflict-ing rules and unnecessary rules could be found if no proper attribute coding is set. Our goal is to avoid excessive system operation through simplification such that we can improve the overall efficiency by reasonable analysis.
2
Related Work-Preliminary
2.1
The RST
The aim of this Section is to introduce the basic concepts of the RST [15, 16, 19]. Before the RST based analytic procedure, assuming data collection can be viewed | as an information system , , where 1,2, … , is the original data set with R data objects from a target system, and 1,2, … , is the attribute set which includes attributes. Each data object can be represented by , ,…, ,…, where is usually a real number in Mathematics terminology. For the sake of simplification of RST rules, we would like to transform the numerical data with a special code. First, we should define the attribute code scopes for each attribute. For example, if there are three codes for an attribute, we should define three scopes for this attribute. That means four margin values can confine these three scopes in this attribute. Since the data have their own maximum and minimum value for each attribute, only two boundary values should be decided. After the boundary value decision, all data objects will be represented by the transformed attribute coding. After attribute coding, the information system is represented by , , | 1,2, … , and where , ,…, ,…, , 1,2, … , . For each attribute , the information function is defined as : , where denotes the set of codes for , and is called the domain of attribute , and is called the domain of attribute . The RST method consists of the following procedures: (a) (b) (c)
(d) (e)
transform the data objects of target system by attribute coding, establish the construction of elementary sets, check the difference between the lower approximation and upper approxi-mation of the set which represents a class to select good data for classifi-cation rules, evaluate core and reducts of Attribute, generate decision table and rules.
402
J.-R. Chang et al.
Through the above steps, we could find reducts and cores of attributes for , . The reduct is the essential part of an , which can discern all objects discernible by the original , . The core is the common part of all reducts. Finally, the decision table and classification rules can be established by the reducts and the cores of attributes.
2.2
The PSO
Particle swarm optimization [11] is initialized with a population of random solutions of the objective function. All particles have their own position vector and speed vector at any moment. For the th particle, the position vector and speed vector are represented by and , , , ,…, , , , … , respectively, where is the dimension of space. Each par, , , is the ticle moves by the regulation of Equation (1) and Equation (2), where velocity and is the fittest solution which has been achieved so far at current time , is the global optimum solution at the same time, is the weight valand are parameters, and are random numbers, where0 ue, , , 1 and 0 1. , , (1) ,
,
(2)
PSO algorithm procedures can be described as follows: (a)
(b) (c)
(d)
(e) (f)
| A set of solution 1,2, … , and a set of velocity | 1,2, … , are initialized randomly, where is the total | at number of particles. 1,2, … , and 0. The fitness function value is calculated for each particle. If the fitness function value which is calculated by step (b) is better than the local optimal solution , then update the current local optimal solution . , the best one is Considering the fitness function values of all particles selected. If it is better than the global optimum , then update the current global optimal solution . The particle's velocity and position are changed by the Equations (1) and (2). If stopping criteria is satisfied, then stop the repeat steps. Otherwise, it goes back to step (b).
Attribute Coding for the Rough Set Theory
3 3.1
403
Research Methodology Methodology Structure
The attribute coding is the key issue for building a classifier based on RST. Therefore, how to define the attribute boundary values for each attribute code can be seen as a multi-dimensions optimization problem. In this study, we use PSO algorithm to solve the attribute coding problem in the RST classification algorithm. The purpose of this method is to reduce and simplify the RST classification rules, especially when the original input data have many attributes or the system data belong to continuous real number or widespread numerical value. The flow chart of the proposed algorithm is shown in Figure 3.1.
Fig. 1 Flow chart of the proposed algorithm
3.2
Particle Swarm Initialization
Assume there are data objects, output categories and attributes for the information system as shown in Figure 3.2. If we could encode 1 values for each attribute , then cutting values should be decided for the attribute . The number of classification rules in RST algorithm could be increased or decreased by changing the boundary values of any attribute code. To find every boundary value of attribute code in exact place for each attribute is an optimization problem. Therefore, this study uses PSO to solve this problem. First, the total ∑ number of boundary values need to be defined as , which is the number of dimensions for each particle in PSO algorithm. Then, the position values of all particles which have dimensions are randomly initialized. The
404
J.-R. Chang et al.
position value is confined between maximum and minimum border values of each attribute. The speed is initially set as zero. The whole particle swarm ,1 , data can be represented in Figure 3.2, where 1 , , … , and , , … , . , , , , , ,
:
,
,
,
,
,
:
,
,
,
,
,
,
:
,
,
,
,
,
,
:
,
,
,
,
:
,
,
,
,
:
,
,
,
:
,
,
:
,
,
Fig. 2 The structures of input data and particle swarm data.
3.3
Fitness Function for the RST Procedure
Formally, let , be the fitness or the cost function which must be minimized. The function takes a candidate solution as argument and produces a real number as output which indicates the fitness of the given candidate solution. The for which , , for all in goal is to find a solution the search-space, which would mean is the global minimum. For our problem, the fitness function is constructed to minimize classifier error rate and the number of classification rules in RST procedure. The fitness function is described as the following pseudo codes: fitness function = fit( ,IS){ Number of rules =RST( ,IS); Error_rate = Correct_Check(Number of rules,IS); if(Error_rate != 0) return (1 - Error_rate); else return (Number of rules - Number of input data); }
The beginning step in RST analysis is to find the attribute code scopes for each . The total number of dimensions is divided into groups based on particle the number of attributes. Suppose the number of dimensions is for the th group which is relative to the attribute . It means the boundary values of should be decided and optimized. The structure of dimensions for particle is shown in Figure 3.3. According to Equation (3), the original information data ,
Attribute Coding for the Rough Set Theory
405
, and 1 will be encoded as the new split attribute code , , where 1 . Based on the new attribute code, the decision table and classification rules can be generated by the RST based analysis. , ,
,
∑
∑
,
,
y
,
2
∑
3
∑
,
1
∑
∑
,
∑
,
2
, ,
,
∑
,
,
,
∑
,
,
∑
,
,
3.4
∑
.
Fig. 3 The structure of dimensions for particle
1
1
,
,
∑
(3)
,
,
Modification and Stopping Criteria
During the PSO operation, it records the regional optimal solution for each particle and the global optimum solution among all particles; they are iteratively modified by Equations (1), (2) to minimize fitness function value until stopping criteria are satisfied. In normal condition, the solution values of each particle should be followed by Equation (4). When the location values of a particle are out of order in Equation (4), it can restore order by using bubble sorting method. In this study, the stopping criteria are set as follows: (1) the fitness function values of all particles are identical, and they are smaller than or equal to zero; (2) the iteration counter is greater than _ , which is set to avoid from unlimited iteration condition in the PSO process. x x ∑
4
,
,
,
x x ∑
x
,
,
x
,
x
,
,
x∑
,
,
,
∑
(4) ,
Experiment Results
We use Iris database to test the effectiveness. Iris database can be divided into three categories, which are Setosa, Versicolor, and Verginica. Each flower can be identified by four attributes: sepal length, sepal width, petal length, and petal
406
J.-R. Chang et al.
width. There are 150 data in Iris database. We employed Iris database to generate the critical values of attribute code and rules. Then we obtained accuracy rates from testing data. We programmed with MATLAB R2009b, 90% of the data were applied for training and 10% data for testing. The critical values of attribute code were obtained with PSO algorithm. In this study, we used a total of 500 particles to find the optimal solution. As the output is divided into 3 categories and 4 attributes, there are eight dimensions for each particle. The related parameters of the proposed algorithm include 4, 2, a maximum of 10,000 iterations for _ , 100 simulations were processed every time, then value was averaged. The authors compare the results of this study with those from other researches. The average accuracy rate of the comparison is listed in Table 1. It is found that the average accuracy rate of this study is 97.75%, which is higher than others. In other words, the proposed PSO algorithm is more effective. Table 1 Comparison of Average accuracy rate with 90% training and 10% testing data. Research Method
Average Accuracy Rate
Research Method
Average Accuracy Rate
The Proposed Algorithm
97.75%
Aha-and-Kibler[1]
94.87%
Chang-and-Jheng[2]
96.72%
Dasarathy[3]
94.67%
Hong-and-Chen[6]
96.67%
Quinlan[17]
93.89%
Hirsh[5]
95.78%
The 90% training data of the RST and the proposed method are used to generate rules. The number of rules was compared with those generated with original RST method and shown in Table 2. Table 2 Comparison of number of rules. number of rules
5
The Proposed Algorithm
17
RST by original data
31
Conclusion
The original RST has a weakness while searching the critical values of a large amount of attributes. In this study, we combined PSO algorithm with RST to overcome this weakness. The proposed method was used to classify Iris data. With the advantage of the combination, the overall performance was greatly improved. In conclusion, this hybrid method was able to find the critical values for each attribute division scope and had a good performance in classification. In addition, it also significantly reduces the number of output rules.
Attribute Coding for the Rough Set Theory
407
References [1] Aha, D.W., Kibler, D.: Detecting and removing noisy instances from concept descriptions, Tecnical Report, University of California, Irvine, 88–12 (1989) [2] Chang, J.R., Jheng, Y.H.: Optimization of α-cut Value by using Genetic Algorithm for Fuzzy-based rules extraction. In: The 18th National Conference on Fuzzy and Its Applications, pp. 678–683 (2010) [3] Dasarathy, B.V.: Nosing around the neighborhood: A new system structure and classification rule for recognition in partially exposed environments. PAMI 2-1, 67–71 (1980) [4] Dimitras, A.I., Slowinski, R., Susmaga, R., Zopounidis, C.: Business failure prediction using rough sets. European Journal of Operational Research 114(2), 263–280 (1999) [5] Hirsh, H.: Incremental version-space merging: a general framework for concept learning, Ph.D. Thesis, Stanford University (1990) [6] Hong, T.P., Chen, J.B.: Finding relevant attributes and membership functions. Fuzzy Sets and Systems 103(3), 389–404 (1999) [7] Jackson, A.G., Leclair, S.R., Ohme, M.C., Ziarko, W., Kamhwi, H.A.: Rough sets applied to materials data. ACTA Material 44(11), 4475–4484 (1996) [8] Jang, J.R., Hung, J.T.: Rough set theory inference to study pavement maintenance, National Science Council Research Project. Mingshin University of Science and Technology, Hsinchu, Taiwan (2005) [9] Jia, G., Zhang, W.: Using PSO to Reliability Analysis of PC Pipe Pile. In: The 3rd International Symposium on Computational Intelligence, vol. 1, pp. 68– 71 (2010) [10] Jin, J., Wang, Y., Wang, Q., Yang, B.Q.: The VNP-PSO method for medi-cal image registration.In: The 29th Chinese Control Conference, pp. 5203–5205 (2010) [11] Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: Proceeding of the IEEE International Conference on Neural Networks, pp. 12–13 (1995) [12] Khamsawang, S., Wannakarn, P., Jiriwibhakorn, S.: Hybrid PSO-DE for solving the economic dispatch problem with generator constraints. In: 2010 2nd International Conference on Computer and Automation Engineering, vol. 5, pp. 135–139 (2010) [13] Kim, D.J., Ferrin, D.L., RaghavRao, H.: A study of the effect of consumer trust on consumer expectations and satisfaction: The Korean experience. In: Proceedings of the 5th International Conference on Electronic Commerce, Pittsburgh. ACM International Conference Proceeding Series, pp. 310–315 (2003) [14] Kondo, Y., Phimmasone, V., Ono, Y., Miyatake, M.: Verification of efficacy of PSObased MPPT for photovoltaics. In: 2010 International Conference on Electrical Machines and Systems, pp. 593–596 (2010) [15] Pawlak, Z.: Rough sets. International Journal of Parallel Programming 11(5), 341– 356 (1982) [16] Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishing, Dordrecht (1991) [17] Quinlan, J.R., Compton, P.J., Horn, K.A., Lazarus, L.A.: Inductive knowledge acquisition: A case study. In: Quinlan, J.R. (ed.) Applications of Expert systems, Addison-Wesley, Wokingham (1987) [18] Shyng, J.Y., Wang, F.K., Tzeng, G.H., Wu, K.S.: Rough Set Theory in analyzing the attributes of combination values for the insurance market. Expert Systems with Applications 32(1), 56–64 (2007) [19] Walczak, B., Massart, D.L.: Tutorial: Rough sets theory. Chemometrics and Intelligent Laboratory Systems 47, 1–17 (1999)
Building Agents by Assembly Software Components under Organizational Constraints of Multi-Agent System Siam Abderrahim and Maamri Ramdane
*
Abstract. This paper presents an attempt to provide a framework for building agents by automatic assembly of software components; the assembly is directed by the organization of MAS in which the agent belongs. In the proposed model the social dimension of multi-agent system directs the agent depending on its location (according to roles) within the organization to assemble / reassemble the components in order to automatically reconfigure itself; the proposed construction of agent is independent of any component model. Keywords: components, Agents, MAS, Assemblage, Organization.
1 Introduction The multi-agent systems MAS and software components present two important approaches in the world of software development, both propose the software structure as a composition of software elements, this software structuring makes it easy of its development as well as adding and replacement of elements in it. The MAS with their self-organization’s capacity pushing the level of abstraction, we believe that taking the organization of MAS as analytical framework can assist in the use of software components for building agent. Several studies [1], [5], [11] [12] [14] ... were interested in building agent by assembling components indicating the possibility of using the component approach as a framework for the specification of agent’s behavior. In this paper we propose to build agents by automated assembly of components, the assembly is directed by the organization of MAS in which the agent belongs. Siam Abderrahim University of Tebessa Algeria, Laboratory LIRE, University Mentouri-Constantine Algeria e-mail:
[email protected] *
Maamri Ramdane Laboratory LIRE, University Mentouri-Constantine, Algeria e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 409–418. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
410
S. Abderrahim and M. Ramdane
In this work we propose a structure of the MAS in terms of roles grouped into alliances, agents agree to play roles in the organization, why agents should have certain skills implemented as software components. In this proposition the social dimension of multi-agent system directs the agent depending on its position in the organization (roles) to assemble / reassemble the components to reconfigure itself automatically and dynamically. .In this model we do an abstraction of the assembly itself and the component models; the emphasis is on the social dimension as a framework for management of component assembly by the agent.
2 The Component Paradigm The component approach is an advantageous design technique of distributed and open applications, this approach offers a balance between on the one hand the production of custom code, so long to develop and validate the other hand reuse of existing software. We can give a vision on the definition of a software component [13] which looks like a piece of software "fairly small to create and maintain, and large enough to install it and reuse it." The component concept is an evolution of the concept object, with practically the same objectives as encapsulation, separation of interface and implementation, reusability and to reduce complexity. We note the existence of several component models in which we do an abstraction in this work, as well as engine assembly or how to assemble the components.
3 Agents and Multi Agents Systems Several definitions have been attributed to the concept of agent which has been the subject of several studies; a definition of agent concept proposed by [8] consists of An agent is a computer system situated in an environment and acting autonomously and flexibly to achieve the objectives for which it was designed. The concepts of «situated ", "autonomy" and "flexible" are defined as follows: •
• •
Situated: the agent can act on its environment from the sensory inputs it receives from that environment. Examples: process control systems, embedded systems, etc. . autonomous agent is able to act without the intervention of a third party (human or agent) and controls its own actions and its internal state; Flexible: the agent in this case is: 9 able to respond in time: the agent must be able to perceive their environment and develop a response within the required time; 9 proactive: agent must produce a proactive and opportunistic behavior, while being able to take the initiative to the "right" time;
Building Agents by Assembly Software Components
411
9 Social: agents should be able to interact with other agents (software and humans) when the situation requires to complete tasks or assist these officers to do their own. This definition is consistent with the vision with which we see the agent in the context of our work. A multi-agent system [3] is a distributed system consisting of a set of agents. Contrasting AI systems, which simulate to some extent the capabilities of human reasoning, MAS are ideally designed and implemented as a set of interacting agents, most often, in ways of cooperation, competition or coexistence.A MAS is usually characterized by: a) each agent has information and problem solving abilities limited, so each agent has a partial point of view; b) there is no overall control of multi-agent system; c) the data are decentralized; d) The computation is asynchronous. Autonomous Agents and Multi-agent systems represent a good approach for the analysis, design and implementation of complex computer systems. Vision based on the agent entity provides a powerful directory tools, techniques, and metaphors that have the potential to significantly improve software systems. [8] Highlights the multi-agent systems as a preferred solution for analyzing design and build complex software systems.
3.1 Organization of MAS A key to the design and implementation of multi-agent systems of significant size is to take a social perspective in order to constrain the behavior of agents. In general, the organization [10] is a model that allows agents to coordinate their actions during the resolution of one or more tasks. It defines the one hand, a structure (eg, a hierarchy) with a set of roles that should be awarded to agents and a set of communication paths between these roles. It defines the other a control system (eg, a master / slave) that dictates the social behavior of agents. Finally, it defines coordination processes that determine the decomposition of tasks into subtasks, the allocation of subtasks to agents, and the realization of dependent tasks consistently. Examples of the kinds of organizations are AGR [4] and MOISE + [7] who are one of the first attempts of methodologies for analyzing multi-agent systems focus on the concept of organization.
4 Mutual Contributions: Components, Agents and MAS We can consider the mutual contributions between the two concepts component and agent, we can use for example agents through negotiation techniques to assist assembly of components, for example an extension of Ugtaze model in works of [9] us which are to define points of adjustment to the interface components as a
412
S. Abderrahim and M. Ramdane
way to have a variability of components, agents occurs mainly to browse the reuse of existing cases and to negotiate the adaptation of an original component, agents can select the adequate adjustment points. On the other hand, the components can help build, integration and deployment of multi-agent systems, as well as structuring agent, as in several works such as architecture VOLCANO [12] where the decomposition of the agent architecture is based on aspects (environment, organization ..) and associated treatments (communication, coordination ...), Architecture MAST [14] which has an extension of the architecture Volcano with a more detailed decomposition, MALEVA component model [2] whose goal is to enable a modular design of agent behavior as an assembly of elementary behaviors, include also the model MADCAR [5] and finally [6] which proposes an agent architecture based on components for home automation systems. Note that in each of these agent’s architectures based component decomposition based on certain criteria of decomposition (decomposition by facets and associated treatments, decomposition by behavior ...). Although the authors of the majority of these studies emphasize the aspects of genericity and flexibility we believe that these models are appropriate for certain types of agents and it is difficult to replace or add components.
5 An Approach Focused on the Organization for the Construction of Agent by Component Assembly We are primarily interested in the approach of the components is the ability to reuse; In this approach the organization is taken as an explicit framework for analysis and design of MAS, we get the needed behaviors from agents through a structure of social space, to facilitate cooperation, interaction between its members. The organization is primarily a matter of support group activity, facilitating the collective action of agents in their areas of action. On the other hand we propose construction of agents from software components in which the basic know-how are implemented. In this model we propose a structure of the SMA in terms of roles grouped into alliances, agents agree to play roles in the organization, for that agents should have certain skills implemented as software components; to acquire skills necessary for engagement in a role the agent must assemble the components that implement it’s skills.
5.1 Presentation of the Approach This model is a way to describe a structural dimension in the organization of multi-agent systems according to which the agent assembles software components. 5.1.1 Role As in MOISE + [7] a role ρ is the abstract representation of a function. An agent may have several roles; one role can be played by several agents. The role is the expected behavior of the agent in the organization.
Building Agents by Assembly Software Components
413
5.1.2 Alliance We can define an alliance of agents such as a group of agents. Each agent can be a member of an alliance. The term alliance is used to describe a community of agents who play the same role. The roles are played by agents within an alliance. Formally we can describe an alliance A as: A (R, np, cr, cmp) R: a role; np: (N * N) number (min, max) that a role must be played in an alliance; cr: indicates the role’s compatibility within the alliance, noted ρ a≡ ρ b means that the agent which is playing the role ρ a can also plays ρ b. Cmp: skills (competencies) set that the agent should have to commit to play the role of the alliance.
5.2 Roles and Components Components contain the necessary know-how (trade) for performing an action, when an agent engages in a role it must have Competencies acquired by assembling components. Implemented competencies Cmp1 :……. ; Cmp3 :……. ; Alliance Rôle
Required competencies Cmp i :…… Cmp j :……
Components Library Verify the number of agents min,max authorized to play the role.
Age
Verify the numbers of agents witch are playing the role. Acquire components depending required competencies to play the role
Fig. 1 General schema
414
S. Abderrahim and M. Ramdane
To play a role ρ1 an agent must have certain Competencies {CMP1, CMP2 ...} i.e. To contain in its composition of a set of components C = {c1, c2, c3 ... ...} or a combination of components in case of compatibility of component Cn with an assembly of Ci + Cj +…. All components of C implement the business code appropriate with the Competencies {CMP1, CMP2 ...}. For example, in robot soccer matches the specification of a defense alliance, defined by the designer, and is well specified in three alliances:
Def=( {ρgoalkeeper,ρ goalkeeper ->(1,1),ρLeader≡ρ goalkeeper } ;{ ρback,ρ back ->(3.3), ρ Leader ≡ ρ back };{ρLeader , ρLeader ->(0,1)} ; ρLeader ≡ ρ back ,ρ Leader≡ρ goalkeeper } ). This specification indicates that within a defense alliance the roles are: goalkeeper, back and leader, leader and back roles are compatible, it means that an agent witch play the role back act as leader, also the roles leader and goalkeeper are compatible. In goalkeeper alliance we have a single agent can act as a goalkeeper, three agents must commit to play the role of back (min = 3, max = 3) in the back alliance. An agent can play the role of leader witch is compatible with the goalkeeper and back roles. We specify for each role the required Competencies
ρ back → Cmp1: play the offside line, Cmp2: marking a striker ... CmpN); This means that the agent acts back must have skills (Competencies): play the offside line, marking a striker........ ; For each component of the system we specify implemented Competencies. Example: [C1→ Cmp1, Cmp4], [C2→ Cmp1, Cmp6], [C3→ Cmp3]… Thereby specifying a defense becomes:
Déf=( {ρgoalkeeper,ρgoalkeeper ->(1,1), ρLeader ≡ρgoalkeeper,{ }} ;{ ρback, ρback >(3.3), ρLeader ≡ ρback,{ ρback → Cmp1: play the offside line, Cmp2: marking a striker, ... Cmpn) } };{ρLeader, ρLeader ->(0,1)} ; ρLeader ≡ ρback, ρLeader ≡ρρgoalkeeper } ).
6 Description of Competencies Competencies represent know-how described by a name and a set of functions witch implement operations (business code)
Competency: Name_Comp { Static data: …….
Building Agents by Assembly Software Components
415
Functions : Function1 (input parameters) : returned type Function2 (input parameters) : returned type …… } For example we have this competency:
Competency: calculate_area { Static data: ……. Functions: Calcul_area_circle (R:real) : real Calcul_area_rectangle (R,L:real) : real …… }
7 Select the Components to Assemble The diagram below summarizes the process of component selection by the agent depending roles in the organization.
Decision to commit to a role
Identify required competencies
Required competencies
From required competencies identify those already acquired (components already assembled) and those in acquiring
Competencies already acquired
Competencies to acquire
Identify components in which required competencies are implemented
Set of components
416
S. Abderrahim and M. Ramdane
Select components process Define the minimum set of components by eliminating the components whose all competencies are implemented in the other components (minimum cover). The agent can keep history of the minimal sets. The history can be used if it is recent, if not repeat this step and the previous step to avoid losing the benefit of new components if they exist.
Components assembly The agent may detach components that didn't need
8 Discussions Take a social point of view for specifying multi-agent systems has advantages in handling the organization as an entity explicitly manipulated such as means for overcoming the complexity. In Figure 2 the set G represents behaviors whose implementation corresponds to the satisfaction of the overall objective of the MAS. The set E represents all possible behaviors of agents in the current environment. A specification of an organization formed for example of roles and alliances constrained the agents to implement the behaviors of the set S. Thus the organization provides a restriction on the set E which implies the elimination of a set of behaviors that do not serve to satisfy the overall objective of the MAS. Possible behaviors in the environment Possible behaviors after Organizational constraints All possible behaviors
S
G
E
Behaviors after all constraints
Fig. 2 Effect of the organization
A problem seems a real challenge is to ensure, or at least estimate, an agent who will be asked to play a role will effectively be able to accomplish it. On this point we can ensure that the agent who undertakes to play a role has the required competences.
Building Agents by Assembly Software Components
417
In this paper we do an abstraction to the assembly of components itself, the emphasis is on organization, we can use for example the engine assembly proposed by [5] at first while thinking to develop our engine assembly. Adding new components in the system does not disturb the agents’ behaviors; The implementation roles as components adds to the benefits of the component approach has the advantage of reorganization and adaptability possible if the model is enriched with mechanisms of reorganization, an agent that changes position in the organization can done by detaching and assembling components depending of the new roles in the organization by example always in robot soccer agent who plays defender can become an attacker. The agents' ability to disassemble (detach) components can be very useful in the case where agents are mobile.
9 Conclusion In this paper we tried to combine the two approaches components and agents to building agents taking the organization as a framework for analysis, this work can be considered a preliminary framework to build around him a more robust and more complete by introducing the dynamic and functional aspects with make from MAS a real autonomous system doesn’t requires interventions of an external actors; in other words enrich the model with the knowledge to identify new components and detect the coherence between them; build an engine component assembly and finally adding to the structural specification a meta structural specification that is to say, roles whose role is to identify new roles.
Bibliographies [1] Brazier, F.M., Jonker, T., Trenr, C. M.: Principles of component based design of intelligent agent. Data Knowl. Eng. 1, 1–27 (2002) [2] Briot, J.P., Meurisse, T., Peschanski, F.: Une expérience de conception et de composition de comportements d’agents à l’aide de composants. L’Objet, Revue des Sciences et Technologies de l’Information 12(4), 11–41 (2006) [3] Chaib-draa, b., Jarras, I., Moulin, B.: Département d’Informatique, Pavillon Pouliot, Article à paraître dans. In: Briot, J.P., Demazeau, Y. (eds.) Agent et systèmes multiagents, chez Hermès en (2001) [4] Ferber, J., Gutknecht, O.: A Meta-Model for the Analysis and Design of Organizations in Multi-Agents Systems (1998), http://www.madkit.org [5] Grondin, N., Bouraqadi, L., Vercouter: Assemblage Automatique de Composants pour la Construction d’Agents avec MADCAR. In: Journées Multi-Agents et Composants, JMAC 2006, Nimes, France, March 21 (2006) [6] Hamoui, F., Huchard, M., Urtado, C., Vauttier, S.: Un système d’agents à base de composants pour les environnements domotiques. In: Actes de la 16ème conférence francophone sur les Langages et Modèles à Objets, Pau, France, pp. 35–49 (March 2010) [7] Hübner, J.F., Sichman, J.S., Boissier, O.: Spécification structurelle, fonctionnelle et déontique d’organisations dans les SMA 2002 (jfiadsma 2002)
418
S. Abderrahim and M. Ramdane
[8] Jennings, N.R.: On agent-based software engineering. Artificial Intelligence Journal (2000) [9] Lacouture, J., Aniorté, P.: vers l’adaptation dynamique de services: des composants monitorés par des agents. In: Journées Multi-Agents et Composants, JMAC 2006, Nimes, France, March 21 (2006) [10] Malville, E.: L’auto-organisation de groupes pour l’allocation de tâches dans les Systèmes Multi-Agents: Application à CORBA. Thèse, Université SAVOIE France (1999) [11] Occello, M., Baeijs, C., Demazeau, Y., Koning, J.-L.: MASK: An AEIO Toolbox to Develop Multi-Agent Systems. In: Knowledge Engineering and Agent Technology, Amsterdam, The Netherlands. IOS Series on Frontiers in AI and Applications (2002) [12] Ricordel, P.-M., Demazeau, Y.: La plate-forme VOLCANO: modularité et réutilisabilité pour les systèmes multi-agents. Numéro spécial sur les plates-formes de développement SMA. Revue Technique et Science Informatiques, TSI (2002) [13] Sébastien, L.: Architectures à composants et agents pour la conception d’applications réparties adaptables. Thèse2006 Université Toulouse III France [14] Vercouter, L.: MAST: Un modèle de composants pour la conception de SMA. In: Journées Multi-Agents et Composants, JMAC 2004, Paris, France, November 23-23 (2004)
Determining an Efficient Parts Layout for Assembly Cell Production by Using GA and Virtual Factory System Hidehiko Yamamoto and Takayoshi Yamada
*
Abstract. This paper describes a system that can determine an efficient parts layout for assembly cell production before setting up a real cell production line in a factory. This system is called the Virtual Assembly Cell production System (VACS). VACS consists of two modules, a genetic algorithm (GA) for determining the parts layout and a virtual production system. The GA system utilizes a unique crossover method called Twice Transformation Crossover. VACS is applied to a cell production line for assembling a personal computer. An efficient parts layout is generated, which demonstrates the usefulness of VACS. Keywords: Production System, Virtual Factory, Assembly Line, Cell Production, GA.
1 Introduction Due to the increasing variety of users’ needs, flexible manufacturing system (FMS) have been introduced in recent years. In particular, cell production [1, 2] is useful for the assembly of modern IT products, thus several companies have adopted cell production line techniques. However, during cell production development or implementation, trial and error is required to determine an efficient parts layout. Namely, the operator places the parts on the shelves and gradually adjusts or changes the layout position such that the overall arrangement is comfortable for him. However, it can take a long period of time to make these adjustments. Trial and error is required to decide the parts layout in a cell production line because of several uncertain factors involved. Because a cell production line assembles several product varieties, each product is required to be assembled from different parts and at different production rates. Therefore, the most efficient parts layout cannot be determined beforehand, and a systematic method for finding an efficient parts layout does not yet exist. Hidehiko Yamamoto · Takayoshi Yamada Department of Human and Information System, Gifu University, Japan e-mail:
[email protected],
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 419–428. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
420
H. Yamamoto and T. Yamada
To solve this problem, our research deals with an assembly cell production line and tries to develop a system that consists of a genetic algorithm (GA) optimizer and a virtual factory system [3] that production engineers can use to plan a cell production line. The GA is used to determine efficient two-dimensional parts layout locations. In particular, the GA utilizes the characteristic crossover method. The system is applied to the development of a cell production line for assembling personal computers, which demonstrates that the proposed system is useful.
2 Assembly Using a Cell Production Line The study deals with an assembly cell production line in which an operator assembles a variety of products individually. The operator takes one part at a time from the shelves, moves to the assembly table, and performs the assembly tasks. The characteristics of an assembly cell production line include the following: {1} Assembly parts (hereafter referred to as “parts”) are located in shelves around an operator and he or she assembles them in front of the assembly table. {2} Some components of the products use the same parts (i.e., common parts) and some components use different parts. {3} The product ratio for each product is different. {4} Depending on the size of each part, an operator will be able to carry either one part at a time or two parts at a time. {5} One by one production is used [4].
3 The VACS System 3.1 Outline of VACS This paper proposes a system to solve the problems mentioned above. The system is called the Virtual Assembly Cell production System (VACS). VACS is able to solve the problem of determining efficient parts layout locations before an assembly cell production line starts operating. VACS is comprised of two modules, the Virtual factory module (V-module) and the Parts layout GA module (GA-module) as shown in Fig.1. Using these two modules, VACS can perform the following four functions. VACS creates a tentative assembly cell production line, tentative line, in the V-module. VACS calculates two dimensional data for the shelf location names and distance from the tentative line, and sends them to the GA-module.
Determining an Efficient Parts Layout for Assembly Cell Production
421
VACS Parts Layout GA Module coordinates data for working table and shelves are visualized
parts layout is decided
Virtual Factory Module
Fig. 1 VACS
The GA-module calculates the distance from the two dimensional data and, using the distance data and the parts names, finds the coordinates of the most efficient parts layout locations. VACS transfers the found coordinates to the V-module, which generates an updated three dimensional virtual cell production line, final line. By operating the final line, visual final judgments can be confirmed.
3.2 Two Dimensional Data In function , a tentative line is generated by designing a virtual assembly cell production line using three dimensional computer graphics (CG). In particular, engineers design the assembly table and shelves and set them in a three dimensional virtual factory space. From the generated tentative line, VACS extracts the shelf names and the location coordinates of the shelves. S(n) are the shelf names for n number of shelves, and Cn = (xn, yn) represents the coordinates of the n shelves. By using the coordinates Cn and the coordinates for the operator’s working standing position (a, b), the distances from the operator’s working standing position to each shelf are calculated as (Ln). The distances are expressed with a set of distances L as shown in equation (1). L={L1,L2,
・・・,L } n
= { (a − x1)2 + (b − y1)2 , (a − x2)2 + (b − y2)2・・・,
(a − xn)2 + (b − yn)2}
・ ・ ・ (1)
422
H. Yamamoto and T. Yamada
VACS transfers the two dimensional data from the tentative line as mentioned above in function . The data are a set Q whose components are the shelf names S(n) and the set of distances L. The data are described by equation (2). Q={(S(1), L1), (S(2), L2),
・・・(S(n), L )} ・
・・(2)
n
3.3 Determining the Parts Layout Locations The GA-module determines the two dimensional part layout locations in the shelves. The first job of the GA-module is to create the initial “individuals” using both the shelf names S(n) and the parts names. Each “individual” is a set of components combined with the shelf names corresponding to part layout locations and the part names in each location. Namely, each component of the set is expressed as the combination of the shelf names S(n) of the tentative line and n part names M(j) located in the shelves. The aggregate of the components corresponding to individual I is generated by equation (3). I = {(S(n), M(j)), |n is a sequential integer from 1 to n, j is an integer without repetition from 1 to n} (3)
crossover point
I(1’) I(1)
I(2)
9 1 4 6 1 9 6 2 4
9 1 4 6 5 7 3 8 2
5 3 8 7 1 9 6 2 4
Fig. 2 Parent individuals
I(2’)
5 3 8 7
5 7 3 8 2
Fig. 3 Children individuals
The GA-module generates several individuals corresponding to equation (3). After the completion of the initial generation, the GA operates in a cycle [5-8] that includes calculating fitness, crossover, and mutation. The parts layout locations are determined as the final output of the GA-module. When conventional GA operations are adopted to solve this cell production problem in VACS, a problem occurs. A significant number of lethal genes will be generated if the conventional crossover method [5, 9-11] is used. For example, consider the two parent individuals I(1) and I(2) shown in Fig. 2. If the crossover point is chosen as shown in Fig.2, and a crossover operation is performed, the next generation individuals, I(1’) and I(2’), will be generated as shown in Fig.3. Because of the crossover operations, the new component nine appears twice in the individual. Thus, the next individual that includes the integer component with
Determining an Efficient Parts Layout for Assembly Cell Production
423
reputation is generated. As equation (3) prohibits the reputation component of an individual, I(1’) becomes a lethal gene. Thus, when VACS adopts the conventional crossover method, several lethal genes will be generated. To solve this problem, the following crossover method (called Twice Transfer Crossover (TTC)) is proposed to perform conversions twice. Using the following procedures, TTC performs the twice conversion processes that results in the two object individuals having crossover. Procedure {1}: Create the standard arrangement (SA). Procedure {2}: Convert the two individuals, I1 and I2, by using SA and express the converted two individuals as two sets that are their replacements. The new sets are T1’ and T2’. Procedure {3}: Perform a crossover between the new sets and create two new sets, T1” and T2.” Procedure {4}: Using SA, perform a reverse conversion on the sets T1” and T2” to acquire new individual expressions whose components are shelf names and parts names. The acquired expressions result in the next generation individuals. The SA of Procedure {1} is expressed with a set whose components are an integer and a part name. Equation (4) shows the set whose element is the sequential number (i.e., “Order”) corresponding to the number of the shelf and a part name randomly located on the shelf. The Order is the integer sequence from 1 to n, placed from left to right. SA={ (Order, M(j’)) | j’ is an integer randomly selected without reputation from among 1~n } (4) In procedure {2}, the conversion executes the following operations using individuals and the SA. The initial value of both k and x used in the operations is 1. Step 2-1: Find the part name M(j) that is the element of the locus number x of the individual I1, find the M(j’) whose part name is the same as M(j) from among the SA, and find the sequence number, Order(k), of the SA. Step 2-2: Set Order(k) in the k element of the set T’. Step 2-3: Renew the SA as follows. Delete M(j’) and the element of its Order(k) from among the elements of SA and move down the Order(k) of each element whose location is behind the location of the deleted element by 1. Step 2-4: As for the all elements that are behind the first location of the individual I1, create the new set, T1’, by repeating k←k+1 x←x 1 and performing Step 2-1 through Step 2-3.
,
T1’={Order(k)|k is an integer from 1~n}
+
(5)
As for the individual I2, the same operations are performed and the new individual I2’ is generated. The reverse conversion of procedure {4} is performed by the following operations. In the operations, the set after the crossover from procedure {3} is expressed as T1”= {Order(k’)| k’ is an integer without reputation from among 1~n } and the initial value of y is 1. Step 4-1: Find the element, Order(k’), of the sequential number y from among the set T1.”
424
H. Yamamoto and T. Yamada
Step 4-2: Find the sequential number k’ from among SA and create the new set newI1 whose k’ element consists of the shelf name S(y) and the part name M(k’). The equation for newI1 is expressed as follows. newI1={(S(y), M(k’) | k’ is an integer randomly selected without reputation from among 1~n } (6) Step4-3: Renew the SA as follows. First, delete the element of M(k’) and Order(k’) from among the SA. Second, move down the Order(k’) of each element whose location is behind the location of the deleted element by 1. Step4-4: As for the all elements that are behind the first location of the individual T1,” generate the new set, newI1, by repeating y←y+1 and performing Step 4-1 through Step 4-3. The fitness in the GA uses the principle that the shorter the operator moving distance, the better the fitness. It is calculated by using the distance Ln, which is one of the elements in equation (2), corresponding to the distance between the operator working standing position and the shelf location.
4 Application Simulations We applied VACS to the cell assembly production of a personal computer. The V-module utilized GP4 from Lexer Reseach Inc. The computer has 12 parts as shown in Table 1 and the number of product variants is 10. Table 1 shows the part numbers that 5 (P(1)~P(5)) of the 10 variant products require for assembly, and the order number for each product. The locations of the parts are on the 18 shelves behind the assemble table, as shown in Fig.4. The shelves that surround the operator are named with letters A through R. In the GA operations, 100 individuals per population of a generation are calculated. In the GA, the crossover operation utilizes roulette selection with five individuals as an elite preservation group and 5% as the mutation probability. As discussed above, the fitness calculates the distance an operator has to move, where the smaller the distance, the better the fitness. In the simulation, first the shelves and the assembly table were designed with AutoCAD and were placed in the V-module. On the basis of the layout of the shelves and the assembly table, the coordinates of the shelves and the assembly table were automatically acquired, sent to the GA-module, and GA operations were started. Fig.5 shows one example of the fitness curves. After the two hundredth generation, the fitness values became constant. The individual present in this constant situation corresponds to the parts layout locations that are most efficient. Table 2 shows the resulting parts layout locations. As shown in the table, the parts whose frequency of use is high are densely located on shelves A, B, and M through R that are close to the assembly table. These locations are judged to be efficient choices. Sending the parts layout locations acquired in the GA-module back to the V-module, the production line shown in Fig. 6 was generated. Thereby, after determining the most efficient production line in the V-module, which is a virtual assembly production line, we can confirm the appropriateness of the configuration by a visual evaluation.
Determining an Efficient Parts Layout for Assembly Cell Production
425
Table 1 Personal computer parts products parts
P(1)
P(2)
P(3)
P(4)
P(5)
case
1
1
1
1
1
power source
1
1
1
1
1
mother board
1
1
1
1
1
CPU
1
1
1
1
1
memory
2
4
2
4
1
TV tuner
0
0
0
1
0
fan
1
1
1
0
0
Order number
20
17
12
9
8
Table 2 Acquired locations Locations A B C D E F G H I J K L M N O P Q R
Parts motor case case fan card reader sound card TV tuner other card capture board LAN card other options FD drive CPU fan CPU memory hard disk mother board video card CD/DVD
Parts number 80 80 35 49 46 13 31 21 35 8 18 15 82 169 126 80 76 82
426
H. Yamamoto and T. Yamada assembly table
A
R
B
Q
C
P
D
O
E
N M F
G
H
I
J
K
L
Fig. 4 PC assembly cell production 2.20E-06 2.10E-06
Best fitness
2.00E-06
ss 1.90E-06 en ti F1.80E-06
Average fitness
1.70E-06 1.60E-06 1.50E-06 0
Fig. 5 Fitness curves
50
100
150
Generations
200
Determining an Efficient Parts Layout for Assembly Cell Production
427
Fig. 6 Example of a final production cell
4 Conclusions This study outlines the development of VACS, a system for determining the most efficient locations for parts in two dimensions when planning an assembly cell production line, i.e., before constructing the real production line. VACS incorporates a GA module, which determines the parts layout locations, and a 3-dimensional virtual factory module. The GA module for determining the parts layout locations adopted a unique crossover procedure that performed coding of each individual twice. By applying the VACS to the development of a cell production line for a personal computer, an efficient parts layout location was determined without physically building the cell production line. The usefulness of VACS was thereby demonstrated.
References [1] Zhang, J., Chan, F.T.S., et al.: Investigation of the reconfigurable control system for an agile manufacturing cell. International Journal of Production Research 40(15), 3709–3723 (2002) [2] Solimanpur, M., Vrat, P., Shankar, R.: A multi-objective genetic algorithm approach to the design of cellular manufacturing systems. International Journal of Production Research 42(7), 1419–1441 (2004) [3] Inukai, T., et al.: Simulation Environment Synchronizing Real Equipment for Manufacturing Cell. Journal of Advanced Mechanical Design, Systems, and Manufacturing, The Japan Society of Mechanical Engineers 1(2), 238–249 (2007) [4] Yamamoto, H.: One-by-One Parts Input Method by Off-Line Production Simulator with GA. European Journal of Automation, Hermes Science Publications, Artiba, A. (ed.), 1173–1186 (2000)
428
H. Yamamoto and T. Yamada
[5] Yamamoto, H., Marui, E.: Off-line Simulator to Decide One-by-one Parts Input Sequence of FTL—Method of Keep Production Ratio by Using Recurring Individual Expression. Journal of the Japan Society for Precision Engineering 69(7), 981–986 (2003) [6] Goldberg, D.E.: Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading (1989) [7] Yamamoto, H.: One-by-one Production Planning by Knowledge Revised-Type Simulator with GA. Transactions of the Japan Society of Mechanical Engineers, Series C 63(609), 1803–1810 (1997) [8] Ong, S.K., Ding, J., Nee, A.Y.C.: Hybrid GA and SA dynamic set-up planning optimization. International Journal of Production Research 40(18), 4697–4719 (2002) [9] Yamamoto, H., Qudeiri, J.A., Yamada, T., Ramli, R.: Production Layout Design System by GA with One by One Encoding Method. The Journal of Artificial Life and Robotics 13(1), 234–237 (2008) [10] Qiao, L., Wang, X.-Y., Wang, S.-C.: A GA-based approach to machining operation sequencing for prismatic parts. International Journal of Production Research 38(14), 3283–3303 (2000) [11] Fanjoy, D.W., Crossley, W.A.: Topology Design of Planar Cross-Sections with a Genetic Algorithm: Part 1—Overcoming the Obstacles. International Journal of Engineering Optimization 34(1), 1–22 (2002)
Development of a Multi-issue Negotiation System for E-Commerce Bala M. Balachandran, R. Gobbin, and Dharmendra Sharma*
Abstract. Agent-mediated e-commerce is rapidly emerging as a new paradigm to develop business intelligent systems. Such systems are built upon the foundations of agent technology with a strong emphasis on the automated negotiation capabilities. In this paper, we address negotiation problems where agreements must resolve several different attributes. We propose a one-to-many multi-issue negotiation model based on the Pareto optimal theory. The proposed model is capable of processing agents’ preferences and arriving to an optimal solution from a set of alternatives by ranking them according to the score that they achieved. We present our prototype system architecture, together with a discussion of the underlying negotiation framework. We then illustrate on our implementation efforts using the JADE and Eclipse platform. Our concluding remarks and possible further work are presented. Keywords: software agents, e-commerce, Eclipse, JADE, scoring functions, multi-issue negotiation, Pareto-efficient.
1 Introduction Recent advances in Internet and web technologies have promoted the development of intelligent e-commerce systems [8]. Such systems are built upon the foundations of agent technology with a strong emphasis on the agent negotiation capabilities. We define negotiation as a process in which two or more parties with different criteria, constraints and preferences jointly reach an agreement on the terms of a transaction [9]. In most e-commerce situations, what is acceptable to an agent can not be described in terms of a single parameter. For example, a buyer of a PC will consider the price, the warranty, the speed of the processor, the size of the memory, etc. A buyer of a service, like Internet access, will look at the speed and reliability of the connection, the disk space offered, the quality of customer Bala M. Balachandran · R. Gobbin · Dharmendra Sharma Faculty of Information Sciences and Engineering The University of Canberra, ACT, Australia e-mail:{bala.balachandran,renzo.gobbin, dharmendra.sharma}@canberra.edu.au J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 429–438. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
430
B.M. Balachandran, R. Gobbin, and D. Sharma
service, the pricing scheme, etc. Agreements in such cases are regions in a multidimensional space that satisfy the sets of constraints of both sides [6]. In this paper, our interest is in the development of a multi-issue based negotiation model for e-commerce. We use three agents in our study: a buyer agent, a seller agent, and a facilitator agent. The seller agent allows a seller to determine his negotiation strategies for selling merchandise. Similarly, the buyer agent allows a buyer to determine his negotiation strategies for buying merchandise. The facilitator agent serves to handle the negotiation strategies for both the buyer and the seller agents. In our approach, agents’ preferences are expressed in fuzzy terms. The application domain for our prototype implementation is buying and selling laptop computers. Our paper is organised as follows. First, we review some related works. We then present our proposed negotiation model, discussing its ability to handle customer preferences based on multiple parameters. We describe the model in terms of the negotiation object, the negotiation protocol and the negotiation strategy. Then we show how the principles of fuzzy logic and scoring functions are used in our model to facilitate multi-issue based negotiation. We then describe details of a prototype system we have developed using JADE [3] within the Eclipse platform [12]. Finally, we present our concluding remarks and discuss possible future work.
2 Related Works Automated bilateral negotiation has been widely studied by artificial intelligence and microeconomics communities. AI-oriented research has focused on automated negotiation among agents. Merlat discusses the potential of agent-based multiservice negotiation for e-commerce and demonstrates a decentralized constraint satisfaction algorithm (DCSP) as a means of multiservice negotiation. Badica et al [1] present a rule based mechanism for agent price negotiation. Lin et al [5] present an automatic price negotiation using fuzzy expert system. Sheng [10] presents work that offers customers online business-to-customer bargaining service. There have also been some efforts in applying automated negotiation to tackle travel planning problems[2] [7]. Several approaches to agent-mediated negotiation on electronic marketpalces have been introduced in the literature. For example, Kurbel et al [4] present a system called FuzzyMAN: An agent-based electronic marketplace with a multilateral negotaitaion protocol. They argued that there is no universally best approach or technique for automated negotiation. The negotaition strategies and protocols need to be set according to the siatuation and application doamin. Ragone et al [10] present an approach that uses fuzzy logic for automated multi-issue negotiation.They use logic to model relations among issues and to allow agents express their preferences on them.
Development of a Multi-issue Negotiation System for E-Commerce
431
3 A Multi-issue Bilateral Negotiation Model for E-Commerce In this section we present a multi-issue negotiation model for e-commerce in which agents autonomously negotiate multi-issue terms of transactions in an eCommerce environment. We have chosen a laptop computer trading scenario for our prototype implementation. The negotiation model we have chosen for our study is illustrated in Figure 1. In this model, issues within both the buyer’s request and the seller’s offer can be split into hard constraints and soft constraints. Hard constraints are issues that have to be necessarily satisfied in the final agreement, whereas soft constraints represent issues they are willing to negotiate on. We utilise a facilitator agent which collects information from bargainers and exploits them in order to propose an efficient negotiation outcome.
Buyer 1
Buyer 2
Seller 1
Seller 2 Facilitator Agent
Buyer 3
Seller 3
Fig. 1 One-to-many negotiation scheme.
The negotiation module consists of three components: negotiation object, decision making model and negotiation protocol. The negotiation object is characterised by a number of attributes for which the agents can negotiate. The decision making model consists of an assessment part which evaluates an offer received and determines an appropriate action, and an action part which generate and send a counter-offer or stop the negotiation. The assessment part is based on the fact that different values of negotiation issues are of different value for negotiating agents. We model the value of negotiating issues by scoring functions [4]. The bigger the value of a scoring function for a certain value of an issue is, the more suitable is this value for a negotiating agent.
432
B.M. Balachandran, R. Gobbin, and D. Sharma
3.1 Scoring Functions The scoring functions represent private information about their preferences regarding the negotiation issues. This information is not given to other participants in the negotiation process. A scoring function is defined by four values of a negotiation issue. They are the minimal, maximal, optimum minimal and optimum maximal as illustrated in Figure 2 below: Value 1
0 Min
Opt Min
Opt Max
Max
Years
Fig. 2 Scoring Function for negotiation issue “number of years warranty"
We also consider the fact that different negotiation issues are of different importance for each participant. To model this situation, we introduce the weighting factor representing the relative importance that a participant assigns to an issue under negotiation. During negotiation, the value of an offer received is calculated using two vectors: a vector-valued offer received by an agent and a vector of relative importance of issues under negotiation. The value of an offer is the sum of the products of the scoring functions for individual negotiation issues multiplied by their relative importance.
3.2 The Negotiation Protocol The negotiation facilitator receives this request and registers the customer. Once this is done, the negotiation process can begin with the suppliers. The negotiation facilitator requests the suppliers to provide offers conforming to the restrictions imposed by the customer agent. Please note that each restriction has an importance rating (0% to 100%), which means there is some leniency in the restrictions
Development of a Multi-issue Negotiation System for E-Commerce
433
imposed by the customer. For example if the customer wants the colour Red, but provides an importance rating of 50%, it is quite lenient and the negotiation facilitator will request suppliers to make offers for a range of different colours. The negotiation facilitator and suppliers go through several rounds of negotiation until they reach the maximum number of rounds. Then the best offer (optimal set) is sent back to the customer agent. The customer agent then displays the results of the negotiation process to the end user who is ultimately responsible for making the decision on which item to buy.
3.3 The Negotiation Strategy The facilitator’s strategy is to gather a set of offers from the listed suppliers which satisfy the customer’s wishes. Each offer is compared with the last offer by using a Pareto optimality algorithm. The facilitator has a utility algorithm which shows how good a particular offer is, this facilitator may modify the customers preferences (those which have an importance rating of less than 100%) in order to find other offers which may satisfy the user’s needs. Once the set of optimal results are obtained, it is sent back to the customer agent. The buyer’s strategy is one which aims to maximise their profits on the goods sold to customers. They would also like to sell the goods as fast as possible, but at the highest possible price. The supplier does not want any old stock which cannot be sold
3.4 The Negotiation Process The negotiation process begins with registered buyers and sellers and a single facilitator. The seller sends a list of all items for sale to the facilitator. These items are registered for sale and available for all the buyers to bargain on and purchase. The buyer then registers with the facilitator and sends all their preferences. Once the preferences have been received by the facilitator, the negotiation process can begin between the facilitator and the supplier: 1. 2. 3. 4.
Facilitator runs the Pareto optimality algorithm to remove any suboptimal solutions Facilitator runs the Utility function to get the item with the highest utility. This item with the highest utility is selected as the negotiation item and set as the base item with all its properties (price, hard drive space etc) The item’s properties are changed so that the property with the highest importance factor is increased. If importance factor of price was highest, it would be reduced by 10%. If the importance factor of the hard drive space or any other property was the highest, then it would be increased by 10%
434
B.M. Balachandran, R. Gobbin, and D. Sharma
5. 6.
7.
8.
This is sent to the supplier to see if they agree with the properties This counter offer is received by the supplier who has a negotiable threshold amount (set to 10% by default) by which they are willing to negotiate on the items properties a. If the negotiable threshold is not crossed, the counter offer is agreed to and sent back to the facilitator b. If the negotiable threshold has been crossed, then check by how much. This difference is added to the price. If the threshold is crossed by 5%, then the price is increased by 5% and sent back to the facilitator When the facilitator receives this offer, it calculates the utility of the offer and if it is greater, then it becomes the new base item. The next round of bargaining begins (back to step 4) The bargaining process happens for a fixed number of rounds, 4 by default.
4 System Development and Evaluation In this section, we present our implementation efforts to automating multi-issue negotiations. As described before there are three different agents, namely the buyer, the seller and the facilitator. Although there can be more than one instance of the buyer and the seller, there can only be one instance of the facilitator running at any one time. This is a limitation on the system imposed to reduce the complexity of the application. The main focus in this implementation has been the negotiation component which implements the multi-issue bargaining model described in the previous sections. The architecture of the system consists of three modules: a communication module, a module for interaction with the agents’ owners, and a negotiation module. The communication module manages the exchange of messages between agents. The interaction module is responsible for communication between an agent and its owner. A user can manage his or her agent through a graphical user interface, and an agent can communicate with its owner in certain situations through e-mail. For example, to initialise a buyer’s agent, the user has to specify attributes such as price, memory size, hard disk, warranty, etc as shown in Figure 3. The user can also specify the tactic the agent is supposed to employ in the negotiation process. and the scoring functions and weights for the negotiation issues.
Development of a Multi-issue Negotiation System for E-Commerce
435
Fig. 3 One sample screen shot used by a buyer
4.1 The Buyer Agent The buyer agent is designed to get the preferences from the user, register with the facilitator and then receive the results of the negotiation process. From the point the user clicks on search, there is no interaction between this agent and the end user, until the negotiation results are returned. The end user specifies their preference values on various negotiation issues and their relative importance factors. This information is used by the facilitator during the bargaining process.
4.2 The Negotiation Object The negotiation object in this case is the item which is being negotiated upon. This item, A, has several properties and each property has a name and value. For the purposes of this project, the name is a string and the value is an integer whole number. These item details are read in by the agents and the properties manipulated during the negotiation process. Figure 4 shows an example XML document showing a sale Item and its properties.
436
B.M. Balachandran, R. Gobbin, and D. Sharma
Fig. 4 A sale item representation in XML.
4.3 The Facilitator Agent The facilitator agent receives registration requests from both the buyer and seller and then process the request (either accepts or denies the registration request). The most important functionalities of the facilitator agent are management of other agents, pre-selection of agents for a negotiation, and looking after the negotiation process. During the pre-selection phase, the facilitator selects and ranks those agents of the marketplace that will start a specific negotiation over certain issues with a given agent. Once the maximum number of negotiation rounds has been completed, the facilitator sends the best offer back to the buyer.
4.4 The Seller Agent The seller agent is responsible for registering with the facilitator and sending a list of sale items which are available. This agent also manages the counter offers received from the facilitator. The agent has a threshold limit as to how much it is able to negotiate. All offers where it needs to negotiate more incur an increase in the price of the good.
Development of a Multi-issue Negotiation System for E-Commerce
437
4.5 JADE Implementation The proposed multi-issue negotiation system was implemented using the JADE framework [3] and the Eclipse platform [12] as our programming environment. Figure 5 shows a screen dump of the development environment. The system provides graphical user interfaces for users (buyers and sellers) to define scoring functions, weighting factors, negotiation tactics. It also has a customer management system for the system administrator. We have done some evaluation to investigate the satisfaction of using the proposed system. The performance of the system is very promising and we are currently investigating further improvements to the system.
Fig. 5 The Development environment using Eclipse and JADE
5 Conclusions and Further Research In this paper, we have attempted to model multi-issue negotiation in the ecommerce environment. We showed how a one-to-many negotiation could be handled by co-ordinating a number of agents via a facilitator agent. The facilitator agent utilises scoring functions, relative importance factors and Pareto optimality principles for the evaluation and ranking of offers. We have also demonstrated a prototype implementation using JADE[3] within the Eclipse environment. The facilitator agent currently uses a simple strategy, providing a proof of concept. In the future, we plan to extend the approach using more expressive logics, namely fuzzy logic[2] to increase the expressiveness of supply/demand descriptions. We
438
B.M. Balachandran, R. Gobbin, and D. Sharma
are also investigating other negotiation protocols, without the presence of a facilitator, allowing to reach an agreement in a reasonable amount of communication rounds.
References [1] Badica, C., Badita, A., Ganzha, M.: Implementing rule-based mechanisms for agentbased price negotiation, pp. 96–100. ACM, New York (2006) [2] Cao, Y., Li, Y.: An intelligent fuzzy-based recommendation system for consumer electronic products. Expert Systems with Applications 33, 230–240 (2007) [3] Bellifemine, F., Caire, G., Greenwood, D.: Developing Multi-Agent Systems with JADE. John Wiley & Sons, UK (2007) [4] Kurbel, K., Loutchko, I., Teuteberg, F.: FuzzyMAN: An agent-based electronic marketplace with a multilateral negotiation protocol. In: Lindemann, G., Denzinger, J., Timm, I.J., Unland, R. (eds.) MATES 2004. LNCS (LNAI), vol. 3187, pp. 126– 140. Springer, Heidelberg (2004) [5] Lin, C., Chen, S., Chu, Y.: Automatic price negotiation on the web: An agent-based web application using fuzzy expert system. Expert Systems with Applications, 142 (2010), doi:10, 1016/j.eswa.2010.09 [6] Merlat, W.: An Agent-Based Multiservice Negotiation for ECommerce. BT Technical Journal 17(4), 168–175 [7] Ndumu, D.T., Collis, J.C., Nwana, H.S.: Towards desktop personal travel agents. BT Technol. J. 16(3), 69–78 (200x); Nwana, H.S, et al.: Agent-Mediated Electronic Commerce: Issues, Challenges and Some Viewpoints. In: Autonomous Agents 1998, MN, USA (1998) [8] Nwana, H.S., et al.: Agent-Mediated Electronic Commerce: Issues, Challenges and Some Viewpoints. In: Autonomous Agents 1998, MN, USA (1998) [9] Paprzycki, M., et al.: Implementing Agents Capable of Dynamic Negotiations. In: Petcu, D., et al. (eds.) Proceedings of SYNASC 2004: Symbolic and Numeric Algorithms for Scientific Computing, pp. 369–380. Mirton Press, Timisoara (2004) [10] Ragone, A., Straccia, U., Di Noia, T., Di Sciascio, E., Donini, F.M.: Towards a fuzzy logic for automated multi-issue negotiation. In: Hartmann, S., Kern-Isberner, G. (eds.) FoIKS 2008. LNCS, vol. 4932, pp. 381–396. Springer, Heidelberg (2008) [11] Takayuki, I., Hattori, H., Klein, M.: Multi-Issue Negotiation Protocol for Agents: Exploring Nonlinear Utility Spaces. In: IJCAI 2007, pp. 1347–1352 (2007) [12] The Eclipse Platform, http://www.eclipse.org/ [13] Tsai, H.-C., Hsiao, S.-W.: Evaluation of alternatives for product customization using fuzzy logic. Information Sciences 158, 233–262 (2004)
Effect of Background Music Tempo and Playing Method on Shopping Website Browsing Chien-Jung Lai, Ya-Ling Wu, Ming-Yuan Hsieh, Chang-Yung Kung, and Yu-Hua Lin *
Abstract. Background music is one of the critical factors that affect browsers’ behavior on shopping website. This study adopted a laboratory experiment to explore the effects of background music tempo and playing method on cognitive response in an online store. The independent variables were background music tempo (fast vs. slow) and playing method of music (playing the same music continuously, re-playing the same music while browsing different web pages, and playing different music while browsing different web pages). The measures of the shifting frequency between web pages, perceived browsing time, and recalling accuracy of commodity were collected. Results indicated that participants had more shifting frequency and shorter perceived browsing time for fast music tempo than for slow music tempo. The effect of music tempo on recalling accuracy was not significant. Continuous playing method and re-playing method had similar shifting frequency and perceived browsing time. Different method had more shifting frequency, longer perceived time, and less recalling accuracy. Continuous playing had greater accuracy than re-playing method and different method. The findings should help in manipulating background music for online shopping website. Keywords: Background music, music tempo, playing method.
1 Introduction Online retailing has attracted a great deal of attention in recent years due to the rapid developing of Internet. Some researchers have been already begun to call for more systematic research on the nature of this format by using established Chien-Jung Lai Department of Distribution Management, National Chin-Yi University of Technology Ya-Ling Wu Department of Applied English, National Chin-Yi University of Technology Ming-Yuan Hsieh · Chang-Yung Kung Department of International Business, National Taichung University of Education Yu-Hua Lin Department of Marketing & Distribution Management., Hsiuping Institute of Technology J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 439–447. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
440
C.-J. Lai et al.
retailing and consumer behavior theories. Researches of consumer on-line behavior are more important than ever. As a natural departure from the stimuli present in a traditional retail store, the online retail environment can only be manipulated by visual and auditory cues. In the past, research of online store focused on the design of website structure and interface from visual stimulus, few carried on the discussion from the auditory stimulus. Recently many websites place background music on web to attract browsers’ attention. Some researches start to study the effect of background music on consumer response [20]. These researches focused on the structures of music, such as rhythm, tempo, volume, melody and mode, and browsers’ affective response. Little research has been done on the measure of cognitive response to background music in an online setting. The present study will address the music tempo and playing method of background music and examine the browsers’ cognitive response.
2 Research on Store Atmospheric 2.1 Brick-and-Mortar Atmospherics The impact of atmospherics on the nature and outcomes of shopping has been examined by researchers for some time. To explain the influence of atmospheres on consumer, atmospheric research has focused heavily on the Mehrabian-Russel affect Model [13]. Donovan and Rossiter [5] tested the Stimulus-OrganismResponse (S-O-R) framework in a retail store environment and examined Mehrabian and Russell’s three-dimensional pleasure, arousal, and dominance (PAD) emotional experience as the intervening organism state. Empirical work in the area has examined specific atmospheric cues and their effects on shopper response. As shown throughout the literature, atmospherics can be a means of influencing the consumer in brick-and-mortar environment.
2.2 Web Atmospherics Dailey [3] defined web atmospherics as “the conscious designing of web environments to create positive effects (e.g., positive affect, positive cognitions, etc.) in user in order to increase favorable consumer responses (e.g., site revisiting, browsing, etc.). Eroglu et al. [6] propose a typology that classifies web atmospheric cues into high task-relevant cues (i.e., descriptions of the merchandise, the price, navigation cues, etc.) and low task-relevant cues (i.e., the colors, borders and background patterns, typestyles and fonts, music and sounds, etc.). These cues form the atmosphere of a web site. As in brick-and-mortar environments, atmospheric cues have been posited to influence consumers on the web [3] [6]. However, research on web atmospheric thus far is limited in its theoretical explanation of why web atmospherics influence consumers. Dailey [3] indicated that web atmospheric researchers should begin to focus on specific web atmospheric cues (i.e., color cues, navigation cues, etc.) and theoretical explanations of how and why these cues may influence consumers. The present study will address this issue by focusing specifically on design of music cues.
Effect of Background Music Tempo and Playing Method on Shopping Website
441
2.3 Music Cues There have been a number of studies that investigate the effect of background music on the physical environment in the interest of various research fields, for example, advertising, retail/manufacturing, and ergonomics [8][15][17]. Bitner [1] argued that music is a critical ambient condition of the servicescape and that music influences people’s emotions and physiological feelings, and mood. Various structure characteristics of music, such as time (rhythm, tempo, and phrasing), pitch (melody, keys, mode, and harmony), and texture (timbre, orchestration, and volume), influence consumer response and behavior [2]. Academic research suggests that music affect how much time shoppers spend in a store, music appears to affect shoppers’ perceptions of the amount time they spent shopping [22]. However, few studies have reported the effect of music on website.
2.4 Music Tempo and Playing Method Music tempo has been considered representative of an essential dimension of music and has received wide attention in previous research [2] [12]. In empirical studies, the variation in the tempo of background music has been found to have a differential effect on listeners’ affective responses and shopping behavior. Milliman [14] found that fast-tempo background music speeds up in-store traffic flow and increases daily gross sales volume compared with slow tempo music. Further, Milliman [15] found that fast-tempo background music shortens restaurant patrons’ dining time. BrunerII [2] summarized that faster music can induce a more pleasant affective response than slower music. In a meta analysis, Garlin and Owen [7] concluded that tempo is the main structural component of music to induce listeners’ arousal. Besides the study of the traditional physical environment, the study of Wu et al. [20], focusing on the effect of music and color on participants’ emotional response in an online store setting, indicated that participants felt more aroused and experienced greater pleasure when they were exposed to fast-tempo music website than those people who experienced slow-tempo music. Day et al. [4] examined the effects of music tempo and task difficulty on the performance of multi-attribute decision-making. The results indicated participants made more accurately with the presentation of faster than slower tempo music. Further, the faster tempo music was found to improve the accuracy of harder decision-making only, not the easier decision-making. The music is always played as background music in a continuous way in the traditional retail environment. However, the music could be played in various methods in the online setting for the progress advancement of information technology. The music could be played continuously during the browsing time for a browser. It can also be played in different ways, such as re-playing the same music while browsing a different web page, or playing distinct music while browsing a different web page. Various playing method may cause different attention and induce different browsers’ response.
442
C.-J. Lai et al.
2.5 Music and Cognitive Response A number of studies have been conducted using affective response as measurement of shopper in a retailer store environment [3][5][6]. According to different theoretical perspective, background music has been focus to act on listeners’ cognitive process [11]. Cognitive response such as shoppers’ attention, memory, and perceived time are also critical for the assessment of atmospheric cues. A possible general explanation for the effect of background music can be based on the mediating factor of attention [9][10]. Background music can play at least two different roles in the operation of attention, i.e. the distractor vs. the arousal inducer. From the distractor perspective, both browsing website and processing background music are basically assumed to be cognitive activity demanding the attention resource [9][18]. On the other hand, from the arousal inducer perspective, background music may affect the arousal level of listeners, increasing the total amount of momentary mental resource available to browsing website and thus, impacts the recalling of memory. According to the distractor perspective, the performance of a task is impaired by the interruption of background music, while from the arousal inducer perspective, it is improved by the supplement of resource stimulated by the background music. This contradiction makes the prediction of background music effect problematic. Time is an important factor in retailing because retailers strongly believe in a simple correlation between time spent shopping and amount purchased. Milliman’s [14] study suggests that music choices affect actual shopping times. His study on grocery store indicated that consumers spent 38% more time in the store when exposed to slow music compared with fast music. It is likely that shoppers spent more time in the store during the slow music periods than the fast music periods. Yalch and Spangenberg [21] showed that actual shopping time was longer in the less familiar background music condition, but perceived shopping time was longer in the more familiar foreground music condition. Further, Yalch and Spangenberg [22] indicated that individuals reported themselves as shopping longer when exposed to familiar music but actually shopped longer when exposed to unfamiliar music. Shorter actual shopping times in the familiar music condition were related to increased arousal. Longer perceived shopping times in the familiar music condition appear related to unmeasured cognitive factors. Although substantial studies have been performed on the time perception in traditional retailing setting, few studies have reported the effects in online setting. Here, the present study focused on the effect of music on browsers’ cognitive activities. The measures of shifting frequency between web pages, perceived browsing time, and recalling accuracy of website commodity were collected to explore the relation of attention and memory recall.
3 Research Method The main purposes of this study are to design the background music playing method and examine the effects of these methods and music tempo on browsers’ cognitive response. A laboratory experiments was conducted to evaluating these effects.
Effect of Background Music Tempo and Playing Method on Shopping Website
443
3.1 Participants A total of 54 university students (18 male and 36 female) who were between 19 and 26 years old (M= 21.3, SD = 1.21) were recruited as participants in the experiment. Each participant has surfing experiences on the internet for one year at least. They were paid a cash reward of NT$ 100 for their participation.
3.2 Experimental Design The experiment used a 2 x 3 between-participant factorial design. The first independent variable to be manipulated was the music tempo which had two levels: faster tempo vs. slower tempo. According to the definition of North and Hargreaves [16], we considered music tempo of 80 BPM or less as slow and 120 or more as fast. Two categories of songs from the same music collection were used, one fast and the other slow. The second independent variable was music playing methods. There were three levels: playing the same music continuously (Continuous), re-playing the same music while browsing a different web page (Re-play), playing distinct music while browsing a different web page (Different). All of the three methods have been used as background music on online website. Participants were randomly assigned to one of the six treatments condition. The dependent measures collected in this experiment were the shifting frequency between web pages, perceived browsing time, and recalling accuracy of website commodity. Shifting frequency between web pages was the participants’ shift frequency between web pages during the browsing period. It was used to express the variations of browsers’ attention. The perceived browsing time was the participant’s estimation of the amount of time they had spent browsing the website. Recalling accuracy of commodity was the percentage of the correct recall of commodities attributes after the browsing task.
3.3 Materials The shopping website conducted in the experiment had 3 levels of contents. The first level introduced two types of commodity store for participant to browse. The second level had a brief description of commodity for each store. Each store had 3 commodities. The third level described the detail information for each commodity. In total, there were nine pages for the experimental website. In order to match up the music tempo and playing method of playing a different music while browsing different web pages, nine faster tempo songs and nine slower tempo songs were selected from the same music collection. All of the songs had no lyrics. The volume was controlled at 60 dB which was considered as comfortable level for shopping [12].
444
C.-J. Lai et al.
3.4 Task and Procedure The participants were tested individually in the experiment setting. A participant was instructed to browse the experimental shopping website for four minutes as he/she was planning to purchase from the site. Background music was played from the computer’s speakers. After browsing the website, participants were asked to estimate the amount of time they had spent on browsing the website by circling the points of minutes (from 1 to 8) and second (from 0 to 60) [19]. Then a short test for the recall of attributes for each commodity was given to participants. Participants’ shift frequency between web pages during the browsing was recorded by the FaceLab eye-tracking system automatically.
4 Results Table 1 shows the means and standard deviations of shifting frequency between web pages, perceived browsing time, and recalling accuracy of commodity under each level of the independent variables. An analysis of variance (ANOVA) was conducted for each of the dependent measures. The factors that were significant were further analyzed using Tukey test to discuss the difference among the factor levels. Table 1 Means and standard deviations of shifting frequency between web pages, perceived browsing time, and recalling accuracy of commodity under levels of the independent variables. Independent variables Music tempo
Playing method
Level
n
Shifting frequency Mean S.D.
Perceived time (s) Recalling accuracy (%)
TukeyMean S.D. Tukey Mean S.D.
Tukey
Fast
27 33.59 10.46 A
245 51 A
63.1
0.11
A
Slow
27 19.64 4.48
288 52 B
58.7
0.15
A
B
Continuous 18 25.33 7.76
AB
265 55 AB
71.2
0.09
A
Re-play
18 24.06 8.94
A
245 55 A
59.3
0.11
B
Different
18 30.44 13.73 B
289 51 B
52.2
0.12
B
Note: Different letters in Tukey group indicate significant difference at 0.05 level.
4.1 Shifting Frequency between Web Pages Results of the analysis of variances for shifting frequency have shown that the main effects of music tempo were significant (F(1, 48) = 44.422, p < 0.01). Participants had more shifting frequency between web pages for fast tempo (33.59) than for slow tempo (19.63). There was also a significant main effects for playing method (F (2, 48) = 3.472, p < 0.05). Multiple comparisons using Tukey test demonstrated that shifting frequency for re-playing method (24.06) was less than for different method (30.44). However, there was no significant difference between
Effect of Background Music Tempo and Playing Method on Shopping Website
445
continuous playing method (25.33) and re-playing method (24.06), and between continuous playing method (2.33) and different method (30.44). The interaction of music tempo and playing method was not significant on shifting frequency.
4.2 Perceived Browsing Time Analysis of variances shows that music tempo had significant difference on perceived browsing time (F(1, 48) = 9.926, p < 0.01). Participants had shorter perceived browsing time for fast tempo (245 s) than for slow tempo (288 s). There was also a significant main effects for playing method (F (2, 48) = 3.447, p < 0.05). Tukey test demonstrated that perceived browsing time for re-playing method (245 s) was shorter than for different method (289 s). However, there was no significant difference between continuous playing method (265 s) and replaying method (245 s), and between continuous playing method (265 s) and different method (289 s). The interaction of music tempo and playing method was not significant on perceived browsing time.
4.3 Recalling Accuracy of Commodity Analysis of variances shows that playing method had significant difference on recalling accuracy of commodity (F (2, 48) = 15.638, p < 0.01). Tukey test demonstrated that continuous playing method had greater accuracy (71.2%) than replaying method (59.3%) and different method (52.2%). However, there was no significant difference between re-playing method and different method. Music tempo had no significant difference on recalling accuracy of commodity. The interaction of music tempo and playing method was also not significant on recalling accuracy of commodity.
5 Discussion and Conclusion The purpose of the study was to examine the differential effect of music tempo and playing method on cognitive response in an online store. The results found that participants had more shifting frequency and shorter perceived browsing time under the faster tempo music condition than under the slower. The more shifting frequency for fast music tempo revealed that participants browsed a web page in a shorter time, and then changed to another page quickly. The findings may provide support that faster tempo music tends to be an arousal inducer rather than a distractor. According to Kaneman’s capacity model [10], faster tempo arouses the participant, which may motivate the participant to increase the processing resources. As expected, recalling accuracy could be improved by the additional mental resources. However, there was not significant difference for music tempo on recalling accuracy. The results can be explained by the finding of Day et al. [4] which indicated the faster tempo music was found to improve the accuracy of harder decision-making only, not the easier decision-making. In the present study,
446
C.-J. Lai et al.
participants were asked to browse the same shopping website including 3 books and 3 notebooks in four minutes. The online shopping website in the experimental had only 3 levels and 9 pages of 6 commodities. In addition to the background music, there was not any difference for the participants. It is not hard to complete the browsing task. With regard to playing method of background music, the results showed that continuous method and re-play method had the similar shifting frequency and perceived browsing time. Continuous method had the greatest recalling accuracy of commodity. Further, different method had more shifting frequency, longer perceived time, and less recalling accuracy. The findings may imply that different playing method tends to be a distractor rather than an arousal inducer. The method of using distinct music while browsing a different web page as background music may interfered with participants’ browsing on web contents. Extra attention resources are demanded to process the cognitive activities. On the contrary, continuous playing method may be an arousal inducer for the shorter perceived time and higher recalling accuracy. The less shifting frequency may infer that participants allocate more attention in a website page for the less interference from the background music. The results of this study support the belief that browsers’ behavior is affected by a retail environmental factor like background music. It confirms that music tempo is an important structural component of music that relate to the arousal response from the cognitive view. However, the relationships between the measures have not examined clearly in the present study. It is necessary to conduct another research to understand the relationships in the future. Acknowledgments. This research was funded by the National Science Council of Taiwan, Grant No. NSC 99-2221-E-167-020-MY2.
References [1] Bitner, M.J.: Servicescape: The impact of physical surroundings on consumers and employees. Journal of Marketing 56, 57–71 (1992) [2] Bruner II, G.C.: Music, mood, and marketing. Journal of Marketing 54(4), 94–104 (1990) [3] Dailey, L.: Navigational web atmospherics: Explaining the influence of restrictive navigation cues. Journal of Business Research 57, 795–803 (2004) [4] Day, R.F., Lin, C.H., Huang, W.H., Chuang, S.H.: Effects of music tempo and task difficulty on multi-attribute decision-making: An eye-tracking approach. Computers in Human Behavior 25, 130–143 (2009) [5] Donovan, R.J., Rossiter, J.R.: Store atmosphere: an environmental psychology approach. Psychology of Store Atmosphere 58(1), 34–57 (1982) [6] Eroglu, S.A., Machleit, K.A., Davis, L.M.: Atmospheric qualities of online retailing: A conceptual model and implications. Journal of Business Research 54(2), 177–184 (2001) [7] Garlin, F.V., Owen, K.: Setting the tone with the tune: A meta-analytic review of the effects of background music in retail settings. Journal of Business Research 59, 755– 764 (2006)
Effect of Background Music Tempo and Playing Method on Shopping Website
447
[8] Herrington, J.D., Capella, L.M.: Practical application of music in service settings. Journal of Services Marketing 8(3), 50–65 (1994) [9] Jones, D.: The cognitive psychology of auditory distraction: The 1997 BPS broad bent lecture. British Journal of Psychology 90(2), 167–187 (1999) [10] Kahneman, D.: Attention and Effort. Prentice-Hall, NJ (1973) [11] Kellaris, J.J., Cox, A.D., Cox, D.: The effect of background music on adprocessing: A contingency explanation. Journal of Marketing 57, 11–125 (1993) [12] Kellaris, J.J., Rice, R.C.: The influence of tempo, loudness, and gender of listener on responses to music. Psychology and Marketing 10(1), 15–29 (1993) [13] Mehrabian, A., Russel, J.A.: An Approach to Environmental Psychology. MIT Press, Cambridge (1974) [14] Milliman, R.E.: Using background music to affect the behaviors of supermarket shoppers. Journal of Marketing 46(3), 86–91 (1982) [15] Milliman, R.E.: The influence of background Music on the behavior of restaurant patrons. Journal of Consumer Research 13, 286–289 (1986) [16] North, A., Hargreaves, D.J.: The effects of music on responses to a dining area. Journal of Environmental Psychology 16(1), 55–64 (1996) [17] Oakes, S.: The influence of the musicscape within service environments. Journal of Services Marketing 14(7), 539–550 (2000) [18] Payne, J.W., Bettman, J.R., Johnson, E.J.: The adaptive decision maker. Cambridge University Press, New York (1993) [19] Seawright, K.K., Sampson, S.E.: A video method for empirically studying waitperception bias. Journal of Operations Management 25(5), 1055–1066 (2007) [20] Wu, C.S., Cheng, F.F., Yen, D.C.: The atmospheric factors of online storefront environment design: An empirical experiment in Taiwan. Information & Management 45, 493–498 (2008) [21] Yalch, R.F., Spangenberg, E.: Using store music for retail zoning: A field experiment. In: McAlister, L., Rothschild, M.L. (eds.) Advances in Consumer Research, vol. 20, pp. 632–636. Association for Consumer Research, Provo (1993) [22] Yalch, R.F., Spangenberg, E.R.: The effect of music in a retail setting on real and perceived shopping times. Journal of Business Research 49, 139–147 (2000)
Forecasting Quarterly Profit Growth Rate Using an Integrated Classifier You-Shyang Chen, Ming-Yuan Hsieh, Ya-Ling Wu, and Wen-Ming Wu
*
Abstract. This study proposes an integrated procedure based on four components: experiential knowledge, feature selection method, rule filter, and rough set theory for forecasting quarterly profit growth rate (PGR) in the financial industry. To evaluate the proposed procedure, a called PGR dataset collected from Taiwan’s stock market in the financial holding industry is employed. The experimental results indicate that the proposed procedure surpasses the listing methods in terms of both higher accuracy and fewer attributes. Keywords: Profit growth rate (PGR), Rough sets theory (RST), Feature selection, Condorcet method.
1 Introduction In economics, financial markets play a major role in the price determination of the financial instrument. The markets enable to set the prices not only for newly issued financial assets but also for an existing stock of financial assets. Financial markets climates and gains/losses can be vastly changed within seconds; thus, the accuracy of information for investment planning is very crucial. A rationally informed investor analyzes expected returns to decide whether they will participate in the stock market or not, but an irrational ‘sentiment’ investor depends on intuition or other factors unrelated to expected returns. In stock market, the most effective way to assess operating performance of a specific company is to conduct a profitability analysis that is the necessary tool for determining how to allocate investor’s resource to ensure profits for themselves best. A profitability analysis is You-Shyang Chen Department of Information Management, Hwa Hsia Institute of Technology Ming-Yuan Hsieh Department of International Business, National Taichung University of Education Ya-Ling Wu Department of Applied English, National Chin-Yi University of Technology Wen-Ming Wu Department of Distribution Management, National Chin-Yi University of Technology J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 449–458. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
450
Y.-S. Chen et al.
the most significant of financial ratios, which provide a definitive evaluation of the overall effectiveness of management based on the returns generated on sales and investment quarterly and/or annually. Financial ratios are widely used for modeling purposes by both practitioners and researchers. Practically, profit growth rate (PGR) is one of core financial ratios. Hence, this study aims to further explore quarterly PGR on financial statement. Based on related studies, the performances of proposed models for classification may be depended on the field of application strongly [1], the study goal [2], or the user experience [3][4]. Therefore, it is a valuable issue to find more suitable ways that are applied in the context of the financial markets. Recently, artificial intelligence (AI) techniques for classification had great popularity not only in the research area and but also in commercialization, and they were flexible enough to perform satisfactorily in a variety of application areas, including finance, manufacturing, health care, and service industry. It is therefore recommended to employ more efficient classifiers as evaluation tools based on artificial intelligence techniques in the financial industry. Due to rapid changing of information technology (IT) today, the most common tools [5] of AI techniques for classification (i.e., prediction or forecast), such as rough set theory (RST), decision trees (DT), and artificial neural networks (ANN), have become significant research trends to both academicians and practitioners [6][7]. As such, this study plans to propose an integrated procedure, combining feature selection method, rule filter, and rough sets LEM2 algorithm, for classifying quarterly PGR problems faced by interested parties, particularly in investors. A trustworthy forecasting model is extremely desired and expected by them.
2 Related Works This section mainly reviews related studies of profit growth rate, feature selection, rough set theory, and LEM2 rule extraction method and rule filter.
2.1 Profit Growth Rate Profitability represents the ability that company can raise earnings in a periodic time (may be monthly, quarterly or yearly). Profitability analysis regarded as one of the financial analyses is an important tool for judging whether is capable of value on investment portfolios or not by investors. Interestingly, financial ratios are used as the most common way of profitability analysis for academicians and practitioners [8]. Generally, financial ratios are categorized as profitability, stability, activity, cash flow and growth [9]. The growth is focused in this study. Growth includes profit growth rate (PGR), revenue growth rate (RGR), sales increase ratio (SIR), year-over-year change on growth rate (YoY), quarter-overquarter change on growth rate (QoQ) month-over-month change on growth rate (MoM), etc. [10]. Particularly, with view of stock investors, the PGR is an effective evaluation indicator for them to see how big the potential power of future development is, and measures how about the growth of future development for a target company that may be selected to investment portfolios [11].
Forecasting Quarterly Profit Growth Rate Using an Integrated Classifier
451
Simply, PGR refers to the proportion of periodic variation quarterly of operating profits for a specific company in the dissertation, and there are four quarters, named first, second, third and fourth quarter in a year. The equation of quarterly PGR can be calculated in the following: Quarterly PGR (%) = ((Current quarter Profit – Last quarter Profit)*100)/Last quarter Profit. However, the PGR can be either positive or negative depended on which PGR is bigger between this quarter and last quarter. Furthermore, the PGR has no limit with respect to its range. The positive PGR represents good and optimistic with company for future, whereas negative PGR is bad and pessimistic. With its application, the PGR is a critical determinant of a firm's success [12]. The higher the PGR of firms is, the better their future is [11]. However, the overall performance for PGR is varied greatly by industry and firm size. The higher PGR represents increasing the operating profit, which may be motivated by a growth of sale amount of products, contingent with enlarging the market share rate, and it is regarded as optimistic with future development for a firm; simultaneously, a better future of firms will trigger stock prices higher particularly.
2.2 Feature Selection In order to completely remove redundant information and enable quicker and more accurate training, a number of preprocess were done against the dataset. One of the critical aspects of any knowledge discovery process is feature selection. Feature selection will help to evaluate usefulness of attributes, to select relevant attributes and then to reduce the dimensionality of datasets by deleting unsuitable attributes that may degrade the performance of classification, and hence improves the performance of data mining algorithms [13]. Feature selection techniques can be roughly categorized into three broad types, the filter model [13] [14], the wrapper model [15] [16], and the hybrid model [16] [17]. From Witten and Frank [18], the five feature selection methods of the filter model, ReliefF, InfoGain, Chi-square, Gain Ratio, Consistency, OneR, and Cfs are well known in academic work and are adopted in common usage. Thus, the study mainly introduces the five-subset evaluator methods for selecting features, as follows [18]: (1) Cfs (Correlation based feature selection) [19]:The method evaluates subsets of attributes that are highly correlated with the class while having the low intercorrelation is preferred. (2) Chi-squared [20]: Chi-squared method evaluates the worth of an attribute by computing the value of the chi-squared statistic with respect to the class. (3) Consistency [21]: This method values an attribute subset by using the level of consistency in the class values when the training instances are projected onto the attribute subset. (4) Gain Ratio [22]: The method evaluates the worth of an attribute by measuring the gain ratio with respect to the class. (5) InfoGain [23]: InfoGain is one of the simplest attribute ranking methods and is often used in text categorization applications. The method evaluates the worth of an attribute by measuring the information gain with respect to the class.
452
Y.-S. Chen et al.
2.3 Rough Set Theory Rough set theory (RST), first proposed by Pawlak [24], employs mathematical modeling to deal with data classification problems. Let B ⊆ A and X ⊆ U be an information system. The set X is approximated using information contained in B by constructing lower and upper approximation sets, BX = {x | [ x ]B ⊆ X } and
BX = {x | [ x ]B ∩ X ≠ ∅} respectively. The elements in BX can be clas-
sified as members of X by the knowledge in B. However, the elements in BX can be classified as possible members of X by the knowledge in B. The set
BN B ( x ) = BX − BX is called the B-boundary region of X and it consists of those objects that cannot be classified with certainty as members of X with the knowledge in B [25]. The set X with respect to the knowledge in B is called ‘rough’ (or ‘roughly definable’) if the boundary region is non-empty.
2.4 The LEM2 Rule Extraction Method and Rule Filter Rough set rule induction algorithms were implemented for the first time in a LERS (Learning from Examples) [26] system. A local covering is induced by exploring the search space of blocks of attribute-value pairs, which are then converted, into the rule set. The LEM2 (Learning from Examples Module, version 2) algorithm [27] for rule induction is based on computing a single local covering for each concept. The large number of rules limits the classification capabilities of the rule-set as some rules are redundant or of ‘poor quality.’ Some rule-filtering algorithms [28] can be used to reduce the number of rules.
3 Methodology The nature filled with risk and uncertainty of financial markets is that the greater the uncertainty is, the greater the price volatility is. Conversely, so is the risk of investment in the financial markets. The changing economic conditions and characteristic of uncertainty and risk have also existed and have made a financial forecast even more difficult for investors. Hence, increasing the need for a more reliable choice to forecast quarterly PGR accurately is obvious for investors. Accordingly, this study aims at to construct an integrated procedure for classifying PGR and for improving the accuracy of a rough set classification system. Figure 1 illustrates the flowchart of the proposed procedure.
Forecasting Quarterly Profit Growth Rate Using an Integrated Classifier
453
1. Collect practical dataset 2. Preprocess the dataset
Feature Selection
3. Select the core attributes 4. Build the decision table
RST (LEM2)
5. Extract the decision rules 6. Improve rule quality
Rule Filter
7. Evaluate the experimental results
Other Methods
Fig. 1 The flowchart of the proposed procedure
The proposed procedure is introduced step-by-step roughly as follows: Step 1: Collect the attributes of practical dataset by experiential knowledge. Step 2: Preprocess the dataset. Step 3: Select the core attributes by five feature selection methods. Step 4: Build the decision table by global cuts. Step 5: Extract the decision rules by LEM2 algorithm. Step 6: Improve rule quality by rule filter. Step 7: Evaluate the experimental results.
4 Empirical Case Study To verify the proposed procedure, a collected dataset will be used in the experiment. The computing processes of using the target dataset are expressed in detail as follows: Step 1: Collect the attributes of practical dataset by experiential knowledge. At first, based on experiential knowledge of the author (having ever-skillful experiences working in the financial industry about 14 years and individual investments in this field over 20 years) in the financial markets, a practically collected dataset is selected from 70 publicly traded financial holding stocks listed on Taiwan’s TSEC and OTC as experimental dataset that is quarterly financial reports during 2004-2006. There are a total of nine selected attributes: (i) F_assets, (ii)T_assets, (iii) T_liabilities, (iv) O_expenses, (v) O_income, (vi) T_salary, (vii) A_costs, (viii) C_type, and (ix) PGR (Class); except for the ‘C_type,’ all attributes
454
Y.-S. Chen et al.
are continuous data. They are referred to total fixed assets, total assets, total liabilities, operating expenses, operating income, total salary, agency costs, company type, and profit growth rate of accounting terms on quarterly financial reports, respectively. The first eight items are condition attributes and the last ‘PGR’ is a decision attribute. Therefore, the target dataset is called the PGR dataset in this study. Table 1 shows all attributes information in the PGR dataset sample. Table 1 The attributes information in the PGR dataset
1 2 3 4 5 6 7
Attribute name F_assets T_assets T_liabilities O_expenses O_income T_salary A_costs
Attribute information numeric numeric numeric numeric numeric numeric numeric
Number of value continuous continuous continuous continuous continuous continuous continuous
8
C_type
symbolic
4
9
PGR (Class) numeric
No.
continuous
Note Min: 15990 ~ Max: 47131292 Min: 8390514 ~ Max: 2380547669 Min: 5375826 ~ Max: 2297331093 Min: -1994478 ~ Max: 11852475 Min: -34328398 ~ Max: 9318711 Min: -701805 ~ Max: 7535360 Min: -4530602 ~ Max: 188208927 A, B, C and D refer to different types of company Min: -1671.79 ~ Max: 247.28
(Unit: thousands in New Taiwan Dollars). Table 2 The partial raw data of the PGR dataset
No. F-assets 1 25071 2 25119 3 25054 4 25038 5 23386
T-assets 1355605 1337366 1296335 1299946 1574022
T-liabilities 1273420 1258024 1214666 1219787 1486897
… … … … … …
A-costs 11149 4742 4930 4368 7242
632 633 634 635 636
2380547 1880395 1674573 1674420 1656606
2297331 1814067 1583874 1586178 1571092
… … … … …
6982 9612 11558 6677 9017
36016 26887 25612 25613 25457
C-type Class A P A F A F A F A F D D D D D
F F F F F
Step 2: Preprocess the dataset. To preprocess the target dataset to make knowledge discovery easier is needed; accordingly, data are transformed into EXCEL format that is more easily and effectively processed for use in the study. After preprocessing this dataset, a total of 636 instances are contained. Based on experiential knowledge of the author, the
Forecasting Quarterly Profit Growth Rate Using an Integrated Classifier
455
‘PGR (Class)’ (i.e., decision attribute) is partitioned into three classes: F (Fair, PGR > 100%), G (Good, PGR = 0% ~ 100%), and P (Poor, PGR < 0%). The raw data of the PGR dataset is showed in Table 2. Step 3: Select the core attributes by five feature selection methods. The key goal of feature selection is evaluating usefulness of attributes, selecting relevant attributes, and removing redundant and/or irrelevant attributes. Hence, five feature selection methods, Cfs, Chi-square, Consistency, Gain Ratio, and InfoGain, are accordingly applied to build these purposes from the PGR dataset. In the following acts, there are three sub-steps used to select core attributes. Step 3-1: Use Consistency Principle to find the more consistent outcomes for estimating the potential attributes among the five feature selection methods; Step 3-2: Use Condorcet method [29] to compute the scores for estimating the potential attributes among the five feature selection methods also; next, the cut-off point is set to Scorei+1 > Scorei, which is the maximal difference between Scorei+1 and Scorei. Step 3-3: Integrate the two methods above to find out the core attributes. Consequently, only four core attributes are remained. The selected key attributes are ‘F-assets,’ ‘T-assets,’ ‘T-liabilities,’ and ‘A_costs,’ which are regarded as input attributes (condition attributes) for the following steps. Step 4: Build the decision table by global cuts. From the condition attributes selected in Step 3, in addition to a class (decision attribute), which are used to build a decision table by global cuts [30] accordingly. Step 5: Extract the decision rules by LEM2 algorithm. Based on the derived decision table, the decision rules are determined by the rough sets LEM2 algorithm for classifying PGR. Step 6: Improve rule quality by rule filter. The more the generated rules, the more the complexity of making prediction. Thus, the rule filter algorithm in the rough sets guides a filtering process, with rules below the support threshold (< 2) being eliminated, because it refers to only one example coinciding with those rules. The performance of the refined rules will be described conclusively in next step for convenient readings. Step 7: Evaluate the experimental results. To verify the proposed procedure, the PGR dataset is randomly split into two sub-datasets: the 67% training dataset and the 33% testing set; the former thus consists of 426 observations while the latter has 210 observations. The experiments are repeated ten times with the 67%/33% random split using four different methods, Bayes Net [31], Multilayer Perceptron [32], and Traditional Rough Sets [33], and the proposed procedure, respectively. Next, the average accuracy is calculated with standard deviation. Finally, two evaluation criteria, the accuracy with standard deviation, number of attributes, are adopted as the comparison standard. Table 3 clearly illustrates the experimental results in the PGR dataset.
456
Y.-S. Chen et al.
Experimental results Regarding the comparison of them, the proposed procedure has the highest accuracy (94.05%), and thus significantly outperforms other listing methods. Moreover, the proposed procedure uses the fewer attributes (5) than the listing methods (9). In detailed view, the accuracy is significantly improved when the proposed procedure is compared to traditional Rough Sets. This information solidifies the demand of feature selection, which eliminates the redundant or useless attributes, and supports evidence that the reduced attributes can effectively improve accuracy rate through this integrated feature selection in the PGR dataset. As to the standard deviation of the accuracy, the proposed procedure is second low. The lower the standard deviation, the higher its performance, and vice versa. Particularly, it is recommended that the proposed procedure is suitable and viable for classifying PGR in the PGR dataset. Table 3 The comparison results of four methods for running 10 times in the PGR dataset
Method Bayes Net (Murphy, 2002) Multilayer Perceptron (Lippmann, 1987) Traditional Rough Sets (Bazan and Szczuka, 2001) The proposed procedure
Testing accuracy
Attribute
89.95% (0.024)
9
87.95% (0.017)
9
82.87% (0.033)
9
94.05% (0.018)
5
5 Conclusions This study has proposed an effective procedure for classifying the PGR of financial industry, based on utilizing an integrated feature selection approach to select the four condition attributes of financial data, and a rough sets LEM2 algorithm to implement prediction behaviors. To verify the proposed procedure, a practical PGR dataset has employed. The experimental results using the proposed procedure show that the procedure has satisfactory accuracy (of 94.05%) and fewer attributes (of 5), and thus outperforms other listing methods. One suggestion exists for future research. Apply the proposed procedure to more in-depth analysis of market structure toward other emerging markets and developed markets.
References [1] Chen, C.C.: A model for customer-focused objective-based performance evaluation of logistics service providers. Asia Pacific Journal of Marketing and Logistics 20(3), 309–322 (2008) [2] Fan, H., Mark, A.E., Zhu, J., Honig, B.: Comparative study of generalized Born models: Protein dynamics. Chemical Theory and Computation Special Feature 102(19), 6760–6764 (2005)
Forecasting Quarterly Profit Growth Rate Using an Integrated Classifier
457
[3] Barber, S.: Creating effective load models for performance testing with incomplete empirical data. In: Proceedings of the Sixth IEEE International Workshop, pp. 51–59 (2004) [4] Dasgupta, C.G., Dispensa, G.S., Ghose, S.: Comparative the predictive performance of a neural network model with some traditional market response models. International Journal of Forecast 10, 235–244 (1994) [5] Zopounidis, C., Doumpos, M.: Multicriteria classification and sorting methods: a literature review. European Journal of Operational Research 138, 229–246 (2002) [6] Dunham, M.H.: Data mining: Introductory and advanced topics. Prentice Hall, Upper Saddle River (2003) [7] Ravi Kumar, P., Ravi, V.: Bankruptcy prediction in banks and firms via statistical and intelligent techniques – A review. European Journal of Operational Research 180, 1– 28 (2007) [8] Andres, J.D., Landajo, M., Lorca, P.: Forecasting business profitability by using classification techniques: A comparative analysis based on a Spanish case. European Journal of Operational Research 167, 518–542 (2005) [9] Min, S.H., Lee, J., Han, I.: Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications 31, 652–660 (2006) [10] Press, E.: Analyzing Financial Statements. Lebahar, Friedman (1999) [11] Loth, R.B.: Select winning stocks using financial statements, IL, Dearborn, Chicago (1999) [12] Covin, J.G., Green, K.M., Slevin, D.P.: Strategic process effects on the entrepreneurial orientation–sales growth rate relationship. Entrepreneurship Theory and Practice 30(1), 57–82 (2006) [13] Hall, M.A., Holmes, G.: Benchmarking feature selection techniques for discrete class data mining. IEEE Transactions on Data Engineering 15(3), 1–16 (2003) [14] Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature selection for clustering-a filter solution. In: Proceedings of Second International Conference. Data Mining, pp. 115– 122 (2002) [15] Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997) [16] Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005) [17] Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of 18th International Conference. Machine Learning, pp. 74–81 (2001) [18] Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann Publishers, USA (2005) [19] Hall, M.A.: Correlation-based feature subset selection for machine learning. Thesis submitted in partial fulfilment of the requirements of the degree of Doctor of Philosophy at the University of Waikato (1998) [20] Kononenko, I.: Estimating attributes: Analysis and extensions of relief. In: Proceedings of the Seventh European Conference on Machine Learning, pp. 171–182. Springer, Heidelberg (1994) [21] Liu, H., Setiono, R.: A probabilistic approach to feature selection - A filter solution. In: Proceedings of the 13th International Conference on Machine Learning, pp. 319– 327 (1996) [22] Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. Wiley and Sons, Inc., New York (2001)
458
Y.-S. Chen et al.
[23] Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the International Conference on Information and Knowledge Management, pp. 148–155 (1998) [24] Pawlak, Z.: Rough sets. Informational Journal of Computer and Information Sciences 11(5), 341–356 (1982) [25] Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning About Data. Kluwer, Dordrecht (1991) [26] Grzymala-Busse, J.W.: LERS—a system for learning from examples based on rough sets. In: Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18 (1992) [27] Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31(1), 27–39 (1997) [28] Nguyen, H.S., Nguyen, S.H.: Analysis of stulong data by rough set exploration system (RSES). In: Berka, P. (ed.) Proc. ECML/PKDD Workshop, pp. 71–82 (2003) [29] Condorcet method. Wikipedia, http://en.wikipedia.org/wiki/Condorcet_method (retrieved October 31, 2010) [30] Bazan, J., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in classification problem. In: Polkowski, L., Tsumoto, S., Lin, T. (eds.) Rough Set Methods and Applications, pp. 49–88. Physica-Verlag, Heidelberg (2000) [31] Murphy, K.P.: Bayes Net ToolBox, Technical report, MIT Artificial Intelligence Labortary (2002), Downloadable from http://www.ai.mit.edu/~murphyk/ [32] Lippmann, R.P.: An introduction to computing with neural nets. IEEE Acoustics, Speech and Signal Processing Magazine 4(2), 4–22 (1987) [33] Bazan, J., Szczuka, M.: RSES and RSESlib - A collection of Tools for rough set. LNCS, pp. 106–113. Springer, Berlin (2001)
Fuzzy Preference Based Organizational Performance Measurement Roberta O. Parreiras and Petr Ya Ekel*
Abstract. This paper introduces a methodology for constructing a multidimesional indicator designed for organizational performance measurement. The methodology involves the application of fuzzy models and methods of their analysis. Its use requires the construction of fuzzy preference relations by means of the comparison of performance measures with respect to a reference standard defined as a predetermined scale consisting of linguistic terms. The exploitation of the fuzzy preference relations is carried out by means of the Orlovsky choice procedure. An application example related to the organizational performance evaluation with the use of the proposed methodology is considered, in order to demonstrate its applicability.
1 Introduction In response to the challenges raised by a competitive market, there has been an increasing use of organizational performance measurement systems in the business management activities (Nudurupati et al. 2010). Traditionally, organizational performance has been measured with financial indexes (Kaplan and Norton 1996), (Bosilj-Vuksice et al. 2008). Nowadays, non-financial measures also play an important role in business management, being utilized in the strategic planning to reflect the potential of an organization to obtain financial gains in a near future (Kaplan and Norton 1996), (Chapman et al. 2007), and in the manufacturing and distribution management to control the regular operations (Abdel-Maksoud et al. 2005). In parallel, the complexity of processing multiple performance estimates and of generating effective recommendations from their analysis has motivated researchers to develop models and methods for constructing and analyzing multidimensional performance measures. One possible approach for carrying out the performance evaluation is to regard it as a multiple criteria decision-making problem (Bititci et al. 2001; Clivillé et al. 2007; Yu and Hu 2010). Roberta O. Parreiras · Petr Ya Ekel Pontifical Catholic University of Minas Gerais, Av. Dom José Gaspar, 500, 30535-610, Belo Horizonte, MG, Brazil e-mail:
[email protected],
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 459–468. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
460
R.O. Parreiras and P.Y. Ekel
In this paper, we present a methodology for constructing a multidimensional performance indicator, which is based on fuzzy models and methods for multicriteria analysis. Its application requires the construction of fuzzy preference relations by means of the comparison of performance measures with respect to reference standards, which are defined as predetermined scales consisting of linguistic terms (linguistic fuzzy scales (LFSs)). The processing of the fuzzy preference relations is carried out by means of the Orlovsky choice procedure (Orlovsky 1978), which provides the degree of fuzzy nondominance of each performance measure with respect to the predetermined references. The aggregated degree of fuzzy nondominance is taken as a measure of multidimensional performance. An advantageous aspect of the methodology being proposed lies in the fact that it is based on the same theoretical foundation (Orlovsky choice procedure), as several multicriteria decision-making methods (Ekel et al. 2006; Pedrycz et al. 2010) as well as some group decision-making methods (Herrera-Viedma et al. 2002; Pedrycz et al. 2010). Such common foundation serves as a means for the integration of those classes of methods in the development of a computing system of management tools, where each one of those classes of technologies can communicate with each other their respective results without the need of much data processing or data conversions. The need of the development of integrated platforms of management tools has been recognized in the literature (Berrah et al. 2000; Shobrys and White 2002). The paper begins by presenting, in Section 2, some basic issues related to performance measurement and fuzzy models, which are necessary for understanding our proposal. In Section 3, we introduce a version of the fuzzy number ranking index (FNRI) proposed by Orlovsky (1981), which is extended here in order to deal adequately with fuzzy estimates, in a context where they may have no intersections. Section 4 briefly describes Orlovsky choice procedure and show how it can be applied to construct a multidimensional performance indicator based on the degree of nondominance of assessed quantities being compared with reference standards. An application example related to the organizational performance evaluation on the basis of the proposed methodology is considered in Section 5. Finally, in Section 6, we draw our conclusions.
2 Fuzzy Performance Measures In the design of performance measurement systems, indicators of very different nature (for instance, "degree of production efficiency", "level of customer satisfaction", etc.) are utilized (Nudurupati et al. 2010). We call as performance indicator each index of quantitative or qualitative character that reflects the status of the organization considering the goals to be achieved (Popova and Sharpankykh 2010). In the methodology being proposed, fuzzy performance indicators are utilized. Each fuzzy performance indicator Ip, p=1,…,m is defined in a universe of discourse Fp and is associated with a LFS in the following way:
LFS( I p ) = {X 1p , X 2p ,..., X npp } ,
(1)
Fuzzy Preference Based Organizational Performance Measurement
461
where each linguistic estimate X kp , k = 1,..., n p , is characterized by both a word (or sentence) from a linguistic term set and a fuzzy set with membership function μ X p ( f p ) : Fp → [0,1] , k = 1,..., n p . The linguistic estimates belonging to LFS(Ip), k
p=1,…,m, form a reference standard for the evaluation of the “goal satisfaction” degree. Such reference standard can be defined, for instance, in periodic meetings of directors or managers during which they review and update the strategic goals for the enterprise. Finally, it is important to highlight the role of a LFS for the construction of the multidimensional performance indicator: it allows the transformation of physical measures (which can be assessed on different scales), into “goal satisfaction” degrees. Despite of its importance, the aspect of defining a LFS is not further addressed in this paper. Among the correlated works which address this subject, we can name (Pedrycz 2001; Brouwer 2006; Pedrycz et al. 2010).
3 Construction of Fuzzy Preference Relations for Performance Measurement As it was indicated above, the methodology proposed in the present paper is based on processing of fuzzy preference relations. A fuzzy nonstrict preference relation Rp (Orlovsky 1978) consists in a binary fuzzy relation, which is a fuzzy set with bi-dimensional membership function μ R p ( X k , X l ) : X × X → [0,1] . In essence,
μ R p ( X k , X l ) indicates the degree to which Xk is at least as good as Xl in the unit interval. The fuzzy nonstrict preference relations can be constructed for each fuzzy performance indicator by means of the comparison of an estimated quantity (which can be represented as a real number or a fuzzy estimate) with a standard reference expressed as a LFS. This comparison can be realized (Ekel et al. 1998) by applying the FNRI proposed by Orlovsky (1981). The rationality of the use of this FNRI is justified in (Ekel et al 2006). In general, when we are dealing with a performance indicator which can be accommodated (measured) on a numerical scale and the essence of preference behind such performance indicator is coherent with the natural order ≥ along the axis of measured values, then the following expressions can be utilized to compare a pair of fuzzy estimates Xk and Xl (Orlovsky 1981):
μ Rp ( X k , X l ) = μ Rp ( X l , X k ) =
max
f p ( X k )≥ f p ( X l )
max
f p ( X k )≤ f p ( X l )
min( μ X p ( f p ( X k )), μ X p ( f p ( X l ))) ,
(2)
min ( μ X p ( f p ( X k )), μ X p ( f p ( X l ))) ,
(3)
k
k
l
l
if the considered indicator is associated with the need of maximization.
462
R.O. Parreiras and P.Y. Ekel
In (2) and (3), the operation min is related (Zimmermann 1990) to constructing the Cartesian product Fp ( X k ) × Fp ( X l ) , where Fp(Xk) represents the universe of discourse Fp, which is associated with the evaluation of μ X p ( f p ( X k )) , when k
processing the Cartesian product. The operation max is carried out for the region f p ( X k ) ≥ f p ( X l ) , if we use (2) and, for the region f p ( X k ) ≤ f p ( X l ) , if we use (3). If the performance indicator is associated with the need of minimization, then (2) and (3) are written for f p ( X k ) ≤ f p ( X l ) and f p ( X k ) ≥ f p ( X l ) , respectively. Simple examples of applying (2) and (3) are given in (Pedrycz et al. 2010). Characterizing the considered FNRI, it is necessary to distinguish two extreme situations. The first one is associated with the cases when we cannot distinguish two fuzzy quantities, Xk and Xl, whose Core areas intersect (the Core of a fuzzy set X is the set of all elements of the universe that come with membership grades equal to 1 (Pedrycz et al. 2010)). In such cases, (2) and (3) produce μ R p ( X k , X l ) = 1 and μ R p ( X l , X k ) = 1 , respectively. It means that Xk is indifferent to Xl. This is to be considered as a desirable result and can be interpreted as the impossibility of identifying which is the best (highest or lowest) of the compared alternatives due to the uncertainty of the available information. The second situation is associated with the cases when the fuzzy estimates do not intersect, and (2) and (3) produce one between the following results: • μ R p ( X k , X l ) = 1 and μ R p ( X l , X k ) = 0 , if Xk is better than Xl; • μ R p ( X k , X l ) = 0 and μ R p ( X l , X k ) = 1 , if Xl is better than Xk.
In such cases, the FNRI does not reveal how much better Xk (or Xl) is than Xl (or Xk), which is an important aspect to be considered in the construction of a performance indicator. One direct way around this situation is associated with reconstructing the fuzzy estimates of a LFS. However, it is possible to indicate another way, which is more natural and acceptable from the practical point of view to adequately handle fuzzy estimates with no intersection. This way permits one to distinguish how far a performance measure is from the highest degree of goal satisfaction and consists in the inclusion of a term D ( X k , X l ) to expressions (2) and (3), as follows:
μRp ( X k , X l ) =
⎫⎪ 1 ⎧⎪ D( X k , X l ) min( μ X p ( f p ( X k )), μ X p ( f p ( X l ))) + ⎨ f ( Xmax ⎬, k l 2 ⎪⎩ p k )≥ f p ( X l ) ( Fmax p − Fmin p ) ⎪⎭
(4)
μRp ( X l , X k ) =
⎫⎪ 1 ⎧⎪ D( X k , X l ) min( μ X p ( f p ( X k )), μ X p ( f p ( X l ))) + ⎨ f ( Xmax ⎬. k l 2 ⎪⎩ p k )≤ f p ( X l ) ( Fmax p − Fmin p ) ⎪⎭
(5)
Fuzzy Preference Based Organizational Performance Measurement
463
In the expressions (4) and (5), Fmaxp and Fminp represent the maximum value and the minimum value of the universe of discourse Fp. The term D ( X k , X l ) is given by (Lu et al. 2006)
(
)
min | a p − b p | , if μ R p ( X k , X l ) = 0 or μ R p ( X l , X k ) = 0; ⎧ ⎪∀a p∈Supp( μ X k ( f p ) ) D ( X k , X l ) = ⎨∀bp∈Supp( μ X l ( f p ) ) , (6) ⎪0, otherwise. ⎩ where a p , b p ∈ Fp and the operation Supp provides the support of a fuzzy set (a set of all elements of the universe of discourse with nonzero membership degrees in that set (Pedrycz et al. 2010)).
4 Multidimensional Performance Measurement Based on Fuzzy Preference Relations Let us consider a procedure for the construction of a multidimensional performance indicator based on a set of m fuzzy performance indicators I1,…,Im, each one with its respective LFS. Initially, it is necessary to obtain a set X p = { X 0p , X 1p ,..., X np } for each criterion, where X 0p represents the value being measured for Ip and X 1p ,..., X np are the fuzzy estimates corresponding to the linguistic terms of LFS(Ip). Then, by applying (4) and (5) to all pairs belonging to X p × X p , a fuzzy nonstrict preference relation Rp is constructed for each performance indicator Ip, p=1,…,m. Finally, those matrices can be exploited by applying the Orlovsky choice procedure, in order to obtain the fuzzy nondominance degree of X 0p (which is utilized as a performance measure), as it is described below. The Orlovsky choice procedure requires the construction of fuzzy strict preference relations and of fuzzy nondominance sets. The strict preference relation Pp ( X kp , X lp ) corresponds to the pairs ( X kp , X lp ) that satisfy ( X kp , X lp ) ∈ R p and
( X lp , X kp ) ∉ R p and can be constructed as Orlovsky (1978):
μ Pp ( X kp , X lp ) = max{μ R p ( X kp , X lp ) − μ R p ( X lp , X kp ), 0} .
(7)
As Pp ( X lp , X kp ) describes the set of all fuzzy estimates, X kp ∈ X p that are strictly dominated by (or strictly inferior to) X lp ∈ X p , its compliment Pp ( X lp , X kp ) provides the set of alternatives that are not dominated by other alter-
natives from Xp. Therefore, in order to meet the set of alternatives from Xp that are not dominated by any other alternative, it suffices to obtain the intersection of all
464
R.O. Parreiras and P.Y. Ekel
Pp ( X lp , X kp ) . This intersection is the set of nondominated objects belonging to Xp
with the membership function
μ ND p ( X kp ) = minp (1 − μ Pp ( X lp , X kp )) = 1 − maxp μ Pp ( X lp , X kp ) . X l ∈X
X l ∈X
(8)
Having a collection of measures X 0 = { X 01 ,..., X 0m } at hand, one can obtain the degrees of fuzzy nondominance μ ND p ( X 0p ) , for each performance indicator p=1,…,m, and aggregate them in order to estimate the multidimensional performance degree for the enterprise. Different aggregation operators can be applied in this context. The use of an intersection operation is suitable, when it is necessary to verify at which level the organization simultaneously satisfies all the goals associated with the performance indicators I1 and I2 and ... and Im. Among the operators that can be utilized to implement the intersection operation, the min operator allows one to construct a multidimensional performance indicator Gmin ( X 0 ) = min( μ ND1 ( X 01 ),..., μ NDm ( X 0m )) ,
(9)
under a completely noncompensatory approach, in the sense that the high satisfaction of some goals does not relieve the remaining ones from the requirement of being satisfied. Such pessimistic approach gives emphasis to the worst evaluations, which may be particularly advantageous to identify the organization weaknesses. On the other hand, the use of the union operation is also admissible, when it is necessary to verify at which level the organization satisfies at least one goal, which can be associated with I1 or I2 or … or Im. The use of max operator to implement the union operation allows one to construct a multidimensional performance indicator Gmax ( X 0 ) = max( μ ND1 ( X 01 ),..., μ NDm ( X 0m )) ,
(10)
under an extremely compensatory approach, in the sense that the high level of satisfaction of any goal is sufficient. Finally, it can be useful to apply the so-called ordered weighted aggregation operator (OWA) (Yager 1995), which can produce a result that is more compensatory than min or that is less compensatory than max under a proper adjustment of its weights. An OWA operator of dimension m corresponds to a mapping function [0, 1]m → [0, 1] . Here it is utilized to aggregate a set of m normalized values
μ ND1 ( X 01 ),..., μ NDm ( X 0m ) , in such a way that m
GOWA ( X 0 ) =
∑w b , i i
(11)
i =1
where bi is the ith largest value among μ ND1 ( X 01 ),..., μ NDm ( X 0m ) and the weights w1, …, wm satisfy the conditions w1+ …+ wm=1 and 0 ≤ wi ≤ 1 , i=1,…,m.
Fuzzy Preference Based Organizational Performance Measurement
465
The major attractive aspect of using OWA is associated with the fact that it allows to indirectly specify the weights by using linguistic quantifiers. Here, OWA is utilized with the linguistic quantifier “majority” to indicate at which level the organization satisfies most of the goals (Yager 1995).
5 Application Example In order to demonstrate the applicability of the proposed methodology, it is applied to measure the performance of an enterprise. Table 1 shows the performance indicators being considered and their corresponding evaluations. The fuzzy performance indicators I1, I2, and I5 have a maximization character and I3 and I4 have a minimization character. Figure 1 presents the LFS defined for each performance indicator (the LFSs have different granularities to respect the uncertainty degree of the perception of the managers being invited to participate in the definition of the reference standards). With the use of (4) and (5), fuzzy preference relations are constructed for each performance indicator. By applying subsequently expressions (7) and (8) to the fuzzy preference relations, the fuzzy nondominance degrees shown in Table 2 are obtained. Table 2 also shows the fuzzy nondominance degrees aggregated with the use of min, max and OWA operators. As it can be seen, the low value of Gmin(X0) suggests that the enterprise still has to deal with the corresponding weaknesses. The high value of Gmax(X0) indicates that the organization has already achieved at least one of its goals, which means that more aggressive goals can be established for that performance indicator. The majority of goals have been satisfied at a degree 0.72, as indicated by GOWA(X0). It is worth noting that the value of Gmin(X0), Gmax(X0), and GOWA(X0), depends only on the comparison of X 0p with X *p , being X *p the fuzzy estimate from LFS(Ip), which is associated with the highest level of goal satisfaction. In this way, only the entries R p ( X 0p , X *p ) and R p ( X *p , X 0p ) are required to obtain the aggregated fuzzy nondominance degree of X 0 . However, in order to know the position of X 0p in the LFS(Ip), it is valuable to obtain the complete matrix Rp, as well as the values of Gmin ( X kp ) , Gmax ( X kp ) , and GOWA ( X kp ) , for all X kp ∈ X p . For instance, consider the fuzzy preference relation associated with I1, which is given by
μ R1
⎡ 1 ⎢0.38 =⎢ ⎢0.53 ⎢ ⎣ 1
1 1 1 0.81 1
1
1
1
0 .8 ⎤ 0.5 ⎥ ⎥ 0.725⎥ ⎥ 1 ⎦
(12)
466
R.O. Parreiras and P.Y. Ekel
Fig. 1 LFS associated with each fuzzy performance indicator. Table 1 Fuzzy performance indicators. Performance indicators
Assessments
I1: productivity as average monthly ratio (output/input)
0.74
3
I2: production amount (value per hour in 10 USD) ) expressed as a Gaussian distribution with ( μ , σ ) = ( 70,4 ) Gaussian fuzzy number coming from a Gaussian distribution 3
I3: production costs (per year in 10 USD)
60900 3
I4: internal and external failure costs (per year in 10 USD)
190
I5: customers satisfaction expressed as a fuzzy estimate (a trapezoidal fuzzy number) coming from LFS(I5)
Average
Table 2 Fuzzy nondominance degrees for each fuzzy preference indicator and global fuzzy nondominance degrees obtained with min, max and OWA. Indices and operators
μ ND ( X 0 )
I1
I2
I3
I4
I5
min
Max
OWA
0.8
0.7
0.71
1
0.5
0.5
1
0.72
Fuzzy Preference Based Organizational Performance Measurement
467
and the corresponding fuzzy nondominance set μ ND1 = [0.8 0.38 0.53 1] .
(13)
By analyzing (13), we can see that the level of fuzzy nondominance of X 01 is between the level of fuzzy nondominance of the fuzzy estimates associated with the linguistic terms “Average” and “High”, being more similar to “High”.
6 Conclusions We presented a methodology for constructing a multidimensional performance indicator. Among its advantageous aspects, we can name: • Once the managers have constructed a LFS for each performance indicator, the FNRI makes it possible to compare the assessed values with the standard references without the participation of managers (until new goals for the enterprise and, as a consequence, a new reference standard become needed). • The proposed methodology does not involve the use of defuzzifying operations, which usually implicates loss of information or unjustified simplification of the problem. • The results of (Pedrycz et al. 2010), which are associated with multicriteria group decision-making in a fuzzy environment, can be utilized to extend this methodology to include the input of a group of managers and experts. Acknowledgments. This research is supported by the National Council for Scientific and Technological Development of Brazil (CNPq) - grants PQ:307406/2008-3 and PQ:307474/2008-9.
References [1] Abdel-Maksoud, A., Dugdale, D., Luther, R.: Non-financial performance measurement in manufacturing companies. The Br. Account Rev. 37, 261–297 (2005), doi:10.1016/j.bar.2005.03.003 [2] Berrah, L., Mauris, G., Foulloy, L., Haurat, A.: Global vision and performance indicators for an industrial improvement approach. Comput. Ind. 43, 211–225 (2000), doi:10.1016/S0166-3615(00)00070-1 [3] Bititci, U.S., Suwignjo, P., Carrie, A.S.: Strategy management through quantitative modelling of performance measurement systems. Int. J. Prod. Econ. 69, 15–22 (2001), doi:10.1016/S0925-5273(99)00113-9 [4] Bosilj-Vuksice, V., Milanovic, L., Skrinjar, R., Indihar-Stemberger, M.: Organizational performance measures for business process management: a performance measurement guideline. In: Proc. Tenth Conf. Comput. Model. Simul. (2008), doi:10.1109/UKSIM.2008.114 [5] Brouwer, R.K.: Fuzzy set covering of a set of ordinal attributes without parameter sharing. Fuzzy Sets Syst. 157, 1775–1786 (2006), doi:10.1016/j.fss.2006.01.004
468
R.O. Parreiras and P.Y. Ekel
[6] Chapman, C.S., Hopwood, A.G., Shields, M.D.: Handbook of management accounting research, vol. 1. Elsevier, Amsterdam (2007) [7] Clivillé, V., Berrah, L., Mauris, G.: Quantitative expression and aggregation of performance measurements based on the MACBETH multi-criteria method. Int. J. Prod. Econ. 105, 171–189 (2007), doi:10.1016/j.ijpe.2006.03.002 [8] Ekel, P., Pedrycz, W., Schinzinger, R.: A general approach to solving a wide class of fuzzy optimization problems. Fuzzy Sets Syst. 97, 49–66 (1998), doi:10.1016/S01650114(96)00334-X [9] Ekel, P.Y., Silva, M.R., Schuffner Neto, F., Palhares, R.M.: Fuzzy preference modeling and its application to multiobjective decision making. Comput. Math Appl. 52, 179–196 (2006), doi:10.1016/j.camwa.2006.08.012 [10] Herrera-Viedma, E., Herrera, F., Chiclana, F.: A consensus model for multiperson decision making with different preference structures. IEEE Trans. Syst. Man Cybern – Part A: Syst. Hum. 32, 394–402 (2002), doi:10.1109/TSMCA.2002.802821 [11] Kaplan, R.S., Norton, D.: The Balanced Scorecard: Translating Strategy into Action. Harvard Business School, Boston (1996) [12] Lu, C., Lan, J., Wang, Z.: Aggregation of fuzzy opinions under group decisionmaking based on similarity and distance. J. Syst. Sci. Complex 19, 63–71 (2006), doi:10.1007/s11424-006-0063-y [13] Nudurupati, S.S., Bititci, U.S., Kumar, V., Chan, F.T.S.: State of the art literature review on performance measurement. Comput. Ind. Eng. (in press), doi:10.1016/j.cie.2010.11.010 [14] Orlovsky, S.A.: Decision making with a fuzzy preference relation. Fuzzy Sets Syst 1, 155–167 (1978), doi:10.1016/0165-0114(78)90001-5 [15] Orlovsky, S.A.: Problems of Decision Making with Fuzzy Information. Nauka, Moscow (1981) (in Russian) [16] Pedrycz, W.: Fuzzy equalization in the construction of fuzzy sets. Fuzzy Sets Syst. 119, 329–335 (2001), doi:10.1016/S0165-0114(99)00135-9 [17] Pedrycz, W., Ekel, P., Parreiras, R.: Fuzzy Multicriteria Decision-Making: Models, Methods, and Applications. Wiley, Chichester (2010) [18] Popova, V., Sharpanskykh, A.: Modeling organizational performance indicators. Inf. Syst. 35, 505–527 (2010), doi:10.1016/jis.2009.12.001 [19] Shobrys, D.E., White, D.C.: Planning, scheduling and control systems: why cannot they work together. Comput. Chem. Eng. 26, 149–160 (2000), doi:10.1016/S00981354(00)00508-1 [20] Yager, R.R.: Multicriteria decision making using fuzzy quantifiers. In: Proc. IEEE Conf. Comput. Intell. Financial Eng. (1995), doi:10.1109/CIFER.1995.495251 [21] Yu, V.F., Hu, K.J.: An integrated fuzzy multi-criteria approach for the performance evaluation of multiple manufacturing plants. Comput. Ind. Eng. 58, 269–277 (2010), doi:10.1016/j.cie.2009.10.005 [22] Zimmermann, H.J.: Fuzzy Set Theory and Its Application. Kluwer Academic Publishers, Boston (1990)
Generating Reference Business Process Model Using Heuristic Approach Based on Activity Proximity Bernardo N. Yahya and Hyerim Bae
*
Abstract. The number of organizations implementing business process innovation by an approach that involves Business Process Management (BPM) system has been increased significantly. Those organizations design, implement and develop BPM system into such a level of maturity and consequently, there are large collections of business process (BP) models in the repository. The existence of numerous process variations leads to both process redundancy and process underutilization, which impact on business performance in negative ways. Thus, there is a need to create a process reference model that can identify and find a representative model without redundancy. This paper introduces a new heuristic-based approach to generate a valid business process reference model from a process repository. Previous research used genetic algorithm (GA) to produce a reference process model. However, GA procedure has a high computational cost on solving such problem. Near the end of this paper, we show the experimental results of the proposed method, which is conveniently executed using business process structure properties and the proximity of activities. It is believed that this process reference model can help a novice process designer to create a new process conveniently. Keywords: reference process model, heuristic, business process, proximity.
1 Introduction The BP improvement and effectiveness issues have induced many organizations to implement and apply BPM system. The implementation of BPM in numerous organizations encourages vendors to develop the system into a certain level of maturity and consequently, there are large collections of BP models in the repository. The existence of numerous process variations leads to both process redundancy and process underutilization. Thus, there is a need to create a process reference model to identify and find a representative model without redundancy. Bernardo N. Yahya · Hyerim Bae Business & Service Computing Lab., Industrial Engineering, Pusan National University 30-san Jangjeon-dong Geumjong-gu, Busan 609-735, South Korea e-mail:
[email protected],
[email protected] *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 469–478. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
470
B.N. Yahya and H. Bae
The generation of a generic process model, as a new reference model, is considered necessary for future adaptations and decreasing change costs. In the domain of industrial process, there exists industrial process model collections and reference process models such as ITIL, SCOR, eTOM. However, its high level reference model corresponds to process-guidelines to match different aspects of IT without considering the level of implementation. Moreover, existing approaches only attempted to create reference model based on history of process configurations (Li et al. 2009; Kuster et al. 2006; Holschke et al. 2008). Thus, the present study developed a method for generating the best reference model from a large number of past business process variants without any history information of process configurations.
Fig. 1 Process variants in logistics
Figure 1 illustrates the variation of business processes that having common goals, showing seven process variants as examples. The characteristics of each process example have to be modeled for a specific goal in the logistics process. As the basic process, six activities are initialized: order entry, order review, financial check, stock check, manager review, and purchase order (PO) release. According to the given process context, the concerns of a certain organizational unit and specific customer requirements, different variations of a basic process are needed.
Generating Reference Business Process Model Using Heuristic Approach
471
There had been previous approach for creating a reference model with a combinatorial optimization problem by using a distance measure and considering a process model’s mathematical formulation for optimization of the activity distances in a process model (Bae et al. 2010). It includes safety property which is mentioned that a business process will always satisfy a given property, e.g. it will always run to completion (Aalst 2000). However, there is no guarantee when a process has safety property, it also follows the soundness property as a validation approach. To overcome these limitations, Yahya et al. (2010) have developed GA-based method of generating a BP reference model from a BP repository (GA-BP). It adopted a measure to assess the proximity distance of activities in order to evaluate both integer programming (IP) and GA measurements. However, a high computational cost on execution time is somewhat a problem in GA approach. Thus, this study focuses on developing such heuristic approach to select the best solution with a better execution time. In addition, a valid reference process model considering soundness properties (Aalst 2000) and corresponds to the characteristics of existing process variants (Fettke et al. 2006) is also discussed. This paper proceeds as follows. In section 2, we briefly review the literature on process modeling, reference process and graph theory in the BPM field. In section 3, we incorporate the proposed proximity score measurement (PSM) method into the evaluation function of an IP problem (IP-BP). The heuristic method (HeuristicBP) is proposed as a way to improve GA-BP results. Section 4 examines the experimental results using the IP-BP, GA-BP dan Heuristic-BP. Finally, section 5 concludes this study.
2 Related Work There are some existing business-process-design-related research fields, which usually are titled business process modeling, workflow verification, process change and workflow patterns (Kim and Kim 2010; Zhou and He 2005; Jung et al. 2009; Kim et al. 2010). Process configurations using version management approach was discussed previously (Kim et al. 2010). Research about pattern-based workflow design using Petri net was also proposed (Zhou and He 2005). Kim and Kim (2010) developed a process design tool with regard to fault-tolerant process debugger. Jung et al. (2009) discussed a method to find similar process by clustering stored processes in repository. All researches developed an improved process, however, there is no such issue on modeling BP by means of process reference models. Any discussion of reference model issues begins with process variants (Fettke et al. 2006; Kuster et al. 2006; Holschke et al. 2008; Li et al. 2009; Bae et al. 2010). Recently, a comprehensive heuristics-based approach to discovering new reference models by learning from past process configurations was discussed (Li et al. 2009), and a mathematical programming approach was introduced (Bae et al. 2010). The proposed heuristic (Li et al. 2009) updates process configurations and produces new reference models based on a minimum edit distance from initial reference processes. However, most traditional process design tools have lack
472
B.N. Yahya and H. Bae
functions on storing of process configurations. When there are a lot of processes already stored in the repository, it requires a special method to generate process reference model without any process configuration information. The IP-based mathematical model (Bae et al. 2010) was proposed to address the issue of creating reference processes without initial reference information or process reconfiguration. There remain problems in the presentational and validation aspects of the process using IP formulations, which is solved using GA approach (Yahya et al. 2010). The industry papers using refactoring operations (Kuster et al. 2006) emerged as process configuration tool from AS-IS into TO-BE process models. Scenario-based analysis on application of reference process models in SOA (Holschke et al. 2008) and survey results and classification (Fettke et al. 2006) have all necessary concepts in regard to process reference models with less quantitative techniques.
3 Proposed Model We measured the process structure distance using PSM. For a simpler distance measure, the process variants in Figure 1 were transformed into graph abstractions, as illustrated in Figure 2.
3.1 Proximity Score Measurement (PSM) Definition 1. (Process Model) We define a process model pk , which means the k-th process in a process repository. It can be represented as a tuple of , each element of which is defined below. • Ak ={ai| i=1,…,I} is a set of activities where ai is the i-th activity of pk and I is the total number of activities in pk. • A is defined as a set of all activities in the process repository, where A is the union of all Ak, A =
⊆
∪
K
k =1
Ak
∈
• Lk {lij = (ai,aj) | ai,aj Ak } is a set of links where lij is the link between two activities ai and aj in the k-th process. The element (ai,aj) represents the fact that ai immediately precedes aj. Definition 2. (Activity Proximity Score) We have to obtain the Activity Proximity Score (APS) for each process. The APS value, which is denoted by qij, is defined as
qijk = where
h k (i, j ) d ijk
(1) ,
h (i, j ) = 1 if ai → aj in the k-th process; 0, otherwise, and k
Generating Reference Business Process Model Using Heuristic Approach
473
d ijk is the average path distance between activity ai and aj of the k-th process. k
k
Each process has a single value of qij ,k={1,2,3,…,K}, where qij is the APS of the k-th process in a process repository, K is the total number of processes, and ai → aj denotes that activity aj is reachable from ai. Detailed distance calculations can be found in Yahya et al. (2009). a2 a1
l12 l13 l14
a3 a4
l25 l35
a5
l56
a6
l45
a1
a2
l23 a3
l12
a1
a6 a4
l45
l36
a2
l23
a4
l35
l45 a5 l56
(d). process variant 4 (p4)
a5
l13
a3 l 35 (
)
l14
a4
a5
l56
a6
a1 l13
l46
a6
l34
l34 a3
a4
l35
l46 a6
(
)
l56
(e). process variant 5 (p5)
a1
a3
(c). process variant 3 (p3)
l46 (
)
a4
(
)
l13
a5 l 56
l34 a3
l24
a2
l12
a1
a6
(b). process variant 2 (p2)
l36
a1 l24
a2
a4
(a). process variant 1 (p1) l12
l12
a3
l23 l24
a5
l56
(f). process variant 6 (p6)
a6
l45
(g). process variant 7 (p7)
Fig. 2 Graph Abstraction from Fig. 1
Definition 3. (InDegree and OutDegree of activity) InDegree defines the number of edges incoming to an activity, and OutDegree defines the number of edges outgoing from an activity. We denote the InDegree and the OutDegree of the i-th activity as inDegree(ai) and outDegree(ai), respectively, and according to these concepts, we can define start/end activities and split/merge semantics. Start activity (aS) is an activity with an empty set of preceding activities, inDegree(ai)=0. End activity (aE) is an activity with an empty set of succeeding activities, outDegree(ai)=0. For a split activity ai such that outDegree(ai)>1, fs(ai) = ‘AND’ if all of the succeeding activities should be executed; otherwise, fs(ai)= ‘OR’. For a merge activity ai such that inDegree(ai)>1, fm(ai) = ‘AND’ if all of the preceding activities should be executed; otherwise, fm(ai)= ‘OR’.
3.2 Integer Programming Mathematical Formulation A process of automatic reference model creation finds an optimum reference process by maximizing the sum of proximity scores among the process variants in a process repository. The following notations, extended from [10], are used in the mathematical formulation of our problem. Notice that yi, zj, and xij are decision variables. i,j: activity index (i,j = 1,…, I), where I is the number of activities k: process variant index (k = 1,…, K), where K is the number of process variants
474
B.N. Yahya and H. Bae
yi: 1, if the i-th activity is a start activity; 0, otherwise zj: 1, if the j-th activity is an end activity; 0, otherwise xij: 1, if the i-th activity immediately precedes the j-th activity; 0, otherwise Mathematical Formulation min
s.t.
I
∑y
i
I
I
i
j
∑∑ ((K − c ) x ij
=1
(3),
ij
I
∑x
yi +
i
ji
(2)
≥1
i = 1,..., I
(7),
≥1
i = 1,..., I
(8),
∀k , qijk = 1
(9),
i = 1,..., I
(10).
{ j ; q kji =1}
I
∑z
+ cij .(1 −xij ))
j
=1
(4),
j
I
zi +
∑x
ij
{ j ; q ijk =1}
yi + x ji ≤ 1
∀k , q kji = 1
(5),
xij ∈ {0,1}
zi + xij ≤ 1
∀k , q = 1
(6),
yi , zi ∈{0,1}
k ij
In this model, we link two activities based on the information from the existing process variants. The summation of the number of adjacent links among all of the process variants is denoted as cij. This determines the cost of creating a link between ai and aj among all k process variants. When the constraints are satisfied, we minimize the multiplication of the integer values of possible link (xij) and negative cost (-cij) of links ai and aj. To avoid unexpected links, we multiply (1 – xij) by the cost. In other words, in order to maximize the sum of proximity scores among process variants, the objective functions have to be minimized. Constraints (3) and (4) impose the condition that there is only one start (yi) activity and one end activity (zi) in a process reference model. Constraints (6) and (6) guarantee that there are no immediate predecessors to the start activity and no immediate successors of the end activity, respectively. Constraints (7) and (8) determine that there should be at least one path following the start activity and preceding the end activity, respectively. The adjacent relationship of activity ai and aj is denoted as xij with a binary value element, as shown by constraint (9). Constraint (10) reflects the fact that the start/end activity has a binary value element. a2 a1
a4 a3
a5
a6
Fig. 3 Result from LINGO by IP-BP (Fitness = 31)
By using the graph example in Figure 2 and applying the mathematical approach, we obtain the result from LINGO, shown in Figure 3. The process
Generating Reference Business Process Model Using Heuristic Approach
475
provides us with insight into the new reference process. The safety property, as a part of soundness property, has been considered in the IP constraint. However, behavior in between start and end activities may hold some irrelevant properties. For example, the result is considered to be an invalid process, since activity a3 (financial check activity) has never been experienced as a merge activity. Previous study set about solving the validity problem using a GA approach. Due to the computational cost, this present study shows a relevant heuristic algorithm for gaining the best result with less computational time.
4 Heuristic Approach It is important to generate a valid and sound reference process model. Thus, in order to verify the well-formedness of a business process, this study follows and applies the soundness properties of business processes. Three corresponds properties are accommodated to ensure the soundness of business process (Aalst 2000). The proposed heuristic approach in this study is comprised of two parts. First, the initialization procedure to create initialized process based on certain probability condition. Second, revision algorithm is proposed to modify and enhance a process to be a sound reference process model. The heuristic procedure to obtain the best fitness value is as follow. 1. Identify N activities from all process variants in the repository. 2. Selection activity property (Initialize algorithm) Step 1. Search a possible activity for start activity. Let a1 be a start activity Step 2. Search a potential of next activity after a1 by finding the greatest value activity proximity score. Let n=2. Denote the next activity as an. Step 3. Search a potential of next activity after an by finding the greatest value activity proximity score. Step 4. If n < N, n++, do step 3. Otherwise, next. Step 5. Let aN be an end activity. 3. Check the validity of generated process by insertion_deletion algorithm (see Fig. 4) The total time complexity of insertion_deletion algorithm is O(N2) where N = |A|. Fig. 5 shows the heuristic results by considering validity property. By using the insertion_deletion algorithm (Figure 4) to check the process path, we can produce the result shown in Figure 5, with the objective function greater than the result of IP-BP (Figure 3). Although the heuristic results in a greater objective value, the process model fits with the soundness and validation of the process variants properties. Experiments result using IP, GA and Heuristic are shown at Table 1. The heuristic results show the same fitness value with GA-BP (see Table 1). Problem on GA-BP, which is regarded to high execution time, is solved using Heuristic-BP. Figure 6 (right side) presents a graph of comparison among three methods, IP-BP, GA-BP and Heuristic-BP.
476
B.N. Yahya and H. Bae Algorithm insertion_deletion (pk) Input : a nominated of reference process pk Output : valid process pk Begin /* Link Deletion */ FOR each ai Ak DO // Ak is a set of activities in pk FOR each aj Ak, aj ≠ ai IF (qjik = 1) THEN IF ((cji=0) || (outDegree(aj) > maxk(outDegree(aj)))) THEN Lk⟵Lk -{lij}; outDegree(aj)--; inDegree(ai)--; // Lk is a set of links in pk IF (qijk = 1) THEN IF ((cij=0) || (inDegree(aj) > maxk(inDegree(aj)))) THEN Lk⟵Lk -{lij}; outDegree(ai)--; inDegree(aj)--;
∈ ∈
/* Link Insertion */ FOR each ai Ak DO link_insert(ai); IF ((inDegree(ai)==0) && (ai ≠ aS))THEN FOR each aj Ak , aj ≠ ai ∧ (ai ,aj) Lk IF ((outDegree(aj)>0) && (outDegree(aj) < maxk(outDegree(aj)))) THEN IF (cji > max_in(ai)) THEN max_in(ai) = cji; END FOR FOR each aj Ak , aj ≠ ai IF (cji = max_in(ai)) THEN Lk⟵Lk +{lij}; outDegree(aj)++; inDegree(ai)++; END FOR IF ((outDegree(ai)==0) && (ai ≠ aE)) THEN FOR each aj Ak , aj ≠ ai ∧ (ai ,aj) Lk IF ((inDegree(aj)>0) && (inDegree(aj) < maxk(inDegree(aj)))) THEN IF (cij > max_out(ai)) THEN max_out(ai) = cij; END FOR FOR each aj Ak , aj ≠ ai IF (cij = max_out(ai)) THEN Lk⟵Lk +{lij}; outDegree(ai)++; inDegree(aj)++; END FOR END FOR End.
∈ ∈
∈
∈
∈
∈
∈
Fig. 4 insertion_deletion algorithm
a1
a2
a3
a5
a6
a4 Heuristic-BP (fitness = 32)
Fig. 5 Reference Process Model as result of Heuristic-BP (Fitness = 32)
Generating Reference Business Process Model Using Heuristic Approach
477
Table 1 Experiment Result
# of avg. act. 6.1 10.4 14.8 19 23.2 28.6 33.1 38.3 43 47.2
IP-BP Exec. Time 49 0.17 63 0.17 124 0.39 152 0.31 205 0.56 199 0.92 339 1.49 305 2.45 438 3.72 538 5.56
Obj. Value
GA-BP Exec. Best Time 50 1.19 63 1.09 125 1.89 154 1.81 207 2 202 3.75 345 5.45 306 9.77 447 14.75 548 20.69
Heuristic-BP Fitness Exec. Value Time 50 0.19 63 0.27 125 0.33 154 0.44 207 0.62 202 0.89 345 1.53 306 2.56 447 3.44 548 4.89
Fig. 6 Graphic comparison of IP, GA and heuristic BP based on fitness value and execution time
5 Conclusions This paper presents an enhanced approach of finding a process reference model. Previous works already proposed IP approach, which is a combinatorial optimization problem, and GA procedure. It is required to develop a heuristic to solve the problem more efficiently. This study shows a heuristic approach by using insertion_deletion algorithm to obtain the same result as GA approach with less execution time. The presentation limitations of the mathematical formulation and the possibility of an unguaranteed valid process that were resolved by using the GA procedure are also solved by this heuristic. The process reference model derived by our approach can be utilized for various purposes. First, it can be a process template for certain process variants. Second, it can deal with the process reuse issue. Hence, our approach can be a robust decision making tool for convenient process modeling by novice designers.
Acknowledgement This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No.2010-0027309).
478
B.N. Yahya and H. Bae
References 1. van der Aalst, W.M.P.: Workflow verification: Finding control-flow errors using petrinet-based techniques. In: van der Aalst, W.M.P., Desel, J., Oberweis, A. (eds.) Business Process Management. LNCS, vol. 1806, pp. 161–183. Springer, Heidelberg (2000) 2. Bae, J., Lee, T., Bae, H., Lee, K.: Process reference model generation by using graph edit distance. In: Korean Institute Industrial Engineering Conference, D8-5 (2010) (in Korean) 3. Fettke, P., Loos, P., Zwicker, J.: Business process reference models: Survey and classification. In: Bussler, C.J., Haller, A. (eds.) BPM 2005. LNCS, vol. 3812, pp. 469–483. Springer, Heidelberg (2006) 4. Holschke, O., Gelpke, P., Offermann, P., Schropfer, C.: Business process improvement by applying reference process models in SOA – a scenario-based analysis. In: Multikonferenz Wirtschaftsinformatik (2008) 5. Jung, J., Bae, J., Liu, L.: Hierarchical clustering of business process models. International Journal of Innovative Computing, Information and Control 5(12A), 4501–4511 (2009) 6. Kim, D., Lee, N., Kan, S., Cho, M., Kim, M.: Business Process version management based on process change patterns. International Journal of Innovative Computing, Information and Control 6(2), 567 (2010) 7. Kim, M., Kim, D.: Fault-tolerant process debugger for business process design. International Journal of Innovative Computing, Information and Control 6(4), 1679 (2010) 8. Küster, J.M., Koehler, J., Ryndina, K.: Improving business process models with reference models in business-driven development. In: Eder, J., Dustdar, S. (eds.) BPM Workshops 2006. LNCS, vol. 4103, pp. 35–44. Springer, Heidelberg (2006) 9. Li, C., Reichert, M., Wombacher, A.: Discovering reference models by mining process variants using a heuristic approach. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 344–362. Springer, Heidelberg (2009) 10. Yahya, B.N., Bae, H., Bae, J.: Process Design Selection Using Proximity Score Measurement. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. Lecture Notes in Business Information Processing, vol. 43, pp. 330–341. Springer, Heidelberg (2010) 11. Yahya, B.N., Bae, H., Bae, J., Kim, D.: Generating business process reference model using genetic algorithm. In: Biomedical Fuzzy Systems Association 2010, Kitakyushu (2010) 12. Zhou, G., He, Y.: Modelling workflow patterns based on P/T nets. International Journal of Innovative Computing, Information and Control 1(4), 673–684 (2005)
How to Curtail the Cost in the Supply Chain? Wen-Ming Wu, Chaang-Yung Kung, You-Shyang Chen, and Chien-Jung Lai
*
Abstract. In the appearance of the lowest profitable manufacture epoch, a financial influence has been the most key-point in the supply chain management; Therefore, it is the core-issue for manufacturers and companies that how to increase the positively financial benefits and to decrease the negatively financial impacts in their supply chain. However, in order to create the innovative assessable criteria and evaluated model for the supply chain management, in this research, the Analytical Network Process (ANP) model is selected to evaluate key financial assessment criteria through brainstorming, focus group, the Delphi method and nominal group technique to improve the selection of suppliers in supply chain management (SCM). The specific characteristics of the ANP evaluated model is to establish pairwise compared matrix and furthermore, to measure the priority vector weights (eigenvector) of each assessable characteristic, criteria and attribute. In addition, in the content, the analytical hierarchical relations are definitely expressed in four levels including between each characteristic of supply chain, criterion and attribute. Nevertheless, according to the empirical analysis, the enterprises are able to choose the best potential suppliers through this research in order to minimize financial negative impact. Eventually, in empirical and academic, some suggestions are supposed not only for manage but also for researchers to further the best development of operation strategy of supply chain management. Keywords: Supply Chain Management (SCM), Analytical Network Process (ANP).
1 Introduction Nowadays, the rapid development of the manufacture, information and networking technologies, the lowest manufacture profits epoch has arrived. Therefore, manufacturers and enterprises has commenced not only to create the most effective marketing strategies but also to overview the most efficient manufacture Wen-Ming Wu · Chien-Jung Lai Department of Distribution Management, National Chin-Yi University of Technology *
Chaang-Yung Kung Department of International Business, National Taichung University of Education You-Shyang Chen Department of Information Management, Hwa Hsia Institute of Technology J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 479–487. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
480
W.-M. Wu et al.
processes in order to execute the most cost-down policies. Further, the supply chain management (“SCM”) has been the famous doctrine. However, there a few financially influenced elements to be comprehensively considered in the SCM. Hence, essentially, there are two significant ideas in the SCM in this research. One is “The Supply Chain (“SC”) encompasses all activities associated with the flow and transformation of goods from the raw materials stage (extraction), through to the end user, as well as the associated information flows. Material and information flow both up and down the SC.” [1] and another one is “SCM is the systemic, strategic coordination of the traditional business functions and the tactics across these business functions within a particular company and across businesses within the SC, for the purposes of improving the long-term performance of the individual companies and the SC as a whole.” [2] However, the fundamental ideal of SCM depends on compressing the total cost of manufacture, inventory and delivery in order to reach the best profits for the enterprises after the orders has been given to the enterprises. SCM is not able to effectively handle two crucial problems for enterprises: cash-flow stress without orders and account receivable stress with slow client-payment. In the boom period between the 1990 and 2008, the global economy has been in a rapid growing status due to the steady development of the economy of Mainland China. The cashflow and account receivable stress were not influencing the enterprises. The enterprises have specifically suffered the financial stress from clients after the 2008 global finance crisis because the clients are not willing to offer a stable procurement demand orders and are paying for these orders in six months or longer. Moreover, in this rapid transition and lower profits era, enterprises have confronted more seriously financial and managerial negative influence. Furthermore, [3] analyzed the comprehensive SC under continuous and complex information flow and then, consequently discovered that the management has reached the conclusion that optimizing the product flows cannot be completed without implementing a process approach to the business. The research articulately addressed the key processes of SC processes included supplier relationship management, demand management, order fulfillment, manufacturing flow management, product development, commercialization and returns management, customer relationship management and customer service management. Further, [4] organized the performance evaluation model of SCM which includes four key evaluated elements: reliability, elasticity and respondence, cost and Return on Asset (“ROA”). Reliability includes two main evaluated criteria: order handled performance and delivery performance. Elasticity and respondence contains two major assessable criteria: production elasticity and SCM respondence time. Cost comprises three principle evaluated criteria: SCM management cost, additional cost of production under SCM and correct error cost. ROA involves two chief assessable criteria: inventory days and cash flow. In order to reduce financial and managerial negative influence, the enterprises not only have to focus on utilizing the benefits of SCM to cost-down but must also consider a financially strategic view regarding sales forecast, finance preview, inventory system and SC development in order to achieve the best competitive advantage.
How to Curtail the Cost in the Supply Chain?
481
For the sake of the few profits and unsustainable sale era, major business scholars and enterprise leaders and managers have defected that reducing financial and managerial negative influence to increase sales is more important than cost-down because revenue is the critical lifeline for the enterprises. Specifically, the Asia enterprises which are the world’s manufacturing factory have been affected this tendency. Taiwan’s proximity to Mainland China has resulted in Taiwan being depended on the expose processing and international-trade to develop its economy. In the last 20 years, more and more of Taiwan’s enterprises have started to invest in Mainland China which has caused a high depended relationship between enterprises in Taiwan and Mainland China. This type of highly depended relationship leads to the encumbrance of development in Taiwan’s enterprises. However, due to the rigorously political issue, Taiwan has been marginalized from the rapid integrated economic regional unions, for example: Association of Southeast Asian Nations. Despite, the Taiwan government has directly communicated with Mainland China government through party alternation in power from the Democratic Progressive Party to Chinese Kuomintang Party. The Taiwan’s enterprises still faces many financial and managerial negative influences from uncertain business environment. What is the efficient and effective approach to find out the best suppliers without the financial and managerial negative influence? What are the most important assessable criteria to measure suppliers? Therefore, choosing appropriate supplier to diminish costs and finance negative influence is a crucial analytical factor in SCM. The purpose of this research is to utilize the hierarchically analytical approach and the analytical network process (“ANP”) approach in order to measure the key elements and assessable criteria for reducing financial negative influence under SCM for the enterprises to minimize the finance and managerial negative influence.
2 Methodologies 2.1 Measurement In terms of assessing the complexity and uncertainty challenges surrounding the ANP model, a compilation of expert’s collection was analyzed along with empirical survey in order to achieve retrospective cross-sectional analysis of the supplier chain relationship between the enterprises and suppliers for diminishing finance and managerial negative influence. This section not only characterizes the overall research design, research specification of analytical and research methodology but also is designed for comparing each assessable criteria of the relationship for characteristic, criteria, attributes and selected candidates in the four phases of the research design. The four phases consist of (1) Identify the research motive in order to define the clear research purpose and question - Select the apposite suppliers in order to diminish financial and managerial negative influence ; (2) Select the research methodology – Establish ANP model to analyze research question ; (3) Utilize research methodology to analyze empirical survey data – Utilize ANP model to evaluate each assessable criteria through transitivity, comparing weights principle, evaluated criteria, positive reciprocal matrix and supermatrix ; and (4)
482
W.-M. Wu et al.
Integrate overall analysis in to inductively make conclusion – Select the best choice depended on assaying results by employing the research model development, measuring framework, selecting the research methodology, investigating procedures, analyzing empirically collected data, assessing overall analytical criteria through the use of Delphi method, comparing and empirical analysis in order to make a comprehensive conclusion.
2.2 Measurement In terms of the representativeness of the efficient ANP model through transitivity, comparing weights principle, evaluated criteria, positive reciprocal matrix and supermatrix, research data source must collectively and statistically constrain all impacted expert’s opinion related to each assessable criteria. According to the assessment of the ANP model, the pairwise comparison of the evaluation characteristics, criteria and attribution at each level are evaluated with reference to the related correlationship, interdependence and importance from equal important (1) to extreme important (9) as expressed in Figure 1. Characteristics of supply chain 1
0 1 2 3 4 5 6 7 8 9 Equal-----------------------------------Extreme Important
Characteristics of supply chain 2
Criteria of supply chain 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Criteria of supply chain 2
Attributes of supply chain 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Attributes of supply chain 2
Selected Candidate of supply chain 1
0 1 2 3 4 5 6 7 8 9 Equal------------------------------------Extreme Important
Selected Candidate of supply chain 2
Fig. 1 The evaluation scale of pairwise assessment
“Once the pairwise comparison are conducted and completed, the local priority vector w (eigenvector) is computed as the unique solution” [5] in equation (1) and the w is represented priority vector w (relative weights). Additionally, [6] delivered the two-stage algorithm as presented in equation (1.2): m
∑(
Rw = λmax w ,
j =1
wi =
Rij m
∑ Rij i =1
)
(1)
m
In each pairwise comparison, the consistency of compared elements will match transitivity in order to satisfy the representativeness of collected of expert’s opinion. Then, the Consistency Index (“C.I.”) is definitely considered in each pairwise comparison calculated matrix as presented in equation (2) and further, the Consistency Ratio (“C.R.”) is assessed with C.I. and Random Index (“R.I”) which obtained from the statistic table of random index figure as presented in equation (2).
How to Curtail the Cost in the Supply Chain?
C. I . =
λmax − n n −1
483
,
C.R. =
C. I . R. I .
(2)
Based on the principle of consistency ratio, the pairwise comparison matrix can be acceptable when the number of C.R. is equal or small than 0.01. Further, the research data source in this research derived from the scholars and experts who understand the SCM and ANP and are served or employed in Taiwan and Mainland China. Additionally, according to the fundamental characteristics of SCM and AHP with concepts of [7] and the collected data of expert’s opinion, this research is organized the following six assessable criteria and their homologous attributes which are expressed in Figure 2. to testify and analyze the consistency of each candidate suppliers in this research. Distinguish the research motive for creating research purpose and question – Select the apposite financially considerable elements for diminishing financial and managerial negative influence
Select the research methodology – Establish ANP model to analyze research question
Utilize research methodology to analyze empirical survey data – Utilize ANP model to evaluate each assessable criteria through transitivity, comparing weights principle, evaluated criteria, positive reciprocal matrix and supermatrix
Integrate overall analysis in to inductively make conclusion – Choose the best choice depended on assaying results
Fig. 2 The research design framework [8]
2.3 Research Process For employing the effective model to measure the suppliers, the ANP model is applied in this research in order to deal with the correspondent relationships of the “inter-independence” related each characteristic of SC, criteria and attributes, the hierarchical, as presented in figure 3. [9] addressed the most major different point between AHP and ANP is that, based on the original assumption, AHP is not able to directly evaluate each assessable criterion by hierarchical relations but that, on the contrary, ANP can be utilized to dispose of direct interdependence relationships and inter-influence between each criteria and criteria at the same or different level through conducting the “supermatrix”.
484
W.-M. Wu et al. The selection of suppliers with best potential communicationtechnology in supply chain under the minimizing financial risk
Criteria of assessment
Enterprise’s high-speed operation demand Financial Evaluation
Attributes of each subcriterion
Selected Candidate of Suppliers
RG GM-ROI
Sale Review SFA DOIS
Supplier 1 without enterprise’s domain (SWTED)
Value-adding transformation
Supply information channel
Inventory System
Delivery Status
Customer’s Service
IT OI IA
WOCSP OFCOSP
OTS PRG OFR
Suppliers 2 with the enterprise’s internal domain and social networking (SWEIDSN)
Suppliers’ offer POTSDPSDMD PTDMDNRI PTDMVP-W PTDMVP-EDI PTMIMS
Suppliers 3 with the enterprise’s internal and external domain and social networking (SWEIEDSN)
Fig. 3 The research process [10]
(1) Financial Evaluation. For overall reflection of operational financial evaluation of suppliers, the two principle assessable attributes are considered in the criterion of financial evaluation: revenue growth (“RG”) and gross margin ROI (GM-ROI). (2) Sale Review. In terms of ensuring the revenue evaluation of suppliers, the two assessable attributes based on expert’s opinion, are considered in the criterion of sale review: sale forecast accuracy (“SFA”) and days of inventory sales (“DOIS”). (3) Inventory System. In order to realize the inventory status of suppliers in brief, the three major attributes according to financial concepts and expert’s discuss, are considered in the criterion of inventory system: inventory turns (“IT”), obsolete inventory (“OI”) and inventory accuracy (“IA”). (4) Delivery Status. In terms of presentation delivery status, the experts who are surveyed in this research, considered two chief evaluated attributes: warehouse operations cost as a percentage of sales (“WOCSP”) and outbound freight cost as a percentage of sales (“OFCOSP”) in this criterion. (5) Customer Service. For understanding the situation of conducting the customer feedback of suppliers, the three basic evaluated attributes are considered in the customer service criterion: on time shipment (“OTS”), percentage of returned good (“PRG”) and order fill rate (“OFR”). (6) Suppliers’ offer. Based on the discussion of expert, in this assessable criterion, they deem that six crucial evaluated attributes should been contained: percentage of on time supplier delivery (“POTSD”), percentage of supplier delivered material defects (“PSDMD”), percentage of total direct material that do not require inspection (“PTDMDNRI”), percentage of total material value purchased using a webbased system (“PTDMVP-W”), percentage of total material value purchases using
How to Curtail the Cost in the Supply Chain?
485
a EDI transactions (“PTDMVP-EDI”) and percentage of total material inventory managed by suppliers (“PTMIMS”).
3 Empirical Measurement Each potential accounting partner has to fit match each assessable sub-criterion matched in each assessed criterion through the comparative pairwise of each potential supplier. Owing to reflecting the comparative scores for three kinds of potential suppliers with a minimum of financial impact, the equation (3) is employed to measured the comprehensively comparative related priority weight w (eigenvector) in the Table 1. Consequently, the appropriate accounting partner is selected by calculating the “accounting comparative index” D [11] which is defined by: i
s
kj
Di = ∑∑ PjTkj Rikj
(3)
j =1 k =1
Where the importance of related priority, D , is weight w (eigenvector) for assessable criterion j; T is the importance of related priority weight w (eigenvector) for assessable attribute k of criterion j and R is the important potential accounting partner i on the attribute k of criterion j. i
kj
ikj
Table 1 The results of the empirical evaluation (Financial Evaluation/ Sale Review/ Inventory System/ Delivery Status/ Suppliers’ offer/ Customer Service) SCM comparative index
Supplier 1 (SWTED)
Suppliers 2 (SWEIDSN)
Suppliers 3 (SWEIEDSN)
0.137
0.6414
0.2749
Additionally, based on the equation (1), (2), and (3) processing manipulation, the ultimate evaluated step is to combine the overall outcome of complete importance of related priority weights w (eigenvector) in Table 1. Table 1 shows that Supplier 1 (without enterprise’s domain.) (“SWTED”), Suppliers 2 with the enterprise’s internal domain and social networking (“SWEIDSN”) and Suppliers 3 (with the enterprise’s internal and external domain and social networking) (“SWEIEDSN”) all obtain the highest evaluated score is the Suppliers 2 with the enterprise’s internal domain and social networking (“SWEIDSN”), not the Suppliers 3 (with the enterprise’s internal and external domain and social networking) (“SWEIEDSN”) which points out the most critical point that the benefits of the development of the communication-technology in supply chain is on the enterprise’s internal domain and social networking. The reason is the relationship between enterprise and suppliers overall emphasize on the high-speed establishment of internal domain during the enterprises selects the cooperative suppliers because suppliers growing will definitely assist the enterprise to grow. Consequently, the highest result of the evaluated score of comparative index of 0.6414 is Suppliers 2 (SWEIDSN).
486
W.-M. Wu et al.
4 Conclusion There are a plethora of researches in SCM surrounded the major fundamental idea of the cost-down under the development of the communication-techology. However, the measurement and diminishment of financial negative influence of selection suppliers in SCM is not discussed in detail in the research field. Our contention, therefore, not only focuses on the original central concept of SCM but also concentrates on the diminishment of financial negative influence during selecting the best potential suppliers through new, financial perspective and novel approach (ANP model). The ANP model is used not only to clearly establish comprehensively hierarchical relations between each assessable criterion but also to assist the decision-maker to select the best potential supplier 2 with the enterprise’s internal domain and social networking (“SWEIDSN”) with low financial negative influence through the academic Delphi method and expert’s survey. In the content, there are six main assessable criteria including three financial assessable factors (financial evaluation, sale review and inventory system), two SCM assessable factors (delivery status and suppliers’ offer) and one customer-service assessable factors (customer service). The further step beyond this research is to focus attention on minimizing additional negative influence which is created in SCM through more measurement and assessment. As these comprehensive versions are respected, the enterprises will be able to create more comparative business strategies to survive in this complex, higher-comparative, lower-profit manufacture epoch.
References [1] He, H., Garcia, E.A.: Learning from Imbalanced Data, Knowledge and Data Engineering. IEEE Trans. on Knowledge and Data Engineering 21(9), 1263–1284 (2009) [2] Handfield, R.B., Nichols Jr., E.L.: Introduction to Supply Chain Management, pp. 141–162. Prentice-Hall, Inc., Englewood Cliffs (1999) [3] Mentzer, J., et al.: Defining supply chain management. Journal of Business Logistics 22, 1–25 (2001) [4] Lambert, et al.: Issues in Supply Chain Management. Industrial Marketing Management 29, 65–83 (2000) [5] Yi-Ping, C., Yong-Hong, Y.: Ping Heng Ji Fen Ka Wan Quan Jiao Zhan Shou Ce. Merlin Publishing Co., Ltd., Taiwan (2004) [6] Chen, S.H., et al.: Enterprise Partner Selection for Vocational Education: Analytical network Process Approach. International Journal of Manpower 25(7), 643–655 (2004) [7] Sarkis, J.: Evaluating environment conscious business practices. Journal of Operational Research 107(1), 159–174 (1998) [8] Cheng, L.X.: Gong Ying Lian Feng Xian De Cai Wu Guan Dian. Accounting Research Monthly 285, 66–67 (2009) [9] Hsieh, M.-Y., et al.: How to Reduce the Negative Financial Influence in the Supply Chain, pp. 397–400. Electronic Trend Publications (2010)
How to Curtail the Cost in the Supply Chain?
487
[10] Saaty, T.L.: Decision Making with Dependence and Feedback: The Analytic Network Process. RWS Publications, Pittsburgh (1996) [11] Hsieh, M.-Y., et al.: Decreasing Financial Negative Influence in the Supply Chain Management through Integrated Comparison the ANP and GRA-ANP Models. The Journal of Grey System 139(2), 69–72 (2010) [12] Hsieh, M.-Y., et al.: Decreasing Financial Negative Influence in the Supply Chain Management by applying the ANP model. In: 2009 The 3rd Cross-Strait Technology, Humanity Education and Academy-Industry Cooperation Conference, China (2009)
Intelligent Decision for Dynamic Fuzzy Control Security System in Wireless Networks Xu Huang, Pritam Gajkumar Shah, and Dharmendra Sharma
*
Abstract. Security in wireless networks has become a major concern as the wireless networks are vulnerable to security threats than wired networks. While elliptic curve cryptography (ECC) prominently offers great potential benefits for wireless sensor network (WSN) security there is still a lot of work needs to be further improved due to WSN has very restraint running conditions such as limited energy source, capability of computing, etc. It is well known that scalar multiplication operation in ECC accounts for more 80% of key calculation time on wireless sensor network motes. In this paper we presented an intelligent decision for optimized dynamic window based on our previous research works. The whole quality of service (QoS) has been improved under this algorism in particularly the power consuming is more efficiently. The simulation results showed that the average calculation time, due to intelligent decision system and the fuzzy conditions decreased from previous 26 to current 9 as a whole the calculation time, decreased by approximately 17.5% in comparison to our previous algorithms in an ECC wireless sensor network.
1 Introduction The high demand for various sensor applications shows the fact that the rapid progress of wireless communications has become popular in our daily life. With the growth in very large scale integrated (VLSI) technology, embedded systems and micro electro mechanical systems (MEMS) has enabled production of inexpensive sensor nodes, which can transit data over a distances with free media and efficient use of power [1, 22, 23]. In WSN systems, the sensor node will detect the interested information, processes it with the help of an in-built microcontroller and communicates results to a sink or base station. Normally the base station is a more powerful node, which can be linked to a central station via satellite or internet communication to form a network. There are many deployments for wireless sensor networks depending on various applications such Xu Huang · Pritam Gajkumar Shah · Dharmendra Sharma Faculty of Information Sciences and Engineering, University of Canberra, ACT 2601, Australia e-mail: {Xu.Huang,Pritam.Shah,Dharmendra.Sharma}@canberra.edu.au *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 489–500. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
490
X. Huang, P.G. Shah, and D. Sharma
as environmental monitoring e.g. volcano detection [2,3], distributed control systems [4], agricultural and farm management [5], detection of radioactive sources [6], and computing platform for tomorrows’ internet [7]. Generally speaking, a typical WSN architecture can be shown in Figure 1. Contrast to traditional networks, a wireless sensor network normally has many resource constraints [4] due to the limited size. As an example, the MICA2 mote consists of an 8 bit ATMega 128L microcontroller working on 7.3 MHz. As a result nodes of WSN have limited computational power. Radio transceiver of MICA motes can normally achieve maximum data rate of 250 Kbits/s, which restricts available communication resources. The flash memory that is available on the MICA mote is only 512 Kbyte. Apart from these limitations, the onboard battery is 3.3.V with 2A-Hr capacity. Therefore, the above restrictions with the current state of art protocols and algorithms are expensive for sensor networks due to their high communication overhead.
Fig. 1 A Typical WSN architecture
Recalled that the Elliptic Curve Cryptography was first introduced by Neal Koblitz [9] and Victor Miller [10] independently in the early eighties. The advantage of ECC over other public key cryptography techniques such as RSA, Diffie-Hellman is that the best known algorithm for solving elliptic curve discrete logarithm problem (ECDLP) which is the underlying hard mathematical problem in ECC which will take the fully exponential time. On the other hand the best algorithm for solving RSA and Diffie-Hellman takes sub exponential time [11].
2 Elliptic Curve Diffie-Hellman Scheme Proposed for WSN Before we get into our innovation method, we need to have a closer look at the popular legacy scheme for WSN. As per [13] the original Diffie-Hellman algorithm with RSA requires a key of 1024 bits to achieve sufficient security but Diffie Hellman based on ECC can achieve the same security level with only 160 bit key size. In ECC two heavily used operations are involved: scalar multiplication and modular reduction. Gura et. al. [14] showed that 85% of execution time is spent on scalar multiplication. Scalar Multiplication is the operation of multiplying point P
Intelligent Decision for Dynamic Fuzzy Control Security System
491
on an elliptic curve E defined over a field GF(p) with positive integer k which involves point addition and point doubling. Operational efficiency of kP is affected by the type of coordinate system used for point P on the elliptic curve and the algorithm used for recoding of integer k in scalar multiplication. This research paper proposes an innovative algorithm based on one’s complement for representation of integer k which accelerates the computation of scalar multiplication in wireless sensor networks. The number of point doubling and point addition operations in scalar multiplication depends on the recoding of integer k. Expressing integer k in binary format highlight this dependency. The number of zeros and number of ones in the binary form, their places and the total number of bits will affect the computational cost of scalar multiplications. The Hamming weight as represented by the number of non-zero elements, determines the number of point additions and bit length of integer K determines the number of point doublings operations in scalar multiplication. One point addition when P ≠ Q requires one field inversion and three field multiplications [13]. Squaring is counted as regular multiplication. This cost is denoted by 1I + 3M, where I denotes the cost of inversion and M denotes the cost of multiplication. One point doubling when P = Q requires 1I + 4M as we can neglect the cost of field additions as well as the cost of multiplications by small constant 2 and 3 in the above formulae. Binary Method Scalar multiplication is the computation of the form Q = kP, where P and Q are the elliptic curve points and k is positive integer. This is obtained by repeated elliptic curve point addition and doubling operations. In binary method the integer k is represented in binary form:
k=
l −1
∑K
j2
j
, K j ∈ {0,1}
j =0
The binary method scans the bits of K either from left-to-right or right-to-left. The cost of multiplication when using binary method depends on the number of non-zero elements and the length of the binary representation of k. If the representation has kl-1 ≠ 0 then binary method require (l–1 ) point doublings and (W-1) where l is the length of the binary expansion of k, and W is the Hamming weight of k (i.e., the number of non-zero elements in expansion of k). For example, if k = 629 = (1001110101)2, it will require (W–1) = 6–1 = 5 point additions and l–1 = 10–1 = 9 point doublings operations. Signed Digit Representation Method The subtraction has virtually the same cost as addition in the elliptic curve group. The negative of point (x, y) is (x, –y) for odd characters. This leads to scalar multiplication methods based on addition–subtraction chains, which help to reduce the number of curve operations. When integer k is represented with the following form, it is a binary signed digit representation.
492
X. Huang, P.G. Shah, and D. Sharma l
k=
∑S
j2
j
, S j ∈ {1,0,−1}
j =0
When a signed-digit representation has no adjacent non zero digits, i.e. Sj Sj+1 = 0 for all j ≥ 0 it is called a non-adjacent form (NAF). NAF usually has fewer nonzero digits than binary representations. The average hamming weight for NAF form is (n–1)/3.0. So generally it requires (n–1) point doublings and (n–1) /3.0 point additions. The binary method can be revised accordingly and is given another algorithm for NAF, and this modified method is called the Addition Subtraction method.
3 Dynamitic Window Based on a Fuzzy Controller in ECC We are going to use the algorithm based on subtraction by utilization of the 1’s complement is most common in binary arithmetic. The 1’s complement of any binary number may be found by the following equation [19-22]:
C1 = (2 a − 1) − N
(1)
where C1 = 1’s complement of the binary number, a = number of bits in N in terms of binary form, N = binary number. From a closer observation of the equation (1), it reveals the that any positive integer can be represented by using minimal non-zero bits in its 1’s complement form provided that it has a minimum of 50% Hamming weight. The minimal nonzero bits in positive integer scalar are very important to reduce the number of intermediate operations of multiplication, squaring and inverse calculations used in elliptical curve cryptography as we have seen in previous sections. The equation (1) can therefore be modified as per below:
N = (2 a − C1 − 1) For example, we may take N =1788 then it appears N= (11011111100)2 in its binary form C1= 1’s Complement of the number of N= (00100000011)2 a is in binary form so we have a = 11 After putting all the above values in the equation (2) we have: 1788 = 211 – 00100000011 –1, this can be reduced as below: 1788 = 100000000000 00100000011 – 1
(2)
(3)
So we have 1788= 2048
256 2 1 1
As is evident from equation (3), the Hamming weight of scalar N has reduced from 8 to 5 which will save 3 elliptic curve addition operations. One addition operation requires 2 Squaring, 2 Multiplication and 1 inverse operation. In this case a total of 6 Squaring, 6 Multiplication and 3 Inverse operations will be saved.
Intelligent Decision for Dynamic Fuzzy Control Security System
493
The above recoding method based on one’s complement subtraction combined with sliding window method provides a more optimized result. Let us compute [763] P (in other words k = 763) as an example, with a sliding window algorithm with K recoded in binary form and window sizes ranging from 2 to 10. It is observed that as the window size increases the number of precomputations also increases geometrically. At the same time number of additions and doubling operations decrease. Now we present the details for the different window size to find out the optimal window size using the following example: Window Size w = 2 763 = (1011111011)2 No of precomputations = 2w – 1 = 22 – 1 = [3] P 763 = 10 11 11 10 11 The intermediate values of Q are P, 2P 4P, 8P, 11P, 22P, 44P, 47P, 94P, 95P, 190P, 380P, 760P, 763P Computational cost = 9 doublings, 4 additions, and 1 pre-computation. Window Size w = 3 No of pre-computations = 2w – 1 = 23 – 1 = [7] P So all odd values: [3]P, [5]P, [7]P 763 = 101 111 101 1 = [5]P [7]P [5]P [1]P The intermediate values of Q are 5P, 10P, 20P, 40P, 47P, 94P, 188P, 376P, 381P, 762P, 763P Computational cost = 7 doublings, 3 additions, and 3 pre-computations. We continue to derive the remaining calculations for Window Size w = 6, Window Size w = 7, Window Size w = 8, Window Size w = 9, and Window Size w = 10. The results for all calculations are presented in Table 1.
Algorithm for sliding window scalar multiplication on elliptic curves. 1 . Q ← P∞ and i ← l - 1 2.while i ≥ 0 do 3.if n i = 0 then Q ← [2]Q and i ← i - 1 4.else 5. s ← max (i - k + 1,0 ) 6 .while n s = 0 do s ← s + 1 7.for h = 1 to i - s + 1 do Q ← [2]Q 8 u ← (n i ....... ns ) 2 [n i = n s = 1 and i - s + 1 ≤ k] 9. Q ← Q ⊕ [u]P [u is odd so that [u]P is precompute d] 10. i ← s - 1 11.return Q
We continue to derive the remaining calculations for Window Size w = 6, Window Size w = 7, Window Size w = 8, Window Size w = 9, and Window Size w = 10. The results for all calculations are presented in Table 1.
494
X. Huang, P.G. Shah, and D. Sharma
Table 1 Window Size Vs No of doublings, additions and Pre computations Window Size
No of Doublings
No of Additions
No of Pre computations
2
9
4
1
3
7
3
3
4
6
2
7
5
5
1
15
6
4
1
31
7
3
1
61
8
3
1
127
9
1
1
251
10
0
0
501
4 Intelligent Decision Fuzzy Controller System in ECC It is clear, from above description that there is a tradeoff between the computational cost and the window size as shown in Table 1. However, this tradeoff is underpinned by the balance between computing cost (or the RAM cost) and the pre-computing (or the ROM cost) of the node in the network. It is also clear that, from above description that the variety of wireless network working states will make this control complex and calculations could be relatively more expensive.
Therefore, we propose a fuzzy dynamic control system, to provide dynamic control to ensure the optimum window size is obtained by tradeoff between precomputation and computation cost. The fuzzy decision problem introduced by Bellman and Zadeh has as a goal the maximization of the minimum value of the membership functions of the objectives to be optimized. Accordingly, the fuzzy optimization model can be represented as a multi-objective programming problem as follows [21]: Max: min{μs (D)}&min{μl (Ul )} ∀s ∈S &∀l ∈ L such that Al ≤ Cl
∑
∀l ∈ L,
xrs = 1
∀p ∈ P &∀s ∈S,
xrs = 0 or 1
∀r ∈ R &∀s ∈S
r∈RP
In above equation, the objective is to maximize the minimum membership function of all delays, denoted by D, and the difference between the recommend value and the measured value, denoted by U. The Fuzzy control system is extended from and shown in Figure 2. For accurate control, we designed a three inputs fuzzy controller. The first input is storage room, which has three statuses, showing storage room in one of the three, namely (a) low, (b) average, and (c) high. The second input is pre-computing working load (PreComputing) in one of three states, namely (a) low, (b) average, and (c) high. The third input is Doubling, expressing how much working load for the calculation “doubling” which has three cases, namely (a) low, (b) average, and (c)
Intelligent Decision for Dynamic Fuzzy Control Security System
495
high. The output is one, called WindowSize, to express the next window size should be moved in which way, which has three states for the window sizes, namely (a) down, (b) stay, and (c) up.
Fig. 2 Three inputs fuzzy window control system
There are only 9 Fuzzy Rules listed as follows (weight are unit) due to StorageRoom in Figure 6 can be ignored due to the results of Figure 5 although it is a factor needs to be appeared in terms of this control system. 1. If (PreComputing is low) and (Doubling is low) then (WindowSize is Up) 2. If (PreComputing is low) and (Doubling is average) then (WindowSize is Up) 3. If (PreComputing is low) and (Doubling is high) then (WindowSize is stay) 4. If (PreComputing is average) and (Doubling is low) then (WindowSize is Up) 5. If (PreComputing is average) and (Doubling is average) then (WindowSize is Up) 6. If (PreComputing is average) and (Doubling is high) then (WindowSize is stay) 7. If (PreComputing is high) and (Doubling is low) then (WindowSize is Up) 8. If (PreComputing is high) and (Doubling is average) then (WindowSize is stay) 9. If (PreComputing is high) and (Doubling is high) then (WindowSize is stay) The number at each fuzzy condition with a bracket is the weight number, currently it is unit. Later we shall change it with different number according to the running situations as described in the next. The three inputs are StorageRoom, PreComputing and Doubling. The output is WindowSize. It is noted if we did not take the advantage of Figure 5, there will be at least 26 fuzzy rules need to be considered as shown in our previous paper [23]. This is because that the “StorageRoom” has low, average, high with other two parameters’ combinations. In order to make the controller running more
496
X. Huang, P.G. Shah, and D. Sharma
efficiently, a intelligent decision established via multi-agent, the individual agent is able to take decision to look after the controller to work coherently. In order to make high efficiently system, a intelligent decision system is designed as below. We have definitions of the agents as below: Co-Ordination Agent: This agent is the coordination agent in the whole system that communicates with all other agents within the enterprise. It looks after the whole agents to make all agents to co-operation coherently. Window Size Checking Agent: This agent carries out checking the current “window size” to answer if there needs to change the window size or not for the system at the current running status. Fuzzy Controller Agent: This agent will carry on the implementation of fuzzy controller to obtain the optimal window size at the current status. System State Agent: This agent takes care of the whole system (rather than just window size issue) and make sure the holistic system is always sitting on the optimal status. In this solution, not a single agent is fully aware of the whole communication process. Instead, all agents get together to make the whole communication happen and keep the holistic system sitting on the designed state. With this kind of approach, which is quite suitable to be represented as an agent society, modifications can be done effectively as shown in Figure 3 with real line and dish line are used for distinguished from two communication channels, namely one is out of the agent and another one is getting in the agent.
Fig. 3 The Enterprise and the Agent Society
Intelligent Decision for Dynamic Fuzzy Control Security System
497
The “System State Agent” looks after the holistic ECC system and ensures the coding, encoding, cryptography, energy, communications among the nodes, etc. are in the designed states. It will frequently talk to the “Co-ordination Agent” that is majorly control the optimized dynamic window to meet the requirements from the “System State Agent”. The “Co-ordination Agent” is in charge of whole optimized dynamic window with fuzzy controller to effectively operate ECC system. The communications majorly among the “Window Size Checking Agent”, “Fuzzy Controller Agent” and “System State Agent” as shown in Figure 3. “Window Size Checking Agent” is looking after the ECC system the relations among the “ROM control Agent”, “RAM Control Agent”, and “Calculation Control Agent”. To keep the current ECC system always sitting on the excellent condition where the RAM is fully used with its potential while the ROM storage ensure the required pre-calculation results is fairly maintained. Also if there is any calculation needs for the ECC system, the “Calculation Control should offer the services as required. The “Fuzzy Controller Agent” in the agent society is the implementation agent for the Fuzzy Controller, which is running the framework shown in Figure 2. It has three agents: (1) “Fuzzy Input Agent”, which is looking after the three components, namely “storage Room”, “PreComputing”, and “Doubling” as shown in Figure 2; (2) “Fuzzy Rule Agent” is managing “Fuzzy Riles” in Figure 2 to make sure the fuzzy controller can correctly complete the designed functions; and (3) “Fuzzy Output Agent” is taking acre of the “Output” of the Fuzzy control and the ensuring the output sending is correct as shown in the right hand side part of Figure 2. The software developments are based on C++ platform, we have concentrated on developing MAS application using the same C++ language. We have found that most of the MAS applications only support Java based developments. Therefore we have decided to write our own application for MAS using C++. This works well with MAS, since C++ being an object oriented language, the agents can easily be represented by C++ classes. As of now we are in the process of developing the Agent society along with Co-ordination Agent. Appleby and Steward of BT Labs have done a similar approach to prototype a mobile agent based system for controlling telecommunication networks. The final simulation result is shown in Figure 4. There are two outcomes, namely “Doubling” and “PreComputing”.
498
X. Huang, P.G. Shah, and D. Sharma
Fig. 4 The output of the surface for the StorageRoom = constant (0.4) for the up side and StorageRoom = constant (0.8) for the down side and PrecComputing vs. Doubling.
It is obviously to find that the, from above figures, the number for the “addition” factor is leas “important” than others in the fuzzy control system. In our simulations, the proposed method together with a fuzzy window size controller makes the ECC calculation in the current algorism is about 17% more efficient than the methods in [23] with the same QoS level.
5 Conclusion In this paper we have extended our previous research results to the intelligent decision system via the agent society to increase the capacity and capability of Fussy system and the make the original system more efficient and effective. The final simulation in a sensor wireless network shows that about 17.5% more efficient than our previous method [23] can be obtained with an ECC sensor network.
References [1] Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a survey. Computer Networks 38, 393–422 (2002) [2] Chung-Kuo, C., Overhage, J.M., Huang, J.: An application of sensor networks for syndromic surveillance, pp. 191–196 (2005)
Intelligent Decision for Dynamic Fuzzy Control Security System
499
[3] Werner-Allen, G., Lorincz, K., Ruiz, M., Marcillo, O., Johnson, J., Lees, J., Welsh, M.: Deploying a wireless sensor network on an active volcano. IEEE Internet Computing 10, 18–25 (2006) [4] Sinopoli, B., Sharp, C., Schenato, L., Schaffert, S., Sastry, S.S.: Distributed control applications within sensor networks. Proceedings of the IEEE 91, 1235–1246 (2003) [5] Sikka, P., Corke, P., Valencia, P., Crossman, C., Swain, D., Bishop-Hurley, G.: Wireless ad hoc sensor and actuator networks on the farm, pp. 492–499 (2006) [6] Stephens Jr., D.L., Peurrung, A.J.: Detection of moving radioactive sources using sensor networks. IEEE Transactions on Nuclear Science 51, 2273–2278 (2004) [7] Feng, Z.: Wireless sensor networks: a new computing platform for tomorrow’s Internet, vol. 1, pp. I–27 (2004) [8] Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: A Survey on Sensor Networks. IEEE Communication Magazine 40, 102–116 (2002) [9] Koblitz, N.: Elliptic Curve Cryptosystems. Mathematics of Computation 48, 203–209 (1987) [10] Miller, V.S.: Use of Elliptic Curves in Cryptography. In: Williams, H.C. (ed.) CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986) [11] Lopez, J., Dahab, R.: An overview of elliptic curve cryptography, Technical report,Institute of Computing, Sate University of Campinas, Sao Paulo, Brazil (May 2000) [12] Lauter, K.: The advantages of elliptic curve cryptography for wireless security. Wireless Communications, IEEE [see also IEEE Personal Communications] 11, 62– 67 (2004) [13] Wang, H., Sheng, B., Li, Q.: Elliptic curve cryptography-based access control in sensor networks. Int. J. Security and Networks 1, 127–137 (2006) [14] Gura, N., Patel, A., Wander, A., Eberle, H., Shantz, S.C.: Comparing elliptic curve cryptography and RSA on 8-bit cPUs. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 119–132. Springer, Heidelberg (2004) [15] http://csrc.nist.gov/CryptoToolkit/dss/ecdsa/NISTReCur.pdf [16] Malan, D.J., Welsh, M., Smith, M.D.: A public-key infrastructure for key distribution in TinyOS based on elliptic curve cryptography. In: 2nd IEEE International Conference on Sensor and Ad Hoc Communications and Networks (SECON 2004)2nd IEEE International Conference on Sensor and Ad Hoc Communications and Networks (SECON 2004), pp. 71–80 (2004) [17] Blake, I., Seroussi, G., Smart, N.: Elliptic Curves in Cryptography, vol. 265 (1999) [18] Hankerson, D., Hernandez, J.L., Menezes, A.: Software implementation of elliptic curve cryptography over binary fields. In: Paar, C., Koç, Ç.K. (eds.) CHES 2000. LNCS, vol. 1965, p. 1. Springer, Heidelberg (2000) [19] Gillie, A.C.: Binary Arithmetic and Boolean algebra, p. 53. McGRAW-HILL Book Company, New York (1965) [20] Bellman, H.R., Zadeh, L.A.: Decision-making in a fuzzy environment. Management Science 17, 141–164 (1970) [21] Huang, X., Wijesekera, S., Sharma, D.: Fuzzy Dynamic Switching in Quantum Key Distribution for Wi-Fi Networks. In: Proceeding of the 6th International Conference on Fuzzy Systems and Knowledge Discovery, Tianjin, China, August 14-16, pp. 302– 306 (2009)
500
X. Huang, P.G. Shah, and D. Sharma
[22] Huang, X., Shah, P.G., Sharma, D.: Multi-Agent System Protecting from Attacking with Elliptic Curve Cryptography. In: The 2nd International Symposium on Intelligent Decision Technologies, Baltimore, USA, July 28-30 (2010) (accepted to be published) [23] Huang, X., Sharma, D.: Fuzzy Controller for a Dynamic Window in Elliptic Curve Cryptography Wireless Networks for Scalar Multipication. In: The 16th Asia-Pacific Conference on Communications, APCC 2010, Langham Hotel, Auckland, New Zealand, October 31-November 3, pp. 509–514 (2010) ISBN: 978-1-4244-8127-9
Investigating the Continuance Commitment of Volitional Systems from the Perspective of Psychological Attachment Huan-Ming Chuang, Chyuan-Yuh Lin, and Chien-Ku Lin
*
Abstract. This study majorly integrates IS success model and Social influence theory to take into account both social influences and personal norms, to investigate deeply critical factors affecting the continuance intention of an information system used by elementary schools in Taiwan. Questionnaire survey is conducted to collect data for analysis by PLS, with 206 teachers sampled from Yunlin county elementary school as research subjects. Principal research findings are: (1) perceived net benefits positively affect attitude, and attitude has the same effect on continuance intention, (2) among three psychological attachment degree, compliance shows negative effect toward perceived net benefits and continuance intention, while identification and internalization manifest positive effects, (3) system quality, information quality, and service quality push perceived net benefits, attitude and continuance intention in general. Conclusions offer practical and valuable guidance regarding ways to enhance SFS users’ continuance commitment and behavior. Keywords: IS success Model, Psychological Attachment, IS continuance.
1 Introduction Due to rapid innovation and popularization of information technology, educational agencies are leveraging it to enhance administrative efficiency and effectiveness. Under this background, Yunlin county has been promoting actively a System Free Software (SFS) for education administration. SFS system provides teachers as well as students better teaching, learning, communication and evaluation platform and Huan-Ming Chuang Associate Professor, Department of Information Management, National Yunlin University of Science and Technology *
Chyuan-Yuh Lin · Chien-Ku Lin Graduate Student, Department of Information Management, National Yunlin University of Science and Technology J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 501–510. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
502
H.-M. Chuang, C.-Y. Lin, and C.-K. Lin
gains increasing acceptance. But, in order to improve their continuance intention effectively, prior IS acceptance and usage researches, emphasize social normative compliance purely, have shown great limitations. As a result, more and more researchers propose personal norms should also be considered to truly understand important factors determining the continuance intention of users. Nevertheless, no matter how good the system is, and how active it is promoted, if can not be accepted and used, the system can not be succeeded at all. Consequently, essential factors affecting information system acceptance and continuance are big research issues.
2 Background and Literature Review 2.1 Delone and McLean’s IS Success Model Since IS success is a multi-dimensional concept that can be assessed from different perspectives according to different strategic needs, the measure for IS success is not an easy and objective job. Nevertheless, Delone and McLean in 1992 made a major breakthrough[1]. After conducting a comprehensive review of related literature, they proposed a IS success model as shown in Figure 1.
Fig. 1 DeLon and McLean’s IS success model [1]
This model suggested that the IS success can be represented by the system quality, the output information quality, use of the output, use satisfaction, the effect of the IS on individual impact, and the effect of the IS on organizational impact. This model identified six important dimensions of IS success and suggest the temporal and casual interdependencies between them. After the original model went through lots of validations, Delone and McLean proposed an updated model in 2003 as shown in Figure 2.
Investigating the Continuance Commitment of Volitional Systems
503
Fig. 2 DeLone and McLean’s updated IS success model [2]
The primary differences between the original and updated models can be listed as follow: (1) the addition of service quality to reflect the importance of service and support in successful e-commence systems, (2) the addition of intention to use to measure user attitude, and (3) the combining of individual impact and organizational impact into a net benefit construct.
2.2 Psychological Attachment The concept of psychological attachment is based on Kelman’s (1958) theory of social influence, aiming to understand the basis for individuals’ attitude and belief change [3]. He emphasized the importance of knowing about the nature and depth of changes, which help to predict about the manifestations and consequences of the new attitudes. Kelman noted that individuals can be affected in three different ways: compliance, identification, and internalization, which are termed the three processes of attitude change [3]. Distinguishing the three processes of attitude change is significant because one could ascertain how individuals are influenced, and then could make meaningful predictions about the consequences of the individuals’ change. Compliance occurs when an individual adopts the induced behavior to gain specific rewards or approval, and to avoid specific punishments or disapproval by conforming. In this case, the individual accepts influence not because he believes in the content of the induced behavior but because he hopes to achieve a favorable reaction from another person or group. Identification occurs when an individual accepts the induced behavior because he wants to obtain a satisfying, self-defining relationship to another person or group. In this case, an individual is simply motivated by the desired relationship, not by the content of the induced behavior. Internalization occurs when influence is accepted because the content of the induced behaviors is congruent with his own value system. In this case, internalization is due to the content of the new behavior. Behavior adopted in this fashion is more likely to be integrated into the individual’s existing values.
504
H.-M. Chuang, C.-Y. Lin, and C.-K. Lin
The processes of compliance, identification, and internalization represent three qualitatively different ways of accepting influence. Kelman further illuminated that behaviors adopted through different processes can be distinguished is based on the conditions under which the behavior is performed [3]. He indicated that each of the three processes mediates between a distinct set of antecedents and a distinct set of consequents. Given the set of antecedents or consequents, influence then will take the form of compliance, identification, or internalization, respectively. Each of these corresponds to a characteristic pattern of internal responses in which the individual engages while accepting the influence. Kelman also noted that responses adopted through different processes will be performed under different conditions and will have different properties [3]. For example, behavior adopted through compliance is performed under surveillance by the influencing agent. Behavior adopted through identification is performed under conditions of salience of the individual’s relationship to the influencing agent. And behavior adopted through internalization is performed under conditions of relevance of the issue, regardless of surveillance or salience. The induced behavior is integrated with the individual’s existing value system and becomes a part of his personal norms. These differences between the three processes of social influence may represent separate dimensions of the commitment to the group or IT usage [4]. Some management research referred the term commitment to antecedents and consequences of behavior, as well as the process of becoming attached and the state of attachment itself to specific behaviors [5][6]. More specifically, it is the psychological attachment that is the construct of common interest (O’Reilly and Chatman) in Kelman’s social influence theory [4]. Kelman’s theory underscores personal norms instead of simple social norms in terms of understanding behavioral commitment to system usage [3]. Personal norms are embedded in one’s own values system and beliefs, and therefore allow one to understand the individual inherent reason that why the induced behavior is adopted or rejected. In IT research, the use of an IT is viewed as a continuum that refers to the range from nonuse, compliant use, and committed use. The continuum is a function of the perceived fit of the system use in terms of the users’ values. Accordingly, this study defines user commitment as the user’s psychological attachment to the chosen technology context. Several prior IS research points out that social influence (referring to subjective norm) is important to predict user’s IS usage and acceptance behavior. However, the conceptualization of social influence only based on social normative compliance has theoretical and psychometric problems, due to that it is difficult to distinguish whether usage behavior is caused by the influence of certain referents on one’s intent or by one’s own belief [7]. Additionally, social normative compliance usually occurred under the power of the influencing agent. Therefore, the prediction on IT usage is likely to require more than simple compliance. Because O’Reilly and Chatman (1986) have proven that psychological attachment can be predicated on compliance, identification, and internalization [4], this study views social influence as the specific behavior adoption process by any individual to fulfill his own instrumental goals in terms of the above three processes.
Investigating the Continuance Commitment of Volitional Systems
505
3 Research Model and Hypotheses 3.1 Research Model Based on the literature review, we propose a research model shown in Figure 3, examining the effects of user commitment and IS success factors on volitional systems usage behavior.
Fig. 3 Research model
3.2 Research Hypotheses Users’ commitment level can be categorized as continuance commitment and affective commitment [8]. Continuance commitment is based on the costs that the system user associates with not adopting the induced behavior. While affective commitment refers to the commitment of the system user based upon congruence of personal values and identification of satisfying self-defining relationships and is represented by identification and internalization in this study. Since compliance can be said to occur when an individual accepts the induced behavior for the sake of gaining rewards of approval, or minimize costs such as punishments, we can expect its negative toward continuance-related variables and propose the following hypotheses. 3.2.1 Hypothesis Related to Perceived Net Benefits H1a: Compliance will have a negative influence on perceived net benefits. H1b: Identification will have a positive influence on perceived benefits. H1c: Internalization will have a positive influence on perceived benefits. H1d: System quality will have a positive influence on perceived net benefits. H1e: Information quality will have a positive influence on perceived net benefits. H1f: Service quality will have a positive influence on perceived net benefits.
506
H.-M. Chuang, C.-Y. Lin, and C.-K. Lin
3.2.2 Hypotheses Related to Attitude H2a: Compliance will have a negative influence on attitude. H2b: Identification will have a positive influence on attitude. H2c: Internalization will have a positive influence on attitude. H2d: System quality will have a positive influence on attitude. H2e: Information quality will have a positive influence on attitude. H2f: Service quality will have a positive influence on attitude. 3.2.3 Hypotheses Related to Continuance Intention H3a: Perceived net benefits will have a positive influence on attitude. H3b: Attitude will have a positive influence on continuance intention.
4 Research Method 4.1 Study Setting School Free Software (SFS) is a school affair system developed under the collaboration of teachers with computer expertise, under the platform as “Interoperable Free Opensource.” Where system modules developed by skilled teachers, such as bulletin board, student credits processing and so on, are integrated for free download to meet the goal of avoiding duplicate development and enhancing resources sharing. Major advantages of SFS can be described as follows. First, with its cross-platform feature, all its functional modules can be accessed and processed through internet browser, meeting the spirits of free software. Second, compared with current commercial software package, it can lessen burden of limited budget greatly. Last, since all the functional modules are developed by incumbent teachers, it fits practical administrative procedures quite well, synergic system performance and compatibility can easily be attained. Default functions offered by SFS can be categorized as follows: (1) school affairs, (2) academic affairs, (3) student affairs, (4) teaching and administrative staff, (5) system administration, and (5) extra modules. These functions can be easily modify and customize for any special purposes. Though SFS is promoted aggressively, it is volitional in nature, since users can decide their involvement of the system willingly.
4.2 Operationalization of Constructs All constructs and measures were based on items in existing instruments, related literature, and input from domain experts. Items in the questionnaire were measured using a seven-point Likert scale ranging from (1) strongly disagree to (7) strongly agree.
Investigating the Continuance Commitment of Volitional Systems
507
4.3 Data Collection Data for this study were collected using a questionnaire survey administered in Yunlin county of Taiwan. The respondents were sampled from instructors of elementary schools who have experiences with SFS. We sent out 250 questionnaires and received 206 useful responses.
5 Data Analysis and Results 5.1 Scale Validation We used PLS-Smart 2.0 software to conduct confirmatory factor analysis (CFA) to assess measurement scale validity. The variance-based PLS approach was preferred over covariance-based structural equation modeling approached such as LISREL because PLS does not impose sample size restrictions and is distribution-free [9]. 100 records of raw data was used as input to the PLS program, and path significances were estimated using the bootstrapping resmapling technique with 200 subsamples. The steps of scale validation were summarized as shown in table 1. Table 1 Scale validation Type of validity Definition and criteria Reference Convergent ●Measures of constructs that theoretically should be related to each [10] validity other are, in fact, observed to be related to each other. - All item factor loadings should be significant and exceed 0.70 - Composite reliabilities (CR) for each construct should exceed 0.80 - Average variance extracted (AVE) for each construct should exceed 0.50, or the square root of AVE should exceed 0.71. Discriminate ●Measures of constructs that theoretically should not be related to [10] validity each other are, in fact, observed to not be related to each other. - The square root of AVE for each construct should exceed the correlations between that and all other constructs.
As seen from Table 2, standardized CFA loadings for all scale items in the CFA model were significant at p 1),
⎧ ⎛ ⎞ x = y, a = b ⎪ ⎪ ⎨ ⎝ 1 x > y, a > b ⎠ where f (x, y, a, b) = .(8) x < y, a < b ⎪ ⎪ ⎩ 0 (otherwise)
Notation-Support Method in Music Composition
555
S takes a value between 0 and 1. When S is 1, all the directions from a note to its following note in E are the same as those in T . We define S calculated just after interval-pitch conversion as S0 . We can prove the validity of proposed methods if A or R is close to 1 with system support S0 is close to 1, namely a subject recognizes the intervals of a target melody correctly.
4.3 Experimental Results and Discussion We recorded operational logs and an externalized melody in each trial. There was a case in which by the subject D was not the same as that of the melody b. This is partly because the subject D could not recognize the melody b correctly. In what follows, we discuss 14 trials other than this one. Table1 shows S0 , Ae and Re . Ae and Re are the pitch agreement rates at the end of a trial. When subject reset the input or converted the input again in a trial, its result has multiple values of S0 . We can see that Ae or Re is close to 1 when S0 is close to 1. This result shows that externalization is performed accurately with our system, when a subject recognizes the target correctly. In trials of the subject A, the value Ae is low. This result shows there is a difference between the pitch recognized by the subject and that of the actual melody. However, the value S0 and Re are high. This means that the subject A recognizes melodies not with pitches of notes, but with intervals between notes. When the value Ne is low, the value S0 also tends to be low. After the experiment, we interviewed the subjects B and D to verify whether or not they recognized intervals correctly. Neither of them could not guess correctly whether a note is higher or
Table 1 Result of each trial Melody a Melody b S0 Subject A Re Ae S0 Subject B Re Ae S0 Subject C Re Ae S0 Subject D Re Ae S0 Subject E Re Ae
0.94 1 1 0.69 0.23 0 1 1 1 0.5 0.26 0.06 1 1 1
1 1 0 0.06 → 0.87 0.75 0.06 1 1 1
0.93 1 1
Melody c 0.8 → 0.93 0.81 0 0.67→ 0.67 0.13 0.06 1 1 1 0.47 → 0.4 → 0.67 0.13 0.13 1 0.86 0.86
556
M. Kanamaru, K. Hanaue, and T. Watanabe
lower than the previous note. The result agrees with the fact that S0 of their trials were low. This fact shows that supporting externalization by our proposed method is difficult when a subject cannot recognize intervals correctly.
5 Conclusion In this paper, we propose a notation-support method based on interval-pitch conversion. First, we mention the need for support of externalization for beginners. We pointed out that the difficulty of externalizing melodies for beginners comes from the difficulty of conversion from intervals to pitches. Therefore, we propose a method which covers user with interval-pitch conversion. We implemented the prototype system and validated our method in a situation where a subject externalized a melody that he/she memorized by listening to it. As the results of experiment, our approach is appropriate as a notation-support method for a subject who has have relative hearing. Although supportable users are limited, our method is valid for the beginners who have strong motivation for composition. For our future work, we have to consider a method for externalization of rhythm factor. Observation of the externalized melodies in the experiment revealed that input duration tends to be shorter. One of the reasons is that the subjects put the pen off the tablet involuntarily to get the timing of the next tap. One of the solutions to this problem is to adjust input values.
References 1. Unehara, M., Onisawa, T.: Interactive music composition system - Composition of 16bars musical work with a melody part and backing parts. In: The 2004 IEEE International Conference on Systems, Man & Cybernetics, pp. 5736–5741 (2004) 2. Farbood, M.M., Pasztor, E., Jennings, K.: Hyperscore: A graphical sketchpad for novice composers. Computer Graphic and Applications 24(1), 247–255 (2009) 3. Itou, N., Nishimoto, K.: A voice-to-MIDI system for singing melodies with lyrics. In: The International Conference on Advances in Computer Entertainment Technology, pp. 183–189 (2007)
Numerical Study of Random Correlation Matrices: Finite-Size Effects Yuta Arai, Kouichi Okunishi, and Hiroshi Iyetomi∗
Abstract. We report the numerical calculations of the distribution of maximal eigenvalue for various size of random correlation matrices. Such an extensive study enables us to work out empirical formulas for the average and standard deviation of the maximal eigenvalue, which are accurate in a wide range of parameters. As an application of those formulas, we propose a criterion to single out statistically meaningful correlations in the principal component analysis. The new criterion incorporates finite-size effects into the current method based on the random matrix theory, which gives the exact results in the infinite-size limit.
1 Introduction In the field of econophysics, random matrix theory(RMT) has been successfully combined with the principal component analysis of various economic data. The analytical forms of the eigenvalue distribution and its upper edge λ+ of random correlation matrix provide a very useful null model for extracting significant information on correlation structures in the data [1, 2, 3, 4, 5, 6, 7]. Here, we should note that the analytical results of RMT is basically obtained for the case where the matrix size is infinite. When the matrix size is finite, the eigenvalue of the random correlation matrices may have a certain distribution in the region beyond λ+ , which is called “finite-size effect” here. When the data size is sufficiently large, the finite-size effect can be neglected. In practical situations, however, time series data is not so long that the finite-size effects can be neglected. Taking account of the finite-size effect of Yuta Arai Graduate School of Science and Technology, Niigata University, Ikarashi, Niigata 950-2181, Japan Kouichi Okunishi, · Hiroshi Iyetomi Faculty of Science, Niigata University, Ikarashi, Niigata 950-2181, Japan e-mail:
[email protected] ∗
Corresponding author.
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 557–565. c Springer-Verlag Berlin Heidelberg 2011 springerlink.com
558
Y. Arai, K. Okunishi, and H. Iyetomi
the random correlation matrix, therefore, we should improve the RMT criterion in principal component analysis of actual data. In this paper, we numerically analyze the finite-size effect of random correlation matrices. Especially, we focus on the average and standard deviation of the maximum eigenvalue distribution of random correlation matrices. In reference to the Tracy-Widom distribution of order one [8, 9, 10], we show that the average and standard deviation can be described by nontrivial power-law dependences on the matrix size N and the ratio Q ≡ T /N, where T is length of time series data. We then propose a new criterion to single out statistically meaningful correlations in the principal component analysis for a finite size data.
2 Maximal Eigenvalue for Random Correlation Matrix We begin with introducing a random matrix H which is an N × T matrix with elements {hi j ; i = 1, . . . , N; j = 1, . . . , T }. Its elements {hi j } are random variables following normal distribution N(0, 1) and hence mutually independent. The correlation matrix is then defined by C=
1 HHT . T
(1)
In the limit N, T → ∞ with Q ≡ T /N fixed, the probability density function ρ (λ ) of eigenvalue λ of the random correlation matrix C is analytically obtained as Q (λ+ − λ )(λ − λ− ) ρ (λ ) = , (2) 2π λ λ± = 1 + 1/Q ± 2 1/Q, (3) for λ ∈ [λ− , λ+ ], where λ− and λ+ are the minimum and maximum eigenvalues of C, respectively. The analytic result (2) is valid for N, T → ∞. However, for the case where the matrix size is finite, the maximal eigenvalue has broadening. We numerically generate random correlation matrices with Q = 2 for N = 20, 100, 400 and then calculate their maximal eigenvalues. For each N, the number of samples is 10000. Figure 1 shows the shape of the distribution of maximal eigenvalue. With increasing matrix size, the average of the maximal eigenvalue certainly approaches λ+ and broadening of the distribution becomes narrower. To incorporate these finite-size effects with the principal component analysis, we should precisely analyze features of the maximal eigenvalue distribution, namely, the dependence of the average and standard deviation of the maximal eigenvalue on Q and N. Precisely speaking, the distribution is slightly asymmetric around its center with a fatter right tail. Here, we should recall the analytical results [10] of the average and standard deviation for the maximal eigenvalue, which are very helpful for our following numerical analysis. The distribution of the maximal eigenvalue is known follow
Numerical Study of Random Correlation Matrices: Finite-Size Effects
559
Fig. 1 Distribution of the maximal eigenvalue at Q = 2 for various values of N.
asymptotically the Tracy-Widom distribution of order one with the center constant μ and scaling constant σ , where 1 T 1 σ = T
μ=
√
√ 2 T −1+ N , √ √ 1 1 1/3 √ T −1+ N +√ . T −1 N
(4) (5)
Expanding the right-hand side of Eq. (4) with respect to T and N with fixed Q, we obtain λ+ μ = λ+ − . (6) NQ Also we rewrite σ into a convenient form, 4/3 σ = N −2/3 Q−7/6 1 + Q .
(7)
And then the average lm and standard deviation σm of the maximal eigenvalue are given as lm = μ − 1.21σ , σm = 1.27σ ,
(8) (9)
where −1.21 and 1.27 are numerical results [9] for the average and standard deviation of the Tracy-Widom distribution. Equations (8) and (9) thus constitute a leading finite-size correction to the RMT prediction of the maximal eigenvalue.
560
Y. Arai, K. Okunishi, and H. Iyetomi
3 Numerical Results 3.1 Average of the Maximal Eigenvalue In the following, λm denotes the statistical average of the maximal eigenvalue obtained by numerical calculation for N and Q. The number of samples is 10000, for which the statistical error is negligible in our fitting arguments. We define f (N, Q) as f (N, Q) ≡ λ+ − λm .
(10)
We calculate f (N, Q) for N = 20, 30, 50, 70, 100, 200, 400 and Q = 1, 2, 3, 4, 5, 7, 10, 20, 30, which are shown by cross symbols in Fig. 2. Then, inspired by Eq. (6), we assume that f (N, Q) takes empirical form given by fe (N, Q) = aN b Qc .
(11)
The parameters a, b and c are determined by the least squares fitting to f (N, Q). Then we obtain fe (N, Q) = 5.67N −0.74Q−0.68 ,
(12)
which is compared with the numerical results in Fig. 2. We remark that the finitesize scaling law has nontrivial fractional exponents with respect to N and Q. The exponent of Q is very close to − 23 .
Fig. 2 Functional behavior of the empirical formula fe (N, Q) together with f (N, Q).
Figure 3 depicts how accurately λ+ − lm and fe (N, Q) can reproduce f (N, Q) at N = 20 and 400. Deviation of λ+ − lm from f (N, Q) decreases with increasing N. But the convergence of λ+ − lm to f (N, Q) is very slow. To evaluate accuracy of fe (N, Q) quantitatively, we calculate the absolute and relative errors defined as
Δ ≡ | fe (N, Q) − f (N, Q)|,
(13)
Numerical Study of Random Correlation Matrices: Finite-Size Effects
561
Fig. 3 The results of least squares fitting of f e and λ+ − lm . Left panel shows the results at N = 20 and right panel, those at N = 400.
δ≡
| fe (N, Q) − f (N, Q)| . λ+
(14)
For comparison, the same evaluation was done with λ+ − lm replacing fe in Eqs. (13) and (14). The comparison results are summarized in Table 1. We thus see that fe (N, Q) reproduces f (N, Q) quite accurately; the relative errors are well within 1% even for small N. Table 1 Comparison of accuracy of f e and λ+ − lm . Δ and Δ max refer to the average and maximum of Δ . The same notations are used for δ .
Δ Δmax δ δmax
f e (N, Q)
λ+ − lm
3.4 × 10−3
3.0 × 10−2 1.2 × 10−2 (Q = 1, N = 20) 1.3 × 10−2 3.4 × 10−2 (Q = 2, N = 20)
2.5 × 10−2 (Q = 1, N
= 50) 1.2 × 10−3 6.4 × 10−3 (Q = 1, N = 50)
3.2 Standard Deviation of the Maximal Eigenvalue The standard deviation of the maximal eigenvalue is calculated for the same combinations of N and Q as its average. We refer to the numerical results as g(N, Q), which are shown by cross symbols in Fig. 4. To obtain an empirical formula ge (N, Q) for the numerical results, we preserve the functional form of the scaling constant in Q as given by Eq. (7): ge (N, Q) = A(N)Q−7/6 (1 + Q)4/3 . (15)
562
Y. Arai, K. Okunishi, and H. Iyetomi
Fig. 4 Functional behavior of g(N, Q) and ge (N, Q).
The prefactor A is determined by the least squares fit to the original data for each N. Figure 5 shows ge (N, Q) so determined along with g(N, Q) and σm at N = 20 and 400. The convergence of σm is much faster than that of λ+ − lm . For N = 400, σm and ge (N, Q) reproduce g(N, Q) quite well. For N = 20, however, we observe σm deviates appreciably from g(N, Q). On the other hand, the empirical formula can rectify the deficiency of σm for such a small value of N.
Fig. 5 Accuracy of the least squares fitting for ge (N, Q) as a function of Q with given N. The Left panel shows the results at N = 20 and the right panel, those at N = 400.
We then determine the N-dependence of A. Recalling Eq. (7) again, we fit the fitted results for A to the form given by log A = log 1.27 −
N log N 2/3 . N − 3.10
(16)
Equation (15) with this formula recovers Eq. (9) in the limit of large N. Figure 6 shows the N-dependence of A. Equation (15) together with Eq. (16) yields ge (N, Q) in the entire (N,Q) plane, which is presented as lines in Fig. 4.
Numerical Study of Random Correlation Matrices: Finite-Size Effects
563
Fig. 6 The N-dependence of A.
To check accuracy of ge (N, Q) quantitatively, we evaluate Δ and δ for ge (N, Q). Note that the denominator in δ is g(N, Q), instead of λ+ in Eq. (14). The results are summarized in Table 2. Table 2 Comparison of accuracy of ge and σm . The same notations for Δ and δ as in Table 1 are used.
Δ Δmax δ δmax
ge (N, Q)
σm
2.9 × 10−3 1.3 × 10−2 (Q = 1, N = 20) 4.2 × 10−2 1.7 × 10−1 (Q = 30, N = 20)
1.4 × 10−2 1.5 × 10−1 (Q = 1, N = 20) 1.4 × 10−1 5.1 × 10−1 (Q = 1, N = 20)
4 Criterion for Principal Components Taking Account of Finite-Size Effect On the basis of the results in the previous sections, we propose a new criterion for principal component analysis. Let us write the eigenvalue of a correlation matrix obtained from finite size data as λ . So far, the criterion for principal component is given by λ > λ+ . However, the eigenvalue of the corresponding random correlation matrix have a certain distribution in the region larger than λ+ , which is quantified by fe (N, Q) and ge (N, Q). Adopting the confidence level of 3σ (99.7%), we propose the following new criterion:
λ > λnew (N, Q) ≡ λ+ − fe (N, Q) + 3ge (N, Q).
(17)
We infer that the eigenvectors associated with eigenvalues satisfying this criterion contain statistically meaningful information on the correlations. For reference, we also introduce an alternative criterion with the same confidence level as Eq. (17) using the asymptotic formulas (8) and (9):
564
Y. Arai, K. Okunishi, and H. Iyetomi
λ > λ+ ≡ lm + 3σm .
(18)
Table 3 compares λnew and λ+ with λ+ for various combinations of values of N and Q. The critical value λnew for the principal component analysis is appreciably larger than λ+ over the parametric range as covered in Table 3. If we adopted the criterion based on λ+ , we would underestimate the number of principal components significantly for such small N as N < ∼ 100 and small Q as Q ∼ 1. Table 3 Comparison of the criterion for the principal component analysis. Q
N
λ+
λ+
λnew
1 1 1 1 2 2 2 2 3 3 3 3
20 30 50 100 20 30 50 100 20 30 50 100
4 4 4 4 2.91 2.91 2.91 2.91 2.48 2.48 2.48 2.48
4.80 4.61 4.44 4.28 3.38 3.27 3.17 3.08 2.84 2.76 2.68 2.61
4.30 4.30 4.27 4.20 3.05 3.06 3.05 3.02 2.58 2.59 2.59 2.57
5 Summary We numerically studied the distribution of eigenvalues of random correlation matrices for various matrix sizes. In particular, we investigated the finite-size dependence of the average and standard deviation of the maximal eigenvalue distribution. The main results are summarized as follows. • The finite-size correction fe (N, Q) to the average, given by Eq. (12), has nontrivial power-law behavior in N and Q and reproduces the corresponding results obtained numerically quite well. • The standard deviation ge (N, Q), modeled by Eq. (15) with Eq. (16) is in good agreement with the original numerical results even for N as small as 20. As an application of these results, we finally proposed a new criterion to single out genuine correlations in the principal component analysis. This new criterion thus takes accurate account of the finite-size correction to the RMT prediction. The new criterion is especially useful when both N and Q are small (N < ∼ 100 and Q ∼ 1). On the other hand, the similar criterion based on the asymptotic formulas might dismiss an appreciable number of statistically meaningful principal components in the same condition for N and Q.
Numerical Study of Random Correlation Matrices: Finite-Size Effects
565
Acknowledgements. This work was partially supported by the Program for Promoting Methodological Innovation in Humanities and Social Sciences by Cross-Disciplinary Fusing of the Japan Society for the Promotion of Science and by the Ministry of Education, Science, Sports, and Culture, Grants-in-Aid for Scientific Research (B), Grant No. 22300080 (2010-12).
References 1. Laloux, L., Cizeau, P., Bouchaud, J.P., Potters, M.: Phys. Rev. Lett. 83, 1467 (1999) 2. Santhanam, M.S., Patra, P.K.: Phys. Rev. E 64, 016102 (2002) 3. Plerou, V., Gopikrishnan, P., Rosenow, B., Amaral, L.A.N., Guhr, T., Stanley, H.E.: Phys. Rev. E 65, 066126 (2002) 4. Utsugi, A., Ino, K., Oshikawa, M.: Phys. Rev. E 70, 026110 (2004) 5. Kim, D.H., Jeong, H.: Phys. Rev. E 72, 046133 (2005) 6. Kulkarni, V., Deo, N.: Eur. Phys. J. B 60, 101 (2007) 7. Pan, R.K., Sinha, S.: Phys. Rev. E 76, 046116 (2007) 8. Tracy, C.A., Widom, H.: Comm. Math. Phys. 177, 727–754 (1996) 9. Tracy, C.A., Widom, H.: Calogero-Moser-Sutherland Models, ed. by van Diejen, J., Vinet, L., pp. 461–472. Springer, New York (2000) 10. Johnstone, I.M.: The Annals of Statistics 29, 295–327 (2001)
Predicting of the Short Term Wind Speed by Using a Real Valued Genetic Algorithm Based Least Squared Support Vector Machine Chi-Yo Huang, Bo-Yu Chiang, Shih-Yu Chang, Gwo-Hshiung Tzeng, and Chun-Chieh Tseng *
Abstract. The possible future energy shortage has become a very serious problem in the world. An alternative energy which can replace the limited reservation of fossil fuels will be very helpful. The wind has emerged as one of the fastest growing and most important alternative energy sources during the past decade. However, the most serious problem being faced by human beings in wind applications is the dependence on the volatility of the wind. To apply the wind power efficiently, predictions of the wind speed are very important. Thus, this paper aims to precisely predict the short term regional wind speed by using a real valued genetic algorithm Chi-Yo Huang Department of Industrial Education, National Taiwan Normal University No. 162, Hoping East Road I, Taipei 106, Taiwan e-mail:
[email protected] *
Bo-Yu Chiang Institute of Communications Engineering, National Tsing Hua University No. 101, Sec. 2 , Guangfu Road, Hsinchu 300, Taiwan e-mail:
[email protected] Shih-Yu Chang Institute of Communications Engineering, National Tsing Hua University No. 101, Sec. 2, Guangfu Road, Hsinchu 300, Taiwan e-mail:
[email protected] Gwo-Hshiung Tzeng Department of Business and Entrepreneurial Administration, Kainan University No. 1, Kainan Road, Luchu, Taoyuan County 338, Taiwan Gwo-Hshiung Tzeng Institute of Management of Technology, National Chiao Tung University Ta-Hsuch Road, Hsinchu 300, Taiwan e-mail:
[email protected] Chun-Chieh Tseng Nan-Pu Thermal Power Plant, Taiwan Power Company No. 5, Chenggong 2nd Rd., Qianzhen Dist., Kaohsiung City 806, Taiwan e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 567–575. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
568
C.-Y. Huang et al.
(RGA) based least squared support vector machine (LS-SVM). A dataset including the time, temperature, humidity, and the average regional wind speed being measured in a randomly selected date from a wind farm being located in Penghu, Taiwan was selected for verifying the forecast efficiency of the proposed RGA based LS-SVM. In this empirical study, prediction errors of the wind turbine speed are very limited. In the future, the proposed forecast mechanism can further be applied to the wind forecast problems based on various time spans. Keywords: wind power, wind speed forecasting, short term wind prediction, support vector machines (SVMs), genetic algorithm (GA), least squared support vector machine (LS-SVM).
1 Introduction The unbalanced supply of fossil fuels during the past years has aroused the human being’s anxiety of possible shortage or depletion of fossil fuels in the near future. Thus, people strived toward developing alternative energy sources like wind energy, tidal wave energy, solar energy, etc. The wind power is the fastest growing renewable energy (Mathew 2006) and has played a daily significant role in replacing the traditional fossil fuels. According to the statistics of the Global Wind Energy Council (GWEC), the global installed capacity increased from 23.9 gigawatts (GW) at the end of 2001 to 194.4 GW at the end of 2010 (Global Wind Energy Council 2011), at the compound annual growth rate of 23.3%. To fully benefit from a large fraction of wind energy in an electrical grid, it is therefore necessary to know in advance the electricity production generated by the wind (Landberg 1999). The prediction of wind power, along with load forecasting, permits scheduling the connection or disconnection of wind turbine or conventional generators, thus achieving low spinning reserve and optimal operating cost (Damousis et al. 2004). In order to achieve the highest possible prediction accuracy, the prediction methods should consider appropriate parameters and data that may indicate future trends (Mabel and Fernandez 2008). However, one of the largest problems of wind power, as compared to conventionally generated electricity, is its dependence on the volatility of the wind (Giebel et al. 2003). As observed by Sfetsos (2000), wind is often considered as one of the most difficult meteorological parameters to forecast because of the complex interactions between large scale forcing mechanisms such as the pressure and the temperature differences, the rotation of the earth, and local characteristics of the surface. The short-term prediction is a subclass of the wind power time prediction (in opposition to the wind power spatial prediction) with the time scales concerning short-term prediction to be in the order of some days (for the forecast horizon) and from minutes to hours (for the time-step). According to Costa et al. (2008), the short-term prediction of wind power aims to predict the wind farm output either directly or indirectly. The short-term prediction is mainly oriented to the spot (daily and intraday) market, system management and scheduling of some maintenance tasks (Costa et al. 2008).
Predicting of the Short Term Wind Speed by Using a RGA Based LS - SVM
569
To precisely predict the short term wind speed based on a data set consisting of the time, temperature, humidity, and regional average wind speed, a real valued genetic algorithm (RGA) based least squared support vector machine (LS-SVM) is proposed. The genetic algorithm (GA) has been widely and successfully applied to various optimization problems (Goldberg 1989). However, for the problems being solved by using the GA, the binary coding of the data always occupies the computer memory even though only a few bits are actually involved in the crossover and mutation operations (Wu et al. 2007). To overcome the inefficient occupation of the computer memory when using the GA, the real valued genetic algorithm (RGA) is proposed by Huang and Huang (1997). In contrast to the GA, the RGA uses a real value as a parameter of the chromosome in populations without performing coding and encoding process before calculates the fitness values individuals. Support vector machines (SVMs) are state-of-the-art tools for linear and nonlinear input–output knowledge discoveries (Vapnik 1998). SVMs were first devised for binary classification problems, and they were later extended for regression estimation problems (Vapnik 1998). The least squares support vector machine (LS-SVM) is a least squared version of the SVM. In this version one finds the solution by solving a set of linear equations instead of a convex quadratic programming (QP) for classical SVMs. LS-SVMs classifiers, was proposed in Suykens and Vandewalle (1999). It reduced computing complexity, but still with the high accuracy, and increased the solving speed. An empirical study based on the real data set (Wu 2008) consisting of the time, temperature, humidity, and regional average wind speed being measured from a wind farm being located in Penghu, Taiwan will be provided for verifying the RGA based LS-SVM forecast mechanism. Based on the empirical study results, the wind speed can be precisely predicted. This research is organized as follows. The related literature regarding to wind power forecasting and renewable energy will be reviewed in Section 2. The RGA based LS-SVM forecast mechanism will be introduced in Section 3. A prediction of the wind speed by the RGA based LS-SVM forecast mechanism will be presented in Section 4. Discussions of the forecast results as well as future research possibilities will be presented in Section 5. Finally, the whole article will be concluded in Section 6.
2 Literature Review The advantages of the renewable energy (e.g. reduced reliance on imported supplies, reduced emissions of greenhouse and other polluting gases) have led countries around the world to provide support mechanisms for expanding renewable electricity generation capacity (Muñoz et al. 2007). Among the renewable energy sources, the wind is the fastest growing one (Mathew 2006) today and has played a daily significant role. The wind power generation depends on the wind speed while the wind speed can easily be influenced by obstacles and the terrain (Ma et al. 2009). A good wind power prediction technique can help develop well-functional hour-ahead or day-ahead markets while the market mechanism can be more appropriate to
570
C.-Y. Huang et al.
weather-driven resources (Wu and Hong 2007). Many methods have been developed to increase the wind speed prediction accuracy (Ma et al. 2009). The prediction methods can be divided into two categories. The first category of prediction methods introduces a lot of physical considerations to reach the best prediction results. The second category of prediction methods introduces the statistical method, e.g. the ARMA model, which aims at finding the statistical relationship of the measured times series data (Marciukaitis et al. 2008). Physical method has advantages in the long-term predictions while the statistical method does well in the short-term predictions (Ma et al. 2009). The time-scale classification of wind forecasting methods is vague and can be separated as follows according to the work by Soman et al. (2010): (1) very short-term predictions from few seconds to 30 minutes ahead, (2) short-term predictions from 30 minutes to 6 hours ahead, (3) medium-term predictions from 6 hours to 1 day ahead, and (4) long-term predictions from 1 day to 1 week ahead. The short-term wind power prediction is an extremely important field of research for the energy sector, as the system operators must handle an important amount of fluctuating power from the increasing installed wind power capacity (Catalão 2011). According to the summarization by Catalão et al (2011), new methods including data mining, artificial neural networks (NN), fuzzy logic, evolutionary algorithms, and some hybrid methods have emerged as modern approaches for short term wind predictions while the artificial-based models outperformed others.
3 Analytic Framework for the GA Based LS-SVM Method In this section, a GA based LS-SVM model will be presented. First, the optimal parameters in the LS-SVM will be determined by using the GA. Then, the data set will be predicted by introducing the optimal parameters into the SVM. he detail of the GA and the LS-SVM will be presented in the following subsections. To precisely establish a GA-based feature selection and parameter optimization system, the following main steps (as shown in Fig. 1) must be proceeded.
Fig. 1 The flow chart of the GA based LS-SVM
3.1 Genetic Algorithm The RGA is introduced for resolving the optimization problems by coding all of the corresponding parameters in a chromosome directly. The two parameters, c and σ , of the LS-SVM will be coded directly to from the chromosome in the RGA. The
Predicting of the Short Term Wind Speed by Using a RGA Based LS - SVM
571
chromosome x is represented as x = { p1, p2} , where p1 and p2 denote the regularization parameter c and sigma σ (the parameter of kernel function in LS-SVM), respectively. A fitness function for assessing the performance of each chromosome must be designed before starts to search optimal values of SVM parameters. In this study, the mean absolute percentage error (MAPE) will be used for measuring the fitness. n
The MAPE is defined as MAPE = 1 n ∑ (ai − fi ) ai × 100% , where ai and fi i =1
represent the actual and forecast values and n is the number of forecasting periods. The genetic operators in the GA include selection, crossover and mutation. The offspring of the existing population will be generated by using the operators. There are two well-known selection methods: the roulette wheel method and the tournament method. Users can determine which method is to be adopted in the simulation. After the selection, a chromosome survives to the next generation. Then, the chromosome will be placed in a matting pool for crossover and mutation operations. Once a pair of chromosomes can be selected for the crossover operation, one or more randomly selected positions will be assigned to the to-be-crossed chromosomes. The newly crossed chromosomes are then combined with the rest of the chromosomes to generate a new population. In this research, the method being proposed by Adewuya (1996) will be introduced to prevent the overload of post-crossover when the GA with real-valued chromosomes are applied. Let x1old = {x11, x12 ,…, x1n } and x2old = { x21, x22 ,… , x2 n } . Move closer: x1new = x1old + σ ( x1old − x2old ), x2new = x2old + σ ( x1old − x2old ). Move away: x1new = x1old + σ ( x2old − x1old ), x2new = x2old + σ ( x2old − x1old ). Here, x1old and x2old represent the pair of populations before the crossover operation while x1new and x2new represent the pair of new populations after the crossover operation. The mutation operation follows the crossover operation and determines whether a chromosome should be mutated in the next generation. In this study, uniform mutation method is applied and designed in the presented model. Consequently, researchers can select the method of mutation in GA-SVM best suited to their problems of interest. Uniform mutation can represent as following: x old = {x1, x2 ,…, xn } , xknew = lbk + r × (ubk − lbk )
x new = {x1, x2 ,…, xknew ,…, xn } where n denotes the , , number of parameters; r represents a random number in the range (0,1) , and k is the position of the mutation. lb and ub are the low and upper bounds on the parameters, respectively. lbk and ubk denote the lower and upper bound at the
location k . x old represents the population before mutation operation; x new represents the new population following mutation operation.
572
C.-Y. Huang et al.
3.2 SVM The SVM is a statistical learning theory based on the machine learning algorithm presented by Vapnik (2000). The SVM uses the linear model to implement nonlinear class boundaries through some nonlinear mapping of the input vector x into the high-dimensional feature space. A linear model being constructed in the new space can represent a nonlinear decision boundary in the original space. In the new space, an optimal separating hyperplane is constructed. Thus, the SVM is known as the algorithm that finds a special kind of linear model, the maximum margin hyperplane. The maximum margin hyperplane gives the maximum separation between the decision classes. The training data sets that are closest to the maximum margin hyperplane are called support vectors. All other training data sets are irrelative for defining the binary class boundaries. For the linear separable case, a hyperplane separating the binary decision classes in the three-attribute case can be represented as y = w0 + w1x1 + w2 x2 + w3 x3 , where
y is the outcome, xi are the attribute values, and there are four weights wi to be learned by the learning algorithm. The weights wi are parameters that determine the hyperplane. The maximum margin hyperplane can be represented as y =b+ α i yi x(i ) ⋅ x , where yi is the class value of training data sets x(i ) , ⋅
∑
represents the dot product. The vector x represented a test data set and the vectors x(i ) are the support vectors. In this equation, b and α i are parameters that determine the hyperplane. From the implementation point of view, finding the support vector and determining the parameters b and α i are equivalent to solving a linearly constrained quadratic programming. As mentioned above, the SVM constructs a linear model to implement nonlinear class boundaries through the transforming the inputs into the high-dimensional feature space. For the nonlinear separating case, a high-dimensional version of the equation is simply represented as y = b + α i yi K ( x (i ), x) .
∑
The function k ( x(i), x) is defined as the kernel function. Any function that meets Mercer’s condition can be used as the Kernel function, like polynomial, sigmoid, and Gaussian radial basis function (RBF) used in SVM. In this work, the RBF kernel is defined as K ( xi , x j ) = exp(− xi − x j
2
2σ 2 ) .
Where σ 2 denotes the variance of the Gaussian kernel. In addition, for the separable case, there is a lower bound 0 on the coefficient α i , for the non-separating case, SVM can be generalized by placing an upper bound c on the coefficients α i . Therefore, the c and σ of a SVM model is important to the accuracy of prediction. The learning algorithm for a non-linear classifier SVM follows the design of an optimal separating hyperplane in a feature space. The procedure is the same as the one being associated with hard and soft margin classifier SVMs in x-space. The
Predicting of the Short Term Wind Speed by Using a RGA Based LS - SVM l
dual Lagrangian in z-space is Ld (α i ) =
∑
αi −
i =1
1 2
573
l
∑y y αα z i j i
j i
T
z j and using the
i , j =1
chosen kernels. The Lagrangian is maximized as l
Max Ld (α i ) =
∑ i =1
αi −
1 2
l
∑ y y α α K(x , x ) i j i
j
i
j
i , j =1
s.t. α i ≥ 0, i = 1,…l , l
∑α y = 0. i i
i =1
Note the constraints must be revised for using in a non-linear soft margin classifier SVM. The only difference these constraints and those of the separable non-linear classifier are in the upper bound c on the Lagrange multipliers α i . Consequently, the constraints of the optimization problem become: s.t. c ≥ α i ≥ 0, i = 1,…l , l
∑α y = 0. i i
i =1
4 Short Term Wind Speed Predictions by Using GASVM In the following Section, the GA based LS-SVM model will be introduced for predicting the short term wind speeds. An empirical study based on the real hourly data from the work by Wu (2008) which was measured from a wind farm being located in Penghu, Taiwan will be used to verify the feasibility of the GA based LS-SVM in the real world short term wind speed prediction problem. Some factors like temperature, humidity, etc., will affect the wind speed. These factors were the input data sets while the output were predictions of wind speed in the simulation. The prediction results as well as errors versus the original data were demonstrated in Table 1 and Fig. 2.
5 Discussion According to the forecast results being demonstrated in Table 1 and Fig. 2, the short term wind speed can be predicted precisely with the average error around 2.27% by using the GA based LS-SVM. The error in PM22:00 and PM23:00 are huge since wind speed has very large variation in a day. Thus, to solve the problem about the imprecise prediction(s), the wind speed forecasting should be used in a specific period of time like daylight, night, etc. The wind speed predictions by using the SVM model with fixed parameters will enlarge forecast errors. The more factors which affect the wind speed are used as input data sets can further reduce the
574
C.-Y. Huang et al.
prediction errors. Thus, the accuracy of the forecasts can further be enhanced if some factors influencing the wind speed can be added as the inputs of the GA based LS-SVM. Finally, since the prediction errors are low, the proposed forecast mechanism can be used in predicting the short term wind speeds in any wind farm. Table 1 The prediction results and errors Time(Hour) Original Data (m/s) Prediction Data (m/s) Error (%)
0 17.60 17.57 0.17
1 15.90 15.79 0.70
2 15.50 15.85 2.26
3 16.00 16.01 0.10
4 16.00 15.70 1.88
5 15.10 15.16 0.40
6 14.50 14.48 0.14
7 8 13.30 12.60 13.40 12.38 0.75 1.75
Time(Hour) Original Data (m/s) Prediction Data (m/s) Error (%)
12 9.70 9.73 0.31
13 9.10 8.92 1.98
14 9.00 8.91 1.00
15 8.90 8.89 0.11
16 8.30 8.26 0.48
17 7.20 7.31 1.53
18 6.60 6.56 0.61
19 6.10 6.11 0.16
20 6.00 6.01 0.17
9 12.00 11.97 0.25
21 5.10 5.30 3.92
10 10.50 10.44 0.57
22 4.80 5.52 15.00
11 9.50 9.95 4.74
23 5.00 5.77 15.40
Fig. 2 The original versus the prediction results
6 Conclusions The wind power is one of the fastest growing widely used alternative energy. Efficient wind forecasting methods are very helpful for wind energy integration into the electricity energy grid. In the past, various wind forecasting methods have been developed for the short term wind prediction problems. In the future, an introduction of factors influencing the wind speed including the temperature, humidity, as well as other seasonal factors into the GA based LS-SVM will be very helpful for demonstrating the feasibility of the application of this method in the real world.
References Adewuya, A.A.: New Methods in Genetic Search with Real-Valued Chromosomes: Massachusetts Institute of Technology, Dept. of Mechanical Engineering (1996) Carolin Mabel, M., Fernandez, E.: Analysis of wind power generation and prediction using ANN: A case study. Renewable Energy 33(5), 986–992 (2008)
Predicting of the Short Term Wind Speed by Using a RGA Based LS - SVM
575
Catalão, J.P.S., Pousinho, H.M.I., Mendes, V.M.F.: Short-term wind power forecasting in Portugal by neural networks and wavelet transform. Renewable Energy 36(4), 1245–1251 (2011) Costa, A., Crespo, A., Navarro, J., Lizcano, G., Madsen, H., Feitosa, E.: A review on the young history of the wind power short-term prediction. Renewable and Sustainable Energy Reviews 12(6), 1725–1744 (2008) Damousis, I.G., Alexiadis, M.C., Theocharis, J.B., Dokopoulos, P.S.: A fuzzy model for wind speed prediction and power generation in wind parks using spatial correlation. IEEE Transactions on Energy Conversion 19(2), 352–361 (2004) Giebel, G., Landberg, L., Kariniotakis, G., Brownsword, R.: State-of-the-art on methods and software tools for short-term prediction of wind energy production. In: European Wind Energy Conference, Madrid (2003) Global Wind Energy Council. Global wind capacity increases by 22% in 2010 - Asia leads growth (2011) Goldberg, D.E.: Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Pub. Co., Reading (1989) Huang, Y.-P., Huang, C.-H.: Real-valued genetic algorithms for fuzzy grey prediction system. Fuzzy Sets and Systems 87(3), 265–276 (1997), doi:10.1016/s0165-0114(96)00011-5 Landberg, L.: Short-term prediction of the power production from wind farms. Journal of Wind Engineering and Industrial Aerodynamics 80(1-2), 207–220 (1999) Lerner, J., Grundmeyer, M., Garvert, M.: The importance of wind forecasting. Renewable Energy Focus 10(2), 64–66 (2009) Lorenz, E., Hurka, J., Heinemann, D., Beyer, H.G.: Irradiance Forecasting for the Power Prediction of Grid-Connected Photovoltaic Systems. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2(1) (2009) Ma, L., Luan, S., Jiang, C., Liu, H., Zhang, Y.: A review on the forecasting of wind speed and generated power. Renewable and Sustainable Energy Reviews 13, 915–920 (2009) Marciukaitis, M., Katinas, V., Kavaliauskas, A.: Wind power usage and prediction prospects in Lithuania. Renewable and Sustainable Energy Reviews 12, 265–277 (2008) Mathew, S.: Wind energy: fundamentals, resource analysis and economics. Springer, Heidelberg (2006) Muñoz, M., Oschmann, V., David Tàbara, J.: Harmonization of renewable electricity feed-in laws in the European Union. Energy Policy 35(5), 3104–3114 (2007) Sfetsos, A.: A comparison of various forecasting techniques applied to mean hourly wind speed time series. Renewable Energy 21(1), 23–35 (2000) Soman, S.S., Zareipour, H., Malik, O., Mandal, P.: 2010. A review of wind power and wind speed forecasting methods with different time horizons. In: North American Power Symposium, NAPS (2010) Suykens, J.A.K., Vandewalle, J.: Least Squares Support Vector Machine Classifiers. Neural Processing Letters 9(3), 293–300 (1999) Vapnik, V.N.: Statistical learning theory. Wiley, Chichester (1998) Vapnik, V.N.: The nature of statistical learning theory. Springer, Heidelberg (2000) Wu, C.-H., Tzeng, G.-H., Goo, Y.-J., Fang, W.-C.: A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy. Expert Systems with Applications 32(2), 397–408 (2007) Wu, M.T.: The Application of Artificial Neural Network to Wind Speed and Generation Forecasting of Wind Power System. Department of Electrical Engineering, National Kaohsiung University of Application Sciences, Kaohsiung (2008) Wu, Y.K., Hong, J.S.: A literature review of wind forecasting technology in the world. In: Power Tech. IEEE Lausanne (2007)
Selecting English Multiple-Choice Cloze Questions Based on Difficulty-Based Features Tomoko Kojiri, Yuki Watanabe, and Toyohide Watanabe
*
Abstract. English multiple-choice cloze questions require learners of various grammatical and lexical knowledge. Since the knowledge of learners is different, it is difficult to provide appropriate questions suitable for learners’ understanding levels. This research determines features that affect to difficulties of questions and proposes the method for selecting questions according to the features for the stepwise learning. In order to manage the relations among questions, a question network is introduced in which questions are structured based on differences of each feature. Questions are selected by following appropriate links according the learners’ answers. By following this question network, learners are able to tackle questions from easier one to difficult one according to their understanding levels.
1
Introduction
Multiple-choice cloze questions are often used in English learning. Such type of question is effective for checking the knowledge of grammar and lexicon and thus it is used in TOEIC or TOEFL. In addition, by tackling these questions repeatedly, the knowledge of English grammar and lexicon is able to be acquired. Only limited number of knowledge is included in one question, many questions need to be solved for the purpose of acquiring whole grammar and lexicon knowledge. However, it is difficult to select questions that are appropriate for individual learners’ understanding levels. If all knowledge in the question has been already acquired, a learner cannot acquire new knowledge from it. If all knowledge is new to a learner, it may be difficult for him to understand plural knowledge from one question. Appropriate questions for learners should contain some acquired knowledge and a few in-acquired one. Tomoko Kojiri Faculty of Engineering Science, Kansai University 3-3-35 Yamate-cho, Suita, Osaka, 564-8680, Japan e-mail:
[email protected] Yuki Watanabe · Toyohide Watanabe Graduate School of Information Science, Nagoya University Furo-cho, Chikusa-ku, Nagoya, 464-8603, Japan J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 577–587. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
578
T. Kojiri, Y. Watanabe, and T. Watanabe
Intelligent Tutoring System (ITS) which provides learning contents that fit for learners’ understanding levels has been developed for different learning domain. Suganuma et al. developed the system that estimates dynamically difficulties of exercises and learners’ understanding levels according to the learners’ answers, and provides exercises based on the estimated understanding levels [1]. This system assigned difficulties of exercises based on the learners’ answers. However, the reasons for the incorrect answer may be different among learners. The incorrect knowledge should be evaluated in detail. Since English multiple-choice cloze questions need various knowledge to solve and it is difficult to find out which knowledge is used to derive the correct answer, to specify correctly acquired/inacquired knowledge from learners’ answers may be difficult. We have constructed the knowledge network that arranges multiple-choice cloze questions according to the quantity of their grammatical knowledge [2]. This research assumed that the difficulties of questions become high according to the number of their grammatical knowledge. By solving questions along the knowledge network, English grammar could be learned from a simple question to a complicate one. However, this knowledge network considers only grammatical knowledge and is not able to cope with all types of multiple-choice cloze questions. Other features of questions are needed to characterize questions. This paper proposes features of multiple-choice cloze questions that affect to difficulties of questions (difficulty-based feature). Then, it proposes a method for providing questions based on the features for the stepwise learning. In order to represent the relations among questions, a question network is introduced that structures all questions based on difficulty levels for each difficulty-based feature. In the question network, questions that situate in the same levels for all difficulty-based features form one node, and nodes whose levels are next to each other are connected by links. By following this question network, learners are able to tackle questions from easier one to more difficult one according to their understanding levels. In our system, in order to estimate correctly the learners’ understanding levels in the ongoing learning process, learners are required to solve plural questions at one learning process. In addition, questions are selected not from one node but from several nodes that are similar to the learners’ understanding levels (solvable nodes). Questions in solvable nodes have larger possibilities to be solved by learners and they are determined after each learning process.
2 2.1
Approach Difficulty-Based Features of English Multiple-Choice Cloze Question
English multiple-choice cloze questions require learners to understand the meaning of English sentences and grammatical structure, and to select word/s for a blank part that forms the correct sentence from choices. Figure 1 is an example of English multiple-choice cloze questions. A question consists of sentence, blank part, and choices. Choices consist of one correct choice and three distracters. Learners select one from choices for filling in the blank part.
Selecting English Multiple-Choice Cloze Questions
✓ 1 Question Sentence
579
✓ 2 Blank Part
The company's advertisements always look (
).
1) beauties 2) beautiful 3) beautifully 4) beauty
✓ 3 Choices (Correct choice and distracters) Fig. 1 Example of multiple-choice cloze question
There are various definitions or findings about difficulty features of English questions. Kunichika et al. defined difficulty features of English reading questions as difficulties of understanding of original texts, understanding of question sentences, and understanding of answer sentences [3]. This paper follows this definition and defines three difficulty-based features of English multiple-choice questions illustrated. Understanding of original texts for English reading question corresponds to understanding of sentence for multiple-choice questions. Understanding of question sentences is regarded as understanding of intention of a question, so understanding of blank part corresponds to this feature. Understanding of answer sentence is regarded as similar to the understanding of differences among choices. − 1) Difficulty of sentence Especially for questions that ask words, it is important to grasp the meaning of a sentence correctly. Readability is one of the features that prevent learners of understanding the meaning easily. Researches about readability of English sentences insisted that lengths of sentences or difficulties of words affect to the readability [4]. Based on this result, lengths of sentence and difficulties of words are defined as one of the difficulty-based features of a sentence. 2) Difficulty of blank part A blank part indicates required knowledge to answer the question. In some questions, answers in the blank part can be estimated, most of which ask grammatical knowledge such as word class. Therefore, the difficulty of the blank part depends on which word class is asked. 3) Difficulty of choices There are various relations between distracters and a correct choice. One distracter may belong to the same word class as the correct choice and another one may have the same prefix as the correct choice. The difference between choices needs to be grasped in selecting the correct answer. We adopt types of distracters defined in [5]. The types of distracters represented differences between the correct answer and the distracters. 12 types of distracters exist, which were derived by analyzing
580
T. Kojiri, Y. Watanabe, and T. Watanabe
existing questions statistically. Questions become more difficult if similar types of distracters exist in it. Therefore, the number of distracter types in choices is defined as a difficulty-based feature.
2.2
Question Network
In learning with multiple-choice cloze questions, it is desirable that learners acquire knowledge step by step according to their understanding levels. If the difficulty of a question is determined without considering the knowledge included in the question, to support learners to acquire knowledge may become difficult. In our research, levels of difficulty-based features are assigned to each question. Such difficulty-based features are more related to the knowledge used to solve the question, but it is still acquired from the features of questions. By determining the level of learners for each difficulty-based features and selecting questions, learners are able to solve questions that are appropriate for their levels.
…
Adverb
Pronoun
Noun
Verb
問題 Most difficult questions
問題
問題 問題
…
問題 Difficulty-based feature 1 Difficulty-based feature 2 Difficulty-based feature 3
問題
問題
問題
問題
問題
Easiest questions
問題 問題
Questions
問題
問題
Fig. 2 Question Network
In order to provide questions along levels of difficulty-based features, questions need to be organized according to the levels. In this paper, a question network is introduced that structures questions along the levels of each feature. Nodes in the question network contain the questions of the same levels for all difficulty-based features. Nodes whose levels are next to each other are linked by directed links. Since the difficulty order cannot be defined uniquely for the word class of the blank part, links based on word class are not attached. Instead, question networks
Selecting English Multiple-Choice Cloze Questions
581
are constructed for all kinds of word classes. Figure 2 illustrates the question network. Nodes without incoming links correspond to the easiest questions. Nodes without outgoing links have the most difficult questions.
問題 問題
Solvable nodes for difficulty-based feature 1
問題 問題
Questions
問題 問題 Solvable nodes for difficulty-based feature 2
問題
問題 問題
問題
Difficulty-based feature 1 Difficulty-based feature 2 Difficulty-based feature 3
Solvable nodes for difficulty-based feature 3
Fig. 3 Selecting solvable nodes
Learners’ current understanding levels are grasped by their answers and current their nodes are determined. If learners’ levels are increased to the next level, current nodes of learners in the question network are changed by following the link of the corresponding feature. The senses of difficulty for difficulty-based features vary for each learner. Some learners may feel a feature is critical, but others may not. If a learner does not feel the feature is difficult, learners move to the higher levels quickly, since it is a waste of time to follow the link one by one in such case. In order to determine the correct level of learners, several questions from several nodes that are estimated to be solvable are provided. The solvable nodes for each difficulty-based feature are estimated based on the learners’ answers. Figure 3 is an example of selecting questions from solvable nodes. In this figure, two nodes are solvable for the feature 1, while only one node for the features 2 and 3.
3 3.1
Mechanism for Selecting Questions Construction of Question Network
Questions that have the same levels for all features are gathered as one node of the question network. Two nodes whose levels of one difficulty-based feature differs only one are connected by a link. Figure 4 shows a part of a question network.
582
T. Kojiri, Y. Watanabe, and T. Watanabe
This research defines length of sentence, difficulty of words, and the number of distracter types as the difficulty-based features. Followings are the method for acquiring these features from questions. z
z
z
Length of sentence The number of words is regarded as one of the viewpoints of defining the length of sentence. Based on the analysis of 1500 questions in the database of our laboratory, it is revealed that sentences consist of 4 to 32 words. Thus, we categorize the length of sentence into four levels according the number of words. Table 1 shows the levels of the length of sentence. Difficulty of words In this research, the difficulty of words followed SVL12000[6], which is the list of word difficulties defined by ALC. In SVL12000, 12000 words that are useful for Japanese are selected and classified into 5 levels of difficulty. In addition to these five levels, the most difficult level 6 is attached to the words that are not in the list. The level of a question is defined as the highest level in all words including the sentence and choices. The number of distracter type Distracter types are attached to the questions in the database of our laboratory. People who achieved more than 700 points in TOEIC are asked to attach the distracter types based on the definition in [5]. Since choices of the same distracter types may be more difficult than that of the different one, the difficulty based on the number of distracter type is set as Table 2 Length of sentence:3 Difficulty of word:3 Number of distracter type:1
Length of sentence:2 Difficulty of word:4
Length of sentence becomes larger. Words becomes difficult.
Levels of difficulty-based features Length of sentence:2 Difficulty of word:3
Number of distracter type:1
Number of distracter type:1 Length of sentence:2 Difficulty of word:3 Number of distracter type:2
Fig. 4 Example of attaching links
Level based on the number of distracter type becomes higher.
Selecting English Multiple-Choice Cloze Questions
583
Table 1 Levels of length of sentence
Level # of words
1 26
Table 2 Levels based on the number of distracter type
Level # of distracter types
3.2
1 3
2 2
3 1
Selection of Questions
In the begging of the learning process, the start node which fits for the learner’s initial understanding level is estimated. The start node is calculated by the result of the pretest. Let θi be the level of difficulty-based feature i of a learner. θi is calculated by the following formula. ∑
,
,
,
(1)
where bi,j represents the level j of difficulty-based feature i, ni is the number of levels in i and Pi,j indicates the ratio of correctly answered questions whose levels are j of the feature i. By deriving average of levels and ratio of correctly answered questions, average level of the feature i can be derived. Questions are selected from several solvable nodes. More questions should be selected from nodes that are nearer to the learner’s current node. The probabilities for selecting questions for each node i is calculated by the following formula. ,
√
(2)
li is the number of the links from the learner’s current node to the node i. S(li) follows the normal distribution and βis a normalization factor. According to the probability for each node, questions are selected from the node. In the learning phase, learners solve several questions. After the learning has been finished, learners’ levels for each difficulty-based feature are re-calculated. Learners’ levels for the difficulty-based feature i after t–th learning, such asθi,t is calculated by the following formula. ,
,
|
|
∑
,
,
(3)
Average difference of solved questions’ levels and the current level is added to the current level for each difficulty-based feature. Qt is the set of questions that are posed in the t–th learning and |Qt| represents the number of the questions. Q’t is the set of the questions that learners answered correctly at the t–th learning. bi,q is the level of difficulty-based feature i of question q, and α is the ratio of correctly answered questions for judging the accomplishment of the node. If the value ofθi,t is bigger than that of the current node, the next node by following the link of the difficulty-based feature i until the level of the current node becomes bigger thanθi,t is selected as the current node.
584
T. Kojiri, Y. Watanabe, and T. Watanabe
The solvable nodes for the learner are also re-calculated after each learning. The number of solvable nodes becomes large if the learner solved questions in farther node, while it becomes small if the learner only could solve the questions in the nearer nodes. The distance of solvable nodes from the current node is calculated by the following formula. ,
,
|
|
∑
,
,
,
(4)
This equation adds the certain number of links from the current node which is derived by subtracting the number of links to the incorrectly answered nodes from that of correctly answered ones. li,q represents the number of links from the current node to the node that question q belongs to for the difficulty-based feature i. wq corresponds to the correctness of the question q; wq is 1 if the answer to the question q is correct and -1 if the answer is incorrect. The initial value of ri,t are set as 1, which means only the next node is solvable. ri,t is set to 1 if the calculated ri,t becomes smaller than 1.
4
Prototype System
We have developed the web-based system based on the proposed method. When the learning starts, the selected questions are shown in the web page as shown in Figure 5. Currently, 10 questions are selected in one learning process. Learners answer the questions by selecting the radio buttons of the correct choice. If learners cannot understand the answer, they can check the checkbox which says “I could not solve the question”. If this checkbox is checked, the answer of this question is not considered as learners’ answers. After learners select all answers and push the send button, their answers are evaluated, and the result and explanation are displayed.
Sentence Choices
Fig. 5 Interface of prototype system
Checkbox: “I could not solve question”
Selecting English Multiple-Choice Cloze Questions
5
585
Experimental Result
Experiments were conducted using the prototype system. This experiment focuses on the question network of verb words. First, examinees were asked to solve a pretest which consists of 20 questions and examinees’ initial levels were calculated based on the result of the pretest. The questions were carefully prepared by authors to include all levels of difficulty-based features as the equal ratio. In the learning phase, they were asked to answer 10 questions for 10 times. In the experiment, α in Equation 3 is set as 0.7, which means learners were regarded to accomplish the node if they could answer more than 70 percent of the posed questions. As the counter methods, we have prepared following two methods: z z
Random link selection method (RLSM) which selects links randomly in selecting nodes in the question network, Random question posing method (RQPM) which selects questions randomly from the database.
In RLSM, the movement of the node occurs when the learner solved correctly 70 percent of the questions in the node. The examinees were 12 members in our laboratory and 4 of them were assigned for each method; proposed method, RLSM, and RQPM. The correct questions in each learning were evaluated. Table 3 is the average number of correct questions and its variance for each learning. The average numbers are almost the same for all three methods. However, the variance of our method is the smallest of the three. This indicates that the number of correctly answered questions is almost the same for every learning. This result shows that our method could provide questions whose levels are similar to the learners, even the understanding levels of learners change during the 10th learning. Table 3 Result of learning phase
Proposed method RLSM RQPM
Average # of correct questions 5.725 5.850 5.825
Variance of # of correct questions 1.585 2.057 2.665
The questionnaire result for acquiring the consciousness of examinees for the proposed questions is shown in Table 4. In each questionnaire item, 5 is the best and 1 is the worst. Items 1 and 2 got high values. Based on the result of item 1, examinees felt questions become difficult as the learning proceeded. Based on the item 2, they also felt that words were getting more difficult. Table 5 shows the number of links that examinees who use the prototype system with proposed method followed during the learning. All examinees follow links of difficulty of words more than 2 times. For the item 3, examinee who answered 4 followed the link of the length of sentence more than 2 times, and examinees who answered 3
586
T. Kojiri, Y. Watanabe, and T. Watanabe
followed the link only once. The worst result of item 4 may be caused by the small number of following links based on the number of distracter type. Based on the result, if links are followed, learner can feel the difficulties of questions. Therefore, questions are arranged appropriately by its difficulties in the question network. Table 4 Questionnaire result
1 2 3 4
Contents Did the questions become difficult? Did the words in questions become difficult? Did the question sentences become difficult? Did the distracters become difficult?
Average value 4.00 4.00 3.50 2.75
Table 5 # of links that examinees followed
Difficulty words Examinee 1 Examinee 2 Examinee 3 Examinee 4
6
of
Length of sentence
3 2 2 3
1 2 3 1
The number distracter type
of 1 0 1 0
Conclusion
In this paper, the method for posing questions step by step according to the difficulty-based features was proposed. Based on the experimental result, defined features are intuitive and match to learners’ consciousness. In addition, using the question network which arranges questions according to the levels of difficultybased features, questions that fit for learners’ levels were able to be selected in spite of change of learner’s situation during the learning. Currently, three difficulty-based features have been prepared. However, there are still several other features in questions, such as grammatical structure. Thus, for our future work, to investigate other features of questions is necessary if they become difficulty-based features or not. Moreover, our system only provides explanation for learner’s answers to the question and does not support learners of acquiring the knowledge. If questions run out before learners can make correct answers for the certain amount of questions, they cannot proceed the learning process. Therefore, we have to provide the support tool that teaches necessary knowledge to learner.
References 1. Suganuma, A., Mine, T., Shoudai, T.: Automatic Generating Appropriate Exercises Based on Dynamic Evaluating both Students’ and Questions’ Levels. In: Proc. of EDMEDIA, CD-ROM (2002)
Selecting English Multiple-Choice Cloze Questions
587
2. Goto, T., Kojiri, T., Watanabe, T., Yamada, T., Iwata, T.: English grammar learning system based on knowledge network of fill-in-the-blank exercises. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part III. LNCS (LNAI), vol. 5179, pp. 588–595. Springer, Heidelberg (2008) 3. Kunichika, H., Urushima, M., Hirashima, T., Takeuchi, A.: A Computational Method of Complexity of Questions on Contents of English Sentences and its Evaluation. In: Proc. of ICCE 2002, pp. 97–101 (2002) 4. Kate, R.J., Luo, X., Patwardhan, S., Franz, M., Florian, R., Mooney, R.J., Roukos, S., Welty, C.: Learning to Predict Readability Using Diverse Linguistic Features. In: Proc. of ICCL 2010, pp. 546–554 (2010) 5. Goto, T., Kojiri, T., Watanabe, T., Iwata, T., Yamada, T.: Automatic Generation System of Multiple-choice Cloze Questions and its Evaluation. An International Journal of Knowledge Management and E-Learning 2(3), 210–224 (2010) 6. Standard Vocabulary List 12000: SPACE ALC, http://www.alc.co.jp/eng/vocab/svl/index.html (in Japanese)
Testing Randomness by Means of RMT Formula Xin Yang, Ryota Itoi, and Mieko Tanaka-Yamawaki
*
Abstract. We propose a new method of testing randomness by applying the method of RMT-PCA, which was originally used for extracting principal components from a massive price data in the stock market. The method utilizes RMT formula derived in the limit of infinite dimension and infinite length of data strings, and can be applied to test the randomness of very long, highly random data strings. Although level of accuracy is not high in a rigorous sense, it is expected to be a convenient tool to test the randomness of the real-world numerical data. In this paper we will show the result of applying this method (RMT-test) on two machine-generated random numbers (LCG, MT), as well as artificially distorted random numbers, and examine its effectiveness. Keywords: Randomness, RMT-test, Eigenvalue distribution, Correlation, LCG, MT.
1 Introduction In spite of numerous algorithms to generate random numbers being proposed [1,2] , no artificial method can generate better randomness than the naturally generated noise in the radioactive decay, for example [3,4]. Yet many programmers rely on various kinds of random number generators that can be used as a sub-routine to the main program. On the other hand, game players utilize other means to generate randomness, for the sake of offering flat opportunity to the players and attracting participants to the community, in order to make the participants to believe in the equal opportunity to win or lose. Stock prices are also expected to be highly random, so that the investors can dream that everyone can have a chance to win the market. When the purpose is different, the level of accuracy required on randomness is also different. On the other hand, the random matrix theory (RMT, hereafter) has attracted much attention in many fields of sciences [5,6]. In particular, a theoretical formula [7,8] on the eigenvalue spectrum of correlation matrix been applied to extract principal components from a wide range of multidimensional databases including financial prices [9-14]. Xin Yang, Ryota Itoi, and Mieko Tanaka-Yamawaki Department of Information and Knowledge Engineering Graduate School of Engineering Tottori University, Tottori, 680-8552 Japan e-mail:
[email protected] *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 589–596. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
590
X. Yang, R. Itoi, and M. Tanaka-Yamawaki
We consider in this paper a new algorithm to test the randomness of marginallyrandom sequences that we encounter in various situations, by applying the RMT-PCA method [13,14] originally developed in order to extract trends from a massive database of stock prices [9-12]. We name this method the 'RMT-test' and examine its effect on several examples of pseudo-random numbers including LCG [1], and MT [4].
2 Formulation of RMT-Test The RMT-test can be formulated as follows. The aim is to test the randomness of a long 1-dimensional sequence of numerical data, S. At the first step, we cut S into N pieces of equal length T, then shape them in an N×T matrix, Si,j, by placing the first T elements of S in the first row of the matrix Si,j, and the next T elements in the 2nd row, etc., by discarding the remainder if the length of S is not divisible by T. Each row of Si,j is a random sequence of length T and can be regarded as independent T- dimensional vector, Si = (Si,1, S i,2, S i,3, ... , Si,T). We normalize them by means of Si, t - < Si > x i, t = (1) σi for (i=1,…,N, t=1,...,T) where,
< Si >=
1 T ∑S T t =1 i, t
σ i = < Si 2 > - < Si > 2
(2)
such that every row in the new matrix x has mean=0, variance =1. Since the original sequence S is random, in general all the rows are independent, i.e., no pair of rows is identical. The cross correlation matrix Ci,j between two stocks, i and j, is constructed by the inner product of the two time series, x i, t and x j,t , Ci, j =
1 T ∑x x T t =1 i, t j, t
(3)
thus the matrix Ci,j is symmetric under the interchange of i and j. A real symmetric matrix C can be diagonalized by a similarity transformation V-1CV by an orthogonal matrix V satisfying Vt=V-1, each column of which consists of the eigenvectors of C. Such that
C v k = λ k vk (k=1,…,N)
(4)
where the coefficient λk is the k-th eigenvalue and v k is the k-th eigenvector. According to the RMT, the eigenvalue distribution spectrum of the cross correlation matrix C of random series is given by the following formula [7,8],
PRMT (λ ) =
Q 2π
(λ + − λ )(λ − λ − )
λ
(5)
where the upper bound and the lower bound are given by the following formula. (6) λ ± = (1 ± Q −1 / 2 ) 2
Testing Randomness by Means of RMT Formula
591
in the limit of N → ∞, T → ∞, Q = T / N = const . where T is the length of the time series and N is the total number of independent time series (i.e. the number of stocks considered, when applied on stock markets [9-14]). An example of this function is illustrated in Fig.1 for the case of Q=3. This means that the eigenvalues of correlation matrix C between pairs of N normalized time series of length T distribute in the following range, if the sequence is random, (7) λ− < λ < λ+ The RMT-test can be formulated in the following five steps.
Algorithm of the RMT-test: 1. Prepare a data string to be tested and cut it into N pieces of length T. 2. Each piece, Si = (Si,1, S i,2 , ... , Si,T), is converted to a normalized vector x i = (xi,1, x i,2 , ... , xi,T) by means of Eq. (1) and Eq. (2). By placing x i in i-th row, we make an N × T matrix x i,t . Taking the inner product and divide it by T, we make the correlation matrix, C in Eq. (3). 3. Compute the eigenvalues λk and the eigenvectors v k of Matrix C, by solving the eigenvalue equation, Eq. (4). 4. Discard the eigenvalues to satisfy Eq. (7), as the random part. 5. If nothing is left, the data string passes the RMT-test. If any eigenvalue is left over, the data string fails the RMT-test. 6. (Visual Presentation of the RMT-test) Compare the histogram of the eigenvalue distribution and the graph of Eq. (5). If those two lines match, as in the left figure below, that data passes the RMT-test. If those two lines do not match, as in the right figure below, it fails the RMT-test.
Fig. 1 The algorithm of RMT-test is summarized in 5 steps plus a visual presentation in the step 6 in two figures, an example of good random sequence(left) and an example of bad random sequence(right).
592
X. Yang, R. Itoi, and M. Tanaka-Yamawaki
3 Applying RMT-Test on Machine-Generated Random Numbers In this chapter, we apply RMT-test on two kinds of machine-generated random numbers, namely, LCG as the most popular algorithm of pseudo-random number generators, and MT as a new algorithm recently discovered and widely used for its astronomically long period.
3.1 Random Sequence by Means of Linear Congruential Generators (LCG) The most popular algorithm of numerically generating random sequences is the class of linear congruential generators (LCG) [1]. Xn+1 = (aXn + b) mod M
(8)
The popular function rand( ) uses the following parameters in Eq. (8). A = 1103515245 / 65536 B = 12345 / 65536 M = 32768
(9)
Using Eq.(8)-(9), we generate a random sequence S of length 500*1500, then cut it into 500 ( = N) pieces of length 1500 ( = T) each to make LCG ( Q = 3) data. Although LCG are known to have many problems, RMT-test cannot detect its offrandomness. As is shown in the left figure of Fig.2, this data passes RMT-test safely for Q = 3 (left) with the wide variety of seeds. The right figure of Fig.2 is a corresponding result for Q = 6 for N = 500, T = 3000.
Fig. 2 Examples of random sequences generated by LCG for different Q pass the RMT-test (left:Q = 3, right:Q = 6, both for N = 500)
Testing Randomness by Means of RMT Formula
593
3.2 Random Sequence by Means of Mersenne Twister (MT) The Mersenne Twister (MT) [2] is a recently proposed, highly reputed random number generator. The most valuable feature of MT is its extremely long period, 219937 - 1. We test the randomness of MT in the same procedure as above. The result is shown in Fig.3 for Q = 3, and Q = 6. MT also passes the RMT-test in wide range of N and T.
Fig. 3 Examples of random sequences generated by MT for different Q pass the RMT-test (left : Q = 3, right : Q = 6, both for N = 500)
4 Application of RMT-Test on Artificially Distorted Sequence In this chapter, we artificially lower the randomness of pseudo-random sequences by three different ways, and apply the RMT-test on such data. The first data are created by collecting the initial 500 numbers generated by LCG, starting from a fixed seed. The second data are created by converting the sequences into Gaussian-distributed random numbers by means of Box-Muller formula. The third data are created by taking log-returns of the generated sequences by LCG, and MT.
4.1 Artificially Distorted Pseudo-random Sequence (LCG) Knowing that the initial part of the LCG sequences generally have low randomness, we artificially create off-random sequences by collecting the initial parts of pseudo-random numbers, namely, the sequences of 500 iterations after starting with seeds, in order to see if RMT-test can indeed detect the offrandomness. As is shown in Fig. 4, RMT-test has detected a sign of deviation from RMT formula, for the case of N=100, and t=500 (left), since some eigenvalues are larger than the theoretical maximum. On the other hand, the corresponding case of using the data without the first 500 numbers after the seeds passes RMT-test, having all the eigenvalues within the theoretical curve, as shown
594
X. Yang, R. Itoi, and M. Tanaka-Yamawaki
in the right figure in Fig.4. Compared to the case of LCG, MT does not have a problem of initial 500 numbers.
Fig. 4 (left): A collection of first 500 of LCG (N=100,T=500) fails the RMT-test. (right): LCG without first 500 numbers (N=100,T=500) passes the RMT-test.
4.2 Artificially Distorted Pseudo-random Sequence (Box-Muller) The Box-Muller formula is often used to convert uniform random numbers into Gaussian random numbers. We test those sequences for checking the performance of the RMT-test. We apply the B-M formula in two ways. One is to convert to the random number of the normal distribution with zero mean, N(0,1), which barely passes the RMT-test as shown in the left figure of Fig.6. On the other hand, another set of random numbers created to have the off-symmetric normal distribution , N(5,1), fails the RMT-test as shown in the right figure of Fig.5.
Fig. 5 (left): N(0,1) Gaussian random number barely passes the RMT-test. (right): N(5,1) Gaussian random number (N=100,T=500) fails the RMT-test.
Testing Randomness by Means of RMT Formula
595
4.3 Artificially Distorted Pseudo-random Sequence (Log-Return) In this section, we point out the fact that any random sequence looses randomness after being converted to the log-return as follows, ri = log(Si +1 / Si )
(10)
although this process is inevitable when we deal with financial time series in order to eliminate the unit/size dependence of different stock prices. The effect of taking the log-return of the time series typically results in a significant deviation from the RMT formula, as shown in the left figure for the case of LCG and the right figure for the case of MT in Fig. 6.
Fig. 6 Both log-return sequence of LCG (left) and MT (right) fail the RMT-test.
5 Summary In this paper, we proposed a new method of testing randomness, RMT-test, as a byproduct of the RMT-PCA that we used to extract trends of stock markets. In order to examine its effectiveness, we tested it on two random number generators, LCG and MT, and showed that both LCG and MT pass the RMT-test for various values of parameters. We further tested the validity of our RMT-test on some artificially deteriorated random numbers to show that the RMT-test can detect offrandomness in the following three different examples; (1) the initial 500 numbers of rand( ), and (2) off-centered Gaussian random numbers obtained by means of Box-Muller algorithm, and (3) log-return sequences of the two kinds of pseudorandom numbers (LCG, MT).
596
X. Yang, R. Itoi, and M. Tanaka-Yamawaki
References 1. Park, S.K., Miller, K.W.: Random Number Generators: Good Ones are Hard to Find. Communication of ACM 31, 1192–1201 (1988) 2. Matsumoto, M., Nishimura, T.: Mersenne Twister: a 623-dimensionally equidistributed uniform pseudorandom number generator. ACM Transactions on Modeling and Computer Simulation 8, 3–30 (1998) 3. Tamura, Y.: Random Number Library (The Institute of Mathematical Statistics) (2010), http://random.ism.ac.jp/random/index.php 4. Walker, J.: HotBits: Genuine Random Numbers (2009), http://www.fourmilab.ch/hotbits/ 5. Edelman, A., Rao, N.R.: Acta Numerica, pp. 1–65. Cambridge University Press, Cambridge (2005) 6. Mehta, M.L.: Random Matrices, 3rd edn. Academic Press, London (2004) 7. Marcenko, V.A., Pastur, L.A.: Distribution of eigenvalues for some sets of random matrices. Mathematics of the USSR-Sbornik 1(4), 457–483 (1994) 8. Sengupta, A.M., Mitra, P.P.: Distribution of singular values for some random matrices. Physical Review E 60, 3389 (1999) 9. Plerou, V., Gopikrishnan, P., Rosenow, B., Amaral, L.A.N., Stanley, H.E.: Random matrix approach to cross correlation in financial data. Physical Review E 65, 066126 (2002) 10. Plerou, V., Gopikrishnan, P., Rosenow, B., Amaral, L.A.N., Stanley, H.E.: Physical Review Letters 83, 1471–1474 (1999) 11. Laloux, L., Cizeaux, P., Bouchaud, J.-P., Potters, M.: American Institute of Physics 83, 1467–1470 (1999) 12. Bouchaud, J.-P., Potters, M.: Theory of Financial Risks. Cambridge University Press, Cambridge (2000) 13. Tanaka-Yamawaki, M.: Extracting principal components from pseudo-random data by using random matrix theory. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010. LNCS, vol. 6278, pp. 602–611. Springer, Heidelberg (2010) 14. Tanaka-Yamawaki, M.: Cross Correlation of Intra-day Stock Prices in Comparison to Random Matrix Theory, Intelligent Information Management (2011), http://www.scrp.org
The Effect of Web-Based Instruction on English Writing for College Students Ya-Ling Wu, Wen-Ming Wu, Chaang-Yung Kung, and Ming-Yuan Hsieh
*
Abstract. The use of network technology in teaching and learning is the latest trend in education and training. The purpose of this study is to investigate the effect of web-based instruction on English writing. In this study, a hybrid course format (part online, part face-to-face lecture) was developed to deliver an English writing course to sophomores majored in English. The hybrid course was structured to include online assignments, handouts, lecture recording files and weekly lectures in a classroom. To evaluate the effectiveness of web-based instruction on learning English writing, participants include two classes. One class was taught under hybrid course format while the other one was taught with traditional instruction. The findings of the study revealed that: 1. There were statistically significant differences between students’ achievement mean scores in English writing skills and concepts attributed to the course setting. This difference is in favor of students in the hybrid course format; 2. There were no statistically significant differences between students’ writing mean scores in writing ability attributed to the course setting. Keywords: web-based instruction, CAI, English writing.
1 Introduction At most colleges and universities, English writing courses are required for students majored in English. In order to improve learning, many learners are eager to find the way to enhance their learning effect. As the progress of technology, using web-based instruction or computers as teaching tools is more prevalent than Ya-Ling Wu Department of Applied English, National Chin-Yi University of Technology *
Wen-Ming Wu Department of Distribution Management, National Chin-Yi University of Technology Chaang-Yung Kung Department of International Business, National Taichung University of Education Ming-Yuan Hsieh Department of International Business, National Taichung University of Education J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 597–604. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
598
Y.-L. Wu et al.
before. Hundreds of studies have been generated on the use of computer facilities, e-learning, network to enhance learning. Riffell and Sibley (2005) claimed that using web-based instruction can improve the learning for undergraduates in biology courses.
1.1 Description of the Hybrid Course The hybrid course incorporate three primary components: (1) computer-assisted instruction (CAI): WhiteSmoke software provides instant feedback for students’ writings; (2) web-based instructions: web-based assignments, handouts, lectures; (3) lectures in a computer-equipped classroom: weekly meetings in the computer lab focused on teaching core skills and concepts of text types in English writing.
1.2 Research Question Hypothesis This study attempts to answer the following questions. 1. Are there any statistically significant differences (α< 0.05) between the student’ achievement mean scores attributed to course setting (traditional & hybrid)? H1: There is no significant difference in achievement between students taught under hybrid course format and those students taught with traditional instruction. 2. Are there any statistically significant differences (α< 0.05) between the student’ writing mean scores in writing ability attributed to course setting (traditional & hybrid)? H2: There is no significant difference in writing ability between students taught under hybrid course format and those students taught with traditional instruction.
1.3 Limitation In this study, the participants are sophomores and major in English. The result of this study cannot infer that all college students in Taiwan can improve their English writing ability via web-based instruction. Moreover, the effect of web-based instruction might depend on some characteristics of learners, such as age, personality, attitude, motivation and so on.
2 Literature Review Since the first computers used in schools in the middle of 20th century, things have changed. At first, computers were used as tools. They are tools to help learners and instructors perform tasks faster and more efficiently. As a result, computers have played an important role in processing information as well as in assisting teaching and learning.
The Effect of Web-Based Instruction on English Writing for College Students
599
2.1 CAI There are several researches on computer-assisted instruction (CAI) indicating positive influence in learning and teaching. Schaudt (1987) conducted an experiment which students were taught with CAI in reading lesson. The result showed that CAI is possible to be an effective tool in improving students’ master aimed at reading skills. In this study, Schaudt assumed that (1) time is allocated on the computer for sufficient and continuous content coverage, (2) performance is monitored, and (3) the teacher chooses software properly for the students’ ability levels. Wedell and Allerheiligen (1990) indicated that CAI has significant efficacy in learning English, especially in reading and writing. Learners are conscious of improvement in the areas of conciseness, courtesy, concreteness, clarity, consideration, and correctness, as well as their overall, general communication skill. However, there is no efficiency on the inexperienced writers because they don’t possess enough ability to understand these corrected errors. Zhang (2007) suggested that incorporating computers into Chinese learning and teaching can offer several important advantages to American learners. It cannot only enhance the learning motivations for students in the classroom, but also afford them the increased opportunity for self-directed learning. According to Cotton (1997), using CAI as a supplement to traditional instruction can bring higher performance than only using conventional instruction. Besides, students learn instructional contents more efficient and faster when using CAI than using traditional instruction; they retain what they have learned better with CAI than with conventional instruction alone. Karper, Robinson, and Casado-Kehoe (2005) reported that computer assisted instruction has been viewed as an effective way to improve learners’ performance than the conventional instructional method in counselor education. In addition, a study by Akour (2006) showed the time required for learners to use CAI was higher overall than conventional classroom instruction. Students taught by conventional instruction combined with the use of computer performed significantly better than students taught by conventional instruction in a college setting. The purpose of Chen and Cheng (2006)’s study is to prove factors that may lead to facilitation or frustration when students was taught by CAI. They suggested that computer-assisted learning programs should be used as a supplement to classroom instruction but never as a replacement of the teacher. Kasper (2002) said that technology is now regarded as both a necessary component and means of achieving literacy; as a result, it must become a required part of language courses, and computers ought to be used as a tool to promote linguistic skills and knowledge construction. Forcier and Descy (2005) pointed out that with an explosion of information and widespread, immediate access to it, today’s students are faced with the need to evolve problem-solving skills. The classroom equipped with computer offers situations that the student will confront in real life and permits the students to illustrate their capabilities in completing various tasks.
600
Y.-L. Wu et al.
Purao (1997) stated that a hyper-linked instructional method encouraged students to engage in a more active role in the classroom and boost higher levels of learning.
2.2 Web-Based Instruction Recently, web-based instruction is becoming increasingly popular and familiar. Online instruction was broadly applied in education because it may be superior to traditional learning environment. According to Ostiguy and Haffer (2001), web-based course may provide students more flexibility and control over where and when to participate. In addition, it can lead to greater learning motivation (St. Clair, 1999). Furthermore, Hacker and Niederhauser (2000) indicated that learning in web-based courses can be more active than traditional instruction. It is also more student-centered (Sanders, 2001) and can encourage students to learn (Yazon, Mayer-Smith, & Redfield, 2002). Moreover, it was reported that online courses have significant improvements in student performance (Navarro and Shoemaker, 2000), and a hybrid course could improve learning outcomes (Tuckman, 2002).
3 Methodology This study was designed to investigate whether the web-based instruction significantly enhance learning outcome and English writing ability for students.
3.1 Subjects Participants of this study were recruited by convenient sampling because it is hard to get the permission from instructors to conduct this study. The participants consisted of 41 sophomores majored in English. There were totally two classes. All participants were taught by a same instructor for a semester.
3.2 Instrument In order to successfully implement this study, achievement tests and writing tests were designed by the instructor. The achievement tests were used to measure the core skills and concepts for different text types in English writing, and writing tests were utilized to evaluate writing ability. 3.2.1 Achievement Test Two achievement tests were developed by the instructor. The first test was used as a pre-test to assess the students’ previous knowledge and the second one as a posttest to find out the impact of the hybrid course format on students’ achievement.
The Effect of Web-Based Instruction on English Writing for College Students
601
3.2.2 Writing Test Writing tests were administered on the first day of the course (pre-test) and on the last class day (post-test). The objective of the pre-test was to compare the level of writing ability between the two classes. Both pre-test and post-test were assessed by the software, WhiteSmoke. WhiteSmoke contains the following functions: Grammar-Checker, Punctuation-Checker, Spell-Checker, Dictionary, Thesaurus, Templates, Translator, Error Explanations and AutoCorrect. The grading system in WhiteSmoke is 10-point scale.
3.3 Procedure The study was conducted by the following steps. First of all, in order to make sure the students in both classes are at the same level of writing ability, a writing test (pre-test) was arranged at the beginning of a semester. Students were asked to write a paragraph in the writing test. Then students’ writings were evaluated by the software “WhiteSmoke.” The use of WhiteSmoke could provide the same criterion on the evaluation of writing ability. In addition, the first achievement test was taken by students in the two classes on the first class day to test students’ previous knowledge. During the semester, one class (experimental group) was taught under hybrid course format in the computer lab whereas the other one (control group) was subject to traditional instructions in the classroom with blackboard only. Finally, at the end of the semester, an achievement test (post-test) and a writing test (post-test) were given again.
3.4 Data Analysis The data analysis consisted of independent samples t-test testing the significance of the difference in the mean scores of tests between the two classes.
4 Results This study was designed to determine whether a significant difference existed in the achievement and writing ability based on course settings. The first question asks about the existence of statistically significant differences between the students’ achievement mean scores attributed to the course setting (traditional & hybrid). Analysis of independent samples t-test was performed to test the significance of the differences between the experimental group who were taught under hybrid course setting and the control group who were taught with traditional instruction. Table 1 presents the means and standard deviations of the experimental group and control group for achievement tests and writing tests.
602
Y.-L. Wu et al.
Table 1 Means and Standard Deviations on Tests
Tests Achievement Test 1
Achievement Test 2
Writing Test 1
Writing Test 2
Group
N
Mean
SD
Hybrid
25
81.64
7.233
Traditional
16
84.88
7.650
Hybrid
25
73.12
7.844
Traditional
16
63.88
11.592
Hybrid
24
7.38
1.056
Traditional
14
6.93
1.072
Hybrid
23
7.39
.839
Traditional
16
7.25
1.125
As indicated in Table 1, the mean score on achievement test 1 for students in the hybrid course format is 81.64 and 84.88 for students in traditional setting; the mean scores on achievement test 2 are 73.12 and 63.88 in the two groups respectively. The mean score on the achievement test 2 is lower than it is on the achievement test 1 for the two groups. This is because the contents tested in the achievement test 2 were more difficult than the contents tested in the achievement test 1. Moreover, the mean scores on writing test 1 for the two groups are 7.38 and 6.93, 7.39 and 7.25 on writing test 2. Table 2 Independent Samples t-test on Tests t
df
Achievement Test 1
Tests
p-Value
-1.366
39
.180
Achievement Test 2
3.052
39
.004*
Writing Test 1
1.251
36
.219
Writing Test 2
.450
37
.656
To test hypothesis one and two, independent samples t-test was utilized to determine the significance of differences, if any, between mean scores of achievement tests and writing tests between the two groups. As indicated in Table 2, there is no statistically significant difference between the mean scores in the achievement test 1 and writing test 1 (pre-tests). It reveals that there is no significant difference (p = 0.18) on students’ previous knowledge of writing skills and writing ability level between the two groups. However, it exists a significant difference (p = 0.004) on the achievement test 2. It indicates that the student in the hybrid course format have better performance on course contents learned in class than the student in the traditional setting. In contrast, it refers no significant difference (p = 0.656) between the mean scores in the writing test 2. This shows that students in
The Effect of Web-Based Instruction on English Writing for College Students
603
hybrid class did not have better performance on writing ability than students in tradition class.
5 Discussions and Conclusions This study had two major purposes. The first purpose of the study was to examine the effect of web-based instruction on students’ achievement. Secondly, the researchers sought to determine whether a significant difference existed in writing ability based on the course setting. According to the analysis of t-test shown in Table 2, students who received web-based instruction performed better on achievement test than students who were treated by traditional instruction. The result indicated that web-based instruction enhanced students on learning core skills and concepts of text types in English writing. Students exposed to the hybrid course format had more flexibility and control over where and when to review the contents learned in class. However, findings in this study revealed that students in the hybrid course setting did not perform better on the writing ability than students in traditional course setting. This probably implies that students need more time to improve their writing ability. Although available technology can greatly enhance students’ learning on the contents of a course, well-planned curricular should be careful designed by instructors.
References 1. Akour, M.A.: The effects of computer-assisted instruction on Jordanian college student’ achievements in an introductory computer science course. Electronic Journal for the Integration of Technology in Education 5, 17–24 (2008), http://ejite.isu.edu/Volume5/Akour.pdf 2. Chen, C.F., Cheng, W.Y.: The Use of a Computer-based Writing Program: Facilitation or Frustration? Presented at the 23rd International Conference on English Teaching and Learning in the Republic of China (May 2006) 3. Cotton, K.: Computer-assisted instruction. North West Regional Educational Laboratory, http://www.borg.com/rparkany/PromOriginalETAP778/CAI.html 4. Forcier, R.C., Descy, D.E.: The computer as an educational tool: Productivity and problem solving, vol. 17. Pearson Education, Inc., London (2005) 5. Hacker, D.J., Niederhauser, D.S.: Promoting deep and durable learning in the online classroom. New Directions for Teaching and Learning 84, 53–64 (2000) 6. Karper, C., Robinson, E.H., Casado-Kehoe, M.: Computer assisted instruction and academic achievement in counselor education. Journal of Technology in Counseling 4(1) (2005), http://jtc.colstate.edu/Vol4_1/Karper/Karper.htm (Retrieved December 22, 2007) 7. Kasper, L.: Technology as a tool for literacy in the age of information: Implications for the ESL classroom. Teaching English in the two-year College (12), 129 (2002)
604
Y.-L. Wu et al.
8. Navarro, P., Shoemaker, J.: Performance and perceptions of distance learners in cyberspace. American Journal of Distance Education 14, 15–35 (2000) 9. Ostiguy, N., Haffer, A.: Assessing differences in instructional methods: Uncovering how students learn best. Journal of College Science Teaching 30, 370–374 (2001) 10. Purao, S.: Hyper-link teaching to foster active learning. In: Proceedings of the International Academy for Information Management Annual Conference, 12th, Atlanta, GA, December 12-14 (1997) 11. Riffell, S., Sibley, D.: Using web-based instruction to improve large undergraduate biology courses: an evaluation of a hybrid course format. Computers and Education 44, 217–235 (2005) 12. Sanders, W.B.: Creating learning-centered courses for the world wide web. Allyn & Bacon, Boston (2001) 13. Schaudt, B.A.: The use of computers in a direct reading lesson. ReadingPsychology 8(3), 169–178 (1987) 14. St. Clair, K.L.: A case against compulsory class attendance policies in higher education. Innovations in Higher Education 23, 171–180 (1999) 15. Tuchman, B.W.: Evaluating ADAPT: A hybrid instructional model combining webbased and classroom components. Computers and Education 39, 261–269 (2002) 16. Wedell, A.J., Allerheiligen, J.: Computer Assisted Writing Instruction: Is It Effective? The Journal of Business Communication, 131–140 (1991) 17. Yazon, J.M.O., Mayer-Smith, J.A., Redfield, R.J.: Does the medium change the message? The impact of a web-based genetics course on university students’ perspectives on learning and teaching. Computers and Education 38, 267–285 (2002) 18. Zhang, H.Y.: Computer-assisted elementary Chinese learning for American students. US-China Education Review 4(5) (Serial No. 30), 55–60 (2007)
The Moderating Role of Elaboration Likelihood on Information System Continuance Huan-Ming Chuang, Chien-Ku Lin, and Chyuan-Yuh Lin
*
Abstract. Under the rapid development of digital age, government has been promoting the digitization and networking of administration activities. Under this background, with the great potential to improve learning effectiveness, uSchoolnet is one of the important information systems to be emphasized. Nevertheless, no matter how good the system is, and how active it is promoted, if can not be accepted and used, the system can not be succeeded at all. Consequently, essential factors affecting uSchoolnet’s acceptance and continuance are big research issues. Based on the case of uSchoolnet used in elementary schools of Yunlin county, Taiwan, this study adopt theories from elaboration likelihood model (ELM) and IS success model to build research framework. Specifically, system quality and information quality from IS success model represent the argument quality of ELM model, while service quality represents source creditability of it. Besides verifying the casual effects of IS success related constructs, the dynamic moderating effects of task relevance and personal innovativeness representing motivation and user expertise representing ability, on the above relationships are also investigated. Questionnaire survey is conducted to collect relevant data for analysis, with teachers of Yunlin county elementary schools are sampled as research subjects. Major research results can offer insightful, valuable and practical guidance for the promotions of uSchoolnet. Keywords: Elaboration Likelihood Model, IS success Model, Information System Continuance.
1 Introduction Under the rapid development of digital age, government has been promoting the digitization and networking of administration activities. Under this background, Huan-Ming Chuang Associate Professor, Department of Information Management, National Yunlin University of Science and Technology *
Chien-Ku Lin · Chyuan-Yuh Lin Graduate Student, Department of Information Management, National Yunlin University of Science and Technology J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 605–615. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
606
H.-M. Chuang, C.-K. Lin, and C.-Y. Lin
with the great potential to improve learning effectiveness, Taiwan government has been prompting information technology integrated instruction aggressively. uSchoolnet is one of the important information systems to be emphasized in elementary schools. Nevertheless, no matter how good the system is, and how active it is promoted, if can not be accepted and used, the system can not be succeeded at all. Consequently, essential factors affecting uSchoolnet’s acceptance and continuance are big research issues.
2 Background and Literature Review 2.1 Delone and McLean’s IS Success Model Since IS success is a multi-dimensional concept that can be assessed from different perspectives according to different strategic needs, the measure for IS success is not an easy and objective job. Nevertheless, Delone and McLean in 1992 made a major breakthrough. After conducting a comprehensive review of related literature, they proposed a IS success model as shown in Fig. 1. This model identified six important dimensions of IS success and suggest the temporal and casual interdependencies between them.
Fig. 1 DeLon and McLean’s IS success model [1]
After the original model went through lots of validations, Delone and McLean proposed an updated model in 2003 as shown in Fig. 2.
The Moderating Role of Elaboration Likelihood
607
Fig. 2 DeLone and McLean’s updated IS success model [2]
The primary differences between the original and updated models can be listed as follow: (1) the addition of service quality to reflect the importance of service and support in successful e-commence systems, (2) the addition of intention to use to measure user attitude, and (3) the combining of individual impact and organizational impact into a net benefit construct.
2.2 Elaboration Likelihood Model In social psychology literature, the role of influence processes in shaping human perceptions and behavior has been examined by dual-process theories. These theories suggest that there exist two alternative processes of attitude formation or change, namely more versus less effortful processing of information [3]. One representative dual process theory of interest to this study is the elaboration likelihood model (ELM). ELM posits in relation to above-mentioned dual processes, there are two alternative “route” of influence, the central route and the peripheral route, which differ in the amount of thoughtful information processing or “elaboration” demanded of individual subjects [3] [4]. ELM supposes elaboration likelihood is composed of two major component dimensions, motivation and ability to elaborate, both of which are required for extensive elaboration to occur [4]. ELM researchers have typically adopted recipients’ personal relevance of the presented information as their motivation, and prior expertise or experience with the attitude target as their ability. If information recipients view a given message as being important and relevant to the target behavior, they are more likely to spend more time and effort to scrutinize its information content. In contrast, those who view the same message as having little personal relevance may not be willing to spend the same effort, but instead rely on peripheral cues such as recommendations from trustable people to build their perceptions. Consequently, personal relevance and prior expertise are presumed to exert moderating effects on the relationship between argument quality or peripheral cues and attitude changes as shown in Figure 3.
608
H.-M. Chuang, C.-K. Lin, and C.-Y. Lin
Fig. 3 Elaboration Likelihood Model [3]
3 Research Model 3.1 The Application of IS Success Model 3.1.1 The Roles of System Use and Net Benefits There has been intense debate over the role system use plays in measuring IS success. From the con side, some authors have suggested that system use was not an appropriate IS success variable [5]. But Delone and McLean argued otherwise. They posited that the source of the problem was a too simplistic definition of system use, and researchers have to take the extent, nature, quality and appropriateness of it into consideration. For the purpose of this research, since all sampled subjects had experiences with the target IS, we adopted continuance intention as surrogate of system use. Although it may be desirable to measure system benefits from the objective and quantitative perspective, such measures are often hard to conduct due to intangible system impacts and intervening environmental variables [6]. Consequently, system benefits are usually measured by the perception of system users. We also used perceived system benefits to represent IS success. 3.1.2 The Relationships among System Use, Net Benefits, and User Satisfaction In terms of the relationship between system use and net benefits, Seddon (1997) contended that system use would be a behavior that reflects an expectation of potential system benefits; though system use must precede benefits, it did not cause them. Therefore, system benefit was precedent variable of system use, not vice versa. User satisfaction results from the feelings and attitudes from aggregating all the benefits that a person hopes to receive from interaction with the IS [7]. Therefore, user satisfaction was caused by perceived system benefits. In addition, some researchers [8] proposed that user satisfaction pushes system use rather than vice versa.
The Moderating Role of Elaboration Likelihood
609
3.2 The Application of ELM Since social psychology research views attitude as a broad construct consisting of three related components: cognition, affect, and conation [9], we expanded the independent variable of ELM (i.e., attitude changes) into three constructs. First, as the cognition dimension is related to beliefs salient to the target behavior, we used perceived system benefits as a representation. Next, for the affect dimension, it is represented by user satisfaction. Last, for the conation dimension, we adopted continuance intention as a proxy. ELM identified two alternative influencing processes, namely central route and peripheral route toward information recipients’ attitude changes. Besides, it recognized the important moderating effects of personal elaboration likelihood on the above-mentioned relationships. While elaboration likelihood refers to users’ motivation and ability to elaborate on informational messages[4]. Bhattacherjee and Sanford [3] drawing on prior ELM research, operationalized motivation dimension of elaboration as job relevance, defined as the information recipient’s perceived relevance of an IT system to their job, and the ability dimension as user expertise, defined as the information recipient’s IT literacy in general. We not only followed their approach, but also added personal innovativeness as a motivation construct, since under rapid-developing IT environment, this factor can be expected to influence their involvement of an IT system. 3.3 The Integration of IS Success Model and ELM 3.3.1 The Identification of Central Route The central route of ELM is represented by argument quality, and peripheral route is peripheral cues. Argument quality refers to the persuasive strength of arguments embedded in an informational message, while peripheral cues relate to metainformation about the message (e.g., message source) [3]. Sussman and Siegal [10] developed an argument quality scale that examined completeness, consistency, and accuracy as three major dimensions. These dimensions mapped quite well with the information quality as well as broader system quality of IS success model. As a result, we used IS users’ assessment of system quality and information quality as our model’s argument quality. 3.3.2 The Identification of Peripheral Route Many peripheral cues have been suggested in the ELM literature, including the number of message, number of message sources, source likeability, and source credibility. Of these, source credibility seems to be one of the most frequently, referenced cues [3]. Source credibility is defined as the extent to which an information source is perceived to be believable, competent and trustworthy by information recipients [4] [10]. This definition of source credibility relates quite well with the service quality of IS success model, which emphasizes the competency and credibility of IS champions.
610
H.-M. Chuang, C.-K. Lin, and C.-Y. Lin
In sum, our research model can be shown as Fig. 4.
Fig. 4 Research model
3.4 Research Hypotheses 3.4.1 Hypothesis Related to IS Success Model Based on above literature review, the hypotheses regarding IS success model can be listed as follows. H1. The extent of system quality is positively associated with user perceived net benefits. H2. The extent of information quality is positively associated with perceived net benefits. H3. The extent of service quality is positively associated with perceived net benefits. H4. The extent of system quality is positively associated with user satisfaction. H5. The extent of information quality is positively associated with user satisfaction. H6. The extent of service quality is positively associated with user satisfaction. H7. The extent of perceived net benefits is positively associated with user satisfaction. H8. The extent of perceived net benefits is positively associated with user continuance intention. H9. The extent of user satisfaction is positively associated with user continuance intention. 3.4.2 Hypotheses Related to ELM Based on above literature review, the hypotheses regarding ELM can be listed as follows.
The Moderating Role of Elaboration Likelihood
611
H10. User elaboration likelihood has moderating effect on the relationship between IS success dimensions and perceived net benefits. H10a. User elaboration likelihood has positive moderating effect on the relationship between system quality and perceived net benefits. H10b. User elaboration likelihood has positive moderating effect on the relationship between information quality and perceived net benefits. H10c. User elaboration likelihood has negative moderating effect on the relationship between service quality and perceived net benefits.
4 Research Method 4.1 Study Setting uSchoolnet is a leading web-based communication network for the k-12 schools sponsored by Prolitech Corp. The goal of this system is to bridge the gap between teachers, students, parents and administrators by offering products and services that promote and encourage interaction and collaboration. By allowing everyone involved to participate in the creation, maintenance, and growth of their own class website, their class website will be an extension of their daily lives. It will be dynamic and have a life of its own. Furthermore by focusing on the user experience and friendliness, teachers will finally be able to focus on teaching, students to focus on learning, parents to focus on guiding and administrators to focus on managing [13]. uSchoolnet is a comprehensive class web suite that offers the following major features: (1) Photo Albums, (2) Class Schedule, (3) Class Calendar, (4) Seating Chart, (5) Homework Assignments, (6) Student Recognition/Awards, (7) Message Board, (8) Content Management System, and (9) Polls and Surveys Since its practicability, it is quite popular in Taiwan, and is chosen as the target IS of this research. The adoption of uSchoolnet is volitional in nature, instructors are encouraged, but not forced to do so.
4.2 Operationalization of Constructs All constructs and measures were based on items in existing instruments, related literature, and input from domain experts. Items in the questionnaire were measured using a seven-point Likert scale ranging from (1) strongly disagree to (7) strongly agree.
4.3 Data Collection Data for this study were collected using a questionnaire survey administered in Yunlin county of Taiwan. The respondents were convenient sampled from instructors of elementary schools who have experiences with uSchoolnet. We sent out 200 questionnaires and received 185 useful responses.
612
H.-M. Chuang, C.-K. Lin, and C.-Y. Lin
5 Data Analysis and Results 5.1 Scale Validation We used PLS-Smart 2.0 software to conduct confirmatory factor analysis (CFA) to assess measurement scale validity. The variance-based PLS approach was preferred over covariance-based structural equation modeling approached such as LISREL because PLS does not impose sample size restrictions and is distribution-free [11]. 100 records of raw data was used as input to the PLS program, and path significances were estimated using the bootstrapping resmapling technique with 200 subsamples. The steps of scale validation were summarized as shown in table 1. Table 1 Scale validation Type of validity Definition and criteria Ref. Convergent ● Measures of constructs that theoretically should be related to each [12] validity other are, in fact, observed to be related to each other. - All item factor loadings should be significant and exceed 0.70 - Composite reliabilities (CR) for each construct should exceed 0.80 - Average variance extracted (AVE) for each construct should exceed 0.50, or the square root of AVE should exceed 0.71. Discriminan ● Measures of constructs that theoretically should not be related to each [12] validity other are, in fact, observed to not be related to each other. - The square root of AVE for each construct should exceed the correlations between that and all other constructs.
As seen from Table 2, standardized CFA loadings for all scale items in the CFA model were significant at p Step 6 Replace evidence of sensor si +1 into s ' ; Step 7 If t qk∗ . In Hist-Scal, the self dissimilarities are not given. Here, we assume that the self dissimilarities are given since we would like to deal with the internal variation of objects.
2.1 Percentile Model and Dissimilarities In the percentile model, we assume that each object is described by using nested hyperboxes that have the same center point in R p . Fig. 1 illustrates the relationships in terms of a dissimilarity between two objects. In Fig. 1, a object is represented by a set of the nested hyperboxes, where diLjk and diUjk are the lower and upper distances between the kth hyperboxes of objects i
Multidimensional Scaling with Hyperbox Model for Percentile Dissimilarities
781
dijC object i
2ri2k 2ri1k
object j
dijLk
U djjk
2rj2k
2rj1k
dijUk
Fig. 1 Relationships in terms of dissimilarity between objects i and j
and j respectively. For the two kth hyperboxes of objects i and j, if the center points and two sides are given, we can calculate the lower and upper distances diLjk and diUjk between these hyperboxes using the following formula [2]: p
2 L di jk = ∑ max 0, xis − x js − (risk + r jsk ) and diUjk =
s=1 p
∑
xis − x js + (risk + r jsk ) 2 .
(2)
s=1
where xi = (xi1 , . . . , xi p ) is a center point of the nested hyperboxes with object i, and risk is one half of the length of the sth side with the kth hyperbox of object i. Let X = (xis )n×p be a matrix whose rows represent the coordinates of the center points of nested hyperboxes with each object and Rk = (risk )n×p (k = 0, . . . , K) be a matrix whose rows represent one half of the lengths of the sides of the kth hyperboxes with each objects. In this paper, we describe the lower and upper distances diLjk and diUjk using diLj (X , Rk ) and diUj (X , Rk ), since the lower and upper distances diLjk and diUjk are considered to be functions of X and Rk (k = 0, · · · , K). We can rewrite Rk (k = 0, · · · , K) using R0 and the non-negative matrix Ak (k = 1, . . . , K) since 0 < ris0 ≤ · · · ≤ risK for Rk (k = 0, · · · , K). Let Ak (k = 1, . . . , K) be a non-negative matrix satisfying Rk = R0 + ∑kl=1 Al (k = 1, . . . , K). We can consider lower and upper distances diLjk and diUjk to be functions of X , R0 , and Ak (k = 1, · · · , K). We therefore describe them using diLj (X , R0 , A1 , . . . , Ak ) and dU (X , R0 , A1 , . . . , Ak ). ij
782
Y. Terada and H. Yadohisa
2.2 Stress Function In the percentile model, the objective of MDS for percentile dissimilarities is to approximate ξiLjk and ξiUjk using the lower and upper distances diLjk and diUjk between two kth hyperboxes of two set of nested hyperboxes in the least-square criteria. In other words, we estimate values for X and Rk (k = 0, . . . , K) that minimize the following stress function [3]: 2 σHist (X, R0 , . . . , RK )
2 K n n
2 K n n = ∑ ∑ ∑ wi j ξiLjk − diLj (X, Rk ) + ∑ ∑ ∑ wi j ξiUjk − diUj (X, Rk ) k=0 i=1 j=i
k=0 i=1 j=i
(subject to 0 < ris0 ≤ · · · ≤ risK ),
(3)
where wi j ≥ 0 is a given weight. The constraint of this function (0 < ris0 ≤ · · · ≤ risK ) makes it difficult to op2 timize σHist using iterative majorization. In the Hist-Scal algorithm, we estimate X, R¯ k (k = 0, . . . , K) using iterative majorization as an unconstrained optimization, and using the weighted monotone regression to R¯ k , we derive a value of Rk (k = 0, . . . , K) which satisfies the constraint. Therefore, with the Hist-Scal algorithm, an improvement in the solution for each iteration cannot be guaranteed. In this paper, in order to resolve the difficult encountered due to the constraint 2 0 < ris0 ≤ · · · ≤ risK , we consider σHist to be a function of X, R0 , and Ak (k = 1, · · · , K). We therefore consider optimization using the following stress function, called Percen-Stress: 2 σpercen (X, R0 , A1 , . . . , AK ) = n
n
∑ ∑ wi j
i=1 j=i
2 K ξiLj0 − diLj (X, R0 ) + ∑
n
n
∑ ∑ wi j
2 L ξi jk − diLj (X, R0 , A1 , . . . , Ak )
k=1 i=1 j=i
2 K n n 2 U U w ξ − d ( X, R ) + ∑ ∑ ∑ wi j ξiUjk − diUj (X, R0 , A1 , . . . , Ak ) 0 ∑ ∑ i j i j0 i j n
n
i=1 j=i
k=1 i=1 j=i
(subject to aisk > 0; k = 1, . . . , K).
(4)
The constraint in Percen-Stress requires only that each element of Ak (k = 1, . . . , K) is non-negative. This constrained optimization can therefore be achieved using only iterative majorization. That is, we can develop an improved algorithm that guarantees that the solution has improved after each iteration.
3 Majorizing Function of Stress Function First, we introduce a majorizing function and the framework of iterative majorization. A majorizing function of f : Rn×n → R is defined as a function g : Rn×n × Rn×n → R that satisfies the following conditions:
Multidimensional Scaling with Hyperbox Model for Percentile Dissimilarities
783
i) f (X0 ) = g(X0 , X0 ) for X0 ∈ Rn×n , ii) f (X) ≤ g(X, X0 ) (X ∈ Rn×n ) for X0 ∈ Rn×n . This majorization function has a useful characteristic: for a given X0 ∈ Rn×n and X˜ = arg min g(X, X0 ), ˜ = g(X, ˜ X) ˜ ≤ g(X, ˜ X ) ≤ g(X , X ) = f (X ). f (X) 0 0 0 0
(5)
The framework of iterative majorization allows an optimum solution to be achieved by minimizing the majorizing function instead of the original function in each iteration. Iterative majorization comprises the following steps: and to the convergence criterion; k ← 0. Set X0 (ε > 0) to an initial matrix
If k = 0 or f (Xk ) − f (Xk−1 ) ≥ ε then go to Step 3, else stop. k ← k + 1; compute X˜ = arg min g(X, Xk−1 ). ˜ go to Step 2. X ← X;
Step 1 Step 2 Step 3 Step 4
k
From Eq. (5), it is guaranteed that the solution Xk improves after each iteration. The majorization function for iterative majorization should be a linear or quadratic function for which the solution can more easily be found. For the majorization algorithm for Percen-Stress, we derive a majorizing func2 tion. We can expand Percen-Stress σpercen to 2 σpercen (X, R0 , A1 , . . . , AK ) = K
n
n
∑ ∑ ∑ wi j (ξiLjk
k=0 i=1 j=i n n
+ ∑ ∑ wi j
i=1 j=i
2
2 n n n n 2 + ξiUjk ) + ∑ ∑ wi j diLj (X, R0 ) − 2 ∑ ∑ wi j ξiLj1 diLj (X, R0 ) i=1 j=i
2 n n diUj (X, R0 ) − 2 ∑ ∑ wi j ξiUj1 diUj (X, R0 )
i=1 j=i
i=1 j=i
2 L L L + ∑ ∑ ∑ wi j di j (X, R0 , A1 , . . . , Ak ) − 2ξi jk di j (X, R0 , A1 , . . . , Ak ) K
n
n
k=1 i=1 j=i
K
n
n
+ ∑ ∑ ∑ wi j
2 diUj (X, R0 , A1 , . . . , Ak ) − 2ξiUjkdiUj (X, R0 , A1 , . . . , Ak ) . (6)
k=1 i=1 j=i
In Eq. (6), wi j , ξiLjk , and ξiUjk are fixed. We therefore derive inequalities for each 2
2
term without (ξiLjk + ξiUjk ) and obtain the following inequality function of Percen2 : Stress σpercen
784
Y. Terada and H. Yadohisa 2 σpercen (X, R0 , A1 , . . . , AK )
≤
n
n
K
p n−1
∑ ∑ ∑ wi j (ξiLjk + ξiUjk ) + ∑ 2
2
k=0 i=1 j=i
K
∑ ∑ ∑ (αi jsk + αi jk )(xis − x js)2 (1)
(2)
s=1 i=1 j=i+1 k=0
p n−1
−2 ∑
n
n
K
∗(1)
∑ ∑ ∑ (αi jsk
∗(2)
∗(3)
+ αi jsk + αi jsk )(xis − x js )(yis − y js )
s=1 i=1 j=i+1 k=0
n n K (1) (2) (3) 2 (1) (2) (3) + (β jisk + β jisk + β jisk )r2js0 + ∑ ∑ ∑ ∑ (βi jsk + βi jsk + βi jsk )ris0 p
s=1 i=1 j=i k=0 p
n
K
n
−2 ∑ ∑ ∑ ∑ (βi jsk + βi jsk )(ris0 + r js0 ) ∗(1)
∗(2)
s=1 i=1 j=i k=0
p
n
n
K
+∑∑∑∑
k
∑
(1) (2) (3) (1) (2) (3) (γi jskl + γi jskl + γi jskl )a2isl + (γ jiskl + γ jiskl + γ jiskl )a2jsl
s=1 i=1 j=i k=1 l=1 p
n
n
K
−2 ∑ ∑ ∑ ∑
k
∑ (γi jsk + γi jskl )(aisl + a jsl ) ∗(1)
∗(2)
s=1 i=1 j=i k=1 l=1
p
n
n
K
+ ∑ ∑ ∑ ∑ (δi jsk + δi jsk ), (1)
(2)
(7)
s=1 i=1 j=i k=0
where (1)
αi js0 = wi j
|yis − y js | + (qis0 + q js0 ) (2) , αi j0 = 2wi j , |yis − y js |
|yis − y js | + (qis0 + q js0 ) + ∑kl=1 (bisl + bisl ) (2) (1) , αi jk = (k + 2)wi j , αi jsk = wi j |yis − y js | ⎧ ⎨wi j |yis −y js |+(qis0 +q js0 ) dU (Y, Q0 ) > 0 and |yis − y js | > 0 ij ∗(1) |yis −y js |diUj (Y, Q0 ) αi js0 = , ⎩0 dU (Y, Q ) = 0 or |y − y | = 0 ij
0
is
js
⎧ |yis −y js |+(q +q ) is0 js0 ⎪ |yis − y js | ≥ (qis0 + q js0 ) and |yis − y js | > 0 ⎨wi j |yis −y js | ∗(2) αi js0 = 2wi j |yis − y js | < (qis0 + q js0 ) and |yis − y js | > 0 , ⎪ ⎩ 0 |yis − y js | = 0 ⎧ wi j ξiLj0 max{0, |yis −y js |−(qis0 +q js0 )} |yis −y js |≥(qis0 +q js0 ), ⎪ ⎨ ∗(3) |yis −y js |diLj (Y, Q0 ) diLj (Y, Q0 )>0 and |yis −y js |>0 αi js0 = , |yis −y js |0 and |yis −y js |di j (Y, Q0 , B1 , ..., Bk ) |yis −y js |>0 ∗(1) αi jsk = , diUj (Y, Q0 , B1 , ..., Bk )=0 or ⎪ ⎩0 |y −y |=0 ⎧ (k+1)|y −y |+(q +q )+∑k (b +b ) is js is0 js0 jsl l=1 isl ⎪ ⎪ |yis −y js | ⎨wi j ∗(2) αi jsk = (k + 2)wi j ⎪ ⎪ ⎩ 0
is
js
|yis −y js |≥(qis0 +q js0 )+∑kl=1 (bisl +bisl ) and |yis −y js |>0 |yis −y js |0 diUj (Y, Q0 ) = 0 or |yis − y js | = 0
,
Multidimensional Scaling with Hyperbox Model for Percentile Dissimilarities
∗(3) αi jsk
=
⎧ w ξ L max{0, |yis −y js |−(qis0 +q js0 )−∑kl=1 (bisl +b jsl )} ⎪ ⎨ i j i jk L ⎪ ⎩0
|yis −y js |≥(qis0 +q js0 )+∑kl=1 (bisl +b jsl ), diLj (Y, Q0 , B1 , ..., Bk )>0 and |yis −y js |>0 |yis −y js | 0 is js is0 js0 0 (3) ij qis0 diLj (Y, Q0 ) , βi js0 = ⎩0 L |yis − y js | < (qis0 + q js0 ) or di j (Y, Q0 ) = 0 (1)
βi jsk = wi j
|yis − y js | + (qis0 + q js0 ) + ∑kl=1 (bisl + b jsl ) , qis0
2(qis0 + q js0 ) + ∑kl=1 (bisl + bisl ) (2) βi jsk = wi j , qis0 ⎧ w ξ L max{0, |yis −y js |−(qis0 +q js0 )−∑kl=1 (bisl +b jsl )} |yis −y js |≥(qis0 +q js0 )+∑kl=1 (bisl +b jsl ) ⎪ ⎨ i j i jk qis0 diLj (Y, Q0 , B1 , ..., Bk ) and diLj (Y, Q0 , B1 , ..., Bk )>0 (3) βi jsk = , |yis −y js | 0 0 ∗(1) ij diUj (Y, Q0 ) βi js0 = , ⎩0 diUj (Y, Q0 ) = 0 wi j {|yis − y js | + (qis0 + q js0 )} |yis − y js | ≥ (qis0 + q js0 ) ∗(2) , βi js0 = 2wi j (qis0 + q js0 ) |yis − y js | < (qis0 + q js0 ) ⎧ k U ⎨ wi j ξi jk {|yis −y js |+(qis0 +q js0 )+∑l=1 (bisl +b jsl )} dU (Y, Q , B , . . . , B ) > 0 0 1 ∗(1) k U (Y, Q , B , ..., B ) ij d 0 1 k , βi jsk = ij ⎩0 diUj (Y, Q0 , B1 , . . . , Bk ) = 0 |yis − y js | ≥ (qis0 + q js0 ) + ∑kl=1 (bisl + b jsl ) wi j {|yis − y js | + (qis0 + q js0 )} ∗(2) , βi jsk = k wi j 2(qis0 + q js0 ) + ∑l=1 (bisl + b jsl ) |yis − y js | < (qis0 + q js0 ) + ∑kl=1 (bisl + bisl ) |yis − y js | + (bisl + b jsl ) + ∑l =l (bisl + b jsl ) , bisl (qis0 + q js0 ) + ∑l =l (bisl + b jsl ) (2) γi jskl = wi j , bisl ⎧ w ξ L max{0, |yis −y js |−(qis0 +q js0 )−∑kl=1 (bisl +b jsl )} |yis −y js |≥(qis0 +q js0 )+∑kl=1 (bisl +b jsl ) ⎪ ⎨ i j i jk bisl diLj (Y, Q0 , B1 , ..., Bk ) and diLj (Y, Q0 , B1 , ..., Bk )>0 (3) γi jskl = , |yis −y js | 0 ∗(1) 0 1 k ij diUj (Y, Q0 , B1 , ..., Bk ) γi jsk = , ⎩0 L di j (Y, Q0 , B1 , . . . , Bk ) = 0 ⎧ |yis −y js |≥ ⎪ ⎨wi j {|yis − y js | + (qis0 + q js0 )} k ∗(2)
(qis0 +q js0 )+∑l=1 (bisl +b jsl ) , γi jsk = |y −y |< is js ⎪ ⎩wi j (qis0 + q js0 ) + 2(bisl + b jsl ) + ∑l =l (bisl + b jsl ) (qis0 +q js0 )+∑kl=1 (bisl +b jsl ) wi j {|yis − y js | + (qis0 + q js0 )}2 |yis − y js | ≥ (qis0 + q js0 ) (1) , δi js0 = 2wi j {|yis − y js |2 + (qis0 + q js0 )2 } |yis − y js | < (qis0 + q js0 ) ⎧ L ⎨ wi j ξi jk {(qis0 +q js0 ) max{0, |yis −y js |−(qis0 +q js0 )} |y − y | ≥ (q + q ) and d L (Y, Q ) > 0 is js is0 js0 0 (2) ij diLj (Y, Q0 ) , δi js0 = ⎩0 |y − y | < (q + q ) or d L (Y, Q ) = 0 (1)
γi jskl = wi j
is
js
is0
js0
ij
0
786
Y. Terada and H. Yadohisa
(1)
δi jsk =
⎧ k ⎪ ∑ {|yis − y js | + (bisl + b jsl )}2 ⎪wi j {|yis − y js | + (qis0 + q js0 )}2 + l=1 ⎪ ⎨ k wi j {|yis − y js | + (qis0 + q js0 )}2 +
|yis − y js | ≥ (qis0 + q js0 ) +
k
∑ (bisl + b jsl )
l=1
∑ {|yis − y js | + (bisl + b jsl )}2
,
l=1 ⎪ ⎪
⎪ ⎩+ |yis − y js | − (qis0 + q js0 ) − ∑k (bisl + b jsl ) 2
|yis − y js | < (qis0 + q js0 ) +
l=1
⎧ max{0, |yis −y js |−(qis0 +q js0 )−∑kl=1 (bisl +b jsl )} ⎪ ⎪ ⎪ diLj (Y, Q0 , B1 , ..., Bk ) ⎪ ⎨ (2) L δi jsk = ×wi j ξi jk {(qis0 + q js0 ) + ∑kl=1 (bisl + b jsl )} ⎪ ⎪ ⎪ ⎪ ⎩0
k
∑ (bisl + b jsl )
l=1
|yis −y js |≥(qis0 +q js0 )+∑kl=1 (bisl +b jsl ) and diLj (Y, Q0 , B1 , ..., Bk )>0 |yis −y js | 0 and aisk > 0.
Multidimensional Scaling with Hyperbox Model for Percentile Dissimilarities
787
Therefore, the hyperbox model Percen-Scal is given by the following: Step 1 Step 2 Step 3 Step 4
Set X0 to a matrix and set R0 and Ak0 (k = 1, . . . , K) to non-negative matrices. t ← 0. Set ε to be a small positive number as the convergence criterion. 2 2 While t = 0 or |σpercen (Xt , R0t , A1t , . . . , AKt ) − σpercen (X(t−1) , R0(t−1) , A1(t−1) , . . . , AK(t−1) )| ≥ ε do 0 ← R0(t−1) ; Bk ← Ak(t−1) . t ← t + 1; Y ← Xt−1 ; Q ∗ s , g∗s , H sk , and h∗ . Compute Fs , Fs , G sk For s = 1 to p do −1 xs ← Fs+ Fs∗ys ; r0s ← G g∗s . s For k = 1 to K do −1h∗ . ask ← H sk sk end for end for Xt ← X; R0t ← R0 ; Akt ← Ak . end while.
5 Numerical Example
10
Frequency
0
0
5
5
10
Frequency
15
15
20
In this section, we demonstrate the utility of the hyperbox model Percen-Scal algorithm. We set convergence criterion ε = 0.0001 and apply the hyperbox model Percen-Scal and Hist-Scal algorithm (modified to deal with self dissimilarities) using a sample of ideal artificial data by 50 random starts and compare their results. Here, the ideal data is the percentile dissimilarities consisting of distances between
0
(a)
10
20
30
(b)
40
50
60
70
(c)
Fig. 2 The ideal artificial data and distributions of stress obtained by 50 random starts and the relation between stress and the number of iterations; (a) the ideal artificial data, (b) distribution of Hist-Stress, (c) distribution of Percen-Stress
788
Y. Terada and H. Yadohisa
two nested rectangles in Fig. 2 (a). In this case, the value of Percen-Stress (HistStress) at the global minimum is 0. Fig. 2 (b) and 2 (c) show the distributions of Hist-Stress and Percen-Stress by 50 random starts, respectively. If the number of objects or percentiles is small, the HistScal algorithm is stable. However, the stability of the Hist-Scal algorithm decreases as the number of objects and percentiles increases. In particular, Fig. 3 (a) shows that in such cases the Hist-Scal algorithm is not stable. On the other hand, Fig. 3 (b) shows that the Percen-Scal algorithm is stable in such cases, and we can obtain at the good solution which is close to the global minimum.
(a)
(b)
Fig. 3 Relation between stress and the number of iterations; (a) relation between Hist-Stress and the number of iterations, and (b) the relation between Percen-Stress and the number of iterations
Acknowledgment I would like to give heartful thanks to anonymous reviewers. This research was partially supported by the collaborative research program 2010, information initiative center, Hokkaido university, Sapporo, Japan.
References 1. Denœux, T., Masson, M.: Multidimensional scaling of interval-valued dissimilarity data. Pattern Recognition Letters 21, 82–92 (2000) 2. Groenen, P.J.F., Winsberg, S., Rodr´ıguez, O., Diday, E.: I-Scal: Multidimensional scaling of interval dissimilarities. Computational Statistics & Data Analysis 51, 360–378 (2006) 3. Groenen, P.J.F., Winsberg, S.: Multidimensional scaling of histogram dissimiralities. In: Batagelj, V., Bock, H.H., Ferligoj, A., Ziberna, A. (eds.) Data Science and Classification, pp. 581–588. Springer, Heidelberg (2006) 4. Terada, Y., Yadohisa, H.: Hypersphere model MDS for interval-valued dissimilarity data. In: Proceedings of the 27th Annual Meeting of the Japanese Classification Society, pp. 35–38 (2010)
Predictive Data Mining Driven Architecture to Guide Car Seat Model Parameter Initialization Sabbir Ahmed, Ziad Kobti, and Robert D. Kent
*
Abstract. Researchers in both government and non government organizations are constantly looking for patterns among drivers that may influence proper use of car seats. Such patterns will help them predict behaviours of drivers that shape their decision in placing a child in the proper car constraint when traveling in an automobile. Previous work on a multi-agent based prototype, with the goal to simulate car seat usage patterns among drivers, has shown good prospects as a tool for researchers. In this work we aim at exploring the parameters that initialize the model. The complexity of the model is driven by a large number of parameters and a wide array of values. Existing data from road surveys are examined using existing data mining tools in order to explore beyond basic statistics what parameters and values can be most relevant for a more realistic model run. The intent is to make the model replicate real world conditions as closely mimicking the survey data as possible. Data mining driven architecture which can dynamically use data collected from various surveys and agencies in real time can significantly improve the quality and accuracy of the agent-model.
1 Introduction Road accidents are one of the most significant health risks all over the world. Car seats are used to protect children from injury during such accidents. Use of car safety seats can significantly reduce injury, yet misuse remains very high even in developed countries like Canada. Many government and non government agencies alike are actively involved in research to investigate how to reduce such misuse in an effort to increase child safety. One approach towards understanding the cause of misuse of car safety seats is to discover patterns among drivers who have higher probability of improper use. For example, does family income play a role or does education have a higher effect? These patterns will help identify the high risk group of drivers and develop the appropriate targeted interventions such as education to effectively reduce the probability of misuse. With this goal in mind health Sabbir Ahmed · Ziad Kobti · Robert D. Kent School of Computer Science, University of Windsor Windsor, ON, Canada, N9B-3P4 e-mail: {ahmedp,kobti,rkent}@uwindsor.ca *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 789–797. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
790
S. Ahmed, Z. Kobti, and R.D. Kent
care researchers and computer scientists at the University of Windsor collaborated to produce a multi-agent model for child vehicle safety injury prevention [1] (from here on referred as the “Car Seat Model”). The computer simulation presents the user with a set of parameters in order to define and control various characteristics and behaviours. This however is tedious manual process that requires the researcher prior knowledge of the case study being simulated. For instance, to simulate a particular population group one would need to carry out targeted surveys and collect useful data that describes the group and subsequently build a set of corresponding parameter values to initialize the Car Seat Model. With the advancement of data collection and survey methods enabling live data generation from the field, an ambitious objective is to enable the rapid automation of simulation parameters in an effort to produce a scalable model constantly seeded with real values. An ambitious objective is to enable the model to use new survey data dynamically as they become available. That is if new survey data reflect new patterns in the usage of car seat or if they introduce new parameters (e.g. education level of drivers), the Car Seat Model should be able to make use of this new knowledge and simulate it accordingly. The simulation itself is used as a decision support tool for life agent analysis, outcome prediction and risk estimation. Furthermore, such agent based model needs to validate agent’s behaviour with more up to date real life data, such as field survey data. In this paper we propose a data mining driven architecture which would provide the Car Seat Model with an interface to initialize its agents based on real world data, predict agent’s behaviour, and validate simulation results. In the next section we discuss the need of real world data in multi-agent simulations. Next, we present why data mining techniques are a good fit to answer such needs. Then we present the proposed architecture, tools and survey data used to implement the system with a sample proof of concept experiment.
2 Multi Agent Simulation Modeling A complex system includes qualities such as being heterogeneous, dynamic, and often agent based. While standard mathematical and statistical models can be used to analyse data and systems, they fall short when the degree of realism increases along with the complexity of the system. Agent based models become more prevalent when it comes to more realistic and complex systems. The current Car Seat Model seeks to examine driver behaviour for selecting a car seat and properly positioning it in a vehicle at the time of driving. Events such as accidents, driving conditions are a dynamic part of the model challenging the driver (agent). One goal of the simulation is to provide the observer a first person eye view of the world as it unfolds in the simulated world [3, 4]. In most cases these simulations begin with values taken from a uniform random distribution [2]. This can be a poor choice as it may not necessarily reflect the real world and in turn may affect the output. In other cases such as the current Car Seat Model the initial values are based on statistical measures taken from one particular survey. The limitation of
Predictive Data Mining Driven Architecture
791
such approach is that when new set of data become available from new surveys, these parameters need to be recalculated and changed accordingly. Moreover, another key issue inhibiting researchers from fields outside of computer science to accept agent-based modeling systems (ABMS) as a tool for the modeling and simulation of real-life systems is the difficulty, or even lack of developing standard verification and validation methodologies. Verification aims to test whether the implementation of the model is an accurate representation of the abstract model. Hence, the accuracy of transforming a created model into the computer program is tested in the verification phase. Model validation is used to check that the implemented model can achieve the proposed objectives of the simulation experiments. That is, to ensure that the built model is an accurate representation of the modeled phenomena it intends to simulate [8].
2.1 Data Driven Models Much work has been done on agent-based models which use real world data to tackle these issues. A popular example is the simulation model of the extinction of Anasazi civilization where data collected from observed history is used to compare with the simulation result [5]. Furthermore another example is the water demand models of [5, 9], in which data about household location and composition, consumption habits and water management policies are used to steer the model, with good effect when the model is validated against actual water usage patterns. In [2], the author encourages agent based designers to adopt the data-driven trend and merge concepts taken from micro simulation into agent based models by taking the following basic steps: 1. 2.
3. 4. 5.
Collection of data from the social world. Design of the model, which should be guided by some of the empirical data (e.g. equations, generalisations and stylised facts, or qualitative information provided by experts) and by the theory and hypotheses of the model. Initialisation of the model with static data (from surveys and the census). Simulation and output of results. Validation, comparing the output with the collected data. The data used in validation should not be the same as that used in earlier steps, to ensure the independence of the validation tests from the model design.
3 Data Mining Potentials Data mining is the process of extracting patterns from data into knowledge. Such extracted knowledge from data will be very useful for multi-agent models, such as the Car Seat Model to address issues discussed in the previous section. Recently data mining driven architecture for agent-based models was presented by [6] [8].
792
S. Ahmed, Z. Kobti, and R.D. Kent
Here the authors proposed several methods for integrating data mining into agentbased models. There are various types and categories of data mining techniques available. Usually these techniques are adopted from various fields of Computer Science, Artificial Intelligence and Statistics. These techniques can be categorized into two major branches of data mining, namely predictive and descriptive. Predictive data mining is used to make prediction about values of data using known results found from different data. Various predictive techniques are available such as Classification, Regression etc. Descriptive data mining is used to indentify relationships in data. Some of the descriptive data mining techniques are clustering, association rule and others. Due to the nature of our goal for this paper we will focus on predictive data mining using decision tree classification. However, other predictive methods can be explored which we intend to do in our future work in this area.
3.1 Classification Using Decision Tree Classification using Decision Tree is one of the most popular data mining techniques used today. A decision tree is a series of questions systematically arranged so that each question queries an attribute (e.g. Sex of the driver) and branches based on the value of the attribute. At leaves of the tree are placed predictions of the class variable (e.g. Type of Car Seat used) [6]. Among various Decision Tree algorithm C4.5 is the most efficient. It is a descendent of ID3 algorithm and was designed by J Ross Quinlan. The algorithm C4.5 calculates entropy for class attribute, which in general is the measure of the uncertainty associated with the selected attribute. The entropy is calculated using the following formula:
ሺሻ ൌ െ ሺ୧ ሻ ୠ ሺ୧ ሻ ୀଵ
Where p(xi) is the probability mass function of outcome xi. b is the number of possible outcome of the class random variable. n is the total number of attributes. Once the entropy of the class variable is known, Information Gain for each attribute is calculated. Information gain, which is calculated using the below formula, is the expected reduction in entropy caused by partitioning the examples according to the attribute. C4.5 algorithm puts the attribute with highest information gain on top of the tree and recursively build the tree using attribute with next highest information gain value [11]. ݊݅ܽܩሺܵǡ ܣሻ ൌ ݕݎݐ݊ܧሺܵሻ െ
௩ఢ௨௦ሺሻ
ȁܵ௩ ȁ ݕݎݐ݊ܧሺܵ௩ ሻ ȁܵȁ
Predictive Data Mining Driven Architecture
793
Where v is the set of possible values. For instance, if we are calculating information of attribute Gender then the value of v will be 2 for Male and Female. S denotes the entire dataset. Sv denotes the subset of the dataset for which the attribute, such as gender have that value. |.| denotes the size of a dataset (in number of instances).
4 Proposed Architecture Our goal here is to design an architecture which will collect survey data from a database and generate a Decision Tree model on the fly. We also need to provide an application program interface which will be used by the Car Seat Model for initialization, prediction and validation. Based on these requirements our proposed architecture presented in Figure 1.
Fig. 1 The proposed architecture.
794
S. Ahmed, Z. Kobti, and R.D. Kent
The system consists of three modules namely 1.
2.
3.
Data pre-processing module – This module will be used pre process (e.g. data cleansing etc.) the data collected from database. User will be able to use this module to select desired attributes before data mining algorithm is applied to it. This module will provide flexibility to the agent based model to select relevant attributes from the real world data. Also this module will be able to split the data into 2 sets namely the training set and the test set. The training set will be used to generate the decision tree model. And the test set data will be used to initialize the agents in the simulation. Data mining module – The data mining module will use the open source Weka library to generate decision tree model using the training set data generated by pre-processing module. API Module – This module will provide a java based application program interface to the Car Seat Model to initialize agents, predict agents behaviour and validate the simulation. This module actually uses the model generated by the data mining module and return information (such as prediction) to the Car Seat Model.
4.1 Tools Used Following tools will be used to implement the proposed architecture: 1. Weka - The Pre processing module and the data mining module used the open source Weka Library[10]. 2. Java using Netbeans to develop the API, and data mining Module. 3. The database we used here is the SQL Server 2008.
4.2 Data Used To be able to implement the proposed architecture we collected some survey data. The survey was conducted all over Canada. Among many fields that were available in the survey following fields were used to build the initial decision tree model: Location Code {ON, NF, BC, QC, AB, YT, MB} Driver's Sex {male, female} Ethnicity {Asian, SE Asian, Aboriginal, Middle Eastern, African Canadian, Caucasian} Child Age {< 1 year, 1-3 years, 4-8 years, 9-14 years} Car Seat Type {Not restrained (N), Rear-facing infant seat (R), Forward-facing child seat (F), Booster seat (B), Seat belt (S)}
Predictive Data Mining Driven Architecture
795
From the above fields the Car Seat Type field was chosen as the class random variable, meaning the decision tree model would be able to predict type of the Car Seat to be used by the driver given values of the other attributes. After initial cleaning and pre processing of the raw survey data we had some 3222 sets of data which we used to feed into the Data Mining module to generate C45 Decision Tree Model.
4.3 Results A total of 3,222 record sets were available after Data Cleaning and Pre-processing. We used 66% to train the model and rest to test. Out of the 34% test cases 836 instances were classified correctly, which is about 74.78% accuracy. Decision tree that was generated is below:
Fig. 2 The decision tree generated by Weka tool.
Based on the decision tree generated from the survey data (figure 2) we can see that age of the child is at the root of the tree. This means that the type of car seat used by a driver is mostly depends on the age of the child. We also noticed that the model looks into other attributes of drivers, such as ethnicity or sex, for the case where the child age is between 4 to 8 years. For this age range the survey data shows variation in the seat selection between provinces, and, taking Alberta as example we find that ethnicity plays a role. For other ranges, the model predicts the outcome without looking into these attributes. For e.g. age 1-3 years, regardless of the values of the other attributes the model predicts the driver will use front facing car seat (F).
796
S. Ahmed, Z. Kobti, and R.D. Kent
Front facing seat choice for the age between 1 and 3 is mandated by law. From this we can see that the problem of car seat usage is more in cases of child’s age between 4-8 years. Obviously more data and more attributes are required to concretely back such hypothesis. This is why our proposed application allows more data to be fed so that it can keep on learning the real world phenomena. Our application can use the above tree model to predict behaviour of new set of driver. A sample java code is below: { CarSeatC45 newmodel = new CarSeatC45(data.arff) .... AgentDriver D = new AgentDriver(); D.setGender(“Male”); D.setChildAge(“ Begin with K clusters, each containing only a single PD, K = n. Calculate distance between PD. <Step2> Search the minimum distance in K clusters. Let the pair the selected clusters. Combine PDs into a new cluster, It is described by mixture distribution of the member, where mixture weight is equal. Let K be K − 1. If K > 1, go to Step3, otherwise Step4. <Step3> Calculate the distance between new cluster and other cluster, and go back to Step2. <Step4> Draw the dendrogram. Kullback-Leibler divergence is the natural way to define a distance measure between probability distributions [8], but not symmetry. We would like to use the symmetric Kullback-Leibler (symmetric KL) divergence as distance between concepts. The symmetric KL-divergence between two distributions s1 and s2 is
802
K. Katayama et al.
D(s1 (xx), s2 (xx)) = D(s1 (xx)||s2 (xx)) + D(s2 (xx)||s1 (xx)) ∞ ∞ s1 (xx) s2 (xx) = s1 (xx) log dxx + s2 (xx) log dxx, s2 (xx) s1 (xx) −∞ −∞
(1)
where D(s1 ||s2 ) is KL divergence from s1 to s2 and D(s2 ||s1 ) is one from s2 to s1 .
4.2 Distance between PDs In section 4.1, we use symmetric KL-divergence as distance between PDs. Let PDs be d dimensional N( μ i , Σ i ) and N( μ j , Σ j ). Symmetric KL-divergence in Step 1 is D(p(xx|μ i , Σ i ), p(xx|μ j , Σ j )) −1 −1 −1 T = tr(Σ i Σ −1 j ) + tr(Σ j Σ i ) + tr((Σ i + Σ j )(μ i − μ j )(μ i − μ j ) ) − 2d. (2)
Let PDs be d = 1, D(p(x|μi , σi ), p(x|μ j , σ j )) σ 2j σi2 + (μi − μ j )2 1 1 σi2 σ 2j + (μ j − μi )2 = log 2 + + log 2 + − 1. (3) 2 2 σi σ 2j σj σi2 After Step2, we need symmetric KL-divergence between Gaussian mixture distributions. However, it cannot be analytically computed. We can use, instead, MonteCarlo simulations to approximate the symmetric KL-divergence. The drawback of the Monte-Carlo techniques is the extensive computational cost and the slow converges properties. Furthermore, due to the stochastic nature of the Monte-Carlo method, the approximations of the distance could vary in different computations. In this paper, we use unscented transform method proposed by Goldberger, et al[5]. We show approximation of D(s1 ||s2 ) in (1). Let cluster c1 contains d-dimensional (1) x) = ∑M distribution Nd (μ m , Σ (1) m )(m = 1, . . . M). Expression formula of c1 is s1 (x m=1 (1) (1) (1) x ωm p(xx|θ (1) ), where ω is a mixture weight, p(x | θ ) is m-th probability density m m m (1) (1) (1) (1) function of Nd (μ m , Σ (1) m ) and θ m = (μ m , Σ m ). Simmilary, cluster c2 contains (2) (2) d-dimensional distribution Nd (μl , Σl )(l = 1, . . . L). Expression formula of c2 is (2) (2) s2 = ∑Ll=1 ωn p(xx|θ l ). Approximation of KL-divergence from s1 to s2 by using unscented transform method is D(s1 ||s2 ) ≈
2d s1 (oom,k ) 1 M ωm ∑ log , ∑ 2d m=1 k=1 s2 (oom,k )
(4)
Symbolic Hierarchical Clustering for Visual Analogue Scale Data
803
where o m,t are sigma points. They are chose as follows: (1) o m,t = μ (1) + d Σ , m m t (1) o m,t+d = μ (1) − d Σ , m m
(5)
t
(1) (1) such that Σm is t-th column of the matrix square root of Σm . Then, t
o m,t o m,t+d
(1) (1) = d λm,t u m,t (1) (1) = μ (1) − d λm,t u m,t , m
μ (1) m +
(1)
(6)
(1)
where t = 1, . . . , d, μ m is mean vector of m-th normal distribution in s1 , λm,t is t-th (1)
eigenvalue of Σ (1) m and u m,t is t-th eigenvector. If p = 1, the sigma points are simply (1)
(1)
μm ± σm . We can calculate approximation of D(s2 ||s1 ). Substituting these approximations into (1), we obtain the symmetric KL-divergence. We set the divergence as distance between cluster c1 and c2 .
5 An Application to the VAS Data In this section, we apply our proposal method to real VAS data from Keio University School of Medicine. This is masked data and is not be tied to any information that would identify a patient. To compare the traditional method, we apply centroid method to same data.
5.1 Medical Questionnaire in Keio University School of Medicine Center for Kampo Medicine, Keio University School of Medicine, have a questionnaire to patients to help medical decision. The questionnaire includes one set of questions about their subjective symptoms. There are 244 yes-no questions and 118 visual analogue scale questions,for example, ”How do you feel pain with urination?”. Patients answer these questions every time when they come to Keio University. Doctors can understand patients’ fluctuate in severity.
804
K. Katayama et al.
5.2 Data Description and Result For our analysis, we deal with a question which ask about how patient feel cold: ”Do you feel cold in your left leg?”. The data contain 435 patients’ first and second VAS value. We transform this data set to PD. Next table show extracts taken from the original data and their translation.
Table 1 VAS value and PD Patient ID
first VAS value
Second Vas Value N(μ , σ 2 )
1 2 .. .
100 0 .. .
78 50 .. .
42
5
435
N(89, 121) N(25, 625) .. . N(23.5, 342.25)
The result of our simulation show in figure2. Vertical axis of this dendrogram means distance between PDs. There seem to be three large cluster, A, B and C. The PDs of cluster A have large variance. The member of cluster B has small variance. The member of cluster C has small variance and large mean. The level that patients’ expression of sense of pain appears in features of clusters.
Fig. 2 Dendrogram for PDs
Symbolic Hierarchical Clustering for Visual Analogue Scale Data
805
The result of centroid method show in figure3.
Fig. 3 Dendrogram of Traditional Method
6 Concluding Remarks In this paper, we defined PD that is from transformation of the VAS to DistributionValued data. We also proposed hierarchical clustering method for it. Comparing across a group of patients by using the VAS is difficult, but our method can do it. Through the simulation, we verified our model. In the future, we will define multidimensional PD and apply our clustering method.
References 1. Billard, L., Diday, E.: Symbolic Data Analysis. Wiley, NewYork (2006) 2. Bock, H.-H., Diday, E.: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, Berlin (2000) 3. Diday, E.: The symbolic approach in clustering and related methods of Data Analysis, Classification and Related Methods of Data Analysis. In: Bock, H. (ed.) Proc. IFCS, Aachen, Germany. North-Holland, Amsterdam (1988) 4. Diday, E.: The symbolic approach in clustering and related methods of Data Analysis. In: Bock, H. (ed.) Classification and Related methods Of Data Analysis, pp. 673–684. North-Holland, Amsterdam (1988) 5. Goldberger, J., Gordon, S., Greenspan, H.: An efficient image similarity measure based on approximations of KL-divergence between two gaussian mixtures. In: Proceedings of CVPR, pp. 487–494 (2006) 6. Gowda, K.C., Diday, E.: Symbolic clustering using a new dissimilarity measure. Pattern Recognition 24(6), 567–578 (1991) 7. Katayama, K., Suzukawa, A., Minami, H., Mizuta, M.: Linearly Restricted Principal Components in k Groups. In: Electronic Proceedings of Knowledge Extraction and Modeling, Villa Orlandi, Island of Capri, Italy (2006) 8. Kullback, S.: Information theory and statistics. Dover Publications, New York (1968)
Part IV Miscellanea
Acquisition of User’s Learning Styles Using Log Mining Analysis through Web Usage Mining Process Sucheta V. Kolekar, Sriram G. Sanjeevi, and D.S. Bormane
*
Abstract. Web Usage Mining is a broad area of Web Mining which is associated with the Patterns extraction from logging information produced by web server. Web log mining is substantially the important part of Web Usage Mining (WUM) algorithm which involves transformation and interpretation of the logging information to predict the patterns as per different learning styles. Ultimately these patterns are useful to classify various defined profiles. To provide personalized learning environment to the user with respect to Adaptive User Interface, Web Usage Mining is very essential and useful step to implement. In this paper we build the module of E-learning architecture based on Web Usage Mining to assess the User’s behavior through web log analysis. Keywords: E-learning, Log Mining Analysis, Adaptive Learning styles, Web Usage Mining.
1 Introduction Typically e-learning is Web based educational system which provides the same resources to all learners even though different learners need different information according to their level of knowledge, ways of learning style and preferences. Content sequencing of any course is a technology originated in the area of Intelligent / Adaptive Learning System with the basic aim to provide end user/student with the most suitable sequence of knowledge content to learn, and sequence of Sucheta V. Kolekar Research Scholar National Institute of Technology, Warangal, A.P., India e-mail:
[email protected] *
Sriram G. Sanjeevi National Institute of Technology, Warangal, A.P., India e-mail:
[email protected] D.S. Bormane JSPM’s Rajarshi Shahu College of Engineering, Pune, Maharashtra, India e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 809–819. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
810
S.V. Kolekar, S.G. Sanjeevi, and D.S. Bormane
learning tasks (examples, exercise, problems, contents etc) to work with. To implement the Adaptive personalized learning system, different types of knowledge required which is related to learner’s behavior, learning material and the representation of learning process [3]. Several kind of research is already addressed in the field of personalized e-learning; still there is a requirement to concentrate on the adaptation based on learning styles of the user. [7] In fact there are two basic classes of adaptation need to consider: Adaptive User Interface also called Adaptive Navigation and Adaptive content presentation. Some of the Learning Systems focusing on static modules of contents, which can be confine the learner to gain knowledge initiatively in some degree. But because of the difference between learners in study purposes, abilities and cognizant of knowledge, it need to build the intelligent and individual learning platform for all learners to highly improve their enthusiasm for learning. Ultimately the research emphasizes on following objectives for an Adaptive User Interface with respect to E-learning: • • • • • •
Create personalized environment Acquisition of user preferences Take control of task from the user Adaptive display management Reduce information overflow Provide help on new and complex function
To achieve above objectives it is very essential to introduce web data mining technique. Web usage mining is dealing with the extraction of knowledge from web server log files. It mines the useful behavior to define accurate user profiles for the intelligent adaptive personalized e-learning system The objectives of research in this paper: 1. Capture learning styles of an individual user/learner using log file method. 2. Improve the performance of web services. 3. Prepare web site structure to deal with users on an individual basis. 4. Provide accurate and complete picture of user’s web activities. 5. Generate sufficient data like server side and client side logs to perform meaning full mining tasks in the phase of Pattern Analysis. The paper is organized as follows; in section II basic architecture of Web Usage Mining is discussed with different kinds of applications. Section III talked about related work directly and indirectly existing on this issue. Section IV discussed about the proposed architecture about Web Usage Mining and detail description of steps.
2 Web Usage Mining Web Mining is divided into three important categories as per the part of Web based system are Web Content Mining, Web Structure Mining and Web Usage Mining. Web Content Mining deals with the discovery of useful information from the web contents. Web Structure Mining tries to discover the model of links structure from typical applications which are based on linked web pages.
Acquisition of User’s Learning Styles Using Log Mining Analysis
811
2.1 Basic Architecture of Web Usage Mining in E-Learning The general framework of Web Usage Mining is shown in fig. 1 for e-learning environment [12]. The first basic step of WUM is to collect and manage data related to users. It is called Data Preprocessing which includes Web Server Log files and some other important information like Student’s registration details and learning information. Second step is Pattern Discovery which utilizes some mining algorithms to generate the rules and modules to extract the learning patterns of the users based on learning styles which is recorded in log files. Pattern Analysis is the third step which is mainly converting the rules and modules into important knowledge by analyzing the user’s usage which is ultimately the input to Interface component manager to change the GUI according to user’s interest.
Fig. 1 Basic architecture of Web Usage Mining
2.2 Applications of Web Usage Mining 1. Personalization Service: Personalization for a user is achieved by keeping track of previously accessed pages e.g. individualized profiling for E-Learning. Making adaptive interface on the basis of her/his profile in addition to usage behavior is very attractive and essential feature of e-learning in the field of Education. Web usage mining is an excellent approach for achieving this objective which is described in next section. It will classify the user’s patterns as per the learning styles captured in log records. It can be used to find the learner’s interests and preferences by mining single learner’s browsing information such as[6] visiting pages, visiting frequency, content length of visit, time spent on each visit and preferences so as to provide each learner with the personalized adaptive pages which are accurate for his learning style and to forecast the learning behavior of each learned and to offer personalized education environment. 2. System Improvement: The improvement factor of the system is totally based on User’s Satisfaction. The performance and quality of web site are the important measures of user’s satisfaction. Web usage mining can provide useful
812
S.V. Kolekar, S.G. Sanjeevi, and D.S. Bormane
knowledge and patterns to design of Web Server in a better way so that sever can focus on special features like [5] Page Caching, Network Transmission, Load Balancing, Data Distribution and Web Site Security. 3. Site Modification: The structure and interface of web site as per interest and contents are the key factors to attract learners to learn. Web Usage Mining can provide site improvement as per the mining knowledge and modify the structure as per the learner’s navigation path and feedback. In adaptive environment of web site, structure and interface of a web site changes automatically on the basis of usage patterns discovered from server logs. 4. Business Intelligence: Business Intelligence service is related to customer’s information captured on web based system. In e-learning customers is nothing but learners whose learning behavior can be identified by web mining technique which will be ultimately used to increase the learner’s satisfaction and to improve the business.
3 Related Work and Discussion Up till now many papers have been suggested techniques related to Web Usage Mining and Log Analysis in E-learning environment. Xue Sun and Wei Zhao introduced how to use WUM in e-learning system which can be more intelligent and individual learning system and promote the interests of learners [2]. Shahnaz Nina et al. [3] propose technique of Pattern Discovery for web log records to find out the hidden information or predictive pattern by the data mining and knowledge discovery. Navin Kumar et al. surveyed about data preprocessing activities like Data Cleaning, Data Reduction. Some research is also done on personalized elearning using different agents related to domains. Our work is differs from above mentioned research in various aspects. As the main focus of research is to design an e-learning system with personalized adaptive interface, this paper is primarily focusing on the first step of personalization of users which is based on Web Usage Mining. The proposed framework of elearning system is shown in fig. 2 [10] where we are implementing the Learning Style Acquisition phase using the advanced log analysis method of Web Usage Mining framework.
Fig. 2 Architecture diagram of E-learning System
Acquisition of User’s Learning Styles Using Log Mining Analysis
813
The approach of architecture is as follows: 1. Learning Style Acquisition: In this phase Web Usage Mining technique is used to analyze the log data for identification of learning styles of different users/students. 2. User Classification Manager: The learning repository is the input for User classification manager where Back Propagation Neural Network algorithm of Classification is performed to identify different kind of users based Learning style of Felder and Silverman. 3. Interface Component Manager: After identifying the categories of users the Interface component manager is changing the graphical representation of user interface as per user’s need. 4. Adaptive Content Manager: This phase generates the adaptive contents based on user classification with the help of administrative activities and Elearning content repository.
4 Proposed Approach of Learning Style Acquisition In the field of web based e-learning, we are mainly emphasizing on the above mentioned to application areas: (i) Personalization and (ii) Site Modification (Adaptive User Interface). When users visit the site, they are interested in some course material, so they visit different pages. The e-learning sever log the information based on their visits. Through the log analysis and mining we can get the user’s interest and behavior towards the pages visited. When users log on to the portal, the system will classify the users to different classes based on the previous behavior and generates the personalized page interface by adjusting the contents continuously and timely. The idea of the architecture implementation: 1. Activity Recorder: Authentication of the user on e-learning portal and capturing of client side information through Activity Recorder. 2. Log Information: Capturing of Server side logs and proxy side log to pass through the data pre-processing with the additional information of user. 3. Data Pre-processing: Perform data cleaning, data integration, and data reduction steps to generate useful data for mining. 4. Clustering: Apply Usage clustering method for patterns discovery. The advanced k-means clustering algorithm is used to find out appropriate clusters based on user’s usage. 5. Profile Generation: Generate user’s profiles and content profiles according to clusters. The user’s profiles are used to generate the learning styles and content profiles used to find out the domain interest of the user. 4.1 Steps of Web Usage Mining: (i) Data Collection: The first step in the Web usage mining process consists of gathering the relevant Web data, which will be analyzed to provide useful information about the user’s behavior.
814
S.V. Kolekar, S.G. Sanjeevi, and D.S. Bormane
Types of log files: 1. Server Side: The Extended Log Format (W3C) [7][1], which is supported by Web servers such as Apache and Netscape, and the similar W3SVC format, supported by Microsoft Internet Information Server, include additional information such as the address of the referring URL to this page, i.e., the Web page that brought the visitor to the site, the name and version of the browser used by the visitor and the operating system of the host machine. The Server side logs should contain the information of Web server and cached pages. 2. Client Side: [5] Client side data are the local activities collected from the host that is accessing the Web site using JAVA Wrapper technique. The local activities include actions of user on desktop like save/print the page, back/forward/stop the browser, email link/page, add a bookmark etc. This information is additional and reliable to understand the accurate behavior of the learner.
Fig. 3 Learning Style Acquisition Approach
3. Intermediary Side: Proxy Server Logs: The advantage of using these logs is that they allow the collection of information about users operating behind the proxy server, since they record requests from multiple hosts to multiple Web servers. (ii) Data Preprocessing: The captured log files are not suitable directly for data mining techniques. Files must be gone through the three data pre-processing steps 1. Data Cleansing: Useless information removal e.g. graphical page content [6]. An algorithm for clearing the entries of log information: (i) Removal of picture files associated with request for particular pages: (ii) Remove status of error or failure on different pages. (iii) Automatically generated access records should be identified and removed. (iv) Entries with unsuccessful HTTP status code should be removed. Codes in between 200 to 299 are successful entries.
Acquisition of User’s Learning Styles Using Log Mining Analysis
815
2. Data Integration: Integration of cleaned data is the process of identification and reconstruction of user’s sessions from log files. This phase of prediction is divided into two basic steps: User Identification: The identification of different users based on three ways: (i) By converting IP address to domain name exposed some knowledge. (ii) Cookies help to easily identify the individual visitors, which gives information regarding the usage of website. (iii) Records of cached pages are used to find out the profiles. Session Identification: The need of session identification is to separate out the different sessions of same user by checking threshold value. Usually threshold of each session considered as 30 min. time interval [3]. 3. Data Reduction: Need to reduce the Dimension of data to decrease the complexity. Access log files on the server side and proxy side consist of log information of user’s sessions. These logs include the list of pages that a user has accessed in one single session. The log format of the file is in Extended Log File Format which includes special records. The information in this record is sufficient to obtain session information. The set of URLs of particular pages are forming a session which should satisfy the requirement that the time of elapsed between two consecutive requests is smaller than a given t, which is accepted as 30 minutes threshold value. After preprocessing of log files the following fields are used for research: 1. Users: In e-learning system it refers as a learner who visits the e-learning portal with different learning styles. 2. Page view: A page view can get users by clicking on the page once which can be used to represent as one learning behavior. 3. No. of click streams per session: Click streams are nothing but the user’s page requests which can be considered as a learning sequence. 4. User Session: All sequence of clicks from that user visits from whole website i.e. aggregation behavior of the user. Evaluation of Parameters for Method: 1. 2. 3.
4.
Topics (T) are related to the contents of the web site and are defined by the owner of the portal. Weight (W) defines the importance of the topic based on the actions. Actions (A) is nothing but the clicks of the student on particular content type like text links, video lectures, downloadable link etc. Each action can be defined by the weight as per the importance. E.g. A1=PageRequest and weight WA1=1 or A2=VideoLecture and weight WA2= 4 or A3= Download PDF and weight AW3= 8. Action A1 is default action for any other action. Duration (D) is the time; a student spends on a page which will give us the interest area based upon actions on pages. To calculate the exact duration, there is one problem related to the time a student spends. You cannot predict whether the student is really reading the page or might be sleeping. To solve this problem we are considering the time duration up to timeout of the login.
816
S.V. Kolekar, S.G. Sanjeevi, and D.S. Bormane ∑ ∑
(1)
: : Different pages visited as per Learning Styles: The Web Usage Mining architecture we propose aims to find a mapping of student’s actions on the browsers to learning styles they fit. Based on the formula derived in equation (i) we can find the duration spent on number of pages of E-learning portal. The observed actions are as follows: 1.
Access of contents and Reading Material:
2.
Access of Examples:
3.
Exercises or Quiz
4.
Chat Usage/Forum Usage/Email Usage: Student may use the chat or forum or email service for social communication based on contents. (iii) Clustering: There is a need of clustering the user’s profiles and the contents profiles based on the log information. Clustering is an unsupervised classification method which groups the objects together based on the similarity feature into the same cluster [2]. Clustering can be possible by two ways which are partitioned based and hierarchical based. Partitioned clustering is to separate the records of n objects into k clusters which include most similar objects into n groups; the separation depends on distance measure. In this approach we are using popular k-means clustering method with some advance features to find out most frequent pages access by the user on specific contents of domain.
Fig. 4 Web Log Record
Acquisition of User’s Learning Styles Using Log Mining Analysis
817
Fig. 5 Web page contents and frequency
Fig. 6 Histogram of calculated durations of different pages and contents.
5 Experimental Details and Results Although user registered his or her favorite domains at the time of registration, user’s identity or work will influence the style of content reference of each visit, for example user submits the domain of interest called “database management systems” but to understand the concepts of the database user always prefer the contents based on his/her individual learning style. The styles of user we can get by analyzing the log records. It's sure that the definition of interests is not easy. In order to get satisfied criteria for interests, we have done a great deal of experimentation. We employ the extended data which is a kind of client level Web log data and server level log data that are recorded, since the data contains behaviors of a large numbers of users among the investigation. Based on the mentioned steps of web usage mining we built a system on extracting users' Interests. The experiment on used web log data, collected from www.e-learning.rscoe.in web server (see in Figure 4). This record is used get web pages access the by user. The accessed web pages contain different type of contents for the particular search topic. According to user’s interest they can access different links. We have recorded 600 users log
818
S.V. Kolekar, S.G. Sanjeevi, and D.S. Bormane
records and the frequency of the contents (see in figure 5). After Clustering we can define the 7 types of clusters as per web page contents to decide user profiling and content profiling. The graph (see in figure 6) shows the number of users accessing and spending time on the different pages of portal which are useful input for Neural Network Algorithm to classify different Learning Styles.
6 Conclusion In this paper we proposed approach of Web Usage Mining by surveying data preprocessing activities and different kinds of log records. Web Usage Mining for elearning environment mines the log records to find the user’s usage patterns to provide users with personalized and adaptive session. The next phase of research is to use the effective User’s profiles as input parameters to Neural Network based algorithm to classify the users as per the Felder & Silverman Learning style model. According to classified users the interface components can be changed adaptively on website by using adaptive contents and administrative activities.
References 1. Extended Log File Format, http://www.microsoft.com/technet/prodtechnol/ WindowsServer2003/Library/IIS/ 676400bc-8969-4aa7-851a-9319490a9bbb.mspx?mfr=true 2. Sun, X., Zhao, W.: Design and Implementation of an E-learning Model based on WUM Techniques. In: IEEE International Conference on E-learning, E-business, Enterprises Information Systems and E-government (2009) 3. Nina, S., Rahaman, M.M., Islam, M.K., Ahmed, K.E.U.: Pattern Discovery of Web Usage Mining. In: International Conference on Computer Technology and Development. IEEE Computer Society, Los Alamitos (2009) 4. Oskouei, R.J.: Identifying Student’s Behaviors Related to Internet Usage Patterns. In: T4E 2010. IEEE, Los Alamitos (2010) 978-1-4244-7361-8/ 2010 5. Li, X., Zhang, S.: Application of Web Usage Mining in e-learning Platform. In: International conference on E-business and E-government. IEEE Computer Society, Los Alamitos (2010) 6. Tyagi, N.K., Solanki, A.K., Tyagi, S.: An algorithmic approach to data preprocessing in Web Usage Mining. International Journal of Information Technology and Knowledge Management 2(2), 269–283 (2010) 7. Khiribi, M.K., Jemni, M., Nasraoui, O.: Automatic Recommendations for E-learning Personalization based on Web Usage Mining Techniques and Information Retrieval. In: Eight IEEE International Conference on Advanced Learning Technologies. IEEE Computer Society, Los Alamitos (2009) 8. Chanchary, F., Haque, I., Khalid, M.S.: Web Usage Mining to evaluate the transfer of learning in a Web-based Learning Environment. In: Workshop on Knowledge Discovery and Data Mining. IEEE Computer Society, Los Alamitos (2008)
Acquisition of User’s Learning Styles Using Log Mining Analysis
819
9. Guo, L., Xiang, X., Shi, Y.: Use Web Usage Mining to Assist Online E-learning Assessment. In: Proceeding of the IEEE International Conference on Advanced Learning Technologies (2004) 10. Kolekar, S., Sanjeevi, S.G., Bormane, D.S.: The Framework of an Adaptive User Interface for E-learning Environment using Artificial Neural Network. In: 2010 WORLDCOMP/ International Conference on e-Learning, e-Business, Enterprise Information Systems, and e-Government (EEE 2010), USA, July 12-15 (2010) 11. Markellou, P., Mousourouli, L., Spiros, S., Tsakalidis, A.: Using Semantic Web Mining Technologies for Personalized E-learning Experiences. International Journal in Elearning (2009) 12. Das, R., Turkoglu, I.: Creating mining data from web logs for improving the impressiveness of a website by using path analysis method. Journal of Elsevier, Expert Systems with Applications 36, 6635–6644 (2009), Science Direct
An Agent Based Middleware for Privacy Aware Recommender Systems in IPTV Networks Ahmed M. Elmisery and Dmitri Botvich
*
Abstract. IPTV providers keen to use recommender systems as a serious business tool to gain competitive advantage over competing providers and attract more customers. As indicated in (Elmisery,Botvich 2011b) IPTV recommender systems can utilize data mashup to merge datasets from different movie recommendation sites like Netflix or IMDb to leverage its recommender performance and predication accuracy. Data mashup is a web technology that combines information from multiple sources into a single web application. Mashup applications created a new horizon for different services like real estate services, financial services and recommender systems. On the other hand, mashup applications bring about additional requirement related to the privacy of data used in the mashup process. Moreover, privacy and accuracy are two contradicting goals that need to be adjusted for the spread of these services. In this work, we present our efforts to build an agent based middleware for private data mashup (AMPM) that serve centralized IPTV recommender system (CIRS). AMPM is equipped with two obfuscation mechanisms to preserve privacy of the dataset collected from each provider involved in the mashup application. We present a model to measure privacy breaches. Also, we provide a data mashup scenario in IPTV recommender system and experimentation results. Keywords: privacy, clustering, IPTV networks, recommender system, Multiagent.
1
Introduction
Data mashup (Trojer et al. 2009) is a web technology that combines information from more than one source into a single web application for specific task or request. Data mashup can be used to merge datasets from external movie recommendation sites to leverage the IPTV recommender system from different perspectives like providing more precise predictions and recommendations, improving the reliability toward customers, alleviating cold start problem Ahmed M. Elmisery · Dmitri Botvich Telecommunications Software & Systems Group, Waterford Institute of Technology, Waterford, Ireland J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 821–832. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
822
A.M. Elmisery and D. Botvich
(Gemmis et al. 2009) for new customers, maximizing the precision of target marketing and finally improve the overall performance of the current IPTV network by building up an overlay to increase content availability, prioritization and distribution based on customers’ preferences. Due to that, Providers of the next generation of IPTV services keen to gain accurate recommendations systems for their IPTV networks. However, privacy is an essential concern for mashup application in IPTV recommender system as the generated recommendations obviously require the integration of different customers’ preferences form different providers. This might reveals private customers’ preferences that were not available before the data mashup. Most of movie recommendation sites refrain from joining a mashup process to prevent discloser of the raw preferences of their customers to other sites or to the IPTV recommender system itself. Moreover, divulgence their customer preferences represent infringement against personal privacy laws that might be applied in some countries where these providers operate. In this work, we present our ongoing work to build an agent based middleware for private data mashup (AMPM) that bear in mind privacy in mashup different datasets from movie recommendation sites. We focus on stages related to datasets collection and processing and omit all aspects related to recommendations, mainly because these stages are critical with regard to privacy as they involve different entities. We present a model to measure privacy breaches as an inference problem between the real dataset and obfuscated dataset. We derived a lower bound for the amount of fake items to achieve optimal privacy. The experiments show that our approach reduces privacy breaches. In the rest of this work, we will generically refer to news programs, movies and video on demand contents as Items. In section 2, we describe some related work. In section 3, we introduce the scenario landing our AMPM middleware. In section 4, we give an overview of our proposed obfuscation algorithms used in AMPM. In section 5, we present a model to measure privacy breaches in the obfuscated dataset .In section 6, we present some experiments and results based on our obfuscation algorithms. Finally, Section 7 includes conclusions and future work.
2
Related Works
The majority of the literature addresses the problem of privacy on recommendation systems, Due to it is a potential source of leakage of personally identifiable Information, However a few works have studied the privacy for mashup services. The work in (Trojer et al. 2009) discussed private data mashup, where the authors formalize the problem as achieving a k-anonymity on the integrated data without revealing detailed information about this process or disclosing data from one party to another. Their infrastructure ported to web-based mashup applications. In (Esma 2008) it is proposed a theoretical framework to preserve privacy of customers and the commercial interests of merchants. Their system is a hybrid recommender that uses secure two party protocols and public key infrastructure to achieve the desired goals. In (Polat,Du 2003, 2005) it is suggested another method for privacy preserving on centralized recommender systems by adding uncertainty
An Agent Based Middleware for Privacy Aware Recommender Systems
823
to the data by using a randomized perturbation technique while attempting to make sure that necessary statistical aggregates such as mean don’t get disturbed much. Hence, the server has no knowledge about true values of individual rating for each user. They demonstrate that their method does not decrease essentially the obtained accuracy of the results. But the research work in (Huang et al. 2005; Kargupta et al. 2003) pointed out that randomized perturbation techniques don’t provide levels of privacy as it was previously thought. In (Kargupta et al. 2003) it is Pointed out that arbitrary randomization is not safe because it is easy to breach the privacy protection it offers. They proposed a random matrix based spectral filtering techniques to recover the original data from perturbed data. Their experiments revealed that in many cases random perturbation techniques preserve very little privacy.
3
Data Mashup in IPTV Recommender System Scenario
This work uses the scenario proposed in (Elmisery,Botvich 2011b) that extends previously proposed scenarios in (Elmisery,Botvich 2011d, c, a). The scenario in (Elmisery,Botvich 2011b) proposed a data mashup service (DMS) that integrates datasets from different movies recommendations sites for the recommender system running at the IPTV provider, Figure (1) illustrates this scenario. We assume all the involved parties follow the semi-honest model, which is realistic assumption because they need to accomplish some business goals and increase their revenues. Also we assume all parties involved in the data mashup have similar items set (catalogue) but the customer sets are not identical. The data mashup process based on AMPM can be summarized as follows; The CIRS sends a query to the DMS to start gathering customers’ preferences for some genres to leverage its recommendation service. At DMS side, the coordinator determines based on providers’ cache, the movie recommendation sites could satisfy that query. The coordinator transforms CIRS query into appropriate sub-queries languages suitable for each provider’s database. The manager unit sends each sub-query to the candidate providers to incite them about the data mashup process. The manager agent at the provider side rewrites the sub-query considering the privacy preference for its host. The manager agent produces a modified sub-query for the data can be accessed by DMS. This step allows the manager agent to audit all issued sub-queries and prevent ones that can extract sensitive information. The resulting dataset is obfuscated by the local obfuscation agent (LOA) using clustering based obfuscation (CBO) algorithm. The synchronize agents sends the results to the coordinator which in turn integrates these results then performs global perturbation using random ratings generation (RRG) algorithm on them. Finally, the output from global perturbation process is delivered to CIRS to achieve its business goals. The movie recommendation sites earn revenues from the usage of their databases and in the same time assure that the mashup process does not violating the privacy of their customers. DMS uses anonymous pseudonyms identities to alleviate the providers’ identity problems, as the providers do not want to reveal their ownership of the data to competing providers. Moreover the data mashup service will keen to hide the identities of the participants as a business asset.
824
A.M. Elmisery and D. Botvich
Update Coordinator DMS Catalog Refine Rules
Coordination Plans
Providers pseudonyms
Obfuscated Ratings
Providers cache
index
Manager Unit
Delivery Agent
Synchronize Agent
AMPM
Local Obfuscation Agent Manager Agent Learning Agent Local Catalog
Service Users
Ratings
Metadata
Fig. 1 Data Mashup in IPTV Recommender System
4
Proposed Obfuscation Algorithms
In this section, we give an overview of our proposed algorithms in (Elmisery,Botvich 2011b) that used to preserve the privacy of the datasets with minimum loss of accuracy. The core idea for our obfuscation algorithms is to alleviate the attack model proposed in (Narayanan,Shmatikov 2008), The authors state that if the set of user preferences is fully distinguishable from other users’ preferences in the dataset with respect to some items. This user can be identified if an attacker correlates the published preferences with data from other publicly-accessible databases. We belief, The current anonymity models might fail to provide an overall anonymity as they don’t consider matching items based on their features vector. A key success for any obfuscation algorithms is to create homogeneous groups inside the published datasets based on features vectors in order to make user preferences indistinguishable from other users. A typical feature vector for a real item includes genres, directors, actors and so on. We proposed our obfuscation algorithms as two stages processes taking advantage of group formation done by DMS to accomplish each query of CIRS. We used this group to attain privacy for all participants such that each provider obfuscates its dataset locally then release it to the DMS to perform global perturbation. The first algorithm called CBO, it runs at provider side and it aims to create clusters of fake items that have a similar features vector to each real item preferred by provider’s customers. The algorithm consists of following steps: 1. CBO splits the dataset D into two subsets D and D . Where D is subset if highly rated items in the dataset and D is the rest of items in the dataset. 2. For each real item I D , CBO adds K 1 fake items have similar features vector to that item. The process continues until we get new subset D .
An Agent Based Middleware for Privacy Aware Recommender Systems
825
3. For each real item I D , CBO selects I with probability α or select a fake item I from the candidate fake item set D with probability 1 α. We denote the selected item by I that added as a record to the obfuscated set D . P I
αP I
1
α P I
1
is merged with D to obtain the final obfuscated set D .
4. Finally, D
The second algorithm called RRG, it runs at DMS side and it aims to mitigate the data sparsity problem that can be used to formulate some attacks as shown in (Narayanan,Shmatikov 2008). The main aim for RRG is to pre-process the merged dataset by filling the unrated cells in such a way to improve recommendation accuracy and increase the attained privacy. The algorithm consists of following steps: 1. RRG determines the majority rated items and partially rated items by all providers’ customers in the merged datasets. 2. Then, it selects a percent of the partially unrated items in merged datasets and use the KNN to predicate the values of the unrated cells in that subset. the remaining unrated cells are filled by random values chosen using a distribution reflecting the ratings in the merged datasets.
5
Measuring Privacy Breaches
As indicated in (Evfimievski et al. 2003), the distribution of D may allow the attacker to learn some real items’ preferences. Different D distributions represent different datasets release whereas the attacker exploits their properties to reveal real items’ preferences. The attacker can only learn the subset D inside D . Thus, to prevent any privacy breaches, we aim to minimize the amount of information in D which can be inferred through D . We use mutual information I D ; D as a measure for the notion of privacy breach between D and D . Let I ,I ,…..I be the subset of real items’ preferences then we have: D ; ∑∑
|
,
log
P
log
, |
(2)
; ; is better than Given , if , we can deduce that for privacy protection. Therefore, our aim is to find fake items’ preferences set ; . Based on our previous discussion, we can conclude which minimizes the following conditional probability: |
1 1
| |
. ,
(3)
826
5.1
A.M. Elmisery and D. Botvich
Privacy Guarantees
Enlighten by (Xiao et al. 2009) In this subsection, we measure the lower bound for the amount of fake items that can be added to the real items’ preferences to achieve optimal privacy level. In the rest of this work, we will generically refer to real items’ preferences as real items and added fake items’ preferences as fake items. Our derived bound is a function of α and the total number of all real items | |. In CBO algorithm, given α, the more real items in the dataset, the more fake items can be added. The amount of | | in the obfuscated dataset is decided by the provider, thus we need an upper bound for α that can given by the following theorem: ;
Theorem 1:
0
To prove that there exits
|
|
satisfies theorem 1, let 0 1 | | 1
|
|
|
and
, 4
,
This can be explained, given a real item , the fake item is equally likely to be any other item except . Based on equation (3), we have |
1
| |
1
0
| |
| | | |
|
| |
, |
, (5)
is uniformly distributed in the entire real items space. Replacing Therefore | into equation (2), we get ; 0 . This proofs the correctness of the bound given by theorem 1. In order to increase privacy gain, we need a lower bound for expected number | | | | 1 | |. Based of fake items | | that can be given by | | on the previous analysis we can deduce, it is expensive to achieve optimal privacy level.
5.2
Minimizing Number of Fake Items
The number of fake items has to be minimized to decrease computation complexity and dataset size. In order to achieve that, we need to simply the obfuscation and to be independent. With this new assumption, process by allowing equation (3) will be |
1 1
, ,
6
An Agent Based Middleware for Privacy Aware Recommender Systems
827
Then equation (2) will be ;
,
1
P
1
i
P
1 1
log
P
1
log 1
i
P
log
j,j i
1
log
u, P
∑ u log
n and
∑ n log
1
1 7
To simply this equation assume P ;
log
∑ u n log
(8)
Then our task will be w. r. t
arg min
n
1
i
n
0 where n
, ,….,
9
We show first that is a convex function of . Since is a continuous twice differentiable function over n, we will prove that its second derivate is positive. θ n n n 1 n log u n log u log θu n u n n 10 n n 1 n u n 1 log u n 1 u log n u 1 n 1 1 u log n u n 1 1 u n 11 n 1
u log
n 1
u log
1
1 1 n
log
u u
1 log n
1
n
1
u
1
u
u
log
u
n
log n
1
(12)
Therefore n
1
u
1 n
u
n
n
0
13
828
A.M. Elmisery and D. Botvicch
Using Lagrange multip pliers to solve for equation (9), we use constrains to de∑ ni 1 . Let Λ , Υ fine a function Υ , solving equatio on (13) for critical values of Λ we get: n Υ
1
og u lo
Υ=0 n
log
u
n
(144) 1
u
log n
(155)
Choosing different values v to solve equation (15), we can get the optimal fakke set that is independen nt of and minimize privacy breaches. It is computaationally expensive to solv ve the equation (15) for large number of items, so we onlly select a portion of the dataaset as mentioned before.
6
Experimental Results R
The proposed algorithms are implemented in C++, we used message passing inteerface (MPI) for a distributeed memory implementation of RRG algorithm to mimicc a distributed reliable netwo ork of peers. In order to evaluate the effect of our prooposed algorithms on the mashuped data used for recommendation, we used tw wo a accuracy of results. The experiments presented herre aspects: privacy breach and were conducted using g the movielens dataset provided by grouplenns (Lam,Herlocker 2006), it contains users’ ratings on movies in a discrete scale bet mean average error (MAE) metric proposed in (Heertween 1 and 5. We used the locker et al. 2004) to meeasure the accuracy of results calculated using . Also, as a measure for the notion of privaccy we used mutual information ; through , so the larger value of ; indicates a higheer breach of privacy breach.
Fig. 2 Privacy breach for opttimal and uniform fake sets
An Agent Based Middlewaree for Privacy Aware Recommender Systems
8229
In the first experiment,, we want to measure the relation between the quantity oof real items in the obfuscaated dataset D and privacy breach, we select α in rangge from 1.0 to 5.5, and we in ncreased the number of real items from 100 to 1000. W We select a fake set using un niform distribution as a baseline. As shown in figure (22), our generated fake set red duces the privacy breach and performs much better thaan uniform fake set. As num mber of real items increases, the uniform fake set geets worse as more informatio on is leaked while our optimal fake set does not affeect with that attitude. Our reesults are promising especially when dealing with largge number of real items. In second experiment, we want to measure the relation between the quantity oof fake items in the subset D (which based on value) and the accuracy of reccommendations. We selectt a set of real items from movielens, then we split it intto two subsets D and D .W We obfuscate D as described before with fixed value foor α to obtain D . We appeend D with either items from optimal fake set or unniform fake set. Thereafter, we gradually increase the percentage of real items in D selected from the movieleens dataset from 0.1 to 0.9. For each possible obfuscatioon rate value), we measu ured MAE for the whole obfuscated dataset D . Figure (33) shows MAE values as a function f of the obfuscation rate. The provider selects obbfuscation rate based on th he desired accuracy level required from the recommendation process. We can ded duce that with a higher value for the obfuscation rate a higher accurate recommeendation the CIRS can attain. Adding items from the opptimal fake set has a min nor impact of MAE of the generated recommendationns without having to select higher h value for obfuscation rate.
Fig. 3 MAE of the generated d predications vs. obfuscation rate
However, as we can seee from the graph, MAE rate slightly decreases in roughlly linear manner with high values for obfuscation rate. Especially, the change iin MAE is minor in the range 40% to 60% that confirms our assumption that accuurate recommendations can be provided with less values for the obfuscation ratte.
830
A.M. Elmisery and D. Botvicch
The optimal fake items arre so similar to the real items in the dataset, so the obfuuscation does not significan ntly change the aggregates in the real dataset and it havve small impact of MAE.
Fig. 4 MAE of the generated d predications for ratings groups
In third experiment, wee seek to measure the impact of adding fake items on thhe predications accuracy of the t various types of ratings. We partitioned the movielenns dataset to 5 rating groupss. For each rating group, a set of 1300 rating were sepaarated. CBO was applied using u optimal and uniform fake sets then the ratings werre pre-processed using RRG G. The resulting datasets were submitted to CIRS to peerform predications for diffferent rating group. We repeat the predication experimennt with different values for α and then we compute MAE for these predictionns. E values for generated predications for each rating groupp. Figure (4) shows the MAE We can clearly see the im mpact of adding fake items on the predications of variouus types of ratings is differen nt. For the optimal fake set, the impact is minor as MA AE roughly remains unchangeed regardless of the values of α and .
7
Conclusions and d Future Work
In this work, we presenteed our ongoing work on building an agent based middleware for private data maashup that serve centralized IPTV recommender system m. We gave a brief overview w over the mashup process. Also we gave overview oon our novel algorithms thatt give the provider complete control over the privacy oof their datasets using two stages processes. We presented a model for privaccy breach as an inference prroblem between the real dataset and obfuscated dataseet. We derived a lower boun nd for the amount of fake items to achieve optimal prrivacy. The experiments sh how that our approach reduces privacy breaches. We neeed to investigate weighted feeatures vector methods and its impact in forming homoogeneous groups. Such thaat, the provider not only expresses what kinds of item feaatures can be used to creaate fake items dataset, but also expresses the degree tto which those features sho ould influence the selection of items for the fake item ms
An Agent Based Middleware for Privacy Aware Recommender Systems
831
dataset. We realize that there are many challenges in building a data mashup service. As a result we focused in IPTV recommendation services scenario. This allow us to move forward in building an integrated system while studying issues such as a dynamic data release at a later stage and deferring certain issues such as schema integration, access control ,query execution and auditing to future research agenda. We believe that given the complexities of the problem, we focus on simpler scenarios and a subset of issues at the beginning. Then we will go a head in solving the remaining issues in our future work. Acknowledgments. This work has received support from the Higher Education Authority in Ireland under the PRTLI Cycle 4 programme, in the FutureComm project (Serving Society: Management of Future Communications Networks and Services).
References [1] Elmisery, A., Botvich, D.: Agent Based Middleware for Maintaining User Privacy in IPTV Recommender Services. In: 3rd International ICST Conference on Security and Privacy in Mobile Information and Communication Systems, ICST, Aalborg, Denmark (2011a) [2] Elmisery, A., Botvich, D.: Agent Based Middleware for Private Data Mashup in IPTV Recommender Services. In: 16th IEEE International Workshop on Computer Aided Modeling, Analysis and Design of Communication Links and Networks, Kyoto, Japan. IEEE, Los Alamitos (2011b) [3] Elmisery, A., Botvich, D.: Privacy Aware Recommender Service for IPTV Networks. In: 5th FTRA/IEEE International Conference on Multimedia and Ubiquitous Engineering, Crete, Greece. IEEE, Los Alamitos (2011c) [4] Elmisery, A., Botvich, D.: Private Recommendation Service For IPTV System. In: 12th IFIP/IEEE International Symposium on Integrated Network Management, Dublin, Ireland. IEEE, Los Alamitos (2011d) [5] Esma, A.: Experimental Demonstration of a Hybrid Privacy-Preserving Recommender System. In: Gilles, B., Jose, M.F., Flavien Serge Mani, O., Zbigniew, R. (eds.), pp. 161–170 (2008) [6] Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. Paper Presented at the Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, San Diego, California [7] Gemmis, M.d., Iaquinta, L., Lops, P., Musto, C., Narducci, F., Semeraro, G.: Preference Learning in Recommender Systems. Paper Presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Slovenia [8] Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004), doi: http://doi.acm.org/10.1145/963770.963772 [9] Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. Paper Presented at the Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland (2005)
832
A.M. Elmisery and D. Botvich
[10] Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the Privacy Preserving Properties of Random Data Perturbation Techniques. Paper Presented at the Proceedings of the Third IEEE International Conference on Data Mining [11] Lam, S., Herlocker, J.: MovieLens Data Sets. Department of Computer Science and Engineering at the University of Minnesota (2006), http://www.grouplens.org/node/73 [12] Narayanan, A., Shmatikov, V.: Robust De-anonymization of Large Sparse Datasets. Paper Presented at the Proceedings of the 2008 IEEE Symposium on Security and Privacy (2008) [13] Polat, H., Du, W.: Privacy-Preserving Collaborative Filtering Using Randomized Perturbation Techniques. Paper presented at the Proceedings of the Third IEEE International Conference on Data Mining [14] Polat, H., Du, W.: SVD-based collaborative filtering with privacy. Paper Presented at the Proceedings of the 2005 ACM symposium on Applied computing, Santa Fe, New Mexico (2005) [15] Trojer, T., Fung, B.C.M., Hung, P.C.K.: Service-Oriented Architecture for PrivacyPreserving Data Mashup. Paper Presented at the Proceedings of the 2009 IEEE International Conference on Web Services (2009) [16] Xiao, X., Tao, Y., Chen, M.: Optimal random perturbation at multiple privacy levels. Proc. VLDB Endow. 2(1), 814–825 (2009)
An Intelligent Decision Support Model for Product Design Yang-Cheng Lin and Chun-Chun Wei
*
Abstract. This paper presents a consumer-oriented design approach to determining the optimal form design of character toys that optimal matches a given set of product images perceived by consumers. 179 representative character toys and seven design form elements of character toys are identified as samples in an experimental study to illustrate how the consumer-oriented design approach works. The consumer-oriented design approach is based on the process of Kansei Engineering using neural networks (NNs). Nine NN models are built with different momentum, learning rate, and hidden neurons in order to examine how a particular combination of form elements matches the desirable product images. The NN models can be used to construct a form design database for supporting form design decisions in a new character toy design. The result provides useful insights that help product designers best meet consumers’ specific feelings and expectations.
1 Introduction In an intensely competitive market, it is an essential issue that how to design highly-reputable and hot-selling products [1]. Whether consumers choose a product depends largely on their emotional feelings of the product image, which is regarded as something of a black box [7]. Consequently, product designers need to comprehend the consumers’ feelings in order to design successful products [6, 12]. Unfortunately, the way that consumers look at product appearances or images is usually different from the way that product designers look at product elements or characteristics [4]. Moreover, it is shown that “aesthetics” plays an important role in new product development, marketing strategies, and the retail environment [1, 2]. The Apple product (e.g. iPod or iPhone) is a good example to illustrate the visual appearance has become a major factor in consumers’ purchase decisions, called the “aesthetic Yang-Cheng Lin Department of Arts and Design, National Dong Hwa University, Hualien, 970, Taiwan e-mail:
[email protected] *
Chun-Chun Wei Department of Industrial Design, National Cheng Kung University, Tainai, 701, Taiwan e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 833–842. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
834
Y.-C. Lin and C.-C. Wei
revolution” [13]. Yamamoto and Lambert [14] also find that aesthetically pleasing properties have a positive influence on consumers’ preferences of a product and their decision processes when they purchase it [3]. In product design, the “visual appearance” (or visual aesthetics) is usually concerned with “product form” [5]. In order to help product designers work out the optimal combination of product forms for matching consumers’ psychological feelings, a consumer-oriented approach, called Kansei Engineering [9, 10], is used to build a design decision support model. Kansei Engineering is as an ergonomic methodology and design strategies for affective design to satisfy consumers’ psychological feelings [9]. The word “Kansei” indicates the consumers’ psychological requirements or emotional feelings of a product. Kansei Engineering has been applied successfully in the product design field to explore the relationship between the consumers’ feelings and product forms [4, 5, 6, 7, 8]. To illustrate how the consumer-oriented approach works, we conduct an experimental study on character toys (dolls or mascots) for their great popularity in eastern Asia (particularly in Taiwan, Japan, and Hong Kong). In subsequent sections, we first present an experimental study with character toys to describe how Kansei Engineering can be used to extract representative samples and form elements as numerical data sets required for analysis. Then we construct and evaluate nine NN models based on the experimental data. Finally we discuss how the NN models can be used as a design decision support model to help product designers meet consumers’ emotional feelings for the new product design.
2 Experimental Procedures of a Consumer-Oriented Approach The experimental procedures involve three main steps: (a) extracting representative experimental samples, (b) conducting morphological analysis of design form elements, and (c) assessing product images.
2.1 Extracting Representative Experimental Samples In the experimental study, we investigate and categorize various character toys with local and aboriginal cultures in Taiwan. We first collect 179 character toys and then classify them based on their similarity degree by a focus group that is formed by six subjects with at least two years’ experience of product design. The focus group eliminates some highly similar samples through discussions. Then the hierarchy cluster analysis is used to extract representative samples of character toys. The 35 representative character toy samples are selected by the cluster tree diagram, including 28 samples as the training set and 7 samples as the test set for building the NN models.
2.2 Conducting Morphological Analysis of Design Form Elements The product form is defined as the collection of design features that the consumers will appreciate [5]. The morphological analysis, concerning the arrangement of
An Intelligent Decision Support Model for Product Design
835
objects and how they conform to create a whole of Gestalt, is used to explore all possible solutions in a complex problem regarding a product form [7]. The morphological analysis is used to extract the product form elements of the 35 representative character toy samples. The five subjects of the focus group are asked to decompose the representative samples into several dominant form elements and form types according to their knowledge and experience. Table 1 shows the result of the morphological analysis, with seven product design elements and 24 associated product form types being identified. The form type indicates the relationship between the outline elements. For example, the “width ratio of head and body (X2)” form element has three form types, including “head body (X21)”, “head=body (X22)”, and “head body (X23)”. A number of design alternatives can be generated by various combinations of morphological elements.
<
>
2.3 Assessing Product Images In Kansei Engineering, emotion assessment experiments are usually performed to elicit the consumers’ psychological feelings about a product using the semantic differential method [9]. Image words are often used to describe the consumers’ feelings of the product in terms of ergonomic and psychological estimation [6]. With the identification of the form elements of the product, the relationship between the consumers’ feelings and the product forms can be established. The procedure of extracting image words includes the followings four steps: Step 1: Collect a large set of image words from magazines, product catalogs, designers, artists, and toy collectors. In this study, we collect 110 image words which are described the character toys, e.g. vivid, attractive, traditional, etc. Step 2: Evaluate collected image words using the semantic differential method. Step 3: Apply factor analysis and cluster analysis according to the result of semantic differential obtained at Step 2. Step 4: Determine three representative image words, including “cute (CU)”, “artistic (AR)”, and “attractive (AT)”, based on the analyses performed at Step 3. To obtain the assessed values for the emotional feelings of 35 representative character toy samples, a 100-point scale (0-100) of the semantic differential method is used. 150 subjects (70 males and 80 females with ages ranging from 15 to 50) are asked to assess the form (look) of character toy samples on a image word scale of 0 to 100, for example, where 100 is most attractive on the AT scale. The last three columns of Table 2 show the three assessed image values of the 35 samples, including 28 samples in the training set and 7 samples in the test set (asterisked). For each selected character toy in Table 2, the first column shows the character toy number and Columns 2-8 show the corresponding type number for each of its seven product form elements, as given in Table 1. Table 2 provides a numerical data source for building neural network models, which can be used to develop a design decision support model for the new product design and development of character toys.
836
Y.-C. Lin and C.-C. Wei
Table 1 The morphological analysis of character toys
Length ratio of head and body (X1) Width ratio of head and body (X2)
Type 1
Type 2
Type 3
≧ 1:1
1:1~1:2
<1:2
>body
head
head=body
head
Type 4
Type 5
<body
Costume style (X3) one-piece
two-pieces
robe
simple
striped
geometric
mixed
tribal
ordinary
flowered
feathered
eyes only
partial features
entire features
cute style
semi-personi fied style
personified style
Costume pattern (X4)
Headdress (X5)
Appearance of facial features (X6)
Overall appearance (X7)
arc-shaped
An Intelligent Decision Support Model for Product Design
837
Table 2 Product image assessments of 35 representative character toy samples No.
X1
X2
X3
X4
X5
X6
X7
CU
AR
AT
1 2 3 4* 5 6 7* 8 9 10 11 12 13 14 15 16* 17 18 19 20 21* 22 23 24 25 26* 27 28 29 30* 31 32* 33 34 35
3 1 2 2 2 2 2 2 2 2 1 1 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 2 1
2 1 2 3 2 2 2 3 2 2 1 1 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 3 1
1 1 1 2 1 2 2 2 3 1 2 3 2 1 2 1 2 3 1 2 2 2 2 1 2 2 1 2 1 1 3 1 1 2 1
1 1 3 4 1 4 4 4 2 3 3 2 4 4 2 2 4 2 1 1 2 3 1 3 2 2 2 2 3 3 2 4 4 4 3
4 1 3 2 4 3 5 4 2 2 4 2 4 4 2 3 2 2 2 1 3 2 2 2 2 4 4 1 5 3 2 4 5 2 5
3 2 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 2 3 3 3 3 3 3 1 3 3 1 3 3 3 1 1 2 2
3 1 1 2 1 2 2 2 2 2 1 1 3 3 3 3 3 2 3 3 3 2 3 1 1 2 2 1 2 2 2 1 1 2 1
73 72 70 63 68 65 52 53 63 55 70 57 48 62 54 62 55 71 41 39 41 44 43 54 63 58 57 62 76 68 71 61 72 38 78
61 45 64 52 59 66 66 61 59 63 69 54 69 68 63 74 68 65 52 53 50 74 59 60 52 71 61 56 67 59 60 49 59 48 59
64 43 71 54 55 69 61 60 59 65 67 61 76 78 68 72 66 61 75 63 58 62 74 62 62 68 66 73 74 65 70 51 57 49 79
3 Neural Network Models With the effective learning ability, the neural network model has been widely used to examine the complex and non-linear relationship between input variables and output variables [11].
3.1 Building NN Models In this study, we use the multilayered feed-forward NN trained with the back-propagation learning algorithm, as it is an effective and popular supervised learning algorithm [8].
838
Y.-C. Lin and C.-C. Wei
(a) The Number of Neurons To examine how a particular combination of product form element matches the CU, AR, and AT images, we use three most widely used rules [11] for determining the number of neurons in the single hidden layer respectively, given below: (a) (The number of input neurons + the number of output neurons) / 2 (b) (The number of input neurons + the number of output neurons) (c) (The number of input neurons + the number of output neurons) * 2 The seven design elements in Table 1 are used as the seven input variables for the NN models. If the character toy has a particular design element type, the value of the corresponding input neuron is 1, 2, 3, 4 or 5. The assessed average values of the CU, AR, and AT feelings are used as the output neurons. Table 3 gives the neurons of the NN models, including the input layer, hidden layer, and output layer. Table 3 Neurons, learning rate, and momentum of NN models
Input neuron
Hidden Output Learning Momentum neuron neuron rate
NN-a-S
7
5
3
0.9
0.6
NN-a-C
7
5
3
0.1
0.1
NN-a-N
7
5
3
0.05
0.5
NN-b-S
7
10
3
0.9
0.6
NN-b-C
7
10
3
0.1
0.1
NN-b-N
7
10
3
0.05
0.5
NN-c-S
7
20
3
0.9
0.6
NN-c-C
7
20
3
0.1
0.1
NN-c-N
7
20
3
0.05
0.5
Note Research issue is very simple Research issue is more complicated Research issue is complex and very noisy Research issue is very simple Research issue is more complicated Research issue is complex and very noisy Research issue is very simple Research issue is more complicated Research issue is complex and very noisy
(b) The Momentum and Learning Rate In this study, we conduct a set of analyses by using different learning rate and momentum factors for getting the better structure of the NN model. Three pairs of learning rate and momentum factors are used for different conditions based on the complication of the research problem [11]. For example, if the research issue is very simple, a large learning rate of 0.9 and momentum of 0.6 are recommended. On more complicated problems or predictive networks where output variables are continuous values rather than categories, use a smaller learning rate and momentum, such as 0.1 and 0.1 respectively. In addition, if the data are complex
An Intelligent Decision Support Model for Product Design
839
and very noisy, a learning rate of 0.05 and a momentum of 0.5 are used. To distinguish between the NN-a, NN-b, and NN-c models using different input neurons and hidden neurons, all three models are associated with the learning rate and momentum mentioned above, such as -S, -C, -N, as shown in Table 3. As a result, there are totally nine NN models (3*3) built in this study.
3.2 Training NN Models The learning rule used is Delta-Rule and the transfer function is Sigmoid for all layers. All of input and output variables (neurons) are normalized before training [8]. The experimental samples are separated into two groups: 28 training samples and 7 test samples. The training process of each model is not stopped until the cumulative training epochs are over 25,000. Table 4 shows the corresponding root of mean square (RMS) error of each model. The lowest RMS error of the nine models is asterisked. As shown in Table 4, the RMS error of the nine NN models using the (c) rule is the lowest, as compared to the other two rules (i.e. (a) and (b) rules mentioned in Section 3.1). This result indicates that the more the hidden neurons, the lowest the RMS error. Furthermore, we find that the RMS errors of the NN model using the same rule are almost the same, no matter what the momentum and learning rate are used. In other words, the RMS errors of NN-a-S, NN-a-C, and NN-a-N are slightly different (0.0478, 0.0473, and 0.0481), as well as the NN-b models (NN-b-S, NN-b-C, and NN-b-N), and the NN-c models (NN-c-S, NN-c-C, and NN-c-N). The result shows that the momentum and learning rate have no significant impact for training the NN models in this study. Table 4 RMS errors of the NN models for the training set RMS errors
NN-a NN-b NN-c
-S 0.0478 0.0312 0.0205*
-C 0.0473 0.0406 0.0290
-N 0.0481 0.0426 0.0297
3.3 Testing NN Models To evaluate the performance of the nine NN models in terms of their predictive ability, the 7 samples in the test set were used. Rows 2-4 of Table 5 show the average assessed values of the CU, AR, and AT images on the 7 test samples given by the 150 subjects, and Rows 5-31 show the predicted values for the three images by using these nine NN models trained in the previous section. The last column of Table 5 shows the RMS errors of the NN models for the test set. As indicated in Table 5, the RMS error (0.0931) of the NN-a-N model is the smallest among the nine models, thus suggesting that the NN-a-N model has the highest predictive consistency (an accuracy rate of 91.69%, 100%-9.31%) for predicting the values of the CU, AR, and AT images of character toys. This suggests that the NN-a-N model is most promising for modeling consumers’ feelings on
840
Y.-C. Lin and C.-C. Wei
product images of character toys. However, the other eight models also have a quite similar performance, as the difference between the RMS errors of the nine models is almost negligible. This seems to suggest that the different variables or factors (the momentum, learning rate, or the number of neurons in the hidden layer) have no significant impact on the predictive ability of the NN models. Table 5 Predicted image values and RMS errors of the NN models for the test set
Sample No. CU Consumer AR feelings AT CU NN-a-S AR Predicted AT CU NN-a-C AR Predicted AT CU NN-a-N AR Predicted AT CU NN-b-S AR Predicted AT CU NN-b-C AR Predicted AT CU NN-b-N AR Predicted AT CU NN-c-S AR Predicted AT CU NN-c-C AR Predicted AT CU NN-c-N AR Predicted AT
4 63 52 54 37.94 48.78 48.57 41.65 48.79 49.07 38.08 50.56 47.26 37.84 48.12 49.01 38.27 50.02 46.73 37.80 48.63 48.82 38.01 48.01 49.00 38.03 48.76 49.47 38.31 48.58 48.43
7 52 66 61 73.96 63.14 69.27 67.84 64.30 66.94 76.87 68.48 77.38 69.51 64.92 77.89 73.22 67.58 75.43 61.38 69.31 73.05 54.13 65.73 73.25 64.32 64.75 43.64 68.12 68.43 73.97
16 62 74 72 48.50 61.18 71.77 53.84 66.32 72.20 52.09 65.38 71.99 50.12 62.35 59.45 49.96 64.96 71.93 52.57 67.79 72.26 71.92 72.53 59.84 50.16 66.76 78.13 50.41 67.42 78.82
21 41 50 58 38.51 53.24 76.14 59.87 62.26 71.39 56.24 67.06 73.56 37.47 67.27 66.15 67.75 65.07 72.47 54.37 70.69 77.76 57.35 68.91 79.29 53.77 65.55 76.59 42.97 70.74 77.23
26 58 71 68 69.56 59.54 58.30 67.44 61.91 66.43 55.04 65.56 69.57 52.45 73.63 58.01 67.75 60.93 61.15 54.24 69.47 75.82 50.48 66.53 61.77 51.37 60.39 74.30 43.17 71.08 75.77
30 68 59 65 51.78 62.80 74.14 60.26 65.64 68.22 66.46 59.37 64.61 64.27 73.07 79.35 59.79 57.21 58.99 63.35 52.76 50.60 57.55 71.00 61.84 59.59 64.98 39.81 56.28 65.83 59.35
32 61 49 51 73.43 62.88 69.23 69.28 60.58 65.25 69.29 49.18 52.30 70.79 64.15 77.12 64.96 52.92 52.53 61.44 52.65 49.46 70.83 58.62 58.73 75.13 69.76 54.75 66.79 51.11 45.89
RMSE
0.1243
0.0995
0.0931*
0.1274
0.1091
0.0937
0.1069
0.1288
0.1065
4 The Decision Support Model for New Product Forms The NN models enables us to build a design decision support database that can be used to help determine the optimal form design for best matching specific product
An Intelligent Decision Support Model for Product Design
841
images. The design decision support database can be generated by inputting each of all possible combinations of form design elements to the NN models individually for generating the associated image values. The resultant character toy design decision support database consists of 4,860 (=3×3×3×4×5×3×3) different combinations of form elements, together with their associated CU, AR, and AT image values. The product designer can specify desirable image values for a new character toy form design, and the database can then work out the optimal combination of form elements. In addition, the design support database can be incorporated into a computer aided design (CAD) system to facilitate the form design in the new character toy development process. To illustrate, Table 6 shows the optimal combination of form elements for the new character toy design with the most “cute + artistic + attractive” image (the CU value being 75, the AR value being 63, and the AT value being 70). The product designer can follow this design support information to match the desirable product images and satisfy the consumers’ emotional feelings. Table 6 The optimal combination of form elements for the most “cute + artistic + attractive”
X1
X2
X3
X4
X5
X6
X7
Length ratio of head and body
Width ratio of head and body
Costume style
Costume pattern
Headdress
Appearance of facial features
Overall appearance
one-piece
mixed
arc-shape d
entire features
personifie d style
<1:2
>body
head
two-pieces
partial features
robe
5 Conclusion In this paper, we have demonstrated how NN models can be built to help determine the optimal product form design for matching a given set of product images, using an
842
Y.-C. Lin and C.-C. Wei
experimental study on character toys. The consumer-oriented design approach has been built a character toy design decision support model, in conjunction with the computer-aided design (CAD) system, to help product designers facilitate the product form in the new product development process. Although character toys are used as the experimental product, the consumer-oriented design approach presented can be applied to other consumer products with a wide variety of design form elements. Acknowledgments. This research is supported in part by the National Science Council of Taiwan, ROC under Grant No. NSC 99-2410-H-259-082.
References 1. Cross, N.: Engineering Design Methods: Strategies for Product Design. John Wiley and Sons, Chichester (2000) 2. Jonathan, C., Craig, M.V.: Creating Breakthrough Products- Innovation from Product Planning to Program Approval, pp. 1–31. Prentice Hall, New Jersey (2002) 3. Kim, J.U., Kim, W.J., Park, S.C.: Consumer perceptions on web advertisements and motivation factors to purchase in the online shopping. Computers in Human Behavior 26, 1208–1222 (2010) 4. Lai, H.-H., Lin, Y.-C., Yeh, C.-H., Wei, C.-H.: User Oriented Design for the Optimal Combination on Product Design. International Journal of Production Economics 100, 253–267 (2006) 5. Lai, H.-H., Lin, Y.-C., Yeh, C.-H.: Form Design of Product Image Using Grey Relational Analysis and Neural Network Models. Computers and Operations Research 32, 2689–2711 (2005) 6. Lin, Y.-C., Lai, H.-H., Yeh, C.-H.: Consumer-oriented product form design based on fuzzy logic: A case study of mobile phones. International Journal of Industrial Ergonomics 37, 531–543 (2007) 7. Lin, Y.-C., Lai, H.-H., Yeh, C.-H.: Consumer Oriented Design of Product Forms. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3174, pp. 898–903. Springer, Heidelberg (2004) 8. Lin, Y.-C., Lai, H.-H., Yeh, C.-H.: Neural Network Models for Product Image Design. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 618–624. Springer, Heidelberg (2004) 9. Nagamachi, M.: Kansei engineering: A new ergonomics consumer-oriented techology for product development. International Journal of Industrial Ergonomics 15, 3–10 (1995) 10. Nagamachi, M.: Kansei engineering as a powerful consumer-oriented technology for product development. Applied Ergonomics 33, 289–294 (2002) 11. Negnevitsky, M.: Artificial Intelligence. Addison-Wesley, New York (2002) 12. Petiot, J.F., Yannou, B.: Measuring consumer perceptions for a better comprehension, specification and assessment of product semantics. International Journal of Industrial Ergonomics 33, 507–525 (2004) 13. Walker, G.H., Stanton, N.A., Jenkins, D.P., Salmon, P.M.: From telephones to iPhones: Applying systems thinking to networked, interoperable products. Applied Ergonomics 40, 206–215 (2009) 14. Yamamoto, M., Lambert, D.R.: The impact of product aesthetics on the evaluation of industrial products. Journal of Product Innovation Management 11, 309–324 (1994)
Compromise in Scheduling Objects Procedures Basing on Ranking Lists Piech Henryk and Grzegorz Gawinowski
Abstract. In our work possibility of support of ranking objects (tasks) is analyzed on base of a group of lists. We can get these lists both from experts or with help of approximating and simple (according to complexity) algorithms. To support analysis we can use elements of neighborhood theory [13], preferential models [5], and rough sets theory [16]. This supporting process is used for creation final list of tasks sequence. Usually, these problems are connected with distribution, classification, prediction, strategy of games as well as compromise searching operations. The utilization preference and domination models permits to crisp inferences and to force the chronological location of object. In some situations we have dealt with dynamic character of filling lists resulting from continuous tasks succeeding and continuous their assigning to executive elements. The utilization of the theory of neighborhood permits to locate objects in range of compromised solutions resulting in getting close to dominating proposal group. Main task for us is to find the best compromise in final objects location. We want to defined advantages and drawbacks of methods basing on mention theories and analyze possibilities of their cooperation or mutual completions. Keywords: discrete optimization, ranking lists, compromise estimation.
1 Introduction There are many application of preference theory solving the problems of decision supporting [5, 8, 15]. The dynamic scheduling using preferential models and rough sets theory does not introduce essential changes adoption in algorithms based on this theory but only adjusts parameters of data [14, 17]. Piech Henryk · Grzegorz Gawinowski Czestochowa University of Technology, Dabrowskiego 73 e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 843–852. c Springer-Verlag Berlin Heidelberg 2011 springerlink.com
844
P. Henryk and G. Gawinowski
Preferences and dominations [7]are used to compare sequences of assigning tasks which need to be run. This however needs selecting data and defining profiles which represent tasks (objects) in aspect of execution preference (final location) before we start running the process [9].The domination in Pareto and Lorenze sense permits to settle basic relations between sequences of well ordered objects. The preferences of type ”at least as good as” estimated as interval (by low and upper bounds) permit to define zone of uncertain solution. In such situations, for decision making we use additional criteria (for example the costs of reorganization [2]). For defining location of tasks we can use elements of neighborhood theory [13] as well as cooperation, toleration and collision in range of neighborhood [9]. They are named according to researched problems. They were connected, among others, with supporting or rejecting the thesis about task location in centre of given neighborhood. The closed neighborhoods confirm and support the decision (the thesis) about assigning the task to specific location. The relation of tolerance have reflective and symmetrical character [13]. The cooperating neighborhoods intensify the strength of domination and reduce the influence of passivity or small influence of tolerance. The cooperation (the supports of thesis) and the collision relation (the postponement of supporting the thesis, which means indirectly, the support of antithesis) strengthens inference mechanisms. Cooperation has reversible character. This kind of dependence between relations should simplify creation of conclusion. According to theory of neighborhood, which we engage in procedure of establishing sequent, we increase the autonomy of studied tasks groups with reference to their distribution. The symmetry of inference increases power of decision support at the same time [13]. The next problem is connected with dynamic scheduling, and appointing the objective solutions (independent on sequent or set of criteria or experts opinions). Obviously, it is not always possible, but is comfortable to use interval solutions, particularly in situation, when solutions are on border of location classes according to given criterion.
2 Compromise Estimation after Creating Final Ranking List Process Compromise is formed between ingredient judgment lists which were built with help of algorithms or on base of experts’ opinions. There is a possibility of creating several types of compromise, for example: 1. minimum concessions and similar of their levels (minimum variance of concessions); {cmp1 =
n m (loc(i, j) − locf (i))2 } → min}, j=1 i=1
Compromise in Scheduling Objects Procedures Basing on Ranking Lists
var{
n
(loc(i, j) − locf (i))2 } → min,
845
j = 1, 2, ..., m,
(1)
i=1
or var{
n
(loc(i, j) − locf (i))2 } → min,
i = 1, 2, ..., n,
i=1
where var - variance of concession according to ingredient list or to tasks. 2. minimum distances between center of neighborhoods with maximum powers (or concentration) and final tasks location: {cmp2 =
n
(centr max pow(i) − locf (i))2 } → min},
i=1
or
{cmp2 =
n
(centr max concentration(i) − locf (i))2 } → min},
(2)
i=1
where centre max pow - centre of maximum power neighborhood, centre max concentration - centre of maximum concentration (numbering) neighborhood, 3.minimum correction on final list according Lorenze preference location.
{cmp3 =
n
(Lorenze loc(i) − locf (i))2 } → min}.
(3)
i=1
Generally we can describe compromise as follows:
{cmp =
n
(criterion loc(i) − locf (i))2 } → min},
(4)
i=1
where criterion loc(i)−location of i-th object suggested by chosen criterion. We can use different criteria or their composition for estimation of compromise. In result of using these criteria we often obtain the same location for different objects. In this case it needs to use auxiliary criteria, methods or
846
P. Henryk and G. Gawinowski
heuristic rules. Sometimes we decided to use different criteria for compromise estimation and resign from based on creating final lists method (fig.1).
Fig. 1 Distinguished criteria set for creating final list and compromise: A ∩ B = 0
In our convention (1)-(4) the best compromise refers the smallest value of parametr cmp. To compare compromises for several final lists we should keep the same criteria in set B.
3 Sets of Criteria for Creating Final List It is necessary to define several criteria because often results from using single criteria aren’t unambiguous. It means that we have several objects pretending to one location on final list. We propose several compositions of criteria: 1.sup(ϕ → ψ) → max centre(ϕ, i) → min 2.cnbh(ϕ, i) → max zone(ϕ, i) → min centre(ϕ, i) → min 3.sup(∗ → ψ) + sup(ϕ ← ∗) → min cnbh(ϕ, i) → max,
(5)
where sup(ϕ → ψ) → max - maximal number of object in one placement in ingredient lists, where object ϕi is placed on position ψj , centre(ϕ, i) → min - minimal position of neighborhood centre; we chose object ϕi from this neighborhood, which is closest to the beginning of the list and located in center of its neighborhood, cnbh(ϕ, i) → max - maximal concentration neighborhood, we chose object ϕi with maximal neighborhood concentration (numbering) and locate them in its center, zone(ϕ, i) → min minimal neighborhood distance from begin of list, we chose object ϕi with minimal neighborhood distance and locate them in its center,
Compromise in Scheduling Objects Procedures Basing on Ranking Lists
847
sup(∗ → ψ) + sup(ϕ ← ∗) → min - minimal number of objects pretending to position ψj and minimum positions to which pretended objects ϕi , we chose object ϕi and locate them on position ψj (intuition criterion) We often obtain the same value of criteria estimators. In this case we should go to next criterion in hierarchy, considering the same object and next searching the best location for it. Similar situation appears when chosen location is occupied by previously located objects.
4 Methods and Examples of Creating Final Lists of Scheduled Objects For scheduling objects we can use rules from the theories of: – neighborhoods – preferences – rough sets Beside the criteria set we can use specific methods used traditionally for classification, categorization and ordering objects [16].We try to enrich every proposed method by showing an example. We described exploitation neighborhoods theory to define criteria set above. It is possible to combine elements of quoted theories in different way: 1) neighborhoods + rough sets We can create lower approximation P (O) [16] as set of maximal concentration (or power) neighborhoods and upper approximation as set of all objects locations . In this case main structure (O) is defined by sum of all neighborhoods. 2) neighborhoods + preferences. We can define preferences relation between neighborhoods (or maximal neighborhoods) using their characteristics (concentration, power). 3) neighborhoods + preferences + rough sets From set of upper approximation we chose and remove extreme located neighborhoods and locate adequate objects in their neighborhoods center. Researched objects distribution can be exploited by rough sets theory (Pawlak theory). Using Pawlak theory [16] we can adapt semantically dependence on physical sense of terminology, e.g. relative zone (O). In our case (in ordering objects by several algorithms simultaneously) we can define relative zone as a range of positions in which the most important neighborhoods representing all objects (lower approximation) are included. Relative zone has common part with less important neighborhoods. nbh(i,max⊆(O)
nbh(i, max) = P (O)
(lower approximation)
(6)
848
P. Henryk and G. Gawinowski
nbh(i, ∗ < max) = P (O)
(upper approximation) (7)
nbh(i,∗<max∩(O)=0
So, in our case relative zone (O) can be named representative zone and it contained objects on all positions (O) = (1) + (2) + ... + (8). This zone will be systematically cut off (from both sides) during extracting objects to final list ( tables in fig.2). So, this zone has dynamic length.
Fig. 2 Stages-tables of creation final list
Example 1 1 3 2 6 4 5 8 7 final list - last stage The drawback of presented method above is that it prefers center neighborhoods location over their numbering. When we use Lorenze preference rules [6] we can simply calculate average locations for all objects. In our example (fig.2) we get: pL(1) = aver(loc(j1)) = (1 + 1 + 3 + 8)/4 = 3, 25 pL(2) = aver(loc(j2)) = (1 + 2 + 4 + 7)/4 = 3, 5 pL(3) = aver(loc(j3)) = (1 + 2 + 2 + 3)/4 = 2 pL(4) = aver(loc(j4)) = (1 + 1 + 3 + 8)/4 = 4, 5 pL(5) = aver(loc(j5)) = (4 + 6 + 6 + 8)/4 = 6 pL(6) = aver(loc(j6)) = (2 + 3 + 3 + 5)/4 = 3, 25
Compromise in Scheduling Objects Procedures Basing on Ranking Lists
849
pL(7) = aver(loc(j7)) = (6 + 7 + 8 + 8)/4 = 7, 25 pL(8) = aver(loc(j8)) = (5 + 6 + 7 + 7)/4 = 6, 25 where m pL(i) = aver(loc(ji)) = 1/m j=1 (loc(ϕ(i, j))− the strength of Lorenze preference characteristic After ordering we have final list of rankings pL(3)pL(1)pL(6)pL(2)pL(4)pL(5)pL(8)pL(7) or in form Example 2 3 1 6 2 4 5 8 7 final list - preference in Lorenze sens The drawback of this approach is taking into account less important data (such as single object locations). In order to cover essential information we should use neighborhoods only (not single object locations) and prepare their characteristics:
Fig. 3 Characteristics of neighborhoods for all tasks
ln(i)
pn(i) =
j=1
ln(i)
numb(nbh ϕ(i, j)) ∗ centre(ϕ(i, j))/
numb(nbh ϕ(i, j)),(8)
j=1
where ln(i)− number of neighborhoods for i-th object, numb(nbh(i, j))− numbering (concentration) of j-th neighborhood for i-th object (table in fig.3), centre(ϕ(i, j))− centre of of j-th neighborhood for i-th object (fig.3), pn(1) = 2 ∗ 1/2 = 1pn(5) = 2 ∗ 6/2 = 6 pn(2) = 2 ∗ 1/2 = 1pn(6) = 6 ∗ 3/6 = 3 pn(3) = 4 ∗ 2/4 = 2pn(7) = 4 ∗ 8/4 = 8 pn(4) = 4 ∗ 4/4 = 4pn(8) = 4 ∗ 7/4 = 7 Example 3 1 2 3 6 4 5 8 7 final list - gravity points for every object To analyze and compare chosen methods we propose set of choices (1), for example:
850
P. Henryk and G. Gawinowski
cnbh(ϕ, i) → max zone(ϕ, i) → min centre(ϕ, i) → min and their help formulate the final list. It gives us solution with structure: Example 4 1 3 6 4 2 5 8 7 final list - set of criteria In this case we have next sequence of tasks joining to final list: 1)ϕ4 → ψ4 2)ϕ8 → ψ7 3)ϕ7 → ψ8 4)ϕ3 → ψ2 1)ϕ6 → ψ3 2)ϕ1 → ψ1 3)ϕ5 → ψ6 4)ϕ2 → ψ5 According this method we use essential date and omit single object placement and deviation.
5 The Example of Exploitation Compromise to Judgment of Set of Final List For choosing compromise criteria we can go by the quantity of information, which was used to define in estimation process. Such approach suggested exploiting Lorenze preferences as compromise criterion. In next step we estimate scale of differences between final lists and list created on base of Lorenze preference. According to (4) we will do it for all solutions. n 1)cmp = i=1 (criterion loc(i) − locf (i))2 = (3 − 1)2 + (1 − 3)2 + (6 − 2)2 + (2 − 6)2 +(4 − 4)2 + (5 − 5)2 + (8 − 8)2 + (7 − 7)2 = 48 n 3)cmp = i=1 (criterion loc(i) − locf (i))2 = (3 − 1)2 + (1 − 2)2 + (6 − 3)2 + 2 (2 − 6) +(4 − 4)2 + (5 − 5)2 + (8 − 8)2 + (7 − 7)2 = 30 4)cmp = ni=1 (criterion loc(i) − locf (i))2 = (3 − 1)2 + (1 − 3)2 + (6 − 6)2 + (2 − 4)2 + (4 − 2)2 + (5 − 5)2 + (8 − 8)2 + (7 − 7)2 = 16 min{cmp(1); cmp(3); cmp(4)} = min{48; 30; 16} = 16 To find the nearest to compromise solution final list we have named additional parameter for defining method code. For example we extend location attribute name to form locfk (i), where k is a code used for creating final list method (which are adequate to examples above). Compromise expression stays simple and might have the following form:
{cmp =
lm n (criterion loc(i) − locfk (i))2 } → min, k=1 i=1
where lm− number of ordering methods basing on ingredient lists analysis.
(9)
Compromise in Scheduling Objects Procedures Basing on Ranking Lists
851
The best compromise according to Lorenze criterion we can find in example 4. Obviously when we chose different compromise criterion the best criterion will vary. Sometime we dispose set of compromise criteria. In this case rules of searching compromise can be expressed by: {cmp =
lc lm n
(criterion loc(i) − locfk (i))2 } → min,
(10)
j=1 k=1 i=1
where lc - compromise criteria number. If we use the same methods (criteria)for creating both final lists and n compromise stencil list, than components i=1 (criterion loc(d) − locfd (i))2 , where d refers to choosen the same method (or criteria) for both tasks, will be obviously equal to zero, and it doesn’t influence at all, on final compromise estimator level.
6 Conclusions The experiments show that combining methods of neighborhoods, preference and rough set for analysis ranking list is very comfortable and permits to exploit reach part of information for creating final list and compromise solution. The situation doesn’t become more difficult even when we dispose the same set of methods for creating final lists and compromise list. In neighborhood theory we use tools for eliminating inessential information opposite to some variant of preference rules, but using preference methods we can create reference stencils. Specific character of rough sets theory description permits not only to reject objects of inessential attribute values, but at the same time to dislocate objects using current compromise decisions. Neighborhood estimators are less unambiguously but don’t regard inessential date.
References 1. Blazewicz, J., Lenstra, J.K., Rinnooy Kan, A.H.G.: Scheduling subject to resource constrains: Classification and complexity. Discrete Appl. Math. 5, 11–24 (1983) 2. Brzezi˜ nska, I., Greco, S., Slowinski, R.: Mining Pareto-optimal rules wth respect to support and anti-suport. Engeniering Applications of Artificial Inteligence 20(5), 587–600 (2007) 3. Conway, R.W., Maxwell, W.L., Miller, L.W.: Theory of Scheduling. AddisionWesley, Reading (1954) 4. Crupi, V., Tentori, K., Gonzalez, M.: On Bayesian confirmation measures of evidential support. Theoretical and Empirical Issues. Philosophy of science
852
P. Henryk and G. Gawinowski
5. Finch, H.A.: Confirming Power of Observations Metricized for Decisions among Hypotheses. Philosophy of Science 27, 391–404 (1999) 6. Greco, S., Matarazzo, B., Slowinski, R., Stefanowski, J.: An algorithm for induction of decision rules with dominance principle. In: Rough Sets and Current Trends in Computing. LNCS (LNAI), pp. 304–313. Springer, Berlin (2005) 7. Greco, S., Matarazzo, B., Slowinski, R.: Axiomatic characterization of a general utility function and its particular cases in terms of conjoint measurement and rough sets decision rules. European J. of Operational Research (2003) 8. Greco, S., Matarazzo, B., Slowinski, R.: Extension of rough set approach to multicriteria decision support. Infor. 38, 161–196 (2000) 9. Greco, S., Matarazzo, B., Slowinski, R.: Rough sets theory for multicriteria decision analysis. European J. of Operational Research 129, 1–47 (2001) 10. Greco, S., Pawlak, Z., Slowinski, R.: Can Bayesian confirmation measures be useful for rough set decision rules? Engineering Applications of Artificial Intelligence 17, 345–361 (2004) 11. Greco, S., Slowi´ nski, R., Szcz¸ech, I.: Assessing the quality of rules with a new monotonic interestingness measure Z. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 556–565. Springer, Heidelberg (2008) 12. Hilderman, R., Hamilton, H.: Knowledge Discovery and Measures of Interest. Kluwer Academic Publishers, Dordrecht (2001) 13. Jaro˜ n, J.: Systemic Prolegomena to Theoretical Cybernetics, Scient. Papers of Inst. of Techn. Cybernetics 25 (1975) 14. Kent, R.E.: Rough concept analysis: A synthesis of rough sets and formal concept analysis. Fundamanta Informaticae 27, 169–181 (1996) 15. Kleinberg, J.: Navigation in a small words. Nature 406, 845 (2000) 16. Kohler, W.H.: A preliminary evolution of the critical path method for scheduling tasks on multiprocessor systems. IEEE Trans. Comput. 24, 1235–1238 (1975) 17. Nikodem, J.: Autonomy and Cooperation as Factors Dependability in Wireless Sensor Network. IEEE Computer Society P3179, 406–413 (2008) 18. Pawlak, Z., Sugeno, M.: Decision Rules Bayes, Rule and Rough, New Decisions in Rough Sets. Springer, Berlin (1999) 19. Pawlak, Z.: Rough Sets. Present State and the Future, Foundations, vol. 18(3-4) (1993) 20. Piech, H. (ed.): Analysis of possibilities and effectiveness of combine rough set theory and neibourhood theories for solving dynamic scheduling problem, vol. P3674, pp. 296–302. IEEE Computer Society, Washington (2009) 21. Skowron, A.: Extracting lows from decision tables. Computational Intelligence 11(2), 371–388 (1995) 22. Slowi´ nski, R., Brzezinska, I., Greco, S.: Application of bayesian confirmation measures for mining rules from support-confidence pareto-optimal set. In: ˙ Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 1018–1026. Springer, Heidelberg (2006) 23. Szwarc, W.: Permutation flow-shop theory revised. Math Oper. Res. 25, 557– 570 (1978) 24. Syslo, M.M., Deo, N., Kowalik, J.S.: Algorytmy optymalizacji dyskretnej. PWN, Warszawa (1995) 25. Talbi, E.D., Geneste, L., Grabot, B., Previtali, R., Hostachy, P.: Application of optimization techniques to parameter set-up in scheduling. Computers in Industry 55(2), 105–124 (2004)
Decision on the Best Retrofit Scenario to Maximize Energy Efficiency in a Building Ana Campos and Rui Neves-Silva
*
Abstract. Building owners, or investors, and facility managers, or building technical consultants, have the difficult task of maintaining an infrastructure by selecting the most adequate investments. Nowadays, this maintenance means in most countries to update a building to current regulations regarding energy efficiency. The decision to retrofit a building involves several actors and a diverse set of criteria, covering technical, economical, social and financial aspects. This paper presents a novel approach to support investors and technical consultants in selecting the most appropriate energy-efficient retrofit scenario for a building. The proposed approach uses the actual energy consumption of the building to predict energy profiles of introducing new control strategies to increase energy efficiency. Additionally, the approach uses the Analytic Hierarchy Process combined with benefits, opportunities, costs and risks and a sensitivity analysis to support actors in selecting the best scenario to invest.
1 Introduction In the last decade, increased attention has been given to energy efficiency and greenhouse gas emissions. According to the International Energy Agency, global energy demand will grow 55% by 2030. In the period up to 2030, the energy supply infrastructure worldwide will require a total investment of USD 26 trillion, with about half of that in developing countries. If the world does not manage to green these investments by directing them into climate-friendly technologies, emissions will go up by 50% by 2050, instead of down by 50%, as science requires (United Nations Framework Convention on Climate Change 2011). The European Commission has prioritized climate change and energy in its new Europe 2020 strategy, establishing ambitious goals: 20% reduction of greenhouse gas emissions, meeting 20% of energy needs from renewable sources, and reducing energy consumption by 20% through increased energy efficiency. Ana Campos UNINOVA, FCT Campus, 2829-516 Caparica, Portugal *
Rui Neves-Silva DEE-FCT, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 853–862. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
854
A. Campos and R. Neves-Silva
In order to meet EU targets, the European Commission and the countries have approved legislation to be applied. One of the areas where regulations have been significant is related to buildings. The European Union has now European and national laws that have to be complied particularly by new buildings. However, there is also a growing concern about existing buildings, especially about the possibilities to modernize them, making them more efficient from an energy perspective. Many recent reports on energy efficiency in buildings stress that occupants behavior is one of the most important aspects to achieve energy efficient buildings: “The behavior of building’s occupants can have as much impact on energy consumption as the efficiency of equipment”, (Parker and Cummings 2008) (Vieira 2006), “a heightened energy consumption awareness is expected to stimulate behavioral changes both at household and enterprise level” (European Commission 2008), “a smart building is only as smart as the people running it” (Powell 2009). Recently, several European research projects were approved addressing building’s occupants behavior, such as BeAware - Boosting Energy Awareness with mobile interfaces and real-time feedback (BeAware 2009), DEHEMS - Digital Environmental Home Energy Management System (DEHEMS 2009) or Beywatch - Building Energy Watcher (Beywatch 2009). From the perspective of the building user, all these projects focus on metering energy consumption and advising users in real-time about energy consumptions that are higher than the expected ones. The concept proposed on this paper follows a novel approach by measuring energy consumptions in the building, outdoor environmental data (weather conditions, urban context etc.) and indoor environmental data (luminance, temperature, humidity, CO2 concentration, etc.). These data are collected and processed identify the cause-effect relations between building’s occupant’s behavior and energy consumption in the building. The objective is to identify the most adequate control technologies to be used in retrofitting the building to increase energy efficiency, without disturbing or reducing the comfort level of the building’s occupants. The approach supports the human actors in selecting the best retrofit scenario, considering technical, social, economic and financial criteria.
2 Concept and Objectives When the owner of a building decides to renovate an existing infrastructure, with energy efficiency in mind, would it help to know exactly where energy is being spent, i.e. the real use of the infrastructure? The work here proposed takes this assumption as truth. The key hypothesis is that the data gathered on how an infrastructure is being used may serve to improve the accuracy on prediction of future energy consumption impact of installing alternative sets of available technologies, including controllers. This will also serve to justify the necessary renovation investment based on a financial return-on-investment calculation. The objective of this work is to develop a reliable method to support decisionmaking on energy-efficient investments in building renovations. This objective is critically important to convince building owners to renovate with energy-saving, energy-generating, and energy-storing solutions.
Decision on the Best Retrofit Scenario to Maximize Energy Efficiency in a Building
855
se
ns o
rn
et w or
k
The proposed approach gathers data on energy consumption of an existing infrastructure, crossing it with the building’s use, to define a baseline energy consumption model of the building. This baseline scenario can then be used to predict consumption of different scenarios, comprehending energy-efficient technologies and control solutions. The system monitors the usage of a building, models the building’s energy consumption, and uses these two elements to predict energy consumption under alternative scenarios based on available market solutions and provide recommendations for a best solution, taking into consideration the decision-makers’ criteria and restrictions, as presented in Fig. 1.
Fig. 1 The proposed concept.
The exploration of this hypothesis requires solving the key problem: how to define and prepare the energy consumption software to make use of all this data, and return a coherent answer? The solution proposed includes a detachable sensor network and an energy prediction decision support system. The sensor network is installed to audit the building usage and provide all the necessary information to specify an energy consumption model of the building, which can be extrapolated to cover a pre-defined period of time (usually the models are annual). The building audit data and the information about the infrastructure are then used by the system to identify potential control technologies that could be applied to improve the building. The system constructs renovation scenarios, using one or more control technologies, with the objective of increasing energy efficiency. Furthermore, the system uses the data of the baseline scenario to predict the energy consumption of the proposed solutions. The retrofit scenarios identified can have different complexity levels. One building may only need to exchange luminaries in several rooms to improve light
856
A. Campos and R. Neves-Silva
provided, while another building may have to combine that with presence sensors and heat and ventilation control.
3 Decision-Making Process The decision-making process starts with defining the objective of the renovation (e.g. invest capital, increase comfort, decrease energy costs) and ends with selecting the most appropriate renovation scenario to be implemented. The purpose of the decision-making process is to select a renovation scenario that best addresses the requirements of a specific building and the expectations of the investors. Within this process, two actors have been identified: the technical consultant who usually manages and/or maintains the building and the investor who owns the building and controls the financial resources. However, from organizational point of view these two roles can even be played by the same person. The decision-making process comprehends several steps, as represented in Fig. 2. The complete process includes four decision points, and additional steps.
Fig. 2 Decision-making process.
Decision on the Best Retrofit Scenario to Maximize Energy Efficiency in a Building
857
Select Renovation Solution: The technical consultant and investor identify the need to renovate a specific building and define the objective to be achieved. The technical consultant defines assessment parameters and requests an audit to the building, to gather data on actual energy consumption. The audit data is extrapolated to a baseline scenario that represents the energy profile of the building before renovation. The technical consultant identifies renovation solutions based on the technical knowledge of the building and the retrofit objectives. The result of this step is a collection of renovation solutions that address the retrofit objectives. Check Technical Criteria: The technical consultant examines the specification of each selected renovation solution and tries to identify if it is compatible with the existing building infrastructure. The result of this step is the list of renovation solutions annotated on their individual applicability for the specific building. Build a scenario based on solution i? The technical consultant studies the renovation solutions and the retrofit objectives, and selects the renovation solutions that should be considered for implementation. Elaborate Renovation Scenario: The technical consultant has expertise to build different scenarios, which are the several alternatives for the final decision. It is recommended to elaborate simple and perhaps cheaper scenarios, but also more complex and more expensive ones. This divergence allows an enlarged decision space for the investor. The result of this step is a collection of renovation scenarios that comprise the renovation solutions selected and filtered in the previous steps. Check Regulations: The technical consultant studies each renovation scenario and checks it against legislation and regulation applicable to the specific location. If one of the scenarios fails to comply with all necessary regulation, it should be reworked, or ultimately disregarded. The result of this step is a list of renovation scenarios annotated on regulatory applicability in the specific building location. Approve scenario j? This second decision point is to select renovation scenarios that will be simulated to calculate envisaged energy consumption. Simulate Scenarios: Each of the scenarios includes detailed technical specification data that can be used to estimate the energy consumption of installed control technologies. This information is aggregated to the baseline scenario that represents the current building infrastructure. The result is an energy consumption profile of the building with the renovation scenario already implemented. The result of this step is a list of renovation scenarios with information about the respective energy consumption pattern in the current building. Approve simulated scenarios? This third decision point is to approve the renovation scenarios, including the simulations of the respective energy consumption for the current building. The technical consultant analyses each scenario and checks if it fits the technical and financial objectives of the current situation. Score and order scenarios: This step elaborates a benefit, opportunity, cost and risk (BOCR) analysis of each renovation scenario and presents it to the investor to select the most appropriate scenario for implementation in the specific building. Moreover, the investor should have the option to prioritize the decision criteria and merits used to order the alternatives.
858
A. Campos and R. Neves-Silva
Re-define and Approve Criteria/Parameters: The investor, alone or with the support of the technical consultant, defines the decision criteria (e.g. aesthetics, comfort, corporate reputation, brand ambitions) and parameters (e.g. interest rates, energy prices, credit conditions, tax incentives). By iterating between this step, the previous one and the following one, the investor has the possibility of performing a sensitivity analysis of the scenarios being considered. The result of this step is a complete set of decision criteria and parameters to be used by the previous step when realizing the cost-benefit analysis if each renovation scenario. Select one of the renovation scenarios? This is the final decision of choosing the renovation scenario to be implemented for the current building. The investor analyses the scenarios ordered in relation to the decision criteria and parameters defined. One of the scenarios should be the baseline scenario built from the auditing data, so that the investor can assess the cost of “doing nothing” to the building.
4 Benefits, Opportunities, Costs and Risks Analysis The main objective of the proposed presented in this paper is to support an investor or technical consultant in selecting the best investment scenario to increase energy efficiency in a specific building. The support given to the users is in the form of benefits, opportunities, costs and risks (BOCR) analysis, using the analytic hierarchy process (AHP). This combination is used to order the retrofit scenarios being considered and provide a clear financial viability. The Analytic Hierarchy Process is a theory of measurement concerned with deriving dominance priorities from paired comparisons of homogeneous elements with respect to a common criterion or attribute (Saaty 1994). This process uses a series of one-on-one comparisons to rate a series of alternatives to arrive at the best decision. The problem here is how to identify and formulate the alternatives to be considered. AHP has been used to performed BOCR analysis with diverse discussion (Wedley et al. 2001) and successful results (Saaty 2005) (Lee et al. 2009) (Longo et al. 2009). This paper uses the latest developments in the field, re-using results from critiques made to the use of AHP and BOCR. The AHP is especially suited for complex decisions, involving several actors, even from diverse backgrounds. The hierarchical representation of the problem is very suited to identify the criteria to consider. The current decision problem is the choice of the most appropriate technological solution to improve a building’s energetic efficiency. In order to use AHP to perform a BOCR analysis, the approach proposes four hierarchies, where each one represents one of the merits being considered. This means that criteria are separated in four groups, and the users will compare the criteria and alternatives using the four established hierarchies. The authors have elaborated a list of criteria that can be used for this application, with the support of several industrial companies (Campos et al. 2010) (Campos 2010). The list is quite general and can cover different decisions. In each situation, it is possible to discard and of the criterions or add new ones. The method developed is
Decision on the Best Retrofit Scenario to Maximize Energy Efficiency in a Building
859
not in any way tied to this list. The suggested criteria includes, on top of technical performance (which equals energy efficiency performance): • Benefits o Energy savings o Comfort level o Use of government program (e.g. tax incentive) o Occupants satisfaction • Opportunities o Flexibility for reconfiguration o Compliance with regulations o Impact on company’s image • Costs o Equipment costs o Operating costs o Training costs o Personnel costs o Expertise (outsourcing costs) • Risks o Sensitivity on future energy prices o Sensitivity on future user behavior o Technological obsolescence o Efficiency degradation o Visual impact on building o Noise and vibrations impact Each hierarchy has criteria (in one or several levels) and a lower level with the alternatives to be considered. In the current approach the alternatives are the renovation scenarios identified and simulated, as described in the previous section. The decision-making process can be made individually (e.g. by the investor alone) or in a group involving several actors (investor, technical consultant, facility manager etc.). The process comprehends the following steps: • actors provide their judgments on the criteria of the four hierarchies, resulting in four matrixes for each actor; • the priorities of each criteria provided by each actor are calculated using the eigenvalue of each matrix; • the priorities of the several actors are combined in group priorities using a weighted arithmetic mean, resulting in four vectors (one for each of the BOCR aspects); • the actors judge the alternatives regarding the several criteria of each hierarchy, resulting again in four matrixes per actor; • the priorities of the alternatives provided by each actor are calculated using the eigenvalues of each matrix and then normalized according to the ideal AHP mode, resulting in one alternative scoring 1 for each criterion;
860
A. Campos and R. Neves-Silva
• the priorities of the several actors are again combined using a weighted arithmetic mean, resulting in four matrixes; • the actors define the merits of each of the BOCR aspects, i.e. assigning a weight; • the alternatives are ordered using the subtractive formula to calculate the overall priorities. The criteria are judged by the different actor using the scale proposed by Saaty, of numbers 1-9 and reciprocals. This scale uses 1 to define items of equal importance and differences until 9, representing absolute importance. The use of the ideal AHP mode is important, and has been proven to establish the priority of an alternative independent of any scale used. In this mode, the most relevant alternative always ranks 1 and all the others are related to that. This achieved independence of scale, which will be necessary when combining priorities, and particularly different hierarchies. The final priority of each alternative is calculated using the subtractive BOCR formula, defined as Pi = bBi + oOi – cCi – rRi,
(1)
where b, c, o and r represent the merits (weights) of each aspect and Bi, Oi, Ci and Ri represent the priorities given for alternative i in each of the four hierarchies. This formula has provided successful results in the works referenced before, unlike others, such as the additive or multiplicative formulas. The objective of the formula used is to provide a positive result for alternatives that have more positive aspects (benefits and opportunities) than negative (costs and risks) and a negative result for alternatives that do not reach a breakeven point. One important aspect of the approach is how to identify the merits of each of the four aspects being considered, i.e. how to determine b, c, o and r. Actors use a fifth hierarchy, designated the control hierarchy, containing strategic criteria to rate the merits of benefits, opportunities, costs and risks. The authors suggest the following control criteria: • • • • •
Performance Sustainability Company’s strategy Time to implement decision Growth
The authors are studying and developing these control criteria, and each company and/or situation should specify its own. The objective is to use a hierarchy that relates these criteria with the BOCR aspects, to derive priorities. Users should consider the highest ranking alternative in each of the four aspects and rate it against the criteria identified. These ratings are combined in a matrix and used to derive the priorities.
Decision on the Best Retrofit Scenario to Maximize Energy Efficiency in a Building
861
Once the merits of each aspect are identified and the final priorities of the alternatives calculated, the users can perform a sensitivity analysis by adjusting the merits in about 10%.
5 Applications The approach presented in this paper will be applied and validated in real infrastructures in the scope of the following two scenarios: • The owner of the infrastructure requests an evaluation to a construction company on possible solutions to reduce the infrastructure energy intensity. The construction company uses the proposed system to assess the infrastructure usage and offer possible solutions supported by the results of the tool. In case of viable solutions, the owner accepts (or requests further iterations) and the solution is installed in the infrastructure. The end-user of the system is the construction company responsible for the installation of the solution, with the collaboration of the owner that will decide on the investment. • A company, responsible for managing the infrastructure, uses the EnPROVE tool to evaluate possible cost reduction scenarios through the installation of alternative energy-efficient control systems technologies. In case of viable solutions, the infrastructure manager decides for a solution and installs it. The end-user of the EnPROVE tool is the infrastructure manager responsible for deciding the investment and installing the solution.
6 Conclusions and Future Work This paper presents a novel approach to support renovation investment decisions on existing building, aiming at increasing energy efficient. The work has been developed in the scope of the research project EnPROVE, which started in the beginning of 2010. The approach is based on the monitoring of the building usage by a wireless sensor network to build adequate energy consumption models. These models are then used to predict the impact on energy consumption of the eventual installation of several energy efficient technologies. Finally, the decision-support model suggests the best investment alternative taking into consideration the investor’s criteria and possible restrictions. The project is currently on the phase of developing and specifying algorithms, as the ones presented in this paper. This work will be implemented as a serviceoriented software system, which will be tested within the two applications described. Compared to current available energy auditing services and prediction tools, it is foreseen that this approach will increase the cost-effectiveness of renovation investments by 15 to 30%.
862
A. Campos and R. Neves-Silva
Acknowledgments. Authors express their acknowledgement to the consortium of the project EnPROVE, Energy consumption prediction with building usage measurements for software-based decision support. EnPROVE is funded under the Seventh Research Framework Program of the European Union (contract FP7-248061).
References 1. BeAware (2009), http://energyawareness.eu 2. Beywatch (2009), http://www.beywatch.eu 3. Campos, A.R.: Intelligent Decision Support Systems for Collaboration in Industrial Plants. PhD Thesis (2010) 4. Campos, A.R., Marques, M., Neves-Silva, R.: A decision support system for energyefficiency investments on building renovations. In: Energycon 2010: IEEE Energy Conference and Exhibition, Bahrain (2010) 5. DEHEMS (2009), http://www.dehems.eu 6. European Commission. Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions - Addressing the challenge of energy efficiency through information and communication technologies (2008) 7. Lee, A.H.I., Chen, H.H., Kang, H.Y.: Multi-criteria decision making on strategic selection of wind farms. Renewable Energy 34, 120–126 (2009) 8. Longo, G., Padoano, E., Rosato, P., Strami, S.: Considerations on the Application of AHP/ANP Methodologies to Decisions Concerning a Railway Infrastructure. In: International Symposium on the Analytic Hierarchy Process (2009) 9. Parker, D., Cummings, J.: Pilot Evaluation of Energy Savings from Residential Energy Demand Feedback Devices. Florida Solar Energy Center, USA (2008) 10. Powell, K.: Energy Smart Buildings. In: Fourth Annual Green Intelligent Buildings Conference, Santa Clara, USA (2009) 11. Saaty, T.: Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process: Vol. VI of the AHP Series. RWS Publications, USA (1994) 12. Saaty, T.: Theory and Applications of the Analytic Network Process. RWS Publications, Pittsburgh (2005) 13. United Nations Framework Convention on Climate Change. Fact Sheet: The need for strong global action on climate change. United Nations Framework Convention on Climate Change (February 2011), http://unfccc.int/2860.php (accessed 2011) 14. Vieira, R.: The Energy Policy Pyramid - A Hierarchical Tool for Decision Makers. In: Fifteenth Symposium on Improving Building Systems in Hot and Humid Climates, Orlando, USA (2006) 15. Wedley, W.C., Choo, E.U., Schoner, B.: Magnitude adjustment for AHP benefit/cost ratios. European Journal of Operational Research 133, 342–351 (2001)
Developing Intelligent Agents with Distributed Computing Middleware Christos Sioutis and Derek Dominish
*
Abstract. Intelligent agents embody a software development paradigm that merges theories developed in Artificial Intelligence (AI) research combined with computer science. The power of agents comes from their intelligence and also their communication. Current agent development methodologies and resulting frameworks have been developed from an AI perspective. From a developer’s point of view they introduce new programming concepts and provide a specialised execution environment. Considerable emphasis is placed on hiding away the underlying complexity of how agents actually operate. However, the fact is; agent systems are inherently distributed software systems and this brings significant implications in their application and more importantly integration. This has been largely underestimated by the agent community resulting in increased development risk in large production systems. The Distributed Object Computing (DOC) development methodology on the other hand has been used to successfully build large scale distributed software systems using standards-based middleware. In this context Objects encapsulate behaviour and are inherently integrated with any system utilising compatible middleware. This paper explores the possibility of leveraging the power of both approaches through a proposed Agent Architecture Framework (AAF) that implements generic agent behaviours and algorithms with DOC middleware using well understood software design patterns.
1 Introduction The term “agent” is an overloaded term in literature, the meaning of which depends on how the concept is applied in different application domains. In this paper an agent refers to software of considerable complexity that exhibits behaviour dictated by AI algorithms which are in-turn directed by business logic. The process of design and implementation of agents is referred to as agent-oriented development. There are a number of commercial and open source software frameworks available which are specifically designed for agent-oriented development. Typical Christos Sioutis · Derek Dominish Air Operations Division, Defence Science and Technology Organisation e-mail: {Christos.Sioutis,Derek.Dominish}@dsto.defence.gov.au *
J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 863–872. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
864
C. Sioutis and D. Dominish
applications utilise multiple agents running autonomously with each agent executing multiple behaviours in parallel. There are also provisions for managing an agent‘s knowledge and inter-agent communication. However, interfacing with the environment is usually left entirely up to the developer; this is often described as a strength of the technology. Agents are able to work as a part of any system that can be accessed through its underlying software architecture. The above statement also alludes to a problem observed across all agent development frameworks regardless of their maturity. Specifically, there is little emphasis placed on the fact that agents are inherently complex, distributed software systems. Moreover, there is little support to help developers deal with the integration issues of such systems. This results in reluctance to use agents in large production systems and increased integration risk if chosen to do so. With computer networking now commonplace and the rise of multi-processor hardware architectures there has been a steady increase in interest of Distributed Object Computing (DOC). DOC developers build software components that are logically interconnected and are impervious to how communication with other components is achieved. For example, communication could be routed through system memory, via a local area network, or even the internet. This paper argues that agent-oriented development fits well within the DOC paradigm. Agents could be built using DOC techniques and operate through a middleware architecture. Furthermore, it is argued that agents are themselves built by extending the same middleware utilised by a target system thereby reducing their integration risk considerably.
2 Agent Technology The general concept of an agent involves an entity that is situated within an environment. The environment generates sensations triggering reasoning in the agent causing it to take actions. These actions in-turn generate new sensations, hence forming a sense-reason-act loop. This is re-enforced by Wooldridge’s widely referenced definition [1] “An agent is a computer system that is situated in some environment, and that is capable of autonomous action in this environment in order to meet its design requirements”. Wooldridge continued to define a number of properties that an agent-based system should exhibit: Autonomy, operating without the direct intervention of humans; Social ability, interacting with other agents; Reactivity, perceiving and responding to changes in the environment; Proactiveness, initiating actions to achieve long term goals. Agent-oriented development is concerned with techniques for software development that are specifically suited for developing agent-based systems. These are useful because generic software engineering processes are unsuitable for agents because they fail to capture autonomous behaviour and complex interactions. A very good agent-oriented development methodology is described in [2].
Developing Intelligent Agents with Distributed Computing Middleware
865
3 Distributed Object Computing Modern computer software systems are inherently complex with their many layered interactions and patterned idioms combining to form a structured cohesive whole. With the advances of software technologies over the past 20 years it has become increasingly necessary to utilise architectures based on frameworks that adopt a pattern orientated approach and not just class libraries to achieve this; frameworks that are purpose built and focused on a particular problem domain. A class library is a collection of re-usable functions and classes where functionality is invoked from application code. A framework provides not only cohesive reusable components but also re-usable behaviour [3]. It is through the adoption of well designed and easy to use architectures that are based on pattern orientated implementation frameworks that significant reductions in overall system complexity can be achieved. Application function and component interactions can also be defined through pattern orientated framework mechanisms. Furthermore, when a pattern orientated approach to both architecture and design is followed it allows for a higher order of re-use and enhanced application developer productivity and reliability. It can serve to facilitate and ‘guide’ developers into adopting a solid and reliable patterned approach to the software artefacts of systems development. Middleware is software that connects application components as shown in Fig 1a. It is in essence those mechanistic elements of software that are “in-themiddle” between application components. Through these elements components can be concurrently hosted on different operating systems, platforms and environments and commonly consists of a set of services that allows multiple processes running on one or more machines to interact. Middleware technologies have evolved significantly over the past 10 years in support of the move to coherent distributed architectures and standardisation. One example is the Common Object Request Broker Architecture (CORBA) [4] standard published by the Object Management Group (OMG). Coupled with this standardisation is the capability to simplify the complex problem of integrating disparate distributed application components and systems within a heterogeneous system-of-systems context. In many cases the middleware itself is comprised of different layers. Each layer is responsible for providing a new level of functionality. Typically, lower layers provide generic services and capabilities which are utilised by higher layers that are increasingly more domain or application specific. Applications can be developed to integrate with a particular layer that they support and interoperate with other applications operating on other layers. As a result, most of the work in such systems happens in the underlying layers and the applications sitting on top contain only business logic commonly called services. This is the essence of the Service Oriented Architecture (SOA) approach. Fig 1b illustrates how SOA differs to DOC systems. The dotted horizontal lines in the middleware indicate logical connections that can be routed in different ways. The difference in sizes of the services alludes to the fact that ones built on higher layers need less development because they leverage on the functionality of the layers underneath them.
866
a) Distributed Computing Architecture
C. Sioutis and D. Dominish
b) Service Oriented Architecture
Fig. 1 Connecting software components with middleware
Data distribution is an example of a domain specific layer. It involves the asynchronous dissemination of data between components. The key challenges addressed by data distribution technology are: Real-time, meaning that the right information is delivered at the right place at the right time, all the time; Dependable, thus ensuring availability, reliability, safety and integrity in spite of hardware and software failures; High-Performance, hence able to distribute very high volumes of data with very low latencies. The Data Distribution Service for Real-Time Systems (DDS) [5] is the formalisation through standardisation of traditional Publish/Subscribe (Pub/Sub) capabilities common to many application environments. These Pub/Sub capabilities are expressed through the service as an abstraction for one-to-many communication that provides anonymous, decoupled, and asynchronous communication between a publisher and its subscribers. Different implementations of the Pub/Sub abstraction standard have emerged for supporting the needs of different application domains.
4 Related Research The DOC methodology provides standardised ways for different applications to communicate and also provides some services (eg. Naming, Trading) which can be very useful when utilised in conjunction with agents. For example, in [6] middleware is utilised for inter agent communication as well as access to additional databases. CORBA-based agents operate the same as traditional CORBA components. When plugged into a large DOC system they instantly have access to other agents, information and services. Legacy systems can also be wrapped with CORBA interfaces and become available to the agents. Extending the above concept, the CORBA Component Model (CCM) is a component modelling architecture built on top of CORBA middleware. Researchers
Developing Intelligent Agents with Distributed Computing Middleware
867
have already identified the CCM as a possible approach for merging agents with DOC concepts. Melo [7] describes a CCM-based agent as a software component that exposes a defined interface; has receptacles for external plan components; and utilises sinks/sources for asynchronous messaging. Similarly, Otte [8] describes an agent architecture called MACRO built on CCM that introduces additional algorithms for agent planning and tasking in the application of sensor networks. The main advantage of this application is noted as the abstraction of the details of the underlying communication and system configuration from the agents. Specific limitations identified in this application include overheads imposed by the middleware (due to the limited processing capacity in the embedded systems employed for the sensor network) and having limited control data routed around the network [8]. A number of very capable agent frameworks have been developed to aid in implementing agent based systems, for examples see [9-12]. These frameworks primarily provide new constructs and algorithms that aid a developer in designing an agent based system. Each framework however provides its own version of what an agent is and how behaviours are implemented, brings its own advantages and disadvantages and is suited to different applications. This means that developers must choose and learn to use the appropriate framework for their application. After exploring a number of frameworks it quickly becomes evident that there are conceptual similarities but they are implemented differently. This hints at the existence of patterns that describe the mechanisms of agent behaviour in an implementation independent way. An understanding of these patterns allows switching between frameworks and knowing the expected design elements; albeit implemented differently. Researchers have already discovered the possible advantages of using patterns for designing agents. Weiss [13] describes a hierarchical set of patterns for agents. For example, the “Agent as a Delegate” pattern begins to describe how a user delegates tasks to an agent. An attempt is also made to classify agent patterns based on their intended purpose in [14] and a twodimensional classification scheme is proposed with the intent that it is problem driven and logically categorises agent-oriented problems. Although not specifically mentioning patterns, related work by [15] has attempted to describe agent behaviour as a model. Key behavioural elements utilised by the popular Belief Desire Intention (BDI) reasoning model are defined as: Goals, a desired state of the world as understood by the agent; Events, notifications of a certain state of the internal or external environment of the agent; Triggers, an event or goal which invokes a plan; Plans, respond to predefined triggers by sequentially executing a set of steps; Steps, primitive actions or operations that the agent performs; Beliefs, storing the agents view of its internal state and external environment, these can trigger goal completion/failure events as well as influence plan selection through context filtering; Agent, a collection of the above elements designed to exhibit a required behaviour [15]. This work could be further extended by understanding how the BDI relates to well known cognitive reasoning models like Boyd’s Observe Orient Decide Act (OODA) loop and Rasmussen’s decision ladder and employing additional constructs [16].
868
C. Sioutis and D. Dominish
a) AAF building blocks
b) AAF layer in a SOA system
Fig. 2 Agent Architecture Framework (AAF)
5 Agent Architecture Framework It can be concluded that patterns have as far been applied at a very high level in the agent arena. Examples like the agent delegate pattern provide a use case rather than a generic solution to implement an agent. The proposed Agent Architecture Framework (AAF) will implement generic agent behaviours and algorithms through well understood software design patterns. The aim of the framework is threefold as shown in Fig 2a. First, it will link to specific base support libraries used to implement the required algorithms. Second, it will merge the library APIs with the workflow and conventions of larger SOA architecture. Third, it will be built using templates in order to capture the algorithmic logic but allow it to work against any given type. The intent is for the AAF to be used to implement agents utilising the same middleware architecture as larger SOA systems. This way agents and services will be able to exchange data and operate seamlessly; this concept is illustrated in Fig 2b.
6 Concept Demonstrator In order to demonstrate the viability of this approach a concept demonstrator system has been developed. A simple agent program was written using the JACK Intelligent Agents system and then translated to equivalent constructs using middleware. The agent prints out (draws) shapes in the console based on given commands. Only two shapes have been implemented, squares and triangles. When executing the program it constructs an agent and signals it to print out a small number of shapes of different types and sizes.
Developing Intelligent Agents with Distributed Computing Middleware 1 2 3 4 5 6 7 8 9 10 11 12
869
agent DrawingAgent extends Agent { #posts event DRAWEvent ev; #handles event DRAWEvent; #uses plan SquarePlan; #uses plan TrianglePlan; public DrawingAgent(String name){ super(name); } public void drawShape(String type, int size){ postEvent(ev.drawShape(type,size)); } }
Fig. 3 Code Listing of DrawShape agent using JACK
A source code extract for the JACK agent is shown in Fig 3. The agent is declared by extending the JACK Agent class and calling it DrawingAgent. In lines 2 and 3 it declares that it posts and handles an event of type DRAWEvent. This event is defined in a separate source file and contains two internal variables. A string is used to indicate what type of shape and an integer is used to indicate the size of the shape. In lines 4 and 5 it declares that it uses the plans SquarePlan and TrianglePlan. These are also defined in separate source files. They similarly declare that they handle DRAWEvent and contain an implementation that the agent uses to do the drawing. When the event is posted it is concurrently handled by both plans and a special JACK relevance construct is used to limit its execution. As a result, the SquarePlan plan draws a square event and the TrianglePlan draws a triangle event. In line 6 the agent is constructed using a name. The name is not important in this application but is useful for agent discovery in a multi-agent system. In line 9 a method is defined that is called by the main program to post the draw events to the agent. Considering the above it is easy to deduce that at the very core JACK (similarly to most agent frameworks) exhibits a Reactor pattern [17]. That is, JACK agents respond to events that are triggered by external or internal stimuli and spawn a thread of control to handle these events. Specifically, the agent itself is acting as a reactor of specific events or signals. The JACK kernel is acting as the total system event de-multiplexer and the plans are acting as event handlers. The JACK framework complements the reactor pattern with additional agent constructs and logic. DOC middleware is also very much reactive in nature. When components are “activated” they open themselves to the world via their Object Request Broker (ORB). This means that they expose a defined interface that many external components are able signal for a service request at any time. A concept demonstrator of the AAF has been implemented mirroring agent constructs as seen in the simple JACK DrawShape agent using a combination of CORBA and DDS middleware to provide the reactor functionality. Only a minimal set of features have been implemented at this stage with the intention to grow the AAF to encapsulate all
870
C. Sioutis and D. Dominish
constructs and behaviour patterns encountered in JACK as well as other agent frameworks. The agent initialises a domain participant, binds a topic for the draw event and initialises an associated publisher. Similarly, each plan initialises a subscriber for the specific event. Therefore, posting an event means publishing a topic to the domain and handling an event means subscribing to its topic. DDS allows one-tomany connectivity in arbitrary configurations of subscribers and publishers. This means that when an event is published all plans that have subscribed to it are activated with an independent thread of control. 1 2 3 4 5 6 7 8 9 10 11 12 13
class DrawingAgent : public virtual Agent { DRAW::Event ev; SquarePlan splan; TrianglePlan tplan; public: DrawingAgent(DRAW::Portal &portal, const string &name) : Agent(name), ev(portal), splan(ev) ,tplan(ev) {} void drawShape(const string &shape, const int &size) { DRAW::Data data(shape, size); ev.post(data); } };
Fig. 4 Code Listing of DrawShape agent using preliminary AAF
The source code for the DrawShape agent using the AAF is shown in Fig 4, when executed it behaves identically to the JACK agent. The agent is defined similarly by extending an Agent class which internally initialises a DDS DomainParticipant. The DRAW::Event declared in line 2 is generated by a combination of C++ macros and templates binds the DDS data and topic mechanism with the concept of an event. The content of the data itself is previously defined in another file using the Interface Definition Language (IDL) syntax (part of the CORBA standard) and compiled to generate the appropriate programming structures. The SquarePlan and TrianglePlan are implemented using DDS DataReader objects. Elements have to be explicitly initialised within the DrawingAgent constructor as shown in line 7. There is a two step process in posting an event. The event data is firstly initialised separately in line 10 and subsequently posted in line 11. There is a subtle but important difference here between the JACK and AAF code. In the JACK code the postEvent method is part of the agent’s scope, this can be interpreted as “the agent posts an event”. In the AAF code on the other hand, the post method is in the event’s scope and this can be interpreted as “the agent asks the event to post itself”. This implies that the event can potentially encapsulate behaviour and post itself differently depending on the content and temporal properties of the data passed to it.
Developing Intelligent Agents with Distributed Computing Middleware
871
7 Conclusions Agent development frameworks are very good at guiding developers in specifically creating agents. They introduce a number of behavioural concepts (eg. goals, beliefs) and provide support in the event processing, resource management and structure of their implementation. There is a great emphasis placed on hiding the complexity of the underlying AI algorithms upon which the agents operate. Their integration however within larger systems remains a challenge and is risk prone due to the fact that they provide little support to help the developers deal with the complexities of integration. In the context of DOC and specifically SOA, agents can be viewed as autonomous services with specialised algorithms utilised for intelligent behaviour. The DOC middleware can provide the infrastructure upon which agents communicate with one-another, as well as sense and act upon the environment. When developing agents it is possible to recognise and decompose the patterns of behaviour that agent frameworks implement. These behaviours can then be described using a combination of software design patterns. This allows the implementation of a generic architecture framework built on DOC middleware that implements the agent patterns in a generic way. A developer can subsequently utilise the same middleware employed in their SOA systems and at the same time introduce agents with very little integration risk.
References 1. Wooldridge, M.: Reasoning About Rational Agents. The MIT Press, Massachusetts (2000) 2. Padgham, L., Winikoff, M.: Developing Intelligent Agent Systems A Practical Guide. John Wiley and Sons Ltd., West Sussex (2004) 3. Johnson, R.E., Foote, B.: Designing Reusable Classes. Journal of Object-Oriented Programming 1(2), 22–35 (1988) 4. Object Management Group (2008) Common Object Request Broker Architecture (CORBA) Specification 5. Object Management Group (2007) Data Distribution Service for Real-time Systems (DDS) Specification 6. Cheng, T., Guan, Z., Liu, L., Wu, B., Yang, S.: A CORBA-Based Multi-Agent System Integration Framework. In: IEEE International Conference on Engineering of Complex Computer Systems, pp. 191–198 (2004) 7. Melo, F., Choren, R., Cerqueira, R., Lucena, C., Blois, M.: Deploying Agents with the CORBA Component Model. In: Emmerich, W., Wolf, A.L. (eds.) CD 2004. LNCS, vol. 3083, pp. 234–247. Springer, Heidelberg (2004) 8. Otte, W.R., Kinnebrew, J.S., Schmidt, D.C., Biswas, G.A.: flexible infrastructure for distributed deployment in adaptive sensor webs. In: Aerospace Conference, March 714, pp. 1–12. IEEE, Los Alamitos (2009) 9. Agent Oriented Software (2011) JACK Intelligent Agents User Manual 10. Bellifemine, F., Caire, G., Trucco, T., Rimassa, G.: JADE programmer’s guide. CSELT, TILab and Telecom Italia (2010)
872
C. Sioutis and D. Dominish
11. Laird, J.E., Congdon, C.B.: The SOAR User’s Manual. University of Michigan (2009) 12. Macal, C.M., North, M.J.: Agent-based modelling and simulation. In: Rossetti, M.D., Hill, R.R., Johansson, B., Dunkin, A., Ingalls, R.G. (eds.) Proceedings of the 2009 Winter Simulation Conference (2009) 13. Weiss, M.: Patterns for Motivating an Agent-Based Approach. In: Jeusfeld, M.A., Pastor, Ó. (eds.) ER Workshops 2003. LNCS, vol. 2814, pp. 229–240. Springer, Heidelberg (2003) 14. Oluyomi, A., Karunasekera, S., Sterling, L.: Description templates for agent-oriented patterns. Journal of Systems and Software 81(1), 20–36 (2008) 15. Jayatilleke, G.B., Padgham, L., Winikoff, M.: A model driven component based development framework for agents. International Journal of Computer Systems Science and Engineering 20(4), 273–282 (2005) 16. Sioutis, C.: Reasoning and learning for intelligent agents. University of South Australia (2006) 17. Schmidt, D.C., Stal, M., Rohnert, H., Buschmann, F.: Pattern-Oriented Software Architecture: Patterns for Concurrent and Networked Objects. Wiley & Sons, West Sussex (2000)
Diagnosis Support on Cardio-Vascular Signal Monitoring by Using Cluster Computing Ahmed M. Elmisery, Martín Serrano, and Dmitri Botvich
*
Abstract. The support for remote data processing and analysis is a necessary requirement in future healthcare system. Likewise interconnect/manage medical devices and distributed processing of data collected through these devices are crucial processes for supporting personalised healthcare systems. This work introduces our research efforts to build a monitoring application hosted on a cluster computing environment supporting personalised healthcare systems (pHealth). The application is based on a novel distributed clustering algorithm that is used for medical diagnosis of cardio-vascular signals. The algorithm collects different statistics from the cardiac signals and uses these statistics to build a distributed clustering model automatically. The resulting model can be used for diagnosis purposes of cardiac signals. A cardio-vascular monitoring scenario in cluster computing environment is presented and experimental results are described to demonstrate the accuracy of cardio-vascular signals diagnosis. Advantages of using data analysis techniques and cluster computing in medical diagnosis also discussed in this work. Keywords: Personalised Health Systems, ICT enabled Personal Health, Health Monitoring, Pervasive Computing on eHealth.
1
Introduction
Trends in the next generation of healthcare systems demand applications that can allow prevention of diseases even before they are apparent by using assisted sensors and networks (Yanmin et al. 2009; Lupu et al. 2008). Personalised healthcare systems (pHealth) (Gatzoulis,Iakovidis 2008) is one application that can achieve this objective by presenting a personalized healthcare services. With personalised healthcare services people can receive more accurate diagnostics and early medical assistance. Designing these systems according to individual requirements and based on health data being collected from wearable sensors is challenging task. These systems demands a local processing for the health data and capabilities for Ahmed M. Elmisery · Martín Serrano · Dmitri Botvich Telecommunications Software & Systems Group, Waterford Institute of Technology, Waterford, Ireland J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 873–883. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
874
A.M. Elmisery, M. Serrano, and D. Botvich
distributed data analysis (Herzog et al. 2006) as well as a network infrastructure with high performance (ICT’s) to be able to react in real-time to variations in the data. Cluster Computing and other distributed computing environments have demonstrated their advantages in pHealth systems by offering scalability, availability as well as ability to process massive amount of data (Neves et al. 2008). However, privacy of health data is a main requirement that must be taken into consideration when developing pHealth systems in these environments. Modern medicine can benefits from pHealth systems by building user’s health profiles that can offer personalised support, early assistance, accurate diagnostics and quick response when symptomatic diseases are detected during the local and remote analysis of these profiles. Also, pHealth systems provide procedures to support monitoring the progress of diseases as well as their therapeutic intervention. A key goal in pHealth systems is the ability to perform analysis on either data taken during normal activities or data based on regular medical checks. As a consequence, the people activity/freedom is not affected and accurate results can be attained. Modern pHealth systems allow people to continue their activities and envisage a real time and interactive environment for patient-doctor information exchange. A clear advantage when using these systems is to offer accurate diagnostics for remote healthcare subscribers. We concentrated on distributed clustering as an analysis tool to support healthcare services. This work presents our efforts to build a framework for personalised healthcare applications management. The main objective for this research is to introduce an application for distributed learning clustering (DLC) algorithm (Elmisery,Huaiguo 2010) in the diagnosis of cardiovascular signals. The rest of the work is organized as follow: Section 2 discusses cluster computing as a processing environment to support personalised healthcare applications. Section 3 describes research work results as part of integral cardio-vascular monitoring system in the framework for personalised healthcare applications management introduced in this work. And finally Section 4 summarizes the research advances and concludes this work.
2
Cluster Computing Environment Supporting Personalised Healthcare Applications
This research introduces a framework for personalised healthcare applications management that can manage different healthcare applications running in the same computing environment. This framework is hosted in a cluster computing environment to support massive health data analysis, distributed data storage and health communication networks see Figure (1). Cluster computing play an important role as a processing environment for health data; as it empowers the execution of different health application and the exchange of the data between them. The end-user (i.e. patient or healthy people) has a main role to supply the applications’ databases with his/her health data. This allows these applications to build accurate models for diagnosis and monitoring of health status. Also, the end-user has an important role in the evaluation and enhancement of these applications.
Diagnosis Support on Cardio-Vascular Signal Monitoring
875
The development of user centred systems is crucial and highlights the end-user role in healthcare research and technological development practices. Personalised healthcare applications require an active role for the end-user, as he/she submits health data to the health applications then he/she implies the correct understanding of the medical information provided by the health application. This feature acts as a playground for the healthcare applications to develop a new applications and services.
Fig. 1 Cluster computing as to support personalised healthcare applications
We assume that patients keen to build a local knowledge in order to deal with the alternative solutions of their health problems. The information obtained by end-users can help to enrich health knowledge and research activities. For example if a drug is being consumed by mid-long term period of time. It is difficult and expensive to track the side-effects for it in order to improve or change that drug, but if the patients play the role of self monitoring assisted with ICT’s, they can provide valuable data to assist medical professionals in this task.
3
Personalized Medical Support for Cardio-Vascular Monitoring
This section describes related interdisciplinary application for cardio-vascular monitoring in the framework of personalised healthcare systems. In this application, we employ data clustering techniques to group different cardiovascular signals in order to assign a patient to a physiological condition using no prior knowledge about disease states. We used a new clustering algorithm called distributed learning clustering algorithm (DLC). DLC is based on the idea of stage clustering and offers many advantages than current clustering algorithms, as following: • The algorithm produces clusters with acceptable accuracy; these clusters have different shapes, sizes and densities. • The algorithm was designed with the goal of enabling a privacy preserving version of the data.
876
A.M. Elmisery, M. Serrano, and D. Botvich
• The algorithm helps the user to select proper values for its parameters, and tune parameters for better results. • The algorithm present different statistics for clustering validity in each stage, and use these statistics to enhance the resulting clusters automatically. • The applicability in the algorithm to work in networked environments (p2p, cluster computing or grid systems). Figure (2) depicts the different processes inside our proposed personalized medical application that is used for supporting the diagnosis of cardiovascular signals. In order to enhance the model building process in that application, we proposed an adaptive strategy that utilizes both patient cardiovascular signals and established ECG medical databases, that is more suitable for remote diagnosis. The process described as following:
Fig. 2 Personalized Medical Application for Early Cardio-vascular Diagnostics
1. Use the MIT BIH Arrhythmia database (Moody,Mark 1990) to build an initial clustering model. 2. Test the model on the patient. 3. Collect the new ECG data from this patient. 4. Store the records that achieve high error values beyond a predefined threshold in different Database. 5. Send these data to cardiologists for detailed analysis. This process is done offline. 6. Collect the cardiologists’ annotation and use these data in the model tuning process. ECG recordings carry significant information about the overfull behavior of cardiovascular system and physiological patient conditions. The ECG signal is pre-processed to remove noise and abnormal features, extract features and select
Diagnosis Support on Cardio-Vascular Signal Monitoring
877
certain features that will have high influence on our DLC clustering algorithm. The relevant information is encoded in the form of feature vector that is used as input for DLC algorithm. The key goal for the DLC algorithm is to be able to find patterns in the ECG signals that effectively discriminate between different conditions or categories under investigation.
3.1
ECG Signal Analysis
This section introduces the formalism used for data analysis (Clifford et al. 2006). In the start, each signal is pre-processed by normalization process which is necessary to standardize all the features to the same level. After that, we adjust the baseline of the ECG signal at zero line by subtracting the median of the ECG signal (Yoon et al. 2008). ECG signals can be contaminated with several types of noise, so we need to filter the signal to remove the unwanted noise. ECG signals can be filtered using Low pass filter, high pass filter and Notch filter (Chavan et al. 2008). As shown in figure (3), the ECG signal consists of P-wave, PR-interval, PR-segment, QRS complex, ST-segment, and T-wave. The QRS complex is very important signal that is useful in the diagnosis of arrhythmias diseases. In general the normal ECG rhythm means that there is a regular rhythm and waveform. Correct detection of QRS-complexes forms the basis for most of the algorithms used in automated processing and analysis of ECG (Kors,Herpen 2001).
Fig. 3 ECG Signal Analysis Process Using QRS Metrics (Atkielski 2006)
However, the ECG rhythm in a patient with arrhythmia will not be regular in certain QRS complex (Dean 2006). Our QRS detection algorithm must be able to detect a large number of different QRS morphologies in order to be clinically useful and able to follow sudden or gradual changes of the prevailing QRS morphology. Also it should help to avoid errors related to false positives due either to artifacts or high amplitude T waves. On the other side, false negatives may occur due to low amplitude R waves.
878
3.2
A.M. Elmisery, M. Serrano, and D. Botvich
Clustering Analysis for ECG Signal
Clustering analysis aims to group collection of signals or cases into meaningful clusters without need to prior information about the classification of patterns. There is no general agreement about the best clustering algorithm (Xu,Wunsch 2005); different algorithms reveal certain aspects of the data based on the objective function used. The clustering algorithm learns by discovering relevant similarity relationships between patterns. The result of applying such algorithms is groups of signals evince recurrent QRS complexes and /or novel ST Segments; where each group can be linked to significant disease or risk. Detecting relevant relationships between signals addressed in the literature using different clustering algorithms. For example, the work in (Iverson et al. 2005) applied point wise correlation dimension to analysis of ECG signals from patients suffer from depression. The results obtained in this study indicate that clustering analysis able to discriminate clinically meaningful clusters with and without depression based on ECG information. Authors in (Dickhaus et al. 2001; Bakardjian 1992) cluster collected ECG data into clinically relevant groups without any prior knowledge. This emphasized the advantage of clustering in different classification problems especially in exploratory data analysis or when the distribution of the data is unknown. For detecting the R-peaks in the ECG signal y k , we use an algorithm proposed in(S. et al. 1997). It starts searching for local modulus maxima at large scale then at fine ones. This procedure reduces the effect of high frequency noise; also it uses adaptive time amplitude threshold and refractory period information and rejects isolated and redundant maximum lines (artifacts, high amplitude T wave or low amplitude R waves). Detecting R-peaks starts with calculating zero crossing of the wavelet between a positive maximum- negative minimum that is marked as R-peak m . Once R-peaks are found, the RR-interval between each two consecutive heartbeats is computed by: RR e
m
e
1
m
e
(1)
Where e refers to heartbeat sequence index. For heartbeat segmentation purposes, starting and ending points are obtained as follows: yk
ym
e
0.25RR e : m
e
0.75RR e
(2)
The length of this interval is different for each heartbeat; figure (4) illustrate the detection of RR-interval. The length variability is removed by means of trace segmentation. Following that, Feature extraction is performed using WT decomposition. The heartbeats will represented as an array of time-varying duration, In order to compare the heartbeat morphologies it is necessary to use a proper dissimilarity measure for DLC algorithm. In this work, we used dynamic time warping (DTW) used in (Cuesta-Frau et al. 2007) to find an optimal alignment function between two sequences of different length. The heartbeat is considered if its dissimilarity measure with other elements in the resulting set is higher than a specific threshold. The DLC clustering can be expressed as following:
Diagnosis Support on Cardio-Vascular Signal Monitoring
879
Fig. 4 Illustration for the detection of RR-interval in ECG Signal
Consider is the set of heartbeats, the goal of local learning and analysis LLA step is to find , with beats, where . All dissimilar heartbeat is , … . and similar ones are omitted. Then in distributed represented in , … . is partitioned to a set of clusters clustering step (DC) step the set ,…. ,where each cluster contains proportionate heartbeats. Table 2, shows the resulting heartbeats after the execution of LLA step. Table 1 Heartbeat used Set of heartbeats used in experiment label No.beats
Normal Lbbb Rbbb PVC Ap P Total 9870 7361 6143 8450 2431 7340 41595
Table 2 Resulting heartbeats after LLA Resulting heartbeats after Pre-processing and LLA label No.beats
Normal Lbbb Rbbb PVC Ap P Total 1730 1320 861 1763 935 843 7452
Table 3 Abbreviations Used Label
Meaning
Normal Lbbb Rbbb PVC Ap P
Normal beat Left bundle branch block beat Right bundle branch block beat Premature ventricular contraction Atrial premature beat Paced rhythm
880
A.M. Elmisery, M. Serrano, and D. Botvicch
Our first experiment done on DLC to measure its accuracy in determining diifferent heartbeat clusters, The T figure (5) shows the relation between merge error iin DC stage and the numbeer of clusters. As shown in figure (5), the merge erroor (LET) decreases which indicates only equivalent heartbeat clusters are beinng merged. In order to evaluate th he performance of our algorithm, we used two error meetrics defined in (Cuesta-F Frau et al. 2003) . The first metric is clustering error (CR R) which is the percentage of heartbeats in a cluster that do not correspond to thhe class of such cluster. Seco ond metric is the critical error (CIE) which is the numbeer of heartbeats in a class thaat do not have a cluster and are therefore included in othher’s classes’ clusters.
Fig. 5 Relation between Diffferent Clusters and Merge Error
(a)
(b)
Fig. 6 (a) The Values of CR R for Different No. of Clusters. (b) The Values of CIE for Diifferent No. of Clusters
In the second experim ment, we want measure the relation between different no. of clusters and the valuess of clustering error (CR) and critical error (CIE). Baseed on figure 6(a) and (b), we w can deduce that both CR and CIE for DLC algorithm m decrease with the increasse in no. of clusters till reaching correct number of cluusters. In The third experim ment, we compare the results of DLC with other clusterinng algorithms, here we selecct BIRCH and k-means; we tune the parameters in eacch algorithm to get the samee number of clusters. Figure 6(a) and (b) contain both C CR
Diagnosis Support on Cardio-Vascular Signal Monitoring
881
and CIE values for each algorithm for different number of clusters. The results show the accuracy of the results achieved using DLC compared to other algorithms.
3.3
Privacy in Clustering Cardiovascular Data
Privacy aware users consider ECG signals sensitive information, as these signals allow the health application providers to infer different mental condition for the patients (depressed, afraid, walking or running… etc). As a consequence, they require certain levels of privacy and anonymity in handling their signals. Our aim is to permit clustering of ECG signals without learning any private information about the patient. In reality, these signals do not need to be fully disclosed to the healthcare provider in order to build an accurate model. We preprocess the wavelets coefficients using LLA step to build up sets of initial clusters where the enduser patters are compared with each other locally, then we take the representatives of each initial cluster as an input to the distributed clustering (DC) step. These representatives used as pattern reference to associate clusters patters with same diseases. Also LLA uses wavelets transformation to preserve privacy for ECG signals by decomposing wavelet coefficients .These two steps affect on both accuracy of the results and privacy level attained.
4
Conclusions
This work has introduced our vision for a personalized health systems based on monitoring ECG signals as an application example. Research efforts have been conducted to promote cluster algorithm as an alternative solution for finding out data similarities between cardio-vascular patterns and clusters previously diagnosed/detected. We have introduced a novel solution using DLC algorithms to cluster morphological similar ECG signals and enforcing privacy when matching these patterns. Experimental results were done in set of ECG recordings from MIT database. DLC yielded 99.9% clustering accuracy considering pathological versus normal heartbeats. Both clustering error and critical error percentage was 1%. We will continue investigating computing techniques to map cardiac patterns for different heart diseases and produce reactive solutions in the communications systems.
References Atkielski, A.: Electrocardiography. In. Wikipedia (2006) Bakardjian, H.: Ventricular beat classifier using fractal number clustering. Medical and Biological Engineering and Computing 30(5), 495–502 (1992), doi:10.1007/bf02457828
882
A.M. Elmisery, M. Serrano, and D. Botvich
Chavan, M.S., Agarwala, R.A., Uplane, M.D.: Interference reduction in ECG using digital FIR filters based on rectangular window. WSEAS Trans. Sig. Proc. 4(5), 340–349 (2008) Clifford, G.D., Azuaje, F., McSharry, P.: Advanced Methods And Tools for ECG Data Analysis (2006) Cuesta-Frau, D., Biagetti, M., Quinteiro, R., Micó-Tormos, P., Aboy, M.: Unsupervised classification of ventricular extrasystoles using bounded clustering algorithms and morphology matching. Medical and Biological Engineering and Computing 45(3), 229–239 (2007) Cuesta-Frau, D., Pérez-Cortés, J.C., Andreu-Garcıa, G.: Clustering of electrocardiograph signals in computer-aided Holter analysis. Computer Methods and Programs in Biomedicine 72, 179–196 (2003), doi:10.1016/s0169-2607(02)00145-1 Dean, G.: How Web 2.0 is changing medicine, vol. 333. vol. 7582. British Medical Association, London, ROYAUME-UNI (2006) Dickhaus, H., Maier, C., Bauch, M.: Heart rate variability analysis for patients with obstructive sleep apnea. In: Proceedings of the 23rd Annual International Conference of the Engineering in Medicine and Biology Society, vol. 501, pp. 507–510 (2001) Elmisery, A.M., Huaiguo, F.: Privacy Preserving Distributed Learning Clustering of HealthCare Data Using Cryptography Protocols. In: 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops (COMPSACW), July 19-23, pp. 140–145 (2010) Gatzoulis, L., Iakovidis, I.: The Evolution of Personal Health Systems. Paper Presented at the 5th pHealth Workshop on Wearable Micro and Nanosystems for Personalised Health, Valencia-Spain Herzog, R., Konstantas, D., Bults, R., Halteren, A.V., Wac, K., Jones, V., Widya, I., Streimelweger, B.: Mobile Patient Monitoring - applications and value propositions for personal health. Paper Presented at the pHealth 2006, the International Workshop on wearable Micro- and Nanosystems for Personalized Health, Luzern, Switzerland (2006) Iverson, G., Gaetz, M., Rzempoluck, E., McLean, P., Linden, W., Remick, R.: A New Potential Marker for Abnormal Cardiac Physiology in Depression. Journal of Behavioral Medicine 28(6), 507–511 (2005), doi:10.1007/s10865-005-9022-7 Kors, J.A., Herpen, G.: The Coming of Age of Computerized ECG Processing: Can it Replace the Cardiologist in Epidemiological Studies and Clinical Trials?, pp. 1161– 1165 (2001) Lupu, E., Dulay, N., Sloman, M., Sventek, J., Heeps, S., Strowes, S., Twidle, K., Keoh, S.L., Schaeffer-Filho, A.: AMUSE: autonomic management of ubiquitous e-Health systems. Concurr. Comput.: Pract. Exper. 20(3), 277–295 (2008), doi:10.1002/cpe.v20:3 Moody, G.B., Mark, R.G.: The MIT-BIH Arrhythmia Database on CD-ROM and software for use with it. In: Proceedings of the Computers in Cardiology 1990, September 23-26, pp. 185–188 (1990) Neves, P.A.C.S., Fonsec, J.F.P., Rodrigue, J.J.P.C.: Simulation Tools for Wireless Sensor Networks in Medicine: a Comparative Study. Paper Presented at the International Joint Conference on Biomedical Engineering Systems and Technologies, Funchal, MadeiraPortugal
Diagnosis Support on Cardio-Vascular Signal Monitoring
883
S., S.J., N., T.S., P., B.R.K.: Using Wavelet Transforms for ECG Characterization : an OnLine Digital Signal Processing System, vol. 16, vol. 1. Institute of Electrical and Electronics Engineers, New York (1997) Xu, R., Wunsch, D.: Survey of Clustering Algorithms. IEEE Transactions on Neural Network 16(3), 645–678 (2005), doi: citeulike-article-id:469342 Yanmin, Z., Sye Loong, K., Sloman, M., Lupu, E.C.: A lightweight policy system for body sensor networks. IEEE Transactions on Network and Service Management 6(3), 137–148 (2009) Yoon, S.W., Min, S.D., Yun, Y.H., Lee, S., Lee, M.: Adaptive Motion Artifacts Reduction Using 3-axis Accelerometer in E-textile ECG Measurement System. J. Med. Syst. 32(2), 101–106 (2008), doi:10.1007/s10916-007-9112-x
Multiple-Instance Learning via Decision-Based Neural Networks Yeong-Yuh Xu and Chi-Huang Shih
Abstract. Multiple Instance Learning (MIL) is a variation of supervised learning, where the training set is composed of many bags, each of which contains many instances. If a bag contains at least one positive instance, it is labelled as a positive bag; otherwise, it is labelled as a negative bag. The labels of the training bags are known, but that of the training instances are unknown. In this paper, a Multiple Instance Decision Based Neural Networks (MI-DBNN) is proposed for MIL, which employs a novel discriminate function to capture the nature of MIL. The experiments were performed on MUSK1 and MUSK2 data sets. In comparison with other methods, MI-DBNN demonstrates competitive classification accuracy on MUSK1 and MUSK2 data sets, which are 97.8% and 98.4%, respectively.
1 Introduction Over the last few decades, considerable concern has arisen over learning from examples in machine learning research. Supervised learning attempts to learn a concept from the labelled training examples. Unsupervised learning attempts to learn the structure of the underlying sources of examples, where the training examples are with no labels. In Multiple Instance Learning (MIL), the training set is composed of many bags, each of which contains many instances. If a bag contains at least one positive instance, it is labelled as a positive bag; otherwise, it is labelled as a negative bag. The labels of the training bags are known, but that of the training instances are unknown. The task is to learn the concept from the training set for correctly labelling unseen bags. MIL is first analyzed by Dietterich et al.[4]. They investigated the drug activity prediction problem, trying to predict that whether a new molecule was Yeong-Yuh Xu Department of Computer Science and Information Engineering, Hungkuang University, Taichung, Taiwan e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 885–895. c Springer-Verlag Berlin Heidelberg 2011 springerlink.com
886
Y.-Y. Xu and C.-H. Shih
qualified to make some drug, through analyzing a collection of known molecules. They proposed three axis-parallel rectangle (APR) algorithms to search the appropriate axis-parallel rectangles constructed by the conjunction of the features extracted from molecules. After Dietterich et al., numerous MIL algorithms have been developed, such as Diverse Density[8], Bayesian-kNN and Citation-kNN algorithms[14], EM-DD algorithm[17], etc., and successfully applied to many applications [9, 18, 3, 5, 1, 7, 15]. More works on MIL can be found in [19]. The robustness, adaptation, and ability to automatically learn from examples make neural network approaches attractive and exciting for MIL. When the notion of MIL was proposed, Dietterich et al.[4] indicated that a particular interesting issue in this area is to design multiple-instance modifications for neural networks. Ramon and De Raedt [12] presented a neural networks framework for MIL. Zhang and Zhou proposed a multi-instance neural network named BP-MIP[20, 16], which extended the popular BP [13] algorithm with a global error function defined at the level of bags instead of at the level of instances. How to construct a neural model structure is crucial for successful recognition. All the above neural networks for MIL are based on the all-class-in-one-network (ACON) structure, where all the classes are lumped into one super-network. The supernet has the burden of having to simultaneously satisfy all the teachers, so the number of hidden units tends to be large. In this paper, a Multiple Instance Decision Based Neural Networks (MI-DBNN) is proposed for MIL. The proposed MI-DBNN is a probabilistic variant of the Decision Based Neural Networks (DBNN) [6]. The MI-DBNN inherits the one-class-inone-network (OCON) structure from the DBNN. For each concept to be recognized, MI-DBNN devotes one of its subnets to the representation of that particular concept. Pandya and Macy [11] compared the performance between the ACON and OCON structures, and observed that OCON model achieves better training and generalization accuracies. Beside, the discriminant function of the proposed MI-DBNN is in a form of probability density, which yields high accuracy rates compared to other approaches, as discussed in Section 4. The reminder of this paper is organized as follows. In the next section, the proposed discriminant function is presented in detail. Then, in Section 3, the implementation of the proposed MI-DBNN and its learning scheme are introduced. Experimental results are presented and discussed in Section 4. Finally, Section 5 draws some conclusions and future works.
2 Discriminant Function One major difference between DBNN and MI-DBNN is that MI-DBNN follows the MIL constraint. That is, the discriminant function of MI-DBNN is designed to capture the nature of MIL. Given a set of i.i.d. feature patterns x = {x(t);t = 1, 2, · · · , N} extracted from the instances in a bag B, we assume that the likelihood function p(x(t)|ωi ) for the concept ωi is a linear combination of component densities p(x(t)|ωi , Θri ) in the form
Multiple-Instance Learning via Decision-Based Neural Networks
p(x(t)|ωi ) =
887
Ri
∑ P(Θri |ωi )p(x(t)|ωi , Θri ),
ri =1
where Ri is the number of clusters, Θri represents the ri th cluster, P(Θri |ωi ) denotes the prior probability of the ri th cluster, and p(x(t)|ωi , Θri ) is a D-dimensional Gaussian-like distribution with uncorrelated features 1 T −1 p(x(t)|ωi , Θri ) = exp − (x(t) − μri ) Σri (x(t) − μri ) , (1) 2 where x(t) = [x1 (t), x2 (t), · · · , xD (t)]T is the input pattern, μri = [μri 1 , μri 2 , · · · , μri D ]T is the mean vector, and diagonal matrix Σri = diag[σr2i 1 , σr2i 2 , · · · , σr2i D ] is the covari-
ance matrix. By definition, ∑Rrii=1 P(Θri |ωi ) = 1. Given that B is a positive bag and x(n) is a feature pattern extracted from a positive instance in B, the desired value of p(x(n)|ωi ) is 1, and − log(p(x(n)|ωi )) is adopted to measure the error between p(x(n)|ωi ) and 1. Suppose that (h1 , h2 , · · · , hN ) is a decreasing sequence obtained by sorting {p(x(t)|ωi );t = 1, 2, · · · , N}, and B contains k positive instances. Then, we define the similarity between B and ωi as S(x, ωi , k) = g(x, ωi , k) log(g(x, ωi , k)),where g(x, ωi , k) = (∏kn=1 (− log hn ))−1 . Clearly, if B contains at least one positive instance (i.e. h1 → 1), no matter what values of h2 , · · · , hk are, S(x, ωi , k) has a large value. On the other hand, if all the instances in B are negative, that is, h1 , · · · , hk are far from 1, the value of S(x, ωi , k) is small. Consequently, it is S(x, ωi , k) that captures the nature of MIL. Since S(x, ωi , k) contains a division, in order to translate division into subtraction, we apply the logarithm to S(x, ωi , k). Accordingly, the discriminant function of each subnet in MIDBNN is defined as φ (x, wi , k) = log(S(x, ωi , k)), and can be further derived as follows: k
k
n=1
n=1
φ (x, wi , k) = log( ∑ (− log(− log hn ))) + ∑ (− log(− log hn ))
(2)
where wi = { μri , Σri , P(Θri |ωi ), Ti }. Ti is the output threshold of the ith subnet in MI-DBNN. Use the data set X = {xb ; b = 1, 2, · · · , M}, where xb = {xb (t);t = 1, 2, · · · , Nb } is a feature vector extracted from instances in the bth bag. Then, the energy function for MI-DBNN is defined as E(X, wi ) =
M
∑ φ (xb, wi , Nb ),
(3)
b=1
If X is a positive set, (3) should have a large value; otherwise, (3) should have a small value. In order to verify the proposed discriminant function, we created an artificial data set: five positive and five negative bags, each with 100 instances. Each instance was chosen uniformly at randomly from a [0, 1] × [0, 1] ∈ R2 domain. The concept was located at two 0.1 × 0.1 squares in the Cartesian plane, one with corners at
888
Y.-Y. Xu and C.-H. Shih
(0.15, 0.75), (0.25, 0.75), (0.15, 0.85), and (0.25, 0.85) and the other with corners at (0.75, 0.15), (0.85, 0.15), (0.75, 0.25), and (0.85, 0.25). A bag was labelled positive if at least one of its instances fell within the square, and negative if none did. Each of the squares contains at least one instance from every positive bag and no negative instances. The created data were drawn in Fig. 1, where instances in negative bags are dots, and in positive bags are numbers.
1
4
35 24
5
4
1 1 5 411 5 3 2 3 1 4 23 51 4 2 2 54 3
5 2 1 2 2 2 34 4 5 5 4 34 3 1 45 1 3 54 5 5 33 3 51 21 2 5
22
3
3
1 51 2 4 4 2 2 11153 3 5 24253 5355 2 4 2 1 0.8 4 11 14 5 24 35 4 3 1 5 3 2 4 1 15 42 1 1 4 3 1 2 4 2 4 4 3 55 3 4 5 2 1 3 3 4 1 2 2 2 5 0.7 2 4 5 4 4 4 4 4 2 12 2 55 43 5 2 3 4 34 1 5 4 3 4 51 2 3 5 1 4 34 1 3 51 5 0.6 4 1 3 2 3 4 3 3 2 43 1 1 3 1 4 532 1 2 511 5 4 1 1 4 4 2 4 1 2 5 4 253 23 21 4 2 3 5 2 1 143 3 3 3 3 43 2 3 14 5 2 3 2 52 0.5 4 5 1 4 5 5 4 3 4 25 3 5 1 4 5 14 5 1 2 1 3 4 4 3 2 1 1 41 4 1 5 5 3 3 3 5 4 2 0.4 5 5 5 2 5 2 5 5 2 2 3 1 51 1 3 12 45 5 4 2 23 51 3 1 2 3 1 34 3 1 3 5 3 3 3 55 2 4 2 2 3 5 0.3 2 34 4 1 2 2 1 1 23 2 5 55 15 5 2 4 3 1 1 4 2 3 4113 541 3 4 2 5544 1 1 3 2 4 2 3 5 5 3 1 0.2 244 5 12 2 51 3 2 53 5 43 4214 5 53 3 3 14 3 5 1 22 52 3 42 3 4 4 3 5 4 23 4 4 1 12 4 4 55 33 1 1 2 1 2 25 4 2 4 5 5 1 5 0.1 25 1 1 11 5 13 2 4 4 2 22 1 4 2 3 51 3 11 53 2 2 2 1 3 21 3 0 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.9
3
Fig. 1 The artificial data contains five positive and five negative bags. The instances in negative and positive bags are dots and numbers, respectively. The concept was located at two 0.1 × 0.1 squares. Each square contains at least one instance from every positive bag and no negatives.
In order to highlight the advantages of finding the concept squares shown in Fig. 1, we plotted the proposed energy surface, the regular log-likelihood surface, and the corresponding contour plots with gradient vectors across the domain in Fig. 2. It is clear that picking out the global maximum (the desired concept) in Fig. 2(a) is easier than in Fig. 2(b). This phenomenon can be more clearly found if we compare the gradient vectors in Fig. 2(c) to those in Fig. 2(d).
3
Multiple-Instance Decision Based Neural Networks
The proposed MI-DBNN has a modular network structure. One subnet is designated to represent one object concept. For an m concepts MIL problem, MI-DBNN consists of m subnets. The structure of MI-DBNN is depicted in Fig. 3. To approximate the density function in (1), we apply the elliptic basis functions to serve as the basis function for each cluster,
ϕ (x(t), ωi , Θri ) = −
1 D (xd (t) − μri d )2 . ∑ 2 d=1 σr2d i
After passing an exponential activation function, exp{ϕ (x(t), ωi , Θri )} can be viewed as the same distribution as described in (1).
Multiple-Instance Learning via Decision-Based Neural Networks
200
889
500
150 100 50 0
0
−50 −100 −150 −200 1
−500 1 0.8
0.8
1 0.6
1 0.6
0.8
0.8
0.6
0.4
0.6
0.4
0.4
0.2
0.2
0.2 0
0.4 0.2 0
0
0
(a)
(b) 0
10
0.8
00
−1
0
50
0.9
−5
−50
0
50
0.9
−3
00
1 0
1
00
−2
50
−1
−2
50
0.8 50
0
−5 0
0
−1
00
−5
−50
0
0
−50
0.5
−10
−100
0.6 0
0.6
0
−150
0.7 0
0.7
−20
−150
−50
−50
0.5
50 0
0
0.4
0
0
10
−50
50
0.3
0.3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
(c)
0.8
250
0 0
10
0
50
50
0
−50
0 10
0
0.1
200
0
150
−100
0.1
200
0.2
50
0.2
0
15
50
100
15
0.4
0.9
1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(d)
Fig. 2 Energy surfaces over the example data of Fig. 1. (a) is the the proposed energy surface, and (b) is the log-likelihood surface over the example data of Fig. 1. (c) and (d) are the contour plots with gradient vectors of the the proposed energy and the log-likelihood surfaces, respectively. It is clear that finding the peak which is within the desired concept using the proposed energy function is easier than using the regular log-likelihood function.
The training examples for each concept are from a set of bags with predefined labels (i.e., positive or negative). MI-DBNN adopts the decision based learning rules to learn the concepts. Unlike the approximation neural networks, where exact target values are required, the teacher in MI-DBNN only tells the correctness of the classification for each training bag. The detailed description of the learning phase is given in the follows.
3.1 The Learning Phase As described in Section 2, (3) should be maximized if the training pattern is from the positive set; otherwise, it has to be minimized. Given a positive training set X+ and a negative training set X− , the following reinforced and antireinforced learning techniques are applied to the corresponding subset.
890
Y.-Y. Xu and C.-H. Shih 5HFRJQLWLRQ5HVXOW
'(&,6,211(7 ORJ\ \
ORJ\ \ y
y I
I
I
I
I
I
K KN K %XIIHUDQG6RUWLQJ
K KN K %XIIHUDQG6RUWLQJ
p(x(t ) ω1 )
p(x(t ) ωm )
P(Θ1 ω1 )
P(Θ1 ωm )
P(Θ R1 ω1 )
(
p(x(t ) ω1 , Θ1 ) p x(t ) ω1, Θ R1
)
P (Θ Rm ω m )
(
p (x (t ) ωm , Θ1 ) p x (t ) ωm , Θ R m
FRQFHSWFODVV
FRQFHSWFODVV
)
P
LQSXWXQODEHOHGEDJ[ ^[W W 1` I
IK ORJORJK
Fig. 3 Structure of MI-DBNN. Each subnet is designated to recognize one concept.
Reinforced Learning:
(m+1)
= wi
(m+1)
= wi
wi
Antireinforced Learning: wi
(m)
+ η ∇E(X+ , wi ),
(m)
− η ∇E(X− , wi ),
where 0 < η ≤ 1 is a user defined learning rate, and ∇E are the gradient vectors computed as follows: ∂ E(X, wi ) 1 = −1 Nb ∂ μri d wi =w(m) log(− log p(xb (t)|ωi )) ∑t=1 i ⎛ ⎞ Nb (m) (Θ |ω , x (t)) (xbd (t) − μ (m) ) p ri i b ri d ⎠ ×∑⎝ , (m) 2 log p(x (t)| ω ) i b (σri d ) t=1 ∂ E(X, wi ) 1 = −1 Nb ∂ σr2i d (m) log(− log p(x (t)| ω )) ∑ i b t=1 wi = wi
Multiple-Instance Learning via Decision-Based Neural Networks
891
⎛
⎞ (m) (Θ |ω , x (t)) (xbd (t) − μ (m) )2 p r i r d b i i ⎠, ×∑⎝ (m) 2 log p(xb (t)|ωi ) (σ )4 t=1 Nb
ri d
where p(m) (Θri |ωi , xb (t)) is the conditional posterior probability, p(m) (Θri |ωi , xb (t)) =
P(m) (Θri |ωi )p(m) (xb (t)|ωi , Θri ) . p(m) (xb (t)|ωi )
As to the conditional prior probability P(Θri |ωi ), since the EM algorithm can automatically satisfy the probabilistic constraints ∑Rrii=1 P(Θri |ωi ) = 1 and P(Θri |ωi ) ≥ 0, it is applied to update the P(Θri |ωi ) values to regulate the influences of different clusters: P(m+1) (Θri |ωi ) =
M Nb 1 ∑ ∑ p(m) (Θri |ωi , xb (t)). M · Nb b=1 t=1
(4)
Threshold Updating The threshold value of MI-DBNN can also be learned by the reinforced and antireinforced learning rules. Since the decrement of the discriminant function φ (x, wi , k) and the increment of the threshold Ti have the same effect on the decision making process, the direction of the reinforced and antireinforced learning for the threshold is the opposite of the one for the discriminant function. For example, if an input data set x belongs to the concept ωi but φ (x, wi , k) < Ti , Ti should reduce its value. On the other hand, if x does not belong to ωi but φ (x, wi , k) > Ti , Ti should increase. The proposed adaptive learning rule to train the threshold Ti is described as follows. Define d(X, ωi ) ≡ Ti − φ (x, wi , k) and penalty function f (d(x, ωi )), which can be either a step function, a linear function, or a fuzzy-decision sigmoidal function. Then, the threshold values can be trained as follows: Given a positive learning parameter γ , at step j, ( j) Ti − γ f (d(x, ωi )), if x ∈ ωi (reinforced learning); ( j+1) Ti = ( j) Ti + γ f (d(x, ωi )). otherwise (antireinforced learning), In order to verify the proposed learning rules, we trained MI-DBNN to learn the concept from the artificial data created in Section 2. During the learning phase, the trajectory of the predicted positions of the desired concepts are shown in Fig. 4. Clearly, MI-DBNN successfully picked out the peak and learned the desired concept.
892
Y.-Y. Xu and C.-H. Shih 1 1 5 41 53 2 2 5 2 1 22 3 3 22 3 5 4 23 51 4 2 1 4 34 455 5 41 3 2 1 24 5 4 1 2 1 3 544 5 4 3 3 454 2 5 3 3 2 111523 5 3 242535355 3 5121 2 2 4 13 4 5 0.8 11 153 524 4 3 14 5 2 1 1 4 1 5 42 1 4 3 1 4 4 4 3 22 55 3 2 4 3 3 1 5 21 5 4 2 0.7 2 4 5 2 4 4 4 4 4 2 4 5 43 5 2 2 55 4 4 12 3 51 3 34 3 1 5 4 34 1 32 51 1 0.6 4 2 3 1 5 23 133 53 3 4 2 3 1 1 5 4 11 4 4 1 4 12 4 1 22 5 4 1 51 4 2 2 3 3 5 2 2 3 35 3 4 2 1 414 3 5 32 3 43 3 2 1 52 3 0.5 4 5 1 24 4 5 5 34 25 3 1 4 5 14 5 2 5 1 1 3 4 2 1 4 41 4 5 3 3 3 31 5 5 1 4 0.4 55 2 2 5 5 2 5 5 1 2 3 3 1 51 51 12 5 4 45 22 23 3 1 2 1 3 4 3 3 1 3 3 5 5 2 3 5 4 2 2 5 3 0.3 2 4 34 13 2 21 1 5 23 2 5 5 15 5 2 113 4 4 5 43 1 3 4 13 4 55442 135 51 3 1 3 2 4 2 0.2 244 2 2 53 5 25 4 4214 5 53 53 3 3 4 455312 451 3 1 23 1 24 3 2 2 3 4 3 2 4 4 2 1 3 2 1 4 4 55 1 12 1 35 25 4 2 4 1 5 0.1 25 5 1 11 5 13 1 4 2 4 2 223 1 4 2 3 2 51 3 1 5 2 1 2 1 1 3 3 0 1 2 0 0.2 0.4 0.6 0.8 1 1
4
0.9
3
35
Fig. 4 MI-DBNN is trained to learn the concept from the artificial data of Fig.1. The concept was located at two 0.1 × 0.1 squares. The red circles show the trajectory of the predicted positions of the desired concepts during the MI-DBNN learning phase. Apparently, MI-DBNN successfully picked out the peak and learned the desired concept.
3.2 The Recognition Phase The goal of the MI-DBNN recognition phase is to obtain instances from the input unlabelled bag, to compare them with the concept models learned before, and to find the concepts the input unlabelled bag belonging to. As shown in Fig. 3, each subnet receives the input patterns extracted from the unlabelled bag, and computes the discriminate function as shown in (2). Then, the results of (2) are compared with the threshold Ti . According to the different applications, the unlabelled bag may be labelled as one concept or multiple concepts. The vector V in MI-DBNN is the recognition vector showing which concepts unlabelled bag belonging to. The ith element of V is set to 1 if the output of the ith discriminate function is larger than Ti , which implies that the given bag belongs to the concept ωi . Otherwise, the ith element of V is set to 0 implying that the given bag does not belong to the concept ωi . From the recognition vector, one can recall which concepts the given bag belonging to.
4 Experiments In order to show the proposed MI-DBNN ability to deal with the MIL problems, the MUSK data sets available from UCI Machine Learning Repository [10] are used in the experiments. The MUSK data sets include MUSK1 and MUSK2. The data sets consist of descriptions of molecules. The target protein is a putative receptor in the human nose, and a molecule binds to the receptor as positive if it smells like a musk. The MI-DBNN is trained to learn what shape makes a molecule musky. MUSK1 has 92 molecules (bags), of which 47 are positive, with an average of 5.17 shapes (instances) per molecule. MUSK2 has 102 molecules, of which 39 are positive, with
Multiple-Instance Learning via Decision-Based Neural Networks
893
an average of 64.69 shapes per molecule. Each instance (in this case is conformation) is represented by 162 rays, along with four additional features that specify the location of a unique oxygen atom common to all the molecules. As a consequence, each instance contains totaly 166 features. Ten-fold cross validation is performed on each MUSK data set. MI-DBNN is then trained for ten times, each of which involves a different combination of nine partitions as the training set and the one as the testing set. Table 1 summarizes the prediction accuracy of 8 MIL algorithms in the literature: GFS elim-kde APR, GFS elim-count APR, and iterated-discrim APR [4], Diverse Density [8], Citation-kNN, Bayesian-kNN [14], EM-DD [17]1 , and MILES [3]. We see from Table 1 that the proposed MI-DBNN obtains an average accuracy of 97.8% on MUSK1 and 98.4% on MUSK2, which achieves the best performance on both MUSK1 and MUSK2 data sets. The experimental result tells us that MI-DBNN has the capability of dealing with the MIL problems, and demonstrates competitive classification accuracy in comparison with the other methods.
Table 1 Comparison of the predictive accuracy (%correct±standard deviation) on the MUSK data sets. Algorithms MUSK1 MUSK2 EM-DD [17] 84.8 84.9 iterated-discrim APR [4] 92.4 89.2 GFS elim-kde APR [4] 91.3 80.4 GFS elim-count APR [4] 90.2 75.5 Diverse Density [8] 88.9 82.5 Citation-kNN [14] 92.4 86.3 Bayesian-kNN [14] 90.2 82.4 MILES [3] 87.0 93.1 MI-DBNN 97.8±1.14 98.4 ±1.05
5 Conclusions We have presented a Multiple Instance Decision Based Neural Networks (MIDBNN) for multiple-instance learning (MIL). A novel discriminant function is proposed to capture the nature of MIL. We tested MI-DBNN over the benchmark data sets taken from applications of drug activity prediction. In comparison with other methods, MI-DBNN demonstrates competitive classification accuracy on MUSK1 and MUSK2 data sets, which are 97.8% and 98.4%, respectively. Since MI-DBNN is a general algorithm that has not been optimized toward any data, applying MIDBNN to more real-world applications such as content-based image retrieval is an 1
The EM-DD results reported in [17] were obtained by selecting the optimal solution using the test data. The EM-DD result cited in this paper was provide by [2] using the correct algorithm.
894
Y.-Y. Xu and C.-H. Shih
interesting issue for future works. Furthermore, it is also interesting to employ feature selection techniques to test if feature selection can improve the performance of MI-DBNN .
References 1. Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 288–303 (2010) 2. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multipleinstance learning. In: Advances in Neural Information Processing Systems 15, pp. 561– 568. MIT Press, Cambridge (2003) 3. Chen, Y., Bi, J., Wang, J.Z.: Miles: Multiple-instance learning via embedded instance selection. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(12), 1931– 1947 (2006) 4. Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31–71 (1997) 5. Gu, Z., Mei, T., Hua, X.S., Tang, J., Wu, X.: Multi-layer multi-instance learning for video concept detection. IEEE Transactions on Multimedia 10(8), 1605–1616 (2008) 6. Kung, S., Taur, J.: Decision-based hierarchical neural networks with signal/image classification applications. IEEE Transactions on Neural Networks 6(1), 170–181 (1995) 7. Mandel, M.I., Ellis, D.P.W.: Multiple-instance learning for music information retrieval. In: Proceedings of Ninth International Conference on Music Information Retrieval (2008) 8. Maron, O., Lozano-Perez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576 (1998) 9. Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 341–349. Morgan Kaufmann Publishers Inc., San Francisco (1998) 10. Murphy, P.M., Aha, D.W.: Uci repository of machine learning databases 11. Pandya, A.S., Macy, R.B.: Pattern Recognition with Neural Networks in C++. CRC Press, Boca Raton (1995) 12. Ramon, J., Raedt, L.D.: Multi instance neural networks. In: Proc. ICML 2000 Workshop on Attribute-Value and Relational Learning (2000) 13. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation, pp. 673–695. MIT Press, Cambridge (1988) 14. Wang, J.: Solving the multiple-instance problem: A lazy learning approach. In: Proc. 17th International Conf. on Machine Learning, pp. 1119–1125. Morgan Kaufmann, San Francisco (2000) 15. Zafra, A., Gibaja, E.L., Ventura, S.: Multiple instance learning with multiple objective genetic programming for web mining. Appl. Soft Comput. 11, 93–102 (2011) 16. ling Zhang, M., hua Zhou, Z.: Improve multi-instance neural networks through feature selection. In: Neural Processing Letters, pp. 1–10 (2004)
Multiple-Instance Learning via Decision-Based Neural Networks
895
17. Zhang, Q., Goldman, S.A.: Em-dd: An improved multiple-instance learning technique. In: Advances in Neural Information Processing Systems, pp. 1073–1080. MIT Press, Cambridge (2001) 18. Zhang, Q., Goldman, S.A., Yu, W., Fritts, J.E.: Content-based image retrieval using multiple-instance learning. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 682–689. Morgan Kaufmann, San Francisco (2002) 19. Zhou, Z.H.: Multi-instance learning: A survey. Tech. rep., AI Lab, Department of Computer Science and Technology, Nanjing University, Nanjing, China (2004) 20. Zhou, Z.H., Zhang, M.L.: Neural networks for multi-instance learning. Tech. rep., AI Lab, Department of Computer Science and Technology, Nanjing University, Nanjing, China (2002)
Software Testing – Factor Contribution Analysis in a Decision Support Framework Deane Larkman, Ric Jentzsch, and Masoud Mohammadian
*
Abstract. A decision support framework has been developed to guide software test managers in their planning and risk management for successful software testing. Total factor contribution analysis for risk management is applied to the decision support framework. Total factor contribution analysis (FCA) is a tool that can be used to analyse risk before and during software testing. This paper illustrates how software test managers can apply FCA to the decision support framework to assess risk management issues, and interpret the results for their implications on successful software testing.
1 Introduction Software issues are things that are inconsistent, incomplete, inappropriate, or do not conform to the intended good practices of the software (Institute of Electrical and Electronics Engineers 1994). Software testing identifies issues that result when software does not meet its intended requirements in some way before, during, and after it is executed. Most software issues are rarely obvious: they can be simple or subtle, or both. Often it is hard to distinguish between what is an issue and what is not an issue (Patton 2006). Software testing planning is an essential part of the software testing life cycle (Editorial 2010). However, it is a labour intensive and complex activity (Ammann and Offutt 2008). Planning for successful software testing relies on the expertise and experience of the software test manager (Pinkster et al. 2004). Little to no research has been reported on the development or use of a decision support framework for software testing. Despite an intensive literature search we Deane Larkman · Masoud Mohammadian Faculty of Information Sciences and Engineering University of Canberra, ACT, Australia e-mail:
[email protected],
[email protected] Ric Jentzsch Business Planning Associates Pty Ltd, ACT, Australia e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 897–905. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
898
D. Larkman, R. Jentzsch, and M. Mohammadian
are only aware of our previous work on decision support frameworks for software testing (Larkman et al. 2010a, b). Research on assessing risk management in software testing is also scarce. To assist the software test manager in their planning and risk management for successful software testing, a decision support framework has been developed.
2 Decision Support Frameworks Defined A decision support framework (DSF) establishes a structure and organisation about a phenomenon. The phenomenon is a type of thing, event, or situation (Alter 2002). The framework can be a real or conceptual structure, or it can be an abstract logical structure (Burns and Grove 2009). Generic decision support frameworks use a variety of terms to describe their structure. The structure consists of a high level set of requirements that provides for the inclusion of such concepts as: elements, components, objects, entities, and/or factors. The ability to plug requirements into a decision support framework provides a guide to support analysis and information on achieving an overall goal or objective. A decision support framework is specific to a particular environment, application, business issue, or concept. A decision support framework is based on what is to be achieved, not how it is to be achieved; and therefore the framework implicitly decouples the what from the how (Larkman et al. 2010b). At any particular point in time, a decision support framework identifies an invariant set of concepts and therefore infers a discrete boundary. The framework can be a defined approach, a set of rules, a set of policies, a set of data for the understanding of an issue or domain, a high level definition to achieve some outcome, or a group of outcomes.
3 Decision Support Framework Used for This Study The development of the decision support framework for software testing has been reported elsewhere (Larkman et al. 2010b). The decision support framework was developed to assist the software test manager to guide them in their planning task for successful software testing. The framework consists of: 1.
a set of elements that represent major software testing categories that need to be addressed before testing begins; 2. element factors that provide details about the related element that the software manager needs to consider when planning and assessing risk for successful software testing; and 3. a set of directional signed relationships that indicate elements’ influences, directly or indirectly, on successful software testing, and that provide a basis for risk management assessment before and during software testing.
Software Testing – FCA in a Decision Support Framework
899
The DSF includes a goal (C0), three primary elements (C1, C2, and C3) and one secondary element (C3.1). The influences of elements are mapped by the directional signed relationships to the goal or other elements. Along each directional signed relationship is an illustrative influence weighting, expressed as a percentage. The percentages shown in Fig. 1 are for illustration and discussion purposes only. These percentages will vary by the type of software to be tested within the organisational context, and by the software test manager’s experience and expertise. Influence weightings are determined by the software test manager, and they define the strength of an element’s influence on achieving successful software testing. For each of the four elements there are a set of factors which define the details of an element, as shown in Fig. 1.
Fig. 1 Decision Support Framework for Software Testing
The DSF is applied in two steps. First the software test manager assigns input (influence weightings and factor contribution percentages) to the DSF, to allow them to model their specific testing situation. Second the software test manager evaluates and interprets the DSF model using one or more analytical techniques. Using the DSF for software testing comes with many advantages: •
First, it provides a template that the software test manager can apply to any type of software that is planned to be tested.
900
D. Larkman, R. Jentzsch, and M. Mohammadian
•
Second, the DSF serves as a guide for software test managers of those things (in the form of elements and factors) they need to consider when planning for software testing. Third, the DSF can be used for assessment of risk management issues. In other words, when things do not go 100% according to plan (such as a resource shortage or an unexpectedly reduced testing schedule), the DSF can be analysed for the risk impact of not achieving successful software testing. Finally, the DSF provides a basis for software test project review, used for management reporting. The DSF can be used to compare planning estimates against actual results.
•
•
The DSF generates a specific software testing model that is used to analyse the type of software to be tested from two perspectives: 1. Static perspective; and 2. Dynamic perspective. The static perspective is used to ensure that all software testing considerations have been thought about, and that critical path analysis and total path weight analysis have been done (Larkman et al. 2010a). The static perspective is not part of this paper and is not discussed herein. The dynamic perspective is used to more formally understand the risk management assessment based on the DSF model. The dynamic perspective includes: • •
Fuzzy cognitive maps; and Factor contribution analysis.
Fuzzy cognitive map analysis, for the DSF, has been discussed elsewhere and is not part of this paper (Larkman et al. 2010a).
4 Factor Contribution Analysis (FCA) This paper concentrates on factor contribution analysis for risk management assessment. Fig. 2 will be used to discuss the use of FCA for the decision support framework (established in Fig. 1) (Larkman et al. 2010b).
Software Testing – FCA in a Decision Support Framework
901
Fig. 2 Decision Support Framework for Factor Contribution Analysis
4.1 Factors and Elements Factors are a set of items that relate to a particular element. Factors provide more detailed information about an element as shown in Fig. 1. Each factor contributes to the success of the element it is associated with. By success we mean the degree that the element is able to fulfil its influence on achieving the goal. Each factor contributes to an element being able to influence the goal directly or indirectly, through its influence on other elements. The sum of factors for each element must be equal to 100%, as shown in Fig. 2. When the sum of an element’s factor contributions is not equal to 100%, then the influence weighting percentage(s) associated with that element will be less than their assigned percentage(s). When the factors contribute less than 100%, then factor contribution analysis can be used to assess the level of risk of not achieving the goal of successful software testing.
4.2 Factor Contribution Analysis (FCA) Technique FCA looks at the changes to the total factor contribution of one or more elements in analysing the risk of not being able to achieve the goal. Individual percentages for each factor are not material to the analysis, as only the “total of 100%” is used. Of course, once the analysis has been done, the software test manager would be more concerned with which factor, or factors, contribute to the loss, and which ones require attention.
902
D. Larkman, R. Jentzsch, and M. Mohammadian
Some basic rules and issues need to be remembered in factor contribution analysis. 1. 2. 3. 4. 5. 6.
7.
The influence weighting attributed to each element will be ≥ 10% and ≤ 90%; Factor contribution analysis is only concerned with an element’s total factor contribution and not the individual factors for that element; Risk management is based only on the intent of achieving the goal, and is not applicable to the individual element; If a factor contribution loss is equal to 100%, the goal cannot be reached. Thus the realistic maximum loss for a factor contribution cannot be greater than 90%; No individual factor can be 100%, as that would make all the other factors for that element 0%, and the goal would not be achieved; Back tracking between elements is not permitted, as factor contribution analysis would result in an endless loop (such would be the case for C2 → C3 → C2 → C3 → C2 → etc…); and The decision support framework for software testing, in its current structure, has a maximum of 16 possible scenarios.
Risk management assessment begins with what happens when an element’s factor contribution fails to meet 100%. The following table shows risk management criteria, and how they are interpreted against changes in the influence weightings between primary elements and the goal. The influence weighting change criteria are based on the original influence weightings compared with the new influence weightings, which are determined from the factor contribution change. Table 1 Risk Management Criteria
Risk Category Low Low-Medium Medium Medium-High High
Influence Weighting Change Criteria: Percentage of Original Weightings ≥ 90% ≥ 80% to < 90% ≥ 70% to < 80% ≥ 60% to < 70% < 60%
4.3 Element Total Factor Contribution Effects When C1 (test management) total factor contribution falls below 100%, the following influence weightings are affected: C1 C1 C1
→ C0 → C2 → C0 → C2 → C3 → C0
Software Testing – FCA in a Decision Support Framework
903
When C2 (test information) factor contribution falls below 100%, the following influence weightings are affected: C2 C2
→ C0 → C3 → C0
When C3 (test environment) factor contribution falls below 100%, the following influence weightings are affected: C3 C3
→ C0 → C2 → C0
When C3.1 (technical support) factor contribution falls below 100%, the following influence weightings are affected: C3.1 C3.1
→ C3 → C0 → C3 → C2 → C0
4.4 Combined Elements Factor Contribution Effects This is the case when two, three or all four element’s factor contributions fall below 100%. For example, what if C1’s (test management) and C3.1 (technical support) factor contributions fall below 100%? Based on the DSF (Larkman et al. 2010b) shown in Fig. 2, and using what has been discussed thus far, the following influences are analysed: C1 → C0 C1 → C2 → C0 C1 → C2 → C3 → C0 C3.1 → C3 → C0 C3.1 → C3 → C2 → C0 In other words the goal (successful software testing) will be affected by the change to C1 and C3.1. The effect of C1 is from C1 to C0, C2 to C0 via C1, and C3 to C0 via C2. The effect of change to C3.1 adds to the effect of change to C1. The effect of C3.1 is from C3 to C0 and C2 to C0 via C3.
5 Illustrated Examples An example: what happens if the influence weightings for C1 and C3.1 are reduced by 10%? The new influence weightings on the goal are: C1 C2 C3
→ C0 – 72% (80 minus reduction: 10% of 80) → C0 – 56% (70 minus reduction: 10 + 10 = 20% of 70) → C0 – 44% (55 minus reduction: 10 + 10 = 20% of 55)
904
D. Larkman, R. Jentzsch, and M. Mohammadian
C1 is only affected by changes to its factor contributions. However C1 affects both C2 and C3 influence weightings on the goal. C3.1 affects both C3 and C2 influence weightings on achieving the goal, but not C1. The total affect on achieving the goal has been reduced. If the sum of the new influence weightings on the goal (172) is divided by the sum of the influence weightings on the goal, before the change in factor contribution (205), the result is 83.9%. The loss in factor contribution on C1 and C3.1 shows that the influence weighting loss has a LOW-MEDIUM risk on not achieving the intended goal (see Table 1). As another example, what if the following occurs: Table 2 Factor Contribution Analysis #2
Elements
Factor Contribution Reduced to
Goal Weight Reduced by
95% 75% 90% 80%
5% 60% 60% n/a
C1 C2 C3 C3.1 TOTAL
New Influence Weighting on the Goal 76.0 28.0 22.0 (via C3 and C2) 126.0
The calculations of the new influence weightings on the goal are: C1 → C0 (80 minus reduction: 5% of 80), C2 → C0 (70 minus reduction: 5 + 25 + 10 + 20 = 60% of 70) and C3 → C0 (55 minus reduction: 5 + 25 + 10 + 20 = 60% of 55). The risk of achieving the goal has been substantially affected. If the sum of the new influence weightings (126.0) is divided by the sum of the influence weightings on the goal, before the change in factor contribution (205), the result is 61.5%. Using Table 1 the loss in factor contribution across the board shows that the loss has a MEDIUM-HIGH risk on not achieving the intended goal.
6 Conclusion Factor Contribution Analysis (FCA) is a tool that can be used by software test managers in risk management assessment, based on the DSF model for their particular software testing situation. This paper has demonstrated the use of FCA when planning for successful software testing, and how to analyse the risk if the factors of one or more elements are not 100%. It was shown how to interpret the FCA results and understand their meaning for the risk of not achieving successful software testing. The DSF is an important contribution to the tool set that is needed by modern software test managers.
Software Testing – FCA in a Decision Support Framework
905
References [1] Alter, S.: Information systems: Foundation of e-business, 4th edn. Prentice Hall, Upper Saddle River (2002) [2] Ammann, P., Offutt, J.: Introduction to software testing. Cambridge University Press, New York (2008) [3] Burns, N., Grove, S.K.: The practice of nursing research: Appraisal, synthesis, and generation of evidence, 6th edn. Elsevier Saunders, St. Louis (2009) [4] Editorial (2010) Software testing life cycle, http://editorial.co.in/software/ software-testing-life-cycle.php (accessed January 20, 2011) [5] Institute of Electrical and Electronics Engineers (1994) IEEE standard classification for software anomalies, doi:10.1109/IEEESTD.1994.121429 [6] Larkman, D., Mohammadian, M., Balachandran, B., Jentzsch, R.: Fuzzy cognitive map for software testing using artificial intelligence techniques. In: Papadopoulos, H., Andreou, A.S., Bramer, M. (eds.) AIAI 2010. IFIP Advances in Information and Communication Technology, vol. 339, pp. 328–335. Springer, Heidelberg (2010a), doi:10.1007/978-3-642-16239-8_43 [7] Larkman, D., Mohammadian, M., Balachandran, B., Jentzsch, R.: General application of a decision support framework for software testing using artificial intelligence techniques. In: Phillips-Wren, G., Jain, L.C., Nakamatsu, K., Howlett, R.J. (eds.) Second KES International Symposium IDT 2010, July 28-30, pp. 53–63. Springer, Heidelberg (2010b), doi:10.1007/978-3-642-14616-9_5 [8] Patton, R.: Software testing, 2nd edn. Sams Publishing, Indiana (2006) [9] Pinkster, I., van de Gurgt, B., Janssen, D., van Veenendaal, E.: Successful test management: An integral approach. Springer, Berlin (2004)
Sustainability of the Built Environment – Development of an Intelligent Decision System to Support Management of Energy-Related Obsolescence T.E. Butt and K.G. Jones
*
Abstract. From the built environment perspective, well more than half of whatever has been built and is being built, is going to be around for many decades to come. For instance in the UK, approximately 70% of the UK buildings that have already been built before 2010 will be existing in 2050s. The existing built environment (both infrastructures and buildings) suffer obsolescence in many ways and of various types. The obsolescence is being and will be more induced in the existing built environment not only due to conventional factors (such as aging, wear and tear) but also climate change related factors such as global warming / heat waves, wetter and colder winters, hotter and dryer summers, more frequent and more intense flooding and storms, etc. There are complexities and variation of characteristics from one built environment to another in terms of obsolescence. Whatever the type, shape, size, nature and location of a built environment scenario, energy is involved in it one way or another. Existing energy-related systems in built environments are going to become obsolescent due to both climate change and non-climate change related factors listed above as examples. Furthermore, the energy in the built environment exists in three different stages which are generation end, distribution and consumption end of the ‘pipeline’. Accommodating the aforesaid complexities and variation of characteristics from one built environment to another in terms of obsolescence specifically due to energy-related systems; this paper presents an intelligent decision making tool in the form of a conceptual but holistic framework for the energy-related obsolescence management. The tool at this stage of the research study, provides a conceptual platform where various stages and facets of assessment and management of energy-related obsolescence are assembled together in the form of a sequential and algorithmic system. T.E. Butt · K.G. Jones Sustainable Built Environments Research Group (SBERG), University of Greenwich, Avery Hill Campus, Bexley Road, Eltham, London PostCode: SE9 2PQ. England, UK Tel.: +44(0)7817 139170 e-mail:
[email protected] J. Watada et al. (Eds.): Intelligent Decision Technologies, SIST 10, pp. 907–919. springerlink.com © Springer-Verlag Berlin Heidelberg 2011
908
T.E. Butt and K.G. Jones
Keywords: Intelligent decision technology; intelligent decision making; sustainability; sustainable development; multi-agent system; conceptual framework; obsolescence; built environment; climate change.
1 Background The term built environment means human-made surroundings that provide a setting for human activity, ranging in scale from personal shelter to neighbourhoods and large scale-scale civic surroundings. Thus, whatever is human-made or human-influenced constitutes the built environment. The built environment consists of two main parts which are buildings and infrastructures. The built environment density in an urban environment is more than in a rural environment. The biophysical properties of the urban environment are distinctive with a large building mass (350kg.m-2 in dense residential areas) and associated heat storage capacity, reduced greenspace cover (with its evaporative cooling and rainwater interception and infiltration functions) and extensive surface sealing (around 70% in high density settlement and city centres) which promotes rapid runoff of precipitation (Handley, 2010). Climate change amplifies this distinctive behaviour by strengthening the urban heat island (Gill et. al. 2004). As a general rule, the greater is the density of a built environment, the greater will be the potential of the obsolescence, irrespective of other reasons and drivers. For instance, London is one of the most urbanised parts of the UK built environment in terms of a range of elements such as geographical size, value, economy, human population, diversity, ecology and heritage. Furthermore, London is the capital of the UK and located near the North Sea, stretching around an estuary, with the River Thames running through it, thereby further adding significance and sensitivity to the city in a hydrological context e.g. increased potential of pluvial, fluvial, tidal and coastal floods. In view of these wide-ranging elements together, the overall London share in the total obsolescence to take place in the total UK built environment over time, is most probably to be larger than anywhere else in the UK, and probably one of the largest shares throughout the world. (Butt et. al., 2010a; 2010b). Any constituent (such as a building or infrastructure) of built environment grows to become obsolete or suffers increasing obsolescence over time. Moreover, what is being built now shall predominantly be around as a substantial part of our built environment for decades to come, which are bound to suffer various degrees of obsolescence in different ways (Butt et. al., 2010a; 2010b). In order to render our built environment more sustainable, obsolescence needs to be combated. There is a host of factors which play a role either alone or collectively to cause obsolescence. These factors are not only conventional such as general wear and tear, fatigue, corrosion, oxidation, evaporation, rusting, leaking of gas / water or any other fluid like coolant, breaking, age, etc. These factors are also nonconventional, rather contemporary such as changes in existing or advent of a new environmental legislation; social forces / pressure groups; arrival of new technology; enrichment of knowledge e.g. asbestos is no longer allowed to be used in the built environment; fluctuation in demand; inflation of currency; etc.
Sustainability of the Built Environment
909
In addition to the aforesaid list of factors that cause obsolescence a new driver which is being increasingly realised is climate change (See Section 2 for details). By 2050s the UK is expected to experience: increase in average summer mean temperatures (predicted to rise by up to 3.5oC) and frequency of heat-waves / very hot days; and increases in winter precipitation (of up to 20%) and possibly more frequent severe storms (Hulme et. al., 2002). 70% of UK buildings that will exist in 2050 have already been built. Due to climate change factors (examples of which are indicated above) these existing built assets of the UK are already suffering and will further increasingly suffer from various types of obsolescence (Butt, et. al., 2010a; 2010b). Thus, if sustainable built environment is to accommodate climate change and the investment in these buildings (which was approximately £129 billions in 2007 in the UK alone (UK Status online, 2007)) is to be protected, action needs to be taken now to assess and reduce likely obsolescence of the existing UK built environment; and plan adaptation and mitigation interventions, that continue to support the quality of life and well-being of UK citizens. Failure to act now will mean that the costs of tackling climate change associated obsolescence in future will be much higher (CBI, 2007). The situation with other countries around the globe is not dissimilar, although there may be some variation in nature and quantity of climate change, and the way climate change impacts manifest themselves in relation to the resources and governance of a given country. Thus, managing the sustainability of the existing built environment against obsolescence is of paramount importance to preserve our built assets from local through sub-regional, regional, provincial, national, continental to international and global level.
2 Climate Change Induced Obsolescence Irrespective of whether an obsolescence is internal or external and financial or functional, if a given obsolescence is due to impacts of climate change it is referred to as Climate Change Induced Obsolescence by the authors. The climate change associated obsolescence can be direct or indirect as described below:
2.1 Directly Induced Climate Change Obsolescence Obsolescence that is caused by direct impact of climate change factors is termed as directly induced climate change obsolescence. For instance: •
•
Current air conditioning systems in our built environment may not be as effective due to global warming / heat-waves which are a resultant of climate change. Thus global warming / heat-waves may bring about obsolescence in a given building’s air conditioning system as a direct impact. These heat-waves can also have direct adverse affects on structure or fabric of buildings. Due to ever higher levels of greenhouse gas emissions in the atmosphere, the interaction of poor air quality with facade of a given building can induce obsolescence in terms of reducing refurbishment cycle of the building facade.
910
T.E. Butt and K.G. Jones
•
Similarly, due to water level rise as a result of climate change, estimated flood levels are rising. This implies that current level of electrical cables, power points and appliances from the ground in a given scenario of built environment may not be high enough any longer, thereby becoming obsolete to encounter estimated high level flooding, should it happen.
2.2 Indirectly Induced Climate Change Obsolescence Obsolescence that results from the impact of climate change factors in an indirect manner is referred to as indirectly induced climate change obsolescence. For example: •
•
Irrespective of to whatever degree, one of the reasons of climate change acceleration is anthropogenic activities such as greenhouse gas (GHG) emissions which include carbon dioxide, concentration levels of which in the global atmosphere are higher than ever before. This has contributed in shaping environmental legislation such as European Union (EU) Directive of Energy Performance of Buildings (2002/91/EC) (EC, 2010; EU, 2002); EU Climate and Energy objectives; and legally binding carbon reduction targets set up by the Climate Change Act 2008, 2010 (DECC, 2010a; 2010b). Such environmental legislations have begun to cause indirectly induced climate change obsolescence in existing buildings for they are not able to meet the aforesaid environmental requirements as they stand today. Similarly, the advent of Carbon Capture and Storage (CCS) technology in line with carbon cut demands and targets is at the verge of introducing substantial amount of obsolescence to existing fossil fuel power plants operating without CCS. This is yet another case of indirectly induced climate change obsolescence.
3 Energy-Associated Obsolescence in the Built Environment In the built environment energy is generated, distributed and consumed in different amounts and various direct and indirect ways. The built environment without energy is just not possible to function whether it is lighting of or heating in a building, energy consumption in transport means, or even energy generation and distribution infrastructures. On the other hand, among various other factors, energy-related obsolescence also corresponds to a building’s ability to benefit from improvements in energy efficiency (technical and operational) and the provision of low carbon energy solutions. While significant improvements have been made over the years to the energy efficiency of building fabric, demand for power has increased by 24% since 1990 and is predicted to grow by 53% by 2030 ((DTI 2010, BIFM, 2007). If the UK government are to have any chance of achieving their 80% reduction in CO2 emissions by 2050 as legally bound by the Climate
Sustainability of the Built Environment
911
Change Act 2008, 2010 (DECC, 2010a; 2010b) then the impact of energy-related obsolescence needs to be factored in to energy generation, distribution and utilization policy. Given that most of the current built environment will exist in many decades to come, and considering demands that climate change will place on buildings mitigation and adaptation, the challenge will be to find non-fabric ways of addressing energy obsolescence, through new approaches to integrated building services systems (e.g. heating, cooling, lighting, information technology, etc.) and business operations (e.g. remote working, hot desking, etc.) that do not place unbearable costs of refurbishment on a building’s owner. Failure to develop an integrated approach to energy-related asset management will result in ad hoc solutions being retrofitted to existing buildings (e.g. room mounted air conditioning systems) that, whilst addressing the business imperative, doesn't address the wider climate change agenda. A review of literature to date (e.g. Allehaux and Tessier, 2002; Jones and Sharp, 2007; Acclimatise, 2009) reveals a lack of knowledge, models, and holistic approaches towards integrated and intelligent asset management against energy-related obsolescence. (Butt et. al., 2010a; 2010b; Kiaie, 2010). This paper presents a conceptual but holistic framework of intelligent decision system to support management of energy-related obsolescence. The framework categorises wide ranging scenarios of energy and built environments into appropriate groups, and presents stages of the management of energy-related obsolescence in a sequential, logical and algorithmic order.
4 Development of the Holistic Framework of the Intelligent Decision System This section of the paper presents the development of the holistic framework of the intelligent decision system (which is shown in Figure 1) for assessment and management of energy-related obsolescence in built environments. All the items in the framework are shadowed under technical, non-technical, physical and / or nonphysical aspects. The contributing proportions of these four aspects for items in the framework will vary from scenario to scenario of built environments, depending upon a number of characteristics such as nature, size, scope, and type of the built environment scenario under consideration. This is further explained in the next paragraph with examples. Similarly, the framework encapsulates dimensions of sustainability i.e. social, economic, and environmental. This becomes prominently more evident at the cost-benefit analysis stage of the framework (Section 4.2.1). Examples of technical aspects are heating systems; limitations of technologies (such as non-energy saver i.e. conventional lighting bulbs); etc. Whereas, the behaviour of occupants and maintenance staff of a commercial building; energy using patterns of dwellers of a house; etc. are examples of non-technical aspects. Examples of physical aspects are fabric of buildings; facade of buildings; furniture; etc. Whereas for non-physical the following can be taken as examples: energy; ventilation; maintenance or refurbishment schedule; etc. Among various items in the holistic decision framework (Figure 1), some would be physical, some
912
T.E. Butt and K.G. Jones
non-physical, some technical, some non-technical, and some could be various combinations of any of these four aspects. This will depend on various characteristics specific to a given built environment scenario under consideration. For instance, natural lighting is non-physical but may have aspects that are associated with physical design of the building or non-physical phenomenon such as summer sun or winter sun. Similarly, environmental legislation (e.g. the Climate Change Act 2008; 2010 (DECC, 2010a; 2010b; HM, 2010)) regarding carbon cuts is a non-technical entity but may have technical aspects integrated when carbon cut technologies (e.g. carbon capture and storage) are employed to a fossil-fuel power generation plant. Also, the heating system of a building is a physical commodity, but energy consumption efficiency in terms of users’ behaviour is non-physical aspect and the employed heating technology is a technical matter. Management systems (such as maintenance schedule; environmental management system e.g. ISO 14000; quality management system e.g. ISO 9000, etc.) are non-physical matters but may have technical as well as non-technical aspects associated with them.
Fig. 1 Framework of a holistic intelligent decision system for energy-related obsolescence management
Energy-associated obsolescence management can be described as a process of analysis, evaluation and control of obsolescence that is induced or is likely to be induced in energy-related systems of a given built environment scenario. Figure 1 depicts the holistic conceptual framework as a basis of the intelligent decision support system of energy-related obsolescence management. The framework includes all the three phases of the ‘pipeline’ of energy in the built environment i.e. generation, distribution and consumption. That is, the framework can be applied to
Sustainability of the Built Environment
913
any of the three phases of energy in the built environment. This is divided into two main parts i.e. Obsolescence Assessment (OA) and Obsolescence Reduction (OR). The output of OA is input to OR and this is how the former provides foundation for the latter. Therefore, the more robust and stronger the foundation yielded from OA, the more effective the OR is likely to be.
4.1 Obsolescence Assessment (OA) OA consists of two sub-parts i.e. baseline study and, Identification and categorisation. In the baseline study section of the framework an obsolescence assessor is to gather relevant information by carrying out a desk study of various documents such as drawings, engineering design, historic data on various aspects e.g. flooding, etc. This information can be reinforced by paying investigative visits to the site and gathering anecdotes. The baseline study can also play a vital role of screening before scoping is carried out in a later stage of the framework. Based on the information collected, the assessor can identify and categorise various items under a set of two headings i.e. built environment constituents (BEC) and obsolescence nature (Figure 1). 4.1.1 Built Environment Constituents (BEC) In the BEC module, scope or boundaries can be established of a given built environment. For instance, whether it is a building or infrastructure; what is the type of building (e.g. commercial, domestic, industrial, etc.); what is the type of infrastructure e.g. transport (and if transport then road network or buses in specific; railway networks or trains themselves); energy generation or distribution network; etc. At this point, it can also be established whether it is an existing or a future development. The stage of a development can also be further broken down along the following steps: planning, construction, in-operation and / or decommissioning. Sometimes it may be the extension of an already existing building which is being planned, constructed or decommissioned. Therefore, it is better to identify whether planning, construction, in-operation and / or decommissioning are in full or in part. After this, the assessor can break down the given built environment under consideration into all constituting components. These components can be further categorised into energy-related and non-energy-related groups of commodities. The energy-related group is divided further into energy utilizing items (such as heating, cooling, lighting, etc.) and energy embodied constituents which would be almost all constituents of a given built environment scenario. The only items (if any) which would not fall in this group would basically be in the nonenergy-related category (Figure 1). These identified and categorized components can then be looked at from other perspectives for further investigation such as operational, non-operational, physical, non-physical, technical, non-technical, socio-technical, managerial, non-managerial, fabric, non-fabric, etc.
914
T.E. Butt and K.G. Jones
4.1.2 Obsolescence Nature The next main stage in the framework is identification and categorisation of obsolescence nature for the specified components (Figure 1). The nature of the obsolescence can be characterised as follows: Financial obsolescence means loss in value where as functional obsolescence is loss of usefulness, effectiveness, efficiency or productivity. The financial obsolescence is also termed as social or economic obsolescence, and functional obsolescence as technical obsolescence. (Cooper, 2004; Montgomery Law, 2010; Leeper Appraisal Services, 2010; Richmond Virginia Real Estate, 2003; Nky Condo Rentals, 2010: SMA Financing, 2009). Irrespective of whether obsolescence is in value or function or both, internal obsolescence in a building component or built asset is due to factors that exist within the component or built asset. For instance, general wear and tear, fatigue, corrosion, oxidation, evaporation, rusting, leaking of gas / water or any other fluid like coolant, breaking, age, etc. Where as external obsolescence is temporary or permanent impairment in value or usefulness of a built asset due to factors outside the system such as change in existing or advent of a new environmental legislation; social forces / pressure groups; arrival of new technology; improvement or enhancement of knowledge; fluctuation in demand; inflation of currency; etc. (Landmark Properties, 2009; Salt Lake County, 2004; ESD Appraisal Services, 2010; Drew Mortgage, 2006). Permanent obsolescence is irreversible, for instance, materials of buildings that contain asbestos have become permanently obsolete due to its health impacts. On the contrary, factors which are not permanent such as temporary civil unrest in a society, loss of power for days, flooding, etc. can cause a temporary obsolescence. Furthermore, irrespective of whether obsolescence is internal or external and financial or functional, obsolescence can also be categorised in climate change context as explained above in Section 2.
4.2 Obsolescence Reduction (OR) As stated earlier, the second main part of the sustainable obsolescence management is OR. Although the whole of the obsolescence management framework may be iterated a number of times depending on various characteristics of a given built environment scenario, this part of the holistic framework certainly needs repeating a number of times until most sustainable and yet realistically possible solution or set of solutions have been derived and implemented. The Obsolescence Assessment (OA) is predominantly around gathering and categorisation of data and information of the given built environment, which does not need much iteration. However, as for the OR, the main reason for repeating the OR more frequently, is that various modules in this part are mutually dependent on each other for mutual information transfer. This is due to the fact that information processed in various OR modules have to be delivered and received between the modules backwards and forwards a number of times. For instance, the information between costbenefit analysis and stakeholder participation modules has to be used backwards and forwards due to various sustainability aspects as well as variation in interests among different stake holders (Figure 2). This iteration aspect becomes clearer in
Sustainability of the Built Environment
915
the discussion below where various modules of the OR part are described in more details. This part has been divided into two sub-parts i.e. Obsolescence Evaluation (OE) and Obsolescence Control (OC). Details on them are described below: 4.2.1 Obsolescence Evaluation (OE) The first unit in the OE section is ‘selection of component(s)’ module. Based on the information which would have been collated earlier in the OA part of the OM, in this module the components of the built environment scenario, which are the point of interest in terms of obsolescence assessment and management, can be identified and categorised. In order to assist prioritisation and selection of the components, this module categorises the identified components into three groups based on the rationale around various sustainability aspects. These three groups are: 1. 2. 3.
The components which have become obsolescent; The components which are nearing end of life; and The components which have sufficient life.
There is a module allocated for establishing positive and negative impacts of both taking action and not taking action to fix the obsolescence problems, particularly those of from the first two groups above. This can help to further prioritise on which components need more and quick attention as opposed to others. Following this, there is another module in the framework where all possible options to control obsolescence can be identified and their characteristics (both advantages and limitations) can be established. These options could be technical (e.g. some innovative technology); non-technical (such as a new managerial system to control behaviour of energy consumption); or even combinations of technical and non-technical facets with varying proportions. Keeping this in view, there are three categories introduced in the decision support framework. These are: 1. Technological 2. Behaviour / People, and 3. Socio-technical. The first category of options is further classified into fabric and non-fabric technologies. The technical aspect of the socio-technical category could be fabric or non-fabric related, as Figure 1 shows. The information thus established on various options can later also feed into the ‘cost-benefit analysis’ module, which is divided into three sub-modules to address the three principal dimensions of sustainable development philosophy. These three sub-modules are: Social, Environment and Economic. Each of the social and environment sub-modules cover sustainability aspects in two categories, which are financial costs and nonfinancial costs. For the social sub-module, examples of financial costs are fine that a company may face due to not complying with some legislation requirements such as health and safety regulations; compensation which might have to be paid to the relevant stakeholder e.g. an employee who suffers a health problem or an accident at work; compensation might have to be paid to an external customer too; etc. Where as adverse impact on company image, quality of service or product of
916
T.E. Butt and K.G. Jones
the company, poor social corporate responsibility, are examples of non-financial aspects. Similarly for the environment sub-module, lets consider a case in which some spillage of a toxic substance takes place due to some improper or obsolete component. This can cause financial costs such as the cost to fix the environmental damage and compensation to the victims of the environmental damage. Whereas the bad publicity and extra-ordinarily high pressures from the government bodies (e.g. the Environment Agency) as well as various voluntary environmental pressure groups are examples of non-financial environmental costs. For the economic sub-module there are three categories which are: 1. 2. 3.
capital cost, running cost, and payback time.
In the first two categories above, costs of refitting (i.e. maintenance) and / or retrofitting (i.e. refurbishment) of the selected components are to be analysed. The payback time will also play a vital role in decision making. The financial costs of environmental and social aspects can also be tapped into the economics submodule to draw a bigger picture of total costs. Thus, economic sub-module is shown connected with financial costs of the social and environmental sub-modules (Figure 1). The cost-benefit analysis can be reinforced by consulting diverse spectrum of (internal and external) stakeholders ranging from technical to nontechnical. Each and every stakeholder needs not to be consulted for each and every obsolescence scenario but only appropriate and relevant stakeholders depending on characteristics of the scenario. Thus, in the ‘stakeholder participation’ module of the framework, appropriate stakeholders should also be identified prior to consultation. Information from the ‘other evaluations’ module regarding e.g. feasibility report, the company’s policy and mission statement, etc., can also be tapped into the cost-benefit analysis module to render it more holistic. Eventually, in the decision making module, a decision is made in terms selection of an option or a set of options to reduce impacts of the obsolescence in the built environment scenario. 4.2.2 Obsolescence Control (OC) In the OC section of the obsolescence management framework, the selected obsolescence control option(s) is/are designed and planned. If any unexpected implications, these can be reconsulted with the appropriate stakeholders and rechecked via the cost-benefit analysis module. If any problems, another option or set of options can be selected and then designed and planned again. Such iterations can continue till a satisfactory option or set of options has/have been designed and planned, following which the option(s) can be implemented. While implementing monitoring needs to take place for if there are any discrepancies. Frequency of monitoring can also be incorporated into the system at the design and planning stages earlier. If any discrepancies observed, corrective actions need to be taken to control the implementation process. Such corrective actions against discrepancies can also be set during the design and planning stage.
Sustainability of the Built Environment
917
5 Concluding Remarks This paper establishes link between energy-associated obsolescence and various factors that cause it ranging from conventional to contemporary and also climate change challenges. An intelligent support system is developed in the form of a holistic and conceptual framework for sustainable assessment and management of energy-related obsolescence, which has not existed in the reported literature to date. The framework assembles and categorises all appropriate modules and submodules from the start to the end under one umbrella and places them in a logical, sequential and algorithmic order to support carrying out the obsolescence assessment and management process. The framework encapsulates wide ranging built environment scenarios whether fully or partly at planning, construction, inoperation and / or decommissioning stage. The presence of energy in the built environment is not considered only at the consumption end of the ‘pipeline’ but also the generation end and distribution. The physical, non-physical, technical and nontechnical aspects are also included. This renders the framework useful for diverse range of stakeholders from experts to non-technical. This research work is a step towards making obsolescence management possible in a holistic and sustainable manner. This research work can also streamline current practices of obsolescence management which are available in a non-integrated and peace-meal fashion, and to a limited extent. This framework can attract debate and interests from both practitioners and researchers for further study and research, and later be converted into a computer-aided system. However, the framework, in its current shape can still be effectively used as a decision making tool to select an option or set of options to control obsolescence. Thereby, assisting in rendering our existing built environment (which will be around for many decades to come) more sustainable against various obsolescence drivers from conventional to as modern factors as climate change.
References 1. Acclimatise, Building Business Resilience to Inevitable Climate Change, Carbon Disclosure Project Report 2008, Global Oil and Gas, Acclimatise and Climate Risk Management Limited, Oxford (2009) 2. Allehaux, D., Tessier, P.: Evaluation of the functional obsolescence of building services in European office buildings. Energy and Buildings 34, 127–133 (2002) 3. BIFM (British Institute of Facilities Management), Position Paper: Energy, Executive Summary, BIFM, (January 10, 2010) 4. Butt, T.E., Giddings, B., Cooper, J.C., Umeadi, B.B.N., Jones, K.G.: Advent of Climate Change and Energy Related Obsolescence in the Built Environment. In: International Conference on Sustainability in Energy and Buildings, Brighton, UK, May 6-7 (2010a) 5. Butt, T.E., Umeadi, B.B.N., Jones, K.G.: Sustainable Development and Climate Change Induced Obsolescence in the Built Environment. In: International Sustainable Development Research Conference, Hong Kong, China, May 30-June 1 (2010b)
918
T.E. Butt and K.G. Jones
6. CBI (Confederation of British Industry), Climate Change: Everyone’s business, CBI (2007) 7. Cooper, T.: Inadequate Life? Evidence of Consumer Attitudes to Product Obosolescence. Journal of Consumer Policy 27, 421–449 (2004) 8. DECC (Department of Energy and Climate Change), Climate Change Act 2008, http://www.decc.gov.uk/en/content/cms/legislation/ cc_act_08/cc_act_08.aspx, (Viewed April 2010b) 9. DECC (Department of Energy and Climate Change), Legislation, http://www.decc.gov.uk/en/content/cms/legislation/ legislation.aspx, (Viewed April 2010a) 10. Drew Mortgate Inc. Online Mortgage Dictionary (2006), https://www.drewmortgage.com/abargoot/dictionary/ dictionary_e.html (Viewed March 2010) 11. DTI (Department of Trade and Industry), Energy – Its impacts on environment and Society – Chapter 3, http://www.dti.gov.uk/files/file20300.pdf (Viewed May 2010) 12. EC [European Commission – Energy (ManagEnergy)], COM 2002/91/EC: Directive on the Energy Performance of Buildings, http://www.managenergy.net/products/R210.htm, ManagEnergy (last modified March 9, 2010) 13. ESD Appraisal Services, External Obsolescence, http://www.edsappraisalservices.com/Glossary_and_Terms (Viewed March 2010) 14. EU (European Union), Directive 2009/91/EC of the European Parliament and of the Council of 16 December 2002 on the energy performance of buildings. OJ L1/65 (4-12003) (2002) 15. Gill, S., Pauleit, S., Ennos, A.R., Lindley, S.J., Handley, J.F., Gwilliam, J., UeberjahnTritta, A.: Literature review: Impacts of climate change on urban environment. CURE, University of Manchester (2004) (available on line) 16. Handley, J.F.: Adaptation strategies for climate change in the urban environment (ASCCUE), Narrative report for GR/S19233/01, http://www.sed.manchester.ac.uk/research/cure/downloads/ asccue-epsrc-report.pdf (Viewed March 2010) 17. HM Government, Climate Change: Taking Action (Delivering the Low Carbon Transition Plan and preparing for changing climate), Crown Copyright (2010) 18. Hulme, et al.: Climate change scenarios for the United Kingdom: The UKCIP 2002 Scientific Report, Tyndall Centre for Climate Change Research, School of Environmental Sciences, University of East Anglia, p. 120 (2002) 19. Jones, K.G., Sharp, M.: A new performance-based process model for built asset maintenance. Facilities 25(13/14), 525–535 (2007) 20. Kiaie, M., Umeadi, B.B.N., Butt, T.E., Jones, K.G.: Challenges to Sustainable Development: how facility managers can apply intelligent monitoring to maintenance. In: Conference on Sustainable Development and Scientific Research, Integration and Knowledge-Oriented Pars, Pars Special Economic Energy Zone, Assaluyeh, Iran, January 6-8 (2010) 21. Landmark Properties – Commercial Real Estate, Real Estate Dictionary, http://www.allaboutskyscrapers.com/dictionary/e.htm (Viewed March 2010)
Sustainability of the Built Environment
919
22. Leeper Appraisal Services, California Appraisal / Appraisers, http://www.leeperappraisal.com/appraiser_jargon.htm (Viewed March 2010) 23. Montgomery Law, Family Law Glossary, http://www.montylaw.com/family-law-glossaryo.php (Viewed March 2010) 24. Nky Condo Rentals – Rodney Gillum (2010), http://www.nkycondorentals.com/index.cfm/ fuseaction/terms.list/letter/O/contentid/ 511AF257-5436-4595-81BBAA7704C1AC40 (Viewed March 2010) 25. Richmond Virginia Real Estate, Real Estate Dictionary (2003), http://www.therichmondsite.com/Blogs/Dictionary_O.html (Viewed March 2010) 26. Salt Lake County, Tax Administration (2004), http://www.taxadmin.slco.org/boeGlossary/boeGlossaryE.html (Viewed March 2010) 27. SMA Financing, Real Estate Glossary (2009), http://www.smafinancing.com/glossary.htm (Viewed March 2010) 28. UK Status online, Gross Fixed Capital Formation at Chained Volume Measure (2007)
Author Index
Abderrahim, Siam 409 Ahmed, Sabbir 789 Arai, Yuta 557 Arbaiy, Nureize 103 Asakura, Koichi 367 Bae, Hyerim 469, 519 Bae, Joonsoo 629 Balachandran, Bala M. 429, 529 Baohui, Ji 683 Barbu, Marian 155 Bhandari, Gokul 743 ´ Bog´ardi-M´esz¨oly, Agnes 65 Bormane, D.S. 809 Botvich, Dmitri 821, 873 Botzheim, J´anos 165, 273 Brodsky, Alexander 223 Bunciu, Elena M. 95 Butt, T.E. 907 Campos, Ana 853 Caraman, Sergiu 155 Chang, Betty 399, 647 Chang, Jieh-Ren 399, 647 Chang-jun, Han 671 Chang, Shih-Yu 567 Cheng, Yu-Kuang 389 Chen, You-Shyang 343, 389, 449, 479 Chiang, Bo-Yu 567 Chiba, Takuya 367 Chiros¸c˘a, Alina 155 Christodoulou, Spyros 113 Chuang, Huan-Ming 501, 605
Dempe, Stephan 255, 265 Dominish, Derek 863 Dumitras¸cu, George 155 Ekel, Petr Ya 459 Elmisery, Ahmed M.
821, 873
F¨oldesi, P´eter 65, 165, 273 Fukagawa, Daiji 719 Gawinowski, Grzegorz 617, 843 Gobbin, R. 429, 529 Grivokostopoulou, Foteini 135 Hai-cheng, Li 699 Hanaue, Koichi 547 Hashizume, Ayako 329 Hatano, Kenji 707, 719 Hatzilygeroudis, Ioannis 135 Henryk, Piech 617, 843 Hsieh, Ming-Yuan 343, 439, 449, 597 Hu, Xiangpei 13, 37 Huang, Chi-Yo 123, 355, 567 Huang, Minfang 13 Huang, Xu 489 Iino, Takashi 511, 537 Ikeda, Kento 719 Imazu, Yoshihiro 799 Imoto, Seiya 799 Irimia, Danut C. 283 Itoh, Akihiro 213 Itoi, Ryota 589, 637 Iyetomi, Hiroshi 511, 537, 557
922
Author Index
Jain, Dreama 753 Jentzsch, Ric 897 Jheng, Yow-Hao 399 Jiang, Yiping 145 Jiang, Zhongqiang 37 Jian-hui, Liu 671 Jones, K.G. 907
McDonald, Tom 743 Ming, Huang 683 Miwa, Kanna 213 Miyano, Satoru 799 Mohammadian, Masoud
Kalashnikov, Vyacheslav V. 255, 265 Kalashnykova, Nataliya I. 255, 265 Kanamaru, Masanori 547 Kang, Young Ki 629 Karacapilidis, Nikos 113 Katayama, Kotoe 799 Kent, Robert D. 731, 743, 789 Kido, Takemasa 637 Kinoshita, Eizo 213, 247, 319 Ko, Yu-Chien 23 Kobayashi, Takashi 719 Kobti, Ziad 743, 753, 789 Kojiri, Tomoko 577 Kolekar, Sucheta V. 809 Kovas, Konstantinos 135 Kung, Chaang-Yung 389, 479, 597 Kung, Chang-Yung 439
Ohya, Takao 247 Okunishi, Kouichi 557 Ont, O. 763 Ozaki, Toshimasa 213
Lai, Chien-Jung 343, 389, 439, 479 Lai, Kin Keung 75 Larkman, Deane 897 Lee, Huey-Ming 185 Leitch, Kellie 763 Li, Jianming 37 Lim, Sungmook 377, 519 Lin, Chia-Li 295 Lin, Chien-Ku 501, 605 Lin, Chyuan-Yuh 501, 605 Lin, Lily 185 Lin, Yang-Cheng 85, 833 Lin, Yi-Fan 355 Lin, Yu-Hua 439 Lo, Chi-Hsiang 399 Lo, Mei-Chen 47, 175, 213 Luo, Juan 223 Lv, Renping 37 Ma, Chao 3 Ma, Min-Yuan 85 Ma, Peng 307 Matsuura, Keiko 799 McCarrell, Jason 743
Neves-Silva, Rui
529, 897
853
Park, Jaehun 519 Park, Jennifer J. 763 Parreiras, Roberta O. 459 Pei, Hung-Mei 647 Peng, Li 691 P´erez-Vald´es, Gerardo A. 255, 265 Perikos, Isidoros 135 Petre, Emil 191, 201 Poboroniuc, Marian S. 283 Poign´e, Axel 113 Popescu, Dan 191 Popescu, Dorin 283 Popovici, Ioana Florina 237 Preney, Paul D. 743 Pugovkin, Aleksey V. 47 Qian, Wang
661
Ramdane, Maamri 409 Roman, Monica 191 R¨ovid, Andr´as 65 Ruan, Junhu 3 R¨uping, Stefan 113 Sajjad, Farhan 743 Sanjeevi, Sriram G. 809 Sato, Yuji 57 Selis¸teanu, Dan 201, 283 S¸endrescu, Dorin 191, 201 Serrano, Mart´ın 873 Shah, Pritam Gajkumar 489 Shang, Hongyan 3 Sharma, Dharmendra 429, 489 Shell, Jeremy 763 Shih, Chi-Huang 885 Shiizuka, Hisao 329
Author Index Shimizu, Nobuo 769 Sioutis, Christos 863 Snowdon, Anne W. 731, 743, 753, 763 Solehati, Nita 629 Sugiura, Shin 319 S¨ule, Edit 273 Tamura, Koya 707 Tanaka-Yamawaki, Mieko 589, 637 Terada, Yoshikazu 779 Tokunaga, Hideaki 799 Tsai, Chien-Tzu 47 Tseng, Chun-Chieh 567 Tzagarakis, Manolis 113 Tzeng, Gwo-Hshiung 23, 47, 123, 175, 213, 355, 567 Tzeng, Wei-Chang 123 Wang, Haiyan 307 Wang, Xuping 3 Watada, Junzo 103 Watanabe, Kenji 799 Watanabe, Toyohide 367, 547, 577
923 Watanabe, Yuki 577 Wei, Chun-Chun 85, 833 Wu, Wen-Ming 389, 449, 479, 597 Wu, Ya-Ling 343, 439, 449, 597 Xu, Liang 683 Xu, Yeong-Yuh 885 Yadohisa, Hiroshi 707, 779 Yahya, Bernardo N. 469 Yamada, Takayoshi 419 Yamaguchi, Rui 799 Yamamoto, Hidehiko 419 Yan, Zhou 691 Yang, Min-Hsien 47 Yang, Xin 589 Yoshikawa, Takeo 511 Yuan, Ming-Cheng 123 Zahid, Atif Hasan 731 Zhang, Lihua 13 Zhao, Lindu 145 Zhi, Qi 699 Zhou, Shifei 75
Index
acidogenesis-methanization 95 ad-hoc network 367 adapted queueing algorithm 65 adaptive control 201 adaptive learning styles 809 additive demand 307 agent-based modeling 237 agents 409 AHP 175, 213, 319 anaerobic process 95 analytical network process 389, 479 analytic hierarchy process 175 analytic network process 47 ANP 47, 213, 319, 390, 479 assemblage 396 assembly line 419 attribute coding 399 background music 439 Bayesian network 691 bee algorithm 683 benchmarking 519 bilevel programming model 255, 265 bioengineering 283 biometric identification 459 budget allocation 57 building 853 building facilities 647 built environment 907 business management 47 business process 469 CAI 597 calculating similarity
720
CCM 319 cell production 419 climate change 908 cloud based banking services 123 cloud computing 123 clustering 821 collaborative filtering model 661 combinational disruption 3 communication 529 components 409 compute unified device architecture conceptual framework 908 condorcet method 449 consignment contract 307 consistency estimation 617 continuous auditing 731 correlation 589, 637 CUDA 37
37
data-intensive collaboration 113 data envelopment analysis 519 DCR 175 decentralized supply chain 307 decision analysis 213 decision-based neural network 885 decision-making 237 decision-making trial and evaluation laboratory 47, 295 decision method 13 decision support 705 decision support systems 743 DEMATEL 47, 295 DEMATEL based network process 355 design change requirement 175
926 difficulty estimation 135 dimension reduction 255, 265 disaster area 367 disruption management 3, 13 distance learning 459 distributed computing middleware 863 distribution-valued data 779, 799 distribution scheduling 145 DNP 355 dominance-based rough set approach 23 dominant AHP 247, 319 domination 617 DRSA 23 dynamic fuzzy control 489 e-commerce 429 e-learning 809 eclipse 429, 529 ECOAccountancy 389 eigenvalue distribution 589 eigenvalues 637 elaboration likelihood model 605 EMD 75 emergency response 145 emotion 753 emotional behavior 237 energy efficiency 853 English writing 597 evidence fusion theory 699 exponential function 145 factor contribution analysis 843, 897 failure diagnosis 185, 691 feature selection 449 finite size effect 557 first order logic 135 fractals 237 fragrance form design 85 functional data analysis 769 fuzzy control 155 fuzzy inference 185 fuzzy preference 459, 897 fuzzy random variable 103 GA 419, 568 genetic algorithm 568, 683 global 459, 466 global investment 343 gold market forecasting 75 GPU 37
Index Harker’s method 247 Harker method 213 health care 731 health informatics 743 health monitoring 873 health information technology 763 heuristic 469 hierarchical clustering 769 HIT 763 hospital medication administration 753 human and machine reasoning 117 hybrid uncertainty 103 hydro-electrical simulation system 691 hyperlink analysis 707 ICT enabled personal health 873 imperfect matrix 213 improved bee algorithm 683 improvement strategy 295 information sharing 367 information system continuance 605 information technology 123 intelligent angent 831 intelligent decision guidance system 223 intelligent decision making 871 intelligent decision technology 908 inter-organizational business process 629 interaction 529 intermodal freight transport 13 interpretation 165 interval-pitch conversion 547 interval observer 95 investment in safety measures 57 IPTV networks 821 IS continuance 501 IS success model 501, 605 iterative majorization 779 JADE
429, 529
K-L transformation 459 K-means clustering 519 Kano’s quality model 165 kansei/affecive engineering 329 kansei communication 329 kansei engineering 85 kansei information 103 kansei value creation 329
Index lactic acid production 201 language model 707 LCG 589 lead user method 355 least squared support vector machine 568 linear programming 37 localization algorithm 671 logarithmic least square method 247 log mining analysis 809 loss aversion 165, 273 LS-SVM 568 macroeconomic model 343 market structure 511 MAS 409 mathematical programming 23 maximizing marginal loss saving 145 MCDM 47, 123, 175, 356 mediated activity 529 membership grade 185 methodology 459 minimum bounding rectangle 367 model predictive control 191 multi-period inventory model 377 MP 23 MT 589 mujltiple-choice cloze questions 577 multi-agent 821 multi-agent system 908 multi-attribute decision making 103 multi-issue negotiation 429 multiple-instance learning 885 multiple criteria decision making 47, 123, 175, 356 multiscale community analysis 537 music composition 547 music tempo 439 national competitiveness 23 natural language formalization 135 nested partition method 3 neural networks 191, 201, 283 neuroprosthesis control 283 nonlinear systems 191 nonverbal communication 329 notation-support method 547 ob-solescence 871 online shopping 292 optimal channel profit 302
927 optimization 388 organization 396 organizational performance measurement 459 pairwise comparison matrix 247 parallel algorithm 37 Pareto-efficient 429 particle swarm optimization 399 percent complete 57 performance arts 647 performance evaluation 175 personalised health systems 873 personal medicine pervasive computing on ehealth 873 PGR 449 piecewise surface regression 223 playing method 439 predictive data mining 789 preferention 617 primary health care 743 principal component 637 privacy 821 process chain 65 product design 833 production-pricing decision 307 production network 537 production system 419 profit growth rate 449 proximity 469 PSO 399 psychological attachment 501 query support
743
random correlation matrix 557 randomness 589 ranking lists 617 real-time assurance 731 recommender system 821 reference process model 469 rescue strategies 3 retrofit scenario 853 RMT-PCA 637 RMT-test 589 robotics 283 rough set theory 399, 449, 647 RSSI 671 RST 399, 449
928 smart space 671 SCM 479 scoring functions 429 security system 489 SEM 356 semantics 743 semiconductor 175 senior people 329 service oriented architecture 731 service performance 295 short term wind prediction 568 SIA-NRM 295 software agents 429, 529 software test 843, 897 stock market 637 structural analysis 720 structural equation modeling 356 subjectivity 529 supply chain 273 supply chain management 479 support vector machines 568 sustainability 908 sustainable development 908 SVM 568 symbolic data analysis 769, 779 synergy 113 tablet PC 356 tablet personal computer 356 TAM 356 technology acceptance model 356
Index technology assessment 47 time-space network 145 time utility 273 TOEFL 577 TOEIC 577 transaction management 629 tree structured data 720 trend 637 VIKOR 47 virtual factory 419 virtual university 459 visual analogue scale 779 VlseKriterijumska optimizacija i kompromis-no resenje 47 VRPTW 3 wafer fab 175 wastewater treatment bioprocesses 191 wastewater treatment process 155 water turbine model 699 web 459, 597, 707, 809 web-based instruction 597 web search 707 web usage mining 809 weighted link 511 wind power 568 wind speed forecasting 568 wireless networks 489 XML
720