Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6677
Derong Liu Huaguang Zhang Marios Polycarpou Cesare Alippi Haibo He (Eds.)
Advances in Neural Networks – ISNN 2011 8th International Symposium on Neural Networks, ISNN 2011 Guilin, China, May 29 – June 1, 2011 Proceedings, Part III
13
Volume Editors Derong Liu Chinese Academy of Sciences, Institute of Automation Key Laboratory of Complex Systems and Intelligence Science Beijing 100190, China E-mail:
[email protected] Huaguang Zhang Northeastern University, College of Information Science and Engineering Shenyang, Liaoing 110004, China E-mail:
[email protected] Marios Polycarpou University of Cyprus, Dept. of Electrical and Computer Engineering 75 Kallipoleos Avenue, 1678 Nicosia, Cyprus E-mail:
[email protected] Cesare Alippi Politecnico di Milano, Dip. di Elettronica e Informazione Piazza L. da Vinci 32, 20133 Milano, Italy E-mail:
[email protected] Haibo He University of Rhode Island Dept. of Electrical, Computer and Biomedical Engineering Kingston, RI 02881, USA E-mail:
[email protected] ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-21110-2 e-ISBN 978-3-642-21111-9 DOI 10.1007/978-3-642-21111-9 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011926887 CR Subject Classification (1998): F.1, F.2, D.1, G.2, I.2, C.2, I.4-5, J.1-4 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues © Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
ISNN 2011 – the 8th International Symposium on Neural Networks – was held in Guilin, China, as a sequel of ISNN 2004 (Dalian), ISNN 2005 (Chongqing), ISNN 2006 (Chengdu), ISNN 2007 (Nanjing), ISNN 2008 (Beijing), ISNN 2009 (Wuhan), and ISNN 2010 (Shanghai). ISNN has now become a well-established conference series on neural networks in the region and around the world, with growing popularity and increasing quality. Guilin is regarded as the most picturesque city in China. All participants of ISNN 2011 had a technically rewarding experience as well as memorable experiences in this great city. ISNN 2011 aimed to provide a high-level international forum for scientists, engineers, and educators to present the state of the art of neural network research and applications in diverse fields. The symposium featured plenary lectures given by worldwide renowned scholars, regular sessions with broad coverage, and some special sessions focusing on popular topics. The symposium received a total of 651 submissions from 1,181 authors in 30 countries and regions across all six continents. Based on rigorous reviews by the Program Committee members and reviewers, 215 high-quality papers were selected for publication in the symposium proceedings. We would like to express our sincere gratitude to all reviewers of ISNN 2011 for the time and effort they generously gave to the symposium. We are very grateful to the National Natural Science Foundation of China, the Institute of Automation of the Chinese Academy of Sciences, the Chinese University of Hong Kong, and the University of Illinois at Chicago for their financial support. We would also like to thank the publisher, Springer, for cooperation in publishing the proceedings in the prestigious series of Lecture Notes in Computer Science. Guilin, May 2011
Derong Liu Huaguang Zhang Marios Polycarpou Cesare Alippi Haibo He
ISNN 2011 Organization
General Chairs Marios M. Polycarpou Paul J. Werbos
University of Cyprus, Cyprus National Science Foundation, USA
Advisory Committee Chairs Ruwei Dai Bo Zhang
Chinese Academy of Sciences, China Tsinghua University, China
Advisory Committee Members Hojjat Adeli Shun-ichi Amari Dimitri P. Bertsekas Amit Bhaya Tianyou Chai Guanrong Chen Andrzej Cichocki Jay Farrell Russell Eberhart David B. Fogel Walter J. Freeman Kunihiko Fukushima Marco Gilli Aike Guo Zhenya He Tom Heskes Janusz Kacprzyk Nikola Kasabov Okyay Kaynak Anthony Kuh Deyi Li Yanda Li Chin-Teng Lin Robert J. Marks II Erkki Oja Nikhil R. Pal Jose C. Pr´ıncipe
Ohio State University, USA RIKEN Brain Science Institute, Japan Massachusetts Institute of Technology, USA Federal University of Rio de Janeiro, Brazil Northeastern University, China City University of Hong Kong, Hong Kong RIKEN Brain Science Institute, Japan University of California, Riverside, USA Indiana University-Purdue University, USA Natural Selection, Inc., USA University of California-Berkeley, USA Kansai University, Japan Politecnico di Torino, Italy Chinese Academy of Sciences, China Southeast University, China Radboud University Nijmegen, The Netherlands Polish Academy of Sciences, Poland Auckland University of Technology, New Zealand Bogazici University, Turkey University of Hawaii, USA National Natural Science Foundation of China, China Tsinghua University, China National Chiao Tung University, Taiwan Baylor University, USA Helsinki University of Technology, Finland Indian Statistical Institute, India University of Florida, USA
VIII
ISNN 2011 Organization
Leszek Rutkowski Jennie Si Youxian Sun DeLiang Wang Fei-Yue Wang Shoujue Wang Zidong Wang Cheng Wu Donald Wunsch II Lei Xu Shuzi Yang Xin Yao Gary G. Yen Nanning Zheng Jacek M. Zurada
Czestochowa University of Technology, Poland Arizona State University, USA Zhejiang University, China Ohio State University, USA Chinese Academy of Sciences, China Chinese Academy of Sciences, China Brunel University, UK Tsinghua University, Beijing, China Missouri University of Science and Technology, USA The Chinese University of Hong Kong, Hong Kong Huazhong University of Science and Technology, China University of Birmingham, UK Oklahoma State University, USA Xi’An Jiaotong University, China University of Louisville, USA
Steering Committee Chair Jun Wang
Chinese University of Hong Kong, Hong Kong
Steering Committee Members Jinde Cao Shumin Fei Min Han Xiaofeng Liao Bao-Liang Lu Yi Shen Fuchun Sun Hongwei Wang Zongben Xu Zhang Yi Wen Yu
Southeast University, China Southeast University, China Dalian University of Technology, China Chongqing University, China Shanghai Jiao Tong University, China Huazhong University of Science and Technology, China Tsinghua University, China Huazhong University of Science and Technology, China Xi’An Jiaotong University, China Sichuan University, China National Polytechnic Institute, Mexico
Organizing Committee Chairs Derong Liu Huaguang Zhang
Chinese Academy of Sciences, China Northeastern University, China
ISNN 2011 Organization
IX
Program Chairs Cesare Alippi Bhaskhar DasGupta Sanqing Hu
Politecnico di Milano, Italy University of Illinois at Chicago, USA Hangzhou Dianzi University, China
Plenary Sessions Chairs Frank L. Lewis Changyin Sun
University of Texas at Arlington, USA Southeast University, China
Special Sessions Chairs Amir Hussain Jinhu Lu Stefano Squartini Liang Zhao
University of Stirling, UK Chinese Academy of Sciences, China Universit` a Politecnica delle Marche, Italy University of Sao Paulo, Brazil
Finance Chairs Hairong Dong Cong Wang Zhigang Zeng Dongbin Zhao
Beijing Jiaotong University, China South China University of Technology, China Huazhong University of Science and Technology, China Chinese Academy of Sciences, China
Publicity Chairs Zeng-Guang Hou Manuel Roveri Songyun Xie Nian Zhang
Chinese Academy of Sciences, China Politecnico di Milano, Italy Northwestern Polytechnical University, China University of the District of Columbia, USA
European Liaisons Danilo P. Mandic Alessandro Sperduti
Imperial College London, UK University of Padova, Italy
Publications Chairs Haibo He Wenlian Lu Yunong Zhang
Stevens Institute of Technology, USA Fudan University, China Sun Yat-sen University, China
X
ISNN 2011 Organization
Registration Chairs Xiaolin Hu Zhigang Liu Qinglai Wei
Tsinghua University, China Southwest Jiaotong University, China Chinese Academy of Sciences, China
Local Arrangements Chairs Xuanju Dang Xiaofeng Lin Yong Xu
Guilin University of Electronic Technology, China Guangxi University, China Guilin University of Electronic Technology, China
Electronic Review Chair Tao Xiang
Chongqing University, China
Symposium Secretariat Ding Wang
Chinese Academy of Sciences, China
ISNN 2011 International Program Committee Jose Aguilar Haydar Akca Angelo Alessandri Lu´ıs Alexandre Bruno Apolloni Marco Antonio Moreno Armend´ariz K. Vijayan Asari Amir Atiya Monica Bianchini Salim Bouzerdoum Ivo Bukovsky Xindi Cai Jianting Cao M. Emre Celebi Jonathan Hoyin Chan Ke Chen Songcan Chen YangQuan Chen Yen-Wei Chen Zengqiang Chen
Universidad de los Andes, Venezuela United Arab Emirates University, UAE University of Genoa, Italy Universidade da Beira Interior, Portugal University of Milan, Italy Instituto Politecnico Nacional, Mexico Old Dominion University, USA Cairo University, Egypt Universit` a degli Studi di Siena, Italy University of Wollongong, Australia Czech Technical University, Czech Republic APC St. Louis, USA Saitama Institute of Technology, Japan Louisiana State University, USA King Mongkut’s University of Technology, Thailand University of Manchester, UK Nanjing University of Aeronautics and Astronautics, China Utah State University, USA Ritsumeikan University, Japan Nankai University, China
ISNN 2011 Organization
Jianlin Cheng Li Cheng Long Cheng Xiaochun Cheng Sung-Bae Cho Pau-Choo Chung Jose Alfredo Ferreira Costa Sergio Cruces-Alvarez Lili Cui Chuangyin Dang Xuanju Dang Mingcong Deng Ming Dong Gerard Dreyfus Haibin Duan Wlodzislaw Duch El-Sayed El-Alfy Pablo Estevez Jay Farrell Wai-Keung Fung John Gan Junbin Gao Xiao-Zhi Gao Anya Getman Xinping Guan Chengan Guo Lejiang Guo Ping Guo Qing-Long Han Haibo He Zhaoshui He Tom Heskes Zeng-Guang Hou Zhongsheng Hou Chun-Fei Hsu Huosheng Hu Jinglu Hu Guang-Bin Huang Ting Huang Tingwen Huang Marc Van Hulle Amir Hussain Giacomo Indiveri
XI
University of Missouri Columbia, USA NICTA Australian National University, Australia Chinese Academy of Sciences, China University of Reading, UK Yonsei University, Korea National Cheng Kung University, Taiwan Federal University, UFRN, Brazil University of Seville, Spain Northeastern University, China City University of Hong Kong, Hong Kong Guilin University of Electronic Technology, China Okayama University, Japan Wayne State University, USA ESPCI-ParisTech, France Beihang University, China Nicolaus Copernicus University, Poland King Fahd University of Petroleum and Minerals, Saudi Arabia Universidad de Chile, Chile University of California Riverside, USA University of Manitoba, Canada University of Essex, UK Charles Sturt University, Australia Helsinki University of Technology, Finland University of Nevada Reno, USA Shanghai Jiao Tong University, China Dalian University of Technology, China Huazhong University of Science and Technology, China Beijing Normal University, China Central Queensland University, Australia Stevens Institute of Technology, USA RIKEN Brain Science Institute, Japan Radboud University Nijmegen, The Netherlands Chinese Academy of Sciences, China Beijing Jiaotong University, China Chung Hua University, Taiwan University of Essex, UK Waseda University, Japan Nanyang Technological University, Singapore University of Illinois at Chicago, USA Texas A&M University at Qatar Katholieke Universiteit Leuven, Belgium University of Stirling, UK ETH Zurich, Switzerland
XII
ISNN 2011 Organization
Danchi Jiang Haijun Jiang Ning Jin Yaochu Jin Joarder Kamruzzaman Qi Kang Nikola Kasabov Yunquan Ke Rhee Man Kil Kwang-Baek Kim Sungshin Kim Arto Klami Leo Li-Wei Ko Mario Koeppen Stefanos Kollias Sibel Senan Kucur H.K. Kwan James Kwok Edmund M.K. Lai Chuandong Li Kang Li Li Li Michael Li Shaoyuan Li Shutao Li Xiaoou Li Yangmin Li Yuanqing Li Hualou Liang Jinling Liang Yanchun Liang Lizhi Liao Alan Wee-Chung Liew Aristidis Likas Chih-Jen Lin Ju Liu Meiqin Liu Yan Liu Zhenwei Liu Bao-Liang Lu Hongtao Lu Jinhu Lu Wenlian Lu
University of Tasmania, Australia Xinjiang University, China University of Illinois at Chicago, USA Honda Research Institute Europe, Germany Monash University, Australia Tongji University, China Auckland University, New Zealand Shaoxing University, China Korea Advanced Institute of Science and Technology, Korea Silla University, Korea Pusan National University, Korea Helsinki University of Technology, Finland National Chiao Tung University, Taiwan Kyuhsu Institute of Technology, Japan National Technical University of Athens, Greece Istanbul University, Turkey University of Windsor, Canada Hong Kong University of Science and Technology, Hong Kong Massey University, New Zealand Chongqing University, China Queen’s University Belfast, UK Tsinghua University, China Central Queensland University, Australia Shanghai Jiao Tong University, China Hunan University, China CINVESTAV-IPN, Mexico University of Macao, Macao South China University of Technology, China University of Texas at Houston, USA Southeast University, China Jilin University, China Hong Kong Baptist University Griffith University, Australia University of Ioannina, Greece National Taiwan University, Taiwan Shandong University, China Zhejiang University, China Motorola Labs, Motorola, Inc., USA Northeastern University, China Shanghai Jiao Tong University, China Shanghai Jiao Tong University, China Chinese Academy of Sciences, China Fudan University, China
ISNN 2011 Organization
Yanhong Luo Jinwen Ma Malik Magdon-Ismail Danilo Mandic Francesco Marcelloni Francesco Masulli Matteo Matteucci Patricia Melin Dan Meng Yan Meng Valeri Mladenov Roman Neruda Ikuko Nishikawa Erkki Oja Seiichi Ozawa Guenther Palm Christos Panayiotou Shaoning Pang Thomas Parisini Constantinos Pattichis Jaakko Peltonen Vincenzo Piuri Junfei Qiao Manuel Roveri George Rovithakis Leszek Rutkowski Tomasz Rutkowski Sattar B. Sadkhan Toshimichi Saito Karl Sammut Edgar Sanchez Marcello Sanguineti Gerald Schaefer Furao Shen Daming Shi Hideaki Shimazaki Qiankun Song Ruizhuo Song Alessandro Sperduti Stefano Squartini Dipti Srinivasan John Sum Changyin Sun
XIII
Northeastern University, China Peking University, China Rensselaer Polytechnic Institute, USA Imperial College London, UK University of Pisa, Italy Universit`a di Genova, Italy Politecnico di Milano, Italy Tijuana Institute of Technology, Mexico Southwest University of Finance and Economics, China Stevens Institute of Technology, USA Technical University of Sofia, Bulgaria Academy of Sciences of the Czech Republic, Czech Republic Ritsumei University, Japan Aalto University, Finland Kobe University, Japan Universit¨ at Ulm, Germany University of Cyprus, Cyprus Auckland University of Technology, New Zealand University of Trieste, Italy University of Cyprus, Cyprus Helsinki University of Technology, Finland University of Milan, Italy Beijing University of Technology, China Politecnico di Milano, Italy Aristole University of Thessaloniki, Greece Technical University of Czestochowa, Poland RIKEN Brain Science Institute, Japan University of Babylon, Iraq Hosei University, Japan Flinders University, Australia CINVESTAV, Mexico University of Genoa, Italy Aston University, UK Nanjing University, China Nanyang Technological University, Singapore RIKEN Brain Science Institute, Japan Chongqing Jiaotong University, China Northeastern University, China University of Padua, Italy Universit` a Politecnica delle Marche, Italy National University of Singapore, Singapore National Chung Hsing University, Taiwan Southeast University, China
XIV
ISNN 2011 Organization
Johan Suykens Roberto Tagliaferri Norikazu Takahashi Ah-Hwee Tan Ying Tan Toshihisa Tanaka Hao Tang Qing Tao Ruck Thawonmas Sergios Theodoridis Peter Tino Christos Tjortjis Ivor Tsang Masao Utiyama Marley Vellasco Alessandro E.P. Villa Draguna Vrabie Bing Wang Dan Wang Dianhui Wang Ding Wang Lei Wang Lei Wang Wenjia Wang Wenwu Wang Yingchun Wang Yiwen Wang Zhanshan Wang Zhuo Wang Zidong Wang Qinglai Wei Yimin Wen Wei Wu Cheng Xiang Degui Xiao Songyun Xie Rui Xu Xin Xu Yong Xu Jianqiang Yi Zhang Yi
Katholieke Universiteit Leuven, Belgium University of Salerno, Italy Kyushu University, Japan Nanyang Technological University, Singapore Peking University, China Tokyo University of Agriculture and Technology, Japan Hefei University of Technology, China Chinese Academy of Sciences, China Ritsumeikan University, Japan University of Athens, Greece Birmingham University, UK University of Manchester, UK Nanyang Technological University, Singapore National Institute of Information and Communications Technology, Japan PUC-Rio, Brazil Universit´e de Lausanne, Switzerland University of Texas at Arlington, USA University of Hull, UK Dalian Maritime University, China La Trobe University, Australia Chinese Academy of Sciences, China Australian National University, Australia Tongji University, China University of East Anglia, UK University of Surrey, USA Northeastern University, China Hong Kong University of Science and Technology, Hong Kong Northeastern University, China University of Illinois at Chicago, USA Brunel University, UK Chinese Academy of Sciences, China Hunan Institute of Technology, China Dalian University of Technology, China National University of Singapore, Singapore Hunan University, China Northwestern Polytechnical University, China Missouri University of Science and Technology, USA National University of Defense Technology, China Guilin University of Electronic Technology, China Chinese Academy of Sciences, China Sichuan University, China
ISNN 2011 Organization
Dingli Yu Xiao-Hua Yu Xiaoqin Zeng Zhigang Zeng Changshui Zhang Huaguang Zhang Jianghai Zhang Jie Zhang Kai Zhang Lei Zhang Nian Zhang Xiaodong Zhang Xin Zhang Yunong Zhang Dongbin Zhao Hai Zhao Liang Zhao Mingjun Zhong Weihang Zhu Rodolfo Zunino
XV
Liverpool John Moores University, UK California Polytechnic State University, USA Hohai University, China Huazhong University of Science and Technology, China Tsinghua University, China Northeastern University, China Hangzhou Dianzi University, China University of New Castle, UK Lawrence Berkeley National Lab, USA Sichuan University, China University of the District of Columbia, USA Wright State University, USA Northeastern University, China Sun Yat-sen University, China Chinese Academy of Sciences, China Shanghai Jiao Tong University, China University of S˜ ao Paulo, Brazil University of Glasgow, UK Lamar University, USA University of Genoa, Italy
Table of Contents – Part III
Reinforcement Learning and Decision Making An Adaptive Dynamic Programming Approach for Closely-Coupled MIMO System Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jian Fu, Haibo He, Qing Liu, and Zhen Ni
1
Adaptive Dual Heuristic Programming Based on Delta-Bar-Delta Learning Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jun Wu, Xin Xu, Chuanqiang Lian, and Yan Huang
11
A Design Decision-Making Support Model for Prioritizing Affective Qualities of Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun-Chih Chen and Ming-Chuen Chuan
21
Local Search Heuristics for Robotic Routing Planner . . . . . . . . . . . . . . . . . Stanislav Sluˇsn´y and Roman Neruda
31
Action and Motor Control Fuzzy Disturbance-Observer Based Control of Electrically Driven Free-Floating Space Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhongyi Chu and Jing Cui
41
Dynamic Neural Network Control for Voice Coil Motor with Hysteresis Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuanju Dang, Fengjin Cao, and Zhanjun Wang
50
A New Model Reference Adaptive Control of PMSM Using Neural Network Generalized Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guohai Liu, Beibei Dong, Lingling Chen, and Wenxiang Zhao
58
RBF Neural Network Application in Internal Model Control of Permanent Magnet Synchronous Motor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guohai Liu, Lingling Chen, Beibei Dong, and Wenxiang Zhao
68
Transport Control of Underactuated Cranes . . . . . . . . . . . . . . . . . . . . . . . . . Dianwei Qian, Boya Zhang, and Xiangjie Liu Sliding Mode Prediction Based Tracking Control for Discrete-time Nonlinear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lingfei Xiao and Yue Zhu
77
84
XVIII
Table of Contents – Part III
Adaptive and Hybrid Intelligent Systems A Single Shot Associated Memory Based Classification Scheme for WSN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nomica Imran and Asad Khan
94
Dynamic Structure Neural Network for Stable Adaptive Control of Nonlinear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingye Lei
104
A Position-Velocity Cooperative Intelligent Controller Based on the Biological Neuroendocrine System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chongbin Guo, Kuangrong Hao, Yongsheng Ding, Xiao Liang, and Yiwen Dou
112
A Stable Online Self-Constructing Recurrent Neural Network . . . . . . . . . . Qili Chen, Wei Chai, and Junfei Qiao
122
Evaluation of SVM Classification of Avatar Facial Recognition . . . . . . . . . Sonia Ajina, Roman V. Yampolskiy, and Najoua Essoukri Ben Amara
132
Optimization Control of Rectifier in HVDC System with ADHDP . . . . . . Chunning Song, Xiaohua Zhou, Xiaofeng Lin, and Shaojian Song
143
Network Traffic Prediction Based on Wavelet Transform and Season ARIMA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yongtao Wei, Jinkuan Wang, and Cuirong Wang
152
Dynamic Bandwidth Allocation for Preventing Congestion in Data Center Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cong Wang, Cui-rong Wang, and Ying Yuan
160
Software Comparison Dealing with Bayesian Networks . . . . . . . . . . . . . . . . Mohamed Ali Mahjoub and Karim Kalti
168
Adaptive Synchronization on Edges of Complex Networks . . . . . . . . . . . . . Wenwu Yu
178
Application of Dual Heuristic Programming in Excitation System of Synchronous Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuzhang Lin and Chao Lu
188
A Neural Network Method for Image Resolution Enhancement from a Multiple of Image Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shuangteng Zhang and Ezzatollah Salari
200
Multiparty Simultaneous Quantum Secure Direct Communication Based on GHZ States and Mutual Authentication . . . . . . . . . . . . . . . . . . . . Wenjie Liu, Jingfa Liu, Hao Xiao, Tinghuai Ma, and Yu Zheng
209
Table of Contents – Part III
A PSO-Based Bacterial Chemotaxis Algorithm and Its Application . . . . . Rui Zhang, Jianzhong Zhou, Youlin Lu, Hui Qin, and Huifeng Zhang Predicting Stock Index Using an Integrated Model of NLICA, SVR and PSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chi-Jie Lu, Jui-Yu Wu, Chih-Chou Chiu, and Yi-Jun Tsai Affective Classification in Video Based on Semi-supervised Learning . . . . Shangfei Wang, Huan Lin, and Yongjie Hu
XIX
219
228
238
Incorporating Feature Selection Method into Neural Network Techniques in Sales Forecasting of Computer Products . . . . . . . . . . . . . . . . Chi-Jie Lu, Jui-Yu Wu, Tian-Shyug Lee, and Chia-Mei Lian
246
Design of Information Granulation-Based Fuzzy Models with the Aid of Multi-objective Optimization and Successive Tuning Method . . . . . . . . Wei Huang, Sung-Kwun Oh, and Jeong-Tae Kim
256
Design of Fuzzy Radial Basis Function Neural Networks with the Aid of Multi-objective Optimization Based on Simultaneous Tuning . . . . . . . . Wei Huang, Lixin Ding, and Sung-Kwun Oh
264
The Research of Power Battery with Ultra-capacitor for Hybrid Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia Wang, Yingchun Wang, and Guotao Hui
274
Neuroinformatics and Bioinformatics Stability Analysis of Genetic Regulatory Networks with Mixed Time-Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhanheng Chen and Haijun Jiang
280
Representing Boolean Functions Using Polynomials: More Can Offer Less . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yi Ming Zou
290
The Evaluation of Land Utilization Intensity Based on Artificial Neural Network: A Case of Zhejiang Province . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianchun Xu, Huan Li, and Zhiyuan Xu
297
Dimension Reduction of RCE Signal by PCA and LPP for Estimation of the Sleeping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yohei Tomita, Yasue Mitsukura, Toshihisa Tanaka, and Jianting Cao
306
Prediction of Oxygen Decarburization Efficiency Based on Mutual Information Case-Based Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Min Han and Liwen Jiang
313
XX
Table of Contents – Part III
Off-line Signature Verification Based on Multitask Learning . . . . . . . . . . . You Ji, Shiliang Sun, and Jian Jin
323
Modeling and Classification of sEMG Based on Instrumental Variable Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaojing Shang, Yantao Tian, and Yang Li
331
Modeling and Classification of sEMG Based on Blind Identification Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yang Li, Yantao Tian, Xiaojing Shang, and Wanzhong Chen
340
Micro-Blood Vessel Detection Using K-Means Clustering and Morphological Thinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhongming Luo, Zhuofu Liu, and Junfu Li
348
PCA Based Regional Mutual Information for Robust Medical Image Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yen-Wei Chen and Chen-Lun Lin
355
Discrimination of Thermophilic and Mesophilic Proteins via Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingru Xu and Yuehui Chen
363
Information Retrieval Using Grey Relation Analysis and TOPSIS to Measure PCB Manufacturing Firms Efficiency in Taiwan . . . . . . . . . . . . . . . . . . . . . . . . . . Rong-Tsu Wang
370
Application of Neural Network for the Prediction of Eco-efficiency . . . . . . Slawomir Golak, Dorota Burchart-Korol, Krystyna Czaplicka-Kolarz, and Tadeusz Wieczorek
380
Using Associative Memories for Image Segmentation . . . . . . . . . . . . . . . . . Enrique Guzm´ an, Ofelia M.C. Jim´enez, Alejandro D. P´erez, and Oleksiy Pogrebnyak
388
Web Recommendation Based on Back Propagation Neural Networks . . . . Jiang Zhong, Shitao Deng, and Yifeng Cheng
397
A Multi-criteria Target Monitoring Strategy Using MinMax Operator in Formed Virtual Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xin Song, Cuirong Wang, and Juan Wang
407
Data Mining and Knowledge Discovery Application of a Novel Data Mining Method Based on Wavelet Analysis and Neural Network Satellite Clock Bias Prediction . . . . . . . . . . . . . . . . . . Chengjun Guo and Yunlong Teng
416
Table of Contents – Part III
Particle Competition and Cooperation for Uncovering Network Overlap Community Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabricio Breve, Liang Zhao, Marcos Quiles, Witold Pedrycz, and Jiming Liu Part-Based Transfer Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhijie Xu and Shiliang Sun A Parallel Wavelet Algorithm Based on Multi-core System and Its Application in the Massive Data Compression . . . . . . . . . . . . . . . . . . . . . . . Xiaofan Lu, Zhigang Liu, Zhiwei Han, and Feng Wu
XXI
426
434
442
Predicting Carbon Emission in an Environment Management System . . . Manas Pathak and Xiaozhe Wang
450
Classification of Pulmonary Nodules Using Neural Network Ensemble . . . Hui Chen, Wenfang Wu, Hong Xia, Jing Du, Miao Yang, and Binrong Ma
460
Combined Three Feature Selection Mechanisms with LVQ Neural Network for Colon Cancer Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tianlei Zang, Dayun Zou, Fei Huang, and Ning Shen
467
Estimation of Groutability of Permeation Grouting with Microfine Cement Grouts Using RBFNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kuo-Wei Liao and Chien-Lin Huang
475
Improving Text Classification with Concept Index Terms and Expansion Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XiangHua Fu, LianDong Liu, TianXue Gong, and Lan Tao
485
Automated Personal Course Scheduling Adaptive Spreading Activation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yukio Hori, Takashi Nakayama, and Yoshiro Imai
493
Transfer Learning through Domain Adaptation . . . . . . . . . . . . . . . . . . . . . . Huaxiang Zhang The Design of Evolutionary Multiple Classifier System for the Classification of Microarray Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kun-Hong Liu, Qing-Qiang Wu, and Mei-Hong Wang Semantic Oriented Clustering of Documents . . . . . . . . . . . . . . . . . . . . . . . . . Alessio Leoncini, Fabio Sangiacomo, Sergio Decherchi, Paolo Gastaldo, and Rodolfo Zunino Support Vector Machines versus Back Propagation Algorithm for Oil Price Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adnan Khashman and Nnamdi I. Nwulu
505
513 523
530
XXII
Table of Contents – Part III
Ultra-Short Term Prediction of Wind Power Based on Multiples Model Extreme Leaning Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ting Huang, Xin Wang, Lixue Li, Lidan Zhou, and Gang Yao
539
BursT: A Dynamic Term Weighting Scheme for Mining Microblogging Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chung-Hong Lee, Chih-Hong Wu, and Tzan-Feng Chien
548
Towards an RDF Encoding of ConceptNet . . . . . . . . . . . . . . . . . . . . . . . . . . Marco Grassi and Francesco Piazza Modeling of Potential Customers Identification Based on Correlation Analysis and Decision Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kai Peng and Daoyun Xu An Innovative Feature Selection Using Fuzzy Entropy . . . . . . . . . . . . . . . . Hamid Parvin, Behrouz Minaei-Bidgoli, and Hossein Ghaffarian Study on the Law of Short Fatigue Crack Using Genetic Algorithm-BP Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zheng Wang, Zihao Zhao, Lu Wang, and Kui Wang
558
566 576
586
Natural Language Processing Large Vocabulary Continuous Speech Recognition of Uyghur: Basic Research of Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhetaer Shadike, Xiao Li, and Buheliqiguli Wasili
594
Sentic Medoids: Organizing Affective Common Sense Knowledge in a Multi-dimensional Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Erik Cambria, Thomas Mazzocco, Amir Hussain, and Chris Eckl
601
Detecting Emotions in Social Affective Situations Using the EmotiNet Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexandra Balahur, Jes´ us M. Hermida, and Andr´es Montoyo
611
3-Layer Ontology Based Query Expansion for Searching . . . . . . . . . . . . . . Li Liu and Fangfang Li
621
Chinese Speech Recognition Based on a Hybrid SVM and HMM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xingxian Luo
629
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
637
An Adaptive Dynamic Programming Approach for Closely-Coupled MIMO System Control Jian Fu1, , Haibo He2, , Qing Liu1 , and Zhen Ni2 1
2
School of Automation,Wuhan University of Technology, Wuhan, Hubei 430070, China
[email protected],
[email protected] Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI 02881, USA {he,ni}@ele.uri.edu
Abstract. In this paper, we investigate the application of adaptive dynamic programming (ADP) for a real industrial-based control problem. Our focus includes two aspects. First, we consider the multiple-input and multiple-output (MIMO) ADP design for online learning and control. Specifically, we consider the action network with multiple outputs as control signals to be sent to the system under control, which provides the capability of this approach to be more applicable to real engineering problems with multiple control variables. Second, we apply this approach to a real industrial application problem to control the tension and height of the looper system in a hot strip mill system. Our intention is to demonstrate the adaptive learning and control performance of the ADP with such a real system. Our results demonstrate the effectiveness of this approach. Keywords: adaptive dynamic programming (ADP), multiple-input and multiple-output (MIMO), online learning and control, looper system.
1
Introduction
Adaptive dynamic programming (ADP) has been a key methodology for adaptive control and optimization for a wide range of domains [1],[2],[3],[4]. Briefly speaking, the fundamental idea of ADP is the “principle of optimality” in dynamic programming. Recently, important research results have also suggested that ADP is one of the core technologies to potentially achieve the brain-like general-purpose intelligence and bridge the gap between theoretic study and challenging real-world engineering applications [2].
This work was done when Jian Fu was a visiting scholar at the Department of Electrical, Computer, and Biomedical Engineering at The University of Rhode Island, Kingston, RI 02881, USA. This work was supported in part by the National Science Foundation (NSF) under Grant ECCS 1053717.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 1–10, 2011. Springer-Verlag Berlin Heidelberg 2011
2
J. Fu et al.
Over the past decades, extensive efforts have devoted to the ADP research and tremendous progresses have been achieved across different application domains, ranging from power system control, communication networks, to helicopter control. For instance, a synchronous generator neuro-controller in a power system with a HDP structure was implemented by multilayer perceptron neural network and radial basis function neural network[5]. In [6], ADP has been successfully applied to the complex helicopter control based on a structured cascade action network and an associated trim network. Tracking applications in engine torque and air fuel ratio control was presented in [7] by ADP design with time lagged recurrent neural network. In [8], a hierarchical ADP architecture is proposed to adaptively develop multiple levels of internal goal representations to facilitate the optimization over time. In this work, our main focus is to demonstrate the ADP application to a practical industrial problem. The action network in our ADP architecture includes multiple outputs, which will be suitable for applications which require multiple control variables. The application case we tested in this research is a looper system in a hot strip mill, a typical nonlinear closely-coupled MIMO system. The rest of this paper is organized as follows. Section 2 presents the MIMO ADP design and its implementation details, such as the adaptation and tuning of parameters in both critic network and action network. In section 3, we present in detail the application to the height and tension control of the looper in a hot strip mill. In addition, experimental settings and results are presented and analyzed. Finally, a conclusion is given in section 4.
2
ADP for System with Multiple Control Variables
In this work, we built our research based on the ADP architecture as presented in [4] (see Fig.1). In this structure, the temporal difference is obtained by calculating the current cost-to-go J (k) and previous value J (k − 1) in the diagram. The binary reinforcement signal r is provided from the external environment which we mark as reinforcement signal explicitly. In our current study, we adopt a common strategy to use “0” and “-1” representing “success” and “failure”, respectively.
&ULWLF 1HWZRUN
-ˆ ; N -1
B
;N
$FWLRQ 1HWZRUN
XN
6\VWHP
;N
$FWLRQ 1HWZRUN
XN
&ULWLF 1HWZRUN
-ˆ ; N
˞
UN
5HLQIRUFHPHQWVLJQDO
B
8F
Fig. 1. Schematic diagram for implementation of online learning based on ADP
An ADP Approach for Closely-Coupled MIMO System Control
3
We realize in many practical situations, the system under control normally require multiple control variables. To do so, the action network needs to include multiple output neurons, which will also in turn augment the inputs to the critic network. Therefore, we first derive the weight adaptation procedure under this situation. All discussions in this article assume neural network with multi-layer perceptron (MLP) architecture is used in both critic network and action network design. 2.1
Critic Network
In the critic network, the output can be defined as follows qi (k) =
N cin
wc(1) (k) · xj (k), ij
i = 1, · · · , Nch
(1)
i = 1, · · · , Nch
(2)
j=1
pi (k) =
1 − exp−qi (k) , 1 + exp−qi (k) Jˆ (k) =
Nch
wc(2) (k) · pi (k) , i
(3)
i=1
where qi is the ith hidden node input of the critic network, pi is the corresponding output of the hidden node qi , Ncin is the total number of the input nodes in the critic network including Nx inputs from system states and Naout inputs from the outputs nodes in the action network, and Nch is the total number of the hidden nodes in the critic network. As discussed in [4], in this structure J can be estimated by minimizing the following error over time Eh =
2 1 ˆ J (k) − r (k) − αJˆ (k + 1) . 2
(4)
k
So the objective function to be minimized in the critic network is Ec (k) =
2 1 2 ec (k) = αJˆ (k) − Jˆ (k − 1) − r (k − 1) , 2
(5)
We can reach the adaptation of the critic network as the following by applying the chain backpropagation rule [9]. This process is summarized as follows. (2) 1) Δwc (hidden to output layer) ∂Ec (k) (2) , (6) Δwci (k) = lc (k) − (2) ∂wci (k) ∂Ec (k) (2) ∂wci
(k)
=
∂ Jˆ (k) ∂Ec (k) ∂ec (k) · · = αec (k) pi (k) ∂ec (k) ∂ Jˆ (k) ∂wc(2) i (k)
(7)
4
J. Fu et al. (2)
where lc (k) > 0 is the learning rate of the critic network at time k and wci is the weight between ith hidden node with output node in the critic network. (1) 2) Δwc (input to hidden layer) (k) ∂E c (k) = lc (k) − (1) Δwc(1) , (8) ij ∂wcij (k) ∂Ec (k)
∂Ec (k) ∂ec (k) ∂ Jˆ (k) ∂pi (k) ∂qi (k) · · · · (1) ∂ec (k) ∂ Jˆ (k) ∂pi (k) ∂qi (k) ∂wc(1) ∂wcij (k) ij (k)
1 1 − p2i (k) · xj (k) = αec (k) wc(2) (k) · i 2 =
(9) (10)
(1)
where wcij is the weight between jth input node with ith hidden node in the critic network. 2.2
Action Network
We now proceed to investigate the weight tuning process in the action network. Note in this case, we consider the action network has multiple control outputs Naout . The associated equations for the action network are: hi (k) =
N ain
wa(1) (k) · xj (k), i = 1, · · · , Nah ij
(11)
j=1
gi (k) = vm (k) =
N ah
1 − exp−hi (k) , i = 1, · · · , Nah 1 + exp−hi (k)
(12)
wa(2) (k) · gi (k) , m = 1, · · · , Naout mi
(13)
i=1
um (k) =
1 − exp−vm (k) , m = 1, · · · , Naout 1 + exp−vm (k)
(14)
where hi is the ith hidden node input of the action network, gi is the ith hidden node output of the action network, vm is the input of the mth output node of the action network, um is the output of the mth output node of the action network, Nain is the total number of the input nodes in the action network, Nah is the total number of the hidden nodes in the action network, and Naout is the total number of the output nodes in the action network. The purpose in adapting the action network is to implicitly back propagate the error between the desired ultimate object Uc and approximate J function from critic network. In our current study, we set Uc (k) equal to zero. The error of the action network is defined as Ea (k) =
1 2 2 e (k) = [J (k) − Uc ] . 2 a
(15)
An ADP Approach for Closely-Coupled MIMO System Control
5
When the classic gradient descent search is used, the generalized approach for MIMO system can be summarized as follows. (2) 1) Δwa (hidden to output layer) ∂Ea (k) (2) , (16) Δwami = la (k) − (2) ∂wami (k) ∂Ea (k) (2) ∂wami
(k)
=
∂Ea (k) ∂ea (k) ∂ Jˆ (k) ∂um (k) ∂vm (k) · · · · ∂ea (k) ∂ Jˆ (k) ∂um (k) ∂vm (k) ∂wa(2) mi (k)
(17)
N ch
1 1 1 − u(2) 1 − p2l (k) ·wc(1) wc(2) (k)· (k) (18) = ea (k)gi (k) m (k) l l(Nx+m) 2 2 l=1
Here the um (k) is the mth output of the action network, which also serves the (Nx + m)th (again, Nx is the number of inputs from the system) inputs to the ∂J(k) critic network as well. One should note that under MIMO scenario, the ∂u m (k) need to be calculated by summarizing the respective partial derivatives in all hidden nodes in the critic network. (1) 2) Δwa (hidden to output layer) ∂Ea (k) (1) , (19) Δwaij = la (k) − (1) ∂waij (k) N aout ∂Ea (k) ∂ea (k) ∂ Jˆ (k) ∂um (k) ∂vm (k) ∂gi (k) ∂hi (k) = (1) ∂ea (k) ∂ Jˆ (k) m=1 ∂um (k) ∂vm (k) ∂gi (k) ∂hi (k) ∂wa(1) ∂waij (k) ij (20) ∂Ea (k)
1
2 = ea (k)· 1 − pl (k) wa(2) (k) mi 2 m=1 l=1 1
1
2 2 ·xj (k) (21) 1 − um (k) 1 − gi (k) 2 2 N aout N ah
wc(2) l
(k) wc(1) (k) l(Nx+m)
We would like to note that for the MIMO system, the information provided (1) by waij , the hidden node pl of the critic network and output node um are all intermediate variables. They all contribute as a part to ∂J(k) in terms of the (1) ∂waij (k)
nested pattern.
3
Case Study with Looper Tension/Height System
In this section, we investigate the application of the ADP architecture with multiple control variables to an industrial looper system currently deployed at
6
J. Fu et al.
the Wuhan Iron and Steel (Group) Corporation. In such a system, the control of looper tension and angle are important because they affect both the dimensional quality and the mass flow of a strip [10]. Therefore, both of them should be kept to a desired value simultaneously, which is called as a nominal operation point. Intuitively, we intend to adjust roller motor speed ratio to keep looper height constant or to regulate looper motor current to supply a predefined tension. However, in such a closely-coupled system, any single manipulation can affect two target values (angle of the looper and stress of the strip) at the same time. Therefore, it is very difficult to achieve the optimal control performance (angle and stress) simultaneously. Fig. 2 shows the looper system diagram we considered in this article. For space consideration, we only give a summary of the system model here. For detailed model parameters, interested readers can refer to [11] for further details. 1 wM load GR wT
'iref _ i
1 'iact _ i 1 Ti S
Cm
ˉ
'Tref _ i ˉ
1 JS
360 2S
'Zmotori
'T motor _ i
1 S
1 GR
'T i
Looper motor
1 wM load GR wW f _ i 1 GR
dL1 dL2 dT dT 'vref _ i
1 1 TV S
'v0 _ i
1 fi
+ ˉ
'W f _ i
Elp LS
ˉ ˉ
v0i
dfi dW f _ i
v0 _ i 1
d Ei dW f _ i
Fig. 2. System model of looper tension and height
The main symbols in Fig. 2 are described as follows. Δiref i : current reference increment of motor of looper attached to stand i; Δiact i : actual current increment of motor of looper attached to stand i ; Δωmotor i : angle speed incremental of looper motor attached to stand i ; Δθi : angle incremental of looper attached to stand i; Δvref i : line speed reference incremental of rolling i; Δv0 i : actual speed incremental of rolling i; Δτi : actual stress incremental of strip between stand i and stand i + 1; v0 i : actual line speed of the roller in the stand i; We select state variables as T
x = [Δτi , Δθi , Δωmotor i , Δiacti , Δvo i ] .
(22)
An ADP Approach for Closely-Coupled MIMO System Control
7
manipulated variables T
u = [Δiref i , Δvref i ] .
(23)
and output variables as T
y = [Δθi , Δτi , ] .
(24)
Thus a state space model is obtained as follows [11] x˙ = Ap x + Bp x , y = Cp x
(25)
where ⎡ a11 ⎢ 0 ⎢ ⎢ Ap = ⎢a31 ⎢ ⎣ 0
a32 0
1 GR
−
KD 360 J 2π
0
0 0
−
Elp L
(1+fi )
0
Cm 360 J 2π
− T1
l
0 0
⎤ ⎥ ⎥ ⎥ ⎥, ⎥ ⎦
− T1v 0 0 T 0 0 0 T1l 0 Bp = 0 0 0 0 T1v 01000 Cp = 10000
E Elp dβi+1 dfi dL1 dL2 1 ; a = − Llp v0i dτ +v = + 0 13 i+1 dτ L dθ dθ GR ; i i i i
0
a11
a13
0 0
(26)
0
(27) (28)
∂Mload 1 360 1 360 a31 = − G1R ∂M∂τload J 2π ; a32 = − ∂θi JGR 2π . i
3.1
Experiment 1: Nominal Model with External Noise
In our experiments, we conduct 100 runs for each experiment. A run consists of a maximum of 1000 consecutive trials. It is considered successful for a run provided that a trial has lasted 6000 time steps and the index of that trial is equal or less than 1000. Otherwise, the run is considered unsuccessful. The time for each time step is 0.02s, and a trial is a complete process from start to finishe which is regarded when either tension is outside the range [−1.029M pa 1.029M pa] or angle is outside the range of [−3.3◦ 3.3◦ ]. Several experiments were conducted to evaluate the effectiveness of our proposed leaning and control design. The critic neural network is chosen as a 7 − 10 − 1 MLP structure and the action neural network is chosen as 5 − 6 − 2 MLP structure. The parameters used in simulations are summarized in Table 1 with the same notations defined in [4]. Besides, we have added both sensor and actuator noise to the state measurements and action network outputs. In our current study, we consider both Uniform and Gaussian noise on actuators and sensors. We present the experimental results in the left part of Table 2. There are three sub-columns under
8
J. Fu et al. Table 1. Summary of the parameters used in obtaining the results given in table 2 parameter lc (0) la (0) lc (f ) la (f ) value 0.28 0.28 0.005 0.005 parameter Nc value 100
Na Tc Ta 200 0.04 0.004
Table 2. Performance evaluation of MIMO method
Noise type
nominal model Success rate trial Noise type
Noise free U.5% a.∗ U. 10% a. U. 5% s.† U. 10% s. G.‡σ 2 (0.1) s. G. σ 2 (0.2) s. U. 5% a.s. U. 10% a.s. G. σ 2 (0.1) a.s. Gaussian σ 2 (0.2) a.s.
96% 96% 94% 92% 93% 88% 79% 95% 94% 84% 74%
224.9 236.3 190 208.2 196.4 273.8 301.3 208.9 280.6 324.8 372.6
generalized model Perturbation Success rate trial
Noise free U. 5% a.s. U. 10% a.s. G. σ 2 (0.1) a.s. G. σ 2 (0.2) a.s. Noise free U. 5% a.s. U. 10% a.s. G. σ 2 (0.1) a.s. G. σ 2 (0.2) a.s.
U. U. U. U. U. G. G. G. G. G.
5% 5% 5% 5% 5% σ 2 (0.001) σ 2 (0.001) σ 2 (0.001) σ 2 (0.001) σ 2 (0.001)
96% 95% 94% 81% 71% 91% 95% 93% 76% 76%
212.4 226.9 249.9 381.9 402.9 235.9 228.9 233.4 269.1 357.3
∗
a. : actuators are subject to the noise s. : sensors are subject to the noise a.s. : both actuators and sensors are subject to the noise Uniform noise ‡ Gaussian noise †
this caption. Specifically, the column ’Noise type’ indicates which kind of noise is added into the looper system (actuators, sensors, or both), the column ’Success Rate’ denotes the number of successful runs over total number of the runs, and the column ’trial’ demonstrates the average number of needed trials among those successful runs. 3.2
Experiment 2: Generalized Model with Internal Perturbation and External Noise
As a matter of fact, variation exists in the inner looper system due to the fluctuation of temperature and reduction ratio. The latter plays an important role in the looper dynamics. Therefore, in this case, we imitate the practical looper object via the generalized model with the perturbation (forward slip) in the elements of matrix Ap equation (26). In his way, the generalized plant model with the matrix A˜p can be presented as follows
An ADP Approach for Closely-Coupled MIMO System Control
⎡ a11 (1 + δ) a12 ⎢ a21 a22 ⎢ A˜p = Ap + ΔAp = ⎢ .. .. ⎣ . . a52 a51
a13 a23 .. .
⎤ a14 a15 (1 + δ) ⎥ a24 a25 ⎥ ⎥. .. .. ⎦ . .
a53 a54
9
(29)
a55
Since the deviation in the strip thickness is mostly below the 100μm or so, a Gaussian distribution with a mean zero and variance 0.001 is chosen to simulate the perturbation. The right part of Table 2 demonstrates the simulation results. We would like to note that the column of ’perturbation’ implies the kind of imitation for inner perturbation imposed on the looper system. From these results, we can see the proposed approach demonstrates effective performance when the system is subject to not only the external noise in sensors and actuators, but also internal parameter perturbations. Further observations show that the success rate and trial numbers in this case are almost at the same level compared to the system under general distributions (the left part of Table 2), which demonstrates the robustness of this approach.
4
Conclusions
In this paper, we investigate the application of ADP for an industrial application problem. We first derive the learning procedure of both action network and critic network with multiple control variables, and then study its control performance of the tension and height of the looper system in a hot strip mill. Various simulation results under different types of internal perturbations and external noises demonstrated the effectiveness of this approach.
References 1. Werbos, P.J.: Adp: The key direction for future research in intelligent control and understanding brain intelligence. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics 38(4), 898–900 (2008) 2. Werbos, P.J.: Intelligence in the brain: A theory of how it works and how to build it. Neural Networks 22(3), 200–212 (2009) 3. Wang, F.Y., Zhang, H., Liu, D.: Adaptive dynamic programming: An introduction. IEEE Computational Intelligence Magazine 4(2), 39–47 (2009) 4. Si, J., Yu-Tsung, W.: Online learning control by association and reinforcement. IEEE Transactions on Neural Networks 12(2), 264–276 (2001) 5. Park, J.W., Harley, R.G., Venayagamoorthy, G.K.: Adaptive-critic-based optimal neurocontrol for synchronous generators in a power system using mlp/rbf neural networks. IEEE Transactions on Industry Applications 39(5), 1529–1540 (2003) 6. Enns, R., Si, J.: Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Transactions on Neural Networks 14(4), 929–939 (2003)
10
J. Fu et al.
7. Liu, D., Javaherian, H., Kovalenko, O., Huang, T.: Adaptive critic learning techniques for engine torque and air-fuel ratio control. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics 38(4), 988–993 (2008) 8. He, H., Liu, B.: A hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming. In: Proc. IEEE International Conference on Networking, Sensing, and Control, ICNSC 2010 (2010) 9. Werbos, P.J.: Backpropagation through time: What it does and how to do it. Proceedings of the IEEE 78(10), 1550–1560 (2002) 10. Cheng, C.C., Hsu, L.Y., Hsu, Y.L., Chen, Z.S.: Precise looper simulation for hot strip mills using an auto-tuning approach. The International Journal of Advanced Manufacturing Technology 27(5), 481–487 (2006) 11. Fu, J., Yang, W.D., Li, B.Q.: Lmis based h decoupling method and the looper height and tension control. Kongzhi yu Juece/Control and Decision 20, 883–886+891 (2005)
Adaptive Dual Heuristic Programming Based on Delta-Bar-Delta Learning Rule Jun Wu, Xin Xu, Chuanqiang Lian, and Yan Huang Institute of Automation, College of Mechatronics and Automation, National University of Defense Technology, Changsha, 410073, China
[email protected] Abstract. Dual Heuristic Programming (DHP) is a class of approximate dynamic programming methods using neural networks. Although there have been some successful applications of DHP, its performance and convergence are greatly influenced by the design of the step sizes in the critic module as well as the actor module. In this paper, a Delta-Bar-Delta learning rule is proposed for the DHP algorithm, which helps the two modules adjust learning rate individually and adaptively. Finally, the feasibility and effectiveness of the proposed method are illustrated in the learning control task of an inverted pendulum. Keywords: reinforcement learning, adaptive critic design, dual heuristic programming, Delta-Bar-Delta, neural networks.
1 Introduction Dynamic Programming (DP) [1] is a general approach for sequential decision making under the framework of Markov decision processes (MDPs). However, classical dynamic programming algorithms, such as value iteration and policy iteration, are computationally expensive for MDPs with large or continuous state/action spaces. In recent years, approximate dynamic programming (ADP) and reinforcement learning (RL) have been widely studied to solve the difficulties in DP. Particularly, Adaptive Critic Designs (ACDs) were developed [2][3][4] as a class of learning control methods for MDPs with continuous state and action spaces. ACDs combine the concepts of reinforcement learning and approximate dynamic programming, and neural networks are usually used to approximate the value functions and policies. In various applications, ACDs have been shown to be capable of optimization over time under conditions of noises and uncertainties [5][6]. As depicted in Fig.1, the ACD-based learning controller approximates a nearoptimal control law for a dynamic system by successively adapting two Artificial Neural Networks (ANNs), namely, an actor neural network (which dispenses the control signals) and a critic neural network (which learns to approximate the cost-togo or utility function). These two neural networks approximate the Hamilton-JacobiBellman equation associated with the optimal control theory [6]. According to the critic's inputs and outputs, there are three types of ACD-based learning algorithms, i.e. Heuristic Dynamic Programming (HDP), Dual Heuristic Programming (DHP), and the action dependent versions of HDP and DHP, namely, ADHDP and ADDHP D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 11–20, 2011. © Springer-Verlag Berlin Heidelberg 2011
12
J. Wu et al.
Fig. 1. The schematic diagram of ACDs
[5]. As an important class of ACD-based learning control method, DHP estimates the derivatives of utility functions and has been successfully applied in some real-world problems, such as generator control [6][7], motion controller design [8] , collective robotic search [9] and flight control [10]. As the DHP technique relies on the online learning performance of ANNs, selecting an appropriate learning rate is crucial for its successful implementation. A low learning rate may be good for learning stably in ANNs. However, it may become a bad choice for DHP since DHP requires the ANNs converge quickly enough so as to learn a near-optimal policy successfully. In DHP, the critic module and the actor module depend on each other. The ‘bad’ learning performance in one module will influence the other module, and the ‘bad’ result will be strengthened iteratively. Finally, it may result in failures for learning a good control policy. Therefore, it is desirable to study new methods in the design of learning rates to improve DHP. However, according to the authors’ knowledge, there are very few works on the optimization of learning rates in DHP. In this paper, a new method, which adopts the Delta-Bar-Delta rule [11], is presented to accelerate DHP’s learning performance heuristically. The simulations on the Inverted Pendulum problem show that the new method can regulate the learning rates adaptively and, finally, increase DHP’s success rate.
2 Dual Heuristic Programming In ACDs, the critic network is used to estimate the utility function J or its derivative with respect to the state vector x. When a deterministic MDP with continuous state and action spaces is considered, the utility function J embodies the desired control objective through time and is expressed in the Bellman equation as follows:
J (t ) = U (t ) + γ J (t + 1) = ∑ k =0 γ kU (t + k ) , ∞
(1)
where γ is a discount factor (0< γ ≤1), and U(.) is a local cost. The aim of DHP is to learn a sequence of actions that maximize or minimize J. The entire process of DHP can be characterized as a simultaneous optimization problem: approximated gradientbased optimization for the critic module together with gradient-based optimization for the actor module. Since the gradient is an important aspect for controller training, DHP uses critics to estimate the derivatives of J directly, i.e. λ = ∂J / ∂x instead of the value function itself. The corresponding Bellman's Recursion is expressed as:
Adaptive Dual Heuristic Programming Based on Delta-Bar-Delta Learning Rule
∂J (t ) ∂U (t ) ∂J (t + 1) . = +γ ∂x(t ) ∂x(t ) ∂x(t )
13
(2)
Then the error in critic training can be expressed as
Ec = λ (t ) − γ
∂J (t + 1) ∂U (t ) ∂J (t ) ∂J (t + 1) ∂x(t + 1) ∂U (t ) . − = −γ − ∂x(t ) ∂x(t ) ∂x(t ) ∂x(t + 1) ∂x(t ) ∂x(t )
(3)
In DHP, applying the chain rule for derivatives yields,
∂x j (t + 1) m n ∂x j (t + 1) ∂uk (t ) ∂J (t + 1) n , = ∑ λ j (t + 1) + ∑∑ λ j (t + 1) ∂xi (t ) ∂xi (t ) ∂uk (t ) ∂xi (t ) j =1 k =1 j =1
(4)
where λ j ( t + 1) = ∂J (t + 1) / ∂x j (t + 1) , and n, m are the numbers of outputs in the model network and the actor network, respectively. By exploiting (4), each component in the vector Ec is determined by:
Ecj (t ) =
∂J (t ) ∂J (t + 1) ∂U (t ) m ∂U (t ) ∂uk (t ) . −γ − −∑ ∂x j (t ) ∂x j (t ) ∂x j (t ) k =1 ∂uk (t ) ∂x j (t )
[w ji (n)] x1 (n)
Σ
(5)
[ wkj ( n)]
Σ
f 1 (n )
y j ( n)
xi (n)
f k (n )
Σ Σ
f K (n)
xI (n)
Σ
y J ( n)
Fig. 2. The ANN structure used in DHP
To train both the actor and critic networks, a model of the system dynamics that includes all the terms from the Jacobian matrix of the coupled system, i.e. ∂x j (t + 1) / ∂xi (t ) and ∂x j (t + 1) / ∂uk (t ) , is needed. These derivatives can be found from an analytic model directly or from a neural network model (or fuzzy model) indirectly [12]. The actor network is adapted by propagating λ(t+1) back through the model down to the actor network. According to the Bellman optimality principle, to obtain the best control signal u(t ) , the following equation should hold,
∂J (t ) ∂U (t ) ∂J (t + 1) ∂U (t ) ∂J (t + 1) ∂x(t + 1) =0 ⇒ +γ = +γ =0 . ∂u (t ) ∂u (t ) ∂u (t ) ∂u (t ) ∂x(t + 1) ∂u (t )
(6)
14
J. Wu et al.
As shown in Fig.2, for ensuring non-linear approximation capacity and faulttolerance capacity, ANNs based on multi-layer perceptrons are usually adopted to build the critic module and actor module in DHP. The activation functions in ANNs usually include linear function Linear (x)=k ( x) and non-linear sigmoid function Sigmoid ( x) = (1 + e − x ) −1 .
3 Adaptive Learning Rates Based on the Delta-Bar-Delta Rule In the original DHP algorithm, the learning rate adopted in the weight update is always fixed and uniform. However, a fixed learning rate is not an optimal choice for ANNs’ learning. Obviously, DHP with a fixed learning rate is lacking of adaptability. The learning rate should be regulated according to the problem disposed and current learning phase resided. Furthermore, since an ANN has many weights, it is questionable whether all the weights should be learned at the same learning rate. Especially, there is another difficulty in adopting a fixed learning rate for DHP. Since the critic network and the actor network depend on each other seriously, the learning results in one network will be passed to the other network and affect its learning procedure. Since the two learning procedures operate iteratively, a ‘bad’ learning behavior in one module may result in an unsuccessful learning procedure. Therefore, although a conservative learning rate is usually selected in ANNs for achieving good convergence performance, an adaptive learning rate can be a better choice for DHP. In the following context, a Delta-Bar-Delta rule [11] meeting such demands is adopted for DHP. 3.1 The Delta-Bar-Delta Rule As mentioned above, how to select learning rates will affect the learning procedure greatly. Since the cost surface for multi-layer networks can be complex, choosing a fixed learning rate can be improper. What works in one location may not work well in another location. The Delta-Bar-Delta rule can regulate the learning rate locally and heuristically as training progresses [11]: z z z z z
Each weight has its own learning rate. For each weight, the gradient at the current time-step is compared with the gradient at the previous step (actually, previous gradients are averaged). If the gradient is in the same direction, the learning rate is increased. If the gradient is in the opposite direction, the learning rate is decreased. Should be used with batch mode only.
The Delta-Bar-Delta rule [11] regulates the learning rates according to the gradients of objective function E(.) as follows:
∂E ( t ) ∂E ( t ) ∂E ( t − 1) , =− ∂α ij ( t ) ∂wij ( t ) ∂wij ( t − 1)
(7)
Adaptive Dual Heuristic Programming Based on Delta-Bar-Delta Learning Rule
15
where α ij ( t ) is the learning rate with respect to weigh wij ( t ) . The learning rate can be regulated as
Δα ij ( t + 1) = −γ
∂E ( t ) ∂E ( t ) ∂E ( t − 1) , =γ ∂α ij ( t ) ∂wij ( t ) ∂wij ( t − 1)
(8)
where γ is a positive number. Due to the difficulty in choosing an appropriate γ , a revised learning rate regulating rate is proposed as follows:
⎧ a ⎪ Δα ij ( t + 1) = ⎨ −bα ij ( t ) ⎪ 0 ⎩
if Sij ( t − 1) Dij ( t ) > 0 if Sij ( t − 1) Dij ( t ) < 0 ,
(9)
else
where Dij ( t ) = ∂E ( t ) / ∂wij ( t ) , Sij ( t ) = (1 − ξ ) Dij ( t − 1) + ξ Sij ( t − 1) , a, b and ξ are positive numbers and usually set as [11]: 10−4 ≤ a ≤ 0.1 , 0.1 ≤ b ≤ 0.5 , 0.1 ≤ ξ ≤ 0.7 . Apparently,
the learning rate aij ( t ) can increase linearly only, but decrease exponentially, which
protects the learning rate from being too large. 3.2 Adopting Delta-Bar-Delta for DHP
According to Fig.2, the expected output of the critic network is:
λsD (t ) =
S ⎧ ∂ U (t ) K ⎧ ∂ U (t ) ∂ u k (t ) ⎫ ⎪ ∂J (t + 1) K ⎛ ∂xs′ (t + 1) ∂uk (t ) ⎞ ⎫⎪ . +∑⎨ ×∑⎜ ⎬+γ ∑⎨ ⎟⎬ ∂xs (t ) k =1 ⎩ ∂uk (t ) ∂xs (t ) ⎭ s ′ =1 ⎩ ⎪ ∂xs′ (t + 1) k =1 ⎝ ∂uk (t ) ∂xs (t ) ⎠ ⎭⎪
(10)
The critic network learns minimization of the following error measure over time
Ec (t ) =
1 (λs (t ) − λsD (t )) 2 . ∑ 2 s
(11)
When a fixed learning rate is used, the update rule for the critic weights is
Δwsm (t ) = α where
∂Ec (t ) ∂λ (t ) , = α (λs (t ) − λsD (t )) s ∂wsm (t ) ∂wsm (t )
(12)
wsm denotes the weight for the m-th neuron connection between the hidden
node with the s-th output of the critic network, α is a positive learning rate. λs (t ) is the output value of the s-th output of the critic network. The actor network tries to minimize the following error measure over time
Ea = ∑ [ t
∂U (t ) ∂J (t + 1) 2 . ] +γ ∂u (t ) ∂u (t )
If a fixed learning rate is used, the update rule for the actor weights is
(13)
16
J. Wu et al.
⎛ ∂U ( n) ∂J (n + 1) ⎞ ∂uk ( n) Δvkm (n) = α ⎜ +γ ⎟ ∂uk ( n) ⎠ ∂vkm (n) ⎝ ∂uk ( n) ⎛ ∂U ( n) ∂x (n + 1) ⎞ ∂uk ( n) =α⎜ + γ ∑ λsD (n + 1) s ⎟ ∂uk (n) ⎠ ∂vkm (n) s ⎝ ∂uk (n)
,
(14)
where β is a positive learning rate, vkm denotes the weight for the m-th neuron connection between the hidden node with the k-th output of the actor network. Generally speaking, if the weights of the actor network converge finally, it can be considered that a near-optimal control policy is obtained, also. With the help of the Delta-Bar-Delta rule, the weights of the critic network in DHP are updated as follows:
Δwsm (t ) = α sm
∂Ec (t ) ∂λ (t ) , = α sm (λs (t ) − λsD (t )) s ∂wsm (t ) ∂wsm (t )
(15)
where α sm is the adaptive learning rate for weight wsm . Correspondingly, the weights of the actor network are updated as follows: ⎛ ∂U (t ) ∂x (t + 1) ⎞ ∂uk (t ) , Δvkm (t ) = α km ⎜ + γ ∑ λsD (t + 1) s ⎟ ∂uk (t ) ⎠ ∂vkm (t ) s ⎝ ∂uk (t )
where α km is the adaptive learning rate for weight
(16)
vkm .
4 Experiments and Analysis 4.1 The Inverted Pendulum Problem
In this paper, the Inverted Pendulum problem [1] is used to evaluate the improved DHP algorithm, which requires balancing a pendulum of unknown length and mass at the upright position by applying forces to the cart. The control system of an inverted pendulum is depicted in Fig. 3. θ
l
F
mc
x=0
x Fig. 3. The control system of an inverted pendulum
Adaptive Dual Heuristic Programming Based on Delta-Bar-Delta Learning Rule
17
The inverted pendulum system consists of a cart moving horizontally and a pole with one end contacting with the cart. Let x denote the horizontal distance between the center of the cart and the center of the track, where x is negative when the cart is in the left part of the track. θ denotes the angle of the pole from its upright position (in degrees) and F is the amount of force (N) applied to move the cart towards its left or right. The state variables x , θ are the derivatives of x and θ , respectively. The state transitions are governed by the nonlinear dynamics of the system:
{
}
x = α F + ml[θ 2 sin θ − θ cos θ ] − μc sgn( x ) , θ =
g sin θ + cos θ [− F − mlθ + μ c sgn( x )] − μ p / ( ml ) l (4 / 3 − α m cos 2 θ )
(17)
,
(18)
where g is the gravity constant (g = 9.8m/s2), m is the mass of the pendulum (m = 0.1kg), l is the length of the pendulum(l = 1m), and α = 1.0/(m + Mcart) , Mcart is the mass of the cart (Mcart = 1.0kg). The friction factor between the cart and track is μc = 0.0005 and the friction factor between the cart and pole is μp = 0.000002. The simulation step is set to be 0.02 seconds. A reward 0 is given as long as the angle of the pendulum does not exceed π/2 and the horizontal distance does not exceed the boundary 2.4m in absolute value, otherwise a penalty of -1 is given. The discount factor of the process is set to be 0.90. For controlling inverted pendulums, the task is to construct a control policy that can balance the pendulum only with state feedbacks and reward signals. 4.2 Implementation and Evaluation
Based on the above model, the state vector can be defined as
S ( t ) = ( x1 ( t ) , x2 ( t ) , x3 ( t ) , x4 ( t ) , x5 ( t ) , x6 ( t ) ) ,
(19)
where x1 (t ) = θ (t ) , x2 (t ) = θ(t ) , x3 (t ) = θ(t ) , x4 (t ) = x (t ) , x5 (t ) = x (t ) , and
x6 (t ) = x (t ) . The action is defined as a ∈ [−10 N ,10 N ] . The local utility function is defined as:
U (t ) = 0.25( x(t ) − xd (t )) 2 + 0.25(θ (t ) − θ d (t )) 2 ,
(20)
where xd (t ) =0 and θ d (t ) =0 denote the desired position and desired angle, respectively. The critic network and the actor network are both implemented with three-layer perceptrons, where the hidden layer has 5 nodes with Sigmoid activation functions. The number of input nodes is equal to the dimension of states, which is 6. For the critic network, the output layer has 6 linear nodes while in the actor network, only 1 linear node is needed since the action dimension is 1. In the experiments, the maximum number of episodes is set to be 500 and the maximum control step for every episode is 4000. One episode is unsuccessful if at the
18
J. Wu et al.
end of the episode, either the pole falls or the cart exceeds the boundary. The initial position and angle are randomized uniformly within the intervals x ∈[ − 0.5m, 0.5m] and θ ∈ [ −1D ,1D ] , respectively. To evaluate the proposed method comprehensively, different initial learning rates, i.e. α m =0.01, 0.04, 0.07, 0.10, 0.15, 0.20, are tested respectively. In the following Fig.4, the convergence rates of the original DHP and the improved DHP are compared in terms of the quadratic norm of the differences between the actor weights in two successive time steps, i.e., ew = wGt +1 − wGt . The initial learning rate α m is set to be 0.04. The necessary precondition for comparison is that both the two
learning processes succeed, i.e. the final learned policy can balance the pole with the maximum steps. Generally speaking, the convergence of the actor weights indicates that the DHP learning process converges. So the errors of the actor network’s weights are used for comparisons. 0.2 Original DHP Improved DHP 0.15
ew 0.1
0.05
0 0
10
20
30 Time(s)
40
50
Fig. 4. A comparison of the convergence rate with/without Delta-Bar-Delta rule 0.6
Success rate
0.5
Original DHP Improved DHP
0.4 0.3 0.2 0.1 0
0.01 0.04 0.07 0.10 0.15 0.20 Initial Learning rate
Fig. 5. A comparison of the success rates between the original DHP and the improved DHP
Adaptive Dual Heuristic Programming Based on Delta-Bar-Delta Learning Rule
19
Fig. 4 shows that the learning process of the original DHP algorithm converged after 43s, however, the improved DHP algorithm converged after 21s only. It is evident that the adoption of the Delta-Bar-Delta rule helps the DHP algorithm learn more effectively and, finally, result in better convergence performance. The following Fig. 5 shows the success rates of the original DHP and the improved DHP. The success rate denotes the ratio of the number of successful learning processes to the total number of learning episodes, which is 500 in the simulation. According to the results, a distinct improvement is achieved for the improved DHP adopting the Delta-Bar-Delta rule. The above experimental results show that the improved DHP gains advantages over the original one. The Delta-Bar-Delta rule helps the networks learn with adaptive learning rates, and then achieve better convergence performance and success rates.
5 Conclusion DHP has been shown to be a promising approach for online learning control in MDPs with continuous state and action spaces. However, the learning rates will greatly influence the performance of DHP. In this paper, a Delta-Bar-Delta rule is adopted to regulate the weights of the critic and the actor adaptively. The feasibility and the effectiveness of the proposed method are illustrated in the experiments on an Inverted Pendulum problem. Future work will include more theoretical and empirical analysis of adaptive learning rates in online reinforcement learning. Acknowledgments. This work is supported in part by the National Natural Science Foundation of China under Grant 60774076, 61075072 and 90820302, the Fork Ying Tung Education Foundation under Grant 114005, and the Natural Science Foundation of Hunan Province under Grant 2007JJ3122.
References 1. Suykens, J.A.K., de Moor, B., Vandewalle, J.: Stabilizing Neural Controllers: A Case Study for Swing up a Double Inverted Pendulum. In: NOLTA 1993 International Symposium on Nonlinear Theory and Its Application, Hawaii (1993) 2. Werbos, P.J.: A Menu of Designs for Reinforcement Learning Over Time. In: Mliler, W.T., Sutton, R.S., Werbos, P.J. (eds.) Neural Networks for Control, ch. 3. MIT Press, Cambridge (1990) 3. Werbos, P.J.: Approximate Dynamic Programming for Real-Time Control and Neural Modeling. In: White, D.A., Sofge, D.A. (eds.) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, ch.13, Van Nostrand Reinhold, New York (1992) 4. Liu, D.R.: Approximate Dynamic Programming for Self-Learning Control. Acta Automatica Sinica 31(1), 13–18 (2005) 5. Prokhorov, D., Wunsch, D.: Adaptive Critic Designs. IEEE Trans. Neural Networks 8, 997–1007 (1997) 6. Venayagamoorthy, G.K., Harley, R.G., Wunsch, D.C.: Comparison of Heuristic Dynamic Programming and Dual Heuristic Programming Adaptive Critics for Neurocontrol of a Tur-bogenerator. IEEE Transactions on Neural Networks 13(3), 763–764 (2002)
20
J. Wu et al.
7. Park, J.W., Harley, R.G., Venayagamoorthc, G.K., et al.: Dual Heuristic Programming Based Nonlinear Optimal Control for a Synchronous Generator. Engineering Applications of Arti-ficial Intelligence 21, 97–105 (2008) 8. Lin, W.S., Yang, P.C.: Adaptive Critic Motion Control Design of Autonomous Wheeled Mobile Robot by Dual Heuristic Programming. Automatica 44, 2716–2723 (2008) 9. Zhang, N., Wunsch II, D.C.: Application of Collective Robotic Search Using Neural Network Based Dual Heuristic Programming (DHP). In: Wang, J., Yi, Z., Żurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006. LNCS, vol. 3972, pp. 1140–1145. Springer, Heidelberg (2006) 10. Ferrari, S., Stengel, R.F.: Online Adaptive Critic Flight Control. Journal of Guidance, Control, and Dynamics 27(5), 777–786 (2004) 11. Jacobs, R.A.: Increased Rates of Convergence Through Learning Rate Adaption. Neural Networks 1, 295–307 (1988) 12. Lendaris, G.G., Shannon, T.T., Schultz, L.J., et al.: Dual Heuristic Programming for Fuzzy Control. In: Proceedings of IFSA / NAFIPS, Vancouver, B.C., vol. 1, pp. 551–556 (2001)
A Design Decision-Making Support Model for Prioritizing Affective Qualities of Product Chun-Chih Chen 1 and Ming-Chuen Chuan 2 1
Department of industrial design, National Kaohsiung Normal University, 62, Shenjhong Rd. , Yanchao Township, Kaohsiung Taiwan 824, R.O.C.
[email protected] 2 Institute of applied arts, National Chiao Tung University, 1001 University Road, Hsinchu, Taiwan 300, R.O.C.
[email protected] Abstract. Nowadays, the affective qualities elicited by the product design plays a crucial role in determining the likelihood of its success in today’s increasingly competitive marketplace. To resolve the trade-offs dilemma in multipleattribute optimization of affective user satisfaction, a decision-making support model based on the two-dimensional quality model, the extended Kano model, is presented in this study. The proposed model can assists designers in prioritizing actions and resource allocation to guide ideal product conception in different competitive situation. The implementation of this proposed model is demonstrated in detail via a case study of mobile phone design. Keywords: User satisfaction, Kano model, Multiple-attribute decision making, affective design.
1 Introduction In order to improve the competitiveness, a well-designed product should be able to not only meet the basic functionality requirements, but also satisfy users’ psychological needs (or feeling, affects). The affective qualities elicited by the product form design have become a critical determinant of user satisfaction. Considering the multiple-attribute nature of user satisfaction, however, there are difficult trade-offs to be made when deciding which affective qualities should be emphasized in the product design process. To resolve this problem requires understanding which qualities create more satisfaction than others. Kano et al. [6] developed a two-dimensional model describing the linear and non-linear relationship between quality performance and user satisfaction. The Kano model divides qualities into must-be, one-dimensional, indifferent and attractive qualities based on how they satisfy user needs, as Figure 1 illustrates. So far, the application of the Kano model has mostly focused on the effects of physical or technical product qualities on user satisfaction. Considering the different ability to satisfy user need between physical and subjective (affective) qualities, this study hypothesizes that in addition to the existed Kano quality categories, more quality types may be identified for affective qualities. Thus, an extended Kano model is proposed to explore the different effects of affective qualities on user satisfaction. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 21–30, 2011. © Springer-Verlag Berlin Heidelberg 2011
22
C.–C. Chen and M.-C. Chuan
Fig. 1. Kano model of user satisfaction
A mobile phone design experiment was conducted to demonstrate the benefits of using the proposed design decision-making support model to prioritize affective qualities for enhancing user satisfaction.
2 Theoretical Background 2.1 The Extended Kano Model User satisfaction has mostly been viewed as one-dimensional; that is, product attributes result in user satisfaction when the attributes are fulfilled and dissatisfaction when not fulfilled. However, fulfilling user expectations to a great extent does not necessarily imply a high level of user satisfaction. Kano et al. [6] have proposed a ‘two-dimensional’ quality model characterizing the following categories of product attributes that influence user satisfaction in different ways (Figure 1). It can help designers ascertain the extent to which a given set of product attributes of a product satisfies users’ wants and needs [6]. The Kano model was originally developed to probe the characteristics of objective (physical or technical) product qualities in affecting users’ satisfaction in different ways. For a physical quality, increasing quality performance generally causes monotonous enhancement of satisfaction. But this may not be true for some affective qualities. For example, Berlyne [1] suggested a Wundt (inverted-U) curve function to explain the effect of sense feelings on user preferences. He indicated that the hedonic value (pleasure) associated with the perception of a stimulus, such as complexity, will peak when there is an optimal level of psychological arousal; too little arousal (too simple) causes indifference, while too much arousal (too complex) causes displeasure. To explain the relationships between quality performances and satisfaction in a systematic and more comprehensive manner, an extended Kano model using the linear regression model for classifying Kano categories is proposed in this study. The extended Kano model classifies product qualities into four distinct categories. Each quality category affects users’ satisfaction in a different way, as Figure 2 shows. The different types of qualities are explained as follows: 1. The attractive quality: Here, user satisfaction increases super linearly with increasing attribute performance (quality). There is not, however, a corresponding decrease in user satisfaction with a decrease in attribute performance (Figure 2 (A)).
A Design Decision-Making Support Model for Prioritizing Affective Qualities
23
2. The one-dimensional quality: Here, user satisfaction is a linear function of product attribute performance (quality). Increasing attribute performance leads to enhanced user satisfaction (Figure 2 (B)). 3. The must-be quality: Here, users become dissatisfied when the performance of this product attribute is low or absent. However, user satisfaction does not rise above the neutral level with increased performance of this product attribute (Figure 2 (C)). 4. The ‘nominal-the-better: A specified level of attribute performance will cause the highest user satisfaction (optimization). In other words, the closer a quality comes to the target performance, the higher the user satisfaction will be (Figure 2 (D)). Besides these four qualities, two more quality types can be identified: the indifferent and reversal qualities (to be precise, they should be called characteristics because they are not really a user need or quality). For the indifferent quality, user satisfaction is not affected by the performance of a product quality at all, as Figure 2 (E) illustrates. For the reversal quality, users are more dissatisfied as the attribute performance increases. It can be classified into four sub-types: reversed attractive quality (-A), reversed one-dimensional quality (-O), reversed must-be quality (-M), and reversed nominal-the-better (-NB). The characteristics of -NB quality (nominal-the-worse quality), -A quality, -O quality, and -M quality are simply the reversed situations of the A, O, M and NB, qualities, respectively, as shown in Figure 2 (F), (G), (H), (I).
Fig. 2. The extended Kano model (linear model)
2.2 Identification of the Extended Kano Qualities Ting and Chen [8] proposed a regression model to classify qualities into Kano types of quality. They performed a regression analysis for each quality, using user satisfaction as the dependent variable and the positive/negative performance of a quality as the independent variables. Positive performance of a quality means that this quality is presenting (sufficient), and negative performance means the quality is absent (insufficient). The impact of positive and negative quality performance on user satisfaction can be estimated by the following linear regression model:
US = C + β 1 × (− Kn ) + β 2 × Kp
(1)
24
C.–C. Chen and M.-C. Chuan Table 1. The Kano decision table for the extended Kano model β2 (+) Regression coefficient Sig.(+) n.s. Sig.(-) Sig.(+) - NB -A -O β1 n.s. A I -M (-) Sig.(-) O M NB (A-attractive; O-one-dimensional; M-must-be; I-indifferent; R: reversal; NB: nominal-the-better).
Where US is the degree of user satisfaction; negative and positive performance of a quality are represented as Kn and Kp , respectively; and β1 and β 2 are their corresponding regression coefficients. When the performance of a quality is negative, the value of the negative performance is equal to (- Kn ) and Kp is assigned to 0. On the other hand, if the performance of a quality is positive, the value of the positive performance is equal to Kp , and Kn is assigned to 0. Comparing the two regression coefficients ( β1 and β 2 ) can reveal the varied relationship between the quality and user satisfaction, according to the positive or negative value of the coefficients with significance. The greater the absolute value of the coefficient, the greater its effect on user satisfaction. According to the extended Kano model, qualities can be classified into different categories according to whether the regression coefficients reach the significant level or not. The classification guidelines are described as Table 1. In this table, Sig.(+) means the regression coefficient is significantly positive; Sig.(-) means the regression coefficient is significantly negative; n.s. means the regression coefficient is non-significant in the regression equation.
3 The Verified Experiment Design for the Extended Kano Model To illustrate how the extended Kano model can be applied to identify the different relationships between affective qualities and user satisfaction, we conduct an experimental study of mobile phones. The design of the experiment is described as the following. In this study the photos of mobile phone design were used as the stimuli for eliciting users’ affective responses. A total of 35 representatives samples (Figure 3) were selected.
Fig. 3. Representative mobile phone designs
A Design Decision-Making Support Model for Prioritizing Affective Qualities
25
Each mobile phone design was represented by a scaled 4”×6” grayscale photographic image. While a literature review [2, 4] provided a comprehensive set of affective qualities for selecting the representative semantic differential scales (adjectives). A group of experts and users then were recruited to identify the suitable and important qualities from the collected pool of qualities. As a result, 20 bipolar adjective pairs were selected for the SD survey, as shown in Table 2. Then 32 male and 28 female subjects between 18-24 years old were recruited for the SD evaluation experiment. All 60 subjects were asked to evaluate the 35 experimental samples of mobile phones, with a 7-point Likert scale (-3-+3) of the SD method, on each of the 20 affective qualities and overall satisfaction.
4 Result and Discussion 4.1 The Classification of Affective Qualities Based on the Extended Kano Model To identify the different effects of qualities on user satisfaction, an extended Kano model using the regression method [8], as described in 2.2, was used to classify these qualities into different Kano categories. The means of 20 qualities for each phone design were then transformed into negative and positive performance, respectively. Setting overall satisfaction as the dependent variable and positive and negative quality performance as the independent variables, the linear regression analyses were performed, in using the software of SPSS, for each of the 20 affective qualities according to Eq. (1). The significance of the regression coefficients was then used to determine the proper category for each quality, according to Table 1. The result of Kano’s quality classification of each attribute is summarized in Table 2. Note that these qualities (No.1-No.7) related with aesthetics perception of user were categorized as one-dimensional qualities. Higher performance in these qualities may lead to higher user satisfaction. These three qualities related with the emotion factor, ‘lively’, ’fun’ interface style , and ‘fascinating’ were categorized as must-be qualities. If the performance of these qualities is low, young users become dissatisfied. However, high performance of this attribute does not raise the level of their satisfaction. Pleasant emotions may presumably regarded as an attractive quality or excitement need for users, but the results indicate that cell phone designs with these qualities are a basic need for young users. The ‘anticipated’ is classified as an attractive quality. That is, the interface design with the characteristics of easy understanding can make users satisfactory; on the contrary an amazing interface design will not lead to user dissatisfaction. While the ‘simple’ attribute is classified as a reversed must-be (-M) quality. In other words, the ‘complex’ (not too simple) feeling in phone design is a must-be quality for young users in this case. The other qualities are all categorized as indifferent qualities. Therefore, these qualities do not affect user satisfaction. Furthermore Table 2 also shows that ‘simple’, with the greatest absolute value of the significantly negative coefficient β2 (-0.759), should be provided to the minimum acceptable level for avoiding user dissatisfaction. Hence, the “anticipated” with the greatest absolute value of the significantly positive coefficient β2 (0.552) can be used as means of differentiating attribute offering from competitors. The values of coefficient can be used to identify the relative importance of qualities affecting on user dissatisfaction or satisfaction.
26
C.–C. Chen and M.-C. Chuan Table 2. Results of the Kano classification Kano classification 1 Conventional-fashionable -0.472 * 0.494 * 0.734 O 2 Vulgar-elegant -0.464 * 0.494 * 0.673 O 3 Ugly-beautiful -0.544 * 0.459 * 0.726 O 4 Common-unique -0.488 * 0.401 * 0.597 O 5 Unpleasant- pleasant -0.561 * 0.346 * 0.610 O 6 Old-young -0.590 * 0.288 * 0.601 O 7 Intense-relaxed -0.367 * 0.343 * 0.393 O 8 Dull-lively -0.627 * 0.197 n.s. 0.557 M 9 Not fun-fun -0.464 * 0.267 n.s. 0.384 M 10 Bored-fascinating -0.478 * 0.241 n.s. 0.367 M 11 Rugged-streamlined 0.130 n.s 0.284 n.s 0.057 I 12 Confused-intelligible interface -0.128 n.s -0.083 n.s 0.014 I 13 Difficult to use-easy to use -0.170 n.s 0.143 n.s 0.024 I 14 Amazing-anticipated -0.008 n.s 0.552 * 0.308 A 15 Novel-familiar -0.099 n.s -0.202 n.s 0.031 I 16 Decorative-functional -0.215 n.s 0.172 n.s 0.110 I 17 Complex-simple -0.132 n.s -0.759 * 0.494 -M 18 Masculine-feminine -0.180 n.s 0.057 n.s 0.046 I 19 Heavy-light -0.039 n.s 0.156 n.s 0.033 I 20 Fragile-durable -0.156 n.s -0.216 n.s 0.045 I *: Sig. 0, f (v) > g (v) > αα α v ∫ f (ζ ) − g (ζ ) e−α v dζ v
Basis on the reference [6], Duhem model is used to the hysteresis of VCM, in which α = 1 , f(v)=3.1635v, g(v)=0.345.
4 Design of Controller 4.1 Improved Dynamic Neural Network Adaptive Inverse Control To meet demands for the high speed and high precision control of the VCM, direct reference model adaptive inverse control for VCM is proposed, Fig.3 shows the structure of the adaptive inverse control in which the external feedback is added for enhancing the dynamic behavior of control system. Jacobin information of the controlled object is the dynamic gain of VCM which is ∂y
used to the learn program for the adaptive inverse control. Jacobin ∂ u is replaced
54
X. Dang, F. Cao, and Z. Wang
∑
∑
∑
∑
∑
∑
Fig. 3. Direct reference model adaptive inverse control
bys g n [ uy (( kk )) −− uy (( kk −−11 )) ] , because the gain value is indirectly adjust by weight value in the learn program[7]. 4.2 Simulation Result for Adaptive Inverse Control To validate the proposed control method, the improved learn technique is applied to the neural network adaptive control for VCM. In VCM model or Duhem model, the desired signal is a sine wave with the attenuation amplitude. In the neural network adaptive inverse control, the number of net hidden layer is thirty, learn rates η 1 0.3 η 1 =0.3 and η 1 =0.8 respectively. The momentum coefficients β 0.1 and α 0.7, respectively.
,
,
=
1.5 Desired output Real output 1
Output
, =
0.5
0
-0.5
-1
0
5
10
15
20
25
Sample time
Fig. 4. Response of output trace
30
35
40
=
Dynamic Neural Network Control for Voice Coil Motor with Hysteresis Behavior
55
0.3
0.2
Output error
0.1
0
-0.1
-0.2
-0.3
-0.4
0
5
10
15
20
25
30
35
40
Sample time
Fig. 5. Output error
Fig.4 and Fig.5 show the trace results of the output signal and trace error, respectively. The experiment results show that the proposed control method is able to effectively trace the desired signal in which MSE (Mean Square Error) is 0.0019. In order to compare with DAFNN[5], DAFNN is used to VCM control in which the learn rates and momentum coefficients are same with the proposed method. For DAFNN control, the curves of input and output signal are showed in Fig.6. The curve of trace error is as in Fig.7. 1.5 Desired output Real output
Output
1
0.5
0
-0.5
-1
0
5
10
15
20
25
Sample time
Fig. 6. Response of output trace
30
35
40
56
X. Dang, F. Cao, and Z. Wang 0.8
0.6
Output error
0.4
0.2
0
-0.2
-0.4
-0.6
0
5
10
15
20
25
30
35
40
Sample time
Fig. 7. Output error
In DAFNN control, MSE is 0.0232, which is ten times more than the proposed control method.
5 Conclusion In order to meet demand for VCM control, a nonlinear dynamic neural network adaptive inverse control is presented, in which the weight value of neural network is filtered and the external feedback is added for raising the dynamic behavior of the control system. Simulation results show that the proposed method can efficiently control VCM with the high precision. Acknowledgments. This work has been supported by the National Nature Science Foundation Project (60964001) and Science Foundation Project of Guangxi Province (0991019Z) and Information and Communication Technology Key Laboratory Foundation Project of Guangxi Province (10902).
References 1. Chen, S.X., Li, Q., Man, Y., et al.: A high-bandwidth moving-magnet actuator for hard disk drives. IEEE Transaction Magnetism 33(5), 2632–2634 (1997) 2. XuanZe, W., Hong, Y., Bo, T.: Cacteristic Text of a Volume Coil Motor Based on Line Cutting. Three Gorger Univetctiy 26(4), 263–265 (2004) 3. Xiaomei, F., Dawei, Z., Xingyu, Z., Weitao, D.: Design Method of High Speed Precision Positioning System Based on Voice Coil Actuator. China Mechanical Engineering 16(16), 1414–1418 (2005)
Dynamic Neural Network Control for Voice Coil Motor with Hysteresis Behavior
57
4. Wang, Y., Su, C.-Y., Hong, H.: Model Reference Control including Adaptive Inverse Hysteresis for Systems with Unknown Input Hysteresis. In: Proceedings of the 2007 IEEE International Conference on Networking, Sensing and Control, London, UK, pp. 70–75 (2007) 5. Ming, L., Hui-ying, L., Han-sheng, Y., Cheng-wu, Y.: Nonlinear Adaptive Inverse Control Using New Dynamic Neural Networks. Journal System Simulation 19(17), 4021–4024 (2007) 6. Du, J., Feng, Y., Su, C.-Y., Hu, Y.-M.: On the Robust Control of Systems Preceded by Coleman-Hodgdon Hysteresis. In: 2009 IEEE International Conference on Control and Automation, Christchurch, New Zealand, pp. 9–11 (2009) 7. Yan-Yang, L., Shuang, C., Hong-Wei, L.: PID-like neural network nonlinear adaptive control. In: 29th Chinese Control Conference, CCC 2010, Beijing, China, pp. 2144–2148 (2010)
A New Model Reference Adaptive Control of PMSM Using Neural Network Generalized Inverse Guohai Liu, Beibei Dong, Lingling Chen, and Wenxiang Zhao School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, China
[email protected],
[email protected] Abstract. A new strategy of model reference adaptive control (MRAC) system based on neural network generalized inverse (NNGI) algorithm, termed as MRAC-NNGI system, is proposed for the current and speed regulations of permanent magnet synchronous motor (PMSM) drives. Due to the fact that PMSM is a multivariable nonlinear system with strong couplings, this paper gives an analysis of generalized reversibility combined with NN. The developed scheme of NNGI is transformed into a pseudo-linear system from connecting the motor plant and achieved the purposes of decoupling and linearization with Levenberg-Marquardt algorithm off-line. Therefore, an adjustable gain of closed-loop adaptive controller is developed by introducing MRAC into this kind of pseudo-linear system. The self-adaptive law is given for the gain regulation of linear system. Comparison of simulation results from others widely used algorithms confirm that it incorporates the merits of model-free learning, highprecision tracking and strong anti-interference capability. Keywords: Current control, speed control, neural network, generalized inverse, MRAC, PMSM.
1 Introduction For the development of modern permanent magnet material and microprocessor technologies, permanent magnet synchronous motor (PMSM) has many excellent features such as small volume, low cost, simple mechanism, high flux density and high efficiency [1]. Due to these advantages, PMSM has gained widespread acceptance in research and application fields of modern aviation, military, industrial automation, intelligent robotics and chemical industry, etc. [2]. However, it can be inferred from a dynamic model that PMSM is a nonlinear and high-order system with interaxis couplings, which may cause poor static and dynamic performances. Moreover, the operation is strongly affected by the rotor magnetic saliency, saturation, and armature reaction effects [3]. So, it is absolutely essential to linearize the system and decouple the operation of current and speed. For advancement of power electronics and computer technology, several schemes have been reported to enhance the control performance of PMSM drives against some of the aforementioned problems. For stabile control purpose, a design method of robust speed servo system with PMSM was presented by applying robust control principle, but there is a bottleneck in D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 58–67, 2011. © Springer-Verlag Berlin Heidelberg 2011
A New Model Reference Adaptive Control of PMSM Using NNGI
59
acquiring the dynamic decoupling and linearization based on id=0 strategy [4]. Especially, the fact that the nominal implementation is faced with the occurrence of some uncertainties and perturbation in motor parameters complicates the design of a practical high-performance control system and is always a model-based controller as well as the same conditions in paper [5]. Neural networks (NN), a multidisciplinary technology, can approach a nonlinear complex function or process with arbitrary precision with two layers using a backpropagation (BP) algorithm. [6]. However, to be an identifier only in many papers [7], [8], [9], the application of NN for controlling motor is restrained considerably in various fields. Subsequently, a common BP algorithm is easy to fall into local minimum and has a relatively low convergence speed during training. To overcome these problems, another kind of BP algorithm, which named as Levenberg-Marquardt algorithm, will be applied. It is the combination of both decreasing gradient and Newton algorithm and has the advantages of improving convergence rate. Neural network generalized inverse (NNGI) system is a method based on the development of NN inverse and becoming popular in linearization and decoupling for controlling of a general nonlinear dynamic continuous system [10] [11]. This paper will propose a strategy of model reference adaptive control (MRAC) system based on NNGI method for the current and speed regulations of PMSM drives. The static NN is responsible for approximating the nonlinear mapping described by the analytical expression of the motor and the linear components are used to represent the dynamics of the GI. Combining NNGI and motor plant, the original system can be transformed into a pseudo-linear composite system, which can be completely equivalent to two single input single output (SISO) linear-subsystems. Compared to traditional linear controller designs, this paper will propose the MRAC method to the pseudo-linear regulators based on Lyapunov stability theory according to the characteristics of these linear-subsystems. All of the afore-mentioned characteristics make the proposed scheme an ideal choice for implementation in a real-time PMSM drives.
2 Generalized Inverse System of PMSM By taking the rotor coordinates (d-q axes) of a motor as reference coordinates, the current model plant of a PMSM can be described in a synchronous state frame as
Lq ⎡ Rs ⎤ 1 x2 x3 + u1 ⎢ − x1 + ⎥ L L L d d d ⎢ ⎥ ⎡ x1 ⎤ ⎢ ψf Ld Rs 1 ⎥ ⎢ ⎥ x = f ( x, u ) = ⎢ x2 ⎥ = ⎢ − x1 x3 − x2 − x3 + u2 ⎥ Lq Lq Lq Lq ⎥ ⎢ ⎢⎣ x3 ⎥⎦ ⎢ ⎥ ⎢σ x x + η x − n p T ⎥ L 2 ⎢⎣ 1 2 ⎥⎦ J
(1)
Where, u=[u1, u2]T =[ud, uq]T is the system state control variable, ud and uq are the stator voltages in d- and q-axis respectively; x=[x1, x2, x3]T =[id, iq, ωr]T is the system state variable, id and iq are the stator currents in d- and q-axis, ωr is the angular velocity; Ld and Lq are the stator inductances in d- and q-axis; np is the number of pole
60
G. Liu et al.
pairs; Rs is the stator resistance; ψf is the rotor flux linkage; J is the moment of inertia; TL is the load torque; y=[y1, y2]T =[id, ωr]T=[x1, x3]T are the system state outputs; σ=3np2(Ld-Lq)/(2J), η=3np2ψf/(2J). To obtain the GI model, differentiation is applied to the outputs y until there are inputs u1 or u2 appear. Firstly, the derivative of output y1 is
y1 = x1 = id = − x1 Rs Ld + x2 x3 Lq Ld + u1 Ld
(2)
Obviously, in y1 , u1 is included. So, the matrix rank t1 is
t1 = rank (∂Y1 ∂uT ) = 1
(3)
Secondly, the derivative of output y2 is y2 = x3 = wr = σ x1 x2 + η x2 − n p J TL
y2 = x3 = wr = −σ Rs x1 x2 (1 Ld − 1 Lq ) + σ x3 ( x12 Lq Ld − x2 2 Ld Lq ) − η x2 Rs Lq − x1 x3 (σψ f + η Ld ) Lq − η x3 ψ f Lq + x2 u1 σ Ld + x1u2 σ Lq + u2 η Lq
(4)
(5)
Obviously, in y2 , u is included. Then Jacobian matrix is 0 ⎤ ⎡ ∂y ∂u ∂y1 ∂u2 ⎤ ⎡ 1 Ld =⎢ J 2 ( x, u ) = ∂Y2 ∂u T = ⎢ 1 1 ⎥ ⎥ ⎣ ∂y2 ∂u1 ∂y2 ∂u2 ⎦ ⎣σ x2 Ld (σ x1 + η ) Lq ⎦
(6)
Where, Y2 = [Y1 , y2 ]T , Y1 = y1 . Det ( J 2 ( x, u )) = (σ x1 + η ) ( Ld Lq ) ≠ 0 , the matrix rank t2 is t2 = rank (∂Y2 ∂u T ) = 2
(7)
When x1≠-η/σ, J2(x, u) is nonsingular. The system relative-order is α=(α1, α2)T=(1, 2)T, α1+α2=3=n (system order), so the GI of system is existent. The system original-order is ne= ne1+ne2=1+2=3=n=α. Therefore, the GI system is described as u = φˆ[( y2 , y2 , y1 ), vˆ]
(8)
Where, a10, a11, a20, a21, a22 are the coefficients of pole-zero assignment of GI system; vˆ = [vˆ1 , vˆ2 ]T are inputs of GI system, vˆ1 = a11 y1 + a10 y1 , vˆ2 = a22 y2 + a21 y2 + a20 y2 .
3 Design of MRAC-NNGI Controller 3.1 Design of NNGI
From what analyzed in 2.2, the pseudo-linear composite system is shown in Fig.1. The transfer function can be represented as
G ( s ) = diag (G11 , G22 ) = diag (1 (a11 s + a10 ) ,1 (a22 s 2 + a21 s + a20 )) Where, a10=1, a11=0.05, a20=1, a21=0.4, a22=0.04.
(9)
A New Model Reference Adaptive Control of PMSM Using NNGI
νˆ1
G11 ( s )
s −1
νˆ2
G22 ( s )i s
u1
y1
u2
y2
νˆ1 νˆ2
1 a1 1 s + a 1 0
1 a 22 s 2 + a 21 s + a 20
61
y1 y2
Fig. 1. Diagram of pseudo-linear system
In Fig. 1, the pseudo-linear system can be completely equivalent to two SISO linear-subsystems. The poles can be placed in the left-half complex plane for an ideal open-loop frequency characteristic. One of SISO is a first-order current unit and the other is a second-order speed unit. The second-order unit can be described as G22=1/(T2s2+2ςTs+1)=1/(a22s2+a21s+1). In order to obtain the excellent performance, the damping coefficient of subsystem select ς=0.707, which termed as optimal damping, and less than 4.3% overshoot and shortest regulation tine. When a21=0.4, a22 should be 0.08. However, a22=0.04 has the lower overshoot. In this scheme, the Levenberg-Marquardt algorithm applied to approximate the nonlinear mapping φˆ(i) . The NNGI system, which captures the dynamic behavior of a system, requires fewer training simples, converges quickly and performs good ability of global optimal solution. Its controller design is simplified own to the linearization model [9], [10]. 3.2 Design of MRAC Controller
The MRAC method can offer simpler implementation and require less computational effort. So it is a popular strategy used for motor control [12]. It makes it possible to design a completely decoupled control structure in the synchronous frame as unknown disturbances based on Lyapunov stability theory [13]. Two adjustable gains of closed-loop adaptive controllers are formed according to characteristics of these aforementioned linear-subsystems. The control diagram is shown in Fig.2. The Gm1(s) and Gm2(s) are the expectation reference models, Gm1(s)=G11(s), Gm2(s)=G22(s); C1 and C2 are the current and speed controller respectively; D(s) is the disturbances of PMSM from internal or external; e1 and e2 are the error for adaptive control law; Kp1 and Kp2 are the original gains of two linear subsystems; Kc1 and Kc2 are the adaptive gains; The desired system performances are given by the reference models with reference inputs [id*, ωr*]T and reference outputs [ym1, ym2]T. In other words, the pseudo-linear composite system can be completely equivalent to two SISO linear-subsystems and be controlled separately. The two adaptive control laws are inferred by Lyapunov stability theory, which will force the system to follow the reference model performance when the controlled variable deviates from the response of the reference model.
62
G. Liu et al.
id*
ν1
id
ωr*
ν2
ωr
Kc1 = e1i*d (λ1a11 K p1 ) Kc 2 = X T PBωr * (λ2 K p 2 )
Fig. 2. Control diagram of the MRAC-NNGI proposed in paper
3.2.1 Adaptive Law of Current Subsystem In application of the MRAC method to first-order current subsystem, the reference model connects the pseudo-linear subsystem in parallel. So the closed-loop transfer function is
Φ1 ( s ) = Gm1 ( s ) − G11 ( s ) = K1 (a11 s + a10 ) = K1 (0.5s + 1)
(10)
Where, K1=Km1-Kc1*Kp1. The time-varying model is a11e1 + e1 = K1id , so e1 = (−e1 + K1id ) a11 . The Lyapunov function with coefficient λ1 can be described as V1 (e1 ) = e12 + λ1 K12, λ1 > 0
(11)
According to the Lyapunov stability theory, the following function can be derived. ⎧lim e1 (t ) = 0 ⎪⎪ t →∞ ⎨V1 (e1 ) positive definite ⎪ ⎪⎩V1 (e1 ) negative definite
(12)
For above reasons, the propose adaptive law of current regulation is K c1 =
e1id* e1id* = λ1a11 K p1 0.5λ1 K p1
3.2.1 Adaptive Law of Speed Subsystem For the second-order speed subsystem, the closed-loop transfer function is
(13)
A New Model Reference Adaptive Control of PMSM Using NNGI
63
Φ 2 ( s ) = Gm 2 ( s ) − G22 ( s ) = K 2 / (a22 s 2 + a21 s + a20 ) = K 2 / (0.04s 2 + 0.4s + 1)
(14)
Where, K2=Km2-Kc2*Kp2. The Lyapunov function with coefficient λ2 can be described as V2 ( X ) = X T PX + λ2 K 2, λ2 > 0
(15)
In order to obtain the symmetric matrix P, Lyapunov method is applied to define the coefficient matrixes A and B as follows ⎡ 0 A=⎢ ⎣ −1 a22
1 − a21
⎤ ⎡0⎤ ,B = ⎢ ⎥ ⎥ a22 ⎦ ⎣1 ⎦
Where, P can be calculated by the function PA+ATP=-I (I is an identity matrix). So the propose adaptive Law of speed regulation is Kc 2 =
ωr* ω* ⎡ 1.5 0.02 ⎤ ⎡0⎤ ωr* X T PB = r [ e2 e2 ] ⎢ = (0.02e2 + 0.052e2 ) ⎥⎢ ⎥ λ2 K p 2 λ2 K p 2 ⎣0.02 0.052⎦ ⎣1⎦ λ2 K p 2
(16)
It is clear from the above discussion that both of the two adaptive laws of MRAC controller design are much simpler and requires less effort to synthesize, in which the reference models selection and stability play the very impotent roles.
4 Verifications To evaluate the performance of the proposed control scheme, the simulation figures compared with other methods are shown as follows. In the simulation, λ1=0.5, λ2=1, Kp1=Kp2=1, so Km1=Km2=1, and the initial value of adjustable gains Kc1=Kc2=1, J=0.0021kg·m2, Ld=16.45mh, Lq=16.60mh, np=3. Fig.3 shows the square wave response of rotor speed when the current in d-axes (id) is set to 5A. The maximum absolute value of speed is set to 80 rad/s and the minimum is 20 rad/s. It can be seen from the (b) and (c) that the actual speed response has the smaller tracking error and low overshoot compared to other methods. (a) and (b) show the excellent decoupling performance. Fig.4 shows the same nice operation performance of current tracking feature compared to the Fig.6 and 9 when the speed is step to 80 rad/s. The maximum value is set to 6A and the minimum is 0A. It can be inferred from Fig.5 that the anti-interference capability of the proposed algorithm is obviously stronger than those of the other schemes. In this simulation, toque load is set to from 3N step to 1N and then step to 3N, what about 30% to 10% to 30% of rated load. It almost has the complete anti-interference capability of speed response. In the same operation conditions, the current interference response is less than 3.7% and the others over 12%. (The current interference responses of Fig.6, 7, 8, 9, 10 are 50%, 18%, 12%, 18% and 25% respectively.)
64
G. Liu et al.
2 10
20 (a)
30
0
10
20 (a)
30
40
10
20 (b)
30
40
10
20 (c) Time (s)
30
40
Id (A)
6
40 20 0
4 2 0
10
20 (b)
30
40
80 60 40 20 0 0
2
40
80 60
0
4
0
0
Wr (rad/s)
Wr (rad/s)
0 0
Wr (rad/s)
6
4
Id (A)
Id (A)
6
10
20 (c) Time (s)
30
80 60 40 20 0
40
Fig. 3. Decoupling and tracking response of MRAC-NNGI (wr steps). (a) Given id* & Response id. (b) Given wr* & Response wr. (c) Reference response ym2.
0
Fig. 4. Decoupling and tracking response of MRAC-NNGI (id steps). (a) Given id* & Response id. (b) Reference response ym1. (c) Given wr* & Response wr.
8 10 Id (A)
Id (A)
6 4
5
2 0 0
10
20 (a)
30
0 0
40
Wr (rad/s)
Wr (rad/s)
20 (a)
30
40
30
40
100
80 60 40 20 0 0
10
80 60 40 20
10
20 (b) Time (s)
30
40
Fig. 5. Anti-interference response of MRACNNGI
0 0
10
20 (b) Time (s)
Fig. 6. Response of IMC-NNGI
A New Model Reference Adaptive Control of PMSM Using NNGI
65
8
8 Id (A)
Id (A)
6
6 4 2
2 0
0 0
2
4
(a)
6
8
-2 0
10
2
4
2
4
(a)
6
8
10
6
8
10
100
100
80 Wr (rad/s)
80
Wr (rad/s)
4
60 40
60 40 20
20 0 0
0
2
4
(b) Time (s)
6
8
10
0
Fig. 7. Response of PID-NNGI
(b) Time (s)
Fig. 8. Response of PID-NNI 20
10
15 Id (A)
Id (A)
8 6 4
5
2 0 0
10
2
4
6
8
0 0
10
2
4
2
4
(a) 120 80
Wr (rad/s)
Wr (rad/s)
8
10
6 (b) Time (s)
8
10
100
100 60 40 20
80 60 40 20
0 0
6
(a)
2
4
(b) Time (s)
6
8
Fig. 9. Response of PID-NN(RBF)GI
10
0 0
Fig. 10. Response of PID
Where, in Figs.5-10, (a) represents given id* & response id, (b) represents given wr* & response wr. Fig.3, 4 and 5 are the current and speed responses in MRAC-NNGI method; Figs.6,7,8,9,10 are the responses in the methods of Internal Model Control GGNI (IMC-NNGI), PID-NNGI, PID-NNI (NN inverse), PID-NNGI with RBF algorithm and PID respectively.
5 Conclusion This paper has proposed an MRAC-NNGI scheme of a PMSM which includes a current regulation and a speed regulation. From the analysis and simulation results, we can conclude that the NNGI with Levenberg-Marquardt algorithm is simple and suitable for implementation. In addition, we can properly place the poles of the pseudo-linear composite system so as to linearize, decouple and reduce the order of the original
66
G. Liu et al.
system. In other words, the original nonlinear system has been transformed into a pseudo-linear composite system, which includes two SISO pseudo-linear subsystems. Last but not least, based on the aforesaid hypothesis of linearization, it considerably simplifies the design of an additional closed-loop linear controller. Therefore, with simpler design, less computational effort and advance stability, MRAC method is introduced into the algorithm. Simulation results have shown that the implementation is independent of the accurate mathematical model of original system. Also, the proposed control strategy provides an effective and achievable approach to realizing the linearization and decoupling control of nonlinear MIMO systems. Further study will be shown in the next paper. Acknowledgments. This work was supported in part by grants (Project No. 51077066, 608074014 and 50907031) from the National Natural Science Foundation of China and a grant (Project No. BK2010327) from the Natural Science Foundation of Jiangsu Province.
References 1. Mohamed, Y.A.-R.I., El-Saadany, E.F.: A Current Control Scheme with an Adaptive Internal Model for Torque Ripple Minimization and Robust Current Regulation in PMSM Drive Systems. IEEE Transactions on Energy Conversion 23(1), 92–100 (2009) 2. Zhao, W., Chau, K., Cheng, M., Hi, J., Zhu, X.: Remedial brushless AC operation of faulttolerant doubly-salient permanent-magnet motor drives. IEEE Transactions on Industrial Electronics 57(6), 2134–2141 (2010) 3. Rahman, M.A., Zhou, P.: Field Circuit Analysis of Brushless Permanent Magnet Synchronous Motors. IEEE Transactions on Industrial Electronics 43(2), 256–267 (1996) 4. Li, S.H., Liu, Z.: Adaptive Speed Control for Permanent-Magnet Synchronous Motor System with Variations of Load Inertia. IEEE Transactions on Industrial Electronics 56(8), 3050–3059 (2009) 5. Mohamed, Y.A.-R.I.: Design and Implementation of a Robust Current-Control Scheme for a PMSM Vector Drive with a Simple Adaptive Disturbance Observer. IEEE Transactions on Industrial Electronics 54(4), 1981–1988 (2007) 6. Bose, B.K.: Neural Network Applications in Power Electronics and Motor Drives-An Introduction and Perspective. IEEE Transactions on Industrial Electronics 54(1), 14–33 (2007) 7. Gadoue, S.M., Giaouris, D., Finch, J.W.: A Neural Network Based Stator Current MRAS Observer for Speed Sensorless Induction Motor Drives. In: IEEE International Symposium on Industrial Electronics (ISIE), Cambridge, pp. 650–655 (June 2008) 8. Yalcin, B., Ohnishi, K.: Infinite-Mode Neural Networks for Motion Control. IEEE Transactions on Industrial Electronics 56(8), 2933–2944 (2009) 9. Gadoue, S.M., Giaouris, D., Finch, J.W.: Sensorless Control of Induction Motor Drives at Very Low and Zero Speeds Using Neural Network Flux Observers. IEEE Transactions on Industrial Electronics 56(8), 3029–3039 (2009) 10. Dai, X., He, D., Zhang, T., Zhang, K.: ANN Generalized Inversion for the Linearization and Decoupling Control of Nonlinear Systems. IEEE Proceedings: Control Theory and Applications 150(3), 267–277 (2003)
A New Model Reference Adaptive Control of PMSM Using NNGI
67
11. Liu, G.H., Liu, P.Y., Shen, Y., Wang, F.L., Kang, M.: Neural Network Generalized Inverse Decoupling Control of Two-motor Variable Frequency Speed-regulating System. J. Proceeding of the CSEE 28(36), 98–102 (2008) 12. Jin, H., Lee, J.: An RMRAC current regulator for permanent-magnet synchronous motor based on statistical model interpretation. IEEE Transactions on Industrial Electronics 56(1), 169–177 (2009) 13. Rashed, M., Stronach, A.F.: A Stable Back-EMF MRAS-based Sensorless Low Speed Induction Motor Drive Insensitive to Stator Resistance Variation. IEEE Proceedings Electric Power Applications 151, 685–693 (2004)
RBF Neural Network Application in Internal Model Control of Permanent Magnet Synchronous Motor Guohai Liu, Lingling Chen, Beibei Dong, and Wenxiang Zhao 1
School of Electrical and Information Engineering, Jiangsu University, 212013, Zhenjiang, China
[email protected],
[email protected] Abstract. As a significant part of artificial intelligence (AI) techniques, neural network is recently having a great impact on the control of motor. Particularly, it has created a new perspective of decoupling and linearization. With reference to the non-linearization and strong coupling of multivariable permanent magnet synchronous motor (PMSM), this paper presents internal model control (IMC) of PMSM using RBF neural network inverse (RBF-NNI) system. In the proposed control scheme, the RBF-NNI system is introduced to construct a pseudo-linear system with original system, and internal model controller is utilized as a robust controller. Therefore, the new system has advantages of above two methods. The efficiency of the proposed control scheme is evaluated through computer simulation results. By using the proposed control scheme, original system is successfully decoupled, and expresses strong robustness to load torque disturbance, the whole system provides good static and dynamic performance. Index Terms: RBF neural network (RBF-NN); internal model control (IMC); permanent magnet synchronous motor (PMSM); decoupling control.
1 Introduction As the progress of power electronics technology, permanent magnet synchronous motor (PMSM) drives have overcome the disadvantage of wide-speed operation, and have been applied in many industrial areas [1],[2]. Thus, PMSM drives are likely to be strong competitor of induction motor for wide range of future applications in advanced motion control and drive systems [3]. Despite many advantageous features of a PMSM drive, the precise control of PMSM speed regulation system remains a great challenge. PMSM is a multivariable and strong coupling nonlinear system, so the key of the control is decoupling and linearization. It has been known to everyone that vector control, differential geometry and inverse system method are all commonly used in decoupling applications. However, differential geometry and inverse system all need the accurate mathematic model of the controlled object. Furthermore, the physical conception of differential geometry suffers unclear expression and is hard to master. In addition, though the theory of Inverse System is simple to analyze, its anti-disturbance and robustness often cannot meet the requirements [4]. So, those two methods are hard to be applied in practice. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 68–76, 2011. © Springer-Verlag Berlin Heidelberg 2011
RBF Neural Network Application in Internal Model Control
69
As the neural network technology is advancing fast, its applications in different areas are expanding rapidly [5]. By introducing neural network method into inverse system, a method called neural network inverse (NNI) system was proposed. This method incorporates advantages of above two methods. So far, most of the reported works have utilized Back Propagation (BP) neural network to build the inverse system [6],[7]. The disadvantages of BP neural network (BP-NN) are well known, as slow convergence speed, existence of local minima and bad real-time performance [8]. As compared to BP-NN, Radial Basis Function (RBF) neural network can greatly accelerate the learning speed and successfully avoids local minima. For the purpose of improving the accuracy, robustness and self-adaptability of original system, this paper will propose RBF neural network (RBF-NN) to build the inverse system. Usually, high performance motor control requires fast and accurate speed response, quick recovery of speed from load torque disturbances, and insensitivity of speed response to parameter variations. However, the pseudo-linear system which combines the RBF-NNI system with original system is not a simple, ideal linear system. Traditional PI controller often cannot meet the high requirements of accurate control [9],[10]. In the present work, internal model control (IMC) theory will be utilized to design an extras controller for the pseudo-linear system. It is superior to common controller that it can meet both requirements of control and robustness. The performance of the proposed IMC-based RBF-NNI system of PMSM will be investigated by simulations at different operating conditions.
2 PMSM Model and Its Inverse The mathematical model of a PMSM in the d-q synchronously rotating reference frame is given as the third-order nonlinear model.
Lq ⎧ disd usd Rs = − isd + isq w1 ⎪ Ld Ld Ld ⎪ dt ⎪⎪ disq usq R ψf L = − s isq − d isd w1 − w1 ⎨ Lq Lq Lq Lq ⎪ dt ⎪ 2 2 ⎪ dw1 = 3n p ψ f i + 3n p ( Ld − Lq ) i i − n p T sq sd sq L ⎪⎩ dt 2J 2J J
(1)
Where isd and isq are stator current of d-axis and q-axis, respectively; w1 is rotor electrical angular velocity; usd and usq are voltage of d-axis and q-axis, respectively; Ld and Lq are inductances of d-axis and q-axis, respectively. It has to be noted that Ld and Lq are equal in the paper; Rs is stator resistance; Ψf is permanent magnet flux linkage of rotor; np is the number of pole pairs; J is the moment of inertia; TL is load torque. Choose w1 and isd to be the outputs of the system, then y=[y1, y2]T=[isd, ω1]T; Choose usd and usq to be the control variables, then u=[u1, u2]T=[usd, usq]T; Choose isd, isq and w1 to be the state variables, then x=[x1, x2, x3]T=[isd, isq, ω1]T. It is also to be noted that isd influences the flux linkage, so the key of the PMSM decoupling control is decoupling of isd and ω1. Consequently, equation (1) can be given as:
70
G. Liu et al.
⎡ ⎢ ⎡ x1 ⎤ ⎢ ⎢u x = f ( x, u ) = ⎢⎢ x2 ⎥⎥ = ⎢ 2 L ⎢⎣ x3 ⎥⎦ ⎢ q ⎢ ⎢ ⎢⎣
⎤ ⎥ ⎥ ψf ⎥ Rs x2 − x1 x3 − x3 ⎥ − Lq Lq ⎥ ⎥ np 3n 2pψ f ⎥ x2 − TL ⎥⎦ J 2J u1 Rs x1 + x2 x3 − Ld Ld
(2)
According to the inverse system theory, differentiate the outputs of the system: y1 =
u1 Rs − x1 + x2 x3 Ld Ld
y 2 =
y2 =
3n 2pψ f 2J
x2 −
np J
(3)
TL
(4)
3n 2pψ f ⎛ u2 Rs ψf ⎞ x2 − x1 x3 − x3 ⎟ ⎜⎜ − 2 J ⎝ Lq Lq Lq ⎟⎠
(5)
Then the Jacobi matrix is: ⎡ ∂y1 ⎢ ∂u A( x, u )= ⎢ 1 ⎢ ∂ y2 ⎢ ⎣ ∂u1
∂y1 ⎤ ⎡ 1 ⎢ ∂u2 ⎥ ⎢ L1 ⎥= ∂ y2 ⎥ ⎢ ⎥ ⎢0 ∂u2 ⎦ ⎢⎣
Det ( A( x, u )) =
⎤ ⎥ ⎥ 3n p 2ψ f ⎥ ⎥ 2 JL1 ⎥⎦ 0
3n p 2ψ f 2 JL12
(6)
(7)
When the rotor flux linkage Ψs≠0, A(x, u) is nonsingular. The relative order of the system is α= (α1, α2) = (1, 2), obviously, α1+α2=1+2=3=n (order of system), so the inverse of the original system is: u = φ ( y1 , y1 , y2 , y 2 , y2 ) (8)
3 RBF-NNI System Application in Internal Model Control 3.1 RBF Neural Network
As we all known, RBF-NN is a kind of feed-forward neural network which executes function mapping in localized receptive fields. It has three layers, which are input layer, hidden layer and output layer, as shown in Fig.1. Radial basis function is used as the incentive functions of the hidden-layer neurons. The main advantages of radial basis
RBF Neural Network Application in Internal Model Control
h1
x1
w1 w2
h2
x2
71
∑
ym
wm xn
hm
Fig. 1. The structure of the RBF-NN
function are that its form is simple; function curve is very smooth and it is radial symmetry. Gaussian function is often selected to be the radial basis function. It can approximate any constant function. Particularly, it does not need to depend on the model of system. The output of the j-th hidden layer neuron: h j = exp(−
X −Cj 2b 2j
2
), j = 1, 2,......, m
(9)
‖*‖is Euclidean norm; X is the input vector; C is central vector of the j-th
Where
j
hidden layer neuron; bj is the width of the j-th hidden layer neuron. The output of the RBF-NN is: m
ym ( k ) = ∑ w j h j
(10)
j =1
Where wj is the weighing between the j-th hidden neuron and output layer; m is the number of the hidden layer neurons. The performance function which is approximated by the RBF-NN is given as: E (k ) =
1 ( y (k ) − ym (k ))2 2
(11)
Where y(k) is the output of the controlled object. The sensitivity of the output to the input is called Jacobian algorithm, which is given as: c j1 − x1 ∂y (k ) ∂ym (k ) ∂ym (k ) m ≈ = = ∑ wj hj ∂u (k ) ∂u (k ) ∂x1 b 2j j =1
(12)
In the above formulation, u(k) is regarded as the first input of RBF-NN, so u(k)=x1.
72
G. Liu et al.
3.2 RBF Neural Network Inverse System
In the previous work, the existence of inverse system has been proved. Due to the nonlinear and strong coupling of the multivariable motor system, it is still hard to obtain an analytical inverse. Neural network offers a good solution to the problem. By introducing neural network method to inverse theory, an RBF-NNI system is built to construct a pseudo-linear system with original system, as shown in Fig.2. v1 = y1
s −1
v2 = y2
s −1
s −1
v1 = y1
u1
y1
u2
y2
v2 = y2
1 s 1 s2
y1
y2
Fig. 2. Construction of the pseudo-linear system
3.3 Design Internal Model Controllers
In the practice implementation, due to the parametric perturbations, unpredictable disturbances and un-modeled dynamics of the pseudo-linear system, the control effect will deviate from expected target. IMC offers strong robustness to disturbances. In this paper, IMC theory is adopted to design the extra controllers. As compared to the common PID controller, IMC is more stable and often can meet the requirements of both control and robust performances. The structure of IMC is shown in Fig.3.
Gm1 ( s ) y1* y
* 2
Gc1 ( s ) −1 m1
F1 ( s )G ( s ) −1 m2
F2 ( s )G ( s ) Gc 2 ( s )
G (s)
d1 u1 y1 y2
u2 d2 Gm 2 ( s )
Fig. 3. Block diagram of IMC
In the above figure, G(s) is the pseudo-linear system, Gm1(s) and Gm2(s) are the internal models of the system, d1 and d2 are the disturbance signals. Gc1(s) and Gc2(s) are the internal model controllers. F1(s) and F2(s) are the filters. According to inverse system theory, the internal models of the system are given by:
RBF Neural Network Application in Internal Model Control
1 1 Gm1 ( s ) = , Gm 2 ( s ) = 2 s s
73
(13)
In order to get good static and dynamic performance, based on the simulation, the filters which are used in control of Gm1(s) and Gm2(s) are designed as: F1 ( s ) =
1 1 , F2 ( s ) = 2s + 1 3s + 1
( )
(14)
2
Then the corresponding internal model controllers respectively are:
Gc1 ( s ) =
s s2 , Gc 2 ( s ) = 2s + 1 (3s + 1) 2
(15)
4 Simulation Results
100
speed, rad/sec
speed, rad/sec speed (rad/s)
In order to verify the effectiveness of the proposed IMC-based RBF-NNI system of the PMSM, a computer simulation model is developed in Matlab/simulink software according to Fig.3. After applying the appropriate incentive signals to the original system, the responses of the d-axis current and rotational speed can be obtained. Then obtained sampling sets {id, id(1), wr, wr(1), wr(2)} and {usq, usd } are normalized to training the RBF-NNI. At last, by introducing the designed internal model controllers into the pseudo-linear system, the robust control of the whole system is completed. The performance of the proposed IMC-based RBF-NNI system of PMSM has been investigated extensively simulations at different dynamic operating conditions. Results are presented below. Fig. 4 shows the simulated responses of the rotational speed for a step input of 100rad/s. It is shown that the speed can follow the reference speed without any overshoot/undershoot and with zero steady-state error. Further simulated result for step changes in reference speed is shown in Fig. 5. It is shown the proposed scheme is capable of handing the step changes in speed commands.
50 0 0
50 100 time, sec
150
80 60 40 20 0 0
50
100 150 time,sec
200
Fig. 4. Response of rotational speed for step Fig. 5. Response of rotational speed for a step input of 100rad/s change
74
G. Liu et al.
speed, rad/sec
10 5 0 0
5 time, sec
10 5 0 0
10
speed, rad/sec
d-axis current, A
d-axis current, A
Fig. 6 shows the simulated responses of both the rotational speed and d-axis current at step inputs for both PID controller system and proposed IMC-based RBF-NNI system. It is shown from Fig. 6 that by using the proposed scheme, the rotational speed and d-axis current are almost successfully decoupled, and both responses have good dynamic performances. Whereas, with the PID controller there is still coupling between the rotational speed and d-axis current, and both responses have overshoots and oscillations, especially when the references are changed.
50
100 150 time, sec
200
100 50 0 0
5 time, sec
10
100 150 time, sec
200
100 50 0 0
50
Fig. 6. Responses of both isd and w1 for step inputs, (a) PID control scheme; (b) proposed control
speed, rad/sec
scheme
100 50 0
50
100 150 time, sec
200
Fig. 7. Speed response for sudden change of load
RBF Neural Network Application in Internal Model Control
75
The robustness of the internal model controller is shown by Fig. 7. It is evident from Fig. 7 that the PMSM with the proposed scheme is almost insensitive to load torque disturbance. These results indicate that the proposed scheme which combines the IMC with RBF-NNI system is robust and suitable for the decoupling control of the PMSM drive.
5 Conclusion RBF-NNI system application in IMC of the PMSM has been presented in this paper. The proposed control scheme has been verified that it can realize the accuracy tracking of the rotational speed. A performance comparison of the proposed control scheme with the common control method has also been demonstrated. From the results, it has been shown that the proposed control scheme not only decouples the nonlinear system successfully, but also has strong robustness to load torque disturbance and un-modeled dynamics. Moreover, the whole system has provided good static and dynamic performance. Consequently, it provides a new way to high-performance control of PMSM. Acknowledgments. This work was supported in part by grants (Project No. 60874014, 50907031 and 51077066) from the National Natural Science Foundation of China, a grant (Project No. BK2010327) from the Natural Science Foundation of Jiangsu Province.
References 1. Morel, F., Retif, J.M., Lin-Shi, X.F., Valentin, C.: Permanent Magnet Synchronous Machine Hybrid Torque Control. IEEE Transactions on Industrial Electronics 55(2), 501–511 (2008) 2. Zhao, W.X., Chau, K.T., Cheng, M., Hi, J., Zhu, X.: Remedial Brushless AC Operation of Fault-tolerant Doubly-salient Permanent-magnet Motor Drives. IEEE Transactions on Industrial Electronics 57(6), 2134–2141 (2010) 3. Zhu, Z.Q., Howe, D.: Electrical Machines and Drives for Electric, Hybrid, and Fuel Cell Vehicles. Proceedings of the IEEE 95(4), 746–765 (2007) 4. Su, W.T., Liaw, C.M.: Adaptive Positioning Control for a LPMSM Drive Based on Adapted Inverse Model and Robust Disturbance Observer. IEEE Transactions on Power Electronic 21(2), 505–517 (2006) 5. Bose, B.K.: Neural Network Applications in Power Electronics and Motor Drives- An Introduction and Perspective. IEEE Transactions on Industrial Electronics 54(1), 14–33 (2007) 6. Pham, D.T., Yildirim, S.: Design of a Neural Internal Model Control System for a Robot. Robotica 5(5), 505–512 (2000) 7. Li, Q.R., Wang, P.F., Wang, L.Z.: Nonlinear Inverse System Self-learning Control Based on Variable Step Size BP Neural Network. In: International Conference on Electronic Computer Technology (February 2009) 8. Zhang, Y.N., Li, Z., Chen, K., Cai, B.H.: Common Nature of Learning Exemplified by BP and Hopfield Neural Networks for Solving Online a System of Linear Equations. In: Proceedings of 2008 IEEE International Conference on Networking, Sensing and Control, vol. 1, pp. 832–836 (2008)
76
G. Liu et al.
9. Uddin, M.N., Rahman, M.A.: High-Speed Control of IPMSM Drives Using Improved Fuzzy Logic Algorithms. IEEE Transactions on Industrial Electronics 54(1), 190–199 (2007) 10. Rubaai, A., Castro-Sitiriche, M.J., Ofoli, A.R.: DSP-based Laboratory Implementation of Hybrid Fuzzy-PID Controller Using Genetic Optimization For High-performance Motor Drives. IEEE Transactions on Industry Applications 44(6), 1977–1986 (2008)
Transport Control of Underactuated Cranes Dianwei Qian , Boya Zhang, and Xiangjie Liu School of Control and Computer Engineering, North China Electric Power University, Beijing, 102206, P.R. China {dianwei.qian,liuxj}@ncepu.edu.cn
Abstract. Overhead cranes are important equipments that are used in many industries. They belong to underactuated mechanical systems. Since it is hard to obtain an accurate model for control design, this paper presents a design scheme for the transport control problem of overhead cranes with uncertainties. In this scheme, a variable structure control law based on sliding mode is designed for the nominal model and neural networks are utilized to learn the upper bound of system uncertainties. In the sense of Lyapunov theorem, the update formulas of the network weights are deduced to approximate the system uncertainties. From the design process and comparisons, it can be seen that: 1) the neural approximator is able to compensate the system uncertainties, 2) the control system possesses asymptotic stability, 3) better performance can be achieved. Keywords: Variable structure control, Neural networks, Underactuated systems, Approximator, Crane.
1
Introduction
Overhead cranes are important equipments that are used in many industries. But their performance may be constrained because loads have a pendulum-type motion, harmful for industrial security. It is desired that overhead cranes are to be able to transport loads to the required position as fast and as accurately as possible without free swings [1]. Concerning the control problem of overhead cranes, a variety of control approaches have been proposed in the last two decades, e.g. fuzzy control [1], adaptive coupling control [2], passivity-based control [3], wave-based control [4], adaptive fuzzy control [5], 3-dimensional path planning [6], etc. But most of the referred methods only focused on the accurate model of crane systems to formulate the control input of cranes. Their performance may be deteriorated when there exist uncertainties in crane systems. With the development of nonlinear control theory, applications of sliding mode control (SMC) have received more attention. SMC [7], belonging to variable structure control, is able to respond quickly, invariant to systemic parameters and external disturbance. It is a good
This work was supported by the NSFC Projects under grants No. 60904008, 60974051, the Fundamental Research Funds for the Central Universities under grant No. 09MG19.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 77–83, 2011. c Springer-Verlag Berlin Heidelberg 2011
78
D. Qian, B.Zhang, and X. Liu
choice to deal with the control problem of overhead crane systems. Some control methods on basis of SMC have been proposed for overhead crane systems, e.g. coupling SMC [8], second-order SMC [9], high-order SMC [10], adaptive SMC [11], hierarchical SMC [14], etc. Although the robust SMC is invariable to matched disturbances [7], it is necessary to get a known bound of uncertainties on the aspect of the systems stability, which may be too strict to be achieved for practical accounts. Due to neural networks possessing the ability of identifying a broad category of complex nonlinear systems, this paper addresses a neural-network-based SMC method to deal with the control problem of overhead crane systems with unknown bounded uncertainties. The update formulas of the network weights are deduced from the Lyapunov direct method to ensure the asymptotically stability of the control system. Simulation results demonstrate the applications of the presented method.
2
System Model
Fig. 1 shows the coordinate system of an overhead crane system and its load. Apparently, this system consists of the trolley subsystem and the load subsystem. The former subsystem is with the trolley driven by a force f . The latter is with the load suspended from the trolley by a rope. For simplicity, we assume the load is regarded as a material particle, an inflexible rope with negligible mass, no friction etc. In Fig. 1, the symbols are described as the trolley mass M , the load mass m, the rope length L, the swing angle θ of the load with respect to the vertical line , the trolley position x with respect to the origin, the force f applied to the trolley. The state space expression of the crane system with uncertain dynamics [3] is written as
Fig. 1. Structure of an overhead crane system
Transport Control of Underactuated Cranes
⎧ x˙ 1 ⎪ ⎪ ⎨ x˙ 2 x˙ 3 ⎪ ⎪ ⎩ x˙ 4
= x2 = f1 (x) + b1 (x) · u + d1 (t) = x4 = f2 (x) + b2 (x) · u + d2 (t)
79
(1)
Here g is the gravitational acceleration; x = [x1 , x2 , x3 , x4 ]T ; x1 = x; x3 = θ; x2 is the trolley velocity; x4 is the angular velocity of the load; u = f is the control input; d1 (t) and d2 (t) are bounded lumped disturbances, involving the parameter variations and external disturbances, but both of them are with the unknown bounds; fi and bi (i = 1, 2) can be derived as f1 =
m · L · x24 · sin x3 + m · g sin x3 · cos x3 M + m · sin2 x3 b1 =
f2 = −
1 M + m · sin2 x3
(m + M ) · g · sin x3 + m · L · x24 · sin x3 · cos x3 (M + m · sin2 x3 ) · L cos x3 b2 = − (M + m · sin2 x3 ) · L
Let d1 (t) = 0 and d2 (t) = 0 in (1). We can obtain the nominal system of this uncertain system.
3
Design of RBF Network-Based Sliding Mode Controller
For designing the sliding mode control law of the uncertain system (1) in the sense of Lyapunov, its sliding surface is defined as s = c1 x1 + c2 x2 + c3 x3 + x4
(2)
here cj (j = 1, 2, 3) are constants and c2 is positive. Employing the equivalent control method [7], we define the control law as u = ueq + usw
(3)
here ueq is the equivalent control and usw is the switching control. After the sliding mode takes place, we have s˙ = 0
(4)
Substituting the equation of the nominal system into (4) yields the equivalent control law ueq as c1 x2 + c3 x4 + c2 f1 + f2 (5) ueq = − c2 b 1 + b 2
80
D. Qian, B.Zhang, and X. Liu 2
Define a Lyapunov function as V (t) = s2 . Differentiating V with respect to time t and substituting (1)–(5) into it yield V˙ = ss˙ = s[c1 x2 + c2 (f1 + b1 u + d1 ) + c3 x4 + f2 + b2 u + d2 ] = s[(c2 b1 + b2 )usw + c2 d1 + d2 ] Let usw = −
ks + ηsgn(s) (c2 b1 + b2 )
(6)
(7)
where both k and η are positive constants. Substituting (7) into (6) yields V˙ = −ks2 − η|s| + s(c2 d1 + d2 ) ≤ −ks2 − (η − c2 |d1 | − |d2 |)|s|
(8)
From (8), we can pick up a positive constant η0 and define η = η0 + c2 d¯1 + d¯2 on the aspect of the system stability if the upper bounds of uncertainties, d¯1 = sup d1 (t) and d¯2 = sup d2 (t), are known. In practical accounts, it is hard to obtain the upper bounds. Thus, it is necessary to approximate them to guarantee the system stability. Due to RBF networks having the ability of approximating complex nonlinear mapping directly from input-output data with a simple topological structure [12], we will propose an approach to approximate the upper bounds of d1 (t) and d2 (t) in (1) by RBF networks. Define the vector x as the network input, and dˆi (i = 1, 2), the estimated value of di (t), as the network outputs. Then, the i-th RBF network output is determined as T (9) dˆi (x, wi ) = wi Φi (x) ∗
here wi ⊆ Rn ×1 is the weight vector of the i-th RBF neural network, where n∗ is the number of hidden neurons, Φi (x) = [φi1 (x), φi2 (x), · · · , φin∗ (x)]T is a radial basis function vector, where the k-th RBF function of the i-th network is determined as ||x − γik ||2 φik (x) = exp(− ) (10) 2 δik here γik and δik depicting the center and width of the k-th hidden neuron of the i-th RBF network. To deduce the update formulas, we make the following assumptions [13]. – A1: There exists an optimal weight wi∗ so that the output of the i-th optimal T network satisfies |wi∗ Φi (x) − d¯i | < i0 , here i0 is a positive constant. – A2: The norm of the system uncertainties and its upper bound satisfy the following relationship |d¯i − di (t)| > i1 > i0 . To get the update formulas, we redefine a Lyapunov function as ˜ iT w ˜i s2 α−1 i w + 2 2 i=1 2
Vn (t) =
(11)
Transport Control of Underactuated Cranes
81
˜ i = wi∗ −wi , w ˜˙ i = −w ˙ i , α1 and α2 are positive constants. Differentiating Here w Vn with respect to time t yields V˙ n (t) = ss− ˙
2
˜˙ i ≤ −ks2 −(η−c2 |d1 |−|d2 |)|s|− ˜ iT w α−1 i w
i=1
2
i∗ i T i ˙ α−1 i (w −w ) w
i=1
(12) Let ˙ 1 = α1 c2 Φ1 (x)|s| w
(13a)
˙ = α2 Φ2 (x)|s| w
(13b)
2
Substituting them into (12), we have V˙ n (t) ≤ −ks2 − (η − c2 |d1 | − |d2 |)|s| − c2 (w1∗ − w1 )T Φ1 (x)|s| − (w2∗ − w2 )T Φ2 (x)|s| = −ks2 − η0 |s| − c2 |s|(d¯1 − dˆ1 ) − |s|(d¯2 − dˆ2 ) − c2 |s|[w
1∗ T
Φ1 (x) − |d1 |] − |s|[w
= −ks − η0 |s| − 2
c2 (11
−
10 )|s|
−
2∗ T
(21
(14)
Φ2 (x) − |d2 |]
− 20 )|s|
From the assumptions A1 and A2, V˙ n ≤ 0 in (14) so that both the update formulas (13a) and (13b) are able to ensure the asymptotic stability of the system (1) with unknown bounded uncertainties under the control law (3) in the sense of Lyapunov.
4
Simulation Results
The presented method in this section will be applied on transport control of an overhead crane system, whose physical parameters are determined as M = 37.32 kg, m = 5 kg, l = 1.05 m, g = 9.81m · s−2 . The initial and desired states, x0 and xd , are [2 0 0 0]T and [0 0 0 0]T , respectively. The parameters of the sliding surface s and the switching control are picked up as c1 = 1.6, c2 = 2.3, c3 = −8.7, and k = 10, η0 = 0.2. The center γik and width δik of the k-th hidden neuron of the i-th RBF network are designed as random numbers in the interval (0, 1). α1 , α2 , and n∗ are selected as 104 , 104 and 6 after trial and error. Here, d1 (t) and d2 (t) in (1) are assumed as random uncertainties. The simulation results in Fig. 2 illustrate the comparison with and without NNs, where the blue solid depicts the results with NNs approximating the unknown bounds and the black dash illustrates the results with no NNs. Under the conditions of no NNs to approximate the unknown bounds, we select a larger value of η, η = 2, to ensure the stability of this system. As we have pointed out, the presented RBF-network-based SMC method with the update formulas (12) makes the crane system with unknown bounded uncertainties asymptotically stable. Further, although the curves of cart position x1 and load angle x3 in Fig. 2 almost make no difference, the plot of the
D. Qian, B.Zhang, and X. Liu 0.3
1.5 1
0.5 0
−0.5 0
3
6
0.2 0.1 0 −0.1 0
9
Time (s)
5
estimated d
4
estimated d
Control input u [N] with NNs
Output of NNs
6 1 2
3 2 1 0 0
3
6 Time (s)
9
4 with NN with no NN desired
Sliding surface s
with NN with no NN desired
Load Angle x2 [rad]
Cart Position x1 [m]
2
3 6 Time (s)
500
0
−500
−1000 0
with NN
3 6 Time (s)
9
3 2 1 0 −1 0
9
Control input u [N] with no NNs
82
3 6 Time (s)
9
500
0 with no NN −500
−1000 0
3 6 Time (s)
9
Fig. 2. Comparison with and without NNs to estimate the upper bounds. (a) cart position, (b) load angle, (c) sliding surface, (d) outputs of RBF NNs dˆ1 & dˆ2 , (e) control input with NNs, (f) control input with no NNs.
control input u with NNs is superior. This case indicates that a larger fixed η with no NNs leads to the chattering phenomena of the control input. On the contrary, η is variable and adaptive on the dynamic process if the upper bounds of uncertainties are estimated by the output of NNs. This effectively reduces the chattering phenomena of the control input.
5
Conclusion
This paper has presented a robust sliding mode controller based on RBF neural networks for overhead crane systems with uncertain dynamics. The update formulas of the RBF functions are deduced from the Lyapunov method to ensure the stability of the control system. Compared with the results of no NN approximator, the simulation results illustrate the feasibility and robustness of the presented method. The main contribution of this presented approach is to be able to solve the control problem of overhead crane systems with unknown bounded uncertainties.
References 1. Yi, J.Q., Yubazaki, N., Hirota, K.: Anti-swing and positioning control of overhead traveling crane. Information Sciences 115(1-2), 19–42 (2003) 2. Yang, J.H., Yang, K.S.: Adaptive coupling control for overhead crane systems. Mechatronics 17(2-3), 143–152 (2007)
Transport Control of Underactuated Cranes
83
3. Fang, Y., Dixon, W.E., Dawson, D.M., Zergeroglu, E.: Nonlinear coupling control laws for an underactuated overhead crane system. IEEE/ASME Transactions on Mechatronics 8(3), 418–423 (2003) ´ 4. Yang, T.W., OConnor, W.J.: Wave based robust control of a crane system. In: Proc. of IEEE/RSJ International Conference on Intelligent Robots & Systems, Beijing, pp. 2724–2729 (2006) 5. Chang, C.Y.: Adaptive fuzzy controller of the overhead cranes with nonlinear disturbance. IEEE Transactions on Industrial Informatics 3(2), 164–172 (2007) 6. Kroumov, V., Yu, J., Shibayama, K.: 3D path planning for mobile robots using simulated annealing neural network. International Journal of Innovative Computing, Information and Control 6(7), 2885–2899 (2010) 7. Utkin, V.I.: Sliding modes in control and optimization. Springer, New York (1992) 8. Shyu, K.K., Jen, C.L., Shang, L.J.: Design of sliding-mode controller for antiswing control of overhead cranes. In: Proc. of the 31st Annual Conference of IEEE Industrial Electronics Society, pp. 147–152 (2005) 9. Bartolini, G., Pisano, A., Usai, E.: Second-order sliding-mode control of container cranes. Automatica 38(10), 1783–1790 (2002) 10. Chen, W., Saif, M.: Output feedback controller design for a class of MIMO nonlinear systems using high-order sliding-mode differentiators with application to a laboratory 3-D crane. IEEE Transactions on Industrial Electronics 55(11), 3985–3997 (2008) 11. Park, M.S., Chwa, D., Hong, S.K.: Antisway tracking control of overhead cranes with system uncertainty and actuator nonlinearity using an adaptive fuzzy slidingmode control. IEEE Transactions on Industrial Electronics 55(11), 3972–3984 (2008) 12. Huang, G.B., Saratchandran, P., Sundararajan, N.: A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation. IEEE Transactions on Neural Networks 16, 57–67 (2005) 13. Man, Z., Wu, H.R., Palaniswami, M.: An adaptive tracking controller using neural networks for a class of nonlinear systems. IEEE Transactions Neural Networks 9(5), 947–955 (1998) 14. Qian, D.W., Yi, J.Q., Zhao, D.B.: Hierarchical sliding mode control for a class of SIMO under-actuated aystems. Control & Cybernetics 37(1), 159–175 (2008)
Sliding Mode Prediction Based Tracking Control for Discrete-time Nonlinear Systems Lingfei Xiao and Yue Zhu College of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu, 210016, China
Abstract. A novel tracking control algorithm based on sliding mode prediction is proposed for a class of discrete-time nonlinear uncertain systems in this paper. By constructing a sliding mode prediction model at first, the future, present and history information of the system can be used to improve the performance of sliding mode control. Because of sliding mode reference trajectory, the reachability of sliding mode is achieved, and the reaching mode can be determined by designers in advance. Due to feedback correction and receding horizon optimization, the influence of uncertainty can be compensated in time and control law is obtained subsequently. Theoretical analysis proves the closed-loop system possesses robustness to matched or unmatched uncertainty, without requiring the known bound of uncertainty. Simulation results illustrate the validity of the presented algorithm. Keywords: Sliding mode prediction, sliding mode control, discrete-time nonlinear systems, tracking.
1
Introduction
As a powerful robust control method, sliding mode control (SMC) has been researched since the early 1950’s [1]. One of reasons for the popularity of SMC is its remarkably good invariance for matched uncertainty on sliding surface. With the wide using of micro-controllers, the study of discrete-time SMC (DSMC) is required, because when continuous-time SMC schemes are implemented on digital devices, some advantages of SMC are lost. In [2], Gao et al. specified desired properties of the controlled systems and proposed a reaching law-based approach to complete DSMC. For a single-input single-output (SISO) linear uncertain plant, [3] addressed a discrete-time variable-structure repetitive control algorithm, which was on the basis of a created delay reaching law. In [4], Chen presented a robust adaptive sliding mode tracking controller for discrete-time multi-input multi-output systems. Bounded motion of the system around the sliding surface and stability of the global system were guaranteed, when all signals were bounded. Using a recursive switching function, [5] presented a DSMC
This paper is supported by Natural Science Foundation of China (NO. 61004079) and NUAA Research Funding (NO. NS2010050).
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 84–93, 2011. c Springer-Verlag Berlin Heidelberg 2011
Sliding Mode Prediction Based Tracking Control
85
algorithm based on [2]. The algorithm produced low-chattering control signal. In [6], a robust controller was developed by employing reaching law in [2], provide that the upper bound of uncertainty was known. In [7], Takagi-Sugeno (T-S) fuzzy model was introduced to transform the considered nonlinear discrete-time systems into linear uncertain system model, and then a modified reaching law was used. However, the guideline of constructing T-S model was not given. [8] discussed the problems involved in the discretization of continuous-time terminal sliding mode (TSM) and presented a discrete-time TSM algorithm. All of above researches did not take optimization into consideration. The purpose of this paper is to design a suitable robust tracking controller for a class of discrete-time nonlinear uncertain systems. Because SMC is an effective nonlinear robust control method, SMC is used to achieve the control objective. Nevertheless, SMC only possesses strong robustness to matched uncertainty; the corresponding control signal is required to be large in order to force arbitrary states reach sliding surface, which probably results in controller saturation; and the bound of uncertainty is often required to be known, which may not be gotten in practice and make closed-loop systems have high conservation. Considering these shortcomings, predictive control (PC) strategy is utilized to improve the performance of SMC. According to PC, prediction model, feedback correction (FC) and receding horizon optimization (RHO) are employed during the design process [9].Therefore, a sliding mode prediction model (SMPM) is created at first, so that the future, present and history information in the system can be used. Considering model mismatch, the error between the history output of SMPM and present sliding mode value is used to make feedback correction for SMPM. Because of sliding mode reference trajectory (SMRT), the reachability of sliding mode is obtained. Since SMRT is optional, the reaching mode can be determined by designers in advance. By RHO, control signal is gained and can be optimized online. Due to FC and RHO, the influence of uncertainty can be compensated in time. Theoretical analysis proves the closed-loop system possesses robustness to matched or unmatched uncertainty, without requiring the known bound of uncertainty. This paper is organized as follows. Section 2 describes the considered systems and control objective. Section 3 contains main results, i.e., the creation of SMPM, the feedback correction for SMPM, the selection of SMRT, the construction of control law and the analysis of robustness. In Section 4, advantages of the presented algorithm are verified by a numerical example. Finally, Section 5 draws conclusions of the paper.
2
Problem Formulation
In the following, we will consider the following uncertain system, xi (k + 1) = xi+1 (k) + di (k), i = 1, . . . , n − 1 xn (k + 1) = f x(k) + g x(k) u(k) + dn (k)
(1)
where xi (k) and u(k) stand for xi (kT ) and u(kT ), respectively, with T is sampling period. x(k) = [x1 (k), . . . , xn (k)]T ∈ Rn is the measurable state vector,
86
L. Xiao and Y. Zhu
u(k) ∈ R is the control input, f x(k) and g x(k) are smooth functions. d(k) = [d1 (k), . . . , dn (k)]T ∈ Rn is matched or unmatched uncertainty vector, which includes all parameter perturbation, modeling error and external disturbance in the system. For the sake of brevity, we will use the notations f (k) = f x(k) and g(k) = g x(k) in further text. Assumption: g(k) is a positive function, gL > 0 is the known lower bound of g(k), i.e., g(k) ≥ gL > 0. The objective of this paper is to construct a control law u(k), such that the state vector x(k) tracks a desired state vector xd (k) in a stable manner, where xd (k) = [xd1 (k), . . . , xdn (k)]T ∈ Rn is known vector and satisfies xdi (k + 1) = xdi+1 (k), xdn (k + 1) = a(k), (i = 1, . . . , n − 1), with a(k) is a known function. Define tracking error as e(k) = xd (k) − x(k), the corresponding error system of (1) is ei (k + 1) = ei+1 (k) − di (k),
i = 1, . . . , n − 1
en (k + 1) = a(k) − f (k) − g(k)u(k) − dn (k)
3
(2)
Main Results
In terms of SMC theory, there are two steps to accomplish the design: 1). to determine a sliding surface which possesses desired dynamics; 2). to design a control law such that tracking error is driven to the sliding surface and remains on it thereafter. In the following, after specifying a suitable sliding surface, a novel SMPM will be constructed. Based on the SMPM, due to feedback correction and receding horizon optimization, a satisfying sliding mode control law will be obtained. 3.1
The Creation of Sliding Mode Prediction Model (SMPM)
Define switching function as s(k) = ce(k) =
i = 1n−1 ci ei (k) + en (k)
(3)
where c = [c1 , . . . , cn ], with cn = 1. Therefore, the sliding surface n is S = {e(k)|s e(k) = 0}. In (3), ci should be chosen such that the roots of i=1 ci zi−1 = 0 are in open unit disk, where z is shift operator, thus the stability and dynamic performance of ideal sliding mode can be guaranteed. In order to obtain desired control law, we would like to use the future, present and history information in the system. Therefore, the following sliding mode prediction model (SMPM) is created to predict a step ahead value of switching function, sm (k + 1) =
n−1
ci ei+1 (k) + a(k) − f (k) − g(k)u(k) + φ(k)u(k)
(4)
i=1
where φ(k) > 0 satisfies φ(k) =
g(k)+
√
g2 (k)−4λ 2
(5)
Sliding Mode Prediction Based Tracking Control
87
with λ is a designable positive parameter. The guideline of adjusting λ will be given in subsection 3.4. To ensure the existence of φ(k) and φ(k) > 0, g 2 (k)−4λ > 0 is required for any time instant k, so 0 Ω where 0 < β < 1, γ > 0, sat srΩ(k) = , sgn(·) is sr (k) |sr (k)| ≤ Ω Ω , γ sign function, Ω = 1−β is the width of boundary layer. Clearly, SMRT (14) is in the form of reaching law [2] and sgn(·) is replaced by sat(·) in order to reduce chattering [10]. When SMRT (14) is selected, the reaching mode of the system under the algorithm in this paper is similar to that of under reaching law method. The Construction of Control Law
Generally, predictive control strategy determines control signal by optimizing a performance index, therefore, it is a kind of optimization algorithm. However, predictive control is different from traditional optimal control, it solves a finite horizon optimal control problem, implements receding horizon optimization[9]. Supposing the predictive horizon is M sampling instant, the approach of receding horizon optimization is: For present sampling instant k, calculate control
Sliding Mode Prediction Based Tracking Control
89
input signals from sampling instant k to sampling instant k + M − 1, namely to obtain [u(k), u(k + 1), . . . , u(k + M − 1)]T . However, only present control input signal u(k) is implemented, other elements u(k + 1), . . . , u(k + M − 1)] are not used for control, but served as initial values for the next optimization. For the next sampling instant, control input signal u(k +1) will be calculated recursively. In this paper, for simplicity and to make the design principle be prominent, one-step preceding horizon optimization is used in the following. Now, performance index is given as 2 2 φ(k − 1) J = sr (k + 1) − s˜m (k + 1) + λ u(k) − u(k − 1) (15) φ(k) g2
where 0 < λ < 4L is a weight coefficient, which adjusts the relation between the closed-loop SMPM tracking error and control effort. When control signal changes extremely, increasing λ will bring on better control performance [9]. Define δ(k) =
φ(k−1) φ(k)
q(k) = g(k) − φ(k) p(k) = sr (k + 1) − Σ(k) − a(k) + f (k) − s¯(k)
(16a) (16b) (16c)
According to (4), (9), (16a), (16b) and (16c), (15) can be re-written to 2 2 J = sr (k + 1) − sm (k + 1) − s¯(k) + λ u(k) − δ(k)u(k − 1) 2 2 = p(k) + q(k)u(k) + λ u(k) − δ(k)u(k − 1) (17) The solution of minimizing (17) gives control signal u(k). Obviously, the minimum of J can be computed analytically. By setting the partial derivative of J ∂J = 0, solving the resulting equation, the optimal solution of to zero, i.e., ∂u(k) u(k) is p(k)q(k) − λδ(k)u(k − 1) u(k) = − (18) q 2 (k) + λ Obviously, because of (5), (6), (16b) and Assumption 1, q(k) > 0 and q 2 (k)+λ > 0 are held.Thus, the control law u(k) for the closed-loop system is obtained. Remark 3. When g(k) = g(k − 1) = g, according to (5), gives φ(k) = φ(k − 1) = 2 φ, therefore, performance index (15) reduces to J = sr (k + 1) − s˜m (k + 1) + 2 λ u(k)−u(k−1) . As a results, control law (18) turns to u(k) = − p(k)q−λu(k−1) , q2 +λ where g, φ, q are positive constants, and q = g − φ. 3.5
The Analysis of Robustness
Consider the uncertain system (2), according to (8) and (18), yields p(k)q(k) − λδ(k)u(k − 1) s(k + 1) = Σ(k) + a(k) − f (k) + g(k) − cd(k) q 2 (k) + λ
90
L. Xiao and Y. Zhu
Because of (5) and (16b), gives φ2 (k)−g(k)φ(k)+λ = 0 ⇒ g 2 (k)+φ2 (k)−2g(k)φ(k)+λ = g 2 (k)−g(k)φ(k) 2 ⇒ q 2 (k) + λ = g(k)q(k) ⇒ g(k) − φ(k) + λ = g(k) g(k) − φ(k) therefore,
g(k)q(k) q2 (k)+λ
= 1 and
g(k) q2 (k)+λ
= q −1 (k). Thus
s(k + 1) = Σ(k) + a(k) − f (k) + p(k) − q −1 (k)λδ(k)u(k − 1) − cd(k) = sr (k + 1) − s¯(k) − q −1 (k)λδ(k)u(k − 1) − cd(k)
(19)
Delaying each term in (8) a step gives s(k) = Σ(k − 1) + a(k − 1) − f (k − 1) − g(k − 1)u(k − 1) − cd(k − 1)
(20)
According to (11), (10) and (20), yields s¯(k) = −cd(k − 1) − φ(k − 1)u(k − 1)
(21)
Substituting (21) into (19) gives s(k + 1) = sr (k + 1) + cd(k − 1) + φ(k − 1)u(k − 1) − q −1 (k)λδ(k)u(k − 1) − cd(k) = sr (k + 1) + cd(k − 1) − cd(k) + φ(k − 1) − q −1 (k)λδ(k) u(k − 1) (22) In terms of (5), q(k) > 0 and φ(k) > 0, one can see q(k)φ(k) = g(k) − φ(k) φ(k) g(k) + g 2 (k) − 4λ g(k) + g 2 (k) − 4λ g 2 (k) − g 2 (k) − 4λ = g(k) − = =λ 2 2 4 λ λ ⇒ = 1 ⇒ 1− φ(k − 1) = 0 ⇒ φ(k − 1) − q −1 (k)λδ(k) = 0 q(k)φ(k) q(k)φ(k) Therefore, (22) reduces to s(k + 1) = sr (k + 1) + cd(k − 1) − cd(k) Theorem 1. The closed-loop error system (2) is stable robustly under the control law (18), such that the following inequality holds, |cd(k) − cd(k − 1)| ≤ ξ
(23)
where ξ is the boundary of the change rate of uncertainty and is a known positive constant. Proof. Because SMRT can be chosen as arbitrary trajectory which converges to sd or a vicinity of sd = 0, that is ∃kN < ∞, ∀k > kN , such that the following inequality holds, |sr (k + 1)| < η (24)
Sliding Mode Prediction Based Tracking Control
91
where η is the boundary of sliding mode reference trajectory and is a positive constant. When SMRT (12) is used, |sr (k + 1)| = 0, thus η = 0; if SMRT (13) is applied, due to 0 < α < 1, obviously ∃kN < ∞, ∀η > 0, k > kN , such that |sr(k+1) | < η; once SMRT (14) is selected, η = Ω. Here, let ε = η + ξ, clearly ε is a positive constant as a result. On account of (23) and (24), one can see when k > kN ,
|s(k + 1)| = sr (k + 1)− cd(k)−cd(k − 1) ≤ |sr (k + 1)|+ cd(k)−cd(k − 1) < η + ξ = ε
Therefore, there exists |s(k + 1)| < ε, namely, system (2) satisfies the reaching condition of quasi-sliding mode [11] in the ε vicinity of sliding surface S = {e(k)|s e(k) = 0}. Hence, the practical sliding mode motion of the closed-loop system will definitely converge to a ε vicinity of sliding surface and stay in it subsequently. Since the stability and dynamic performance of ideal sliding mode has been guaranteed by (3), the closed-loop system (2) with control law (18) is robustly stable. Consequently, the design objective is completed, namely, the state vector x(k) can track desired state vector xd (k) in a stable manner, even when uncertainty appears in the system. Remark 4. Generally speaking, it is sound to assume that the change rate of uncertainty is bounded [11], namely, (23) can be satisfied. Especially, for slowly varying uncertainty or sample period T → 0, quasi-sliding mode band width ε will be tiny, and ξ may be small as a result.
4
Numerical Simulation
Consider the following third order discrete-time nonlinear uncertain system, x1 (k + 1) = x2 (k) + d1 (k) x2 (k + 1) = x3 (k) + d2 (k)
(25)
x3 (k + 1) = f (k) + g(k)u(k) + d3 (k) where f (k) = x1 (k) + x1 (k)x2 (k) + x23 (k), g(k) = 1 + e−k , Δf (k) = 0.2f (k), Δg(k) = −0.3g(k), d1 (k) = 0.1 sin(3k), d2 (k) = 0.05 cos(5k) and d3 (k) = Δf (k) + Δg(k)u(k) + 0.025 sin(10k). ⎧ 1 ≤ k ≤ 50 ⎨ 0.5 + 0.1 sin(50k), Supposing a(k) = 0.25 + 0.1 sin(50k), 51 ≤ k ≤ 150 , choosing c = [ 0.02, −0.3, 1 ], ⎩ −0.75 + 0.1 sin(50k), 151 ≤ k ≤ 200
γ λ = 0.15 and 1−β = 0.5, γ = 0.1, Ω = 1−β in SMRT (14), assuming uncertainty influences system from k = 101 to k = 105. The simulation results of the closedloop system with the presented algorithm are illustrated in Fig. 1–2. Because reaching law method is widespread in SMC area, the system (25) is controlled under reaching law method [2] with boundary layer [10] as well. That
L. Xiao and Y. Zhu
Uncertainty appears from k=101 to k=105
Uncertainty appears from k=101 to k=105
1
1 ↓ max= 0.1212
← max= 0.3221
−1
0
50
100 k
150
200
Input uk
0
−1
−2 0 −1
0
50
100 k
150
200
1 0 −1
0
0
50
100 k
150
200
↓ max= 0.1298
−0.5 −1
0
50
100 k
150
200
Fig. 2. Algorithm in this paper (u(k),s(k)) Uncertainty appears from k=101 to k=105
1
1 ← max= 0.3221 Input uk
0 ↑ min= −0.2083 −1
0
50
1
100 k
150
200
0 ↑ min= −0.3290 −1
−2 0
0
50
100 k
150
200
1 0 −1
0
50
100 k
150
200
1
0
50
100 k
150
Switching function s k
−1
0.5 0 ↑ min= −0.1728
−0.5
200
−1
0
50
100 k
150
200
Fig. 4. [2] with boundary layer (u(k),s(k))
Uncertainty appears from k=101 to k=105
Uncertainty appears from k=101 to k=105
1
1 ← max= 0.4375 ↑ min= −0.1521 0
50
100 k
150
200
Input uk
0 −1
0 ↑ min= −0.2728 −1
1 −2 0 −1
50
100 k
150
200
1 0 −1
0
50
100 k
150
200
1 0
0
50
100 k
150
200
Fig. 5. DTSM algorithm in [8] e(k)
Switching function sk
1
Tracking error e
2
Tracking error e
3
Tracking error e
1
Tracking error e
2
150
0
200
Fig. 3. [2] with boundary layer e(k)
Tracking error e
100 k
0.5
Uncertainty appears from k=101 to k=105
3
50
1
Fig. 1. Algorithm in this paper e(k)
Tracking error e
↓ max=0.0896
0
1
Switching function s k
Tracking error e
3
Tracking error e
2
Tracking error e
1
92
0.5 0 ↑ min= −0.1521 −0.5 −1
0
50
100 k
150
200
Fig. 6. DTSM algorithm in [8] (u(k),s(k))
Sliding Mode Prediction Based Tracking Control
93
is, let s(k + 1) = (1 − β)s(k) − γsat s(k) Ω . Assume the same desired state vector xdk , switching function, uncertainty, 1 − β, γ and Ω are employed. The results are shown in Fig. 3–4. Besides, we apply the discrete-time terminal sliding mode (DTSM) algorithm [8] to the system (25). According to [8], the switching function is s(k) = e3 (k), the control law is u(k) = g −1 (k) a(k) − f (k) . The same desired state vector xdk and uncertainty are used. The results are shown in Fig. 5–6. Compared with Fig. 3–6, after uncertainty appears, all values of the maximum absolute value of tracking error |e(k)|, switching function |s(k)|, and control signal |u(k)| are smaller in Fig. 1–2, thus the closed-loop system possesses stronger robustness under the control of the algorithm in this paper.
5
Conclusions
The robust tracking control algorithm presented in this paper, is the combination of predictive control strategy and sliding mode control method. Because of a designed sliding mode prediction model, the future information of sliding mode can be used. Due to feedback correction and receding horizon optimization, the desired control law is obtained and can be optimized continual. The analysis of robust stability proves that the closed-loop system has strong robustness to uncertainty with unknown bound. The simulation results verify the advantages of the proposed algorithm.
References 1. Utkin, V.: Variable structure systems with sliding modes. IEEE Trans. Automat. Contr. 22, 212–222 (1977) 2. Gao, W.B., Wang, Y., Homaifa, A.: Discrete-time variable structure control system. IEEE Trans. Ind. Electron. 42, 117–122 (1995) 3. Sun, M., Wang, Y., Wang, D.: Variable-structure repetitive control: A discrete-time strategy. IEEE Trans. Ind. Electron. 52, 610–616 (2005) 4. Chen, X.: Adaptive sliding mode control for discrete-time multi-input multi-output systems. Automatica 42, 427–435 (2006) 5. Munoz, D., Sbarbaro, D.: An adaptive sliding-mode controller for discrete nonlinear systems. IEEE Trans. Ind. Electron. 47, 574–581 (2000) 6. Liao, T.-L., Chien, T.-I.: Variable structure controller design for output tracking of a class of discrete-time nonlinear systems. JSME International Journal Series C 45, 462–469 (2002) 7. Zheng, Y., Dimirovski, G.M., Jing, Y., Yang, M.: Discrete-time sliding mode control of nonlinear systems. In: Proceedings of the 2007 American Control Conference, New York, USA, July 11-13, pp. 3825–3830 (2007) 8. Janardhanan, S., Bandyopadhyay, B.: On discretization of continuous-time terminal sliding mode. IEEE Trans. Automat. Contr. 51, 1532–1536 (2006) 9. Xi, Y.G.: Predictive control, 1st edn., pp. 18–90. National Defence Industry Press, Beijing (1993) 10. Slotine, J.J., Sastry, S.S.: Tracking control of nonlinear systems using sliding surfaces with application to robot manipulator. Int. J. Contr. 38, 465–492 (1983) 11. Bartoszewicz, A.: Discrete-time quasi-sliding-mode control strategies. IEEE Trans. Ind. Electron. 45, 633–637 (1998)
A Single Shot Associated Memory Based Classification Scheme for WSN Nomica Imran and Asad Khan School of Information Technology Monash University, Clayton, Victoria, Australia {nomica.choudhry,asad.khan}@infotech.monash.edu.au
Abstract. Identifier based Graph Neuron (IGN) is a network-centric algorithm which envisages a stable and structured network of tiny devices as the platform for parallel distributed pattern recognition. The proposed scheme is based on highly distributed associative memory which enables the objects to memorize some of its internal critical states for a real time comparison with those induced by transient external conditions. The approach not only save up the power resources of sensor nodes but is also effectively scalable to large scale wireless sensor networks. Besides that our proposed scheme overcomes the issue of false-positive detection - (which existing associated memory based solutions suffers from) and hence assures accurate results. We compare Identifier based Graph Neuron with two of the existing associated memory based event classification schemes and the results show that Identifier based Graph Neuron correctly recognizes and classifies the incoming events in comparative amount of time and messages. Keywords: Classification, Pattern Identification, Associated Memory Based Techniques, Wireless Sensor Networks.
1
Introduction
Wireless sensors are spatially distributed tiny battery operated devices. These sensors work autonomously and monitor physical or environmental conditions (e.g. heat, pressure, sound, light, electro-magnetic field, vibration, etc.). Each sensor is comprised of a receiver to sense the environment and a transmitter to disseminate this sensed data. Besides sensing and transmitting, current sensor nodes are also capable of storing and processing. The wireless sensor networks play a central role in achieving the goal of truly ubiquitous computing and smart environments. These sensor networks sense their environment and produce large amount of data. The generated data usually is in its raw form and is of rare use. To extract any meanings out of this data, it must be processed. The data processing required is application specific. Some applications prefer data fusion or aggregation – the processed data is combined from multiple sources to achieve inference. This aggregated information is often useful for other complex applications including network monitoring, load balancing, identifying hot spots, D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 94–103, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Single Shot Associated Memory Based Classification Scheme for WSN
95
reaching consensus in distributed control and decision applications. However, these schemes suffer from the issue of central point of failure. In certain other applications [1], sensors are required to be deployed densely. For example, a large quantity of sensor nodes could be deployed over a battlefield to detect enemy intrusion. Because of the dense deployment of sensors, most of the data generated is redundant and is of no use. Transferring all of this data (including the redundant one) to the base station is not a wise choice as data transmission consumes more energy than any other operation. That is due to the fact that the radio of a sensor node is usually the most power hungry component. Data communication takes up approximately 90% of the scare battery power of a sensor node. Hence, reducing communication is desirable for a solution to achieve efficiency. A simple approach to reduce the communication is to reduce the amount of data collected by sparsely deploying the sensors only on key locations. Identifying such key spots itself is a challenging issue. Moreover, the data collected from the sparsely deployed sensors is usually not consistent and is therefore is hard to be verified for accuracy. Other applications demand even more generalized approach where information is extracted out of the streams of data in the form of patterns. Existing distributed pattern recognition methods are also not considered feasible for wireless networks. Some of these approaches focus on minimizing the error rate by creating feature pattern. The raw patterns are therefore first transformed to lower dimensional feature vectors preserving the salient class information of the original measurements. However, such a transformation can not improve the theoretically optimal recognition results. Other methods are based on probabilistic approach such as Bayesian classifiers. These recognition methods, require a large number of training set which in turn requires more resources. This makes it hard to apply these methods in WSN. Traditional signal processing based pattern detection schemes also proves to be too complex to use in WSN. The current technologies still cannot comprehend the situation. A better approach needs to be developed. In this paper, we present a light-weight pattern classification scheme, called Identifier based Graph Neurons (IDN). It is a neural network based patten recognition algorithm. The motivation to use neural network based methods in WSNs is the analogy between WSNs and ANNs. The proposed classifier is build while keeping the general characteristics of WSNs in mind. One of the peculiarities of this technique is the employment of parallel in-network processing capabilities to address scalability issues effectively. At the same times, we have tried to make the scheme as much close to the network as possible. Most of the processing is carried out in the network through in-network processing. It results in reducing the data as much as possible resulting in saving the communication cost.
2
Related Work
One of the well known neural network is RNN [2], [3], recurrent neural network. RNN is a class of neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it
96
N. Imran and A. Khan
to exhibit dynamic behavior. Unlike feed forward neural networks, RNN can use their internal memory to process arbitrary sequences of inputs. Though the performance of these techniques is quite reasonable, yet they still have some intrinsic drawbacks. These drawbacks are more prominent when these networks are applied for recognition of patterns in noise rich and high-resolution streams. The major concern is that there training is very slow and they require a lot of memory. At the same times, there execution and analysis are not real-time (statistical methods) and reasonable performance is restricted to specific temporal problems (LSTM). Without extra adaptations these methods suffer from noisy input, making them ill-suited for wireless sensor network applications. Away from those recognizing strategies, the development of Hopfields associative memory [6] has given a new opportunity in recognizing patterns. Associative memory applications provide means to recall patterns using similar or incomplete patterns. Associative memory is different to conventional computer memory. In conventional computer memory contents are stored and recalled based on the address. In associative memory architecture, on the other hand, contents are recalled based on how closely they are associated with the input (probe) data. Khan [7] have replaced the term content-addressable, introduced by Hopfield [8] as associative memory (AM). Hence, content addressable memory is interchangeable with AM. The spatio-temporal encoded, or spiking, neurons developed by Hopfield [6] draw upon the properties of loosely coupled oscillators. However, the problem of scaling up the AM capacity still remains. Izhikevich [9] states, though some new capabilities at differentiating between similar inputs have been revealed, however, there is no clear evidence to suggest their superiority over the classical Hopfield model in terms of an overall increase in the associative memory capacity. Hence, one of the motivations behind GN development was to create an AM mechanism, which can scale better than the contemporary methods. Furthermore, we needed to create an architecture which would not be dependent for its accuracy on preprocessing steps such as patterns segmentation [10] or training [7]. From the discussion of several pattern recognition techniques, it seems that GN [11] is a promising AI architecture for recognizing patterns in WSN.
3
A Simple GN-Based Classifier
The communication between GN nodes is accomplished through different stages with the following fashion: 1. Stage I: In this stage, each of p(position, value) pairs of the input pattern P will be sequentially broad-casted to all the nodes in the network. Those GN nodes which are predetermined to receive these values, respond by storing these input pairs. 2. Stage II: At this stage, the end of input pattern is determined by a synchronization phase, which informs all the nodes within the network about the end of an incoming pattern.
A Single Shot Associated Memory Based Classification Scheme for WSN
97
3. Stage III: In this stage, each of the GN nodes contacts its neighboring node(s) to determine whether these adjacent nodes have encountered any p(position, value) pairs during the current input pattern cycle. These stages can be accomplished in an absolute parallel manner which can not only address the scalability concerns effectively but also can decrease the amount of processing delays to maintain real time processing functionality. 3.1
False Positive Detection Problem in GN
Each GN in GN application requires only three parameters to process an input pattern, including its own values and values of its adjacent GNs. This restricted view of the GNs leads to inaccurate results. As an instance, several GNs can recall their sub-patterns simultaneously which in turn leads to a full recall. However, the pattern which is recalled may not have been presented before. Thus, from each active GNs perspective, the recall of sub-patterns is an accurate action however the recall of the pattern by the array is not a justified act and as a result, the overall GN application will botched up in such a scenario. Following example will explain the False Positive Detection Problem in detail. Example of False Positive Detection in GN. As an example, assume that there is a GN-based classifier that can receive patterns of length L = 5 over E = {x, y}. The constructed GN overlay have two rows, one each for x and y and five columns. Consider, GN overlay has already processed two input patterns xxxyy and yxxyx. Notice that, these patterns has been processed by nodes n(x, 1), n(x, 2), n(x, 3), n(y, 4), n(y, 5) and n(y, 1), n(x, 2), n(x, 3), n(y, 4), n(x, 5) respectively as shown in Table 1. Now, upon arrival of a third pattern xxxyx, nodes n(x, 1), n(x, 2), n(x, 3), n(y, 4), n(x, 5) change their state from inactive to active. Each active node transmit its value and position to its left and right active neighbors. On receipt of this information, all active node locally construct a sub-pattern. Subsequently, all active nodes find their locally constructed sub-patterns in their respective biasarrays and individually flag the incoming pattern as a recall. The recalls of all five sub patterns by individual nodes are correct from their local perspective, but the overall recall of the pattern by the GN is incorrect. This leads GN-based classifier to falsely conclude that the incoming pattern xxxyx is already stored. This problem is commonly known as false-positive-recall. Table 1 shows the bais-array of the nodes with non-empty bias-array values after arrival of third pattern xxxyx. Table 1. Example: A GN Based Classifier n(x, 1) n(y, 1) n(x, 2) n(y, 2) n(x, 3) n(y, 4) n(x, 5) -xx — -xx
— -yx —
xxx yxx xxx
xyy xyx xyy
xxy xxy xxy
yy— —
— yxyx-
98
N. Imran and A. Khan
4
Proposed Scheme
In a Identifier based GN-based classifier, each node n(ei , j) is programmed to respond to only a specific element ei at a particular position j in a pattern. That is, node n(ei , j) can only process all those patterns in P such that ei is at the jth position in that pattern. Each node maintains an active/inactive state flag to identify whether it is processing the incoming pattern or not. Initially all nodes are in inactive state. Upon arrival of a pattern, if a node finds its programmed element ei at the given position in the pattern, it switches its state to active otherwise it remains inactive. Only active nodes participate in our algorithm and inactive nodes remain idle. At any time, there can be exactly L active nodes in a IGN. Hence, there are exactly one active left-neighbor and exactly one active right-neighbor of a node n(ei , j) where j = 0, l. Whereas terminal nodes n(ei , 0) and n(ei , L) has only one active left and right neighbor respectively. 4.1
System Model
In this section, we model the wireless sensor network. Table 2 provides the summary of notations used in this paper. Let there are N sensor nodes in the wireless sensor network. Each sensor senses a particular data from its surroundings. Let E = {e1 , e2 , . . . , eE } be a non-empty finite set of such data elements sensors sense from their surroundings. We find it convenient to describe input data in the form of patterns. A pattern over E is a finite sequence of elements from E. The length of a pattern is the number of sensors in the system. We define P as a set of all possible patterns of length L over E: P = {p1 , p2 , . . . , pP }, where P is the total number of possible unique patterns in P and can be computed as: P = EL. For example, if E = {x, y}, then P of length, say, 3 over E is: P = {xxx, xxy, xyx, xyy, yxx, yxy, yyx, yyy}. We model IGN as a structured overlay G = {(E × L)} where L = {1, 2, . . . , L}: G = {n(ei , j)} forall ei ∈ E, j ∈ L, where n(ei , j) is a node in G at i-th row and j-th column. GN can be visualized as a two dimensional array of L rows and E columns. Total number of nodes in the G are E × L. We refer all the nodes in the (j − 1) column as the left neighbors of any node n(∗, j) in j-th column. Similarly, all the nodes in the (j + 1) column are called as the right neighbors of n(∗, j).
A Single Shot Associated Memory Based Classification Scheme for WSN
99
Table 2. Notation of proposed system model Symbol Description N E E L P P G n(ei , j) Ls
4.2
number of sensors nodes set of all elements {e1 , e2 , . . . , eE } size of set E pattern length set of all patterns of length L over E size of set P and is equal to E L GN overlay, a two-dimensional array of E × L a node at i-th row and j-th column in G Local State of active node n(ei , j)
Proposed Protocol
On arrival of an input pattern P, each active node n(ei , j) store ei in its jth position. Each node n(ei , j) sends its matched element ei to its active neighbors (j + 1) and (j − 1). The IGNs at the edges will send there matched elements to there penultimate neighbors only. Upon receipt of the message, the active neighboring nodes update there bias array. Each active node n(ei , j) will assign a local state Ls to the received (ei , ) value. The generated local state Ls will be Recall if the the added value is already present in the bias array of the active node and it will be a store if in-case its new value. An < ID > will be generated against each added value. There are certain rules that need to be considered by each active node n(ei , j) while creating states and assigning < IDS > against those states. The rules are as under: Rule 1: Store(Si) > Recall(Ri) . Store Si has a natural superiority over Recall Ri i-e Store(Si) > Recall(Ri) . If node n(ei , j) self local state Ls is Recall but it receives a Store command from any of its neighbors, (j + 1) or (j − 1), it will update its own state from Recall(Ri) to Store(Si) . Rule 2: All New Elements. If any of the elements presented to G is not existing in the bias array of any of the active nodes n(ei , j) suggests that its a new pattern. Each active node n(ei , j) will create a new by incrementing the already stored maximum in there bias array by 1. Rule 3: All Recalls with Same ID. If ei presented to G is the recall of previously stored pattern with same , means that its a repetitive pattern. The same will be allocated to this pattern. Rule 4 : All Recalls with Different IDs. If all the ei of the pattern P presented to G are the recall of already stored patterns with different indicates that it’s a new pattern. Each active node n(ei , j) will find out the max(ID) in there bias array and will increment it by 1.
100
N. Imran and A. Khan
Rule 5: Mix of Store and Recall. If the end decision is to Store due to mix of Store and Recall, each active node n(ei , j) will again find out the max(ID) in there bias array and will increment it by 1. After generating localagainst generated states each active node n(ei , j) will undergo phase transition mode. During first round of phase transition mode, all the active nodes n(ei , j) will share locally generated and Ls with there (j + 1) and (j − 1) neighbors. On receiving the values all the n(ei , j) will compare the received values with there local values. If received value and self value is same there won’t be any change in there state. If received value = local value, the node will upgrade its local value according to the rules listed below: Transition Rule 1. If the active node n(ei , j) has received a greater value from its left neighbor (j + 1), it will upgrade its local state and transfer this updated value to its right (j − 1) neighbor only. Transition Rule 2. In case if the received value from both the neighbors (j +1) and (j − 1) are smaller than the local value, node n(ei , j)will upgrade its value and transfer this new value to both of its neighbors. When the pattern is resolved, the generated will be stored in the bias array of each active IGN. Bias array wont be having duplicated . It is also not necessary that the in the bias array will be in ordered form. Once the pattern has been stored, a signal is sent out within the network informing all the IGN nodes that the pattern has been stored. This is called the pattern resolution phase.
5
Simulation Work
For the purpose of ensuring the accuracy of our proposed IGN algorithm for pattern recognition, we have conducted a series of tests involving random patterns. We develop an implementation of IGN in Eclipse Java. We have performed several runs of experiments each time choosing random patterns. We have constructed a database of stored patterns in which patterns can be added manually and stored in the database too. The performance metrics are accuracy and time. To estimate the communication overhead for HGN two factors are considered. 1). Bias array size and 2). time required by the HGN array to execute the request. In case, of IGN the number of bias array entries are significant but they are not as high as in HGN as it does not have to create hierarchies and secondly the execution time is also less as compared to HGN as it only has to communicate to its immediate neighbors. In IGN the decision is not taken by the top node as a result lot of processing and communication cost is saved and the storage requirements within the bias array are not significantly extended with the increase in the number of stored patterns from the scalability point of view. It should be mentioned that the implementation of HGN requires asynchronous communication and mutually exclusive processes while these features can be easily implemented in Java, compared with other real-time programming
A Single Shot Associated Memory Based Classification Scheme for WSN
(a)
101
(b)
Fig. 1. (a) Information about all the active nodes in IGN (b) Recall percentage of active nodes in IGN
(a)
(b)
Fig. 2. Execution time for selected patterns - (a) Average execution time for 1000 patterns in IGN, GN and HGN, (b) Average execution time for 2000 patterns in IGN and GN
(a)
(b)
Fig. 3. Execution time for selected patterns - (a) Average execution time for 500 patterns in IGN, GN and HGN, (b) Average execution time for 5000 patterns in IGN and GN
102
N. Imran and A. Khan
(a)
(b)
Fig. 4. Execution time for selected patterns - (a) Average execution time for six selected patterns in IGN, GN and HGN, (b) Average execution time for six selected patterns in IGN and GN
languages such as C or Ada. Fig. 1 shows the number of entries in the bias array of IGNs while processing a database of 1000 patterns, in which each pattern is consisted of 7 elements in 5 positions However, the percentage of recalls for each of the GNs is also illustrated. The results in Fig. 2 also show that the pattern detection time in IGN is much less as compared to HGN. This is mainly due to the fact that by increasing the number of stored patterns, the search will be accomplished through higher layers of HGN hierarchy which results in more delays in delivering the final outcome. To investigate the impact of response time,Fig. 3 illustrates a comparison between GN, IGN and HGN. The response time for IGN is also not consistent because of the overhead of message communication that varies from pattern to pattern. A comparison between the response time of GN and IGN for selected patterns have been made it Fig. 4.
6
Conclusion and Future Work
In this paper we have proposed an in-network based pattern matching approach for providing scalable and energy efficient pattern recognition within a sensor network. IGN algorithm not only provides a single-cycle learning model which is remarkably suitable for real time applications but also overcome the issue of crosstalk in normal GN approach by delivering accurate results. It is highly scalable solution that works close to the network and hence save communication cost resulting in prolonging the lifetime of the network.
References 1. Yu, L., Wang, N., Meng, X.: Real-time forest fire detection with wireless sensor networks. In: Proceedings of the 2005 International Conference on Wireless Communications, Networking and Mobile Computing, vol. 2, pp. 1214–1217 (2005)
A Single Shot Associated Memory Based Classification Scheme for WSN
103
2. Obst, O.: Poster abstract: Distributed fault detection using a recurrent neural network. In: Proceedings of the 2009 International Conference on Information Processing in Sensor Networks, IPSN 2009, pp. 373–374. IEEE Computer Society, Washington, DC, USA (2009) 3. Sollacher, R., Gao, H.: Efficient online learning with spiral recurrent neural networks. In: Neural Networks, IJCNN 2008, pp. 2551–2558 (2008) 4. Fu, K.S., Aizerman, M.A.: Syntactic methods in pattern recognition. IEEE Transactions on Systems, Man and Cybernetics 6(8), 590–591 (1976) 5. Doumit, S., Agrawal, D.: Self-organized criticality and stochastic learning based intrusion detection system for wireless sensor networks. In: Military Communications Conference, MILCOM 2003, vol. 1, pp. 609–614. IEEE, Los Alamitos (2003) 6. McEliece, R.J., Posner, E.C., Rodemich, E.R., Venkatesh, S.S.: The capacity of the hopfield associative memory. IEEE Trans. Inf. Theor. 33(4), 461–482 (1987) 7. Muhamad Amin, A.H., Raja Mahmood, R.A., Khan, A.I.: Analysis of pattern recognition algorithms using associative memory approach. In: A Comparative Study Between the Hopfield Network and Distributed Hierarchical Graph Neuron (dhgn), pp. 153–158. IEEE, Los Alamitos (2008) 8. Kim, J., Hopfield, J.J., Winfree, E.: Neural network computation by in vitro transcriptional circuits 2004. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, pp. 681–688 (2007) 9. Izhikevich, E.M.: Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting (Computational Neuroscience), 1st edn. The MIT Press, Cambridge (2006) 10. Basirat, A.H., Khan, A.I.: Building context aware network of wireless sensors using a novel pattern recognition scheme called hierarchical graph neuron. In: ICSC 2009: Proceedings of the 2009 IEEE International Conference on Semantic Computing, pp. 487–494. IEEE Computer Society, Washington, DC, USA (2009) 11. Basirat, A.H., Khan, A.I.: Building context aware network of wireless sensors using a novel pattern recognition scheme called hierarchical graph neuron. In: IEEE International Conference on Semantic Computing, ICSC 2009 (2009) 12. Nasution, B.B., Khan, A.: A hierarchical graph neuron scheme for real-time pattern recognition. IEEE Transactions on Neural Networks, 212–229 (2008) 13. Mahmood, R., Muhamad Amin, K.A.: A distributed hierarchical graph neuronbased classifier: An efficient, low-computational classifier. In: First International Conference on Intelligent Networks and Intelligent Systems, ICINIS 2008 (2008) 14. Baqer, M., Khan, A.I., Baig, Z.A.: Implementing a graph neuron array for pattern recognition within unstructured wireless sensor networks. In: Enokido, T., Yan, L., Xiao, B., Kim, D.Y., Dai, Y.-S., Yang, L.T. (eds.) EUC-WS 2005. LNCS, vol. 3823, pp. 208–217. Springer, Heidelberg (2005)
Dynamic Structure Neural Network for Stable Adaptive Control of Nonlinear Systems Jingye Lei College of Engineering, Honghe University, 661100, Mengzi,Yunnan, China
[email protected] Abstract. In this paper an adaptive control strategy based on neural network for a class of nonlinear system is analyzed. A simplified algorithm is presented with the technique in generalized predictive control theory and the gradient descent rule to accelerate learning and improve convergence. Taking the neural network as a model of the system, control signals are directly obtained by minimizing the cumulative differences between a setpoint and output of the model. The applicability in nonlinear system is demonstrated by simulation experiments. Keywords: adaptive control; neural network; nonlinear system; predictive control; gradient descent rule; simulation.
1 Introduction Great progress has been witnessed in neural network (NN) control of nonlinear systems in recent years, which has evolved to become a well-established technique in advanced adaptive control. Adaptive NN control approaches have been investigated for nonlinear systems with matching [1], [2], [3], [4], and nonmatching conditions [5], [6], as well as systems with output feedback requirement [7], [8], [9], [10], [11]. The main trend in recent neural control research is to integrate NN, including multilayer networks [2], radial basis function networks [12], and recurrent ones [13], with main nonlinear control design methodologies. Such integration significantly enhances the capability of control methods in handling many practical systems that are characterized by nonlinearity, uncertainty, and complexity [14], [15], [16]. It is well known that NN approximation-based control relies on universal approximation property in a compact set in order to approximate unknown nonlinearities in the plant dynamics. The widely used structures of neural network based control systems are similar to those employed in adaptive control, where a neural network is used to estimate the unknown nonlinear system, the network weights need to be updated using the network’s output error, and the adaptive control law is synthesized based on the output of networks. Therefore the major difficulty is that the system to be controlled is nonlinear with its diversity and complexity as well as lack of universal system models. It has been proved that the neural network is a complete mapping. Using this characteristic, an adaptive D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 104–111, 2011. © Springer-Verlag Berlin Heidelberg 2011
Dynamic Structure Neural Network for Stable Adaptive Control of Nonlinear Systems
105
predictive control algorithm is developed to solve the problems of tracking control of the systems.
2 Problem Statement Assume that the unknown nonlinear system to be considered is expressed by
y (t + 1) = f ( y (t ), y (t − 1), ", y (t − n), u (t ), u (t − 1), ", u (t − m))
(1)
where y (t ) the scalar output of the system, u (t ) is the scalar input to the system, f (") is unknown nonlinear function to be estimated by a neural network, and n and m are the known structure orders of the system. The purpose of our control algorithm is to select a control signal u (t ) , such that the output of the system y (t ) is made as close as possible to a prespecified setpoint r (t ) . Figure 1 shows the overall structure of the closed-loop control system which consists of the system (1), a feedforward neural network which estimates f (") and a controller realized by an optimizer.
Fig. 1. Neural-network control system
Figure 2 shows the neural network architecture. A two-layer neural network is used to learn the system and the standard backpropagation algorithm is employed to train the weights. The activation functions are hyperbolic tangent for the first layer and linear for the second layer. Since the input to the neural network is
p = [ y (t ), y (t − 1), " , y (t − n ), u (t ), u (t − 1), " , u (t − m )]
(2)
The neural model for the unknown system (1) can be expressed as
yˆ(t + 1) = fˆ [ y (t ), y (t − 1)," , y (t − n), u (t ), u (t − 1), " , u (t − m))
(3)
Where yˆ (t + 1) is the output of the neural network and fˆ is the estimate of f . Since the backpropagation training algorithm guarantees that
[ y (t + 1) − yˆ (t + 1)]2 = min
(4)
106
J. Lei
Fig. 2. The neural-network structure
yˆ (t + 1) is also referred to as a predicted output of the system (1). Therefore the control signal can be selected such that yˆ (t + 1) is made as close as possible to r (t ) .
3 Adaptive Algorithm Take an objective function J as follows:
J= where
1 2 e (t + 1) 2
e(t + 1) = r (t + 1) − yˆ (t + 1)
(5) (6)
The control signal u (t ) should therefore be selected to minimize J . Using the neural network structure, (3) can be rewritten to give
yˆ (t + 1) = w 2 [tanh( w1 p + b1 )] + b2
(7)
where w1 , w2 , b1 and b2 are the weights and biases matrices of the neural network. To minimize J , the u (t ) is recursively calculated via using a simple gradient descent rule
u (t + 1) = u (t ) − η
∂J ∂u (t )
(8)
where η > 0 is a learning rate. It can be seen that the controller relies on the approximation made by the neural network. Therefore it is necessary that yˆ (t + 1) approaches the real system output y (t + 1) asymptotically. This can be achieved be keeping the neural network training online. Differentiating (5) with respect to u (t ) , it can be obtained that
∂J ∂yˆ (t + 1) = −e(t + 1) ∂u (t ) ∂u (t )
(9)
where ∂yˆ (t + 1) ∂u (t ) is known as the gradient of the neural network model with respect to u (t ) . Substituting (9) into (8), we have
Dynamic Structure Neural Network for Stable Adaptive Control of Nonlinear Systems
u (t + 1) = u (t ) + ηe(t + 1)
∂yˆ (t + 1) ∂u (t )
107
(10)
The gradient can then be analytically evaluated by using the known neural network structure (7) as follows:
∂yˆ (t + 1) dp = w2 [sec h 2 ( w1 p + b1 )]w1 ∂u (t ) du
(11)
dp = [0,0," ,0,1,0," 0]′ du is the derivative of the input vector p respect to u (t ) . Finally, (10) becomes where
u (t + 1) = u (t ) + ηe(t + 1) w2 [sec h 2 ( w1 p + b1 )]w1
dp du
(12)
(13)
Equation (13) can now be used in a computer program for real-time control. To summarize, we have the adaptive algorithm: 1). produce yˆ (t + 1) using (7); 2). find e(t + 1) using (6); 3).update the weights using backpropagation algorithm; 4). compute new control signal from (13); 5). feed u (t + 1) to the system; 6). go to step 1).
4 Adaptive Predictive Control Algorithm The algorithm described in section 3 can be improved by using the technique in generalized predictive control theory [5], which considers not only the design of the instant value of the control signal but also its future values. As a result, future values of setpoint and the system output are needed to formulate the control signal. Since the neural network model (3) represents the plant to be controlled asymptotically, it can be used to predict future values of the system output. For this purpose, let T be a prespecified positive integer and denote
Rt ,T = [r (t + 1), r (t + 2)," , r (t + T )]′
(14)
as the future values of the setpoint and
Yˆt ,T = [ yˆ (t + 1), yˆ (t + 2),", yˆ (t + T )]′
(15)
as the predicted output of the system using the neural network model (7), then the following error vector.
Et ,T = [e(t + 1), e(t + 2),", e(t + T )]′
(16)
108
J. Lei
can be obtained where
e(t + i ) = r (t + i ) − yˆ (t + i )
(17)
Defining the control signals to be determined as
U t ,T = [u (t + 1), u (t + 2),", u (t + T )]′
(18)
and assuming the following objective function
1 J 1 = [ EtT,T Et ,T ] 2
(19)
then our purpose is to find U t ,T such that J 1 is minimized. Using the gradient decent rule, it can be obtained that ∂J 1 (20) U tk,T+1 = U tk,T − η ∂U tk,T where
∂Yˆt ,T ∂J 1 E = t ,T ∂U tk,T ∂U tk,T
(21)
and
∂Yˆt ,T ∂U tk,T
⎡ ∂yˆ (t + 1) ⎢ ∂u (t ) ⎢ ⎢ ∂yˆ (t + 2) = ⎢ ∂u (t ) ⎢ # ⎢ ∂yˆ (t + T ) ⎢ ⎣ ∂u (t )
⎤ ⎥ ⎥ ∂yˆ (t + 2) ⎥ " 0 ⎥ ∂u (t + 1) ⎥ # % # ∂yˆ (t + T ) ∂yˆ (t + T ) ⎥ " ⎥ ∂u (t + 1) ∂u (t + T − 1) ⎦ 0
"
0
(22)
It can be seen that each element in the above matrix can be found by differentiating (3) with respect to each element in (18). As a result, it can be obtained that
∂yˆ (t + n) = ∂u (t + m − 1)
n −1 ∂fˆ ( p) ∂fˆ ( p) ⎡ ∂yˆ (t + i ) ⎤ +∑ ∂u (t + m − 1) i = m ∂yˆ (t + i ) ⎢⎣ ∂u (t + m − 1) ⎥⎦
for n = 1, 2,", T
(23)
m = 1, 2, ", T
Equation(22) is the well-known Jacobian matrix which must be calculated using (23) every time a new control signal has to be determined. This could result in a large computational load for a big T . Therefore a recursive form for calculating the Jacobian matrix is given in the following so that the algorithm can be applied to the real-time systems with fast responses.
Dynamic Structure Neural Network for Stable Adaptive Control of Nonlinear Systems
∂yˆ (t + n) ∂yˆ (t + n − 1) ⎡ ∂fˆ ( p) ⎤ = ⎥ ⎢1 + ∂u (t + m − 1) ∂u (t + m − 1) ⎢⎣ ∂yˆ (t + n − 1) ⎥⎦
109
(24)
From (24) it can be seen that in order to find all the elements in the Jacobian matrix it is only necessary to calculate the diagonal elements and the partial derivatives of the fˆ (") with respect to the previous predicted output. Therefore less calculation is required to formulate the Jacobian matrix. The two derivative terns in (24) are given by
where
∂yˆ (t + n − 1) dp ⎤ ⎡ = w2 ⎢sec h 2 ( w1 p + b1 )]w1 ⎥ ∂u (t + m − 1) du ⎦ ⎣
(25)
⎡ ∂fˆ ( p ) dp ⎤ = w2 ⎢sec h 2 ( w1 p + b1 )]w1 ⎥ ˆ ∂y (t + n − 1) dyˆ ⎦ ⎣
(26)
dp ′ = [1, 0,", 0, 0, 0,", 0] dyˆ
(27)
Equations (25) and (26) can now be used in a computer program to calculate the Jacobian matrix and the adaptive predictive algorithm is summarized as follows: 1). Select a T ; 2). Find new predicted output yˆ (t + 1) using (7); 3).Calculate ∂yˆ (t + n − 1) ∂u (t + m − 1) and ∂fˆ ( p ) ∂yˆ (t + n − 1) via (25) and (26); 4).Update vector p , using new yˆ (t + 1) calculated in step 2 and u(t + n − 1) from the vector of future control signals (18); 5). Use (24) and the results obtained in Step 3) to calculate the off-diagonal elements of the Jacobian; 6). Use (20) form a new vector of future control signals; 7). Apply u (t + 1) found in step 6) to close the control loop; 8). Return to step 1).
5 Simulation The following simulation with the adaptive algorithm and the adaptive predictive algorithms gives the situations of systems tracking square-wave. Considering a SISO nonlinear system:
y (t + 1) = 2.6 y (t ) − 1.2 y (t − 1) + u (t ) + 1.2u (t − 1) + sin(u (t ) u (t ) + u (t − 1) + y (t ) + y (t − 1) + u (t − 1) + y (t ) + y (t − 1) − 1 + u 2 (t ) + u 2 (t − 1) + y 2 (t ) + y 2 (t − 1)
(28)
Figure 3 and Figure 4 show the simulation results with the adaptive control and the adaptive predictive control respectively. It can be seen that the adaptive predictive controller has excellent control performance with better stability and convergence, less study parameters and small calculation.
110
J. Lei
Fig. 3. Adaptive tracking control
Fig. 4. Adaptive predictive tracking control
6 Conclusions A neural network based adaptive control strategy is presented in this paper for general unknown nonlinear systems, where a simplified formulation of the control signals is obtained using the combination of a feedforward neural network and an optimization scheme. The neural network is used online to estimate The system and the backproagation training algorithm is applied to train the weights. Taking the resulting neural network estimation as a known nonlinear dynamic model for the system, control signals can be directly obtained using the well-established gradient descent rule. An improved algorithm is also included where both instant and future values of control signals are derived. Simulations have successfully demonstrated the use of the proposed method.
References 1. Tan, Y., Keyser, R.D.: Neural-network-based Adaptive Predictive Control. Advances in MBPC, 77–88 (1993) 2. Chen, F.C., Khalil, H.K.: Adaptive Control of Nonlinear Systems Using Neural Networks. Int. J. Cont. 55, 1299–1317 (1994) 3. Man, Z.H.: An Adaptive Tracking Controller Using Neural Networks for a Class of Nonlinear Systems. IEEE Trans. Neural Networks. 9(5), 947–954 (1998) 4. Fabri, S., Dadirkamanathan, V.: Dynamic Structure Neural Networks for Stable Adaptive Control of Nonlinear Systems. IEEE Trans. Neural Networks 6(5), 1151–1165 (1995) 5. Jin, L., Nikiforuk, P.N., Gupta, M.M.: Adaptive Tracking of SISO Nonlinear Systems Using Multilayered Neural Networks. In: Proceeding of American Control Conference, Chicago, pp. 56–62 (1992)
Dynamic Structure Neural Network for Stable Adaptive Control of Nonlinear Systems
111
6. Rovithakis, G.A., Christodoulou, M.A.: Direct adaptive regulation of unknown nonlineare dynamical systems via dynamic neural networks. IEEE Trans. Syst., Man, Cybern. 125, 1578–1595 (1995) 7. Hayakawa, T., Haddad, W.M., Hovakimyan, N.: Neural Network Adaptive Control for a Class of Nonlinear Uncertain Dynamical Systems With Asymptotic Stability Guarantees. IEEE Transactions on Neural Networks 19(1), 80–89 (2008) 8. Hayakawa, T., Haddad, W.M., Hovakimyan, N., Chellaboina, V.: Neural network adaptive control for nonlinear nonnegative dynamical systems. IEEE Transactions on Neural Networks 16(2), 399–413 (2005) 9. Deng, H., Li, H.X., Wu, Y.H.: Feedback-Linearization-Based Neural Adaptive Control for Unknown Nonaffine Nonlinear Discrete-Time Systems. IEEE Transactions on Neural Networks 19(9), 1615–1625 (2008) 10. Li, T., Feng, G., Wang, D., Tong, S.: Neural-network-based simple adaptive control of uncertain multi-imput multi-output nonlinear systems. Control Therory & Application 4(9), 1543–1557 (2010) 11. Chen, P.C., Lin, P.Z., Wang, C.H., Lee, T.T.: Robust Adaptive Control Scheme Using Hopfield Dynamic Neural Network for Nonlinear Nonaffine Systems. In: Zhang, L., Lu, B.-L., Kwok, J. (eds.) ISNN 2010. LNCS, vol. 6064, pp. 497–506. Springer, Heidelberg (2010) 12. Sun, H.B., Li, S.Q.: General nonlinear system MRAC based on neural network. Application Research of Computers 26(11), 4169–4178 (2009) 13. Poznyak, A.S., Sanchez, E.N., Yu, W.: Differential Neural Networks for Robust Nonlinear Control. In: Identification, State Estimation and Trajectory Tracking, World Scientific, Singapore (2001) 14. Farrell, J.A., Polycarpou, M.M.: Adaptive Approximation Based Control: Unifying Neural, Fuzzy and Traditional Adaptive Approximation approaches. Wiley, Hoboken (2006) 15. Ge, S.S., Lee, T.H., Harris, C.J.: Adaptive Neural Network control of Robotic Manipulators. World Scentific, River Edge (1998) 16. Lewis, F.L., Jagannathan, S., Yesilidrek, A.: Neural Network Control of Robot Manipulators and Nonlinear systems. Taylor and Francis, London (1999)
A Position-Velocity Cooperative Intelligent Controller Based on the Biological Neuroendocrine System Chongbin Guo1,2, Kuangrong Hao1,2,*, Yongsheng Ding1,2, Xiao Liang1, and Yiwen Dou1 2
1 College of Information Sciences and Technology MOE Engineering Research Center of Digitized Textile & Fashion Technology Donghua University, Shanghai 201620, China
[email protected] Abstract. Based on multi-loop regulation mechanism of neuroendocrine system (NES), the present paper introduces a novel position-velocity cooperative intelligent controller (PVCIC), which improves the performance of controlled plant. Corresponding to NES, the PVCIC structure consists of four sub-units: Planning unit (PU) is the motion input unit of desired velocity and position signals. Coordination unit (CU) is the position-velocity coordination with a designed soft switching algorithm. Identification optimization control unit (IOCU), which is inspired from hormone regulation theory, is as the key control unit including a PID controller, a hormone identifier and a hormone optimizer. The hormone identifier and hormone optimizer identify control error and optimize PID controller parameters respectively. The execution unit (EU) is as executor which includes driver and plant. The promising simulation results indicate the PVCIC is practical and useful with the better performance compared with the conventional PID controller. Keywords: Neuroendocrine system; Hormone regulation; Position-velocity controller; Cooperative intelligent control.
1 Introduction The bio-intelligent control systems and algorithms, such as artificial neural networks, evolutionary algorithms and artificial immune systems, usually not require accurate mathematical model of plants but have better control performances with physiological regulation [1, 2, 3]. However, some of them may be a little complicated, or not easy to be realized in the engineering. Therefore, it is necessary to develop more simple and efficient control method from new biological mechanism for the complicated system[4]. Besides nervous system and immune system, neuroendocrine system (NES) is also one of three major physiological systems in human body and has some outstanding cooperative modulation mechanisms. It has at least three types of feedback mechanisms which include long, short and ultra-short loop feedbacks. Through such multi-loop feedback mechanism, NES could control multi-hormones harmoniously, *
Corresponding Author.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 112–121, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Position-Velocity Cooperative Intelligent Controller
113
meanwhile has high self-adaptability and stability in human body [5]. Therefore, some novel control methods can be developed by exploring and imitating the functionalities of NES. Such approach can also help to design some controllers for cooperative motion control scenarios such as position-velocity cooperative intelligent control. In previous works, based on neuroendocrine or endocrine system, some intelligent controllers have been designed and acquired good results in control field. Neal and Timmis [6] proposed the firstly artificial endocrine system (AES) in 2003. The AES was defined to include hormone secretion, hormone regulation and hormone control, and such theory was applied to design a useful emotional mechanism for robot control. Córdova and Cañete [7] discussed in conceptual terms the feasibility of designing an ANES in a robot and to reflect upon the bionic issues associated highly with complex automatons. Liu et al. [8] designed a neuroendocrine regulation principle-based two-level structure controller, which include a primary controller and an optimization controller that can not only achieve accurate control but adjust control parameters in real time. Ding et al. [9] developed a bio-inspired decoupling controller from the biregulation principle of growth hormone in NES. These controllers have good control performances and provide some new ideas to motion control field, but no one is designed for position-velocity cooperative intelligent control. The proposed position-velocity cooperative intelligent control in this paper is defined as follows. The controlled plants could require a fine uniform velocity, a fast start, a smooth stop, an accurate position and a strong self-adaptability, etc[10, 11]. In this paper, based on the multi-loop regulation mechanism of NES [12], a novel position-velocity cooperative intelligent controller (PVCIC) with related control structure, detailed designed algorithm and the steps in parameters tuning are proposed. The PVCIC consists of a planning unit (PU), a coordination unit (CU), an identification optimization control unit (IOCU) and an execution unit (EU). The PU is primarily responsible for processing and transmitting the desired velocity and position signal to CU. The CU is designed as long loop feedback mechanism, and it could be a coordinator between position and velocity. The IOCU is as the key control unit which is designed as short loop feedback mechanism and could send control signal via a PID controller. Meanwhile, based on ultra-short loop feedback mechanism and some hormone regulation mechanism [13, 14], control error is identified via a hormone identifier and PID parameters are optimized via hormone optimizer. The EU is an execution unit which includes the driver and the plant. The simulation experimental results demonstrate that the PVCIC can not only ensure position accuracy but also keep the plant operating smoothly; and accuracy, stability, responsiveness and self-adaptability of the PVCIC are better than those of the conventional PID controller. The main contribution of this paper lies in that based on multi-loop regulation mechanism of NES it provides a bio-inspired cooperative intelligent control approach. And based on principle of the hormone regulation, it provides an identification and optimization approach to improve the global result. Furthermore, simulating biological mechanism of NES in the control system provides a new and efficient idea for the position-velocity cooperative intelligent control, and these approaches also could to be extended to achieve more complex multi-objective cooperative intelligent control. The paper is organized as follows. Section 2 introduces the multi-loop regulation mechanism of NES, and then presents the PVCIC structure. In Section 3, the detailed control algorithm design of the PVCIC is elaborated. The parameters tuning methods
114
C. Guo et al.
are also introduced in Section 4. The simulation experimental results are given to verify the effectiveness of the proposed controller in Section 5. Finally, the work is concluded in Section 6.
2 PVCIC Structure Design Inspired from NES 2.1 Regulation Mechanism of NES A classic instance of NES can be viewed as a multiple closed-loop feedback control system, which is essential for the stability and agility of the inner environment of human body, as shown in Fig. 1. The regulation mechanism of neuroendocrine hormone can be generalized as: central nervous system processes and transmits the nerve impulse as appropriate response to hypothalamus. Hypothalamus receives the nerve impulses and secretes relevant tropic hormone (TH), which stimulates pituitary to secrete releasing hormone (RH). Then RH stimulates gland to secrete hormones. There are at least three types of feedback, which include ultra-short loop feedback, short loop feedback, and long loop feedback. The ultra-short loop feedback means the concentration of RH is fed back to the pituitary; the short loop feedback means the concentration of hormone is fed back to the pituitary; the long loop feedback means the concentration of hormone is fed back to the hypothalamus. Through the multiloop feedback mechanism the hormone concentration is stable and easy [12].
Fig. 1. Regulation mechanism of NES
2.2 Controller Structure Design According to multi-loop regulation mechanism of NES, a novel PVCIC is provided, as shown in Fig. 2. The proposed structure and sub-units of the PVCIC are similar as that of NES. Corresponding to NES, the sub-unit PU, CU, IOCU and EU can be regarded as central nervous system, hypothalamus, pituitary and gland respectively. The PU is primarily responsible for processing and transmitting the desired velocity and position signal to CU. The CU is as a position-velocity coordinator and designed as the long loop feedback mechanism. This unit can receive the actual position feedback signal and transmit the processed signal to PU. The IOCU is as the key control unit which includes a PID controller, a hormone identifier and a hormone optimizer. PID controller is the main control module. Hormone identifier is designed to identify control error and create corresponding corrected factor. Hormone optimizer is responsible for optimizing the control parameters of the PID-controller. There are two kinds loop feedback, i.e. short and ultra-short loop feedback. The short loop feedback is that the
A Position-Velocity Cooperative Intelligent Controller
115
Fig. 2. The structure of PVCIC
actual velocity signal is fed back to the PID controller. The ultra-short loop feedback is that the adjusted parameters are fed back to the PID controller via the hormone identifier and the hormone optimizer. The GU is corresponding to an execution unit which includes driver module and plant module.
3 Algorithm Design The PU is designed for planning the desired velocity Vin (t ) and desired position Pin (t ) . The EU is designed as an executor which outputs the actual position signal P(t ) . We can read the P(t ) by sensor, and then calculate the actual velocity signal V (t ) . In order to describe the control algorithm of the PVCIC clearly, design of the CU and IOCU are detailed as following. 3.1 CU Design The CU is designed as a coordinator for velocity and position. Position-velocity mode is used when the actual position is far from desired position that plant can achieve fine uniform velocity, a fast start. Position-position control mode is used when the actual position is close to desired position that plant can achieve accurate position and a smooth stop. This rule for automatic switching is described as follows [10, 11]:
⎧ position-velocity, r > rc , strategy = ⎨ ⎩ position-position, r < rc
(1)
where r is the distance from the actual position to desired position and rc is a constant distance. The control algorithm can be designed as follows: ⎧ V (t ), | e1 (t ) |> δ , U1 (t ) = ⎨ in ⎩e1 (t ) ⋅ K c , | e1 (t ) |< δ
(2)
116
C. Guo et al.
where Kc =
V (tswitch ) . e1 (tswitch )
(3)
Where U1 (t ) is the output of the CU, e1 (t ) is the position error signal between the
desired and the actual position, δ is a position switching coefficient, K c is a conversion coefficient, and tswitch is the initial time of the switching process. 3.2 IOCU Design
In this unit, conventional PID controller is chosen as initial controller, and then its control parameters are adjusted real-time by hormone identification and hormone optimization. (1) Initial PID controller. The initial control law of PID controller obeys the conventional PID control algorithm U 2 (t ) = K p0 ⋅ e2 (t ) + K i0 ⋅ ∫ e2 (t ) dt + K d0 ⋅
de2 (t ) . dt
(4)
Where e2 (t ) is the error signal between the U1 (t ) and the actual velocity, U 2 (t ) is the output of the PID controller, K 0p , K i0 , and K d0 are initial control parameters of the PID controller. (2) Hormone identifier. In NES, the organ could enhance identification and secretion precision within the working scope. However when the stimulate signal beyond the controlled scope, hormone secretion rate is at its high limit. Similarly, the control error identification approach follows the principle of the hormone secretion. Therefore, the absolute value of control error e2 (t ) is calculated at first, and then mapped to the corresponding regulation scope. Hormone identification error 0 ≤ E (t ) ≤ 1 is designed as ⎧ | e2 (t ) | , | e2 (t ) |< emax − emin ⎪ , E (t ) = ⎨ emax − emin ⎪1, | e2 (t ) |≥ emax − emin ⎩
(5)
where emax and emin is high and low limited error of optimal scope, respectively. The hormone secretion rate is always nonnegative and monotone [13]. It will be monotone increasing if represents a positive control. If the control is negative, it will be decreasing. So the increasing and decreasing hormone secretion regulation mechanism follows the Hill functions [14]. The corresponding increasing corrected factor is design as Fup ( E (t )) =
and decreasing corrected factor is
E (t ) n1 , E (t ) n1 + 1
(6)
A Position-Velocity Cooperative Intelligent Controller
Fdown ( E (t )) =
1 , E (t ) n2 + 1
117
(7)
where n1 and n2 are Hill coefficients. (3) Hormone optimizer. If the secretion rate of hormone A is regulated by concentration of hormone B, the relationship between them is [14] S A (CB ) = aFup ,
( down )
(CB ) + S A0 ,
(8)
where S A is the secretion rate of hormone A, CB is the concentration of hormone B, S A0 is the basal secretion rate of hormone A, and 0 ≤ a ≤ 1 is a coefficient of Fup , ( down ) . Therefore, the control algorithm of hormone secretor can be designed as S (t ) = S 0 + Fup ( E (t )) ⋅ ( S H − S 0 ) + Fdown ( E (t )) ⋅ ( S L − S0 ) .
(9)
Where S (t ) is the real-time secretion rate, S0 is the initial secretion rate, when error e2 (t ) is too big, the biggest or the least hormone secretion rate is S H and in contrast, the hormone secretion rate is S L . It also should be noted that if n1 = n 2 , Eq. (9) can be expressed as S (t ) = S L + Fup ( E (t )) ⋅ ( S H − S L ) ,
(10)
S (t ) = S H + Fdown ( E (t )) ⋅ ( S H − S L ) .
(11)
or
Ultimately, Eq. (9) is used to optimize initial control parameters of the PID controller. E.g. when control error e2 (t ) is too big, the proportion gain K 0p should decrease to K pH to weaken the control action, thus reduce the overshoot. In contrast, the propor-
tion gain should increase to K pL to enhance the control action to eliminate control error quickly. The correcting regulation of integral coefficient K i0 and differential coefficient K d0 are similar to that of the proportion gain. The algorithm of the parameters adjusted is ⎧ K P (t ) = K p0 + Fup ( E (t )) ⋅ ( K pH − K 0p ) + Fdown ( E (t )) ⋅ ( K pL − K 0p ) ⎪ 0 H 0 L 0 ⎨ Ki (t ) = Ki + Fup ( E (t )) ⋅ ( Ki − Ki ) + Fdown ( E (t )) ⋅ ( Ki − Ki ) , ⎪ K (t ) = K 0 + F ( E (t )) ⋅ ( K H − K 0 ) + F ( E (t )) ⋅ ( K L − K 0 ) d up d d down d d ⎩ d
(12)
where K p (t ) is the optimized proportion gain, K i (t ) is the optimized integral coefficient, K d (t ) is the optimized differential coefficient, K Hj and K Lj ( j = p, i, d ) is the limited parameter value when the control error is too big or small, respectively.
118
C. Guo et al.
(4) Optimized PID controller: When the initial PID control parameters are adjusted dynamically by the hormone identifier and hormone optimizer, the optimized control law is as follows: U 2 (t ) = K p (t ) ⋅ e2 (t ) + K i (t ) ⋅ ∫ e2 (t ) dt + K d (t ) ⋅
de2 (t ) . dt
(13)
4 Parameters Tuning 4.1 Tuning Steps (1) Tune the initial PID control parameters. First, only take the short loop feedback control (includes desired and actual velocity signals, PID controller and EU) into action, and then tune the initial control parameters K 0p , K i0 and K d0
approximately. (2) Tune the high and low limited PID parameters. As the step 1, only take the short loop feedback control into action. When control error e2 (t ) is too big, tune the control parameters K pH , K iH and K dH to ensure plant stable faster with little or without overshoot. In contrast, tune the control parameters K pL , K iL and K dL to ensure more accurate velocity. (3) Determine the high and low limited hormone identification error. According to the results in step 1 and step 2, determine the high limited error emax and low limited error emin of the optimal working scope. (4) Determine the Hill coefficients. Base on step 1, the ultra-short loop feedback is added to determine Hill coefficients n1 and n2 . By choosing suitable Hill coefficients, better velocity control performance can be achieved. (5) Determine the position switching coefficient. Take the PVCIC into action, and then determine the position switching coefficient δ to ensure control strategy switching smoothly.
5 Results and Discussion A typical mathematical model of the robot servo control system is chosen as controlled plant for simulation experiments, as shown in Eq. (14) [15]. G ( s) =
6000 . s + 128s 2 + 10240 s 3
(14)
To verify control effectiveness more clearly, the proposed PVCIC with the conventional PID controller (PVCIC without hormone identifier and hormone optimizer) is taken for comparison to find out whether hormone regulation method yields better global control results. To make the contrast effects more clearly, three groups of control parameters (PID-0, PID-H and PID-L) in conventional PID controller is the same
119
3RVLWLRQ
9HORFLW\
A Position-Velocity Cooperative Intelligent Controller
7LPH V
7LPH V
(a) Position contrast effect
(b) Velocity contrast effect
Fig. 3. The contrast control effectiveness
Table 1. Control parameter sets K p0
Ki0
K d0
K pH
K iH
K dH
K pL
KiL
K dL
emax
emin
n1
n2
δ
0.4
40
0.006
1
120
0.005
0.1
20
0.01
0.2
-0.2
2
2
0.1
as K 0j , K Hj and K Lj ( j = p, i, d ) in PVCIC, respectively. The other control parameters of them are same and control parameter sets are listed in Table 1. The same desired input signals are Pin (t ) = 1 and Vin (t ) = 1 . All control algorithms are developed under the Matlab/Simulink environment and the fixed sample time is T=0.001s. The simulation result shows that with multi-loop feedback structure and soft switching algorithm all controllers could achieve cooperative control for position and velocity. Fig.3 (a) shows that PVCIC can make position control stable and faster without overshoot due to its better responsiveness and adaptability in switching process. Fig.3 (b) shows that in starting process, PVCIC could achieve a faster velocity response with little overshoot and have better velocity stability compared with the conventional PID controllers. In stopping and switching process due to strong selfadaptability, PVCIC could stop quickly and smoothly without negative overshoot of the velocity. As shown in Fig.3 (b), we can also find that PID-H have a fast velocity response in starting process but be slowly in switching process; PID-L has no overshoot in starting process but has big negative overshoot in stopping process; PVCIC has good intelligence control effectiveness which cloud automatically take advantages of them.
6 Conclusions Based on the biological neuroendocrine regulation network structure and its working mechanism, a novel and simple PVCIC with related networks structure, detailed designed algorithm and steps in parameters tuning are proposed for position-velocity cooperative intelligent control. An effective control structure with soft switching
120
C. Guo et al.
algorithm is proposed to achieve cooperative strategy. Meanwhile, based on the principles of hormone regulation, hormone identification and hormone optimization are proposed to identify control error and optimize control parameters. The simulation experimental results demonstrate that the PVCIC can not only ensure position accuracy but also keep the plant operating smoothly; and the accuracy, the stability, the responsiveness and the self-adaptability of the PVCIC are better than those of the conventional PID controller. According to the knowledge of the authors, this is the first time that a position-velocity cooperative intelligent control approach based on the NES was proposed. Designing motion controller inspired from NES also provides a new and efficient method for the position-velocity cooperative intelligent control.
Acknowledgments This work was supported in part by the National Nature Science Foundation of China (No. 60975059, 60775052), Support Research Project of National ITER Program (No.2010GB108004), Specialized Research Fund for the Doctoral Program of Higher Education from Ministry of Education of China (No. 20090075110002), Project of the Shanghai Committee of Science and Technology (Nos. 10JC1400200, 10DZ0506500, 09JC1400900).
References 1. Mukherjee, A., Zhang, J.: A reliable multi-objective control strategy for batch processes based on bootstrap aggregated neural network models. Journal of Process Control 18, 720– 734 (2007) 2. Bagis, A., Karaboga, D.: Evolutionary algorithm-based fuzzy pd control of spillway gates of dams. Journal of the Franklin Institute 344, 1039–1055 (2007) 3. Xie, F.-w., Hou, Y.-f., Xu, Z.-p., Zhao, R.: Fuzzy-immune control strategy of a hydroviscous soft start device of a belt conveyor. Mining Science and Technology (China) 19, 544– 548 (2009) 4. Mitra, S., Hayashi, Y.: Bioinformatics with soft computing. IEEE Transactions on System, Man and Cybernetics: Part C 36, 616–635 (2006) 5. Savino, W., Dardenne, M.: Neuroendocrine control of thymus physiology. Endocrine Reviews 21, 412–443 (2000) 6. Neal, M., Timmis, J.: A useful emotional mechanism for robot control. Informatica (Slovenia) 27, 197–204 (2003) 7. Córdova, F.M., Cañete, L.R.: The challenge of designing nervous and endocrine systems in robots. International Journal of Computers, Communications & Control I, 33–40 (2006) 8. Liu, B., Ren, L., Ding, Y.: A novel intelligent controller based on modulation of neuroendocrine system. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3498, pp. 119–124. Springer, Heidelberg (2005) 9. Ding, Y.S., Liu, B., Ren, L.H.: Intelligent decoupling control system inspired from modulation of the growth hormone in neuroendocrine system. Dynamics of Continuous, Discrete & Impulsive Systems, Series B: Applications & Algorithms 14, 679–693 (2007) 10. Sun, Z., Xing, R., Zhao, C., Huang, W.: Fuzzy auto-tuning pid control of multiple joint robot driven by ultrasonic motors. Ultrasonics 46, 303–312 (2007)
A Position-Velocity Cooperative Intelligent Controller
121
11. Farkhatdinov, I., Ryu, J.-H., Poduraev, J.: A user study of command strategies for mobile robot teleoperation. Intelligent Service Robotics 2, 95–104 (2009) 12. Keenan, D.M., Licinio, J., Veldhuis, J.D.: A feedback-controlled ensemble model of the stress-responsive hypothalamo-pituitary-adrenal axis. PNAS 98, 4028–4033 (2001) 13. Vargas, P.A., Moioli, R.C., de Castro, L.N., Timmis, J., Neal, M., Von Zuben, F.J.: Artificial homeostatic system: A novel approach. In: Capcarrère, M.S., Freitas, A.A., Bentley, P.J., Johnson, C.G., Timmis, J. (eds.) ECAL 2005. LNCS (LNAI), vol. 3630, pp. 754–764. Springer, Heidelberg (2005) 14. Farhy, L.S.: Modeling of oscillations of endocrine networks with feedback. Methods Enzymol. 384, 54–81 (2004) 15. Guo, C., Hao, K., Ding, Y.: Parallel robot intelligent control system based on neuroendocrine method. Journal of Mechanical Electrical Engineering 27, 1–4 (2010) (chinese)
&
A Stable Online Self-Constructing Recurrent Neural Network Qili Chen, Wei Chai, and Junfei Qiao Intelligent Systems Institute, Electronic Information and Control Engineering, Beijing University of Technology, Beijing, 100124, China
[email protected],
[email protected] Abstract. A new online self-constructing recurrent neural network (SCRNN) model is proposed, of which the network structure could adjust according to the specific problem in real time. If the approximation performance of SCRNN is insufficient, SCRNN can create new neural network state to increase the learning ability. If the neural network state of SCRNN is redundant, it should be removed to simplify the structure of neural network and reduce the computation load otherwise, if the hidden neuron of SCRNN is significant, it should be retained. Meanwhile, the feedback coefficient is adjusted by synaptic normalization mechanism to ensure the stability of network state. The proposed method effectively generates a recurrent neural model with a highly accurate and compact structure. Simulation results demonstrate that the proposed SCRNN has a self-organizing ability which can determine the structure and parameters of the recurrent neural network automatically. The network has a better stability.
;
Keywords: self-constructing; recurrent neural network; dynamic system.
1 Introduction The RNN theory has been studied for more than ten years because of wildly application in system control, image processing, speech recognition and many other fields. As we known, the size of RNN depends on specific real objects, so the recurrent network design must be involved when it is applied to real objects. However, the dynamic characteristics of the network bring stability problems in the application, furthermore, the relative complexity of the network structure itself make the stability condition which calculated by theory is not easy for application. The above factors result in the difficulties of the recursive network design in practical application. From our literature review, we found that most existing RNNs lack a good mechanism to determine their structure sizes automatically. Park et al.[1] proposed a self-structuring neural network control method, which can create new neurons on the hidden layer to decrease the approximated error; unfortunately, the proposed approach can not avoid the structure of neural network growing unboundedly. To avoid the problem, Gao and Er[2] proposed an error reduction ratio with QR decomposition was used to prune the rules; however, the design procedure is overly complex. Yu-Liang Hsu.[3] construct automatically the network structure according to the identified dimension. But the dimension of the system is determined D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 122–131, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Stable Online Self-Constructing Recurrent Neural Network
123
by input and output data. This algorithm is not suitable for online adjustment of network structure. Chun-Fei Hsu[4] proposed a algorithm to determine whether to eliminate the existing hidden neurons which are inappropriate. The neurons removed by pruning algorithm contain small amount of information, which would lead to different levels of information loss according to different threshold value. At the same time, the dynamic characteristics of recurrent neural network would change because the removing neuron separately destroys the network structure. For the recurrent neural network structure in this paper, we design a algorithm for on-line adjustment of the recurrent neural network structure. The rest of this paper is organized as follows. In section 2, we introduce the recurrent network structure. Section3 presents an online adjustment algorithm and analyzes the dynamic characteristics of recurrent neural networks. Section4 provides computer simulations of dynamic applications to validate the effectiveness of the proposed network. Finally, the conclusions are presented in section 5.
2 Structure of Novel Recurrent Neural Network In this paper, we realize a conventional wiener model by a novel recurrent neural network structure. A large number of research of studies have indicated the superior capability and effectiveness of Wiener models in nonlinear dynamic system identification and control [5~7]. Based on the basic principle of Wiener models, we propose a simple recurrent neural network. With certain dynamic characteristics, local recurrent global feedback neural network has strong approximation ability to complex nonlinear dynamic systems. The network structure is shown in Fig. 1: w1h
z −1 W 34
u1
W
i
X 1 (k ) w2 h
z
−1
x1 (k )
Wo
y1
u2
X 2 (k ) um
x2 (k ) yn
wl h z −1
X l (k )
xl (k )
Fig. 1. The structure of SCRNN
The network structure only exists self-feedback among neurons in hidden layer, the information of previous time was still send to the neurons in hidden layer of next time in essence because of the latter static nonlinear layer. Therefore, this network could be equal to a kind of hidden layer. The state space of SCRNN model can be described as follows:
X (k + 1) = W h X (k ) + W iU (k )
(1)
124
Q. Chen, W. Chai, and J. Qiao
y (k ) = W o f (W 34 X (k ))
(2)
where, f ( ⋅) is the sigmoid function.
3 Online Self-Constructing Algorithm The number of neurons in hidden layer should be determined by trial-and-error to achieve favorable approximation. To solve this problem, this paper proposes a simple growing and pruning algorithm to self-structure the neural network on-line. When the approximation performance of SCRNN is insufficient and the growing condition is satisfied, SCRNN can create new state to increase the learning ability; when the approximation performance of SCRNN is insufficient and the growing condition isn’t satisfied, SCRNN can training the network by learning algorithm to increase the learning ability; when the state of SCRNN is redundant, it should be removed to simplify the structure of neural network and reduce the computation load. 3.1 Growing Algorithm and Pruning Algorithm
Growing algorithm can split up the neurons and create new network state at the K-th sampling time, if the following condition is satisfied [1]: l
∑ (| w k =1
l
| + | w ki34 |)+ | w io |
h ki
l
∑ (∑ (| w i =1
k =1
h ki
| + | w |)+ | w |) 34 ki
> Gth
(3)
o i
Where Gth denotes the disjunction threshold value. When the approximated nonlinear functions are too complex, the disjunction threshold value should be chosen as a small value so the neurons can be easily created. This algorithm is derived from the observation that if the left hand side of (3) is larger than the threshold value, then precise approximation is hard to capture because the rate of the i-th neuron is relatively large, which will cause a large updating of the weight values. For that reason, if condition (3) is satisfied, then splitting up the weight W o and create new neural neto work state x 'i (k ) . As shown in Fig. 2. The new w weights connected to the i ' -th neuron are decided as follows:
w 'io (k + 1) = α wio (k )
(4)
Where α is a positive constant. The weights wo connected to the i ' -th neuron are decided as follows:
wio (k + 1) = (1 − α ) wio (k )
(5)
and the new network state x 'i (k ) and the network state xi (k ) are decided as follows:
A Stable Online Self-Constructing Recurrent Neural Network
125
⎡ X i (k + 2) ⎤ ⎡ wi h (k ) 0 ⎤ ⎡ X i (k + 1) ⎤ ⎡ wi i (k ) ⎤ ⎢ ⎥=⎢ ⎥⎢ ⎥+⎢ ⎥ U (k + 1) wi h (k ) ⎦ ⎣ X i (k + 1) ⎦ ⎣ wi i (k ) ⎦ ⎣ X 'i (k + 2) ⎦ ⎣ 0
(6)
⎡ xi (k + 1) ⎤ ⎡(1 − β ) wi 34 (k ) β wi 34 (k ) ⎤ ⎡ X i (k + 1) ⎤ ⎢ ⎥=⎢ ⎥⎢ ⎥ 34 (1 − β ) wi 34 (k ) ⎦ ⎣ X 'i (k + 1) ⎦ ⎣ x 'i (k + 1) ⎦ ⎣ β wi (k )
(7)
wi h z −1
U
X i (k )
y(k )
wio
wi 34
wi i
xi ( k )
wi h z −1
U
wi i
(1 − β ) wi 34 X i (k )
wi h z −1 wi i
β wi 34
(1 − α ) wi o
xi ( k )
y '( k )
β wi 34 (1 − β ) wi 34
X 'i ( k )
α wo i
x 'i ( k )
Fig. 2. Schematic illustration of the growing algorithm
Pruning algorithm can delete network state at the K-th sampling time, if some conditions are satisfied. Where W i can be represented as [ wi1 wi 2 " wi l ]T , wi i is row vector of W i , W 34 can be represented as [ w341
w34 2 " w34l ] , w34i is the
column vector of W 34 . If condition (8) is satisfied, we consider X i (k ) and X j (k ) have linear dependence. Therefore, one of them could be deleted as shown in Fig. 3(a): ⎧ X i (k ) = X j (k ), ⎪ h h ⎨ w i (k ) = w j (k ), ⎪ X (0) = X (0), j ⎩ i
i≠ j i≠ j
(8)
i≠ j
We consider the state xi ( k ) and x j (k ) have linear dependence at the K-th sampling time, if the following condition is satisfied, Pruning algorithm can delete the state x j (k ) which is shown in fig.3(b).
w34 j (k ) = aw34i (k ), a ∈ R; i ≠ j Where w34i (k ) is the i th row vector of W 34 (k ) .
(9)
126
Q. Chen, W. Chai, and J. Qiao
Pruning algorithm can delete the X j (k ) and x j (k ) , only if the following condition is satisfied N X = l − rank (W 34 (k + 1))
(10)
Where N X represents the linear dependent number of X i (k ) , l is the lines of weight matrix W 34 (k + 1) . After pruning x j (k ) and X j (k ) , the weights are updated as: W i (k + 1) = [ wi1 (k ) " wi j −1 (k ) wi j +1 (k ) " wi l (k )]T
(11)
W h (k + 1) = [ wh1 (k ) " wh j −1 (k ) wh j +1 (k ) " whl (k )]T
(12)
Wˆ 34 (k + 1) = [ w341 (k )" w34i −1 (k ) w34i (k ) + w34 j (k ) w34i +1 (k )" w34 j −1 (k ) w34 j +1 (k )" w34l (k )] W 34 (k + 1) = [ wˆ 134(k ) " wˆ 34j −1 (k ) wˆ 34j +1 (k ) " wˆ l34(k )]T
W o (k + 1) = [ wo1 (k )" wo i −1 (k ) wo i (k ) + α wo j (k ) woi +1 (k )" wo j −1 (k ) wo j +1 (k )" wo l (k )]
(13)
(14)
(15)
This method increased and deleted the states of recurrent neural networks, but did not affect the stability of the network status, the specific analysis can be seen in 3.3 the analysis of SCRNN state stability in structure adjustment process. 3.2 Learning Algorithm
Using gradient descent to adjust the parameters in parameter learning process, define the error function of weights adjustment of recurrent neural network as: J (k ) =
2 2 1 1 1 E (k )2 = ∑ j ⎡⎣e j (k ) ⎤⎦ = ∑ j ⎡⎣Y j (k ) − y j (k ) ⎤⎦ 2 2 2
j = 1, 2," n
(16)
where, Y j (k ) is the system target output, y j (k ) is the actual output of recurrent network. The weight adjustment formula of recurrent neural network can be obtained by using gradient descent. To ensure the feedback weight in (-1, 1) with keeping recurrent neural network state stable, a synaptic normalization (SN) mechanisms [8] was added in the process of regulating. SN proportionally adjusts the values of back connections to a neuron so that they sum up to a constant value. Specifically, the wh i connections are rescaled at every time step according to:
A Stable Online Self-Constructing Recurrent Neural Network wi h z −1
wi h z −1 wi i
U
X i (k )
X i (k )
U
X k (k ) o j
wk
y (k )
xk ( k )
woj
x j (k )
x j (k )
wi h z −1
wi h z −1
wi i
U
woi wko
wk i
xk ( k ) w
xi ( k )
wk h z −1
y (k )
wko
z −1
X j (k )
wi i
wio
xi ( k )
wk h z −1 wk i X k (k )
127
X i (k ) h
z
xi ( k )
wio
−1
wko
wk i X k (k )
wi i
y( k )
U
xk ( k )
xi ( k )
wk i
woj x j (k )
X i (k )
y '( k ) o
wk X k (k )
(a)
wio + α woj
wk h z −1 xk ( k )
(b)
Fig. 3. Schematic illustration of the pruning algorithm wh i ( k ) = wh i ( k ) / ∑ w h i ( k ) i
(17)
This rule does not change the relative strengths of synapses established by gradient descent algorithm, but regulates the total incoming drive a neuron receives. The general online learning algorithm, written in vector-matricial, is given by the following equation: W ( k + 1) = W ( k ) + η ΔW ( K ) + α ΔW ( K − 1)
(18)
where W is the weight matrix , being modified ( W o , W h , W 34 , W i ), ΔW is the weight matrix correction ( ΔW o , ΔW h , ΔW 34 , ΔW i ),which is defined as ΔW (k ) ; η is a learning rate normalized parameter’s diagonal matrix, and α is a momentum term normalized learning parameter’s diagonal matrix. 3.3 The Stability Analysis
The analysis of SCRNN state stability can be obtained through the following analysis. The necessary and sufficient condition of discrete linear system stability is: all the roots of the characteristic equation are located within the unit circle in z plane. At the same time due to the diagonal matrix W h , only ensure the absolute value of diagonal elements are less than 1, this recurrent neural network state is ensured stable. In growing process, according to (4)~(7), we can obtain: X i (k ) = X 'i (k ) , xi (k ) = x 'i (k ) , y '(k ) = y (k )
(19)
In pruning process, combine the xi (k ) and x j (k ) . According to (11) ~ (15), we can obtain: x j (k ) = α xi (k )
(20)
128
Q. Chen, W. Chai, and J. Qiao
y '(k ) = y (k )
(21)
Whether growing algorithm or pruning algorithm, the change of the network structure would not affect the mapping relation between outputs and inputs. Therefore, stability of SCRNN structure would not be affected while the structure is changing.
4 Experimental Results To validate the performance of our SCRNN, we have conducted extensive computer simulations on dynamic system identification problems. SCRNN can adjust the network structure according to the difficulty of specific problem. Therefore, we approximate three dynamic systems to verify the adaptive ability of SCRNN network. In addition, all the simulation results are compared with those of some existing fixconstructing neural network methods found in the literature. The system input signals are: u1 (k ) = u2 (k ) = sin(0.1k ) + sin(0.167k )
(22)
u3 (k ) = sin(0.1k ) + sin(1.167k )
(23)
System 1[3], system 2, system 3 can be expressed as (24), (25), (26) respectively. For system 1 and system 2, Train the neural network with the former 1000 steps data, and then use the resulting neural network model to predict the latter 500 steps of the system after the former 1000 steps. The results as shown in Fig.4 For system 3, the change of system inputs has great influence on the system outputs. The dynamic characteristic is comparatively complicated. Change the input signals, train the neural network with the former 550 steps data, and then use the resulting neural network model to predict the latter 200 steps of the system after the former 550 steps. ⎧ x(k ) = −1.4318 x(k − 1) − 0.6065 x (k − 2) ⎪ − 0.9044u1 (k − 1) + 0.0883u1 (k − 2) ⎪ ⎨ ⎪ y1 (k ) = f [ x(k )] = 0.3163x(k ) ⎪ 0.1 + 0.9[ x(k )]2 ⎩
y2 (k ) = 1.39 y2 (k − 1) y2 (k − 2)
y2 (k − 1) + 0.25 + u2 (k − 1) 1 + y2 (k − 1)2 + y2 (k − 2)2
y3 (k ) = 0.4 y3 (k − 1) + 0.4 y3 (k − 2) + 0.6u3 (k − 1)3 + 0.1
(24)
(25)
(26)
A Stable Online Self-Constructing Recurrent Neural Network
0.3
4
0.2
3
129
2
0.1
1 y2
y1
0 0
-0.1 -1 -0.2
-2
-0.3
-3
system output SRNN output
-0.4 1000 1050 1100
1150 1200
1250 1300 1350 step k
1400 1450
-4 1000 1050 1100
1500
1150 1200
1250 1300 1350 step k
1400 1450
1500
Fig. 4. The prediction of system 1 and system 2 bases on SCRNN. The star-line represents the system outputs while asterisk represents the SCRNN output.
The results as shown in Fig. 5(b), the dynamic characteristic of system 3 is comparatively complicated the number of hidden layer adjustment process is shown in Fig.5(a). The experimental results show that comparative complex dynamic systems require more complex recurrent neural network to approach them. According to different approximating system, SCRNN can organize network structure by itself. Using a fixed structure of the recurrent neural network (FSRNN) to predict the above three systems and select 12 neurons for hidden layer. The results are shown in Fig. 6. The MSE of SCRNN and FSRNN prediction on three systems is shown in table 1. Experimental results show that for fixed structure neural network only fit for the approximation of a specific type system. However, the problem can not be well resolved when the system changes the complexity. 18
10
16
8 6 4
12
2 y3
the number of hidden layer
14
10
0 8
-2 6
-4
4 2 0
-6
100
200
300 step k
400
500
-8 550
600
(a)
600
650 step k
700
750
(b)
Fig. 5. The prediction of system 3 bases on SCRNN. The star-line represents the system outputs while real line represents the SCRNN output. The left figure is change of neurons in SCRNN hidden layer.
130
Q. Chen, W. Chai, and J. Qiao
8
0.2
6
0.1
4
0
2
y1
y2
0.3
-0.1
0
-0.2
-2
-0.3
-4
-0.4 1000
1050
1100
1150
1200
1250 1300 step k
1350
1400
1450
-6 1000
1500
1050
1100
1150
(a)
1200
1250 1300 step k
1350
1400
1450
1500
(b)
4 3 2
y3
1 0 -1 -2 -3 -4 550
600
650 step k
700
750
(c)
Fig. 6. (a) and (b) are the prediction of system 1 and system 2 bases on SCRNN. The star-line represents the system outputs while asterisk represents the SRNN output. (c) is the prediction of system 3 bases on FSRNN. The star-line represents the system outputs while real line represents the FSRNN output. Table 1. The MSE of SCRNN and FSRNN prediction on three systems
MSE(self-constructing) MSE (fix-structuring)
System 1 0.00058 0.0017
System 2 0.2989 0.9501
System 3 1.9335 4.5833
5 Conclusion A novel online self-constructing recurrent neural network (SCRNN) model has been proposed for nonlinear identification problems. The major contributions of this paper are: 1) compare to fix-constructing recurrent neural network the network is selfconstructing. For different systems, the researchers no longer need to determine the network structure based on experience while SCRNN can adjust its own structure according to the complexity of different issues. 2) the state of SCRNN will become stable over time. Even in the restructuring process, the state stability of the network still can be guaranteed.
A Stable Online Self-Constructing Recurrent Neural Network
131
Acknowledgments. This work is supported by the National Natural Science Foundation of China under Grant No.60873043 and No. 61034008; the National 863 Scheme Foundation of China under Grant No. 2007AA04Z160; the Doctoral Fund of Ministry of Education of China under Grant No. 200800050004; the Beijing Municipal Natural Science Foundation under Grant No. 4092010, and 2010 PhD Scientific Research Foundation, Beijing University of Technology (No. X0002011201101).
References 1. Park, J.H., Huh, S.H., Seo, S.J., Park, G.T.: Direct adaptive controller for nonaffine nonlinear systems using self-stucturing neural networks. IEEE Transactions on Neural Networks 16(2), 414–422 (2005) 2. Gao, Y., Er, M.J.: Online adaptive fuzzy neural identification and control of a class of MIMO nonlinear systems. IEEE Transactions on Fuzzy Systems 11(4), 462–477 (2003) 3. Hsu, Y.L., Wang, J.S.: A Wiener-type recurrent neural network and its control strategy for nonlinear dynamic applications. Journal of Process Control 19, 942–953 (2008) 4. Hsu, C.F.: Intelligent position tracking control for LCM drive using stable online self-constructing recurrent neural network controller with bound architecture. Control Engineering Practice 17, 714–722 (2009) 5. Hunter, I.W., Korenberg, M.J.: The identification of nonlinear biological systems: Wiener and Hammerstein cascade models. Biol. Cybern. 55, 135–144 (1986) 6. Janczak, A.: Identification of Nonlinear systems using neural networks and polynomial models. Springer, New York (2005) 7. Ni, X., Verhagen, M., Krijgsman, A.J., Verbruggen, H.B.: A new method for identification and control of nonlinear dynamic systems. Eng. Appl. Artif. Intell. 9, 231–243 (1996) 8. Lazer, A., Pipa, G., Triesch, J.: SORN: a self-organizing recurrent neural network. Frontiers in Computational Neuroscience 3(23), 1–9 (2009)
Evaluation of SVM Classification of Avatar Facial Recognition Sonia Ajina1, Roman V. Yampolskiy2, and Najoua Essoukri Ben Amara1 1 National Engineering School of Sousse, University of Sousse, Tunisia
[email protected],
[email protected] 2 Computer Engineering and Computer Science, Louisville University, USA
[email protected] Abstract. The security of virtual worlds has become a major challenge. The huge progress of Internet technologies, the massive revolution of media and the use of electronic finance increasingly deployed in a world where the law of competition forces financial institutions to devote huge amounts of capital to invest in persistent digital worlds like Second Life or Entropia Universe whose economic impact is quite real [1]. Thus, virtual communities are rapidly becoming the next frontier for the cyber-crime. So, it is necessary to develop equitable tools for the protection of virtual environments, similar to those deployed in the real world, such as biometric security systems. In this paper, we present a biometric recognition system of non-biological entities (avatar) faces based on exploration of wavelet transform for characterization and Support Vector Machines for classification. This system is able to identify and verify avatars during their access to certain information or system resources in virtual communities. We also evaluate the performance of our avatar authentication approach focusing specifically on the classification stage. Obtained results are promising and encouraging for such first contribution within the framework of this new research field. Keywords: Biometric, Face Recognition (FR), Avatar, Virtual world, Support Vector Machines (SVM), Discrete Wavelet Transform (DWT).
1 Introduction Domestic and industrial robots, intelligent software agents, virtual world avatars and other artificial entities are quickly becoming a part of our everyday life. Just like it is necessary to be able to accurately authenticate identity of human beings, it is becoming essential to be able to determine identity of the non-biological entities rapidly infiltrating all aspects of modern society [2]. Since its first access to a virtual world, the user is given an opportunity to choose the appearance of his avatar to benefit from different assets of virtual community, such as interaction and communication with other users through several possible ways (Business meetings, sports and games, socialization and group activities, live events, D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 132–142, 2011. © Springer-Verlag Berlin Heidelberg 2011
Evaluation of SVM Classification of Avatar Facial Recognition
133
conferences, collaborative projects, E-learning, etc.), even if they are not physically in the same location [3]. However, the use of these digital metaverses requires special vigilance in view of their vulnerability to run-of-the-mill criminals interested in conducting identity theft, fraud, tax evasion, illegal gambling, smuggling, defamation, money laundering and other traditional crimes causing real social, cultural and economic damages [4]. Henceforth, with over 1 billion dollar invested in 35 virtual worlds in the past year [5], the installation of a biometric system for securing virtual worlds, similar to those developed for the real world is necessary since there are just no existing techniques other for identity verification of intelligent entities other than self-identification [1]. This research work which follows a line of investigation presented by Boukhris [6] sheds light on biometric avatar face recognition, a research which has not been addressed before. The rest of this paper is organized as follows: in Section 2, we describe our biometric approach for avatar faces authentication by focusing on the classification stage, after having detailed the typical architecture of a biometric facial recognition system. In Section 3, we present the principal experiments carried out as well as the recorded results. Finally, we summarize in the last Section conclusions and eventual future directions.
2 System Description 2.1 Problematic In literature, several research works [7, 8, 9, 10] have been focalized on biometric human face recognition. Technological revolution and appearance of virtual worlds have created a new need manifested in virtual characters face recognition. In fact, these two subfields of research present many similarities as well as divergences. Indeed, the human face is characterized by an infinite diversity of grimaces, facial expressions and characteristics of physical entities forming the face (nose, eyes, mouth, skin, etc). Thanks to this important variation, the distinction between human faces proves easier compared with avatar faces characterization where facial expressions and characteristics are limited since it is beforehand fixed by the choice of the creator. Therefore, a biometric recognition system of avatar faces has to assure a great characterization level. It should be based on a set of discriminant characteristics insuring a best texture description of faces to fill the gap of possible resemblance problems in virtual communities. 2.2 Developed Recognition System Our developed biometric recognition system of avatar faces is inspired by the human faces recognition system developed by our team SID (Signal, Image and Document) [6, 11]. As we have mentioned, the ultimate goal of recognition system of avatar faces is to differentiate between the different non-biological entities wishing to access some resources in the virtual world. To achieve this goal, it is absolutely necessary to obtain a reference dataset for all faces of non-biological entities known by the system. To collect this dataset, we have used a special web site devoted to avatar creation [12].
134
S. Ajina, R.V. Yampolskiy, and N. Essoukri Ben Amara
In fact, during the processing of facial image, we use wavelet transform to extract its global characteristics. Then, we employ SVM technique as a classifier for recognition. After completing collection stage and acquisition of different faces, a pretreatment step is necessary to prepare feature vectors for the characterization step. These characteristics, associated with each face, are assumed to be invariant for each subject and different from one subject to another [13, 14]. Then, during the face recognition stage, the system will compare the feature vector associated with the face to recognize with all other feature vectors associated with the rest of the images in the reference base. This allows the recognition of the entity that has the most resemblance of the face and whose vector is the most similar. 2.3 Extracted Features for Characterization Characterization is a key step in our facial identification system during which, we have to extract characteristic parameters that would well-describe the main variations of images to be treated. It allows the identification of images from a set of discriminant features in order to ensure, thereafter, a better separation between the different classes (entities). Generally, the set of features dealing with face recognition may be classified as either geometric or global approach [15]. The geometric approach relates to extraction of local features from different parts of human face (such as the nose, mouth and eyes, etc) [16], whereas the holistic or global approach works on the face as a complete single entity based on pixel information of the texture in order to extract global information that will characterize each attributed facial image [17]. Our developed approach is classified as a holistic approach since it is based on certain global features derived from textural analysis. Indeed, we used wavelet transform, a tool for image processing and texture analysis, for the extraction of a set of global characteristics. Wavelet transform allows us to obtain derived images corresponding to approximation image, horizontal, vertical and diagonal details [18]. The choice of the wavelet family used and the corresponding decomposition level greatly influence the recognition results obtained at the output of our system. They are determined experimentally from several tests carried out on a number of wavelets families. Best results were recorded using the Symlet wavelet family with a decomposition level 5 as shown in Figure 1.
Sym8
Fig. 1. Wavelet decomposition using “Symlet 8” wavelet family with decomposition level 5
Evaluation of SVM Classification of Avatar Facial Recognition
135
In order to find most discriminant parameters allowing a better characterization of avatar facial images, we selected several characteristics recommended in the literature between the most suitable characteristics for textural analysis (contrast, correlation, entropy, standard deviation and energy) and we studied their relevance on our dataset. Naturally, higher diversity of the characteristics leads to easier the authentication [19]. We chose the use of two types of primitives: textural and statistical primitives resulting respectively from wavelets decomposition and co-occurrence matrix associated with the image. From wavelet transform, we retained the average and the standard deviation of the approximation matrix, the standard deviations of matrices associated with horizontal, vertical and diagonal details of each component RGB as well as the entropy of the color image. Moreover, we added to these 16 characteristics 12 other parameters to define each image of face as well as possible. From statistical characteristics, we utilized the moment of third order and the moment of fourth order of approximations matrices of each color component. We also considered second order metrics extracted from co-occurrence matrix like contrast and energy for each RGB component of the original image. The result of this stage is a feature vector of 28 parameters illustrated in Figure 2. Approximation Matrix for each RGB component: Average + Standard deviation + Skewness + Kortosis
Horizontal Details Matrix for each RGB component: Standard deviation
Ve rtical Details Matrix for each RGB component: Standard deviation
Diagonal De tails Matrix for each RGB component: Standard deviation
Color image Matrix: Entropy
Re d Component: Contrast + Energy
Blue Component: Contrast +Energy
Green Component: Contrast + Energy
Fig. 2. Total of parameters composing the feature vector
2.4 SVM Classification Numerical classification techniques were always present in the pattern recognition and particularly in the automatic face recognition.
136
S. Ajina, R.V. Yampolskiy, and N. Essoukri Ben Amara
According to Mari &Napoli (1996) [20], classification is undertaken to describe relations between different objects on the one hand and between each object and its parameters on the other hand, so accurate authentication will be done. This process, also called categorization, allows the construction of whole objects partition under a set of homogeneous classes. Indeed, we have ensured the classification step by using a supervised learning technique called Support Vector Machines, based on the research of optimal hyperplane for classification that separates the different types of samples [16]. This technique has proven performance even on a large number of samples contrarily to not-supervised classification technique with k-means that allows a simple separation just between two classes. We notice that linear, polynomial and RBF kernel are the most used kernels in automatic classification based on SVM techniques. Various tests were conducted to the choice of an SVM architecture whose kernel is a Radial Basis Function RBF, defined by (1): ି ୖ ሺݔǡ ݕሻ ൌ ݁
ȁȁೣషȁȁ మ మ మ
.
(1)
With σ is the parameter to vary for the RBF kernel. This core is suitable for almost all types of classification problems and offers a good detection rate [16].
3 Experimental Tests and Results In this section, we present briefly an overview of the collected dataset and we expose the different tests and obtained results by focusing on the classification stage. 3.1 Dataset Description In order to evaluate robustness and reliability of our avatar face recognition system, we have been led to collect a large and diversified dataset of virtual characters which even deals with problems of disguise or usurpation of identity in the virtual worlds. By considering the most known human faces databases in literature on biometric identification [21, 22, 23, 24], we noticed the significant effect of variation in pose, contrast and brightness on recognition results as well as the importance of similarities type of all images (same size, resolution and format) in order to guarantee the performance of face recognition system. In fact, our new gallery contains a set of 1800 facial images belonging to 100 classes. Each avatar is represented using a series of 18 different images collected thanks to the exploration of the web site “MyWebFace☺” [12]. All images of our database are in RGB-encoded BMP format taken in frontal position with white homogenous background and under the same lighting conditions. Images size is fixed to 150 x 175 pixels at 90 PPI of resolution in order to improve the recognition rate obtained by our system.This dataset is devised into two parts: the first contains 1200 avatar faces (12 facial expressions for each subject) considered for training phase whereas the second part is reserved for testing phase and contains the rest of 600 images (6 images with accessories for each avatar) used to evaluate the performance measures for classification tasks. We have chosen randomly six images for testing and twelve images for training.
Evaluation of SVM Classification of Avatar Facial Recognition
137
3.2 Tests Tables 1 and 2 summarize the main tests carried out to retain the best wavelet family and it corresponding decomposition level. Table 1. Recognition rate and classification threshold for tested wavelet families Wavelet Family Discrete Meyer Wavelet Haar (or db1) Bior1.1 Bi-orthogonal Bior1.5 Bior2.2 Bior6.8 Coif1 Coiflet Coif3 Coif5 Db2 Daubechies Db5 Db9 Db15 Reverse Rbio1.1 Bi-orthogonal Rbio3.3 Rbio4.4 Sym2 Symlet Sym5 Sym8 Sym20
Recognition Rate 94.1517% 93.65% 93.6517% 93.83 % 94.3167% 94.3333% 94.3467 % 94.1683% 93.5217% 94% 94.3467% 94.34% 93.335% 93.65% 94.16% 93.855% 94.015% 94.325% 94.5417% 93.4017%
Classification Threshold 0.527 0.520 0.518 0.532 0.539 0.552 0.537 0.544 0.539 0.529 0.538 0.544 0.521 0.520 0.540 0.520 0.530 0.539 0.557 0.533
Various tests with these discrete wavelet families showed that the recognition results are extremely alike (ranged between 93 and 95%), but the best recognition rate was recorded with the wavelet family “Symlet 8” resulted in a recognition rate of 94.5417% and 0.557 as optimal classification threshold. Once wavelet family and its index are selected, best decomposition level of this family should be determined. Table 2. Recognition results of the retained wavelet family for different decomposition levels Decomposition Level
Recognition Rate
Classification Threshold
Equal Error Rate
Level 1 Level 2 Level 3 Level 4 Level 5 Level 6 Level 7
93.6500% 94.1517% 94.2283% 94.5083% 95.4917% 94.8367% 94.8983%
0.520 0.527 0.539 0.593 0.618 0.624 0.620
6% 5.4% 5.8% 5.5% 4.5% 5.16% 5.1%
138
S. Ajina, R.V. Yampolskiy, and N. Essoukri Ben Amara
3.3 Evaluation of Our Classification Methodology To identify the problems in avatar face recognition encountered by our approach, we show in the Table 3 the Recognition Rate (RR), True Rejection Rate (TRR) and False Acceptance Rate (FAR) calculated for a sample of five virtual subjects randomly selected. Table 3. Recognition results for a sample of avatar Avatar N°2 N°3 N°42 N°68 N°79
RR (%) 92.333 98.667 78.6667 99.8333 100
TRR (%) 33.333 0 0 16.6667 0
FAR (%) 7.4074 1.3468 21.5488 0 0
At first glance, it would appear that calculated rates show a remarkable variation from one subject to another. Furthermore, we note that the presence of accessories like caps, glasses and other structural components affects the obtained Recognition Rates. Indeed, as a comparison between these results, we observe that the low average Recognition Rate (RR) and most importantly TRR and FAR rates are obtained with Avatar N°42 in our dataset. This is explained by fact that large-size accessories presented in some images of this avatar affects considerably authentication results, while unremarked or barely visible accessories as in the case of subject N°79 generates ideal TRR and FAR (0%) and optimal RR (100%). Figure 3 shows some difference in the nature of accessories added.
(a)
(b) Fig. 3. Illustration of accessories variation ((a): Subject N°42 presenting an important accessory variation, (b): Subject N°79 presenting an unremarkable accessory)
Evaluation of SVM Classification of Avatar Facial Recognition
139
According to False Acceptance Error experiments, we can also notice that misclassified test images are assigned to more than one reference image owing to classifier nature used in classification stage for the calculation of close distances between feature vectors. In addition, we notice that the wide variation in facial expressions, image size, digital resolution, lighting conditions and the choice of images used for training and testing process, greatly influence the recognition accuracy in our recognition system [13, 14]. For a better evaluation of classification quality, we present the Confusion Matrix (Contingency Table) evaluated on the first five avatars of our dataset images as shown in Table 4. Table 4. Confusion Matrix
Reference
N°1 N°2 N°3 N°4 N°5
N°1 6 1 0 2 0
Classification Avatar N°2 N°3 N°4 0 0 2 4 2 0 1 6 0 0 1 5 1 0 1
N°5 1 1 0 0 6
This Contingency Table is obtained by comparing classified data with reference data which should be different from data used to perform the classification process [25]. According to Table 4, we can claim that the system makes a misclassification of subject N°1 by its confusion with a reference image of subject N°2 and two reference images of subject N°4. Due to the size limitations of the paper we have been limited only to the representation of a 5x5 Confusion Matrix, a part of our 100x100 Confusion Matrix representing the total number of subjects in our dataset. Generally, the Confusion Matrix illustrates errors and possible risks of confusion for each subject with other subjects. We note that risk of errors is relatively high due to the possibility of assigning a test image of any subject to a set of reference images corresponding to different subjects. 3.4 Recorded Results Experiments of Boukhris [6] suggest that “Daubechies 9” wavelet family with a second decomposition level provide a recognition rate of 92.5%. In our approach, we increased the number of experiments to select finally “Symlet 8” wavelet family among several wavelet families with a fifth decomposition level. The average recognition rate is 95.4917% obtained for classification threshold of 0.618. Beyond this threshold, the effectiveness of the classification is no longer acceptable.
140
S. Ajina, R.V. Yampolskiy, and N. Essoukri Ben Amara distribution of FAR and TRR according to classification score 100 TRR FAR
Percentage of total recognition rate
90 80 70 60 50 40 30 20 10 0
0
0.1
0.2
0.3
0.4 0.5 0.6 Classification score
0.7
0.8
0.9
1
Fig. 4. Obtained Distribution Curve
Indeed, we used an automatic classification threshold while passing from 0 to 1 with a step of 0.001. The optimal threshold is that which corresponds to the highest Recognition Rate. It can be also determined thanks to the distribution curve (intersection between FAR and TRR) as shown in Figure 4 which allows us to judge the performance of our system differently.
4 Conclusion and Future Works This review paper exposes a new subfield of security research concerning virtual worlds in which we describe a facial biometric recognition approach for virtual characters (avatars), based on exploration of several global features in the image’s texture resulting from DWT and SVM techniques for classification stage. We focused also on the characterization level to determine the best family and wavelet decomposition level fitting best for this methodology followed. System performances and classification effectiveness were tested on a database of 1800 images of avatar visages. The tests have led to the choice of “Symlet 8” wavelet family with a decomposition level 5 and Radial Basis Function for SVM kernel among linear, polynomial and Gaussian Kernels. The experimental results are very promising. Actually, we intend to improve and consolidate the performance of our authentication system and deal with problems encountered on the level of accessory and poses variation by using simultaneously other biometric techniques in order to produce multimodal system capable of authenticating both biological (human being) and nonbiological (avatars) entities.
Evaluation of SVM Classification of Avatar Facial Recognition
141
References 1. Oursler, J.N., Price, M., Yampolskiy, R.V.: Parameterized Generation of Avatar Face Dataset. In: 14th International Conference on Computer Games: AI, Animation, Mobile, Interactive Multimedia, Educational & Serious Games, Louisville, KY, pp. 17–22 (2009) 2. Gavrilova, M.L., Yampolskiy, R.V.: Applying Biometric Principles to Avatar Recognition. In: International Conference on Cyberworlds, pp. 179–186. IEEE Press, Canada (2010) 3. Peachey, A., Gillen, J., Livingstone, D., Smith-Robbins, S.: Researching Learning in Virtual Worlds. Human–Computer Interaction Series, pp. 369–380. Springer, Heidelberg (2010) 4. Klare, B., Yampolskiy, R.V., Jain, A.K.: Face Recognition in the Virtual World: Recognizing Avatar Faces, Technical Report from MSU Computer Science and Engineering (2011) 5. Science & Technology News, http://www.virtualworldsmanagement.com/2007/index.html 6. Boukhris, M., Essoukri Ben Amara, N.: First contribution to Avatar Facial Biometric, Endof-studies Project. National Engineering school of Sousse, Tunisia (2009) 7. Lu, J., Kostantinos, A., Plataniotis, K.N., Venetsanopoulos, A.N.: Face Recognition Using LDA Based Algorithms. IEEE Transactions on Neural Networks 14(1), 195–200 (2003) 8. Turk, M.: Interactive-time vision: face recognition as a visual behavior, PhD Thesis, Massachusetts Institute of Technology (1991) 9. Baron, R.J.: Mechanisms of human facial recognition. International Journal International Journal of Man Machine Studies 15, 137–178 (1981) 10. Kanade, T.: Picture Processing System by Computer Complex and Recognition of Human Faces. PhD Thesis, Kyoto University, Japan, pp. 25–37 (1973) 11. Saidane, W.: Contribution à la biométrie du visage, End-of-studies Project. National Engineering School of Sousse, Tunisia (2008) 12. MyWebFace, http://home.mywebface.com/faceApp/index.pl 13. Ajina, S., Yampolskiy, R.V., Essoukri Ben Amara, N.: Avatar Facial Biometric Authentication. In: 2nd International Conference on Image Processing Theory, Tools and Applications, France, pp. 2–5 (2010) 14. Ajina, S., Yampolskiy, R.V., Essoukri Ben Amara, N.: Authentification de Visages d’Avatars. In: 17th Conception and Innovation Colloquy, CONFERE 2010 Symposium, Tunisia (2010) 15. Heseltine, T.D.: Face Recognition: Two-Dimensional and Three-Dimensional Techniques. PhD Thesis, University of York, UK (2004) 16. Huang, J., Blanz, V., Heisele, B.: Face Recognition Using Component-Based SVM Classification and Morphable Models. In: Lee, S.-W., Verri, A. (eds.) SVM 2002. LNCS, vol. 2388, p. 334. Springer, Heidelberg (2002) 17. Jourani, R.: Face recognition. End-of-studies project, Faculty of Sciences Rabat- Mohammed V - University Agdal (2006) 18. Audithan, S., Chandrasekaran, R.M.: Document Text Extraction from Document Images Using Haar Discrete Wavelet Transform. European Journal of Scientific Research 36(4), 502–512 (2009) 19. Ghomrassi, R.: Analyse de Texture par Ondelettes. End-of-studies project, Tunisia (2008) 20. Mari, J.F., Napoli, A.: Aspects de la classification. In: 7th Journey of Francophone Classification, France (1999) 21. The Color FERET Database, http://face.nist.gov/colorferet/request.html
142
S. Ajina, R.V. Yampolskiy, and N. Essoukri Ben Amara
22. The Yale Face Database, http://cvc.yale.edu/projects/yalefaces/yalefaces.html 23. The AT&T Database, http://www.cl.cam.ac.uk/research/dtg/attarchive/ facedatabase.html 24. The ORL Database, http://www.cl.cam.ac.uk/research/dtg/attarchive/ facedatabase.html 25. Traitement des données de télédétection. Girard M. C, DUNOD Edition, pp. 328–333, Paris (1999)
Optimization Control of Rectifier in HVDC System with ADHDP* Chunning Song1, Xiaohua Zhou1,2, Xiaofeng Lin1, and Shaojian Song1 1
College of Electrical Engineering, Guangxi University, Guangxi Nanning 530004, China 2 Department of Electronic Information and Control Engineering, Guangxi University of Technology, Guangxi Liuzhou 545006, China
Abstract. A novel nonlinear optimal controller for a rectifier in HVDC transmission system, using artificial neural networks, is presented in this paper. The action dependent heuristic dynamic programming(ADHDP), a member of the adaptive critic designs family is used for the design of the rectifier neurocontroller. This neurocontroller provides optimal control based on reinforcement learning and approximate dynamic programming(ADP). A series of simulations for a rectifier in dulble-ended unipolar HVDC system model with proposed neurocontroller and conventional PI controller were carried out in MATLAB/Simulink environment. Simulation results are provided to show that the proposed controller performs better than the conventional PI controller. the current of DC line in HVDC system with the proposed controller can quickly track with the changing of the reference current and prevent the occurrence of the current of DC line collapse when the large disturbances occur. Keywords: Optimal control, rectifier,HVDC transmission system, approximate dynamic programming(ADP), action dependent heuristic dynamic programming (ADHDP).
1 Introduction The HVDC technology is used in transmission systems to transmit electric bulk power over long distances by cable or overhead lines. It is also used to interconnect asynchronous AC systems having the same or different frequency. Figure 1 shows a simplified schematic picture of an HVDC system, with the basic principle of transferring electric energy from one AC system or node to another, in any direction. The system consists of three blocks: the two converter stations and the DC line. Within each station block there are several components involved in the conversion of AC to DC and vice versa. *
This work was supported in part by the Natural Science Foundation of China under Grant 60964002; the Natural Science Foundation of Guangxi Province of China under Grant 0991057; the Science & Research Foundation of Educational Commission of Guangxi Province of China under Grant 200808MS03.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 143–151, 2011. © Springer-Verlag Berlin Heidelberg 2011
144
C. Song et al.
Fig. 1. Structure of HVDC sysytem
In order to improve the performance of traditional controller for converters in HVDC system,many researchers have engaged in this field and done many jobs. In recent years, a large number of HVDC controller schemes based on various control techniques have been proposed [1-3] to improve the transient and dynamic stability of power systems. More recently, intelligent control techniques such as fuzzy logic and neural networks have been applied to HVDC control. However, most of the results are still in the early stage of theoretical investigations[4-7]. Artificial neural networks (ANNs) are good at identifying and controlling complex nonlinear systems [8]. They are suitable for multi-variable applications, where they can easily identify the interactions between the inputs and outputs. This paper presents a new technique not found before in the design of a neuro-controller for a rectifier of HVDC that is based on critic training.In this paper the design of a neurocontroller using the action dependent heuristic dynamic programming (ADHDP) method,a type of ADP is discussed, and simulation results are presented to show that critic based trained neural networks has the ability to control rectifier of HVDC.
2 Principle of ADHDP Method Suppose that one is given a discrete-time nonlinear system:
x(t + 1) = F [ x(t ), u (t ), t ]
(1)
Where, x represents the state vector of the system and u denotes the control action.If an initial state x (i ) is given, the goal of dynamic programming is to choose the appropriate control sequence go function:
u (k ), k = i, i + 1,⋅ ⋅ ⋅ ,and minimize the system cost-to∞
J [ x(i ), i ] = ∑ γ k −1U [ x(k ), u (k ), k ]
(2)
k =i
Where U is called the utility function and γ is the discount factor with 0< γ ≤1. The goal of ADHDP method is to obtain approximate Solution of Bellman equation as follow:
J * [ x(t ), t ] = min (U [ x(t ), u (t ), t ) + γJ * [ x(t + 1), t + 1]) u (t )
(3)
Optimization Control of Rectifier in HVDC System with ADHDP
145
Obviously,this method can obtain sub-optimal solution and significantly reduce the computational cost of dynamic programming. Figure 1 shows the schematic principle of ADHDP[9,10],its components include action network and critic network..The input of action network is the parameters of the system state x (t ) ,the output is the control variable u (t ) in the current state; The inputs of critic network is the system parameters x (t ) and control u (t ) in the current state its role is to evaluate the control, and its output is the approximation of the cost function J (t ) . The dashed line in Fig.2 shows the error back propagation training path. The objective of the action network is to minimize the cost function J , their weights are updated by the output of the critic network. The weights of the critic network decided by the difference of actual output and expected output.
Fig. 2. Structure of ADHDP controller
The critic network adopts the idear of supervised learning method,the weight value of network is adjusted by the error function defined as follow:
ec (t ) = J (t ) − γJ (t + 1) − U (t )
(4)
The output of action network is the approximation of the cost function J (t ) ,its weight value of action network is adjusted by the error function defined as follow:
ea (t ) = J (t )
(5)
3 Design of ADHDP-based Rectifier Controller Fig.3 shows the structure of ADHDP-based Rectifier Controller.When designing ADHDP-based Rectifier Controller in this paper,we adopt constant current control mode,and its critic network and action network are designed using three layers BP neural network.
146
C. Song et al.
ΔI d (t − 2)
ΔI d (t − 1)
ΔI d (t )
ΔI d (t + 1)
Fig. 3. Structure of ADHDP rectifier controller
Fig.4 shows the Structure of action neural network (a) and critic neural network (b) for ADHDP controller.In order to obtain better dynamic performance and learning effect,the inputs of the action network includes an error signal obtained from the reference current I dref and the value of the DC line current I d measured at time t, t-1 and t-2, which described with ΔI d (t ) , ΔId (t −1) and ΔId (t − 2) , the output of its is control
signal u(t ) ,which represents the trigger angle Δα of the rectifier, the hidden layer has six neurons. In the critic network, the inputs include ΔI d (t ) , ΔId (t −1) , ΔId (t − 2) and u(t ) . Its output is cost function J with the error function ec (t ) = J (t ) − γ J (t + 1) − U (t ) and its hidden layer has ten neurons.
ΔI d (t ) ΔI d (t − 1) ΔI d (t − 2)
• • •
Δα
ΔI d (t ) ΔI d (t − 1) ΔI d (t − 2) Δα
(a)
• • •
J
(b)
Fig. 4. Structure of action neural network (a) and critic neural network (b) for ADHDP controller
The neural network expressions of the critic network are:
⎧ Jh1 = wc1 X c (t ) ⎪ 1 − e − Jh1 ⎪ = f _ Jh 1 ⎨ 1 + e − Jh1 ⎪ ⎪⎩ J = wc 2 f _ Jh1
(6)
Optimization Control of Rectifier in HVDC System with ADHDP
Where X c (t ) is the inputof critic network, J is the output,
147
f _ Jh1 is the layer out-
put wc1 and wc 2 are the weights of the network at first and second layer respectively. According to the error function definited as equation (4), the critic network minimize objective function as follow:
1 1 E c (t ) = ∑ [ J (t ) − γJ (t + 1) − U (t )] 2 = ec2 (t ) 2 t 2
(7)
As the gradient descent, we can get the wc1 and wc 2 with the following formula:
wc1 = wc1 − lc ⋅
∂E ∂wc1
∂E ∂J ∂J ∂f _ Jh1 ∂Jh1 = ec (t) = ec (t) ∂wc1 ∂wc1 ∂f _ Jh1 ∂Jh1 ∂wc1 ∂f _ Jh1 Xc′ (t) = ec (t)wc′2 (t) ∂Jh1 wc 2 = wc 2 − lc ⋅
∂E ∂wc 2
∂E ∂J = ec (t ) = ec (t ) f _ Jh1 ∂wc 2 ∂wc 2 Where γ is the discount factor,
(8)
(9)
(10)
(11)
lc is the learning rate of the critic network.
Imitating the critic network,we can gain the neural network expressions of the action network are:
⎧uh1 = wa1 X a (t ) ⎪ 1 − e − uh1 ⎪ = f _ uh 1 ⎨ 1 + e −uh1 ⎪ ⎪⎩u = wa 2 f _ uh1
(12)
X a (t ) is the input of action network, u is the output, f _ uh1 is the layer output. wa1 and wa 2 are the network weights at first and second layer. Where
According to the error function definited as equation (5), the action network minimize objective function as follow:
148
C. Song et al.
Ea (t ) =
1 2 ea (t ) 2
(13)
As the gradient descent, the updating of weight wc1 and wc 2 with the following formula:
wa1 = wa1 + Δwa1 Δwa1 = −la ⋅
∂Ea ∂Ea ∂ea ∂J ∂J = −la ⋅ = −la ⋅ ea ⋅ ∂wa1 ∂ea ∂J ∂wa1 ∂wa1
wa 2 = wa 2 + Δwa 2 Δwa2 = −la ⋅
∂Ea ∂Ea∂ea ∂J ∂J = −la ⋅ = −la ⋅ ea ⋅ ∂wa2 ∂ea ∂J ∂wa2 ∂wa2
(14)
(15)
(16)
(17)
Where l a is the learning rate of the action network. Utility function U (t ) can reflect the control effect of controller[11], in this paper is given in equation (18):
U(t) = [4ΔI d (t) + 4ΔI d (t −1) +16ΔI d (t − 2)]2 Where ΔI d (t ) , ΔI d (t
U (t ) defined (18)
− 1) and ΔI d (t − 2) are the value of input of the action net-
work at time t, t-1 and t-2. The action network optimizes the overall cost over the time horizon of the problem by minimizing the function J. It basically provides the optimal control output to the plant.
4 Simulation Results In simulation test,a dulble-ended unipolar HVDC system model is built to test or verify performance of ADHDP rectifier controller. The rectifier is 12 pulse converter in this model. The test is carried out in the MATLAB/Simulink environment,and the rectifier adopts constant current control mode,the inverter adopts constant γ control mode. In order to show the control effects of this controller, the simulation results of ADHDP controller compared with traditional PI controller. Case 1: Simulation results during series changing of reference current In this case, the reference current changes at different simulation time to show steady state performance of controller,thoes changes are indicated in table 1. The PI controller and the ADHDP-based controller are worked in this system respectively.
Optimization Control of Rectifier in HVDC System with ADHDP
149
Fig.5 shows the currents curve of DC line at the control of PI and ADHDP controller,as we can see, the maximum of direct current of DC line at the control of PI controller is 1.35pu,and it further surpasses the reference current 1.0pu.However, the maximum of direct current of DC line at the control of ADHDP controller is close to the reference current 1.0pu,in addition,the response speed of ADHDP controller is faster than that of PI controller when reference current changes. Table 1. Changing of reference current time 0s 0.4s 0.7s
Changing of reference current 1p.u Down to 0.8 p.u Return to 1p.u
1.4
Id&Idref/p.u
1.2
Id(PI) Idref
1 0.8
Id(ADHDP)
0.6 0.4 0.2 0
0.2
0.4
0.6
0.8
1
t/s
Fig. 5. Current of DC line during changing of reference current
Case 2: Simulation result during DC line short circult fault. DC line short circult is a typical fault of HVDC system. In this case, the DC line short circult fault occurred at 0.5s,and it was resected. The PI controller and the ADHDP-based controller are worked in system respectively.Fig.6 illustrates the currents curve of DC line at the control of the ADHDP controller,and Fig. 7 shows the currents curve of DC line at the control of the PI controller. We can see that the ADHDP-based controller can stabilize the current of DC line to 1.0pu at 0.6s,but PI controller stabilize the current of DC line to 1.0pu until 0.7s.And the ADHDP-based controller can achieve stability faster and with less overshoot than the traditinal PI controller.
150
C. Song et al.
Id&Idref/p.u
1.5
1
0.5 Id(ADHDP) Idref
0 0
0.2
0.4
0.6
0.8
1
t/s Fig. 6. Current of DC line during DC line short circult with ADHDP controller
3.5 3 Id(PI) Idref
Id&Idref/p.u
2.5 2 1.5 1 0.5 0 0
0.2
0.4
0.6
0.8
1
t/s Fig. 7. Current of DC line during DC line short circult with PI controller
5 Conclusions In this paper, a novel optimal neurocontroller based on the adaptive critic designs technique is designed for rectifier of a dulble-ended unipolar HVDC system.The structure of it is relatively simple and does not depend on the mathematical model of plant,in addition,it can be trained on-line in MATLAB/simulink environment. The numerical simulation results with ADHDP- based neurocontroller show that the current of DC line can track with the changing of reference current and prevent the occurrence of the current of DC line collapse when the large disturbances occur; it is more effective in
Optimization Control of Rectifier in HVDC System with ADHDP
151
controlling the rectifier of HVDC system compared with conventional PI controller.The proposed neurocontroller achieves optimal control through learning online and can adapt the changes of the size and complexity of the power system,Therefore, the prospect of this controller will be very broad in this field.
References 1. Machida, T.: Improving transient stability of AC systems by joint usage of DC system. IEEE Trans. Power Apparatus Syst. 85(3), 226–232 (1996) 2. Jovcic, D., Pahalawaththa, N., Zavahir, M.: Novel current controller design for elimination of dominant oscillatory mode on an HVDC line. IEEE Trans. Power Deliv. 14(2), 543–548 (1999) 3. Li, X.Y.: A nonlinear emergency control strategy for HVDC transmission systems. Electric Power Systems Research 67, 153–159 (2003) 4. Narendra, K.G., Khorasani, K., Sood, V.K., Patel, R.V.: Intelligent current controller for an HVDC transmission link. IEEE Trans. Power Syst. 13(3), 1076–1083 (1998) 5. Narendra, K.G., Sood, V.K., Khorasani, K., Patel, R.V.: Investigation into an artificial neural network based on-line current controller for an HVDC transmission link. IEEE Trans. Power Syst. 12(4), 1425–1431 (1997) 6. Dash, P.K., Routray, A., Mishra, S.: A neural network based feedback linearising controller for HVDC links. Electric Power Systems Research 50, 125–132 (1999) 7. Peiris, H.J.C., Annakkage, U.D., Pahalawaththa, N.C.: Frequency regulation of rectifier side AC system of an HVDC scheme using coordinated fuzzy logic control. Electric Power Systems Research 48, 89–95 (1998) 8. Hunt, K.J., Sbarbaro, D., Zbikowski, R., Gawthrop, P.J.: Neural networks for control systems -a survey. Automatica 28(6), 1083–1112 (1992) 9. Prokhorov, D., Wunsch, D.: Adaptive critic designs. IEEE Transactions on Neural Net-works 8(5), 997–1007 (1997) 10. Liu, D.: Action-dependent Adaptive Critic Designs. In: Proceedings of IEEE International Joint Conference on Neural Networks, pp. 990–995 (2001) 11. Mohagheghi, S., del Valle, Y., Venayagamoorthy, G.K.: A Proportional-Integrator Type Adaptive Critic Design-Based Neurocontroller for a Static Compensator in a Multimachine Power System. IEEE Transactions on Industrial Electronics 54(1), 86–96 (2007)
Network Traffic Prediction Based on Wavelet Transform and Season ARIMA Model Yongtao Wei, Jinkuan Wang, and Cuirong Wang 1
School of Information Science and Engineering, Northeastern University, Shenyang, China
[email protected], {wjk,wcr}@mail.neuq.edu.cn
Abstract. To deal with the characteristic of network traffic, a prediction algorithm based on wavelet transform and Season ARIMA model is introduced in this paper. The complex correlation structure of the network history traffic is exploited with wavelet method .For the traffic series under different time scale, self-similarity is analyzed and different prediction model is selected for predicting. The result series is reconstructed with wavelet method. Simulation results show that the proposed method can achieve higher prediction accuracy rather than single prediction model. Keywords: Network Traffic; Wavelet; Season ARIMA.
1 Introduction In the area of data communications, particularly in the area of network traffic prediction, Fractal models and wavelet analysis play a important role. Such as the work of Leland et al. [1] and subsequent studies, which demonstrated network traffic loads exhibit fractal properties such as self-similarity, Abruptness, and long-range dependence (LRD)[2]. The models by classical Poisson or Markov models are inadequate, and these properties influence network performance strongly. For example, performance predictions based on classical traffic models are often far too optimistic when compared with actual performance with real data. Fractal traffic models have provided exciting new insights into network behavior and promise new algorithms for network data prediction and control. Some other studies show the application of wavelet analysis to traffic modeling [3-5]. To accurately predict network traffic on a virtual network plays an important role for cost control and dynamic bandwidth allocation. However, traffic variations at different timescales are caused by different network mechanisms. Traffic variations at small timescales (i.e. in the order of ms or smaller timescale) are caused by buffers and scheduling algorithms etc. Traffic variations at larger timescales (i.e. in the order of 100ms) are caused by traffic and congestion control protocols, e.g. TCP protocols. Traffic variations at even larger timescales are caused by routing changes, daily and weekly cyclic shift in user populations. Finally long-term traffic changes are caused by long-term increases in user population as well as increases in bandwidth requirement of users due to the emergence of new network applications. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 152–159, 2011. © Springer-Verlag Berlin Heidelberg 2011
Network Traffic Prediction Based on Wavelet Transform and Season ARIMA Model
153
Considering the complex scaling behavior of network traffic, several wavelet-based traffic predictor has been proposed. Mao [6] proposed a wavelet-based ARIMA method, predicting traffic component at each time scale with an ARIMA scheme. Yin et al.[7] used similar approach to predict wireless traffic. In this paper, we develop a new wavelet-based signal model for positive, stationary, and LRD data. While characterizing positive data in the wavelet domain is problematic for general wavelets. The wavelet transformation, which had the thorough research in the economic projection aspect, will apply in the network flow to come to have one kind of new breakthrough. The wavelet transform has the long distance dependent current flow to get up has gone to the correlation, with which the time domain problems can be possibly solved if transformed to the frequency range in layer by layer wavelet transform. A general approximation coefficient of wavelet algorithms (Haar wavelet filter) at this time will be treated as a stationary series [6]. ARIMA (p, d, q), the method of modeling, the use of differential equations so that after the establishment of a smooth approximation coefficients of ARMA models. Then there will be a smooth wavelet analysis and the classical time series modeling combined to non-stationary time series modeling and forecasting provides a new approach, and improves forecast accuracy. The rest of this paper is organized as follows: section II describes the framework of the Wavelet based Combination Model, including the introduction of the Mallat algorithm for decomposition, the Combination Model for prediction. In the end, we will present the experiments and conclusions.
2 Wavelet and SARIMA Model Based Prediction 2.1 Wavelet Transformation The wavelet method is used in time series processing for decomposition and composition [8]. Using multi-scale wavelet decomposition and synthesis, the time series can be decomposed in different scales of detail coefficients and approximation factor. The Mallat decomposition and reconstruction algorithm formula is as follows: Suppose f(t) is the original signal, the wavelet transform can be described as formula (1), in L2(R),
(W f )(b, a ) =
1 a
∫
+∞
−∞
f (t )ψ (
f (t ) ∈ L2 ( R)
t −b ) dt a
(1)
a, b ⊂ R , a ≠ 0 .The parameter a is called dilation factor, b is called timelapse factor, (W f )(b, a ) is called coefficient of wavelet transform, ψ (t ) is basal Where
wavelet. In practice, f(t) is usually expressed as discrete form. So the multi-scale pyramid method (Mallat discrete algorithm [8-9]) can be employed for calculation. And it can be expressed as:
f ( j + 1) = θ j + f j ( j = N - 1 , N - 2 ,… , N -M )
(2)
154
Y. Wei, J. Wang, and C. Wang
where N is the initiative scale, M is the decomposed degree, θ j ,
f j can be obtained
by formula (3).
f j = ∑ Cmj φ j , m m
(3)
θ j = ∑ d mjψ j ,m m
( j = N - 1 , N - 2 ,… , N -M )
φ j , m is scaling function, ψ j ,m is wavelet function, and the coefficient Cmj can be expressed as follows,
and
d mj
Cmj = ∑ hk − 2 mC kj +1 k
d = ∑ g k − 2 mC kj +1 j m
(4)
k
( j = N - 1 , N - 2 ,… , N -M ) hk-2m and gk-2m is based on the choice of the wavelet generating function. θj and fj (j=N-1,N-2,…,N-M) are called low frequency component and high frequency component in scale N respectively. 2.2 SARIAM Model The model ARMA(p, q) consists of two parts, an autoregressive (AR) part and a moving average (MA) part. The model is usually then referred to as the ARMA(p,q) model where p is the order of the autoregressive part and q is the order of the moving average part [10]. The two parts AR(p) and MA(q) models are described as formula (5) and (6). Xt=ϕ1Xt-1+ ϕ2Xt-2 + … + ϕpXt-p + μt
μt=εt - θ1εt-1 - θ2εt-2 - … - θqεt-q
(5) (6)
ARIAM model[11,12] is built based on the Markov random process, which reflects the dynamic characteristics of both absorbed the advantages of regression analysis also carries forward the merits of moving average. An ARIMA(p,d,q) model is an ARMA(p,q) model that has been differenced d times. In many practical problems, such as network traffic), the observed sample data series (Xt, t = 0,1,2 …) are generally non-stationary series, but it will be d times its finite difference treatment, the sequence has become a smooth sequence. Set d non-negative integers, say { Xt } is the ARIMA (p, d, q) sequence: Φ(B)(1-B)d Xt = θ(B)εt
(7)
Where Φ (B) and θ (B) two times respectively for p and q of the characteristic polynomial, p and q are positive integers, the expression for the:
Network Traffic Prediction Based on Wavelet Transform and Season ARIMA Model
Φ( B) = 1 − Φ1 B − ... − Φ p B p
θ ( B) = 1 + θ1B + ... + θ q B q
155
(8)
Where B is the delay factor, Bxt=xt-1, εt is Gaussian white noise sequence, Subject to WN(0, σ2) distribution. The expression for the SRAIMA model is:
U ( B S ) = (1 − B S ) D X t = V ( B S )ε t
(9)
Where
U ( B S ) = 1 − Γ1 B S − ... − Γ k B kS , and V ( B S ) = 1 − H1 B S − ... − H m B mS . 2.2 Wavelet Transform and Prediction Model The network traffic time series Xt is decomposed into n layer with wavelet method firstly. For each layer, traffic is predicted with corresponding model. And then, the results are reconstructed with wavelet method. 1. Wavelet Decomposition. We use a redundant wavelet transform, i.e. the Mallat wavelet transform, to decompose the history traffic. 2. Multi-scale prediction. The wavelet coefficients and the scaling coefficients is predicted independently at each scale. The significance of a single branch reconstruction of the signal down into the specific frequency range of the time signal, single-branch reconstructed signal analysis, it is the exclusion of other frequency components of interference. The detail of the reconstructed coefficient is generally regarded as a smooth process to deal with that can be used with SARIMA model and ARIAM model to modeling and predicting. 3. Reconstruction. We use the wavelet coefficients and the scaling coefficients to construct the predicted value of the original traffic.
3 Network Traffic Prediction There is often single service in a virtual network. So we use the Packet traces from WIDE backbone [13] for experiment, a daily trace of a trans-Pacific line: /samplepoint-B/2006/, and from which we get the HTTP traffic as the object original traffic data. Figure 1 shows the observed time period of the original traffic data.
Fig. 1. Traffic rate of the trace measured by WIDE backbone
156
Y. Wei, J. Wang, and C. Wang
Fig. 2. Wavelet coefficients and Scaling coefficients
Using db4 wavelet, using formula (2-4) on time series decomposition, we get the detail coefficients cd1, cd2, cd3, and approximation coefficients of ca3. Figure 2 shows the wavelet coefficients and the scaling coefficients of the WIDE traffic trace. A visual inspection of the scaling coefficients and the wavelet coefficients indicates that the wavelet coefficients can be reasonably treated as a stationary time series with zero mean. Therefore wavelet coefficients can be modeled using the ARMA(p,q) model, or equivalently the ARIMA(p,0,q) model. There is significant non-stationarity in the scaling coefficients. This non-stationarity becomes more obvious when examining the scaling coefficients over a longer time period. B-J forecasting methodology is used to establish the ARIMA(p,d,q) model for prediction at each scale. B-J methodology involves four steps[12]: 1. The first step is the tentative identification of the model parameters. This is done by examining the sample autocorrelation function and the sample partial autocorrelation function [12] of the time series Xt. 2. Estimation. Once the model is established, the model parameters can be estimated using either a maximum likelihood approach or a least mean square approach. In this paper both the maximum likelihood approach and the least mean square approach is tried and their results are almost exactly the same. Thus we stick to the least mean square approach for its simplicity. 3. Diagnostic check. Diagnostic checks can be used to see whether or not the model that has been tentatively identified and estimated is adequate. This can be done by examining the sample autocorrelation function of the error signal, i.e. the difference between the predicted value and the real value. If the model is inadequate, it must be modified and improved. 4. When a model is determined, it can be used to predict future time series. Table 1 shows the model parameters of the ARIMA model at each scale. Three scales are chosen. The choice on the number of scales is made based on the tradeoff between
Network Traffic Prediction Based on Wavelet Transform and Season ARIMA Model
157
the model complexity and accuracy. Further increase in the number of scales significantly increases the complexity of the algorithm but there is only a marginal increase in accuracy. Table 1. The prediction models Scale Wavelet coefficient 1 Wavelet coefficient 2 Wavelet coefficient 3 Scale coefficient 3
Model
ARMA(1,0,4) ARIMA(3,0,4) ARIMA(3,0,7) ARIMA(2,1,7)
Most noise in the model comes from the wavelet coefficients at scale 1. In comparison with the wavelet coefficients and the scaling coefficients at other scales, the wavelet coefficient at scale1 has very weak autocorrelations and a white noise like power spectral density. It is almost like white noise. It is the wavelet coefficients at scale 1 that limits the over all performance that can be achieved by the prediction algorithm. Figure 3 shows the reconstructed prediction result.
Fig. 3. Reconstructed series after prediction
4 Result Analysis To achieve a fair comparison, the same trace is also predicted with single ARIMA model. The parameters of the ARIMA model without wavelet decomposition are p = 1; d = 1; q = 5. To measure the performance of the prediction algorithm, two metrics are used. One is the normalized mean square error(NMSE):
1 NMSE = N
∑
N
( X (n) − Xˆ (n)) n =1 var( X (n))
2
(10)
where Xˆ ( n) is the predicted value of X(n) and var(X(n)) denotes the variance of X(n). The other is the mean absolute relative error (MARE), which is defined as:
MARE =
1 N
∑
N n =1
X (n) − Xˆ (n) X ( n)
(11)
158
Y. Wei, J. Wang, and C. Wang
Table 2 shows the performance of the prediction algorithm. For comparison purpose, the performance of traffic prediction algorithms using ARIMA model without wavelet decomposition are also shown in the table. Table 2. The prediction performance Proposed Model
ARIMA Model
NMSE 0.1526
NMSE 0.1650
MARE 0.1731
MARE 0.2132
As shown in the table above, the proposed algorithm gives better performance than the ARIMA model without wavelet decomposition both in NMSE and MARE.
5 Conclusion Since the relevant features that make the traditional model of network traffic, or a single network traffic model can’t make good prediction. This paper introduces the wavelet technology, wavelet transform has been used to the traffic that has a longrange dependence, which in the time domain not easy to solve, and can be transformed into the frequency domain. In this paper we proposed a combined network traffic prediction algorithm based on a time scale decomposition. The history traffic data is first decomposed into different timescales using the Mallat wavelet transform. The prediction of the wavelet coefficients and the scaling coefficients are performed independently at each timescale using the SARIMA and ARIMA model. The predicted wavelet coefficients and scaling coefficient are summed to give the predicted traffic value. As traffic variations at different timescales are caused by different network mechanisms, the proposed time scale decomposition approach to traffic prediction can better capture the correlation structure of traffic caused by different network mechanisms, which may not be obvious when examining the history data directly
Acknowledgment This work is supported by National Nature Science Foundation of China under Grant No. 60874108 and 60904035.
References 1. Leland, W., Taqqu, M., Willinger, W., Wilson, D.: On the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Trans. Networking, 1–15 (1994) 2. Papagiannaki, K., Taft, N., Zhang, Z.-L.: Long-term forecasting of Internet backbone traffic: Observations and initial models. In: IEEE INFOCOM 2003 (2003) 3. Hongfei, w.: Wavelet-based multi-scale network traffic prediction model. Chinese Journal of Computers 29(1), 166–170 (2006)
Network Traffic Prediction Based on Wavelet Transform and Season ARIMA Model
159
4. Leiting, y.: A network traffic prediction of wavelet neural network model. Computer Applications 26(3), 526–528 (2006) 5. Garcia-Trevino, E.S., Alarcon-quino, V.: Single-step prediction of chaotic time series using wavelet-networks. In: Electronics Robotics and Automotive Mechanics Conference, p. 6 (2006) 6. Mao, G.Q.: Real-time network traffic prediction based on a multiscale decomposition. In: Lorenz, P., Dini, P. (eds.) ICN 2005. LNCS, vol. 3420, pp. 492–499. Springer, Heidelberg (2005) 7. Yin, S.Y., Lin, X.K.: Adaptive Load Balancing in Mobile Ad hoc Networks. In: Proceedings of IEEE Communications Society /WCNC 2005, pp. 1982–1987 (2005) 8. Mallat, S., Hwang, W.I.: Singularity detection and processing with wavelets. IEEE Trans on Information Theory 38(4) (1992) 9. Charfiled, C.: Model uncertainty, data mining and statistical inference. J. Roy. Statist. Soc. A 158 (1995) 10. Zouboxian, l.: ARMA model based on network traffic prediction. Computer Research and Development (2002) 11. Groschwitz, N.K., Polyzos, G.C.: A time series model of long-term NSFNET backbone traffic. In: IEEE International Conference on Communications (1994) 12. Bowerman, B.L., O’Connell, R.T.: Time Series Fore casting - Uni¯ ed Concepts and Computer Implementation, 2nd edn. PWS publishers (1987) 13. http://mawi.wide.ad.jp/mawi/
Dynamic Bandwidth Allocation for Preventing Congestion in Data Center Networks Cong Wang1, Cui-rong Wang2, and Ying Yuan1 1
School of Information Science and Engineering, Northeastern University, Shenyang, 11004, China
[email protected],
[email protected] 2 Department of Information, Northeastern University at Qinhuangdao, 066004, China
[email protected] Abstract. Running multiple virtual machines over a real physical machine is a promising way to provide agility in current data centers. Virtual machines belong to one application are striped over multiple nodes, and the generated traffic often shares the substrate network with other traffic of other applications. In such conditions, clients can experience severely degraded performance, such as TCP throughput collapse and network congestion due to competing network traffic. The basic cause of this problem is that network traffic from multiple sources which shares the same network link can cause transient overloads in the link. In this paper, we make the case that network virtualization opens up a new set of opportunities to solve such congestion performance problem. We present an architecture which compartmentalize virtual machines of same application into same virtual networks by network slicing, and divides the role of the traditional ISPs into two: infrastructure providers and service providers to achieve more commercial agility needed by cloud computing in particular. We also present a dynamic bandwidth allocation mechanism, which can prevent congestion and maximize utilization of substrate networks. Experimental result shows that the network slicing mechanism and bandwidth allocation algorithm can prevent network congestion significantly. Keywords: Virtualization; network slicing; data center network; bandwidth allocation.
1 Introduction In recent years, more and more data centers are being built to provide increasingly popular online application services, such as search, e-mails, IMs, web 2.0, and gaming, etc. These data centers often host some bandwidth-intensive services such as distributed file systems (e.g., GFS [1]), structured storage (e.g., BigTable [2]). Another related trend is CPU virtualization which leads administrators to consolidate more virtual machines (VMs) on fewer physical servers and single substrate network topology. The substrate network can become overwhelmed by high-throughput traffic that can be bursty and synchronized, leading to packet losses. The resulting network could easily bring down the effective and possibly cause congestion collapse. In addition, let so many traffic refer to different applications compete in the same substrate network will lead to confusion of management or even result in terrible breakdown. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 160–167, 2011. © Springer-Verlag Berlin Heidelberg 2011
Dynamic Bandwidth Allocation for Preventing Congestion in Data Center Networks
161
The major cause of congestion is there are so many traffic from various service has multiplex custom Quality of Service (QoS) requirements, and non-coordination of these traffic may result in crash in network links. Hardware and software mechanisms to control network congestion or support QoS are far from well-developed [3]. In modern data centers, especially towards the cloud computing, not all network traffic is TCP-friendly. In fact, a growing proportion of network traffic is not TCP-friendly such as streaming media, voice/video over IP, and peer-to-peer traffic. Although TCP is self-regulating, even a small amount of non-TCP friendly traffic can disrupt fairsharing of switched network resources [4]. We argue that network virtualization can extenuate the ossifying forces of the current Ethernet and stimulate innovation by enabling diverse network architectures to cohabit on a shared physical substrate [5]. This paper leverages network virtualization to proactively prevent network congestion and provide more agility. In the present architecture, virtual machines of same applications in data center are partitioned into same virtual networks by network slicing. Therefore the business model can be divided into two parts: infrastructure provider who builds the data center fabric and service provider who rents the virtual machines to run their own applications and provides Internet service. We also give a dynamic bandwidth allocation algorithm hosting in switches based Particle Swarm Optimization (PSO), which can prevent congestion and maximize the utilization of the substrate network.
2 The Network Virtualization Architecture Network virtualization is a promising way to support diverse applications over a shared substrate by running multiple virtual networks, which customized for different performance objectives. Researchers have proposed that future networks can run multiple virtual networks, with the separate resources for each virtual network [6]. Leveraging network virtualization technology, the supervisor can classify virtual machines of same application into same virtual networks so as to achieve more agility that needed by cloud computing. In this condition, the fabric owner as the substrate provider to provide the physical hardware, service provider rent virtual machines and virtual networks to run their applications to provide service, therefore can achieve a flexible lease mechanism which is more suitable for modern commercial operations. Today, network virtualization is moving from fantasy to reality, as many major router vendors start to support both switch virtualization and switch programmability [7]. Our network virtualization architecture also based such switches. Fig. 1 depicts the substrate node architecture that is composed of three modules: substrate resources, virtual management, and virtual switch. The substrate resources module comprises the physical resources of the real switch, such as network interfaces, RAM, IO (buffer), etc. The virtual management module is a middleware as an abstract of the substrate resource for the up layer. It contains the components of the adapted-control unit to dynamically build or modify parameters of virtual networks and a monitoring unit to monitor the running status of each virtual network.
162
C. Wang, C.–r. Wang and Y. Yuan
Fig. 1. Substrate switch architecture for virtualization
Virtual switches are logical independent, each of them has absolute module such as forwarding table, queue and virtual CPU (we omitted the CPU virtualization in Fig. 1), and therefore they can run custom network protocol or forwarding strategy on their own. If each virtual switch is assigned appropriate resource, then we can build different virtual networks above the fixed substrate physical network like in Fig. 2.
Fig. 2. Virtual networks above substrate physical network
Dynamic Bandwidth Allocation for Preventing Congestion in Data Center Networks
163
Fig. 2 illustrates two virtual networks above a fixed substrate physical network. The substrate network has five switches and six physical links, and that mean the topology is fixed. On the contrary, the two virtual networks have different topology with different number of switches and links. That is what network virtualization aims at, it can provide much more agility to build and modify network topology or running parameters. Note that we omitted all virtual machines connected to switches. In this architecture, the data center owner can slice the substrate network to many virtual networks and lease them to any service provider. Service provider can run their own service applications in the rental virtual network and can change the number of virtual nodes with their practical requirements. Data center owner as substrate provider just need to control and monitoring the virtual networks running in data centers. With a commercial multi-tenancy strategy and a dynamic bandwidth allocation algorithm between virtual networks, which we will present in Section 3, the data center owner can achieve best revenue and the service provider can also gain sufficient QoS guarantee in their virtual networks.
3 Dynamic Bandwidth Allocation Virtual networks are constructed over the substrate physical network by subdividing each physical switch and each physical link into multiple virtual switches and virtual links. The substrate runs schedulers that arbitrate access to the substrate resources, to give each virtual network the illusion that it runs on a dedicated physical infrastructure. In this case, for QoS guarantee, the most important thing is the bandwidth indemnification in each virtual link. For future cloud computing, there will be lots of virtual machines running on physical machines in future data center networks. If in nonvirtualized network, multiple virtual machines share the same physical link with different type of traffic which we discussed in Section 1 will lead to serious congestion and connection collapse. On the other hand, if we can’t provide a fair-sharing of bandwidth in virtualized network, the results may even worse, for some virtual networks may gain little bandwidth in some virtual links. Smooth running multiple virtual networks needs a dynamic adaptive bandwidth allocation algorithm hosting in the switches. This work is executed by joint working of both the adaptive control unit and the monitoring unit in virtual management module in Fig. 1 we discussed in Section 2. The monitoring unit monitors each physical port of switch, gather each virtual link’s status, and the adaptive control unit calculate how many bandwidth each virtual link should be assigned in that port based the parameters the monitoring unit delivers. In order to provide agility, the adaptive control unit should rebalance the allocations between the virtual networks when the monitoring parameters are changed. Let
s ( k ) denote the congestion price for virtual network k in that virtual port. The monitoring unit will send probe packets from each port periodically to collect the congestion price of each virtual network. We use a mechanism like the TCP congestion (k )
control to calculate s [8]. The congestion prices are summed up over each virtual machine and interpreted as end-to-end packet loss or delay in each virtual link refers to the physical port.
164
C. Wang, C.–r. Wang and Y. Yuan
s ( k ) (t + T ) = ⎡⎣ s ( k ) (t ) − β ( y ( k ) − z ( k ) (t )) ⎤⎦
+
(1)
y ( k ) is the bandwidth assigned to virtual (k ) (k ) network k in that physical port and z is the link rate in that virtual link. s is Where t is time, T is a same timescale,
updated for virtual network k based on the difference between the virtual link load and virtual link capacity. The stepsize β moderates the magnitude of the update, and reflects the tradeoff between convergence speed (large stepsizes) and stability (small (k )
stepsizes). The [ ]+ implies that s must be nonnegative. Each virtual edge node updates the path rates based on the local performance objective and the congestion level on its virtual paths. But all virtual networks should subject to the bandwidth constraint. We model this problem as an optimal model.
minimize
∑ω ∑y k
subject to
ϕ k ⎡⎣ s ( k ) (t ) − β ( y ( k ) − z ( k ) (t )) ⎤⎦
(k )
(k )
+
≤C
(2)
k
variables Where
ω ( k ) is
y ( k ) , ∀k
a priority of virtual network k assigned by substrate provider,
ϕ = C y is a punish parameter which can prevent a virtual network subscribe too many bandwidth in heavy load case. The goal of the model is to optimize the aggregate utility of bandwidth for all virtual networks. The dynamic bandwidth allocation of the substrate provider determines how satisfied each virtual network is with its k
k
(k )
allocated bandwidth by minimizing the sum of congestion price s and the priority of each virtual network. We use Particle Swarm Optimization (PSO) [9] to solve this optimal problem. Suppose that the search space is N-dimensional and a particle swarm consists of n particles, then the i-th particle of the swarm can be represented by a N-dimensional vector x = {x1 , x2 ...xn } . The velocity of this particle can be represented by another N-dimensional vector v = {v1 , v2 ...vn } . Every particle has a fitness that can be determined by the objective function of optimization problem, and knows the best previously visited position Pbest and present position xi of itself. Every particle has known the position of the best individual of the whole swarm Gbest. The velocity of each particle as given is influenced by the particles own flying experience as well as its neighbors flying experience. This velocity when added to the previous position generates the current position of the particle as following:
v(t + 1) = w * v(t ) + rand () * c1 * ( PBest (t ) − x(t )) + rand () * c 2 *(GBest − x(t ))
x(t + 1) = x(t ) + v(t + 1)
(3)
Dynamic Bandwidth Allocation for Preventing Congestion in Data Center Networks
165
Where w is inertia weight, rand() is random number uniformly from the interval [0,1], c1 and c2 are two positive constants called acceleration coefficients. In PSO, each particle use equation (3) to determine its next velocity and position. The violation which is the function of the constrained condition is introduced. The expression of the fitness and violation are as following:
Ffitness = min ∑ k ω ( k )ϕ k ⎡⎣ s ( k ) (t ) − β ( y ( k ) − z ( k ) (t )) ⎤⎦ Fviolation = max(0, ∑ k y ( k ) − C )
+
The particle competition algorithm to calculate optimal model (2) is: n −1
n −1
Function PSO( PBest , GBest ) {assume that each particle is already initialized with random bandwidth allocation in the swarm, and n is the number of particles}; n −1 n −1 PBest = ∅ , GBest = ∅;
for (i=1 to n){ Update velocity and position using expression (3); Calculate the F fitness and Fviolation of particle i; if
Ffitness ( Pi n −1 ) > Ffitness ( xi ) & & Fviolation ( xi ) = 0 then Pi n = xi ; else Pi n = Pi n −1 ;
if
n −1 Ffitness ( xi ) < GBest & & Fviolation ( xi ) = 0 then n GBest = xi ;}
if
n n n −1 GBest = ∅ then GBest = GBest ;
return
n n GBest , PBest ;
Giving a congestion price s of virtual networks connected to one physical port of the switch, after server iteration, the stable bandwidth allocation rate will be gained, and the substrate provider can set the fair-balanced bandwidth at this port according to the rate. The monitoring unit will send probe packet to collect congestion price for each virtual networks every T timescale. When it finds that the congestion price is changed in a physical port, the bandwidth among virtual networks will be re-allocated by the above algorithm.
4 Performance Evaluation We implemented a bandwidth allocation algorithm discussed in Section 3 using OpenFlow VM environment. OpenFlow is an open standard to support programmable switches. Upon low-level primitive, researchers can build networks with new highlevel properties. It is in the process of being implemented by major switch vendors and hence has good prospects for practical application in future network fabric. From
166
C. Wang, C.–r. Wang and Y. Yuan
the 1.0 release, OpenFlow begin to support QoS by queue slicing. Though the current release just provide bare minimum support for QoS-related Network Slicing, i.e. limit-rate (10MB) and minimum bandwidth guarantee, we believe that it will not be long before the arrival of a advanced version which can be deploy in practical networks. For the limitation of the current software, we made a simple but sufficiently persuasive experiment. The test topology is shown in Fig. 3, where PC 1 has two ports, eth1 and eth2 and PC 2 has one port connect to the OpenFlow Switch.
Fig. 3. Topology for experiment
In PC 1, we use two iperf clients to generate a TCP flow and a UDP flow with destination to PC 2 which run an iperf server. We need to oversubscribe a link for making traffic to be congested in Link 3. We test the each flow’s assigned bandwidth with and without network virtualization to verify if the network virtualization mechanism does work more efficiently. We set both virtual network’s priority to 1 and use a big time scale T=1s for easily counting. 10
10
TCP Traffic UDP Traffic
Assigned bandwidth (MB/s)
Assigned bandwitdh (MB/s)
9 8 7 6 5 4 3 2
6
4
2
TCP Traffic UDP Traffic
1 0
8
0
1
2
3 Time (s)
(a)
4
5
6
0
0
1
2 Time (ms)
3
4
(b)
Fig. 4. Assigned bandwidth of both traffic under non-virtualized and virtualized case
For non-virtualized case as shown in Fig.4 a, when UDP flow was sent, the bandwidth assigned to TCP flow dropped seriously. That will not be accepted for a implementing in practical data center networks for any TCP application. In Fig.4 b, for the substrate network was sliced into two virtual networks, and with the dynamic bandwidth allocation algorithm, both flow (i.e. both virtual networks)’s QoS is guaranteed. In out implementation, both virtual networks have equal priority, therefore they fairly
Dynamic Bandwidth Allocation for Preventing Congestion in Data Center Networks
167
share the bandwidth when they oversubscribed the link simultaneously. This can confirm the truth of that network virtualization can provide a efficient mechanism to prevent congestion in data center network which full of virtual machines, and also a more agility way for QoS guarantee.
5 Conclusion This paper introduced that network virtualization offers new opportunities to alleviate congestion-driven performance problems in data center networks running large number of virtual machines for various applications that generate different type of traffic. We present a network virtualization architecture and a bandwidth allocation algorithm for virtual networks. Maybe there is a certain distance from a practical fully mature system, we believe that network virtualization and dynamic bandwidth allocation are more suitable for the future data center networks, especially in the context of multitenancy mechanism of cloud computing.
References 1. Ghemawat, S., Gobioff, H., Leung, S.: The Google File System. In: ACM SOSP 2003 (2003) 2. Chang, F., et al.: Bigtable: A Distributed Storage System for Structured Data. In: OSDI 2006 (2006) 3. Kant, K.: Towards a virtualized data center transport protocol. In: Proc. of INFOCOM Workshop on High Speed Networks (2008) 4. Rajanna, V.S., Shah, S., Jahagirdar, A., Gopalan, K.: XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet. In: Proc. of 6th IEEE International Workshop on Storage Network Architecture and Parallel I/Os, Incline Village, NV, USA (2010) 5. Keller, E., Lee, R., Rexford, J.: Accountability in hosted virtual networks. In: ACM SIGCOMM Workshop on Virtualized Infrastructure Systems and Architectures, VISA (2009) 6. He, J., Zhang-Shen, R., et al.: DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet. In: ACM CoNEXT 2008, Madrid, Spain, December 10-12 (2008) 7. Cisco opening up IOS (2010), http://www.networkworld.com/news/2007/121207-cisco-ios.html 8. Low, S.H.: A duality model of TCP and queue management algorithms. IEEE/ACM Trans. Networking 11, 525–536 (2003) 9. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks (1995)
Software Comparison Dealing with Bayesian Networks Mohamed Ali Mahjoub1 and Karim Kalti2 1
Preparatory Institute of Engineer of Monastir (IPEIM) 5019 Monastir, Tunisia
[email protected] 2 University of Monastir (FSM) Monastir, Tunisia
[email protected] Abstract. This paper presents a comparative study of tools dealing with Bayesian networks. Indeed, Bayesian networks are mathematical models now increasingly used in the field of decision support and artificial intelligence. Our study focuses on methods for inference and learning. It presents a state of the art in the field. Keywords: Bayesian network, software, comparison.
1 Introduction Bayesian networks are part of the family of graphical models [1],[3]. They band together in the same formalism of graph theory and the probability to provide effective intuitive tools to represent a probability distribution attached to a set of random variables. For handling and scheduling algorithms dealing with Bayesian networks, several libraries have been initiated. The purpose of this paper is firstly, to study these libraries, see what provided each of them, the implemented algorithms, data types supported and the development interface. Thus, we present a synthesis of Bayesian networks. Initially, we present the formalism of Bayesian networks covering fundamentals notions and associated areas of applications of Bayesian networks. In this article, we review some of the more popular and/or recent software packages for dealing with graphical models.
2 Formalism of Bayesian Networks One of the key issues in the field of research in Artificial Intelligence is being able to design and develop dynamic and evolving systems. Therefore, they must be equipped with intelligent behaviors that can learn and reason. But in most cases, the knowledge gained is not always adequate to allow the system to take the most appropriate decision. To answer such questions, several methodologies have been proposed, but only probabilistic approaches are better suited not only to reason with the knowledge and belief uncertain, but also the structure of knowledge representation. These probabilistic approaches are called Bayesian networks. They are a compact representation of a D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 168–177, 2011. © Springer-Verlag Berlin Heidelberg 2011
Software Comparison Dealing with Bayesian Networks
169
joint probability of variables on the basis of the concept of conditional independence. In addition, they aim to facilitate the description of a collection of belief by making explicit the relationship of causality and conditional independence among these beliefs and provide a more efficient way to update the strength of beliefs when new evidence is observed. 2.1 Basic Concepts In a first part, we recall the Bayes theorem: Given two events A and B, Bayes theorem can be stated as follows: P(A|B) = P(B|A)*P(A)/P(B)
(1)
Where P(A) is the prior probability of A, P(B|A) is the likelihood function of A, and P(B) is the prior probability of B. Thus, P(A|B) is a posterior probability of A given B. A Bayesian network is an acyclic graph. This graph is directed and without circuit possible. Each node of a Bayesian network is a label that is an attribute of the problem. These attributes are binary, which can take (with some probability) the value TRUE or FALSE, which means that a random variable is associated with each attribute. A Bayesian network is defined by: 9 A directed graph without circuit G = (V, E) where V is the set of nodes G and E is the set of arcs of G. 9 A finite probability space (Ω,Z,p). 9 A set of random variables associated with nodes of the graph defined on (Ω,Z,p),as : (2) where P(Vi) is the set of causes (parents) of Vi in the graph G. A Bayesian network is then composed of two components; a causal directed acyclic graph (qualitative representation of knowledge) and a set of local distributions of probability1. After defining the different values that can take a characteristic, the expert must indicate the different relationships between these characteristics. Finally, the definition of the different probabilities is necessary to complete the network, each value (state) of a given node must have the probability of occurrence.
3 Problems Associated with Bayesian Networks There are several problems in the use of BNs, we will cite the mains [1]:The first one; The correspondence between the graphical structure and associated probabilistic structure will allow to reduce all the problems of inference problems in graph theory. However, these problems are relatively complex and give rise to much research. The second difficulty of Bayesian networks lies precisely in the operation for transposition of the causal graph to a probabilistic representation. Even if the only probability 1
Quantitative representation of knowledge.
170
M.A. Mahjoub and K. Kalti
tables needed to finish the entire probability distribution are those of a node conditioning compared to his parents, he is the definition of these tables is not always easy for an expert. Another problem of Bayesian networks, the problem of automatic learning of the structure that remains is a rather complex problem.
4 Study on Applications of Learning Bayesian Networks We present a study on the tools manipulating Bayesian networks, see what provides each of them specifies the algorithms for inference and learning and a study on exchange formats for interaction between these programs. Add to this a comparison of these tools in tabular form to facilitate understanding of them. For handling and scheduling algorithms dealing with Bayesian networks, several bookstores and software have been developed for this purpose. We cite these tools on which more research has been conducted. BAYES NET TOOLBOX (BNT) BNT is an open-source library used Matlab2 and is now supported by many researchers view the integration of several algorithms for inference and learning Bayesian networks. However, this library is still difficult to use by non-specialists because it requires some knowledge of Matlab and BNT. Indeed, the manipulation is done through the use of such knowledge and the introduction of the code with a text editor Matlab. BNT offers several inference algorithms for discrete Bayesian networks, Gaussian, or mixed (conditional Gaussian) as the elimination of variables, junction Tree, Quick Score, Pearl exact algorithm (for polyarbres) or approached and sampled (Likelihood Weighting and Gibbs Sampling). For learning parameters, BNT uses two types of learning settings first by maximum likelihood or maximum a posteriori for complete data and the second using the EM algorithm for incomplete data. On learning of structure, BNT uses several scoring functions as the criterion BIC. The only saving format is supported by BNT format Matlab extension ". m". BAYESIALAB BayesiaLab is a product of Bayesia (www.bayesia.com), a French company dedicated to the use of methods of decision support and learning from artificial intelligence and their operational applications (industry, services , finance, etc. BayesiaLab looks like a laboratory full of manipulation and study of Bayesian networks. It is developed in Java and is currently available in French, English and Japanese. BayesiaLab can treat the entire chain modeling study of a system by Bayesian network. Learning in BayesiaLab is using a text file or an ODBC link describing all cases. This application uses exact and approximate inference, but the algorithms are not mentioned. The backup formats supported by BayesiaLab formats are "XBL", "BIF", "DNE" and "NET". NETICA Netica is a Bayesian network software with the greatest circulation in the world3. It is used for diagnosis, prevention or simulation in the fields of finance, environment, 2 3
Following the work of Kevin Murphy. It is developed in 1992 by the Society Norsys which has a free version of software that is limited to the use of 15 variables and samples of 1000 cases for both learning from data.
Software Comparison Dealing with Bayesian Networks
171
medicine, industry and other fields. NETICA software is very responded in the use of Bayesian networks, but uses only one inference algorithm: junction tree. On the other hand, Netica offers a graphical interface for easy operation, and can easily transform a Bayesian network and explore relationships between variables in a model built by learning from data by inverting links or absorbing nodes, while keeping unchanged the probability of overall Bayesian network. Learning in Netica is using a text file and CVS files delimited by tabs or an ODBC link describing all cases. The backup formats supported by Netica are "DNE" and "NETA". This application also allows the import of a Bayesian network in the format "DSC", "DXP" and "NET". HUGIN Hugin is one of the greatest tools used in Bayesian networks. It is a commercial product with similar functionality to the BNT. It was one of the first packet to the model DAG. This tool provides a graphical environment and a development environment that help define and build the foundations of knowledge based on Bayesian networks. Hugin supports one inference algorithm which is the junction tree. The Junction tree algorithm can be seen and it is possible to change the method of triangulation. Learning structure and parameters is done from a Java, C or Visual Basic. The learning parameters in Hugin is with the use of the EM algorithm while learning structures is ensured with the use of two algorithms PC and NPC. The backup formats supported by HUGIN formats are "OOBN" and "HKB" and "NET". JAVABAYES JavaBayes is a set of Java tools that create and manipulate Bayesian networks4. This system consists of a graphical editor, a core inference and a set of parsers. The graphical editor allows the user to create and modify Bayesian networks. Methods in JavaBayes are not commented and many variables were not significant names which makes understanding the code difficult. In addition, this JavaBayes still many bugs and shortcomings in its interface. It is difficult to handle, as safeguarding the network or import into another program. The inference algorithms implemented in this system are the elimination of variables and the Junction tree. JavaBayes is able to use models with sets of distributions to calculate the intervals of posterior distributions or intervals expectations, but it does not propose algorithms for learning parameters and structure. The backup formats supported by JavaBayes formats are "BIF" and "XML". GENIE Released in 1998 by the group decision systems Druzdzel, GENIE (Graphical Network Interface) was a development environment for the decision and the construction of Bayesian networks, characterized by its inference engine SMILE (Structural Modeling Reasoning, and Learning Engine) . Engineering offers several inference algorithms as it has several backup formats for the exchange network between different applications. Genie uses essentially the algorithm of junction tree and Polytree algorithm for inference, and several other approximate algorithms that can be used if the networks become too large for the clustering (logic sampling, likelihood weighting,
4
It was developed by Fabio Gagliardi Cozman in 1998 and has been licensed under the GNU (General Public License).
172
M.A. Mahjoub and K. Kalti
self importance and heuristic importance sampling, sampling backwood). Learning settings and learning structures are supported. The backup formats supported are engineered formats "xDSL", "DSL", "NET", "DNE", "DXP", "ERG" and "DSC". BNJ BNJ (Bayesian network tools in Java) is a set of Java tools research and development of Bayesian networks. This project was developed within the KDD laboratory at the University of Kensai. This is an Open Source project licensed under the GNU5. Its latest version has been published in April 2006. It provides a set of inference algorithms for Bayesian networks. In BNJ it is possible to define two types of probability distribution for the nodes: discrete tabular layout and continuous distribution. Bayesian networks created are stored in XML files. BNJ provides two categories of inference algorithms: exact inference ("Tree Junction", "Elimination of variables with optimization") and approximate inference which is based on exact inference algorithms. Indeed, some methods use the concept of sampling as "Adaptive Importance Sampling (AIS)", "Logic Sampling" and "Forward Sampling" and also, there are other methods of applying the algorithms on an exact inference Selection of arcs of the graph to be treated as "KruskalPolytree", "BCS" and "PTReduction. On the other hand, traveling to all source files of this toolkit we find no implementation of learning algorithms or parameters or structure. The backup formats supported by BNJ formats are "NET", "Hugin" and "XML". MSBNX MSBNX is a component-based Windows applications to create and evaluate Bayesian networks. MSBNX algorithm uses the junction tree algorithm for the inference. There is no learning algorithm for parameters or structures. The backup formats supported by MSBNX formats are "XBN", "DSC" and "XML" (XML for this application has a specific design MSBNX different from other XML formats of other applications that provide a standard format). SAMIAM Samiam has two main components: a graphical interface and an inference engine. The graphical interface allows users to develop models of Bayesian networks and save them in a variety of formats. The inference engine includes many works: the classical inference, parameter estimation, space-tim; sensitivity analysis and explanation, based on the MAP and MPE. Samiam is a free software including a variety of algorithms for inference but are based on one inference algorithm junction Tree. It supports three implementations of the junction Tree algorithm: Hugin architecture, architecture Shenoy-Shafer, and a new architecture that combines best architectures previous PD. Samiam uses the EM algorithm (Expectation Maximization) for estimating parameters of the network based on the data. It adopts the "File" Hugin format for specifying data as a set of cases6. The backup formats supported by Samiam are "NET" and "HUGIN.
5 6
GNU : General Public License. It also includes utilities to generate data randomly from a given network and for storing data in files.
Software Comparison Dealing with Bayesian Networks
173
UNBBAYES UnBBayes is an open source modeling, learning and probabilistic reasoning networks. It uses a variety of algorithms for inference and learning but there are often bugs in handling of a mistake. The software is in an infinite loop that crashes the system and this is due to the lack of exceptions generation of software errors. UnBBayes uses algorithms of junction tree, Likelihood Weighting and Correct Classification Review to make the inference. Also, it uses algorithms K2, B, V and Incremental Learning to learning. The backup formats supported by UnBBayes are "NET", "UBF" and "XML". This application also allows the import of a Bayesian network in the format "OWL". PROBT ProBT is a C++ library. The commercial and industrial exploitation of this software has been granted an exclusive basis to the Company Probayes. ProBT is a very powerful software that offers a free version for the purpose of research, but it does not provide the source code itself, it offers only an explanation of the different classes in which it is built. For exact inference, ProBT uses the "Successive Restrictions Algorithm (SRA). For approximate inference, several arrangements of approximation are used by ProBT like Monte Carlo, the simultaneous evaluation and maximization. ProBT also provides algorithms for learning parameters. In the case of complete data, it proposes an algorithm based on maximum likelihood and an algorithm based on the principle of EM in the case of incomplete data. On the other hand, ProBT contained no implementation for learning Bayesian network structure7. The save format supported by ProBT format is "XML" which is specific to this application. This application offers the possibility to import an Excel file after classified data in tables with graphs. ANALYTICA Analytica is a shareware software for creating, analyzing and modeling of graphical models like Bayesian networks. There may be a limited trial version of this software, simply fill out a form indicating the telephone number on their website and you phone for your opinion on their software. This software offers a single inference algorithm specific to that application can not be compared by the inference algorithms known in the literature of Bayesian networks. Analytica uses its own inference engine ADE (The Analytica Decision Engine). There is no learning algorithm or parameters or structure. The only saving format is supported by Analytica format "ANA". BNET BUILDER BNet.Builder is software for the Bayesian networks developed with Java and its use is very easy. This software uses an inference engine that does not mention the inference algorithm used. Bnet Builder provides an onboard engine to the inference. Browsing through all the documentation and help documents and using the software, there is no evidence indicating that this software makes learning structure or parameter. The backup formats supported by BNetBuilder formats are "XBN", "DNE" and "NET". 7
But recent projects are under development as the algorithm MWST and K2 for learning structure from complete data.
174
M.A. Mahjoub and K. Kalti
PNL Open Source Probabilistic Networks Library, a tool for working with graphical models, supporting directed and undirected models, discrete and continuous variables, various inference and learning algorithms. BAYES BUILDER BayesBuilder is free software for the construction of Bayesian networks. It is only available on Windows, because the inference engine is written in C + + and has been compiled for Windows yet8. This software uses an inference engine without mentioning the inference algorithm, which classifies it as a black box in algorithmic perspective. There is an inference algorithm not mentioned. He supports neither the training nor the structure of the learning parameters. The backup formats supported by BNetBuilder formats are "bbnet. This application also allows the import of a Bayesian network in the format "NET", "DNET", "DSC" and "BIF". XBAIES This software bayesian networks as dialogs for inference and learning Bayesian networks. This software has no graphical interface for entering the network but can be entered based on rules, not making it difficult to use as new generations of applications of expert systems is now based on graphical models and more on systems based on rules. Also it uses only one inference algorithm which is the junction tree. XBAIES uses the junction tree algorithm. Learning parameters and structure are supported XBAIES does not provide a format for saving a Bayesian network, but entering a network is literally using the buttons. OPENBUGS BUGS is software for Bayesian networks, which do not show the networks built, he has limited command lines similar to those of Matlab without graphical display of Bayesian network. This software is for the older generation of expert systems because it is based on rules and not on graphical models. The inference algorithm used in Bugs is Gibbs sampling. Bugs is the learning parameter but not the structure. The save format is supported by OpenBugs format "BUGS". BAYESIAN KNOWLEDGE DISCOVERER / BAYESWARE DISCOVERER Bayesian Knowledge Discoverer is a free software that has been replaced by a commercial version Bayesware Discoverer which itself offers a trial version of 30 days. This software provides a graphical interface with powerful visualization options and is available for Windows, Unix and Macintosh. This software uses only one inference algorithm which is not mentioned. The software algorithm uses the junction tree for inference. Bayesware been learning structure and parameters from complete data and incomplete. The algorithm used is approached 'bound and collapse'. The save format supported by BKD / BD is the format "DBN". This application also allows the import of a Bayesian network in the format "DSC".
8
The GUI program is written in Java and it is easy to use.
Software Comparison Dealing with Bayesian Networks
175
VIBES VIBES (Variational Inference for Bayesian Networks) is a java program for Bayesian networks. This software uses a single inference algorithm not known in the literature of Bayesian networks. VIBES provides an inference engine to clean it VIBES (Variational Inference for Bayesian Networks). The learning parameters is supported while learning structure is not supported. The save format supported by VIBES format is "XML" which has a specific design for this application.
5 Comparison of Tools Manipulating Bayesian Networks To better understand the tools manipulating Bayesian networks, we propose in this section a comparative analysis, summarized in two tables. First, we propose a comparison of applications of Bayesian networks in the form of a table based on the type Table 1. Table of comparison tools manipulating Bayesian networks
Tools name BNT BayesiaLab NETICA HUGIN JavaBayes
Type Source Library Matlab/ c Library No software No software No software Java
GeNIe BNJ MSBNX
software software software
No Java No
Free Free Free
SamIam
software
Java
Free
UnBBayes
software
Java
Free
ProBT Analytica BNet Builder Bayes builder OpenBugs
Library software software software software
C++ No No No No
Free PLEV PLEV PLEV Free
BKD/BD PNL
software Librairy
No Yes
PLEV Free
VIBES
software
Java
Free
PLEV : Pay-limited evaluation version
License Free PLEV PLEV PLEV Free
Web site http://www.cs.ubc.ca/~murph yk/Software/BNT/bnt.html http://www.bayesia.com http://www.norsys.com http://www.hugin.com http://www.cs.cmu.edu/~java /bayes/Home http://genie.sis.pitt.edu http://bnj.sourceforge.net http://research.microsoft.com/ enus/um/redmond/groups/adapt/ /msbnx http://reasoning.cs.ucla.edu/sa miam http://unbbayes.sourceforge.n et http://www.probayes.com http://www.lumina.com http://www.cra.com http://www.snn.ru.nl http://www.mrc/bsu.cam.ac.uk/bugs http://bayesware.com http://sourceforge.net/projects //openpnl http://www.johnwinn.org/ et http://vibes.sourceforge.net/
176
M.A. Mahjoub and K. Kalti
of tool, the availability of source code and what language used for development, the software license and the site Official web application. Second we propose a second comparison chart applications based on the types of inference algorithms used in learning, the formats supported and types of variables used. On the other hand all tools use discrete variables. A more extensive comparison can be found at this [2]. Table 2. Technical Comparison of tools manipulating Bayesian networks
Tools name BNT BayesiaLab NETICA
Inference E/A E/A Exact
Learning P/S P/S P
HUGIN JavaBayes GeNIe
Exact Exact E/A
P/S -
BNJ MSBNX SamIam
E/A Exact Exact
P
UnBBayes ProBT Analytica AgenaRisk BNet Builder Bayes builder
Exact E/A Exact E/A Exact E/A
S P/S P -
XBAIES OpenBugs BKD/BD VIBES
Exact E/A E/A E/A
P/S P P/S P
Supported formats m BIF, xbl, dne et net. dne, neta, dsc, dxp, net. Oobn, hkb, net. BIF et XML xdsl, dsl, net, dne, dxp, erg et dsc XML, net et hugin xbn, dsc et XML. net, hugin, dsl, xdsl, dsc, dne, dnet, erg. XML, net et ubf. xml ana cmp xbn, dne et net. bbnet, net, dnet, dsc, BIF bugs dbn et dsc xml
Variables
C/D C/D C/D C/D Discrete Discrete Discrete Discrete C/D Discrete C/D C/D C/D C/D D C/D C/D C/D C/D
P/S : parameter/structure C/D: continous and discrete E/A : Exact/Approximate
6 Conclusion Graphical models are a way to represent conditional independence assumptions by using graphs. The most popular ones are Bayesian networks. To conclude, we can say that BNT library is most useful tool for Bayesian networks. Also we note that all softwares are paying while all the libraries are free.
Software Comparison Dealing with Bayesian Networks
177
References 1. Naïm, P., Wuillemin, P.-H., Leray, P., Pourret, O., Becker, A.: Ré-seaux Bayésiens, 3rd edn. Eyrolles, 2. Murphy, K.: Software for Graphical models: a review. ISBA Bulletin (December 2007) 3. Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)
Adaptive Synchronization on Edges of Complex Networks Wenwu Yu Department of Mathematics, Southeast University, Nanjing 210096, China
[email protected],
[email protected] Abstract. This paper studies distributed adaptive control for synchronization on edges of complex networks. Some distributed adaptive strategies are designed on all the coupling weights of the network for reaching synchronization, which are then extended to the case where only a small fraction of coupling weights are adjusted. A general criterion is given and it is found that synchronization can be reached if the corresponding edges and nodes of the updated coupling weights form a spanning tree. Finally, simulation examples are given to illustrate the theoretical analysis. Keywords: Synchronization; Complex network; Distributed adaptive strategy; Pining control; Leader-follower control.
1
Introduction
Networks exist everywhere nowadays, such as biological neural networks, ecosystems, social network, metabolic pathways, the Internet, the WWW, electrical power grids, etc. Typically, each network is composed of the nodes representing individuals in the network and the edges representing the connections among them. Recently, small-world and scale-free complex network models were proposed by Watts and Strogatz (WS) [13], and Albert and Barabasi (BA) [2]. Thereafter, the study of various complex networks has attracted increasing attention from researchers in various fields of physics, mathematics, engineering, biology, and sociology. Synchronization, a typical collective behavior in nature, has received a lot of interests, since the pioneering work of Pecora and Carroll [9] due to their potential applications in secure communications, chemical reactions, biological systems, and so on. Usually, there are large numbers of nodes in real-world complex networks nowadays. Therefore, much attention has been paid to the study of synchronization in various large-scale complex networks with network topology [1,7,8,10,11,14,15,16,18,19,20,21]. In [10,11], local synchronization was investigated by the transverse stability to the synchronization manifold, and a distance from the collective states to the synchronization manifold was defined in [14,15], based on which some results were obtained for global synchronization of complex networks and systems [7,8,16,18,19,20,21]. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 178–187, 2011. c Springer-Verlag Berlin Heidelberg 2011
Adaptive Synchronization on Edges of Complex Networks
179
Recently, in order to save control cost, pinning control has been widely investigated in complex networks [3,12,16,20,21]. If the network can not be synchronized by itself, some controllers may be applied on a small fraction of all the nodes to force the network to synchronize, which is known as pinning control. However, many derived conditions for ensuring network synchronization were sufficient and thus a little conservative. Then, a lot of works have been devoted to use adaptive strategies to adjust network parameters to get better conditions for reaching network synchronization based on the previous results in adaptive synchronization in nonlinear systems [17,22]. For example, in [3,20,24,25], the adaptive laws were applied on the control gains from the leader to the nodes, and in [3,20], adaptive schemes were designed on the network coupling strength which are centralized algorithms. However, there are few works about how to update the coupling weights of the network for reaching synchronization. In [23], an algorithm was proposed on the updated coupling weights for reaching network synchronization but the theoretical analysis was not shown. Recently, a general distributed adaptive strategy on the coupling weights was proposed and some theoretical conditions were also derived for reaching network synchronization in [4]. However, the conditions in [4] are not easy to apply and in addition, it is impossible to update all the coupling weights as in [4,23]. A basic contribution of this paper is that a simple distributed adaptive strategy is applied on the coupling weights of the network and the derived conditions are very easy to apply. For example, if the corresponding edges and nodes with updated coupling weights form a spanning tree, synchronization can be reached in the network, i.e., updating N − 1 coupling weights is enough to achieve network synchronization. Another contribution of this paper is that pinning weights control is applied both on a small fraction of the controlled nodes and coupling weights which originates from pinning nodes control in the literature [3,12,16,20,21,24,25]. Additionally, some novel distributed adaptive control laws are proposed for reaching network synchronization in this paper. The rest of the paper is organized as follows. In Section 2, some preliminaries are briefly outlined. Distributed adaptive control strategies are proposed in Section 3. In Section 4, simulation examples are given on the random and scale-free networks to illustrate the theoretical analysis. Conclusions are finally drawn in Section 5.
2
Preliminaries
Let G = (V, E, G) be a weighted undirected graph of order N , with the set of nodes V = {v1 , v2 , · · · , vN }, the set of undirected edges E ⊆ V × V, and a weighted adjacent matrix G = (Gij )N ×N . An undirected edge in graph G is denoted by eij = (vi , vj ). If there is an edge between node vi and node vj , then Gij (t) = Gji (t) > 0 is the time-varying weight associated with the edge eij ; otherwise, Gij (t) = Gji (t) = 0 (j = i). As usual, we assume there is no self-loop in G. Consider the following complex dynamical network consisting of N nodes with linearly diffusive coupling [7,8,10,11,14,15,18,19].
180
W. Yu
x˙ i (t) = f (xi (t), t) +
N
Gij (t)Γ (xj (t) − xi (t)), i = 1, 2, · · · , N,
(1)
j=1,j=i
where xi (t) = (xi1 (t), xi2 (t), · · · , xin (t))T ∈ Rn is the state vector of the ith node, f : Rn ×R+ −→ Rn is a nonlinear vector function, Γ = diag(γ1 , . . . , γn ) ∈ Rn×n is a semi-positive definite inner coupling matrix where γj > 0 if two nodes can communicate through the jth state, G(t) = (Gij (t))N ×N is the coupling configuration matrix representing the topological structure of the time-varying network, and the Laplacian matrix L is defined by Lij (t) = −Gij (t), i = j; Lii (t) = −
N
Lij (t),
(2)
j=1,j=i
which ensures the diffusion that
N j=1
Lij (t) = 0 for all t. Equivalently, network
(1) can be rewritten in a simpler form as follows: x˙ i (t) = f (xi (t), t) − c
N
Lij (t)Γ xj (t), i = 1, 2, · · · , N.
(3)
j=1
The objective of control here is to find some adaptive control laws acting on Gij (t) under fixed network topology such that the solutions of the controlled network (3) can reach synchronization, i.e., lim xi (t) − xj (t) = 0, ∀ i, j = 1, 2, · · · , N.
t→∞
(4)
Here, it is emphasized that the network structure is fixed and only the coupling weights can be time-varying, i.e., if there is no connection between nodes i and j, Gij (t) = Gji (t) = 0 for all t. Assumption 1. There exist a constant diagonal matrix H = diag(h1 , . . . , hn ) and a positive value ε > 0 such that (x − y)T (f (x, t) − f (y, t)) − (x − y)T HΓ (x − y) ≤ −ε(x − y)T (x − y), ∀x, y ∈ Rn .
(5)
Note that Assumption (5) is very mild. For example, all linear and piecewise linear functions satisfy this condition. In addition, if ∂fi /∂xj (i, j = 1, 2, . . . , n) are bounded and Γ is positive definite, the above condition is satisfied, which includes many well-known systems. Without loss of generality, only connected networks are considered throughout the paper. Otherwise, one can consider the synchronization on each connected component of the network.
Adaptive Synchronization on Edges of Complex Networks
181
Lemma 1. [6] (1) The Lapacian matrix L in the undirected network G is semi-positive definite. It has a simple eigenvalues 0 and all the other eigenvalues are positive if and only if the network is connected. (2) The second smallest eigenvalue λ2 (L) of the Laplacian matrix L satisfies xT Lx . min λ2 (L) = xT 1N =0,x=0 xT x N N 1 Gij (ηi − ηj )2 . (3) For any η = (η1 , . . . , ηN )T ∈ RN , η T Lη = 2 i=1 j=1
3
Distributed Adaptive Control in Complex Networks
In this section, some distributed adaptive laws on the weights Lij for i = j are proposed, which result in the obtained adaptive laws on Gij since Lij = −Gij N 1 for i = j. Let x ¯= xj . Then, one gets N j=1 x ¯˙ (t) =
N 1 f (xj (t), t). N j=1
(6)
Subtracting (6) from (3) yields the following error dynamical network e˙ i (t) = f (xi (t), t) −
N N 1 f (xj (t), t) − c Lij (t)Γ ej (t), i = 1, 2, . . . , N, (7) N j=1 j=1
where ei = xi − x¯, i = 1, . . . , N . 3.1
Distributed Adaptive Control in Complex Networks
Theorem 1. Suppose that Assumption 1 holds. The network (3) is synchronized under the following distributed adaptive law L˙ ij (t) = −αij (xi − xj )T Γ (xi − xj ), 1 ≤ i = j ≤ N,
(8)
where αij = αji are positive constants, 1 ≤ i = j ≤ N . Proof. Consider the Lyapunov functional candidate: V (t) =
N N N 1 T ei (t)ei (t) + 2 i=1 i=1
j=1,j=i
c (Lij (t) + σij )2 , 4αij
where σij = σji are positive constants to be determined, 1 ≤ i = j ≤ N .
(9)
182
W. Yu
The derivative of V (t) along the trajectories of (7) gives V˙ =
N
eTi (t)e˙ i (t) +
i=1
N N i=1 j=1,j=i
⎡
c (Lij (t) + σij )L˙ ij (t) 2αij
⎤ N N 1 = eTi (t) ⎣f (xi (t), t) − f (xj (t), t) − c Lij (t)Γ ej (t)⎦ N j=1 i=1 j=1 N
N N c (Lij (t) + σij )(xi − xj )T Γ (xi − xj ) 2 i=1 j=1,j=i N N 1 T ei (t) f (xi (t), t) − f (¯ x, t) + f (¯ x, t) − f (xj (t), t) = N j=1 i=1 N N N c −c Lij (t)Γ ej (t) − (Lij (t) + σij )(ei − ej )T Γ (ei − ej ). (10) 2 j=1 i=1
−
j=1,j=i
Since
N
eTi (t) = 0, one has
i=1
N i=1
eTi (t) f (¯ x, t) −
1 N
f (x (t), t) = 0. From j j=1
N
Assumption 1, it follows that N
eTi (t) [f (xi (t), t) − f (¯ x, t)] ≤ −ε
i=1
N
eTi (t)ei (t) +
i=1
N
eTi (t)HΓ ei (t). (11)
i=1
Define the Laplacian matrix Σ = ( σij )N ×N where σ ij = −σij , i = j; σ ii = N σ ij . In view of Lemma 1, one obtains − j=1,j=i
c
N N
(Lij (t) + σij )(ei − ej )T Γ (ei − ej )
i=1 j=1,j=i
= −2c
N N
Lij (t)eTi Γ ej + 2c
i=1 j=1
N N
σ ij (t)eTi Γ ej .
(12)
i=1 j=1
Therefore, combining (10)-(12) and by Lemma 1, one finally has V˙ ≤ −ε
N i=1
eTi (t)ei (t) +
N
eTi (t)HΓ ei (t) − c
i=1
N N
σ ij (t)eTi Γ ej
i=1 j=1
T
= e (t) [−ε(IN ⊗ In ) + (IN ⊗ HΓ ) − c(Σ ⊗ Γ )] e(t) ≤ eT (t) [−ε(IN ⊗ In ) + (IN ⊗ HΓ ) − cλ2 (Σ)(IN ⊗ Γ )] e(t).
(13)
By choosing σij sufficiently large such that cλ2 (Σ) > maxj (hj ), then one gets (IN ⊗ HΓ ) − cλ2 (Σ)(IN ⊗ Γ ) ≤ 0. Therefore, one has V˙ ≤ −εeT (t)e(t),
Adaptive Synchronization on Edges of Complex Networks
where e(t) = (eT1 (t), eT2 (t), . . . , eTN (t))T . This completes the proof.
183
2
Remark 1. The dynamical coupling weights are adjusted according to the local synchronization property between the node and its neighbors in [23]. A new type of decentralized local strategies was then proposed and detailed theoretical analysis was also given in [4]. However, the derived conditions are conservative and not easy to apply. In Theorem 1, by designing a simple distributed adaptive law on the coupling weights, the conditions in [4] are totally removed. As long as Assumption 1 is satisfied, the network can reach synchronization under (8). In addition, one can also let the convergence rate αij be very small if the communication ability between nodes i and j are limited. 3.2
Distributed Adaptive Pinning Weights Control in Complex Networks
In Theorem 1, all the coupling weights are adjusted according to the distributed adaptive law (8). Though the algorithm is local and decentralized, it is literally impossible to update all the coupling weights of the network. Next, we aim to update a minority of the coupling weights so that synchronization can still be reached, which is called pinning weights control here. Assume that the pinning strategy is applied on a small fraction δ (0 < δ < 1) of the coupling weights in network (3). Suppose that the coupling weights Lk1 m1 , Lk2 m2 , . . . , Lkl ml are selected, where l = δN represents the integer part of the real number δN . E, G) be a weighted undirected graph, with the set of nodes Let G = (V, V = {vk1 , vk2 , · · · , vkl , vm1 , vm2 , · · · , vml }, the set of undirected edges e ki mi = and a weighted adjacent matrix G = (G ij )N ×N where G ki mi (t) = ( vki , v mi ) ∈ E, be Gmi ki (t) > 0, i = 1, . . . , l. It is very clear that G is a subgraph of G. Let L the corresponding Laplacian matrix of the adjacency matrix G. Theorem 2. Suppose that Assumption 1 holds. Under the following distributed adaptive law L˙ mi ki (t) = L˙ ki mi (t) = −αki mi (xki − xmi )T Γ (xki − xmi ), i = 1, . . . , l, (14) where αki mi = αmi ki are positive constants, the network (3) is synchronized if there are positive constants σ mi ki such that (IN ⊗ HΓ ) − cλ2 (L)(IN ⊗ Γ ) ≤ 0,
(15)
σmi ki when where L = (Ljs )N ×N is a Laplacian matrix with Ljs = Lsj = − j = mi and s = ki ; otherwise Ljs = Lsj = Ljs , i = 1, . . . , l, 1 ≤ j = s ≤ N . Proof. Consider the Lyapunov functional candidate: N
V (t) =
l
c 1 T ei (t)ei (t) + (Lmi ki (t) + σ mi ki )2 , 2 i=1 2α k i mi i=1
(16)
184
W. Yu
where σ mi ki = σ ki mi are positive constants to be determined, i = 1, 2, . . . , l. The derivative of V (t) along the trajectories of (16) gives ⎡ ⎤ N N eTi (t) ⎣f (xi (t), t) − f (¯ x, t) − c Lij (t)Γ ej (t)⎦ V˙ = i=1
−c
j=1 l
(Lmi ki (t) + σ mi ki )(eki − emi )T Γ (eki − emi )
i=1 T
≤ e (t) [−ε(IN ⊗ In ) + (IN ⊗ HΓ ) − c(L ⊗ Γ )] e(t).
(17)
The proof is completed. 2 In Theorem 2, a general criterion is given for reaching synchronization by using the designed adaptive law (14). Certainly, one can solve (15) by using the LMI approach. However, sometimes the condition in (15) is difficult to apply in a very large scale network. Next, we aim to simplify the condition in (15). Corollary 1. Suppose that Assumption 1 holds. If the network G contains a spanning tree, the network (3) is synchronized under the following distributed adaptive law L˙ mi ki (t) = L˙ ki mi (t) = −αki mi (xki − xmi )T Γ (xki − xmi ), i = 1, . . . , l, (18) where αki mi = αmi ki are positive constants. Proof. Consider the Lyapunov functional candidate in (16) and take the derivative of V (t) along the trajectories of (16) gives
⊗ Γ ) e(t), (19) V˙ ≤ eT (t) −ε(IN ⊗ In ) + (IN ⊗ HΓ ) − c(L ⊗ Γ ) − c(Σ are Laplacian matrix defined by L = L − L and Σ = (− where L and Σ σij )N ×N with σ mi ki > 0 and 0 otherwise, i = 1, . . . , l. It is easy to see that each offdiagonal entry in L is the same as in L except that Lmi ki = Lki mi = 0. Since G can be sufficiently large if one chooses sufficiently contains a spanning tree, λ2 (Σ) large positive constants σ ki mi , i = 1, . . . , N . Therefore, one finally gets
N ⊗ Γ ) e(t), V˙ ≤ eT (t) −ε(IN ⊗ In ) + (IN ⊗ HΓ ) − cλ2 (Σ)(I (20) This completes the proof.
2
Remark 2. In Corollary 1, if the corresponding edges and nodes of the updated coupling weights form a spanning tree, then synchronization in network (3) can be reached. Therefore, the minimal number of the updated coupling weights under this scheme is N − 1, which is much lower than updating all the coupling weights. Note that at the first stage of adaptive control in complex networks, the general scheme is proposed here. Some detailed pinning weights strategies will be further investigated later.
Adaptive Synchronization on Edges of Complex Networks
4
185
Simulation Examples
In this section, a simulation example is provided to demonstrate the theoretical analysis in this paper.
xi1
10 0 −10
0
1
2
3
4
5
3
4
5
3
4
5
t
xi2
1 0 −1
0
1
2 t
xi3
5 0 −5
0
1
2 t
Fig. 1. Orbits of the states xij in network i = 1, 2, . . . , 100 and j = 1, 2, 3
1.15
Gij
1.1
1.05
1
0.95
0
1
2
3
4
5
t
Fig. 2. Orbits of the weights Gij in network i, j = 1, 2, . . . , 100
Suppose that the nonlinear dynamical system s(t) ˙ = f (s(t), t) is described by the Chua’s circuit ⎧ ⎨ s˙ 1 = η(−s1 + s2 − l(s1 )), (21) s˙ 2 = s1 − s2 + s3 , ⎩ s˙ 3 = −βs2 , where l(x1 ) = bx1 + 0.5(a − b)(|x1 + 1| − |x1 − 1|). The system (21) is chaotic when η = 10, β = 18, a = −4/3, and b = −3/4. In view of Assumption 1, by computation, one obtains θ = 5.1623. Consider an ER random network [5], and suppose there are N = 100 nodes with probability p = 0.15, which contains about pN (N −1)/2 ≈ 724 connections. If there is a connection between node i and j, then Gij (0) = Gji (0) = 1 (i = j), i, j = 1, 2, . . . , 100, and the coupling strength c = 0.2. A simulation-based analysis on the random network is performed by using random weights pinning scheme. In the random weights pinning scheme, one randomly selects δpN (N − 1)/2 weights with a small fraction δ = 0.35 to pin. By Corollary 1, since the network G is connected, the network is synchronized under the distributed designed adaptive law. The orbits of the states xij are shown in Fig. 1 for i = 1, 2, . . . , 100 and j = 1, 2, 3. The updated weights Gij are illustrated in Fig. 2.
186
W. Yu
From Fig. 2, it is easy to see that the coupling weights change a little above 1, which is very hard to check from other results in the literature for a very small coupling strength c. Therefore, by using distributed adaptive pinning weights control, one can find that synchronization can be reached in the network under very mild conditions.
5
Conclusions
In this paper, distributed adaptive control for synchronization in complex networks has been studied. Some distributed adaptive strategies have been designed on all the coupling weights of the network for reaching synchronization, which were then extended to the case where only a small fraction of coupling weights are adjusted. A general criterion was given and it was found that synchronization can be reached if the corresponding edges and nodes with updated coupling weights form a spanning tree. Some novel distributed adaptive strategies on the coupling weights of the network have been proposed. However, it is still not understood what kind of pinning schemes may be chosen for a given complex network to realize synchronization if the number of the selected coupling weights is less than N − 1. The study of adaptive control on coupling weights of complex network is at the beginning, and many works about the detailed pinning weights strategies will be further investigated in the future. Acknowledgments. The author would like to thank his supervisor Prof. Guanrong Chen, Dr. Pietro DeLellis, Prof. Mario di Bernardo, and Prof. J¨ urgen Kurths for inspiring discussions and helpful comments.
References 1. Arenas, A., Diaz-Guilera, A., Kurths, J., Moreno, Y., Zhou, C.: Synchronization in complex networks. Physics Reports 468(3), 93–153 (2008) 2. Barab´ asi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999) 3. Chen, T., Liu, X., Lu, W.: Pinning complex networks by a single controller. IEEE Trans. Circuits Syst. I 54(6), 1317–1326 (2007) 4. DeLellis, P., di Bernardo, M., Garofalo, F.: Novel decentralized adaptive strategies for the synchronization of complex networks. Automatica 45, 1312–1318 (2009) 5. Erd¨ os, P., R´enyi, A.: On random graphs. Pub. Math. 6, 290–297 (1959) 6. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985) 7. Lu, W., Chen, T.: Synchronization of coupled connected neural networks with delays. IEEE Trans. Circuits Syst. I 51(12), 2491–2503 (2004) 8. L¨ u, J., Chen, G.: A time-varying complex dynamical network models and its controlled synchronization criteria. IEEE Trans. Auto. Contr. 50(6), 841–846 (2005)
Adaptive Synchronization on Edges of Complex Networks
187
9. Pecora, L.M., Carroll, T.L.: Synchronization in chaotic systems. Phys. Rev. Lett. 64(8), 821–824 (1990) 10. Wang, X., Chen, G.: Synchronization in scale-free dynamical networks: robustness and fragility. IEEE Trans. Circuits Syst. I 49, 54–62 (2002) 11. Wang, X., Chen, G.: Synchronization in small-world dynamical networks. Int. J. Bifurcation and Chaos 12, 187–192 (2002) 12. Wang, X., Chen, G.: Pinning control of scale-free dynamical networks. Physica A 310, 521–531 (2002) 13. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998) 14. Wu, C.: Synchronization in arrays of coupled nonlinear systems with delay and nonreciprocal time-varying coupling. IEEE Trans. Circuits Syst. II 52(5), 282–286 (2005) 15. Wu, C., Chua, L.O.: Synchronization in an array of linearly coupled dynamical systems. IEEE Trans. Circuits Syst. I 42(8), 430–447 (1995) 16. Xiang, J., Chen, G.: On the V-stability of complex dynamical networks. Automatica 43, 1049–1057 (2007) 17. Yu, W., Cao, J., Chen, G.: Robust adaptive control of unknown modified CohenGrossberg neural networks with delay. IEEE Trans. Circuits Syst. II 54(6), 502–506 (2007) 18. Yu, W., Cao, J., Chen, G., L¨ u, J., Han, J., Wei, W.: Local synchronization of a complex network model. IEEE Trans. Systems, Man, and Cybernetics-Part B 39(1), 230–441 (2009) 19. Yu, W., Cao, J., L¨ u, J.: Global synchronization of linearly hybrid coupled networks with time-varying delay. SIAM Journal on Applied Dynamical Systems 7(1), 108–133 (2008) 20. Yu, W., Chen, G., L¨ u, J.: On pinning synchronization of complex dynamical networks. Automatica 45, 429–435 (2009) 21. Yu, W., Chen, G., Wang, Z., Yang, W.: Distributed consensus filtering in sensor networks. IEEE Trans. Systems, Man, and Cybernetics-Part B (in press) 22. Yu, W., L¨ u, J., Chen, G., Duan, Z., Zhou, Q.: Estimating uncertain delayed genetic regulatory networks: an adaptive filtering approach. IEEE Trans. Auto. Contr. 54(4), 892–897 (2009) 23. Zhou, C., Kurths, J.: Dynamical weights and enhanced synchronization in adaptive complex networks. Phys. Rev. Lett. 96, 164102 (2006) 24. Zhou, J., Lu, J., L¨ u, J.: Adaptive synchronization of an uncertain complex dynamical network. IEEE Trans. Auto. Contr. 51(4), 652–656 (2006) 25. Zhou, J., Lu, J., L¨ u, J.: Pinning adaptive synchronization of a general complex dynamical network. Automatica 44(4), 996–1003 (2008)
Application of Dual Heuristic Programming in Excitation System of Synchronous Generators* Yuzhang Lin and Chao Lu Department of Electrical Engineering, Tsinghua University, 10084 Beijing, China
[email protected] Abstract. A new method, Dual Heuristic Programming (DHP) is introduced for the excitation control system of synchronous generator in this paper. DHP is one type of Approximate Dynamic Programming (ADP), which is an adaptive control method. Nonlinear function fitting method is used to approximate the performance index function of dynamic programming, so that ADP can get optimal control for the plant. In this paper, DHP implements the adaptive control of the excitation system of synchronous generators. Results show that the DHP method performs better than the conventional PID method in excitation control of synchronous generators. The DHP method has obtained a more rapid response. More over, it can optimize the performance globally, which reduced the swinging of excitation voltage and the energy consumption. Keywords: Dual Heuristic Programming, Adaptive Dynamic Programming, Excitation System, Synchronous Generator, Neural Network.
1 Introduction The excitation system of generators provides excitation current to the generator in normal situation. And also it regulates the excitation current according to the load of the generator, so that the terminal voltage is stable. This kind of regulating method is called Automatic Voltage Regulator (AVR). Conventional automatic voltage regulator is a linear control method. The model is linearized at the set point in the conventional method, and the control algorithm is usually PID. The conventional method depends on an accurate math model. In addition, it is effective only at set point, if not, the performance will be degraded. So we should develop a better control method. Approximate dynamic programming was proposed by Werbos[1], based on the dynamic programming. It is a cross-discipline involving artificial neural network, dynamic programming, and reinforcement learning. ADP can deal with the disturbances and uncertain situations. It approximates the cost function in bellman equation to avoid the curse of dimensionality in dynamic programming. So it can be applied to large scale and complicated nonlinear systems. In [1], ADP approaches were classified into four main schemes: Heuristic Dynamic Programming (HDP), Dual Heuristic *
This work was supported by the Students Research Training of Tsinghua University.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 188–199, 2011. © Springer-Verlag Berlin Heidelberg 2011
Application of Dual Heuristic Programming in Excitation System
189
Programming (DHP), Action Dependent Heuristic Dynamic Programming (ADHDP), and Action Dependent Dual Heuristic Programming (ADDHP). In this paper, the Dual Heuristic Programming is applied to excitation control system of synchronous generators. And its function is Automatic Voltage Regulator. The proposed control algorithm is designed for a single machine infinite bus power system, and numerical simulation is carried out. The results show that the response of the proposed method is more rapid than the conventional PID method, and also it can optimize the performance globally.
2 Dual Heuristic Programming Dynamic Programming can solve problems that have the form of maximum or minimum in engineering and economic fields theoretically. But actually, the curse of dimensionality prevents its direct adoption in many real-world control problems. DP’s dependence on system numerical model is another disadvantage. So the DP algorithm works well only for small scale problems or simple problems. Approximate Dynamic Programming employs parameterized function structure to approximate the performance index function and control strategy, under the guarantee of principle of optimality. So the optimal control and optimal performance index can be obtained with the ADP algorithm, and also the curse of dimensionality is prevented. This algorithm provides a feasible method to solve large scale and complicated nonlinear system optimal control problems. 2.1 Principle of Dual Heuristic Programming Considered the following discrete time nonlinear system
x ( k + 1) = F ( x ( k ) , u ( k ) , k )
(1)
Where x ∈ R n denotes the system states. u ∈ R m denotes the control variables. The performance index function (cost function) is ∞
J ( x ( k ) , k ) = ∑ γ t − kU ( x ( t ) , u ( t ) , t )
(2)
t =k
Where U is utility function, γ is discount factor, and 0 < γ ≤ 1 . The J function is the cost function of the system, and it depends on the initial time step and the initial system state x ( k ) . Formulas (2) can be transformed as following
J ( x ( k ) ) = U ( x ( k ) , u ( k ) ) + γ J ( x ( k + 1) )
(3)
In algorithm of DHP, it does not approximate cost function J directly, but it ap∂J ( x ( k ) ) proximate the co-state , the derivate of function J with respect to the sysx (k ) tem state x . The co-state can be represented by
190
Y. Lin and C. Lu
∂J ( x ( k ) ) ∂x ( k )
Here, λ ( k ) =
∂J ( x ( k ) ) ∂x ( k )
=
∂U ( x ( k ) , u ( k ) ) ∂x ( k )
+γ
∂J ( x ( k + 1) )
(4)
∂x ( k )
.
According to the Bellman equation, the problem can be solved as following: J * ( x ( k ) , k ) = min ⎡⎣U ( x ( k ) , u ( k ) , k ) +γ J * ( x ( k + 1) , k + 1) ⎤⎦ u(k )
(5)
The optimal control must be satisfied to the first order derivative condition, that is
∂J * ( x ( k ) ) ∂u ( k )
= =
∂U ( x ( k ) , u ( k ) ) ∂u ( k )
∂U ( x ( k ) , u ( k ) ) ∂u ( k )
+ +
∂J * ( x ( k + 1) ) ∂u ( k )
∂J
*
( x ( k + 1) ) ∂F ( x ( k ) , u ( k ) )
∂x ( k + 1)
(6)
∂u ( k )
So the optimal control is
⎛ ∂J * ( x ( k ) ) ∂U ( x ( k ) , u ( k ) ) − u* = arg min ⎜ u ⎜ ∂u ( k ) ∂u ( k ) ⎝ ∂J * ( x ( k + 1) ) ∂F ( x ( k ) , u ( k ) ) ⎞ ⎟ − ⎟ ∂x ( k + 1) ∂u ( k ) ⎠
(7)
2.2 Neural Network Implementation for DHP
The structure of neural network implementation for DHP is shown as Fig. 1. There are three neural networks in it. They are Action Network, Model Network, and Critic Network.
x (t )
u(t )
∂J (t ) ∂x (t )
xˆ(t + 1) ∂J (t + 1) ∂U (x (t ), u(t )) ∂x (t + 1) ∂x (t )
Fig. 1. The Structure of DHP
Action Network is similar to the conventional controller. Its input is the system state x ( k ) . The output is the current control variable u ( k ) . Model Network is to get
Application of Dual Heuristic Programming in Excitation System
191
the system state’s estimation of next time step. Its input and output is system state x ( k ) and control variable u ( k ) respectively. The input of Critic Network is system state x ( k ) , and the output λˆ ( k ) is the estimation of co-state λ ( k ) =
∂J ( x ( k ) ) ∂x ( k )
. The
three networks can be replaced by any other parameterized function structures, such as polynomial, neural network, fuzzy logic and so on. In this paper, it is a three layered feed-forward neural network. Assume the number of the neurons of hidden layer is denoted by l , the weight matrix between the input layer and hidden layer is denoted by V , the weight matrix between the hidden layer and the output layer is denoted by W . This network can be represented by F ( X ,V ,W ) = W T σ (V T X
)
(8)
e zi − e − zi , i = 1," , l are activation functions. Where σ (V T X ) ∈ R l , ⎡⎣σ ( z ) ⎤⎦ i = zi e + e − zi 2.2.1 Model Network The functions of model network are to estimate the system state of next time step and transmit learning error of other network, so it must approximate the plant as much as possible. Before the whole algorithm goes on, the model network should be trained with a large number of data which contains lots of information of the plant. In addition, the model network can be trained online too, so that the model network can adaptive to the system’s dynamic when the set point or the environment are changed. The error function is defined as
eM ( k + 1) = x ( k + 1) − xˆ ( k + 1) EM ( k + 1) =
1 T em ( k + 1) em ( k + 1) 2
(9) (10)
Where xˆ ( k + 1) is the output of the model network. According to the gradient-based algorithm, minimize the defined error function EM , the weight update rule for the model network is given by wM ( k + 1) = wM ( k ) + ΔwM ( k )
(11)
⎡ ∂E ( k + 1) ⎤ ΔwM ( k ) = lM ⎢ − ⎥ ⎣⎢ ∂wM ( k + 1) ⎦⎥
(12)
where lM is the learning rate of model network, and wM the weight vector of the model network.
192
Y. Lin and C. Lu
2.2.2 Critic Network
Critic network is trained to approximate co-state
∂J ( x ( k ) ) ∂x ( k )
, the derivative of cost
function J . For a controllable system, if neural network convergent to the cost function correctly, the controller can guarantee the system stable. In order to get an approximation of the co-state, define the error of the network as eC ( k ) = λ ( k ) −
∂U ( x ( k ) , u ( k ) ) ∂x ( k )
EC =
−γ
∂J ( x ( k + 1) ) ∂x ( k )
1 T eC ( k ) eC ( k ) 2
(13)
(14)
Minimize objective function EC to train critic function. Where ∂U ( x ( k ) , u ( k ) ) ∂x ( k )
∂J ( x ( k + 1) ) ∂x ( k )
=
∂U ( x ( k ) ) ∂x ( k )
+
∂U ( u ( k ) ) ∂u ( k ) ∂u ( k )
∂x ( k )
∂J ( x ( k + 1) ) ⎛ ∂xˆ ( k + 1) ∂xˆ ( k + 1) ∂u ( k ) ⎞ + ⎜ ⎟ ∂xˆ ( k + 1) ⎜⎝ ∂x ( k ) ∂u ( k ) ∂x ( k ) ⎟⎠ ∂xˆ ( k + 1) ∂xˆ ( k + 1) ∂u ( k ) = λ ( k + 1) + λ ( k + 1) ∂x ( k ) ∂u ( k ) ∂x ( k )
(15)
=
(16)
According to the gradient-based algorithm, the weight update rule for the critic network is given by wC ( k + 1) = wC ( k ) + ΔwC ( k ) ⎡ ∂E ( k ) ⎤ ΔwC (k ) = lC ⎢ − C ⎥ = lC ⎣⎢ ∂wC ( k ) ⎦⎥
⎡ ∂EC ( k ) eC ( k ) ∂λ ( k ) ⎤ ⎢− ⎥ ⎣⎢ eC ( k ) ∂λ ( k ) ∂wC ( k ) ⎦⎥
(17)
(18)
where lC is the learning rate of the critic network and wC is the weight vector of the critic network. 2.2.3 Action Network In order to get optimal performance, DHP obtains the optimal control by training the action network. And the objective is to minimize the cost function J . Because function J is a positive, zero is minimal. Then define the error function as
EA = J ( t ) − 0 = J (t )
(19)
According to the gradient-based algorithm, the weight update rule for the action network is given by
Application of Dual Heuristic Programming in Excitation System
193
wa (k + 1) = wa (k ) + Δwa (k )
(20)
⎡ ∂J ( k ) ⎤ ΔwA ( k ) = l A ⎢ − ⎥ ⎣⎢ ∂wA ( k ) ⎦⎥
(21)
Where l A is the learning rate of the action network and wA is the weight vector of the action network. 2.3 Procedure of DHP Method
Given the above preparation, now the DHP method is summarized as follows: Step 1: Given the computation precision ε , initialize the weights of each network. Step 2: According the model network weight update rule, train the model network. Step 3: Do the forward computations of action network, model network, and critic network. Step 4: According to the critic network weight update rule, train the critic network. Step 5: Do the forward computations of critic network. Step 6: According to the action network weight update rule, train the action network. Step 7: Jump to step 3, until the desired performance is satisfied.
3 The Application of DHP in Excitation System of Synchronous Generators 3.1 Single Machine Infinite Bus Power System
In order to facilitate the analysis, the power system is simplified into a single machine infinite bus system in this paper. The single machine infinite bus system includes synchronous generator, turbine, speed governor, transmission line connected to infinite bus and exciter and AVR excitation control system. ΔP
–
Pref
+
Σ
turbine
governor
Pm
Δω
Synchronous generator
Z=Re+jXe
Vfd
exciter
Ve
Um
Vt
– AVR
Σ
+
Vref
Fig. 2. Single machine infinite bus power system
194
Y. Lin and C. Lu
In Fig. 2, Re , X e represent the transmission line parameters. Δω represents the speed variation with respect to speed of operation, U m represents the voltage of infinite bus, Vt represents the terminal voltage of the generator, Ve represents the input voltage of exciter. V fd represent the output voltage of exciter, Vref represents the reference voltage of terminal voltage. Pm represents the mechanical power. Pref represents the reference power of the mechanical power. 3.2 The Excitation Controller Design Based on DHP
In this paper, DHP method replaces the conventional PID AVR of the excitation system. And the three networks are three layered feed-forward neural network, and the activation function of the hidden layer is bipolar sigmoidal function, and the activation function of the output layer is pure linear function The training strategy is as follows: 1training the model network first, then training the action network and critic network simultaneously. The most important objective of AVR is to let the terminal voltage tracking the reference terminal voltage quickly, and then maintaining the voltage at the set point. So define the following utility function of the system as U (t ) = [(0.4ΔVt (t ) + 0.4ΔVt (t − 1) + 0.16ΔVt (t − 2))]2
(22)
3.2.1 The Design of Model Network The input of model network is selected as the voltage variation with respect to the reference terminal voltage ΔVt , the excitation adjusting voltage ΔVe , and two time
step of their delayed value, that is, ⎡⎣ ΔVt ( k ) ΔVt ( k − 1) ΔVt ( k − 2 ) ΔVe ( k ) ΔVe ( k − 1) ΔVe ( k − 2 ) ⎤⎦ . The output is the next time step of terminal voltage variaT
tion ΔVt ( k + 1) . There are 15 neurons at the hidden layer. So the structure of the model network is 6-15-1. ΔVt (k )
Wm1
ΔVt (k − 1)
Wm 2
ΔVt (k − 2)
ΔVˆt (k + 1)
ΔVe (k ) ΔVe (k − 1) ΔVe (k − 2)
Input layer
Hidden layer
Output layer
Fig. 3. The topology of model network
Application of Dual Heuristic Programming in Excitation System
195
First, define the input of model network Min = ⎡⎣ ΔVt ( k ) ΔVt ( k − 1) ΔVt ( k − 2 ) ΔVe ( k ) ΔVe ( k − 1) ΔVe ( k − 2 ) ⎤⎦ . The forward computation is T
6
Mh1 j ( t ) = ∑ Min ( k ) Wm1ij ( k ) i =1
Mh 2 j ( t ) =
1− e
− Mh1 j ( t )
1+ e
− Mh1 j ( t )
(23)
15
ΔVˆt ( k + 1) = ∑ Mh2 j ( t ) Wm2 j ( t ) j =1
Where Mh1 is the output vector of input layer. Mh2 is the output vector of hidden layer. Wm1 is the weight matrix between input and hidden layer. Wm2 is the weight matrix between hidden and output layer. The update rule of the model network is ΔWm 2 ( k ) = lM ( k ) Mh 2Τ ( k ) eM ( k + 1) ΔWm1(k ) =
{
(24)
}
1 lM (k ) MinΤ (k ) ⎡⎣eM (k + 1)Wm2Τ (k ) ⎤⎦ ⋅ [1 − Mh2(k ) ⋅ Mh 2(k )] 2
(25)
Where “ ⋅ ” represents “entrywise product”. 3.2.2 The Design of Critic Network According to the definition of utility function U ( t ) , there are 3 input neurons. So the T input vector is ⎡⎣ ΔVt ( k ) ΔVt ( k − 1) ΔVt ( k − 2 ) ⎤⎦ . The output is Jˆ , the estimation of function J . And there are 8 neurons at the hidden layer. So the structure of the critic network is 3-8-1.
Let the input of the critic network Cin = ⎡⎣ ΔVt ( t ) ΔVt ( t − 1) ΔVt ( t − 2 ) ⎤⎦ , then T
the forward computation is 3
Ch1 j ( k ) = ∑ Cin ( k ) Wc1ij ( k ) i =1
Ch2 j ( k ) =
1− e
− Ch1 j ( k )
1+ e
− Ch1 j ( k )
(26)
8
Jˆ ( k ) = ∑ Ch2 j ( k ) Wc 2 j ( k ) j =1
Where Ch1 is the output vector of input layer. Ch2 is the output vector of hidden layer. Wc1 is the weight matrix between input and hidden layer. Wc 2 is the weight matrix between hidden and output layer.
196
Y. Lin and C. Lu
The update rule of the critic network is
ΔWc 2 ( k ) = −lC (k )Ch2Τ ( k ) eC ( k )
(27)
{
}
1 ΔWc1( k ) = − lC ( k ) ΔVt T ( k ) ⎡⎣eC ( k ) Wc 2Τ ( k ) ⎤⎦ ⋅ ⎡⎣1 − Ch2 ( k ) ⋅ Ch2 ( k ) ⎤⎦ 2 Where “ ⋅ ” represents “entrywise product”.
3.2.3 The Design of Action Network The input of action network is the same as critic network. Output is the excitation adjusting voltage ΔVe ( k ) , and it is the control variable. The hidden layer also have 8
neurons. So the structure of the action network is 3-8-1. Let Ain = ⎡⎣ ΔVt ( k ) ΔVt ( k − 1) ΔVt ( k − 2 ) ⎤⎦ , then the forward computation is T
3
Ah1 j ( k ) = ∑ Aini ( k ) Wa1ij ( k ) i =1
Ah2 j ( k ) =
1− e
− Ah1 j ( k )
1+ e
− Ah1 j ( k )
(28)
8
ΔVe ( k ) = ∑ Ah2 j ( k ) Wa 2 j ( k ) j =1
Where Ah1 is the output vector of input layer. Ah2 is the output vector of hidden layer. Wa1 is the weight matrix between input and hidden layer. Wa 2 is the weight matrix between hidden and output layer. The update rule of the action network is
⎛ ∂J ( k + 1) ⎞ ΔWa 2 ( k ) = −l A ( k ) Ah2Τ ( k ) ⎜⎜ 2ΔVe ( k ) + γ ⎟ ∂ΔVe ( k ) ⎟⎠ ⎝
(29)
∂J (k + 1) 1 = λ (k + 1)Wm2Τ (k )Wm1u (k ) ⋅ ⎡⎣1 − Mh2 ( k ) ⋅ Mh2 ( k ) ⎤⎦ 2 ∂u (k )
(30)
γ 1 ⎛ ΔWa1(k ) = − l A ( k ) ΔVt T ( k ) ⎜ 2ΔVe ( k ) + λ ( k + 1) Wm2Τ ( k ) 2 2 ⎝
(Wm1 ( k ) ⋅ (1 − Mh2 ( k ) ⋅ Mh2 ( k ) ) ) (Wa 2 ( k ) ⋅ (1 − Ah2 ( k ) ⋅ Ah2 ( k )) )
T
u
Τ
Where “ ⋅ ” represents “entrywise product”.
)
(31)
Application of Dual Heuristic Programming in Excitation System
197
3.3 Simulation and Results
According to the description of previous sections, a numerical simulation is set up at the MATLAB/SIMULINK environment, containing DHP Controller module and the single machine infinite bus power system module. A conventional PID method simulation is set up too. In the single machine infinite bus system, the parameters of the generator is as follows: Pn = 200MVA , Vn = 16.8kVrms , f n = 50 Hz . All values in the simulation are expressed in standard per unit (p. u.). The simulation set the system in fault status to test the DHP based excitation controller. The fault is three phase short circuit at the time span of 6.0s ~ 6.1s. Results show the control effect of DHP based controller and conventional PID based controller. The simulation results are shown in Fig. 4, Fig. 5, and Fig. 6. 1.2 1.0 DHP PID
Vt
0.8 0.6 0.4 0.2 0.0
6
8 time (s)
10
Fig. 4. The terminal voltage with three phase fault
0.01
DHP PID
Δω
0.00
-0.01
6
8
time (s)
10
12
Fig. 5. The speed variation with three phase fault
DHP PID
ΔVe
3 2 1 0 -1
6
7
8 time (s)
9
10
Fig. 6. The excitation adjusting voltage with three phase fault
198
Y. Lin and C. Lu
Fig. 4 shows the terminal voltage response diagram when the system at with three phase fault In this status, the terminal voltage drops down significantly. When the fault is clear, DHP based controller has a more rapid response to recover the voltage than conventional PID. Fig. 5 shows the speed variation response diagram at fault status. This also shows that the DHP based controller has a slightly more rapid response than conventional PID. Fig. 6 shows the controller’s output, the excitation adjusting voltage’s diagram. This shows that DHP based controller has a more rapid response than conventional PID, and the DHP based controller’s output does not swing up. This will reduce the loss of actuator and energy consumption.
4 Conclusions This paper introduced a DHP based automatic voltage regulator to replace the conventional PID based one. DHP does not depend on system’s accurate numerical model. It learns the system dynamics from existing data. In addition the model network can be updated online, to learn new dynamics caused by disturbances and environment changes The action network is also adaptive with environment changes. As the results show, the control signal does not swing up. This advantage shows that the DHP method can optimize the system performance globally. It also reduces the loss of actuator and energy consumption.
References 1. Werbos, P.J.: Approximate dynamic programming for realtime control and neural modeling. In: White, D.A., Sofge, D.A. (eds.) Handbook of Intelligent Control, Van Nostrand Reinhold, New York (1992) 2. Liu, W., Venayagamoorthy, G.K., Wunsch, D.C.: Design of Adaptive Neural Network Based Power System Stabilizer. Neural Networks 16, 891–898 (2003) 3. Wang, F.-Y., Zhang, H., Liu, D.: Adaptive Dynamic Programming: An Introduction. Computational Intelligence Magazine, 39–47 (2009) 4. Seong, C.-Y., Widrow, B.: Neural Dynamic Optimization for Control Systems—Part I: Background. IEEE Trans. on Systems, Man and Cybernetics- Part B 31(4), 482–489 (2001) 5. Liu, D., Xiong, X., Zhang, Y.: Action-dependent adaptive critic designs. In: Proceeding of International Joint Conference on Neural Networks (IJCNN 2001), July 15-19, vol. 2, pp. 990–995 (2001) 6. Si, J., Wang, Y.-T.: On-line learning control by association and reinforcement. IEEE Trans. on Neural Networks 12(2), 264–276 (2001) 7. Hunt, K.J., Sbarbaro, D., Zbikowski, R., Gawthrop, P.J.: Neural networks for control systems -a survey. Automatica 28(6), 1083–1112 (1992) 8. Prokhorov, D., Wunsch, D.: Adaptive critic designs. IEEE Transactions on Neural Net-works 8(5), 997–1007 (1997)
Application of Dual Heuristic Programming in Excitation System
199
9. Mohagheghi, S., del Valle, Y., Venayagamoorthy, G.K.: A Proportional-Integrator Type Adaptive Critic Design-Based Neurocontroller for a Static Compensator in a Multimachine Power System. IEEE Transactions on Industrial Electronics 54(1), 86–96 (2007) 10. Wei, Q., Zhang, H., Liu, D.: An Optimal Control Scheme for a Class of Discrete-Time Nonlinear Systems with Time Delays Using Adaptive Dynamic Programming. Acta Automatica Sinica (2008) 11. Lu, C., Si, J., et al.: Direct heuristic dynamic programming for damping oscillations in a large power system. IEEE Trans. Syst., Man., Cybern. B 38(4), 1008–1013 (2008)
A Neural Network Method for Image Resolution Enhancement from a Multiple of Image Frames Shuangteng Zhang1 and Ezzatollah Salari2 2
1 Department of Computer Science, Eastern Kentucky University, Richmond, KY 40475 Department of Electrical Eng. & Comp. Science, University of Toledo, Toledo, OH 43607
Abstract. This paper presents a neural network based method to estimate a high resolution image from a sequence of low resolution image frames. In this method, a multiple of layered feed-forward neural networks is specially designed to learn an optimal mapping from a multiple of low resolution image frames to a high resolution image through training. Simulation results demonstrate that the proposed neural networks can successfully learn the mapping in the presented examples and show a good potential in solving image resolution enhancement problem, as it can adapt itself to the various changeable conditions. Furthermore, due to the inherent parallel computing structure of neural network, this method can be applied for real-time application, alleviating the computational bottlenecks associated with other methods and providing a more powerful tool for image resolution enhancement. Keywords: Image resolution, image enhancement, neural network, image sequence analysis.
1 Introduction Super-resolution images are very important in many image processing applications. One way to obtain such high-resolution images is to apply image reconstruction or restoration techniques to reconstruct or restore the needed high-resolution images from the corresponding low-resolution ones. Following a mathematical approach, this reconstruction or restoration problem is considered to be highly ill-posed when an improved resolution is required to be obtained from a single low resolution image. However, the problem can lead to a more stable solution when there exists a sequence of images corresponding to the same object or scene. A common technique is to use the relative motions between two consecutive low-resolution image frames and reconstruct a high-resolution image from a multiple of low-resolution images. Several approaches to enhancing the image resolution using a multiple of lowresolution images have been proposed in recent years. Tsai and Huang [1] introduced a Discrete Fourier Transform (DFT) based algorithm to reconstruct a high resolution image from a multiple of input image frames and the method was further enhanced by Kim [2] to handle the noise and the blurring introduced by the imaging device. Stark and Oskoul proposed a convex projection method to obtain high-resolution images from coarse detector data [3]. Hardie et. al. [4] presented a method in which the translation and rotational motion is considered and a high-resolution image is obtained by minimizing a cost function in an iterative algorithm and this method was further D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 200–208, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Neural Network Method for Image Resolution Enhancement
201
enhanced by exploiting non-global scene motion [5]. Freeman et. al. [6] developed a learning based technique in which fine details of image regions are learned from a low resolution in a training set. The learned relationships are used to predict the fine details in other images. Techniques based on maximum a posteriori or maximum likelihood estimation were also used in image resolution improvement [7] and [8]. Most of the methods mentioned above are expressed in terms of an optimization problem and try to minimize some sort of cost function. Solutions to such optimization problems require intensive computations which are very time consuming. In this paper, a neural network based method is proposed to estimate a high resolution image from a sequence of low resolution image frames. This method uses a multiple of layered feed-forward neural networks that is particularly designed to be capable of learning an optimal mapping from a multiple of low resolution image frames to a high resolution image through training. Simulation results demonstrate that the proposed neural networks has good potential in solving image resolution enhancement problem, as it can adapt itself to the various changeable conditions. Furthermore, due to the inherent massively parallel computing structure of neural network, this new method can be used for real-time application, alleviating the computational bottlenecks associated with other methods and providing a more powerful tool for image resolution enhancement.
2 Problem Statement When the detector array of the imaging systems is not sufficiently dense to meet the Nyquist criterion, the obtained digital images are undersampled and have a lower resolution. Assuming that the downsampling factor is L, a simple model of obtaining a low-resolution image of size (N1,N2) from a high-resolution image of size (L x N1, L x N2) can be expressed by the following equation, L x N1 L x N 2
g (m, n)
¦ ¦h
m , n , k ,l
k 1
z (k , l ) K (m, n)
(1)
l 1
where g(m,n) and z(k,l) are the low-resolution and the high-resolution images, respectively, hm,n,k,l is the contribution of the (k,l)th pixel in the high-resolution image to the (m,n)th pixel in the low-resolution image, η(m,n) is the Gaussian noise added to the (m,n)th pixel in the low-resolution image. The problem here is to recover the highresolution image z from P low-resolution images gp, p=1,2,…P. A neural network method is proposed to solve this problem. This method takes advantage of the powerful learning ability of feed-forward neural networks and relative motions between the low-resolution images to recover the lost information due to undersampling, and therefore, generate reconstructed high-resolution image. Assume that the relative motion between a reference low-resolution image gr (can be anyone of the low-resolution images) and other images gp is represented by using the related rotational and translational parameter triplet, (θp, hpx, hpy), the details of the proposed neural network method are provided in the following sections.
202
S. Zhang and E. Salari
3 The Proposed Neural Network Structure The proposed neural network is a multiple layered feed-forward network and can be functionally divided into three sub-networks: preliminary resolution enhancement (PRE), additional information (AI), and output networks. Figure 1 shows the overall diagram of the proposed network. Note, Ia and Ib denote two sets of the external inputs for the preliminary resolution enhancement and additional information sub-networks, respectively, and O represents the set of outputs for the whole network. The PRE network computes high-resolution image pixel information from the reference lowresolution image. The AI network is constructed to extract additional high-resolution image information from the other low-resolution images. The output network combines the calculated high-resolution image pixel information with the additional information from the AI network to obtain the expected high-resolution image. Preliminary Resolution Enhancement Network
Ia
Ya O b
Y
Ib
Output Network
Additional Information Network
Fig. 1. Diagram of the proposed neural network
3.1 The PRE Network As mentioned above, the PRE network is used to compute preliminary high-resolution image pixel information from the reference low-resolution image. This function can be achieved by using a feed-forward neural network. The learning capability of a multi-layered feed-forward network makes it a universal approximator. With proper structure design, it can be trained to learn the optimal mapping from low-resolution image to high-resolution image pixels. The proposed PRE network is designed to be a fully connected three-layered feed-forward neural network as shown in Figure 2. Assume that Ik (k=1, 2,…, 9) denotes the k-th pixel value of a 3x3 window in the reference low-resolution image frame gr, the network takes the vector Ia=(I1, I2, …, I9) as the input and generates M=L2 high-resolution image pixel values yak (k=1, 2, …, M) corresponding to the central pixel of the 3x3 window in the reference image gr. The behavior of the network can be formulated as,
y ka
Q
9
j 1
i 1
f (¦ u ajk f (¦ vija I i )), k
1, 2, ..., M
(2)
A Neural Network Method for Image Resolution Enhancement
ya1
ya2
203
y aM xxx
Ua xxx
Va
I1
I2
I3
I4
I5
I6
I7
I8
I9
Fig. 2. Neural network structure of the PRE network
where yak is the output of the k-th neuron in the ouput layer, uajk is the weight of the connection between the j-th neuron in the hidden layer and the k-th neuron in the output layer, vaij is the connection weight between the i-th neuron in the input layer and jth neuron in the hidden layer, and Q is the number of the neurons in the hidden layer. The activation function of the neurons f(.) is,. f ( x)
2 /(1 e Ox ) 1
(3)
where λ > 0 is the neuron activation function coefficient determining the steepness of the function. 3.2 The AI Network Using only the PRE network can not generate high quality high-resolution image since no enough information can be obtained from only the reference low-resolution image to recover the lost information due to the undersampling. The AI network takes use of the relative motion between the reference image gr and the other low-resolution image gp to extract additional information for the further enhancement of the image resolution in the output network. The proposed IA network structure is shown in Figure 3. It is also a fully connected three-layered feed-forward network. The input of the IA network consists of P-1 vectors with each vector having elements of Ip1, Ip2, …, Ip9, θp, hpx, and hpy. Note Ipk (k=1, 2, …, 9) is the k-th pixel value of a 3x3 window in the p-th low-resolution image, and θp, hpx, and hpy are the rotational and translational motion parameters between the p-th low-resolution image and the reference image gr. Representing the input of the IA network by the vector Ib=( Ib1, Ib2, …, Ib12(P-1))= (I11, I12, … , I19, θ1, h1x, h1y, I21, I21, …, I29, θ2, h2x, h2y, … , I(P-1)1, I(P-1)2, … , I(P-1)9, …, θ(P-1), h(P-1)x, h(P-1)y), the output of the network can be expressed as,
204
S. Zhang and E. Salari yb1
yb2
ybM
xxx Ub xxx V
b
xxx I11 x x x I19
xxx T1
h1x
xxx
xxx
h1y I21 x x x I29 T2 h2x h2y x x x I(P-1)1 x x x I(P-1)9 T(P-1) h(P-1)x h(P-1)y
Fig. 3. Neural network structure of the AI network
R
12( P 1)
j 1
i 1
f (¦ u bjk f (
y kb
¦v
b ij
I ib )), k
1, 2, ..., M
(4)
where ybk is the output of the k-th neuron in the ouput layer, ubjk is the connecting weight between the j-th neuron in the hidden layer and the k-th neuron in the output layer, vbij is the weight of connection between the i-th neuron in the input layer and jth neuron in the hidden layer, R and M are the number of the neurons in the hidden and output layer, respectively, P is the number of the low-resolution images. The activation function of the neurons is the same as equation (3). The AI network requires the motion parameters between the reference lowresolution image gr and the other images gp as part of the input. However, in the practical applications, these parameters are usually unknown. There exist some approaches for the estimate of the parameters [9] and [10]. Here we propose a neural network to estimate the motion parameters and incorporate it into the IA network. We use P-1 neural networks, each calculating the parameters (θp, hpx, hpy) for one of the (P-1) low-resolution images and having the same structure as shown in Figure 4. As in the PRE and AI networks, assume that Ik and Ipk (k=1, 2,…, 9) denote the k-th pixel value of a 3x3 window in the reference low-resolution image frame gr and the p-th low-resolution image frame gp, respectively, the output Yc = (yc1, yc2, yc3)=(θp, hpx, hpy) of the motion parameter network with input of Ic=(Ic1, Ic2, …, Ic18)=( I1, I2, …, I9, Ip1, Ip2, Ip9) can be obtained by using the following equation,
y kc
S
18
j 1
i 1
f (¦ u cjk f (¦ vijc I ic )), k
1, 2, 3
(5)
where ucjk and vcij as in equation (4) are the connection weights between input, hidden and output layers, and S is the number of the neurons in the hidden layer.
A Neural Network Method for Image Resolution Enhancement
yc1,
yc2,
205
yc3
Uc xxx
Vc xxx
I1
I2
xxx
xxx
I9
Ip1
Ip2
xxx
Ip9
Fig. 4. Neural network structure of the motion parameter network
3.3 Output Network The output network is a two-layered feed-forward network as shown in Figure 5. It combines the outputs of the PRE and AI networks as input vector, (ya1, ya2, …, yaM, yb1, yb2, …, ybM), to generate a set of high-resolution image pixel values, Z=(z1, z2, …, zM). The output zm can be calculated by the following equation, M
f ( y ma wmm ¦ w( p k ) m y kb )
zm
(6)
k 1
Note, wkm is the connecting weight between the k-th neuron in the input layer and mth neuron in the output layer. z1
z2
zM xxx
W xxx
ya1
ya2
xxx
xxx
yaM
yb1
yb2
xxx
ybM
Fig. 5. The neural network structure of the output network
206
S. Zhang and E. Salari
4 Training the Proposed Network The proposed neural network consists of three multi-layered feed-forward subnetworks. These three networks work together to form a big feed-forward network to reconstruct a high-resolution image from a multiple of low-resolution images. The capability of the network can be achieved by training it to learn the optimal mapping from a set of low-resolution image frames to high-resolution image through adjusting the neuron connection weights. The standard Error Back-Propagation Training algorithm (EBPT) [11] can be used for the training. EBPT is a gradient descent algorithm that minimizes some cost function. The following error functions are used for the training, 2
Ea
1 M 1 M M (Oi z i ) 2 J ¦ (¦D ij z j ) ¦ 2 i 1 2 i 1 j1
Eb
1 M 1 M M (Oi y ia ) 2 J ¦ (¦D ij y ia ) ¦ 2 i 1 2 i 1 j1 Ec
1 3 ¦ ( Di y c i ) 2 2 i 1
(7) 2
(8)
(9)
where Oi is the expected/desired output of the proposed network, Di is the desired output of motion parameter network, zi, yai and yci are the observed output of the whole network, the PRE sub-network and the motion parameter network, respectively. Note, in equations (7) and (8), the second part calculates the summation of the squared Laplacian of each generated high-resolution image pixel value zm. By adding the additional part to the functions (7) and (8), the training of the network searches the optimal solution in the reduced search space, and thus generates a better training result. Here the coefficient αij is used for the calculation of Laplacian of each generated high-resolution image pixel and the coefficient γ is used to balance the solution generated by the network and is set to a small value. As mentioned earlier, M is the number of the neurons in the output layer of the output network, where M=L2 (L is the downsampling factor).
5 Preliminary Experimental Results The proposed neural network was implemented. In our experiments, we used 16 lowresolution images to generate a high-resolution image with a size of 4 times of the low-resolution image in each dimension. Therefore, P equals to 16, the downsampling factor L=4, and M=L2=16. Synthetically generated image frames were used in our experiments. To generate the low-resolution images synthetically, we used an eight bit gray level image of size 512 x 512. At first, sixteen sets of motion parameters (θp, hpx, hpy) with subpixel magnitudes were randomly generated. Then, part of the original image was rotated and translated by using the sixteen sets of motion parameters, respectively, to obtain sixteen variations of high-resolution images of size 320 x 320. These sixteen images
A Neural Network Method for Image Resolution Enhancement
207
were further blurred by a 5 x 5 moving-average filter and then decimated by a factor of 4 to produce sixteen low-resolution images of size 80 x 80. Follow the procedures, two sets of sixteen low-resolution and their corresponding high-resolution images were generated for ‘lenna’ and ‘boat’ images, respectively. The original high-resolution images plus the generated low-resolution images as well as the generated motion parameter sets were used for the training of the neural network. After the training process, the network was used to reconstruct the highresolution image using the sixteen low-resolution lenna images. Figures 6, 7, 8 and 9 show the original and the generated high-resolution lenna images using pixel replication, bilinear interpolation, and the proposed neural network, respectively.
10
50
20
100 30
150
40
200
50 60
50
100
150
10
200
Fig. 6. Orginal high resolution image
50
100
100
150
150
200
200
100
150
200
30
40
50
60
Fig. 7. A single low-resolution image
50
50
20
50
100
150
200
Fig. 8. Reconstructed lenna image using Fig. 9. Reconstructed lenna image using the bilinear interpolation proposed neural network
208
S. Zhang and E. Salari
6 Summary and Future Work A novel neural network based method for high-resolution image reconstruction from a multiple of low-resolution image frames has been proposed. The preliminary experimental results show that the proposed neural network can successfully learn the mapping from a set of the low-resolution image sequences to the high-resolution image, and therefore, it is promising in solving the ill-posed high-resolution image reconstruction problem. To further investigate the proposed neural network and its capability of image resolution enhancement, more research work is needed. Some directions of future work can be: (1) investigating better neural structure that is more suitable for the task; (2) Designing complete training scheme for a better training result, and (3) testing the network on variety of images.
References 1. Tsai, R.Y., Huang, T.S.: Multiframe Image Restoration and Registration. In: Huang, T.S. (ed.) Advances in Computer Vision and Image Processing, pp. 317–339. JAI Press Inc., United Kingdom (1984) 2. Kim, S.P., Su, W.: Recursive High-resolution Reconstruction of Blurred Multi-frame Images. IEEE Trans. Image Processing 2, 534–539 (1993) 3. Stark, H., Oskoui, P.: High-resolution Image Recovery From Image-plane Arrays, Using Convex projections. J. Opt. Soc. Amer. A. 6, 1715–1726 (1989) 4. Hardie, R.C., Barnard, K.J., Bognar, J.G., Watson, E.A.: High-resolution Image Reconstruction from a Sequence of Rotated and Translated Frames and its Application to an Infrared Imaging System. Opt. Eng. 37, 247–260 (1998) 5. Tuinstra, T.R., Hardie, R.C.: High Resolution Image Reconstruction from Digital Video by Exploitation of Non-global Motion. Optic. Eng. 38, 806–814 (1999) 6. Freeman, W.T., Jones, T.R., Pasztor, E.C.: Example-Based Super-Resolution. IEEE Comput. Graph. Appl. 22, 56–65 (2002) 7. Schultz, R.R., Stevenson, R.L.: Extraction of High-resolution Frames from Video Sequences. IEEE Trans. Image Processing 5, 996–1011 (1996) 8. Hardie, R.C., Barnard, K.J., Armstrong, E.E.: Joint MAP Registration and Highresolution Image Estimation Using a Sequence of Undersampled Images. IEEE Trans. Image Processing 6, 1621–1633 (1997) 9. Irani, M., Peleg, S.: Improving resolution by image registration. CVGIP: Graph. Models and Image Process. 53, 231–239 (1991) 10. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proc. 7th International Joint Conference on Artificial Intelligence, pp. 674–679. Morgan Kaufmann Publishers Inc., San Francisco (1981) 11. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall, Upper Saddle (1999)
Multiparty Simultaneous Quantum Secure Direct Communication Based on GHZ States and Mutual Authentication Wenjie Liu1, Jingfa Liu1, Hao Xiao2, Tinghuai Ma1, and Yu Zheng1 1
School of Computer and software, Nanjing University of Information Science and Technology, Nanjing, China 2 School of Information and Engineering, Huzhou Teachers College, Huzhou, China {wenjieliu,jfliu}@nuist.edu.cn,
[email protected], {thma,yzheng}@nuist.edu.cn
Abstract. In the modern communication, multiparty communication plays a important role, and becomes more and more popular in the e-business and e-government society. In this paper, an efficient multiparty simultaneous quantum secure direct communication protocol is proposed by using GHZ states and dense coding, which is composed of two phases: the quantum state distribution process and the direct communication process. In order to prevent against the impersonators, a mutual identity authentication method is involved in both of the two processes. Analysis shows that it is secure against the eavesdropper’s attacks, the impersonator’s attacks, and some special Trent’s attacks (including the attack by using different initial states). Moreover, Comparing with the other analogous protocols, the authentication method is more simple and feasible, and the present protocol is more efficient. Keywords: quantum cryptography communication, QSDC, simultaneous communication, mutual authentication, GHZ states.
multiparty
1 Introduction The principles in quantum mechanics, such as no-cloning theorem, uncertainty principle, and entanglement, provide some interesting ways for secure communication. For example, quantum key distribution (QKD), the most mature branch of quantum communication, provides a novel way for two legitimate parties to share a common secret key with unconditional security [1-3]. Quantum secret sharing (QSS) is another branch, which supplies a secure way not only for sharing a classical message [4, 5], but also for sharing an unknown quantum state [6, 7]. Different from QKD and QSS, quantum direct communication (QDC) is to transmit directly the secret message without generating a secret encryption key between the communication parties in advance. Due to its instantaneous, QDC may be used in some urgent circumstances, and it was proposed and actively pursued by some groups [8-15]. In essence, these QDC schemes can be divided into two categories: quantum secure direct communication (QSDC) [8-11] and deterministic secure quantum D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 209–218, 2011. © Springer-Verlag Berlin Heidelberg 2011
210
W. Liu et al.
communication (DSQC) [12-15]. Their ultimate difference is that the receiver can read out the secret message only after he exchanges at least one bit of classical information for each qubit with the sender in DSQC protocol. Recently, Lee et al. [16] proposed two QSDC protocols with authentication (LLY for short). However, Zhang et al. [17] indicated these protocols were insecure against the authenticator Trent’s attacks, and put forward two modified protocols by using the Pauli σ Z operation instead of the original bit-flip operation σ X (ZLW). And Liu et al. [18] put forward two corresponding efficient protocols (LCL) by using four Pauli operations { I , σ X , iσ Y , σ Z }. Unfortunately, in 2009, Qin et al. [19] claimed that in LCL protocols Trent and an eavesdropper Eve can elicit half of information about the secret message from the public declaration. And then, Yen et al. [20] also pointed out that ZLW protocols are still insecure when Trent would prepare different initial states, and presented a new DSQC protocol with mutual authentication and entanglement swapping (YHG) to make up this kind of security leakage. In these above proposed protocols [8-20], the one-to-one communication mode is widely discussed, but the multiparty simultaneous communication mode is rarely discussed deeply. In this paper, we present an efficient multiparty simultaneous QSDC protocol (MSQP) with multi-particle Greenberger-Horne-Zeilinger (GHZ) states and mutual authentication. What’s more, the mutual authentication procedures is included in both the quantum state distribution process and the direct communication process. In order to prevent against some special Trent’s attacks, such as the attacks proposed by Zhang et al. [17] (ZLW attacks), the attack of using different initial states propose by Yen et al. [20] (YHG attack), the communication users wouldn’t send qubits to Trent. That is, Trent acts as a “genuine” authenticator who is only used to authenticate the users’ identities.
2 Preliminaries The quantum-mechanical analog of the classical bit is the qubit, a two-level quantum system, which can be written as
ϕ =α 0 + β 1 . 0
Where
(1)
represents the horizontal polarization, 1
represents the vertical
polarization, α and β are complex numbers that specify the probability amplitudes of the corresponding states, and α + β = 1 . Some other essential notations are defined as follows: 1 1 ( 00 ± 11 ), ψ ± = ( 01 ± 10 ) . (1) φ ± , ψ ± (Bell states): φ ± = 2 2 2
P±
(2) Q
±
, =
1 2
Q±
,
R±
S±
,
(| 001〉± | 110〉 ) , R
2
±
=
(GHZ states): 1 2
(| 010〉± | 101〉 ) , S
P ±
±
=
=
1
1
(| 011〉± | 100〉 ) .
2
(| 000〉± | 111〉 ) ,
2
Multiparty Simultaneous Quantum Secure Direct Communication
(3) I , σ x , iσ y , σ z , H
σx = 0 1 + 1 0 H=
I= 0 0 +1 1 ,
(Unitary operations):
iσ y = 0 1 − 1 0
,
,
211
σz = 0 0 − 1 1
,
1
(0 0 − 1 1 + 0 1 + 1 0). 2 (4) h(.)(One-way hash function): h :{0, 1}* × {0, 1}l → {0, 1}c , where the asterisk, l, and c represent an arbitrary length, the length of a counter, and a fixed number, respectively. (5) IDuser ( User’s identity number): The user’ identity number IDA ( IDB , or IDC ) is only known to the authenticator Trent except Alice (Bob, or Charlie). (6) AK user (User’s authentication key): AK user = h( IDuser , cuser ) , where cuser is the
counter of calls on the hash function. And the length of AK user is luser .
3 Multiparty Simultaneous QSDC Protocol For simplicity, we take the three-party simultaneous QSDC as example, and assume there are four parties: the senders (Alice and Bob) want to send secret messages to the distant receiver Charlie simultaneously, and Trent is the third-party authentication center who will be used to supply the three-particle GHZ states and authenticate the users. It is composed of two phases: the quantum state distribution process and the many-to-one direct communication process. 3.1 Quantum State Distribution Process
The purpose of the quantum state distribution process is to let the three users (Alice, Bob and Charlie) safely share the four-particle GHZ states (shown in Fig.1). Suppose AK A ( AK B , AK C ) is the user Alice’s (Bob’s, Charlie’s) authentication key, shared between Trent and Alice (Bob, Charlie) in advance. The procedures are as follows: receive VA , VA and VC ;
I or H decode ⎯⎯⎯⎯⎯ →VA (VA , VC ); SK A ( SK B , SK C ) -------------prepare GHZ states ( S A , S B , SC ); I or H encode : ⎯⎯⎯⎯⎯ → S A ( S A , SC ); SK A ( SK B , SKC ) prepare VAC , VBC ; I or H encode : ⎯⎯⎯ → V SKC AC , VBC
S A +VAC
VA
S B +VBC
VB
SC
VC
prepare VA to authenticate Trent;
I or H encode : ⎯⎯⎯ SK A → V
A
;
-------------obtain ( S A + VAC ) sequence; prepareVB to authenticate Trent; I or H I or H encode : ⎯⎯⎯ decode : ⎯⎯⎯ → SA ; SK B →VB ; SK A v qubits ---→ outcomes. S A ⎯⎯⎯ z-basis obtain ( S B + VBC ) sequence; I or H decode : ⎯⎯⎯ SK B → S A ; same positions S B ⎯⎯⎯⎯→ outcomes. z-basis
prepare VC to authenticate Trent;
I or H encode : ⎯⎯⎯ →V SKC
C
;
-------------obtain S C sequence; I or H decode : ⎯⎯⎯ → SC ; SKC same positions SC ⎯⎯⎯⎯→ outcomes. z-basis
Fig. 1. The quantum state distribution process of three-party MSQP
212
W. Liu et al.
Step 1: Alice prepares an ordered v decoy photons sequence, namely VA sequence,
each of which is randomly chosen in { 0 , 1 }. And these decoy photons will be used for verifying the identity of Trent. It should be noted that the length v is a sufficient large for checking the noise of the quantum channel. In general, v >> l A , where l A is the length of Alice’s authentication key, AK A . Alice makes an operation H or I on each particle of VA sequence according to the bit value of AK A being 1 or 0. For instance, if the bit value is 1, an H operation is performed on the corresponding particle; otherwise, an I operation is applied (i.e., nothing is done). Since the number v is normally larger than the length l A of AK A , the key bits of AK A will be
regenerated by IDA and a new cA for operation performing. The process will be repeated until all v particles in VA sequence are encoded with corresponding H or I operations. Finally, Alice then sends VA sequence to Trent. Step 2: After receiving VA sequence, Trent decodes each of the particles by an
operation H or I also according to the bit value 1 or 0 of AK A as Alice did in Step (1). If this “Trent” is the genuine authenticator Trent, the state of each of the ordered v decoy photons will return to its initial state. Trent then measure these decoy photons in the Z-basis { 0 , 1 }, and publish the outcomes to Alice over public channel. Step 3: Alice is going to authenticate Trent as well as to check the presence of Eve. She compares her initial states of VA sequence with Trent’s measurement outcomes.
Since only the “real” Trent knows her authentication key, AK A , then if they have a sufficient large number of results that are the same, Alice would accept that Trent is the genuine Trent (the authenticator). Otherwise, if the error rate is too high, she just stops the procedure. Simultaneously, Bob and Charlie authenticate Trent in the same way, respectively. If all the users ensure Trent is the “genuine” authenticator, then continue the following steps to be authenticated by Trent. Step 4: Trent prepares an ordered sequence consisted of N (i.e., equals to n+v) threeparticle GHZ states, and each of the states is
ϕi = P + =
1 2
( 000
ABC
+ 111
ABC
) (i = 1, 2,L , N )
(2)
where the subscript indicates the ordering number of quadruplets, A, B and C represent three qubits of each GHZ state. Trent then takes particle A from each GHZ state to form an ordered particle sequence, called S A sequence. Similarly, the remaining partner particles compose S B sequence, SC sequence. And he encodes S A ( S B , SC ) with Alice’s (Bob’s, Charlie’s) authentication key AK A ( AK B , AK C ).
For example, if the bit value of AK A is 1, an H operation is performed on the corresponding particle; otherwise, an I operation is applied. The new authentication keys will be regenerated when li is not long enough.
Multiparty Simultaneous Quantum Secure Direct Communication
213
Step 5: On the same time, Trent prepares two ordered v decoy photon sequences, each of which is randomly chosen in { 0 , 1 }, namely VAC and VBC , respectively. He encodes VAC and VBC sequences according to the authentication keys AK C and AK C , respectively, as described in Step (1). Trent inserts these decoy photons of VAC ( VBC ) into S A ( S B ) sequence with random positions, and keeps these positions in secret. Then he sends the ( S A + VAC ) , ( S B + VBC ) and SC sequences to Alice, Bob and Charlie, respectively. Step 6: After making sure that Alice, Bob and Charlie have received their sequences, respectively, Trent will authenticate Alice, Bob and Charlie as well as check the security of the channels. Taking Alice as example, Trent tells Alice the positions of VAB , and Alice decodes S A by the operations {H, I} according to AK A as Trent did in Step (4). If this “Alice” is the real user Alice, the state of each qubit in VAC will return to its initial state. On the same time, Bob and Charlie perform the similar operations on S B and SC with AK B and AK C as Alice did, respectively. Step 7: Alice selects randomly v (sufficiently large) qubits from the decoded S A sequence, and makes Z-basis measurement on them, so do Bob and Charlie on the same positions, respectively. Step 8: Alice, Bob and Charlie compare their measurement results through the public channel. If the error rate is higher than expected, then Alice, Bob and Charlie abort the protocol. Otherwise they can confirm that their counter parts are legitimate and the channels are all secure. In this process, Trent is verified by all users (Alice, Bob and Charlie) in Step 1-3, each user is also verified by Trent in Step 4-8. If the error rate of Alice’s, Bob’s and Charlie’s outcomes is under the expectation, then the GHZ states are safely distributed to Alice and Bob, and then they can continue the next phase. 3.2 Many-to-One Direct Communication Process
In this process, Alice and Bob will utilize the remained n GHZ states as message carriers: they encode their messages on the particles in S A , S B sequences, and '
prepare VA' to authenticate sender; I or H encode : ⎯⎯⎯ →VA' ; SK A ( I , σ x , iσ y , σ z ) encode: ⎯⎯⎯⎯⎯ → SA; ' insert VAC , VA into S A .
obtain ( S A +VAC +VA ) sequence; ' B
obtain ( S B +VBC +V ) sequence; I or H decode : ⎯⎯⎯ SKC → VAC , VBC ; verify receiver's identity with VAC , VBC ; verify sender's identity with VA' ,VB' ; GHZ basis → outcomes. S A S B SC ⎯⎯⎯⎯
(S A +VAC +VA' ) '
prepare VB to authenticate sender;
I or H encode : ⎯⎯⎯ →V ; SK B ( I , iσ y ) encode: ⎯⎯⎯→ S B ; insert VBC , V into S B ;
(S B +VBC +VB' )
'
B
'
B
Fig. 2. The many-to-one direct communication process of three-party MSQP
214
W. Liu et al.
transmit them to Charlie, respectively. The procedures are shown as follows (also shown in Fig.2): Step 1: Alice (Bob) prepares an ordered v decoy photons sequence, namely VA' ( VB' )
sequence, and encodes VA' ( VB' )according to AK A ( AK B ). Step 2: Alice encodes message on the remanded n-length S A sequence by performing
the four unitary operations {I , σ x ,iσ y , σ z } : two bits message corresponds to a unitary operation on a qubit, “00”Æ I , “01”Æ σ x , “10”Æ iσ y , and “11”Æ σ z . At the same time, Bob encodes his message on the remanding S B sequence as follows: “0”Æ I , “1” Æ iσ y . Then, the original ϕi becomes one of the following states. I ⊗ I ⊗ I ϕi =
1
( 000
2
σ x ⊗ I ⊗ I ϕi =
1
iσ y ⊗ I ⊗ I ϕi =
1
2 2
σ Z ⊗ I ⊗ I ϕi =
1
I ⊗ iσ y ⊗ I ϕi =
1
( 000
1 2
ABC
1 2
( 110
( − 010
) = P+
ABC
+ 011
− 111
ABC
( − 110
ABC
+ 011
ABC
( − 010
2
iσ y ⊗ iσ y ⊗ I ϕi =
ABC
( − 100
1
σ x ⊗ iσ y ⊗ I ϕi =
σ Z ⊗ iσ y ⊗ I ϕi =
( 100
2 2
+ 111
ABC
) = S+
ABC
(4) ABC
) =| S − 〉 ABC
(5)
) =| P − 〉 ABC
(6)
ABC
+ 101
(3) ABC
ABC
) = R−
ABC
+ 001
ABC
) = Q−
ABC
− 001
ABC
) = Q+
ABC
+ 101
ABC
) = R+
ABC
(7) (8)
ABC
(9) ABC
(10) ABC
Step 3: Alice inserts these decoy photons of VAC and VA' into S A sequence with random positions, and keeps these positions in secret. It should be noted that the new positions of VAC is only known for Alice instead of Trent. On the same time, Bob also
inserts the photons of VBC and VB' into S B sequence in the same way. It should be noted that VAC , VBC will be used to verify the receiver’s identity, and VA' , VB' will be used to verify the senders’ identities. Step 4: Alice sends the message-encoded ( S A + VAC + VA' ) sequence to Charlie, Bob sends ( S B + VBC + VB' ) sequence to Charlie, respectively. Step 5: After Charlie has received ( S A + VAC + VA' ) sequence, Alice announces the
positions of VAC to Charlie over the public channel. According the announcement,
Multiparty Simultaneous Quantum Secure Direct Communication
215
Alice will authenticate the receiver (i.e., Charlie). He asks Charlie to perform the corresponding H or I operation on the particles of VAC according to the bit value of AK C being 1 or 0. If the receiver is the “true” Charlie, each qubit in VAC will return to its initial state Trent prepared. And then, Charlie measures the decoy photons of VAC in Z-basis, and tells Trent the measurement. So, Trent can judge the receiver’s identity by comparing the measurement outcomes with the initial state. If the receiver is the “true” Charlie, Charlie is going to authenticate the sender (i.e., Alice) too. After Alice tells Charlie the positions of VA' firstly, Charlie measures the decoy photons of VAC randomly in Z-basis { 0 , 1 } or X-basis {{ + , − }} and tells Trent the
outcomes and the corresponding bases. Then, Trent draws out those outcomes measured in a right basis. That is, if the bit of AK B is 0, the corresponding qubit is measured in Z-basis; if the bit of AK B is 1, the corresponding qubit is measured in Xbasis. Trent asks Alice to tell the corresponding initial states, and he can judge the sender’s identity by comparing the outcomes with their initial state. On the same time, Bob does the same authentication procedures with VBC and VB' as Alice. In a word, if the sender (or receiver) is not legitimate or the channel is not safe, then the users will abort the process. But, they will continue the next step when the users are authenticated to be legitimate as well as the channel is safe. Step 6: Charlie performs GHZ basis measurement on pairs of particles consisting of S A , S B and SC sequence. From their outcomes, she can deduce the probable operations performed by Alice and Bob, respectively (shown in Table 1).
For example, if Charlie’s measurement outcome is S + , she can infer that Alice had performed a σ x operation and Bob had chosen a I operation, that is, Alice’s message is “01”, Bob’s message is “0”. Table 1. Relations of Alice’s operation, Bob’s operation and Charlie’s measurement in the many-to-one direct communication process of the three-party MSQP Charlie’s Measurement P+
S S
Q Q R
Bob’s Operation
Bob’s message
I
00
I
0
ABC
σx
01
I
0
ABC
iσ y
10
I
0
ABC
σz
11
I
0
ABC
I
00
iσ y
1
ABC
σx
01
iσ y
1
ABC
iσ y
10
iσ y
1
ABC
σz
11
iσ y
1
−
R
Alice’s message
ABC
+
P
Alice’s Operation
− − − + +
216
W. Liu et al.
4 Security and Efficiency Analysis 4.1 Security Analysis
Since the present protocol is similar to those analogous protocols [16-20], which are proven to be secure against outer Eve’s attacks and the impersonator’s attacks, we just analyze the security against the special Trent’s attacks here. At first, we consider the Trent’s attack proposed by Zhang et al. [17] (ZLW attack). Under this kind of attack, Trent will intercept all the particles transmitted from the senders to the receiver, and attempt to make a certain measurement in the direct communication process for stealing the secret message. Fortunately to us, there is no redundant particle of each EPR pairs to be kept in Trent’s hand, even if she can impersonate Charlie to gain all the transmitted particles, she sill cannot get any information just acting as the impersonator role. So, the present scheme is secure against ZLW attack. Finally, we take account of YHG attack, where Trent uses a different initial state instead of the original state 1 2 (| 000〉+ | 111〉 ) . Under this attack, In order to avoid be detected by the channel checking procedure in Step 8 of the quantum distribution process, Trent will prepare the different initial state 1 2 (| + + +〉 + | − − −〉 ) instead of 1 2 (| 000〉+ | 111〉 ) . Then, after Alice and Bob finished message encoded, the GHZ state becomes one of the following, I ⊗I ⊗I ϕ' =
σx ⊗ I ⊗ I ϕ ' = iσ y ⊗ I ⊗ I ϕ ' =
σZ ⊗ I ⊗ I ϕ ' = I ⊗ iσ y ⊗ I ϕ ' =
1 2 1
( +++
2 1
2
1 2 1
ABC
+ −−−
ABC
)=
( +++
ABC
− −−−
ABC
)=
( −++
ABC
− +−−
ABC
)=
( −++
ABC
+ +−−
ABC
)=
1
( + − + ABC − − + − ABC ) = 2 1 ( + − + ABC + − + − ABC ) = σ x ⊗ iσ y ⊗ I ϕ ' = 2 1 ( − − + ABC + + + − ABC ) = iσ y ⊗ iσ y ⊗ I ϕ ' = 2 1 ( − − + ABC − + + − ABC ) = σ Z ⊗ iσ y ⊗ I ϕ ' = 2
2 1
( φ+
0
AB
(ψ +
C
+ψ+
C
+ φ+
C
+ φ−
AB
1 C)
(11)
1 C)
(12)
AB
AB
1 C)
(13)
AB
1 C)
(14)
( φ− 1 C − ψ− 0 C) AB AB 2 1 ( φ− 0 C − ψ− 1 C) AB AB 2 1 ( φ+ 0 C − ψ+ 1 C) AB AB 2 1 ( φ+ 1C −ψ+ 0 C) AB AB 2
(15)
2 1
2 1
2 1
(ψ −
( φ−
AB
AB
AB
0
0
0
C
+ ψ−
(16) (17) (18)
Trent will intercept the transmitted qubits from Alice and Bob and draw out S A and S B sequence (because he know all users’ authentication keys), measures them in Bell basis to steal some information about message. For example, suppose Trent’s measurement is φ + , according to Equations (11), (12), (17) and (18), Alice’s
Multiparty Simultaneous Quantum Secure Direct Communication
217
possible operation is one of {I , σ x ,iσ y , σ z } with the same probability 1/4, Bob’s possible operation is one of {I , iσ y } with the same probability 1/2, that is to say that there is no message to be leaked to Trent. In summary, our MSQP is secure against the eavesdropper’s attacks, the impersonator’s attacks, ZLW attacks and YHG attack. Finally, we conclude the security of some analogous protocols (shown in Table 2). 4.2 Efficiency Analysis
Recently, Li et al. [15] presented a useful efficiency coefficient to describe the total efficiency of a quantum communication, which is defined as follows.
ηtotal = mu (qt + bt )
(19)
where mu is the numbers of message transmitted, qt is the number of total qubits used and bt is the numbers of the classical bits exchanged. In the present MSQP, the communication users don’t need any classical information to help the receiver reading out the secret message, and three bits of quantum information (a three-qubit GHZ state) are utilized to communicate three bits of secret message, that is, mu = 3, qt = 3 and bt = 0. Thus, its total efficiency is ηour = 3/(3+0) = 100% in theory. In this way, the other analogous protocols’ efficiency can be computed (see Table 2). Table 2. Security and efficiency of the other analogous protocols and our protocol Protocol LLY ZLW LCL YHG Ours
Eve’s attacks Yes Yes Yes Yes Yes
Impersonator’s attacks Yes Yes Yes Yes Yes
ZLW attacks No Yes No Yes Yes
YHG attack No No Yes Yes Yes
Total Efficiency 25% 25% 50% 25% or 50% 100%
From Table 2, it is clear that our proposed protocol is more efficient than the other analogous protocols.
5 Conclusions As we all know, multiparty communication plays an important role in the modern communication. And it becomes more and more popular in the e-business and e-government society because of a certain demand. In this paper, an efficient MSQP by using GHZ states and dense coding is proposed, which is secure against the eavesdropper’s attacks, the impersonator’s attacks, and some special Trent’s attacks (including ZLW attacks and YHG attack). By introducing the decoy photon checking technology to check the channel as well as authenticating the participators, the present scheme is more feasible and economy with present-day technique. Moreover, the total efficiency of the present scheme is higher than those analogous protocols too.
218
W. Liu et al.
Acknowledgment This work is supported by the Research Foundation of NUIST (20080298), the Natural Science Foundation of Jiangsu Province, China(BK2010570) and the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (09KJB520008).
References 1. Bennett, C.H., Brassard, G.: Quantum cryptography: Public-key distribution and tossing. In: Proc. of IEEE International Conference on Computers, Systems and Signal Processing, Bangalore, India, pp. 175–179. IEEE Press, Los Alamitos (1984) 2. Ekert, K.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661–663 (1991) 3. Bennett, H.: Quantum cryptography using any two nonorthogonal states. Phys. Rev. Lett. 68, 3121–3124 (1992) 4. Hillery, M., Bužek, V., Berthiaume, A.: Quantum secret sharing. Phys. Rev. A 59, 1829 (1999) 5. Deng, F.G., Long, G.L., Zhou, H.Y.: An efficient quantum secret sharing scheme with Einstein–Podolsky–Rosen pairs. Phys. Lett. A 340, 43–50 (2005) 6. Cleve, R., Gottesman, D., Lo, H.K.: How to Share a Quantum Secret. Phys. Rev. Lett. 83, 3 (1999) 7. Lance, M., Symul, T., Bowen, W.P., Sanders, B.C., Lam, P.K.: Tripartite Quantum State Sharing. Phys. Rev. Lett. 92, 17 (2004) 8. Boström, K., Felbinger, T.: Deterministic Secure Direct Communication Using Entanglement. Phys. Rev. Lett. 89, 187902 (2002) 9. Wójcik, A.: Eavesdropping on the ‘ping-pong’ quantum communication protocol. Phys. Rev. Lett. 90, 157901 (2003) 10. Deng, F.G., Long, G.L., Liu, X.S.: Two-step quantum direct communication protocol using the Einstein-Podolsky-Rosen pair block. Phys. Rev. A 68, 042317 (2003) 11. Wang, C., Deng, F.G., Long, G.L.: Multi-step quantum secure direct communication using multi-particle Green-Horne-Zeilinger state. Opt. Commun. 253, 15–20 (2005) 12. Beige, A., Englert, B.G., Kurtsiefer, C., Weinfurter, H.: Secure communication with single-photon two-qubit states. Acta Phys. Pol. A 101, 357–361 (2002) 13. Yan, F.L., Zhang, X.: A scheme for secure direct communication using EPR pairs and teleportation. Euro. Phys. J. B 41, 75–78 (2004) 14. Gao, T., Yan, F.L., Wang, Z.X.: Deterministic secure direct communication using GHZ states and swapping quantum entanglement. J. Phys. A: Math. Gen. 38, 5761–5770 (2005) 15. Li, X.H., Deng, F.G., Li, C.Y., Liang, Y.J., Zhou, P., Zhou, H.Y.: Deterministic Secure Quantum Communication Without Maximally Entangled States. J. Kerean Phys. Soc. 49, 1354–1360 (2006) 16. Lee, H., Lim, J., Yang, H.: Quantum direct communication with authentication. Phys. Rev. A 73, 042305 (2006) 17. Zhang, Z.J., Liu, J., Wang, D., Shi, S.H.: Comment on Quantum direct communication with authentication. Phys. Rev. A 75, 026301 (2007) 18. Liu, W.J., Chen, H.W., Li, Z.Q., Liu, Z.H.: Efficient quantum secure direct communication with authentication. Chin. Phys. Lett. 25, 2354–2357 (2008) 19. Qin, S.J., Wen, Q.Y., Meng, L.M., Zhu, F.C.: High Efficiency of Two Efficient QSDC with Authentication Is at the Cost of Their Security. Chin. Phys. Lett. 26, 020312 (2009) 20. Yen, A., Horng, S.J., Goan, H.S., Kao, T.W., Chou, Y.H.: Quantum direct communication with mutual authentication. Quantum Information and Computation 9, 376–394 (2009)
A PSO-Based Bacterial Chemotaxis Algorithm and Its Application Rui Zhang, Jianzhong Zhou*, Youlin Lu, Hui Qin, Huifeng Zhang Collage of Hydorpower Information Engineering Huazhong University of Scence and Technology Wuhan, Hubei, 430074, China
[email protected],
[email protected] Abstract. In this paper, a new two-phased optimization algorithm called BC-PSO is presented. Bacterial Chemotaxis Algorithm is a biologically inspired optimization method which is analyzing the way bacteria react to chemoattractants in concentration gradients. Aiming at the shortcomings of BC which is lack of global searching ability with low speed, PSO is introduced to accelerate convergence before BC works. With some famous test functions been used, the numerical experiment results and comparative analysis prove that it outperforms standard PSO and GA. Finally, the hybrid algorithm is applied to solve the problem of mid-long term optimal operation of cascade hydropower stations. Comparing with the other algorithms, the operation strategy shows its higher efficiency. Keywords: bacterial chemotaxis algorithm; particle swarm operation; optimal operation; cascade hydropower station.
1 Introduction Bacterial Chemotaxis Algorithm (BC) is a biologically inspired optimization method which was pioneered by Bremermann[1] in 1974 and then proposed by Muller[2] in 2002. This novel optimization strategy is developed from bacteria processes by analogy to the way bacteria react to chemoattractants in concentration gradients[2]. On account of its simplicity and robustness, this evolution strategy has high performance in many optimization problems. However, in allusion to the limitations of the strategy which bacteria is considered individuals, some scholar proposed modified algorithm by introducing social interaction into this model[3]. Some introduce chemotaxis mechanism into PSO by using both attraction force and repulsion force[5]. Recently, modified BC method is developed to solve shop scheduling problem[4-6]. In this paper, we focus on improving the ability of global search as well as increase searching speed of the bacteria chemotaxis algorithm and develop a new two-phased optimization algorithm called BC-PSO. Particle Swarm Optimization (PSO), first introduced by Kenny and Eberhart[7], is one of the population-based optimization technique. In PSO model, all the individuals called particles change their positions *
Corresponding author.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 219–227, 2011. © Springer-Verlag Berlin Heidelberg 2011
220
R. Zhang et al.
with environment information. Aiming at the shortcomings of BC which is lack of global searching ability with low speed, PSO is introduced to execute the global search first and then stochastic local search works by BC. Meanwhile, elitism preservation and orthogonal initiation method are used in the paper. Combining PSO with BC can achieve global search with high searching speed and prevent the search process from falling into the local optimal solution. This hybrid algorithm is applied to solve the problem of mid-long term optimal operation of cascade hydropower stations. Comparing with the other algorithms, the operation strategy shows its feasibility and efficiency This paper is organized as follows. In Section 2, we introduce bacterial chemotaxis algorithm and standard PSO. In section 3, a new optimization algorithm called BCPSO is proposed. In section 4, we verify the effectiveness of the novel algorithm through numerical experiments. In section 5, BC-PSO is applied to the problem of mid-long term optimal operation of cascade hydropower stations. Finally, we conclude the paper in Section 6.
2 Bacterial Chemotaxis Algorithm Bacterial Chemotaxis Algorithm is a newly developed stochastic gradient evolutionary algorithm. Differed from the interaction models for behavior of social insects, bacteria is considered individual and social interaction is not used in the model. The movement of bacteria depends on its direction and the duration of the next movement step while this information is obtain by comparing an environmental property at two different time steps. On account of its simplicity and robustness, this evolution strategy is worthy of further research. In order to describe the optimization algorithm of bacterial chemotaxis clearly, the motion of a single bacterium in two dimensions is as follows. The strategy in n-dimension can be found in [2]. BC is presented below: Step 1: Computing the velocity ν which is assumed to be a constant value ν=cont.
(1)
Step 2: Computing the duration of the trajectory τ from the distribution of an exponential probability density function: P( X = τ ) =
1 −τ / T e T
(2)
The time T is given by ⎧ ⎪T0 , ⎪ T =⎨ ⎪T (1 + b f pr ), ⎪ 0 l pr ⎩
for for
f pr l pr f pr l pr
≥0
(3) 0.8
Score>0.9
Rate
18%
46%
36%
The Evaluation of Land Utilization Intensity Based on Artificial Neural Network
303
4.2 The Land Intensive Degree Exist Spatial Difference According to Tab.3, the spatial distribution of land utilization intensity is mapped. Connecting with the actual situation in Zhejiang Province, We can draw the following characteristics: The land intensive degree in the east is high, in the middle is general, in the south and north is the low.
Fig. 3. Directional distribution of the land intensive utilization evaluation in development zone of Zhejiang
As illustrated in Fig.3, the scores of the eastern coastal cities, just like Ning Bo, Wen Zhou, Shao Xing, are all above 0.9. They are in the range of highly intensive use. The scores of the cities in the middle of Zhejiang are between 0.85 and 0.9. They are at an intermediate level. While the scores of the cities in the north or in the south of Zhejiang are generally low. The scores are all bellow 0.85. 4.3 Intensive Degree Is Highly Correlated with Economic Development Optimal and intensive land-use is important measure to handle the relationship between economic development and resource protection. Rapid development of economy is a
304
J. Xu, H. Li, and Z. Xu
1
5000
0.9
4000
0.8
3000 0.7
2000
n ha ou s
sh ui
Zh
Li
uz h
ou
u
(Billion)
Q
uz ho H
nh Ji
u
Ji ax
iz ho Ta
in Sh ao x
gb o
W en zh
in N
gz ho an H
GDP
ua
0.5 in g
0 g
0.6
ou
1000
Score
6000
u
GDP(Billion)
prerequisite for land intensive utilization and intensive use of land provides a guarantee for economic develop sustainably. As illustrated in Fig.4, with the GDP curve from high to low, the intensive degree curve will also show a downward trend. In order to study the relationship between GDP and intensive degree more accurately, we make a correlation analysis by Spss 17.0. The result is illustrated in Tab 5: the correlation coefficient is up to 0.903. Therefore, the intensive degree and economic development are linked and influenced mutually. On the one hand, only economic developed can fit the needs of land intensive utilization. On the other hand, land intensive utilization provides the conditions for inclusive growth.
Score
Fig. 4. Relationship between economic development and land intensive use Table 5. The table of Correlation Analysis GDP
Score
1
.903** .000 11 1
Pearson Correlation GDP
Score
11 .903** .000 11
11
** At the .01 level (bilateral) is significantly correlated.
5 Summary and Conclusions Artificial Neural Network can be considered flexible, non-linear, all purpose statistical tools, capable of learning the complex relations that occur in the social processes associated with land intensively utilizing appraise. In this study, Artificial Neural Network model is used for estimating the intensive degree of development zones in Zhejiang province. 45 samples were used for training the model and 9 samples were used to verify the model. Inputs for training the model include 14 indexes. Compared
The Evaluation of Land Utilization Intensity Based on Artificial Neural Network
305
with the traditional evaluation method, just like AHP and Regression Analysis, this technology presents a series of advantages. One the most prominent advantage of Artificial Neural Network is objectivity. It avoids the influence of subjective factors on the evaluated results. In summary, the Artificial Neural Network l with 14 input nodes is able to estimate the intensive degree of development zones. However, it should be pointed out that the accuracy of an Artificial Neural Network model is data sensitive, and should not be applied to the conditions outside of the data range used in training the model without verification.
References 1. Yang, J.: Practical Course on Artificial Neural Networks. Zhejiang University Press (2001) 2. Ye, J., Jiang, Y., Roy, P., Zhu, K., Feng, L., Li, P.: The Research on Land Use Rights in Rural China in 2005. World Management (7), 77–84 (2006) 3. Yang, C.T., Marsooli, R., Aalami, M.T.: Evaluation of Total Load Sediment Transport Formulas Using ANN. International Journal of Sediment Research (24), 274–286 (2009) 4. Li, C., Guo, S., Zhang, J.: Modified NLPM-ANN Model and Its Application. Journal of Hydrology 378, 137–141 (2009) 5. Liming, Z.: Artificial Neural Network Model and Its Application. Fudan University Press (1993) 6. Wei, Z., Xiulan, W.: Conservation and intensive land use evaluation index system. Journal of Anhui Agricultural Sciences 35(2), 491–493 (2007) 7. Taishan, Y.: Glass Crack Detection Based on BP Neural Network Model. Sci-Tech Information Development & Economy 15(8), 182–183 (2005) 8. Sun, Z., et al.: Research of Evaluation on Present Status of Urban Land Intensive Utilization—Taking Baoding city as case. Journal of Anhui Agri. Sci. 35(15), 4576–4578 (2007) 9. Yang, S.: Connotation of Intensified Urban Land Use and Its Construction of Evaluation Index. Explore on Economical Problem (1), 27–30 (2007) 10. Nie, X.-z., Zhang, J., Liu, Z.-h., Zhang, J.-h.: Studies on The Spatial Characteristics and The Development of Theme Tourism Clusters in China. Human Geography 20(4), 65–68 (2005) 11. Xia, Z.-c., Xie, C.-s.: Probe into Some Basic Problems of The Tourism Industry Cluster. Journal of Guilin Institute of Tourism 18(4), 479–484 (2007) 12. Wang, Z.-f.: The Research of Games Between The Stakeholder Government and the Tourism Industry Cluster. Journal of Jishou University (Natural Science Edition) 29(4), 93–99 (2008) 13. Ozturk, H.E.: The Role of Cluster Types and Firm Size in Designing the Level of Network Relations: The Experience of the Antalya tourism Region. Tourism Management 29, 1–9 (2008) 14. Zhang, X., et al.: The Classification of Land Use Potentials in Cities of China Based on Principal Component Analysis. Journal of Hunan Agricultural University 33(1), 113–116 (2007) 15. Lim, C., McAleer: Time Series Forecasts of International Travel Demand for Australia. Tourism Management 23, 389–396 (2002) 16. Kim, J., Wei, S., Ruys: Segmenting the Market of West Australian Senior Tourists Using An Artificial Neural Network. Tourism Management 24, 25–34 (2003) 17. Kolarik, T., Rudorfer: Time Series Forecasting Using Neural Networks. APL Quote Quad. 25(1), 86–94 (1994)
Dimension Reduction of RCE Signal by PCA and LPP for Estimation of the Sleeping Yohei Tomita1 , Yasue Mitsukura1 , Toshihisa Tanaka1,3 , and Jianting Cao2,3 1
Tokyo University of Agriculture and Technology, 2-24-16, Nakacho, Koganei, Tokyo, Japan
[email protected], {mitsu e,tanakat}@cc.tuat.ac.jp 2 Saitama Institute of Technology, 1690, Okabe, Saitama, Japan
[email protected] 3 RIKEN Brain Science Institute, 2-1, Hirosawa, Wako, Saitama, Japan
Abstract. Irregular hour and suffering from stress cause driver doze and falling asleep during important situations. Therefore, it is necessary to know the mechanism of the sleeping. In this study, we distinct the sleep conditions by the rhythmic component extraction (RCE). By using this method, a particular EEG component is extracted as the weighted sum of multi-channel signals. This component concentrates the energy in a certain frequency range. Furthermore, when the weight of a specific channel is high, this channel is thought to be significant for extracting a focused frequency range. Therefore, the sleep conditions are analyzed by the power and the weight of RCE. As for weight analysis, the principal component analysis (PCA) and the locality preserving projection (LPP) are used to reduce the dimension. In the experiment, we measure the EEG in two conditions (before and during the sleeping). Comparing these EEGs by the RCE, the power of the alpha wave component decreased during the sleeping and the theta power increased. The weight distributions under two conditions did not significantly differ. It is to be solved in the further study. Keywords: EEG, RCE, PCA, LPP.
1
Introduction
Modern people have been more interested in the sleeping. There are two reasons [1], [2]. One is that importance of the sleeping is revealed by advancement in the brain science technology. Another is that there are many harmful effects by the sacrifice of the sleeping. In the modern society, many people have irregular hour and suffer from much stress, so that they cannot have enough of sleep. It causes driver dozing and falling asleep during important situations. Therefore, it is necessary to reveal the mechanism of the sleeping, and to prevent from falling asleep at critical moments. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 306–312, 2011. c Springer-Verlag Berlin Heidelberg 2011
Dimension Reduction of RCE Signal
307
In order to judge whether a subject sleeps or not, an action observation or the electroencephalogram (EEG) is used. Generally, a doctor visually assesses a situation of the sleeping by the EEG for 8–9 hours at least. In computer simulations, many researchers pay attention to the frequency component such as δ, θ, α, β waves by the frequency analysis [3]. In the Fourier transform, rhythmic components are extracted for decomposing a signal to a set of frequency components. For example, Yokoyama et al. suggest that the transitional EEG wave caused by the sleeping is important [4]. In recent study as for rhythmic component, in addition to these frequency bands, so slow EEG fluctuations are known to be important [5]. Focusing other aspects, if we use the multi-channel signals, the principal component analysis (PCA) and the independent component analysis (ICA) are effective. They are developed for extracting meaningful components in a noisy environment. Furthermore, Tanaka et al. advocate if the frequency of interest is known in advance, it is more natural to directly estimate a component with respect to the frequency range ([6]–[8]). Then, for this purpose, the rhythmic component extraction (RCE) was proposed by them. It outputs the weighted sum of multi-channel signals. These weights are optimized to maximize the power of a specific frequency component in the linear summed variable. We assume that the rhythmic component in the EEG is caused by the sleeping condition, so it may be effective for the analysis of the sleeping. Furthermore, the effective electrode can be found by the weights of the RCE results. However, there is not effective analysis method of these weights. The variation of the weights has the high dimension, because the RCE signal is based on the multichannel signal. For analysis of the features of the weights, we think the principal component analysis (PCA) and the locality preserving projection (LPP) are effective. The PCA is a classical linear technique that projects the data along the directions of maximal variance. The LPP is a linear dimensionality reduction algorithm, and builds a graph incorporating neighborhood information of the data set. This effectiveness is reported in [9]. Therefore, we analyze time variation of these weights by feature dimensionality reduction by the PCA and the LPP.
2
Rhythmic Component Extraction
The RCE is a method to extract particular EEG component that concentrates the energy in a certain frequency range. It is expressed as weighted sum of multichannel signals. Observed signal xi [k](k = 0, ..., N − 1) is based on the channel i(i = 1, ...M ). The RCE signal is expressed as xˆ[k] =
M
wi xi [k].
(1)
i=0
where wi is the weight of signal i. ˆ −jω ) be the discrete-time Fourier transform (DTFT) of xˆ[n]. If we Let X(e are interested in the specific frequency range, denoted by Ω1 ⊂ [0, π], then the power of x ˆ[k] with respect to this range is described as
308
Y. Tomita et al.
P =
Ω1
ˆ −jω )|2 dω. |X(e
(2)
Then, the weight wi is determined to maximize the power of specific frequency component, whereas the power of the other frequency component is minimized. Ω1 ⊂ [0, π] is frequency component we want to extract, and Ω2 ⊂ [0, π] is the other frequency component we want to suppress. Therefore, the cost function to be maximized is given as the following. ˆ −jω )|2 dω |X(e J[w] = Ω1 (3) ˆ −jω )|2 dω |X(e Ω2
The maximization of the above cost function is reduced to a generalized eigen value problem in the following way. Define X∈RM×N as [X]ik = xi [k], and matrices W1 as: [W1 ]l,m = e−jω(l−m) dω, (4) Ω1
where l, m = 0, ..., N − 1. In the same way, W2 is defined by replacing Ω1 with Ω2 in (4). Then, J[w] in (3) can be described in the matrix-vector form as J[w] =
wT XW1 X T w . wT XW2 X T w
(5)
The maximizer of J[w] is given by the eigen vector corresponding to the maximum eigen value of the following generalized eigen value problem: XW1 X T w = λXW2 X T w.
(6)
The problem can be solved by using a matrix square root of XW2 X T . Since XW2 X T is symmetric, a matrix square root, S, exists such that XW2 X T = SS T . Note that S is not uniquely determined. Then, the optimal solution, w∗ , is given by ˆ (7) w∗ = S −T w, where w ˆ is the eigenvector corresponding to the largest eigen value of the matrix S −1 XW1 X T S −T .
3
Electroencephalograph
The EEG data is obtained via the electrodes based on the international 10/20 system shown in Fig. 1. It is a system which describe the locations of electrodes on the human skull. Based on this system, the EEG is measured by the electroencephalograph. In this study, the electroencepharograph shown in Fig. 2 is used as a measurement devise. This system enables the measurement of brain activity over the whole head, because it is the cup type electroencephalograph released by NEUROSCAN. There are 32 channels (including two reference electrodes on the ears), and sampling frequency is 1000 Hz. The electrodes’ impedance should be set to less than 8 kΩ for measurement.
Dimension Reduction of RCE Signal
309
Fig. 1. The international 10/20 system
Fig. 2. The electroencephalograph
Measurement (Before sleeping)
Measurement (Sleeping)
If he notices 0min
4min
If he notices 8min
If he doesn't notice 12min
Experimenter drums on the table every 4 min.
Fig. 3. The experimental procedure
4
Experimental Procedure
In the experiment, we investigate EEG features just after falling asleep. There is one subject, the EEG is measured in two states: before the sleeping and during the sleeping. The EEG data is obtained by a NEUROSCAN system. After the attachment of the electroencephalograph, EEG is measured for one minute. This EEG data is defined as the data before the sleeping. After that, the subject keeps relaxing and sleeps. For the definition of falling asleep, we drums on the table by the forefinger every four minutes (Fig. 3). If he notices this small sound, the subject is regarded as being still waking. Otherwise, if he does not notice, EEG is measured for one minute. It is defined as the data during the sleeping. Approval from ethics committee had been obtained from Tokyo University of Agriculture and Technology.
5
Results and Discussions
Among the experiment, the two types of EEG data ware obtained (the EEG before the sleeping and that during the sleeping). First of all, differences of the frequency features of these were investigated. After that, the rhythmic component extracted by RCE was investigated.
310
Y. Tomita et al. 2200 2000 1800 1600 1400 1200 1000 800 600 400 200
before sleep during sleep
4
6
8
10 12 14 16 18 20 22 Frequency [Hz]
Fig. 4. Spectrum of Fp1 power before the sleeping and that during the sleeping 3.5
4.5 4 3.5 3 2.5 2 1.5 1 0.5 0
before sleep during sleep
3 2.5 2 1.5 1 0.5 0 0
10
20
30 Time [s]
40
50
before sleep during sleep
0
10
20
30
40
50
Time [s]
Fig. 5. Power spectrum of the theta com- Fig. 6. Power spectrum of the alpha component by RCE ponent by RCE
At first, frequency features were analyzed by short time Fourier transform. The window size is set as N = 1000, where sampling frequency is 1000 Hz in this experiment. In Fig. 4, one minutes averaged power spectrum is shown. The vertical axis represents the power spectrum and the abscissa axis represents the frequency. Compared with the power spectra of two conditions, the power of the alpha wave component (8–13 Hz) decreased during the sleeping. On the other hand, the power of the theta wave component (4–7 Hz) increased. These features were also confirmed at the other 29 electrodes. These results suggest that the alpha and theta wave components are promising candidates of doze features. From these results, theta component and the alpha component were chosen for the parameters of RCE analysis. In RCE analysis, the theta component and the alpha component were extracted. To extract the theta wave component by RCE, Ω1 is set to the range corresponding to 4–7 Hz. To extract the alpha wave component, Ω1 is set to the range corresponding to 8–13 Hz. The RCE signals are extracted from all 30 channels per 1 second. Moreover, to investigate the variation of the frequency component of the RCE signals, Fourier transform was applied per 1 second. One minute spectrum average of the theta wave (4–7 Hz) and the alpha wave (8–13 Hz) are shown in Figs. 5–6. The abscissa axis represents
Dimension Reduction of RCE Signal 1
before sleep during sleep
0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -0.5
0
0.5
1
1.5
2
2.5
1.2 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -0.5
311
before sleep during sleep
0
0.5
1
1.5
2
2.5
Fig. 7. PCA mapping of the weight ex- Fig. 8. PCA mapping of the weight extracted by the theta component tracted by the alpha component 0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5 -0.6 -0.3 -0.2 -0.1
before sleep during sleep
0.4 0.3
before sleep during sleep
0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 0
0.1 0.2 0.3 0.4 0.5
-0.5 -0.3-0.2-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Fig. 9. LPP mapping of the weight ex- Fig. 10. LPP mapping of the weight extracted by the theta component tracted by the alpha component
the time range and the vertical axis represents the average of the power spectrum. Time range is set as 60 seconds. From the RCE result, the power of the theta wave component during sleeping tends to be higher, although there was no great difference in the theta wave component between conditions. On the contrary, the power of the alpha wave during the sleeping is lower in the condition of during sleeping. Finally, the weights distribution is analyzed. The RCE weights are thought to be the extracted power distribution. Therefore , if we want to know the effectiveness of the specific electrodes, we should analyze the features of the weight. For mapping of the weight features into a 2-dimensional space, there are linear projective maps methods (PCA and LPP). PCA is to reduce the dimensionality of a data set consisting of a large number of interrelated variables. LPP is linear projective maps that arise by solving variational problem that optimally preserves the neighborhood structure of the data set. The variation of the weight in 30 channels is mapped to a 2–dimensional space by first two directions of each method. The weight variation among 30 electrodes by the PCA is shown in Figs. 7–8, and the results by the LPP are shown in Figs. 9– 10. Abscissa and vertical axes are 1st factor and 2nd factor, respectively. In
312
Y. Tomita et al.
these results, they were difficult to discriminate the locality maps. Therefore, it is difficult to confirm that there are the specific important electrodes. It may caused by the analysis method as the PCA and the LPP, or there are too many clusters in the EEG features.
6
Conclusions
In this paper, we investigate the sleeping EEG features by the RCE. In the experiment, we obtain the EEG data in two conditions (before and during the sleeping). In order to extract the sleeping features, we use the theta and the alpha wave in calculation of the RCE. From the results by RCE, the power spectrum of the alpha wave comparatively decreased during the sleeping and the theta power increased. Furthermore, the weight distribution is analyzed by PCA and LPP for finding effective electrodes. However, the specific electrodes are not confirmed. In the next step, the weights are analyzed by some pattern recognition simulation to confirm differences of sleep conditions.
References 1. Johns, M.W.: A sleep physiologist’s view of the drowsy driver. Transportation Research Part F 3, 241–249 (2000) 2. Lal, S.K.L., Craig, A.: A critical review of the psychophysiology of driver fatigue. Biological Psychology 55, 173–194 (2001) 3. Sanei, S., Chambers, J.: EEG signal processing. John Wiley & Sons, Hoboken (2007) 4. Yokoyama, S., Shimada, T., Takemura, A., Shiina, T., Saitou, Y.: Detection of characteristic wave of sleep EEG by principal component analysis and neural-network. IEICE Trans. J76-A(8), 1050–1058 (1993) 5. Monto, S., Palva, S., Voipio, J., Palva, J.M.: Very slow EEG fluctuations predict the dynamics of stimulus detection and oscillation amplitudes in humans. The Journal of Neuroscience 28(33), 8272–8286 (2008) 6. Tanaka, T., Saito, Y.: Rhythmic component extraction for multi-channel EEG data analysis. In: Proc. of ICASSP 2008, pp. 425–428 (2008) 7. Saito, Y., Tanaka, T., Higashi, H.: Adaptive rhythmic component extraction with regularization for EEG data analysis. In: Proc. of ICASSP 2009, pp. 353–356 (2009) 8. Higashi, H., Tanaka, T.: Rhythmic component extraction considering phase alignment and the application to motor imagery-based brain computer interfacing. In: Proc. of IEEE WCCI IJCNN, pp. 3508–3513 (2010) 9. He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems 16, Vancouver, Canada (2003)
Prediction of Oxygen Decarburization Efficiency Based on Mutual Information Case-Based Reasoning Min Han and Liwen Jiang School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116023, China
[email protected] Abstract. Oxygen decarburization efficiency prediction model based on mutual information case-based reasoning is proposed and the blowing oxygen of static and dynamic phases is calculated according to the forecasting results. First, a new prediction method of blowing oxygen is proposed which attributes the oxygen decarburization efficiency as solution properties of case-based reasoning. Then the mutual information is introduced into the process of determining weights of attributes, which solve the problem that the lack of information is ignored between the problem properties and the solution property in the traditional case retrieval method. The proposed model will be used in a 150 tons converter for the actual production data. The results show that the model has high prediction accuracy. On this basis the calculation accuracy of blowing oxygen in the two phases is ensured and the requirements of actual production are met. Keywords: basic oxygen furnace; case-based reasoning; mutual information; oxygen decarburization efficiency; blowing oxygen.
1 Introduction Steelmaking is a critical process in the steel industry, it is a complex oxidation process warming and reduction of carbon by blowing the oxygen into the pool with highspeed, which reacts with molten iron and then heat is released. Then reduction of carbon can be achieved, molten iron can be heated and the impurity elements such as phosphorus and sulfur can be reduced. Ultimately the steel meeting the technological requirements can be obtained [1]. The blowing oxygen and the amount of refrigerant are the dominant factors in process control, which directly affects the effectiveness and quality of steel smelting and is the key factors of the control of element oxidation, bath temperature, slagging, the removal of steel inclusions and the control of splash. It will finally affect the content of endpoint carbon and the temperature of molten steel, which has an important role for strengthening refining and improving the technology indicators. With the improvement of the converter technology the vice-gun measure has been widely used in the area of industrial steelmaking. Because of the limit of the technological level there are only two vice-gun measures in a smelting process. One is D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 313–322, 2011. © Springer-Verlag Berlin Heidelberg 2011
314
M. Han and L. Jiang
carried out in the late blowing and the other in the end blowing. The boundary is measured with the time of vice-gun measurement, the first stage is called the main blowing stage and the second stage is known as the auxiliary blowing stage. The main blowing stage is the basic stage, in which most reactions are complicated, and in this stage the prediction of oxygen blowing is based on the initial information referred to the static prediction. The forecast accuracy of static blowing oxygen is the foundation of the follow-up process. The auxiliary blowing stage is the final phase, the prediction of oxygen blowing mainly based on the information of vice-gun known as dynamic prediction, whose blowing oxygen directly determines that whether the endpoint carbon content and temperature can achieve the requirements. The molten iron and scrap steel
The main blowing stage
The static prediction
The first vice-gun measure
The second vice-gun measure
The auxiliary blowing stage
The dynamic prediction
Fig. 1. Schematic diagram of converter steel production process
Artificial experience is currently used in the control of the blowing oxygen, or through the mechanism methods, statistical methods and intelligent methods and so on. The blowing oxygen forecast directly bases on the initial conditions and information, such as [2,3,4]. Yet in intelligent methods the oxygen decarburization efficiency changing in different reaction phases is not considered. In addition the method ignoring intermediate "black box modeled" is used. In the same time a lot of unknown parameters are introduced, which lacks the interpretability in history database, and the use of case experience to adjust is also not considered. In this paper a new model computing blowing oxygen was proposed, the oxygen decarburization efficiency was used as the solution of case-based reasoning. Through the mechanism analysis the factors of the oxygen decarburization efficiency in the primary and secondary blowing stages could be found, and the problem properties were identified. Case-based reasoning was used to predict the oxygen decarburization efficiency, because of ignoring an amount of information in the traditional method of case-based reasoning, the case retrieval method based on mutual information was proposed, the oxygen decarburization efficiency of the current case could be get. Finally, the blowing oxygen of static and dynamic phase was calculated with the mechanism formula.
Prediction of Oxygen Decarburization Efficiency
315
2 Oxygen Decarburization Efficiency and Case-Based Reasoning Based on Mutual Information Case-based reasoning consists of the following steps: case description, case retrieval, case reuse, case correction and preservation [5,6].
Fig. 2. Schematic model of case-based reasoning
2.1 Case Description Case description is the basis of case-based reasoning which ensures the effectiveness of subsequent steps. 2.1.1 The Identification of the Problem Properties According to the working condition of converter steel production and the process data the problem attributes can be constructed. The changes of the elements in the molten pool are usually divided into three stages namely pre-blowing, mid and late in converter steelmaking process. In the pre-blowing the average pool temperature is below 1400 . This period is mainly the oxidation of silicon and manganese, but due to the high temperature in a reaction zone a small amount of carbon will be oxidized. At the same time lower temperature and the rapid formation of alkaline oxidation residue which are the thermodynamic conditions of dephosphorization. So the early slag has a strong ability of dephosphorization. In the medium blowing silicon and manganese has been mostly oxidized out, and the bath temperature has risen to more than 1500 . The intense oxidation of carbon begins. In this phase the decarburization rate is so high that the blowing oxygen into the molten pool is mostly consumed in the reaction of decarburization. In the blowing end with the reaction of decarburization the carbon content in molten steel decreases, and the decarburization rate also decreases. The basicity in the slag determines the slagging rate. The blowing process is only ten minutes which must ensure the appropriate alkalinity so as to eliminate
℃
℃
316
M. Han and L. Jiang
effectively phosphorus, sulfur and other elements to improve the efficiency of oxygen decarburization. The bath temperature and the amount of coolant also affect the reaction of carbon and oxygen. The high bath temperature can cause severe reaction and improve the efficiency of oxygen decarburization. Thus in the main blowing stage the factors that affect the oxygen decarburization efficiency are the carbon content in hot metal, the target carbon content, the temperature of molten iron, the target temperature, the silicon content in hot metal, the manganese content in hot metal, the phosphorus content in hot metal, the dolomite, the lime, the iron ball, the molten iron and the scrap steel. In the secondary blowing stage the reaction of carbon and oxygen is the protagonist. Thus the factors affecting the oxygen decarburization efficiency of the dynamic model are only the measured content of carbon and the measured temperature by vice-gun. 2.1.2 The Determination of the Solution Property Traditional CBR process usually target as a solution directly to the property. However the actual data through the analysis shows that the actual oxygen decarburization efficiency and the blowing oxygen have a great relationship. To predict the blowing oxygen the first definition of oxygen decarburization efficiency is given: Oxygen decarburization efficiency η of some stage can be defined as the ratio that the oxygen consumed by carbon oxidation in the bath to the actual amount of oxygen.
η =
Q (C ) ×100% Q
(1)
Where Q(C ) is the blowing oxygen in melt pool. Q is the actual blowing oxygen.
The traditional formula calculating blowing oxygen is shown below [7] Qstatic =
[ w(C )iron × β − w(C )aim ] × μ
ηstatic
×W
(2)
According to equation (2) the formula calculating the dynamic blowing oxygen is shown below Qdynamic =
[ w(C )sub − w(C )aim ] × μ
ηdynamic
×W
Where Qstatic is the total blowing oxygen (static model). Qdynamic is the blowing oxygen of blowing phase (dynamic model). W is the total load. ηstatic is the efficiency oxygen decarburization of the static model.
(3)
Prediction of Oxygen Decarburization Efficiency
317
ηdynamic is the oxygen decarburization efficiency of dynamic model.
(
)
μ = 22.4 / 2× 12 =0.933 . β is the proportion of hot metal.
β=
Wiron × 100% Wiron + Wscrap
(4)
Practical production shows that accurate judgment and estimate of oxygen decarburization efficiency is necessary to calculate the blowing oxygen and achieve goals. Generally oxygen decarburization efficiency can be judged by actual production experience, which usually is taken from 0.70 to 0.75. However the analysis tells that the oxygen decarburization efficiency for actual data often exceeds the experience range. It can be affected by different factors at all stages. There are interactions between various factors. Therefore oxygen decarburization efficiency is determined to the solution property of case-based reasoning. 2.2 Case Retrieval
Case retrieval is the key of case-based reasoning, through which several cases similar to the current problems are obtained, and the solution of similar cases retrieved can be used to case reuse. 2.2.1 The Traditional Method of Case Retrieval Case retrieval methods: the nearest neighbor method, knowledge seeker, induction index method [8]. According to the characteristics of the continuous data the traditional method uses the nearest neighbor method. First the underlying similarity is calculated
simij = 1 −
xij − x j ' max( xij − x j ') − min( xij − x j ')
(5)
Then the global similarity is calculated based on the underlying similarity. As the property values are continuous variables in all similar cases, so the following formula is used. m
n
j =1
i =1
Simi = ∑ ( w j ∑ simij )
(6)
2.2.2 The Determination of the Weights Based on Mutual Information Currently the standard deviation of similarity is used to calculate the weights. In different cases smaller the property differences are, smaller the property role is on the distinction of the similarity in all cases which were given a smaller weight. On the contrary were given a greater weight. The formula of w j is as follows
318
M. Han and L. Jiang
wj =
σj m
∑σ j =1
(7) j
1 n ∑ (simij − sim j ) n i =1
σj =
(8)
sim j is the mean similarity of the current case and the jth case in the case base. Difficulty in the traditional CBR approach is the choice of feature weight. Usually the choice of weights is based on the standard deviation of local similarity. The larger the standard deviation of Local similarity is, the greater the corresponding weight of the feature is. However in practical applications the standard deviation of local similarities of a single property can’t determine on the contribution of the overall similarity of the property, but also the result and the actual have a large deviation. Because the data is collected in the industrial field which contains much redundant information, with the feature dimension increasing the data density decreases. Therefore it is lack of sufficient reason that the properties with smaller standard deviation of local similarities have a smaller amount of contribution to the control objects. In the study of the actual data it is found that although sometimes in a case the standard deviations of local similarity of each attribute were large, it didn’t have any impact for the predictions of control object. And sometimes although the standard deviations of local similarities of the individual properties were very close, the prediction of the control object had a big difference. The amount of information between the input attributes and the control object need to be considered. Mutual information can be defined based on the entropy theory, the formula is as follows:
H(X )
I ( X , Y ) = H ( X ) + H (Y ) − H ( X , Y ) = H(X ) − H (X | Y) = H (Y ) − H (Y | X )
(9)
H ( X ) = −∑ px ( xi ) log( px ( xi ))
(10)
H ( X | Y ) = −∑ px , y ( xi ) log( px| y ( xi | yi ))
(11)
H ( X , Y ) = −∑ pxy ( xi , y j ) log( pxy ( xi , y j ))
(12)
, H ( X | Y ) and H (Y | X ) are defined as follows: i
i
i, j
Mutual information is used as the basis to determine the attribute weights which are calculated as follows: I (x j , y ) wj = m (13) ∑ I (x j , y ) j =1
Prediction of Oxygen Decarburization Efficiency
319
In this way the amount of information revealed by the control properties and the data redundancy contained in input attributes are fully considered. The smaller mutual information between the control object and the input properties, the smaller relatively weight is given. On the contrary a greater weight is given. The formula (13) can be put into (6), so that global similarity can be calculated, and the results are ordered according to similarity, the l cases with the greatest similarity with the current case are choose as similar cases to reuse the case in the next step. 2.3 Case Reuse, Modification and Save
Case remake can propose the current solution or final solution for the current problem. With the l most similar cases used in the previous section the solution of current issues can be constructed by the following formula: l
y=
∑ (Sim
k
k =1
yk )
(14)
l
∑ Simk k =1
The rectification work will be handed over to operators to complete. Case save is necessary to normal and effective operation of case-based reasoning system. The cases in case library can be enriched by case save. But as the production progresses, the case library can expand excessively, and the reliability of case-based reasoning will reduce, therefore it is necessary to carry out the maintenance. If the current blowing results meet the technical requirements, and the required number of cases reaches the upper limit, the earliest similar cases in case base could be replaced with the current case.
3 Simulations The results shown oxygen decarburization efficiency of the two stages as follows: 0. 9
Forecast Actual
The efficiency of oxygen decarburization
The efficiency of oxygen decarburization
0.9 0. 85 0.8 0.75 0.7 0.65
0
5
10
15
20
25
30
35
40
the number of samples
(a) In static phase
45
50
Forecast Actual
0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2
0
5
10
15
20
25
30
35
40
45
50
the number of samples
(b) In dynamic phase
Fig. 3. Predicted value of oxygen decarburization efficiency compared with the actual value
320
M. Han and L. Jiang
Fig.3 shows the predictive value of oxygen decarburization efficiency is well fitted for the actual value which provides a strong guarantee to calculate accurately the next two-stage blowing oxygen. The results of oxygen decarburization efficiency of the two-stage can be taken into the formula (2) and (3) to calculate the static and dynamic blowing oxygen, the comparison between the actual and forecast blowing oxygen is shown in Fig.4. 2000
10000
1800 The actual oxygen blowing
The actual oxygen blowing
9500 9000 8500 8000 7500
1600 1400 1200 1000 800 600 400 200
7000 7000
7500
8000
8500
9000
9500
0
10000
0
200 400 600 800 1000 1200 1400 1600 1800 2000
Predict oxygen blowing
Predict oxygen blowing
(a) In static phase
(b) In dynamic phase
Fig. 4. Calculation of the actual wind blowing oxygen and oxygen deviation
The solid line in Fig.4 shows that the calculated value is equal to the actual value, the points within the dotted lines satisfy the specified scope of absolute errors (static requirement is ± 500, dynamic requirement is ± 300), the points within the imaginary lines meet the range of absolute errors in ± 700 and ± 400. Fig.4 shows the predicted results distributed evenly on both sides of the solid lines. Almost all of the absolute errors of the calculated values and the actual values meet the required range. The data is compared before and after improvement as shown in Table 1. Table 1. Comparison before and after introducing the oxygen decarburization efficiency Static Prediction model
dynamic
Accuracy
Accuracy
RMSE
(±500 m3 )
RMSE
(±300 m3 )
CBR
429.64
0.84
195.93
0.84
Decarburization efficiency
362.25
0.88
172.41
0.92
Seen from Table 1, when the traditional method of case-based reasoning is used directly to predict the blowing oxygen, mean-square error and accuracy of two stages have no obvious improvement compared with statistical and intelligent methods (shown as Table 3). However the introduction of blowing oxygen model based on oxygen decarburization efficiency, the mean square error of two phases is significantly decreased, and the accuracy is also improved significantly.
Prediction of Oxygen Decarburization Efficiency
321
Table 2. Comparison before and after introducing the mutual information Static Prediction model
Accuracy
dynamic Accuracy
RMSE
(±500 m3 )
RMSE
(±300 m3 )
Decarburization efficiency
362.25
0.88
172.41
0.92
This method
328.73
0.88
157.70
0.94
Further case-based reasoning method based on mutual information is introduced to determine the attribute weights as shown in Table 2. The root mean square error (RMSE) of the static model fells to 328.73, and that of the dynamic model fells to 157.7. the percentage of the test sample points whose absolute errors in the static model is less than to 500 m3 is 88%, while that in the dynamic model whose absolute errors in the static model is less than to 300 m3 is 94%. Comparing with traditional methods this method has obvious advantage in various indicators. The method will be compared with other methods using the same data, and the results are shown in Table 3. Table 3. Comparison between this method and the existing methods Static
Dynamic
Prediction model
RMSE
Accuracy (±500 m 3 )
RMSE
Accuracy (±300 m 3 )
Regression [2]
454.64
0.78
242.34
0.86
BP[3]
423.28
0.80
187.70
0.90
SVM
405.85
0.82
190.81
0.88
This method
328.73
0.88
157.70
0.94
4 Conclusions Through the mechanism analysis the factors affecting the blowing oxygen in two stages of converter steelmaking can be determined. Then the case-based reasoning based on mutual information which is used to obtain the attribute weights in the process of case retrieval is proposed to determine the oxygen decarburization efficiency, finally the blowing oxygen of the two stages can be calculated. The amount of information contained in the attribute can be associated with the contribution to the cases similarity from the attributes, which can avoid the defects that the traditional casebased reasoning method relies solely on the standard deviation of local similarity to calculate the attribute weights, the accuracy predicting oxygen decarburization
322
M. Han and L. Jiang
efficiency is improve. Experimental results show that the accuracy can reach respectively to 88% 94% with calculating static and dynamic blowing oxygen, which can ensure to obtain the molten steel meeting the requirements.
Acknowledgements This research was supported by the project (60674073) of the National Nature Science Foundation of China, the project (2006BAB14B05) of the National Key Technology R&D Program of China and the project (2006CB403405) of the National Basic Research Program of China (973 Program). All of these supports are appreciated.
References [1] Valyon, J., Horvath, G.: A sparse robust model for a Linz–Donawitz steel converter. IEEE Transactions on Instrumentation and Measurement 58(8), 2611–2617 (2009) [2] Zhu, G.J., Liang, B.C.: Optimum model of static control on BOF steelmaking process. Steelmaking 15(4), 25–28 (1999) [3] Cox, I.J., Levis, R.W., Ransing, R.S.: Application of neural computing in basic oxygen steelmaking. Journal of Material Processing Technology 120(1), 310–315 (2002) [4] Ullman, S., Basri, R.: Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(10), 992–1006 (1991) [5] Chang, P.C., Liu, C.H., Lai, R.K.: A fuzzy case-based reasoning model for sales forecasting in print circuit board industries. Expert Systems with Applications 34(3), 2049–2058 (2008) [6] Huang, M.J., Chen, M.Y., Lee, S.C.: Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis. Expert Systems with Applications 32, 856–867 (2007) [7] Stephane, N., Marc, L.L.J.: Case-based reasoning for chemical engineering design. Chemical Engineering Research and Design 86, 648–658 (2008) [8] Virkki, T., Reners, G.L.L.: A case-based reasoning safety decision-support tool: nextcase/safety. Expert Systems with Applications 36(7), 10374–10380 (2009)
Off-line Signature Verification Based on Multitask Learning You Ji, Shiliang Sun, and Jian Jin Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 200241, P.R. China
[email protected], {slsun,jjin}@cs.ecnu.edu.cn
Abstract. Off-line signature verification is very important to biometric authentication. This paper presents an effective strategy to perform offline signature verification based on multitask support vector machines. This strategy can get a significant resolution of classification between skilled forgeries and genuine signatures. Firstly modified direction feature is extracted from signature’s boundary. Secondly we use Principal Component Analysis to reduce dimensions. We add some helpful assistant tasks which are chosen from other tasks to each people’s task. Then we use multitask support vector machines to build a useful model. The proposed model is evaluated on GPDS and MCYT data sets. Our experiments demonstrated the effectiveness of the proposed strategy. Keywords: Off-line Signature Verification, Multitask Learning, Support Vector Machines, Machine Learning.
1
Introduction
Handwritten signatures as a behavioral biometric are widely used in the field of human identification [1]. Off-line signature verification is different from on-line signature verification which can extract dynamic features from time, pressure, acceleration, stroke order, etc. Because of the lack of dynamic information, offline signature verification is more difficult than on-line signature verification. Thereby automated off-line signature verification has been a challenging problem for researchers. In an off-line signature verification system, three kinds of forgery such as random, simple and skilled forgeries are considered [2]. The random forgery is usually represented by a signature sample that belongs to a different writer of the signature model. The simple forgery is represented by a signature sample with the same shape of the genuine writer’s name. The skilled forgery is represented by a suitable imitation of the genuine signature model. Because the skilled forgeries are very similar to genuine signatures, it is difficult to distinguish skilled forgeries from genuine signatures [2]. To overcome this difficult problem, global shape features which are based on the projection of the signature and local grid features are proposed in [3]. Local features such as stroke and sub-stroke are discussed in [4]. Structural features [5,6] are also used to improve accuracy of off-line signature verification. The D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 323–330, 2011. c Springer-Verlag Berlin Heidelberg 2011
324
Y. Ji, S. Sun, and J. Jin
Gradient, Striatal and Concavity (GSC) features [7] are also successful in offline signature verification. Various classifiers, such as Neural Networks [8], Support Vector Machines (SVMs), or a threshold decision method are employed. Effective models such as fuzzy model [9] and Hidden Markov Models [10] are used to improve accuracy of forgery detection. So far, off-line signature verification is still an open problem which needs more efforts to address [2]. In this paper, we mainly aim at designing of multitask learning strategy which can solve the problem of distinguishing skilled forgeries from genuine signatures. If we take everyone’s signature verification as a single task, some common information may be shared between tasks. If we take one’s signature verification as the target task, we may add some other peoples’ signature verification as assistant tasks to help improve the accuracy of classification. Because of the sharing of common information between tasks, the generalization of classifiers are highly improved. The rest of the paper is organized as follows. Section 2 mainly discusses feature extraction and dimensionality reduction. Section 3 introduces multitask SVMs. The effective strategy is proposed in Section 4. Besides reporting experimental results, Section 5 also includes analysis of experimental results. Finally, Section 6 presents our conclusions and future work.
2
Feature Extraction
Feature extraction plays an important role in a signature verification system. A suitable feature extraction method will improve the accuracy of classification. Because of the effectiveness of Modified Direction Feature (MDF) [6,11,12], we use it as our feature extraction method. Here, we just introduce it simply. Details may be found in [6,11,12]. Main steps of feature extraction can be seen in Table 1. This feature extraction strategy is based on two other feature extraction techniques, Direction Feature (DT) [11,12] and Transition Feature (TF) [11,12]. DF mainly aim at the extraction of direction features while TF records the locations of the transitions between foreground and background in signature’s boundary. Reducing the dimensions of feature vectors extracted by MDF is an essential part of signature verification. We use Principal Component Analysis (PCA) [13] to reduce dimensions. Then we will get a vector with lower dimensions. Table 1. MDF Feature Extraction Strategy Input: Gray image 1. 2. 3. 4.
Image preprocessing. Retrieve the boundary of the image. Replace foreground pixels with direction numbers. Extract location and direction feature in pairs.
Off-line Signature Verification Based on Multitask Learning
3
325
Multitask Support Vector Machines
If we take each people’s signature verification as a single task, some common information between tasks may be found by multitask learning [14,15]. Because of the success of regularized multitask learning [14], we use multitask SVMs as our classifiers. In order to apply multitask, we consider the following setup. We assume that there are T persons’ signatures. Everyone have two categories, genuine signatures and skilled forgeries. And we assume all the data for the tasks come from the same distribution X × Y . For simplicity, we take X ⊂ Rd , and Y ⊂ {−1, 1} (-1 stands for skilled forgeries and 1 stands for genuine signatures). For the t person’s signature verification, the n samples {(xt1 , yt1 ) , (xt2 , yt2 ) , . . . , (xtn , ytn )} ⊂ Pt
(1)
are get from the distribution Pt on X × Y . So, all samples are shown below. {{(xt1 , yt1 ) , . . . , (xtn , ytn )} , . . . , {(xT 1 , yT 1 ) , . . . , (xT n , yT n )}}
(2)
For single-task SVMs, it is trying to find a hyperplane which can maximize the margin between two classes. That is ft (x) = wt ·x (t stands for the t person’s learning task, and · stands for dot product). Kernel methods can extend it to nonlinear models. Because T tasks are sampled from the sample distribution X × Y . We may assume all wt are close to a mean function w0 . Then wt for all t (t ⊂ {1, . . . , T }) can writhen as wt = w0 + vt (3) where w0 stands for the information shared among T signature verification tasks. To this end, we have to solve the following optimization problem [14] which is analogous to sing-task SVMs. λ1 n T 2 2 λ1 min εit + T vt + λ2 w0 J(w0 , vt , εit ) := w0 ,vt ,εit
t=1 i=1
T
∀i ∈ {1, 2, . . . , nt } , ∀t ∈ {1, 2, . . . , T } , subject to yit (w0 + vt ) · xit > 1 − εit εit > 0
(4)
With the notion of feature map [16], we may rewrite the equations as below. ⎞ ⎛ Φ (x, t) = ⎝ √xtu , 0, . . . , xt , 0, . . . , 0⎠
t−1
u = Tλλ12 √ w = ( uw0 , v1 , . . . , vT ) Then we may get
T −t
w · Φ (x, t) = (w0 + vt ) · xt T 2 w = vt 2 + w0 2 . t=1
(5)
(6)
326
Y. Ji, S. Sun, and J. Jin
Solving the multitask SVMs problem (4) is equivalent to solving a standard SVMs which uses the linear kernel below.
Kst (Φ(x, s) , Φ (z, t)) = u1 + δst x · z 0 if s = t (7) δst = 1 if s = t Then the multitask SVMs can be solved by multitask kernels. We may use multitask kernels to replace regular kernels in SMVs’ toolbox such as LibSVM [17]. More details about nonlinear multitask kernels are given in [14].
4
The Strategy for Signature Verification
For simplicity, we take simple forgeries as skilled forgeries. Then signature verification may split into two classification problems. One is classification between skilled forgeries and genuine signatures. The other one is classification between random forgeries and genuine signatures. Here we just focus on the first classification problem. Although experiments in [14] demonstrate the useness of multitask learning strategy, we may get worse results when we put all the tasks together and learn by multitask learning. That is because not all tasks are sampled from the same distribution. If we want to get better results than single-task learning, we have Table 2. Algorithms to Find Helpful Assistant Tasks Input: integer i stands for the ith task. integer T stands for the number of tasks. Output: set A means a set which contains the ith task and helpful assistant tasks for ith task. Functions: SVMs means multitask SVMs. if the input is just one task, it acts as single-task SVMs. A = {i} while(1) MaxAccuracy=SVMs(A), HelpTask=0 for j = 1 to T S =A∪j Accuracy=SVMs(S) if( Accuracy > MaxAccuracy ) MaxAccuracy=Accuracy; HelpTask=j end if end for if ( MaxAccuracy > SVMs(A) ) A = A ∪ HelpT ask else break end if end while
Off-line Signature Verification Based on Multitask Learning
327
to add some helpful assistant tasks to the target task. We assume these tasks are sampled from the same distribution. How to find helpful assistant tasks is described as the pseudo code in Table 2. The main idea of this algorithm is adding other person’s signature verification task as assistant task until reach the max accuracy on training data sets. At last, each task will get some helpful assistant tasks. Then we deal with these tasks together by multitask SVMs.
5
Experiments
Experiments are carried out on GPDS [18] and MCYT [19] data sets, respectively. 5.1
The Evaluation of Experiments
Error Rate (ER), False Acceptance Rate (FAR) and False Rejection Rate (FRR) are three parameters used for measuring performance of any signature verification method. ER, FAR and FRR are calculated by equations as below. ER =
number of wrong predicted images number of images
× 100%
× 100%
F AR =
number of f orgeries accepted number of f orgeries tested
F RR =
number of genuine signatures rejected number of genuine signatures
(8) × 100%
We compute the Average ER (AER), Average FAR (AFAR) and Average FRR (AFRR) on T persons by equations below. AER =
1 T
T t=1
ERyt ,
AF AR =
1 T
T
F ARt ,
t=1
AF RR =
1 T
T
F RRt
(9)
t=1
For comparison, four kernel methods are used in the table below. In our experiments, we use cross validation to choose parameters for each task. 5.2
Experiments on GPDS Database
Details about GPDS Database. GPDS database contains images from 300 signers. Each signer has 24 genuine signatures and 30 skilled forgeries. The 24 genuine specimens were produced in a single day writing sessions. The skilled forgeries were produced from the static image of the genuine signature which is chosen randomly from the 24 genuine signatures. Forgers were allowed to practice the signature for as long as their wishes. Each forger imitated 3 signatures of 5 signers in a single day writing session. So for each genuine signature there are 30 skilled forgeries made by 10 forgers from 10 different genuine specimens. After we use MDF to extract features, we employ PCA to reduce dimensions. The dimensions are reduced from 2400 to 113. For every people, we just randomly choose 12 genuine signatures and 15 skill forgeries to train. The images
328
Y. Ji, S. Sun, and J. Jin Table 3. Common Kernels Kernel Function
Formula
Linear Ploynomial RBF Sigmoid
(xi · xj ) (xi · xj + 1)p 2 2 e−xi −xj /2σ tanh(kxi · xj − δ)
Table 4. Comparison between Single-task SVMs and Multitask SVMs
Linear Polynomial RBF Sigmoid
AER(%) AFAR(%) AFRR(%) SSVMs MSVMs SSVMs MSVMs SSVMs MSVMs 20.30 6.72 15.46 4.67 26.36 9.29 15.46 4.67 20.00 7.40 25.81 10.04 26.36 9.29 25.81 10.04 29.33 13.13 20.00 7.40 11.92 5.56 25.89 8.53
left (12 genuine signatures and 15 skilled forgeries) are used to test. We get average evaluation parameters (AER, AFAR, AFRR) by repeating 10 times. For simplicity, We let u in (7) equal to 1, although this parameter may get better results if we use cross validation to choose parameter. Items of Table 2 show that our strategy is much better than single-task SVMs’ solution. For instance, if we use single-task SVMs as classifiers, the smallest value of AER is 15.46%. But we can get better results (4.67%) when we use multitask SVMs after we find and add some helpful assistant tasks to our target task. Considering all kernels used in our experiments, all evaluation parameters are get better results than single-task SVMs’ results. 5.3
Experiments on MCYT Database
Details about MCYT Database. For evaluating effectiveness of our method, a sub-corpus of the larger MCYT bimodal database which contains 2250 images is used in our experiments. Each person has 15 genuine signatures and 15 forgeries which are contributed by three different user-specific forgers. The 15 genuine signatures were acquired at different times (between three and five) of the same acquisition session. At each time, between one and five signatures were acquired consecutively [2]. After we employ PCA to reduce dimensions which are extracted by MDF, the dimensions are reduced from 2400 to 102. For every people, we just randomly choose 7 genuine signatures and 7 skilled forgeries to train. The images left (8 genuine signatures and 8 skilled forgeries) are used to test. We get average evaluation parameters (AER, AFAR, AFRR) by repeating 10 times. For simplicity, We let u in (7) equal to 1.
Off-line Signature Verification Based on Multitask Learning
329
Table 5. Comparison between Single-task SVMs and Multitask SVMs
Linear Polynomial RBF Sigmoid
AER(%) AFAR(%) AFRR(%) SSVMs MSVMs SSVMs MSVMs SSVMs MSVMs 15.26 4.22 20.20 4.29 10.32 3.35 20.20 4.29 15.50 4.75 11.17 3.87 10.32 3.35 11.17 3.87 14.58 9.33 15.50 4.75 16.37 5.85 13.08 10.00
Table 3 demonstrated that the proposed methods outperformed the singletask SVMs methods. For instance, if we use single-task SVMs as classifiers, the smallest value of AER is 10.32%. But we can get better results (3.35%) when we use multitask SVMs after we find and add some helpful assistant tasks to our target task. Considering all kernels used in our experiments on MCYT corpus, all evaluation parameters are get better results than single-task SVMs’ results.
6
Conclusion and Future Work
This paper mainly aims at classification between skilled forgeries and genuine signatures which is a difficult problem for signature verification. In this paper, we build our model based on multitask learning. And we proposed an effective algorithm to select help assistant task for our main task. Experiments on GPDS and MCYT demonstrated that the proposed methods achieved favorable verification accuracy. The verification between random forgeries and genuine signatures is focused in the future work. Future work is also focused on classifier combination, feature selection and active learning for signature verification. Acknowledgments. This work is supported in part by the National Natural Science Foundation of China under Project 61075005, 2011 Shanghai Rising-Star Program, and the Fundamental Research Funds for the Central Universities.
References 1. Fierrez, J., Ortega, G.J., Ramos, D., Gonzalez, R.J.: HMM-based on-line Signature Verification: Feature Extraction and Signature Modeling. Pattern Recognition Letters 28, 2325–2334 (2007) 2. Wen, J., Fang, B., Tang, Y., Zhang, T.P.: Model-based Signature Verification with Rotation Invariant Features. Pattern Recognition 42, 1458–1466 (2009) 3. Qi, Y., Hunt, B.R.: Signature Verification Using Global and Grid Features. Pattern Recognition 27, 1621–1629 (1994) 4. Fang, B., Leung, C.H., Tang, Y.Y., Tse, K.W., Kwok, P.C.K., Wong, Y.K.: Offline Signature Verification by the Tracking of Feature and Stroke Positions. Pattern Recognition 36, 91–101 (2003)
330
Y. Ji, S. Sun, and J. Jin
5. Huang, K., Yan, H.: Off-line Signature Verification Using Structural Feature Correspondence. Pattern Recognition 35, 2467–2477 (2002) 6. Nguyen, V., Blumenstein, M., Leedham, G.: Global Features for the Off-Line Signature Verification Problem. In: International Conference on Document Analysis and Recognition, pp. 1300–1304 (2009) 7. Kalera, M.K., Srihari, S., Xu, A.: Off-line Signature Verification and Identification Using Distance Statistics. International Journal of Pattern Recognition and Artificial Intelligence 18, 1339–1360 (2004) 8. Armand, S., Blumenstein, M., Muthukkumarasamy, V.: Off-line Signature Verification Using the Enhanced Modified Direction Feature and Neural-based Classification. In: International Joint Conference on Neural Networks, pp. 684–691 (2006) 9. Madasu, H., Yusof, M.H.M., Madasu, V.K.: Off-line Signature Verification and Forgery Detection Using Fuzzy Modeling. Pattern Recognition 38, 341–356 (2005) 10. Coetzer, J., Herbst, B.M., Preez, J.A.: Offline Signature Verification Using the Discrete Radon Transform and a Hidden Markov Model. EURASIP Journal on Applied Signal Processing 2004, 559–571 (2004) 11. Blumenstein, M., Liu, X.Y., Verma, B.: An Investigation of the Modified Direction Feature for Cursive Character Recognition. Pattern Recognition 40, 376–388 (2007) 12. Blumenstein, M., Liu, X.Y., Verma, B.: A Modified Direction Feature for Cursive Character Recognition. In: International Joint Conference on Neural Networks, vol. 4, pp. 2983–2987 (2005) 13. Esbensen, K., Geladi, P., Wold, S.: Principal Component Analysis. Chemometrics and Intelligent Laboratory Systems 2, 37–52 (1987) 14. Evgeniou, T., Pontil, M.: Regularized Multi-task Learning. In: International Conference on Knowledge Discovery and Data Mining, pp. 109–117 (2004) 15. Sun, S.: Multitask Learning for EEG-based Biometrics. In: Proceedings of the 19th International Conference on Pattern Recognition, pp. 1–4 (2008) 16. Vapnik, V.N., Vapnik, V.: Statistical Learning Theory, vol. 2 (1998) 17. Chang, C.C., Lin, C.J.: LIBSVM: a Library for Support Vector Machines (2001) 18. Vargas, J.F., Ferrer, M.A., Travieso, C.M., Alonso, J.B.: Off-line Handwritten Signature GPDS-960 Corpus. In: International Conference on Document Analysis and Recognition, vol. 2, pp. 764–768 (2007) 19. Ortega, G.J., Fierrez, A.J., Simon, D., Gonzalez, J., Faundez, Z.M., Espinosa, V., Satue, A., Hernaez, I., Igarza, J.J., Vivaracho, C.: MCYT Baseline Corpus: a Bimodal Biometric Database. Vision Image and Signal Processing 150, 395–401 (2004)
Modeling and Classification of sEMG Based on Instrumental Variable Identification* Xiaojing Shang 1, Yantao Tian 1,2, and Yang Li 1 1
School of Communication Engineering, Jilin University, Changchun {130025,312511605}@qq.com 2 Key Laboratory of Bionic Engineering, Ministry of Education Jilin University
Abstract. sEMG is biological signal produced by muscular. According to the characteristics of myoelectric signal, FIR model of the single input and multiple outputs was proposed in this paper. Due to the unknown input of the model, instrumental variable with blind identification was used to identify the model’s transfer function. The parameters of model were used as input of neural network to classify six types of forearm motions: extension of thumb, extension of wrist, flexion of wrist, fist grasp, side flexion of wrist, extension of palm. The experimental results demonstrate that this method has better classification accuracy than the classical AR method. Keywords: sEMG; Instrumental variable; Blind identification; Probabilistic neural networks; Pattern recognition.
1 Introduction sEMG[1] is obtained from the neuromuscular activity, which reflects the function and status of nerve and muscle. Using sEMG can identify different movement patterns, because different actions corresponding to different sEMG signals. Therefore, how to extract features of sEMG effectively and accurately is the key to control artificial limb using myoelectric signal. At present, analyzing characteristics and building model of sEMG signal is rare at home and abroad. In order to make full use of the correlation of signals between channels, Doerschuk [2] proposed multivariate AR model of four channels based on the original AR model proposed by Graupe. The better recognition has obtained by using the model parameters as the input of classifier. According to the complex computation, this method is not suitable for real-time implementation. Therefore, Zhi zhong Wang et al[3] proposed IIR method to establish sEMG model. This method has simple computation and implement online easily, which has better prospects. *
This paper is supported by the Key Project of Science and Technology Development Plan of Jilin Province (Grant No.20090350), Chinese College Doctor Special Scientific Research Fund (Grant No.20100061110029) and the Jilin University "985 project" Engineering Bionic Sci. & Tech. Innovation Platform.
D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 331–339, 2011. © Springer-Verlag Berlin Heidelberg 2011
332
X. Shang, Y. Tian, and Y. Li
In this paper, FIR model of two-channel single input multiple output sEMG is proposed after analyzing myoelectric signals. The whole pathway of sEMG only has zero points, which calculate simply than all-pole IIR model. Due to the unknown input of the model, blind identification is proposed in this paper to identify the model’s transfer function. It is difficult to determine the form of noise, which would create difficulties for modeling. Therefore, the blind identification method based on instrumental variable which does not need to know the specific form of noise, can identify transfer function of model parameters easily and quickly. These parameters are used as the input of neural network to recognize six types of forearm motions: extension of thumb, extension of wrist, flexion of wrist, fist grasp, side flexion of wrist, extension of palm. The experiments results show that it can achieve good results.
2 Modeling of Two-Channel Electromyography Signals EMG is under the control of the central nervous system[4].In this paper, the whole generating process of EMG is equivalent to a FIR system. The input is an electronic pulse sequence caused by motor units. Conduction of muscle fibers and filtering of skin and electrode can be used as transfer function H of FIR system. Then the output will be the EMG which we want to acquisition. Figure 1 is the description of FIR model.
H (1)
x (1)
N (1)
y (1)
u N ( 2) H (2)
x (2)
y ( 2)
Fig. 1. Model of two-channel EMG signal
Where u is the excitation signal produced by the central nervous, H (1) and H (2) are transfer functions of the muscle fibers’ transmission channels; x is the system output; N is the observation noise; y is the actual EMG signal.
3 Instrumental Variable Blind Identification Blind identification is basic parameters identification method, which establishes only on the basis of the system output data and do not directly depend on the input signal to estimate the system parameters. Blind identification gets more and more attention in the field of communication and signal process, which also produced a series of techniques and methods. Blind identification theory has also important applications in seismic signal analysis and image processing [5]. In this paper, two-channel sEMG model is proposed. The input of the model is unknown and unpredictable, and only the output can be measured. So the identification method is not the same as traditional system identification whose both input and output
Modeling and Classification of sEMG Based on Instrumental Variable Identification
333
signals are known. Therefore, the blind channel identification method is used to establish muscle signal model to recognize actions. A SIMO FIR system is applied here, T where y ( t ) = ⎡⎣ y1 ( t ) , y2 ( t ) , ", ym ( t ) ⎤⎦ are output signals, u is input, N ( t ) = ⎡⎣ N1 ( t ) , N 2 ( t ) , ", N m ( t ) ⎤⎦
T
are irrelevant random noise vectors, and H is the transfer
function: H(i ) ( z ) = bi ( 0) + bi (1) z−1 + bi ( 2) z−2 +"+ bi ( ni ) z−ni
(1)
where bi ( l ) is the system parameter which needs to be identified. From fig.1, we can obtain the relation of input and outputs: y ( i ) = H ( i ) ( z )u ( t ) + N (i ) ( t ) , i = 1, 2
(2)
Because the input u is unknown, for any un-zero constant α , we can get the equation: ⎡1 ⎤ y ( i ) = ⎢ H (i ) ( z ) ⎥ (α u ( t ) ) + N (i ) ( t ) , i = 1, 2 ⎣α ⎦
(3)
The equation (2) and (3) have the same output, that is, any method can not identify transfer function H uniquely. In order to recognize transfer function H uniquely, according to the literatures[6], we could normalize transfer function or input signal first, let the norm of the unknown coefficients in transfer function H as 1, that is ni
∑ b ( l ) = 1 . We can also suppose that the transfer function are polynominal, whose l =0
2
i
coefficients are all 1, that is bi ( 0 ) = 1, i = 1, 2 , and this suppose is applied in this paper. From fig.1, we can obtain: y (1) ( t ) − N (1) ( t ) = H (1) ( z ) u ( t ) y (2) ( t ) − N (2) ( t ) = H (2) ( z ) u ( t )
(4)
Divided the two ends of the above equations respectively, we can get: y (1 ) (t y ( 2 ) (t
)− )−
N N
(1 ) (2)
(t ) (t )
=
H H
(1 ) (2)
(z ) (z )
(5)
Through cross multiplication, there is: H (1) ( z) ( y(2) (t ) − N(2) ( t ) ) = H (2) ( z) ( y(1) ( t ) − N(1) ( t ) )
(6)
Then, write the above equation into differential equation forms as : n1
n2
n2
n1
l =1
l =1
l =1
l =1
y(2) ( t ) + ∑b1 ( l ) y(2) ( t − l ) = y(1) ( t ) + ∑b2 ( l ) y(1) ( t − l ) − N (1) ( t ) − ∑b2 ( l ) N (1) ( t − l ) + N (2) ( t ) + ∑b1 ( l ) N (2) ( t − l )
(7)
Define the vector λT = ⎡⎣−y(2) (t −1) ,−y(2) ( t −2) ,"− y(2) ( t −n1) , y(1) ( t −1) , y(1) (t −2) ,"y(1) ( t −n2 )⎤⎦
T
(8)
n1
n2
σ = ⎡⎣b1 (1) ,b1 ( 2) ,",b1 ( n1) ,b2 (1) ,b2 ( 2) ,",b2 ( n2 )⎤⎦ e( t) = N ( t) +∑b1 ( l) N ( t −l) −N ( t) −∑b2 (l) N ( t −l) T
(2)
(2)
l=1
And the equation (8) can be rewritten as:
(1)
(1)
l=1
334
X. Shang, Y. Tian, and Y. Li y (2) ( t ) − y (1) ( t ) = λ T σ + e ( t )
(9)
It can be seen that noise e ( t ) is the superposition of the two-channel noise through above equations, and the specific form of the noise can not be determined in the course of the analysis. Therefore, the auxiliary variable technique, which does not need to know the specific form of noise to identify the model parameters and can simplify the process of identification, can be used to recognize here. But the traditional auxiliary variable method takes the input vector as an auxiliary variable matrix, then the key of the problem is how to construct the auxiliary variable matrix in the case of unknown input signal. The choice of auxiliary variable should meet certain conditions: the selected auxiliary variables should be independent of the noise signal of N (1) ( t ) and N (2) ( t ) , but relate to y (1) ( t ) and y (2) ( t ) . For this system model, the basic idea is that assuming the system rank n1 , n2 is known, under the conditions of known output, combining the instrumental variable blind identification method , and linear combination of the double channel signals is used to construct instrumental variables to identify the model parameters. We can construct instrumental variate matrix according to the above ideas and derivation instrumental variate blind identification method. From formula (9) ⎡ y (2) (1) − y (1) (1) ⎤ ⎡e (1) ⎤ ⎡ λ T (1) ⎤ ⎢ (2) ⎥ ⎢ ⎥ ⎢ T ⎥ (1) ⎢ y ( 2) − y ( 2)⎥ , ⎢λ ( 2 )⎥ E = ⎢e ( 2 ) ⎥ Y =⎢ A=⎢ ⎥ ⎢ # ⎥ ⎥ # ⎢ ⎥ ⎢ ⎥ ⎢ # ⎥ ⎢ y (2) ( n ) − y (1) ( n ) ⎥ ⎢e ( n )⎥ ⎢λ T ( n )⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦
Then, the formula (9) can be written as Y = Aσ + E
(10)
Where n is the length of data, assume that there exists a matrix F, which has the same dimension with A, make the formula (11) like this E ⎡⎣FT E⎤⎦ = lim⎡⎣FT E⎤⎦ = O E ⎡⎣FT A⎤⎦ = lim⎡⎣FT A⎤⎦ = Q n→∞ n→∞
(11)
Where Q is singular matrix, from the formula we can see that F is not related to E, but related to A. In formula (10) , both sides of the equation are multiplied by at the same time, then the unbiased estimation of can be got σ : ∧
σ = ( F T A) F T Y −1
(12)
That is E ⎡⎢σ ⎤⎥ = σ .Traditional instrumental variables matrix is structured by input ⎣ ⎦ signal which is unknown in this paper. Therefore, the traditional method can not be used here. In order to satisfy the conditions of formula (11), we can use any channels signal to construct instrumental variables. This paper adopts linear combination of double channel output signal to construct instrumental variable matrix, it is more in line to the conditions of the instrumental variables F and A are correlated strongly, then F and E are independent of each other, its form is as follows: ∧
⎡−y(1) ( −n2 ) "− y(1) (1− n2 − n1 ) ⎢ (1) y 1 n − − − y(1) ( 2 − n2 − n1) " ( 2) ⎢ F =⎢ # # ⎢ ⎢−y(1) ( n −1− n ) "− y(1) ( n − n − n ) 2 2 1 ⎣
y(2) ( 0)
" y(2) (1− n2 ) ⎤ ⎥ " y(2) ( 2 − n2 ) ⎥ ⎥ # # ⎥ (2) (2) y ( n −1) "y ( n − n2 ) ⎥⎦ y(2) (1)
(13)
Modeling and Classification of sEMG Based on Instrumental Variable Identification
335
Auxiliary matrix F has constructed, combine with formula (12), the unbiased estima∧
tion of model coefficient vector σ will be easily to obtained . From the experiment we found that when n1 = n2 = 6 , the signal waveform produced by model is similar to the actual one, which means the model coefficients we got can represent signal characteristics well. Figure 2 is the comparison between actual signals and model signals by selecting actions and activity period signals random. Table 1 is the coefficient of part of the signal model. As Figure 2 shows that output signals of model are similar to actual ones, and by experiment we find that adopting the linear combination of the double channel signal to construct the instrumental variables is better than using any channels signal to construct instrumental variate modeling, which means that the instrumental variable method is feasible to identify model coefficients. The data in Table 1 also illustrate that there has big differences among all kinds of features. 4
2
x 10
actual measured signal model output
1.5
signal amplitude
mv
1 0.5 0 -0.5 -1 -1.5 -2 -2.5
0
100
200 300 400 sEMG signal points sampled
500
600
Fig. 2. The contrast between actual signal and model signal of any action Table 1. Part signal model coefficients model coefficients
extension of thumb
extension of wrist
fist grasp
side flexion of wrist
b1 1
-2.7112
-2.5559
-2.8729
-2.9918
b1 2
3.8379
2.9515
4.5583
5.0158
b1 3
-3.6107
-2.1715
-4.0603
-5.1942
b1 4
2.3765
1.2032
3.2487
3.7148
b1 5
-1.0448
-0.5534
-1.4935
-1.7414
b1 6
0.2726
0.1872
0.4253
0.4344
b2 1
-3.9609
2.1759
-2.7531
-4.1177
b2 2
5.1284
0.605
7.1993
6.9812
b2 3
-5.0144
-3.5277
-5.2593
-6.2875
b2 4
5.8978
1.6458
1.3623
3.024
b2 5
-3.2873
-1.7778
-1.0567
-0.6082
b2 6
0.8377
0.4889
0.2225
0.4837
336
X. Shang, Y. Tian, and Y. Li
4 Movement Patterns Identification Probabilistic neural network (PNN) is an important deformation of radial basis function (RBF) network, which proposed by Specht in 1990. It has four layers: input layer, mode layer, summation layer and output layer. The input layer is training sample which can be expressed as X = ( x1 , x2 , ⋅⋅⋅, xq )T . Its neurons numbers are equal to the sample vector dimension. The mode layer firstly calculates the distance between training sample and weight vector & X − IWij &2 , then get output vector M through radial basis nonlinear mapping. It often uses Gaussian function as the Radial basis function. The output vector M represents the probability of training sample belongs to every sort. The calculation formula is shown as equation (14). M ij ( X ) =
1 (2πσ 2 )
n
2
⎡& X − IWij &2 ⎤ exp ⎢ ⎥ i ∈ {1, 2, ⋅ ⋅ ⋅, n} , j ∈ {1,2, ⋅ ⋅ ⋅, m} 2σ 2 ⎢⎣ ⎥⎦
(14)
Where: n is total number of mode, m is neurons number in hidden layer (it is equal to the number of input samples), σ is smooth parameter, whose value determines the width of clock shape curve, which takes sample points as the center. The summation layer calculates the weight of vectors M and S : Ni
Si ( X ) = ∑ ωij M ij ( X ) ωij ∈ [0,1], i ∈ {1, 2, ⋅⋅⋅, n} , j ∈ {1, 2, ⋅⋅⋅, m}
(15)
i =1
Where ωij indicates the mixed weight and satisfies: Ni
∑ω i =1
ij
=1
i ∈ {1, 2, ⋅⋅⋅, n} , j ∈ {1, 2, ⋅⋅⋅, m}
represents the neuron numbers of the i th kind of mode layer. The output layer gets the network output O through the maximum response in
(16)
Ni
O ( X ) = arg max ( Si ) , i ∈ {1, 2, ⋅⋅⋅, n}
S
: (17)
Let the value of neurons output which correspond to the maximum category is 1, and other kinds of neurons are 0. The parameter of radial basis function is spread rate which had the significant effect on experiment. It needs to be manually set during the experiments which complicate the experiment process. Hence, this paper adopts PSO method to optimize the parameter. It not only simplifies the experimental process, and makes the results more effective [10]. By the time of network training, let S as the objective vector T . T is n-dimension vectors. Each of them corresponding to different pattern categories respectively, in which, there is one and only one component to be 1, the rest are all 0. This means that corresponding input vector belongs to the corresponding mode. It should determine the structure of PNN network first. According to the characteristics of the radial basis function, the neurons number of input layer is equal to the input sample vector dimension, the neurons number of output layer is equal to the number of types in the training sample data. The output layer of the network is competition layer, each neuron respectively corresponding to a data type. The structure of
Modeling and Classification of sEMG Based on Instrumental Variable Identification
337
PNN network designed is that input layer has 4 neurons, output layer has 6 neurons, and the transfer function of intermediate layer is Gaussian function, the transfer function of the output layer is linear function. This paper collected 6 kinds of gestures : extension of thumb, extension of wrist, flexion of wrist, fist grasp, side flexion of wrist, extension of palm. Each kind has 10 groups data. After establishing the model, the instrumental variables blind identify method is used to identify the model coefficients. Among the 10 groups ,5 groups are used for neural network training and other 5 groups for testing.
5 Experimental Result Model coefficients are used to classify the movement patterns after modeling the two-channel sEMG. First, let 5 groups date of every movement as the training samples of networking, each channel has 6 rank which means 6 characteristic parameters. We got 12 dimension characteristic vector because of two channels. The training data of 6 movements make up a 12 × 30 characteristic matrix. Then use other 30 groups data to test the network which has been trained. Figure 3 is the feature space distribution of 6 kinds of gesture actions, the axis is the second ,the third and the seventh model coefficients respectively. Figure 4 and Figure 5 is the network training results and the error chart respectively. Table 2 is the comparison results between the double-channel sEMG recognition.
model coeffic ients b(7)
4 2 0 -2 -4 -6 2 0
6 5
-2
4 3
-4 model coefficients b(3)
-6
2 1
model coefficients b(2)
Fig. 3. Random three feature space distribution
forecasting category actual category
forecasting category actual category
error after network training 1
error after network training 2
0.8
5
1
0.6
6
0.4
5
0 classification results
classification results
6
0.2 4 0 3
-0.2 -0.4
2
-1
4
-2
3
-3
2
-0.6 -4
1
1
-0.8 0
20 40 predict number
60
-1
0
20 40 predict number
60
Fig. 4. Network training rendering and error chart after training network
0
10 20 30 predict number
40
-5
0
10 20 30 predict number
40
Fig. 5. Test rendering and error chart after training
338
X. Shang, Y. Tian, and Y. Li
Figure 3 indicates that there have good separability between each pattern, and the effect of clustering is good. It is suitable for classification when using the model coefficients as multi-channel sEMG features by instrumental variable method, so that the extraction of features is effective. From Figure 4 and 5 we can see that the recognition result of the gesture actions by the probabilistic neural network is good. Testing for trained network, only 3 samples are error among 30 samples. The recognition correct is quite high. Table 2. The comparison of the double channel sEMG recognition results (correct number / test number)
classification
extension of
extension of
flexion of
fist
side flexion
extension of
method
thumb
wrist
wrist
grasp
of wrist
palm
4/5
5/5
5/5
5/5
4/5
4/5
4/5
3/5
5/5
3/5
3/5
instrumental variable method
AR coefficient
3/5
In order to compare the features got by the instrumental variable method with the traditional AR method, three order AR coefficient of two-channel are used as the character of the signal. The recognition results of two methods are shown in Table 2. It can be seen that the average recognition rate of the instrumental variable method is 90%, and the average recognition rate of the AR coefficients is only more than 70%. The classification effect of instrumental variable method is better than AR method in extensor wrist, clench fist and forefinger extend. Obviously using instrumental variable method to classify sEMG signals is effective.
6 Conclusion Based on the characteristics of sEMG, model for two-channel sEMG signals are built by the instrumental variable method, and the model coefficients were used as the input of neural network to classify six gesture movements. At the same time the traditional AR coefficients also used as signal features to compare with the method above. The experimental results show that the features got through the blind identification modeling are better than the traditional AR coefficient in classification. Instrumental variable blind identification method is often used in communication and signal processing field (such as the multiuser communications system, multi-sensor radar, sonar system, microphone array, etc.), etc. This paper is an attempt of blind identification theory at physiologic signal application. The experiment shows that this method has good prospects in movement pattern recognition.
Modeling and Classification of sEMG Based on Instrumental Variable Identification
339
References 1. Jin, D.W., Wang, R.C.: Artificial lntelligent prosthesis. Chinese J. Clinical Rehabilitation 5(20), 2994–2995 (2002) 2. Doerschuk, P.C., Gustafson, D.E., Willsky, A.S.: Upper extremity limb function discrimination using EMG signal analysis. IEEE Transactions on Biomedical Engineering 30(1), 18–29 (1983) 3. Cai, L., Wang, Z., Liu, Y.: Modelling and Classification of Two-Channel elect romyography Signals Based on Blind Channel Identification Theory. Journal of Shang Hai Jiao Tong University 34(11), 1468–1471 (2000) 4. Deluca, C.: Physiology and mathematics of myoelectric signals. IEEE Transactions on Biomedical Engineering 26(6), 313–325 (1979) 5. Luo, H., Li, Y.D.: Application of blind channel identification techniques to restack seismic deconvolution. Proceedings of IEEE 86(10), 2082–2089 (1998) 6. Ding, F., Chen, T.: Identification of Hammerstein nonlinear AR-MAX systems. Automatica 41(9), 1479–1489 (2005) 7. Shang, X., Tian, Y., Li, Y., Wang, L.: The Recognition of Gestures and Movements Based On MPNN. Journal of Jilin University (Information Science Edition) 28(5), 459–466 (2010)
Modeling and Classification of sEMG Based on Blind Identification Theory* Yang Li1, Yantao Tian1,2,**, Xiaojing Shang1, and Wanzhong Chen1 1 2
School of Communication Engineering, Jilin University, Changchun, 130025 Key Laboratory of Bionic Engineering, Ministry of Education Jilin University
[email protected] Abstract. Surface electromyography signal is non-stationary, susceptible to external interference. For this situation under this case, cyclostationary input with the inverse nonlinear mapping of the Hammerstein-Wiener model were combined to build surface electromyography model and to realize the blind discrete nonlinear system identification. The parameters of model were used as input of improved BP neural network. The experiments results demonstrated the effectiveness of this approach. Keywords: sEMG, Blind Identification, Hammerstein-Wiener Model.
1 Introduction sEMG is obtained from the neuromuscular activity, which reflects the function and status of nerve and muscle. Using sEMG can identify different movement patterns, because different actions corresponding to different sEMG signals. Therefore, sEMG has not only been widely used in clinical medicine and sports medicine, but also been control signal in multi-degree of freedom artifical limb. sEMG is nonlinear biological signal which is non-stable, robustness and susceptible to external interference[1,2]. It is the key of control myoelectric prothesis that extracting effective features of sEMG to accurately identify different movement patterns. Scholars at home and abroad of sEMG signal processing usually use traditional methods to obtain sEMG features, such as time domain method, frequency domain method and time-frequency domain method. The characteristics of sEMG itself don’t be made an intensive study, which causing the extracted features are not typical, the operation time is long, and the recognition rates are not high[3,4]. In this paper, mathematical model of dual-channel forearm sEMG are established based on sEMG characteristics. The output obtained by the electrode is the only measurable signal, the input is unknow and can not be predicted. Therefore, Hammerstein-Wiener blind identification method is used to establish sEMG model. *
This paper is supported by the Key Project of Science and Technology Development Plan for Jilin Province (Grant No.20090350), Chinese College Doctor Special Scientific Research Fund (Grant No.20100061110029) and the Jilin University "985 project" Engineering Bionic Sci. & Tech. Innovation Platform. ** Corresponding author. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 340–347, 2011. © Springer-Verlag Berlin Heidelberg 2011
Modeling and Classification of sEMG Based on Blind Identification Theory
341
The parameters of model are used as input of improved BP neural network for multipattern recognition. The experiments demostrate that good results have been achieved.
2 Hammerstein-Wiener Model Nonlinear discrete Hammerstein-Wiener model as shown in Figure 1.
Fig. 1. Structure of Hammerstein-Wiener model
Where the linear part is G( z ) =
β z −1 + β z −2 + " + β z − n 1
(1)
n
2
1 − α z −1 − α z −2 − " − α z − n n 1 2
The linear relationship between input and output can be expressed as n
n
i =1
i =1
x (t ) = ∑ α i x(t − i ) + ∑ β i u (t − i )
(2)
The nonlinear part is composed by a continuous, reversible sequence, which is shown as (3) q
y = f ( x) = ∑ li si ( x)
(3)
i =1
It needs to meet the following conditions: 1) In nonlinear sequence x = f −1 ( y ) , si ( x) is a continuous, invertible mapping
whose inverse model x = f −1 ( y ) can be expressed as m
x = f −1 ( y ) = ∑ γ i pi ( y )
(4)
i =1
2) If the discrete input is limited sequence, the model output is still limited one 3) The orders of model m, n, q are known Under these conditions, the nonlinear system blind identification can be stated like this: the input of nonlinear system u (t ) is cyclostationary which can not be measured, the output y (t ) and the model parameters
α i , βi , li
can be identified if
the sampling period T and statistical properties of u (t ) are known.
3 Cyclostationary Signals and Statistical Properties A cyclostationary process is a signal having statistical properties that vary cyclically with time. It can better characterize human biological signals such as sEMG. It has the following features:
342
Y. Li et al.
1) The statistical properties vary cyclically with time which can reflect the characteristics of non-stationary signals. It is a generalized description of wide-sense stationary signal. 2) The statistical properties vary cyclically with time, so it is reasonable simplification of non-stationary signals based on objective description. 3) Correlation of frequency-shift signal is contained which is peculiar to signal cyclical characteristics. In this paper, the statistical characteristics of cyclostationary signals with Hammerstein-Wiener model are combined to solve the problem of blind identification of nonlinear systems.
4 Hammerstein-Wiener Model Identification The statistical characteristics of cyclostationary signals can be used with nonlinear inverse mapping of Hammerstein-Wiener model to obtain parameters, and to complete model identification. 4.1 Parameters
αi , γ i
Identification
Put equation (4) of model structure into (2) we get m
n
m
n
j =1
i =1
j =1
i =1
∑ γ j p j ( y(t )) = ∑αi ∑ γ j p j ( y(t − i)) + ∑ βiu(t − i) Assumed that γ 1
(5)
= 1 . With this assumption the results can be uniquely identified.
Then the (8) can be expressed as n
p1 ( y (t )) = φ T (t ) ⋅ θ + ∑ β i u (t − i )
Where φ (t ), θ are vectors, can be shown as φ T (t ) = [− p2 ( y(t ))
− p3 ( y(t ))" − pm ( y(t ))
p1 ( y(t − 1))
p1 ( y(t − 2))" p1 ( y(t − n))
p2 ( y(t − 1))
p2 ( y(t − 2))" p2 ( y(t − n))
" pm ( y (t − 1))
(6)
i =1
(7)
pm ( y (t − 2))" pm ( y(t − n))]
θ T = [γ 2 γ 3 " γ m α1 α 2 " α n α1γ 2 α 2γ 2 α 3γ 2 " α nγ 2 " α1γ m α 2γ m α 3γ m " α nγ m ]
Now consider the next period t + T , the equation of
(8)
p1 ( y (t + T )) is accordingly
given by n
p1 ( y (t + T )) = φ T (t + T ) ⋅ θ + ∑ βi u (t + T − i ) i =1
Equation (9) minus (6) can be obtained
(9)
Modeling and Classification of sEMG Based on Blind Identification Theory
343
p1 ( y (t + T )) − p1 ( y (t )) = [φ T (t + T ) − φ T (t )] ⋅ θ +
(10)
n
∑ β [u (t + T − i) − u (t − i)] i
i =1
Since the statistical characteristics of cyclostationary input has cyclic pattern E[u (t + T − i )] = E[u (t − i )] , the operation result follows that E[ p1 ( y (t + T )) − p1 ( y (t ))] = E[φ T (t + T ) − φ T (t )] ⋅ θ
(11)
From this we can see that the system parameter θ is only related to the output y (t ) and
pi (⋅) , irrelevant to input u (t ) . Then θ can be obtained by the least square method which means that the value of model parameters α i , γ i can be got.
the structure of
4.2 Parameters
li Identification
According to equation (6) and (7),
li will be got by the least square method
2 q m ⎤ ⎪⎫ ⎪⎧ ⎡ min L = min ⎨∑ ⎢ y (t ) − ∑ li ⋅ si (∑ γ i pi ( y )) ⎥ ⎬ i =1 i =1 ⎦ ⎭⎪ ⎩⎪ t ⎣
4.3 Parameters
βi
(12)
Identification
Nonlinear inverse model can be got after
αi , γ i
are obtained. The estimated symbol is
m
xˆ(t ) = ∑ γˆi pi ( y )
(13)
i =1
Substituting (13) into (2), it follows that n
n
i =1
i =1
xˆ (t ) − ∑ αˆ i xˆ (t − i ) = ∑ βi u (t − i )
(14)
⎡
(15)
Due to n
⎤
n
∑ β u (t − i) = ⎢⎣∑ β δ (t − i)⎥⎦ ⋅ u(t ) i =1
i
i =1
i
Equation (15) can be expressed as n ⎡ n ⎤ xˆ (t ) − ∑ αˆ i xˆ (t − i ) = ⎢ ∑ β iδ (t − i ) ⎥ ⋅ u (t ) i =1 ⎣ i =1 ⎦
(16)
Let n
Q (t ) = xˆ (t ) − ∑ αˆ i xˆ (t − i ) i =1
,
⎡ n ⎤ H = ⎢∑ β iδ (t − i ) ⎥ ⎣ i =1 ⎦
Then (15) can be rewritten as Q(t ) = H ⋅ u (t )
(17)
In equation (17) H can be calculated using linear blind identification of second-order statistics. Then βi can be got. The calculation steps are as follows
344
Y. Li et al.
1) Centralize Q (t ) , make it zero mean 2) Mixed signal Q (t ) has no noise, the correlation matrix can be estimated as 1 RˆQ (0) = N
N
∑ ⎡⎣Q(t ) ⋅ Q t =1
T
(t ) ⎤⎦
3) Singular value decomposition for RˆQ (0) RˆQ (0) = Vu Λ uVuT
4) Pre-whitening −
1
Q(t ) = Λu 2 ⋅VuT ⋅ Q(t ) 1
ˆ − 2 = diag {(λ − σˆ 2 ), (λ − σˆ 2 ), " (λ − σˆ 2 )} Λ 1 2 u N N n N
Q(t ) has no noise so
1
ˆ −2 = Λ ˆ Λ u u
5) The time delay p ≠ 0 , then singular value decomposition for covariance 1 RˆQ ( p ) = N
N
∑ ⎡⎣Q(t ) ⋅ Q t =1
T
(t − p ) ⎤⎦
= U Q ⋅ ∑ Q ⋅VQT
6) If the singular value of
∑
Q
have same value, then take different p to repeat step
4), Otherwise 1
ˆ − 2 ⋅V T ) −1 ⋅U Hˆ = W + ⋅U Q = (Λ u u Q 1
ˆ 2 ⋅U = V ⋅ Λ ˆ −1⋅U = Vu ⋅ Λ u Q u u Q
5 Improved BP Neural Network Classifier A new BP neural network improved algorithm is proposed in this paper based on original BP algorithm and simulated annealing algorithm. This new algorithm has used BP algorithm as main frame and introduced SA strategy to learning process. Through this, it not only uses supervised learning of gradient descent to improve local search performance, but utilizes the ability of SA that could probabilistic jump out of local minimum to achieve global convergence [5]. It has greatly improved the learning performance of the network. The new algorithm is used in neural network classifier to recognize five hand gestures in this paper [6]. Let step is η whose adjustment formula is (18) E = E ( k ) − E ( k − 1)
, ,
When E