Communications in Computer and Information Science
15
De-Shuang Huang Donald C. Wunsch II Daniel S. Levine Kang-Hyun Jo (Eds.)
Advanced Intelligent Computing Theories and Applications With Aspects of Contemporary Intelligent Computing Techniques 4th International Conference on Intelligent Computing, ICIC 2008 Shanghai, China, September 15-18, 2008 Proceedings
13
Volume Editors De-Shuang Huang Institute of Intelligent Machines Intelligent Computing Laboratory Chinese Academy of Sciences Hefei, Anhui 230031, China E-mail:
[email protected] Donald C. Wunsch II Missouri University of Science & Technology Department of Electrical and Computer Engineering Applied Computational Intelligence Laboratory Rolla, MO 65409-0040, USA E-mail:
[email protected] Daniel S. Levine University of Texas at Arlington Department of Psychology Arlington, TX 76019-0528, USA E-mail:
[email protected] Kang-Hyun Jo University of Ulsan Graduate School of Electrical Engineering Ulsan 680-749, South Korea E-mail:
[email protected] Library of Congress Control Number: 2008933737 CR Subject Classification (1998): G.1.6, H.2.8, H.3.3, I.2.11, I.5.1 ISSN ISBN-10 ISBN-13
1865-0929 3-540-85929-2 Springer Berlin Heidelberg New York 978-3-540-85929-1 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2008 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12512744 06/3180 543210
Preface
The International Conference on Intelligent Computing (ICIC) was formed to provide an annual forum dedicated to the emerging and challenging topics in artificial intelligence, machine learning, bioinformatics, and computational biology, etc. It aims to bring together researchers and practitioners from both academia and industry to share ideas, problems and solutions related to the multifaceted aspects of intelligent computing. ICIC 2008, held in Shanghai, China, September 15–18, 2008, constituted the 4th International Conference on Intelligent Computing. It built upon the success of ICIC 2007, ICIC 2006 and ICIC 2005 held in Qingdao, Kunming and Hefei, China, 2007, 2006 and 2005, respectively. This year, the conference concentrated mainly on the theories and methodologies as well as the emerging applications of intelligent computing. Its aim was to unify the picture of contemporary intelligent computing techniques as an integral concept that highlights the trends in advanced computational intelligence and bridges theoretical research with applications. Therefore, the theme for this conference was “Emerging Intelligent Computing Technology and Applications”. Papers focusing on this theme were solicited, addressing theories, methodologies, and applications in science and technology. ICIC 2008 received 2336 submissions from 31 countries and regions. All papers went through a rigorous peer review procedure and each paper received at least three review reports. Based on the review reports, the Program Committee finally selected 401 high-quality papers for presentation at ICIC 2008, of which 373 papers have been included in three volumes of proceedings published by Springer comprising one volume of Lecture Notes in Computer Science (LNCS), one volume of Lecture Notes in Artificial Intelligence (LNAI), and one volume of Communications in Computer and Information Science (CCIS). The other 28 papers will be included in two international journals. This volume of the Communications in Computer and Information Science (CCIS) series includes 70 papers. The organizers of ICIC 2008, the including the Center for International Scientific Exchanges of the Chinese Academy of Sciences, Shanghai University, and the Institute of Intelligent Machines of the Chinese Academy of Sciences, made an enormous effort to ensure the success of ICIC 2008. We hereby would like to thank the members of the ICIC 2008 Advisory Committee for their guidance and advice, and the members of the Program Committee and the referees for their collective effort in reviewing and soliciting the papers. We would like to thank Alfred Hofmann, Executive Editor at Springer, for his frank and helpful advice and guidance throughout and for his support in publishing the proceedings. In particular, we would like to thank all the authors for contributing their papers. Without the highquality submissions from the authors, the success of the conference would not have
VI
Preface
been possible. Finally, we are especially grateful to the IEEE Computational Intelligence Society, the International Neural Network Society and the National Science Foundation of China for their sponsorship.
July 2008
De-Shuang Huang Donald Wunsch Daniel S. Levine Kang-Hyun Jo
Organization
General Chair Steering Committee Chair Program Committee Chair Organizing Committee Co-chairs
Donald Wunsch, USA De-Shuang Huang, China Daniel S. Levine, USA Min-Rui Fei, China Shi-Wei Ma, China Chun-Hou Zheng, China Ji-Xiang Du, China Laurent Heutte, France Kang-Hyun Jo, Korea Marco Loog, Denmark Fakhri Karray, Canada Prashan Premaratne, Australia Frank Neumann, Germany Vitoantonio Bevilacqua, Italy Wanquan Liu, Australia Sanggil Kang, Korea Plamen Angelov, UK Xin Li, China Si-Liang Chen, China
Award Committee Chair Publication Chair Special Session Chair Tutorial Co-chairs International Liaison Chair Publicity Co-chairs
Exhibition Chair
Steering Committee Members Luonan Chen, Japan Marco Loog, Denmark Jun Zhang, China
Laurent Heutte, France Guangrong Ji, China Xiao-Ping Zhang, Canada
Kang Li, UK Kang-Hyun Jo, Korea
Organizing Committee Members Jian Fan, China Qun Niu, China Ling Wang, China
Zhi-Hua Li, China Yang Song, China Yu-Lin Xu, China
Li-Xiong Li, China Xin Sun, China Bang-Hua Yang, China
Program Committee Members Khalid Mahmood Aamir, Pakistan Andrea Francesco Abate, Italy Shafayat Abrar, UK
Uwe Aickelin, UK Adel M. Alimi, Tunisia Peter Andras, UK Plamen Angelov, UK Sabri Arik, Turkey
Vasily Aristarkhov, Russian Federation Costin Badica, Romania Vitoantonio Bevilacqua, Italy
VIII
Organization
Salim Bouzerdoum, Australia Martin Brown, UK Jinde Cao, China Uday K., USA Pei-Chann Chang, Taiwan Peng Chen, China Shyi-Ming Chen, Taiwan Shih-Hsin Chen, Taiwan Weidong Chen, China Wen-Sheng Chen, China Xiyuan Chen, China Yuehui Chen, China Min-Sen Chiu, Singapore Michal Choras, Poland Tommy Chow, Hong Kong Jose Alfredo F. Costa, Brazil Kevin Curran, UK Mingcong Deng, Japan Gabriella Dellino, Italy Salvatore Distefano, Italy Ji-Xiang Du, China Meng Joo Er, Singapore Karim Faez, Iran Jianbo Fan, China Minrui fei, Canada Wai-Keung Fung, Canada Max H. Garzon, USA Liang Gao, China Ping Guo, China Qing-Wei Gao, China Xiao-Zhi Gao, Finland Chandan Giri, India Kayhan Gulez, Turkey Fei Han, China Kyungsook Han, Korea Aili Han, China Jim Harkin, UK Haibo He, USA Francisco Herrera, Spain Laurent Heutte, France Wei-Chiang Hong, Taiwan Yuexian Hou, China
Guang-Bin Huang, Singapore Peter Chi Fai Hung, Ireland Won Joo Hwang, Korea Estevam Rafael Hruschka J., Brazil Myong K. Jeong, USA Guangrong Ji, China Zhenran Jiang, China Kang-Hyun Jo, Korea Jih-Gau Juang, Taiwan Dah-Jing Jwo, Taiwan Janusz Kacprzyk, Poland Visakan Fakhri Karray, Canada Hirotaka Inoue Kure, Japan Jia Li, China Kadirkamanathan, UK Hee-Jun Kang, Korea Sanggil Kang, Korea Uzay Kaymak, Netherlands Ziad Kobti, Canada Mario Koeppen, Japan Muhammad Khurram Khan, Pakistan Donald H. Kraft, USA Harshit Kumar, Korea Takashi Kuremoto, Japan Hak-Keung Lam, UK Sungshin Kim, Korea In-Soo Koo, Korea Yoshinori Kuno, Japan Turgay Ibrikci, Turkey Richard Lathrop, USA Choong Ho Lee, Korea Vincent C.S. Lee, Australia Dalong Li, USA Guo-Zheng Li, China Peihua Li, China Xiaoli Li, China Xin Li, China Xueling Li, China
Hualou Liang, USA Chunmei Liu, USA Ju Liu, China Van-Tsai Liu, Taiwan Wanquan Liu, Australia Yanzhang Liu, China Ahmad Lotfi, UK Hongtao Lu, China Jinwen Ma, China Shiwei Ma, China Hiroshi Mamitsuka, Japan Filippo Menolascina, Italy Tarik Veli Mumcu, Turkey Roman Neruda, Czech Republic Frank Neumann, Germany Minh Nhut Nguyen, Singapore Ngoc Thanh Nguyen, Poland Sim-Heng Ong, Singapore Francesco Pappalardo, Italy Sung-Joon Park, Korea Daniel Patino, Argentina Girijesh Prasad, UK Prashan Premaratne, Australia Nini Rao, China Miguel Alberto Melgarejo Rey, Colombia Peter Rockett, UK Fariba Salehi, Iran Angel Sappa, Spain Karadeniz, Turkey Aamir Shahzad, Pakistan Li Shang, China Nobutaka Shimada, Japan Jiatao Song, China Anantaporn Srisawat, Thailand Nuanwan Soonthornphisaj, Thailand Joao Miguel da Costa Sousa, Portugal
Organization
Min Su, USA Zhan-Li Sun, Singapore Maolin Tang, Australia Antonios Tsourdos, UK Naoyuki Tsuruta, Japan Athanasios Vasilakos, Greece Anhua Wan, China Chao-Xue Wang, China Jeen-Shing Wang, Taiwan Jiang-Qing Wang, China Yong Wang, Japan Zhi Wang, China
Hong Wei, UK Zhi Wei, China Ling-Yun Wu, China Shunren Xia, China Yu Xue, China Ching-Nung Yang, Taiwan Jun-Heng Yeh, Taiwan Myeong-Jae Yi, Korea Xinge You, China Tina Yu, Canada Zhi-Gang Zeng, China Guisheng Zhai, Japan
IX
Jun Zhang, China Xi-Wen Zhang, China Hongyong Zhao, China Xiaoguang Zhao, China Zhongming Zhao, USA Bo-Jin Zheng, China Fengfeng Zhou, USA Byoung-Tak Zhang, Korea Xing-Ming Zhao, Japan Chun-Hou Zheng, China Daqi Zhu, China Xiaojin Zhu, China
Reviewers Rahat Abbas, Janos Abonyi, Giuseppe M.C. Acciani, Ali Ahmed Adam, Alimi Adel, Muhammad Zubair Afzal, H. Agaeinia, Hassan Aghaeinia, Ali Aghagolzadeh, Chang Wook Ahn, Lifeng Ai, Ayca Gokhan Ak, Waseem Akhtar, Mustafa Aktas, Songul Albayrak, Davide Alemani, Rahim Ali, Ibrahim Aliskan, Muhammad Alkarouri, Abdullah Al-Malaise, Rui Jorge Almeida, Khareem Almo, Dario Aloise, Pablo Javier Alsina, Roberto T. Alves, Saleh Aly, Marco Alzate, Hamidreza Amindavar, Plamen Angelov, Dennis Barrios Aranibar, Nestor Arana Arexolaleiba, Salvatore Arinisi, Vasily Aristarkhov, Ali Ashraf-Modarres, Krassimir Atanassov, Mutlu Avci, Phillipa Avery, Erel Avineri, Thouraya Ayedi, Pedro Paulo Ayrosa, Amelia Badica, Hyeon Bae, Aditya Bagchi, Chenggang Bai, Meng Bai, Amar Balla, Lucia Ballerini, Rajib Bandyopadhyay, Sudhirkumar Barai, Peter Baranyi, Nicola Barbarini, Jose Joel Gonzalez Barbosa, Andres Eduardo Gaona Barrera, Guilherme Barreto, Lucia Barron, Ying L. Becker, Nur Bekiroglu, Ammar Belatreche, Domenico Bellomo, Umesh Bellur, Tomas Beran, Saul Bertuccio, Alvaro Betancourt, Vitoantonio Bevilacqua, Fiachra Mac Giolla Bhríde, M.R. Bhujade, Rongfang Bie, Gennaro Nicola Bifulco, Laurentiu Biscu, P.K. Biswas, Santosh Biswas, Antonino Biundo, Dario de Blasiis, S.M. Bohte, Danail Bonchev, Andreia G. Bonfante, Olaf Booij, Giuseppe Borzi, Janez Brank, Agostinho de Medeiros Brito Junior, Dimo Brockhoff, Dario Bruneo, Ni Bu, Mari Angelica Camargo-Brunetto, Louis-Claude Canon, Galip Cansever, Anne Magali de Paula Canuto, Jianting Cao, Jinde Cao, Yang Cao, Yuan Cao, Lucia Cariello, Leonarda Carnimeo, Bianca di Angeli Carreras Simoes Costa, Bruno Motta de Carvalho, Matthew Casey, Ssa Giovanna Castellano, Marcello Castellano, Filippo Castiglione, Oscar Castillo, Pablo de Castro Roberto Catanuto, Zhiwei Cen, Jes de Jesus Fiais Cerqueira, Mark Chadwick, P. P. Chakrabarty, Mandira Chakraborty, Sandipan Chakpoborty, Chien-lung Chan, Chuan-Yu Chang, Yeong-Chan Chang, Dong Eui Chang, Kuei-Hsiang Chao, Kuei-Hsiang Chao, Liheng Chao, Hassan Tariq Chattha, Santanu Chattopadhyay, Rizwan Chaudhry, Saurabh Chaudhury, Dongsheng Che, Jiuhua Chen, Chun-Hao Chen, Cycer Chen, Chuyao Chen, Dan Chen, Shi-Jay Chen, Dongsheng Chen, Ziyi Chen, Feng-Chi Chen, Tin-Chih Chen, Yen-Ping Chen, Xuedong Chen, Zhi-Jie Chen, GS Chen, Li-Wen Chen, Miller Chen, Xinkai Chen,
X
Organization
Xinyu Chen, Peter Chen, Sheng Chen, Zehua Chen, Gang Chen, Ming Chen, Peng Chen, Yong Chen, Hui Chen, Ken Chen, Lin Chen, Qisong Chen, Yiming Chen, Qiming Cheng, Ming-Yang Cheng, Mu-Huo Cheng, Victor Cherepanov, Ching-Tsan Cheung, Chi Chiu Chiang, Jen-Chieh Chiang, Jerry Chien, C. H. Chin, Chaochang Chiu, Chih-Hui Chiu, Min-Sen Chiu, Leszek Chmielewski, Dong-Yeon Cho, ChangSik Choi, Sungsoo Choi, Sungsoo Choi, Won Ho Choi, Michal Choras, Smitashree Choudhary, Yun-Kung Chung, Andrzej Cichocki, Vincent Cicirello, Alessandro Cincotti, Guilherme Coelho, Leandro Coelho, Dorian Cojocaru, Joan Condell, Oscar Cordon, Luciano da Fontoura Costa, Jose Alfredo, F. Costa, Mirel Cosulschi, Deborah Cravalho, Valentin Cristea, Cuco Cristiano, Jie Cui, Feipeng Da, Keshav Dahal, Zhifeng Dai, Hong-Yi Dai, Domenico Daleno, Nabanita Das, Bijan Davvaz, Kaushik Deb, Jayanta Kumar Debnath, Alberto Del, Bimbo Haibo Deng, Glad Deschrijver, Michael Dewar, Sajal Dey, Habib Dhahri, Jianli Ding, Alessia D'Introno, Banu Diri, Salvatore Distefano, Adriana Dobriceanu, Wenyong Dong, Yan Dong, Guy Drouin, Yongping Du, Xin Du, Mojie Duan, Fuqing Duan, Yunsuo Duan, Li Duan, Wieslaw A. Dudek, Martyn Durrant, Nees Jan van Eck, John Economou Shinto Eguchi, Chen Ei, Mehmet Kubilay Eker, Atilla Elçi, Meisam Emamjome, Seref N. Engin, Tolga Ensari, Zeki Erdem, Koksal Erenturk, Kadir Erkan, Osman Erol, Andrés Escobar, Imen Essafi, Charles Eugene, Eugene C. Ezin, Mehdi Ezoji, Umar Faiz, Alexandre Xavier Falcão, Ivanoe De Falco, Chun-I Fan, Chin yuan Fan, Shaojing Fan, Jian Fan, Xiang Fan, Kai Fan, Ping-An Fang, Yong Fang, Yi Fang, Adel Omran Farag, Sheyla Farias, Maria Fazio, Joseana Macedo Fechine, Jun Fei, Balazs Feil, Naizhang Feng, Jan Feyereisl, Sevan Ficici, Juan Carlos Figueroa, Simone Fiori, Robert Fisher, Kenneth Ford, Girolamo Fornarelli, Carlos Henrique Forster, Flavius Frasincar, Chaojin Fu, Shengli Fu, Hong Fu, Yu Fu, John Fulcher, Wai-keung Fung, Colin Fyfe, Sebastian Galvao, Zhaohui Gan, zunhai Gao, Jianxin Gao, Xiao-Zhi Gao, Qingwei Gao, Shouwei Gao, Tiehong Gao, Haibin Gao, Xin Gao, Andres Gaona, Juan Carlos Figueroa García, Alexandru Gartner, Vicente Zarzoso Gascon-Pelegri, António Gaspar-Cunha, Dingfei Ge, Fei Ge, Pando Georgiev, David Geronim, Adam Ghandar, Arfan Ghani, Pradip Ghanty, Hassan Ghasemian, Supratip Ghose, R. K. Ghosh, Marco Giannini Gustavo, Gimenez Mark Girolami, Adrian Giurca, Brendan Glackin, Cornelius Glackin, Amin Yazdanpanah Goharrizi, Jackson Gomes, Márcio Leandro Gonçalves, Feng Gong, Xing Gong, Xiujun Gong, Adilson Gonzaga, Flavius Gorgonio, Diganata Goswami, Victor Hugo Grisales, André Grüning, Feng Gu, Ricardo Ribeiro Gudwin, Andrea Guerriero, Jie Gui Kayhan Gülez, Kayhan Gulez, Ge Guo, Feng-Biao Guo, Lanshen Guo, Tiantai Guo, Weiping Guo, Zheng Guo, A K Gupta, A. Gupta, Indranil Gupta, Dan Gusfield, Giménez-Lugo Gustavo, Taeho Ha, Javad Haddadnia, Tarek M. Hamdani, Yousaf Hamza, A. Han, Kyungsook Han, KukHyun Han, Lianyi Han, Kijun Han, Santoso Handri, yuanling Hao, Edda Happ, Jim Harkin, Pitoyo Hartono, Nada Hashmi, Mark Hatcher, Jean-Bernard Hayet, Guoliang He, Zhaoshui He, Zhongkun He, Zhiyong He, Hanlin He, Jun He, Liu He, Yu He, Martin Hermanto, Emilio Del Moral Hernandez, Carlos Herrera, Christian W. Hesse, Hidehiro Ohki Hidehiro, John Ho, Murillo Rodrigo Petrucelli Homem, Murillo Homem, Wei-Chiang Hong, Dihui Hong, Xia Hong, Gen Hori, Keiichi Horio, Shijinn Horng, Christian Horoba, Alamgir Hossain, Yuexian Hou, Zhixiang Hou, Guolian Hou, Estevam R. Hruschka Jr., Chen-Huei Hsieh, Jih-Chang Hsieh, Jui-chien Hsieh, Sun-Yuan Hsieh, Chi-I Hsu, Yu-Liang Hsu, Dan Hu, Yongqiang Hu, Xiaolin
Organization
XI
Hu, Ting Hu, YAN Hua, Chuanxiu Huang, Jian Huang, Wei-Hsiu Huang, Sun-Jen Huang, Weichun Huang, Weitong Huang, Ying J. Huang, Yuefei Huang, Jian Huang, Ping Huang, Di Huang, Evan J Hughes, Yung-Yao Hung, Changyue Huo, Knut Huper, Saiful Huq, Kao-Shing Hwang, I-Shyan Hwang, Won-Joo Hwang, Mintae Hwang, Hwang, Wonju Hwang, Muhammad Usman Ilyas, Anca Ion, Ahmad Ali Iqbal, Zahid Irfan, Y. Ishida, Ivan Nunes Silva, Kuncup Iswandy, Marcin Iwanowski, Yumi Iwashita, Sridhar Iyer, Gonçalves, J. F., Beirão, N., Saurabh Jain, Lakhmi Jain, Sanjay Kumar Jana, D. Janakiram, Jun-Su Jang, Marko Jankovic, Mun-Ho Jeong, Zhi-Liang Ji, Hongjun Jia, Wei Jia, Jigui Jian, Cizhong Jiang, Chang-An Jiang, Yuncheng Jiang, Minghui Jiang, Xingyan Jiang, Lihua iang, Bin Jiao, Kyohong Jin, Zhong Jin, Rong Jin, Geunsik Jo, Jang Wu Jo, Torres-Sospedra Joaquin, Daniel Johannsen, Colin Johnson, José Demisio Simões da Silva, R.K. Joshi, Tejal Joshi, Koo Joungsun, Jih-Gau Juang, Carme Julià, Young Bae Jun, Heesung Jun, Khurum Nazir Junejo, Jinguk Jung, Francisco Madeiro Bernardino Junior, Roberto Marcondes Cesar Junior, Dah-Jing Jwo, Osvaldo Mafra Lopes Junio, E. Kabir, Visakan Kadirkamanathan, Salim Kahveci, kaka, Ilhem Kallel, Habib Kammoun, Hamid Reza Rashidy Kanan, Hyunduk Kang, Hyun-Deok Kang, Hee-June Kang, Hyunduk Kang, Henry Kang, Yasuki Kansha, Cihan Karakuzu, Ghader Karimian, Bekir Karlik, Shohreh Kasaei, Faisal M Kashif, Boer-Sorbán Katalin, H Kawasaki, Olesya Kazakova, Christel Kemke, Tamas Kenesei, Selami Kesler, Muhammad Khurram Khan, Malik Jahan Khan, Shehroz Khan, Pabitra Mohan Khilar, Pabitra Khilar, Chin Su Kim, Chungsan Kim, Dae-Nyeon Kim, Myung-Kyun Kim, Kane Kim, Pil Gyeom Kim, Seong Joo Kim, Eunchan Kim, Gwan-Su Kim, Hak Lae Kim, Kanghee Kim, Il Kon Kim, Sung S Kim, Taeho Kim, Christian Klein, Chun-Hsu Ko, Yoshinori Kobayashi, Kunikazu Kobayashi, Andreas Koenig, Mario Koeppen, Andrew Koh, xiangzhen Kong, Insoo Koo, Murakami Kouji, Vladik Kreinovich, Ibrahim Kucukdemiral, Rajeev Kumar, Chao-Lin Kuo, Tzu-Wen Kuo, Wen-Chung Kuo, Simon Kuo, Takashi Kuremoto, Zarei-Nia Kurosh, Janset Kuvulmaz, Yung-Keun Kwon, Chien-Yuan Lai, Franklin Lam, H.K. Lam, Andrey Larionov, Pietro Larizza, M. Mircea Lazar, Vincenzo Di Lecce, Yulia Ledeneva, Bore-Kuen Lee, Chiho Lee, Kyung Chang Lee, Vincent C S Lee, Myung-Joon Lee, Guanling Lee, Hong-Hee Lee, Ka-keung Lee, Shao-Lun Lee, Eun-Mi Lee, In-Hee Lee, Sangho Lee, Minho Lee, N.Y. Lee, Peter Lee, Lee, Lee, Suwon Lee, Vincent Lee, Per Kristian Lehre, Yujun Leng, Agustin Leon, Carson K. Leung, Alexandre Levada, Ao Li, Caiwei Li, Chen Li, Chia-Hsiang Li, Chien-Kuo Li, Bo Li, Mingdong Li, Hualiang Li, Weigang Li, KeQing Li, Xinyu Li, Heng-Chao Li, Guozheng Li, Hongchun Li, Kangshun Li, Qingfeng Li, Xiaodong Li, zhisheng Li, HuiFang Li, Renwang Li, Shanbin Li, Xueling Li, Yueping Li, Liyuan Li, Rewang Li, Shutao Li, Yiyang Li, Fuhai Li, Li Erguo, Jian Li, Yong Li, Lei Li, Min Li, Feng-Li Lian, Yun-Chia Liang, Hualou Liang, Han Liang, Liao, Wudai Liao, Hee-Woong Lim, Cheng-Jian Lin, Chih-Min Lin, Feng-Yan Lin, Jyun Jie Lin, Jyun-Yu Lin, Jun-Lin Lin, Yu-Chen Lin, Jimmy Lin, Lin, Hao Lin, Junjie Lin, Yingbiao Ling, Steve Ling, Chang Liu, Che-Wei Liu, Bingqiang Liu, Yubao Liu, Xingcheng Liu, Yongmei liu, Jing Liu, Mei-qin Liu, Qingshan Liu, Van-Tsai Liu, KunHong Liu, liangxu liu, Shiping Liu, Weiling Liu, Xiaomin Liu, Xiaoyue Liu, Yu-ling Liu, Zhiping Liu, Hongbo Liu, Jizhen Liu, Liu, Yifan Liu, Qian Liu, Xiao Liu, Jin Liu, Jun Liu, Yue Liu, Joe K. W. Lo, Asim Loan, Andrey Logvinov, Francesco Longo, Milan Lovric, Baoliang Lu, Yixiang Lu, Junguo
XII
Organization
Lu, Feng Lu, June Lu, Wei Lu, CJ Luh, Luiz Marcos Garcia Gonçalves, Andrew Lumsdaine, Tom Lunney, Jingchu Luo, Yan Luo, Leh Luoh, Yan Lv, Chuang Ma, Yinglong Ma, Liyong Ma, Irwin Ma, Jin Ma, Sakashi Maeda, Sakashi Maeda, Sudipta Mahapatra, Sydulu Maheswarapu, Andre Laurindo Maitelli, A.K. Majumdar, Chandan Majumdar, Terrence Mak, Hiroshi Mamitsuka, Qing-Kui Man, Achintya Kumar Mandal, Danilo Mandic, Mata-Montero ManriqueAtif Mansoor, Chengxiong Mao, Zhiming Mao, Fenglou Mao, Zhihong Mao, Weihua Mao, Kezhi Mao, Joao Fernando Marar, Márcio Leandro Gonçalves Mario Marinelli, Francescomaria Marino Urszula Markowska-Kaczmar, Alan Marshall, Allan de Medeiros Martins, Nelson Delfino d Avila Mascarenhas, Emilio Mastriani, Giuseppe Mastronardi, Francesco Masulli, Mohammad Ali Maud, Giancarlo Mauri, Joseph McClay, Liam McDaid, Malachy McElholm, Adelardo A. Dantas de Medeiros, Claudio Medeiros, Reginald Mehta, Jorge Dantas de Melo, Luis Mendonca, Weixiao Meng, Filippo Menolascina, Jianxun Mi, Hirvensalo Mika, Nikolay Mikhaylov, Claudia Milaré, Viorel Milea, Milos Radovanovic, Mihoko Minami, Tsunenori Mine, Giuseppe Minutoli, Sushmita Mitra, Mandar Mitra, Yasue Mitsukura, Jinqiu Mo, Asunción Mochón, Hamid Abrishami, Moghaddam Hamid, Abrishami Moghaddam, Nurul Haque Mollah, Marina Mongiello, Inhyuk Moon, Fearghal Morgan, Yasamin Mostofi, Santo Motta, saeed Mozaffari, Mikhail Mozerov, Krishnendu Mukhopadhyay, J. Mukhopadhyay, Hamid Mukhtar, Tarik Veli Mumcu, T. Murakami, C. Siva Ram Murthy, Muhammad Aziz Muslim, Kazuo Nakamura, Sukumar Nandi, David Naso, Pedro L.K.G Navarro, Duarte Dória Neto, Frank Neumann, WK Ng, Hoi Shing Raymond NG, Tian-Tsong Ng, Vinh Hao Nguyen, Tam Nguyen, Ni, Oana Nicolae, Li Nie, Ke Ning, Luis F. Nino, Fauzia Nisar, Maria Nisar, Takuichi Nishimura, Qun Niu, Shimada Nobutaka, Lars Nolle, Clement Nyirenda, Masanao Obayashi, Hasan Ocak, Richard Oentaryo, Jaewon Oh, Halil Ibrahim Okumus, M. Sorin Olaru, Luiz Affonso H Guedes de Oliveira, Pietro Oliveto, Onat, Kok-Leong Ong, Johan Oppen, Denis Orel, Ajiboye Osunleke, Gaoxiang Ouyang, Ali Ozen, Oprao Pag, Umapada Pal, luca Paladina, Sarbani Palit, Shanliang Pan, Tianhong Pan, Wan-Ling Pan, Paolo Pannarale, Maurizio Paone, Angelo Paradiso, Emerson Paraiso, Daniel Paraschiv, Sang Kyeong Park, Jintae Park, Swapan Kumar Parui,Halit Pastaci, Giuseppe Patanè, Athanasios Pavlou, Jeronimo Pellegrini, Jeronimo Pellegrini, Wei Peng, Marzio Pennisi, Graziano Pesole, Emil Petre, Alfredo Petrosino, Minh-Tri Pham, Vinhthuy Phan, Francesco Piazzaa, Aderson Pifer, Pinar, Huseyin Polat, Alexander Ponomarenko, Alisa Ponomarenko, Elvira Popescu, Girijesh Prasad, Prashan Premaratne, Adam Prugel_bennett, Andrzej Przybyszewski,Viswanath Pulabaigari, Alfredo Pulvirenti, Liu Qian, Haiyan Qiao, Lishan Qiao, Yu Qiao, Hong Qin, Jun Qin,Ying-qiang Qiu, ying qiu, Dong-Cai Qu, Tho Quan, Paulo Quintiliano, Ijaz Mansoor Qureshi, Tariq Rasheed Qureshi, Anas Quteishat, S.V. Raghavan, Carmelo Ragusa, Mkm Rahman, Anca Ralescu, Ramon Zatarain-Cabada, Milton Ramos, Zeeshan Rana, Raquel Esperanza Patiño Escarcina, Jiangtao Ren, Jian Ren, Alberto Rey, Orion Fausto Reyes-Galaviz, Robert Reynolds, Gianbattista Rocco, Peter Rockett, Liu Rong, A.K. Roy, Kaushik Roy, Uttam Roy, Changhai Ru, XiaoGang Ruan, Tomasz Rutkowski, Khalid Saeed, Doris Sáez, Alaa Sagheer, G. Saha, Ratnesh Sahay, Halil Ibrahim Sahin, Mohamed Sahmoudi, G Sajith, Pijush Samui, Saeid Sanei, David Sankoff, Edimilson B. dos Santos, Jose Santos, Brahmananda Sapkota, Angel Sappa, P.Saratchandran, Yoshiko Sato, Gerald
Organization
XIII
Schaefer, Giuseppe Scionti, Dan Selisteanu, S. Selvakumar, Kirusnapillai Selvarajah, Amitava Sen, Sibel Senan, Dorin Sendrescu, Indranil Sengupta, D.Y. Sha, A Shah, Syed Faisal Ali Shah, Syed Ismail Shah, Suleman Shahid, Bilal Shams, Shahnoor Shanta, Li Shao, Qadeer Sharif, Shahzad Amin Sheikh, Hao Shen, Xianjun Shen, Yantao Shen, Yehu Shen, Jinn-Jong Sheu, Chuan Shi, MingGuang Shi, Yongren Shi, Ke Shi, Horng-Lin Shieh, Motoki Shiga, Atsushi Shimada, Tetsuya Shimamura, SooYong Shin, Woochang Shin, Tae zi Shin, Takahashi Shinozaki, Dipak Lal Shrestha, Bi Shuhui, Leandro Augusto da Silva, Fulvio Simonelli, Leszek Sliwko, Kate A.Smith, Grant Smith, Heliana B. Soares, Zhuoyue Song, Qiankun Song, Yinglei Song, Ong Yew Soon, Nuanwan Soonthornphisaj, Jairo Soriano, Joao M. C. Sousa, Marcilio Carlos P. de Souto, Jackson Gomes de Souza, Birol Soysal, Stefano Squartini, Mscislaw Srutek, Cristina Stoica, Umberto Straccia, Antony Streklas, Zheng Su, Min Su, Ahlada Sudersan, Akira Suganuma, Youngsoo Suh, Ziwen Sun, Tsung-Ying Sun, Tien-Lung Sun, Xiangyang Sun, Jingchun Sun, Shiwei Sun, Lily Sun, Yude Sun, Nak Woon Sung, Seokjin Sung, Worasait Suwannik, Aqeel Syed, Duong Ta, Abdullah Taha, Chen Tai, Oluwafemi Taiwo, Shin-ya Takahashi, B. Talukdar, Hakaru Tamukoh, Guangzheng Tan, Ping Tan, Toshihisa Tanaka, Chunming Tang, Hong Tang, David Taniar, Zou Tao, Liang Tao, Imran Tasadduq, Peter Tawdross, Mohammad Teshnehlab, Niwat Thepvilojanapong, Daniel Thiele, Quan Thanh Tho, Jingwen Tian, Jiang Tian, Yun Tian, Ye Tian, Huaglory Tianfield, Ching-Jung Ting, Massimo Tistarelli, Stefania Tommasi, Ximo Torres, Farzad Towhidkhah, Cong Tran-Xuan, Roque Mendes Prado Trindade, Hoang-Hon Trinh, Gianluca Triolo, Giuseppe Troccoli, Chieh-Yuan Tsai, Chi-Yang Tsai, Chueh-Yung Tsao, Norimichi Tsumura, Naoyuki Tsuruta, Hang Tu, Hung-Yi Tu, Luong Trung Tuan, Petr Tuma, Cigdem Turhan, Francesco Tusa, Bulent Tutmez, Seiichi Uchida, Muhammad Muneeb Ullah, Nurettin Umurkan, Mustafa Unel, Ray Urzulak, Ernesto Cuadros Vargas, Andrey Vavilin, Simona Venuti, Silvano Vergura, Susana Vieira, Geoffrey Vilcot, Massimo Villari, Boris Vintimilla, Holger Voos, Juan Wachs, John Wade, Hiroshi Wakuya, Julie Wall, Li Wan, Bohyeon Wang, Chao Wang, Chengyou Wang, Xingce Wang, Jia-hai Wang, Jiasong Wang, Guoli Wang, Yadong Wang, Xiaomin Wang, Jeen-Shing Wang, Zhongsheng Wang, Guoren Wang, Xiangyang Wang, Zhongxian Wang, Jianying Wang, LingLing Wang, Ruisheng Wang, Xiaodong Wang, XiaoFeng Wang, Xiaojuan Wang, Xiaoling Wang, Xuan Wang, Zhengyou Wang, Haijing Wang, Hesheng Wang, Hongxia Wang, Hongyan Wang, Jianmin Wang, Junfeng Wang, Linshan Wang, Shuting Wang, Yanning Wang, Zhisong Wang, Huimin Wang, Huisen Wang, Mingyi Wang, Shulin Wang, Zheyou Wang, Haili Wang, Jiang Wang, Kejun Wang, Linze Wang, Weiwu Wang, Jina Wang, Jing Wang, Ling Wang, Meng Wang, Qifu Wang, Yong Wang, Yan Wang, Yoshikazu Washizawa, Shih-Yung Wei, Shengjun Wen , Shenjun Wen, Guozhu Wen, Seok Woo, Derek Woods, Chao Wu, Christine Wu, Zikai Wu, Hsiao-Chun Wu, Quanjun Wu, YongWei Wu, Ing-Chyuan Wu, Shiow-yang Wu, Shiqian Wu, Shaochuan Wu, Wen-Chuan Wu, JianWu Wu, Weimin Wu, Qiong Wu, Sitao Wu, Peng Wu, Min Wu, Jun-Feng Xia, Li Xia, Yongkang Xiao, Jing Xiao, Lijuan Xiao, Renbin Xiao, Gongnan Xie, Zhijun Xie, Caihua Xiong, Wei Xiong, ChunGui Xu, Chunsui Xu, Weidong Xu, Wenlong Xu, Xiaoyin Xu, Zeshui Xu, Huan Xu, Wei Xu, Yun Xu, Xuanli Wu, Quan Xue, Yu Xue, Xuesong Yan, Li Yan, Banghua Yang, Junghua Yang, Wuchuan Yang, Yingyun Yang, Hyunho Yang, Junan Yang, Shixi
XIV
Organization
Yang, Sihai Yang, Song Yang, Yan Yang, Ming-Jong Yao, Xingzhong Yao, Daoxin Yao, Obilor Yau, Xiaoping Ye, Liang Ye, Chia-Hsuan Yeh, Ming-Feng Yeh, JunHeng Yeh, James Yeh, Yang Yi, Tulay Yildirim, Jian Yin, Zhouping Yin, Qian Yin, Yang Yong, Murilo Lacerda Yoshida, Norihiko Yoshida, Kaori Yoshida, Kenji Yoshimura, Mingyu You, Yu Sun Young, Changrui Yu, Gwo-Ruey Yu, Xinguo Yu, Ming Yu, Tina Yu, Zhiyong Yuan, Guili Yuan, Fang Yuan, Jing Yuan, Jing Yuan, Eylem Yucel, Lu Yue, Masahiro Yukawa, Mi-ran Yun, C. Yung, Anders Zachrison, Aamer Zaheer, Kun Zan, Yossi Zana, Rafal Zdunek, Zhigang Zeng, Wenyi Zeng, Chuan-Min Zhai, Byoung-Tak Zhang, Chuan Zhang, Dabin Zhang, Guangwei Zhang, Ping Zhang, Xianxia Zhang, Yongmin Zhang, Xiangliang Zhang, Zhiguo Zhang, Jingliang Zhang, De-xiang Zhang, Xiaowei Zhang, Xiaoxuan Zhang, Yongping Zhang, Jianhua Zhang, Junpeng Zhang, Shanwen Zhang, Si-Ying Zhang, Weigang Zhang, Yonghui Zhang, Zanchao Zhang, Zhiyong Zhang, Guohui Zhang, Guowei Zhang, Jiacai Zhang, Li-bao Zhang, Liqing Zhang, Yunong Zhang, Zhijia Zhang, LiBao Zhang, Wenbo Zhang, Jian Zhang, Ming Zhang, Peng Zhang, Ping Zhang, Zhen Zhang, Fei Zhang, Jie Zhang, Jun Zhang, Li Zhang, Bo Zhao, Xiaoguang Zhao, Quanming Zhao, Xiaodong Zhao, Yinggang Zhao, Zengshun Zhao, Yanfei Zhao, Ting Zhao, Yaou Zhao, Qin Zhao, Xin Zhao, Yi Zhao, Bojin Zheng, Xin Zheng, Yi Zheng, Aimin Zhou, Chi Zhou, Chunlai Zhou, Xiaocong Zhou, Fengfeng Zhou, Qinghua Zhou, Jiayin Zhou, Zekui Zhou, Qiang Zhou, Wei Zhou, Dao Zhou, Hao Zhou, Jin Zhou, Wen Zhou, Zhongjie Zhu, Quanmin Zhu, Wei Zhu, Hankz Zhuo, Majid Ziaratban.
Table of Contents
Evolutionary Computing and Genetic Algorithms Adaptive Routing Algorithm in Wireless Communication Networks Using Evolutionary Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuesong Yan, Qinghua Wu, and Zhihua Cai
1
A New GA – Based and Graph Theory Supported Distribution System Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sajad Najafi Ravadanegh
7
Sequencing Mixed-Model Assembly Lines with Limited Intermediate Buffers by a GA/SA-Based Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Binggang Wang, Yunqing Rao, Xinyu Shao, and Mengchang Wang
15
Solving Vehicle Routing Problem Using Ant Colony and Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen Peng and Chang-Yu Zhou
23
Knowledge Discovery and Data Mining A Research on the Association of Pavement Surface Damages Using Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ching-Tsung Hung, Jia-Ray Chang, Jian-Da Chen, Chien-Cheng Chou, and Shih-Huang Chen An Integrated Method for GML Application Schema Match . . . . . . . . . . . Chao Li, Xiao Zeng, and Zhang Xiong Application of Classification Methods for Forecasting Mid-Term Power Load Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minghao Piao, Heon Gyu Lee, Jin Hyoung Park, and Keun Ho Ryu Design of Fuzzy Entropy for Non Convex Membership Function . . . . . . . . Sanghyuk Lee, Sangjin Kim, and Nam-Young Jang Higher-Accuracy for Identifying Frequent Items over Real-Time Packet Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ling Wang, Yang Koo Lee, and Keun Ho Ryu Privacy Preserving Sequential Pattern Mining in Data Stream . . . . . . . . . Qin-Hua Huang
31
39
47
55
61
69
XVI
Table of Contents
Methods of Computing Optimization A General k-Level Uncapacitated Facility Location Problem . . . . . . . . . . . Rongheng Li and Huei-Chuen Huang
76
Fourier Series Chaotic Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yao-qun Xu and Shao-ping He
84
Numerical Simulation and Experimental Study of Liquid-Solid Two-Phase Flow in Nozzle of DIA Jet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guihua Hu, Wenhua Zhu, Tao Yu, and Jin Yuan Shape Matching Based on Ant Colony Optimization . . . . . . . . . . . . . . . . . . Xiangbin Zhu
92 101
Fuzzy Systems and Soft Computing A Simulation Study on Fuzzy Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . Juan C. Figueroa Garc´ıa, Dusko Kalenatic, and Cesar Amilcar Lopez Bello
109
A Tentative Approach to Minimal Reducts by Combining Several Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ning Xu, Yunxiang Liu, and Ruqi Zhou
118
Ameliorating GM (1, 1) Model Based on the Structure of the Area under Trapezium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cuifeng Li
125
Comparative Study with Fuzzy Entropy and Similarity Measure: One-to-One Correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanghyuk Lee, Sangjin Kim, and DongYoup Lee
132
Low Circle Fatigue Life Model Based on ANFIS . . . . . . . . . . . . . . . . . . . . . Changhong Liu, Xintian Liu, Hu Huang, and Lihui Zhao
139
New Structures of Intuitionistic Fuzzy Groups . . . . . . . . . . . . . . . . . . . . . . . Chuanyu Xu
145
Intelligent Computing in Pattern Recognition An Illumination Independent Face Verification Based on Gabor Wavelet and Supported Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xingming Zhang, Dian Liu, and Jianfu Chen Hardware Deblocking Filter and Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hao Lian and Mohammed Ghanbari
153 161
Table of Contents
Medical Image Segmentation Using Anisotropic Filter, User Interaction and Fuzzy C-Mean (FCM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M.A. Balafar, Abd. Rahman Ramli, M. Iqbal Saripan, Rozi Mahmud, and Syamsiah Mashohor Medical Image Segmentation Using Fuzzy C-Mean (FCM), Learning Vector Quantization (LVQ) and User Interaction . . . . . . . . . . . . . . . . . . . . . M.A. Balafar, Abd. Rahman Ramli, M. Iqbal Saripan, Rozi Mahmud, and Syamsiah Mashohor New Data Pre-processing on Assessing of Obstructive Sleep Apnea Syndrome: Line Based Normalization Method (LBNM) . . . . . . . . . . . . . . . Bayram Akdemir, Salih G¨ une¸s, and S ¸ ebnem Yosunkaya Recognition of Plant Leaves Using Support Vector Machine . . . . . . . . . . . Qing-Kui Man, Chun-Hou Zheng, Xiao-Feng Wang, and Feng-Yan Lin
XVII
169
177
185 192
Region Segmentation of Outdoor Scene Using Multiple Features and Context Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dae-Nyeon Kim, Hoang-Hon Trinh, and Kang-Hyun Jo
200
Two-Dimensional Partial Least Squares and Its Application in Image Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mao-Long Yang, Quan-Sen Sun, and De-Shen Xia
208
Intelligent Computing in Bio/Cheminformatics A Novel Method of Creating Models for Finite Element Analysis Based on CT Scanning Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liulan Lin, Jiafeng Zhang, Shaohua Ju, Aili Tong, and Minglun Fang
216
Accelerating Computation of DNA Sequence Alignment in Distributed Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tao Guo, Guiyang Li, and Russel Deaton
222
Predicting Protein Function by Genomic Data-Mining . . . . . . . . . . . . . . . . Changxin Song and Ke Ma
229
Tumor Classification Using Non-negative Matrix Factorization . . . . . . . . . Ping Zhang, Chun-Hou Zheng, Bo Li, and Chang-Gang Wen
236
Intelligent Control and Automation A Visual Humanoid Teleoperation Control for Approaching Target Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Usman Keerio, Altaf Hussain Rajpar, Attaullah Khawaja, and Yuepin Lu
244
XVIII
Table of Contents
An Intelligent Monitor System for Gearbox Test . . . . . . . . . . . . . . . . . . . . . Guangbin Zhang, Yunjian Ge, Kai Fang, and Qiaokang Liang Development of Simulation Software for Coal-Fired Power Units Based on Matlab/Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chang-liang Liu, Lin Chen, and Xiao-mei Wang Inconsistency Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sylvia Encheva and Sharil Tumin
252
260 268
Neural Network-Based Adaptive Optimal Controller – A Continuous-Time Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Draguna Vrabie, Frank Lewis, and Daniel Levine
276
On Improved Performance Index Function with Enhanced Generalization Ability and Simulation Research . . . . . . . . . . . . . . . . . . . . . . Dongcai Qu, Rijie Yang, and Yulin Mi
286
Intelligent Fault Diagnosis A Fault Diagnosis Approach for Rolling Bearings Based on EMD Method and Eigenvector Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jinyu Zhang and Xianxiang Huang
294
An Adaptive Fault-Tolerance Agent Running on Situation-Aware Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SoonGohn Kim and EungNam Ko
302
Dynamic Neural Network-Based Pulsed Plasma Thruster (PPT) Fault Detection and Isolation for Formation Flying of Satellites . . . . . . . . . . . . . A. Valdes and K. Khorasani
310
Model-Based Neural Network and Wavelet Packets Decomposition on Damage Detecting of Composites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhi Wei, Huisen Wang, and Ying Qiu
322
Intelligent Computing in Communication A High Speed Mobile Courier Data Access System That Processes Database Queries in Real-Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Barnabas Ndlovu Gatsheni and Zwelakhe Mabizela A Scalable QoS-Aware VoD Resource Sharing Scheme for Next Generation Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chenn-Jung Huang, Yun-Cheng Luo, Chun-Hua Chen, and Kai-Wen Hu Brain Mechanisms for Making, Breaking, and Changing Rules . . . . . . . . . Daniel S. Levine
329
337
345
Table of Contents
Implementation of a Landscape Lighting System to Display Images . . . . . Gi-Ju Sun, Sung-Jae Cho, Chang-Beom Kim, and Cheol-Hong Moon
XIX
356
Intelligent Sensor Networks Probability-Based Coverage Algorithm for 3D Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Chen, Peng Jiang, and Anke Xue
364
Simulating an Adaptive Fault Tolerance for Situation-Aware Ubiquitous Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EungNam Ko and SoonGohn Kim
372
A Hybrid CARV Architecture for Pervasive Computing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SoonGohn Kim and Eung Nam Ko
380
Intelligent Image/Document Retrievals Image and Its Semantic Role in Search Problem . . . . . . . . . . . . . . . . . . . . . Nasir Touheed, Muhammad Saeed, M. Atif Qureshi, and Arjumand Younus Color Image Watermarking Scheme Based on Efficient Preprocessing and Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ˙ O˘guz Fındık, Mehmet Bayrak, Ismail Babao˘glu, and Emre C ¸ omak Multiple Ranker Method in Document Retrieval . . . . . . . . . . . . . . . . . . . . . Dong Li, Maoqiang Xie, Yang Wang, Yalou Huang, and Weijian Ni
388
398 407
Special Session on Image Processing, Analysis, and Vision Technology Based Intelligent Robot Systems An Elimination Method of Light Spot Based on Iris Image Fusion . . . . . . Yuqing He, Hongying Yang, Yushi Hou, and Huan He An Improved Model of Producing Saliency Map for Visual Attention System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingang Huang, Bin Kong, Erkang Cheng, and Fei Zheng Multiple Classification of Plant Leaves Based on Gabor Transform and LBP Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng-Yan Lin, Chun-Hou Zheng, Xiao-Feng Wang, and Qing-Kui Man Research on License Plate Detection Based on Wavelet . . . . . . . . . . . . . . . Junshan Pan and Zhiyong Yuan
415
423
432
440
XX
Table of Contents
Stereo Correspondence Using Moment Invariants . . . . . . . . . . . . . . . . . . . . . Prashan Premaratne and Farzad Safaei The Application of the Snake Model in Carcinoma Cell Image Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhen Zhang, Peng Zhang, Xiaobo Mao, and Shanzhong Zhang
447
455
Special Session on Data Mining and Fusion in Bioinformatics Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pei-Chann Chang, Chin-Yuan Fan, and Yen-Wen Wang
463
Multivariate Polynomials Estimation Based on GradientBoost in Multimodal Biometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mehdi Parviz and M. Shahram Moin
471
Special Session on Advances in Multidimensional Signal Processing An Introduction to Volterra Series and Its Application on Mechanical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Bharathy, Pratima Sachdeva, Harish Parthasarthy, and Akash Tayal Skin Detection from Different Color Spaces for Model-Based Face Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dong Wang, Jinchang Ren, Jianmin Jiang, and Stan S. Ipson
478
487
Other Topics Applying Frequent Episode Algorithm to Masquerade Detection . . . . . . . Feng Yu and Min Wang
495
An Agent-Based Intelligent CAD Platform for Collaborative Design . . . . Quan Liu, Xingran Cui, and Xiuyin Hu
501
Design of a Reliable QoS Requirement Based on RCSM by Using MASQ Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eung Nam Ko and SoonGohn Kim Minimization of the Disagreements in Clustering Aggregation . . . . . . . . . . Safia Nait Bahloul, Baroudi Rouba, and Youssef Amghar Prediction of Network Traffic Using Multiscale-Bilinear Recurrent Neural Network with Adaptive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dong-Chul Park
509 517
525
Table of Contents
Replay Attacks on Han et al.’s Chaotic Map Based Key Agreement Protocol Using Nonce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eun-Jun Yoon and Kee-Young Yoo The Short-Time Multifractal Formalism: Definition and Implement . . . . . Xiong Gang, Yang Xiaoniu, and Zhao Huichang
XXI
533 541
Modified Filled Function Method for Resolving Nonlinear Integer Programming Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Liu and You-lin Shang
549
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
557
A New GA – Based and Graph Theory Supported Distribution System Planning Sajad Najafi Ravadanegh University of Islamic Azad - Branch of Ilkhichy Tabriz - Iran
[email protected] Abstract. After Optimal Distribution Substation locating, distribution feeder routing is the main problem in distribution system planning and its expansion. This paper presents a new approach based on simultaneous application of graph theory and genetic algorithm to solve optimal high voltage substation placement and feeder routing in distribution system. The proposed method solves hard satisfactory optimization problem with different kinds of operational and optimization constraints. Since it is formulated as a combinatorial optimization problem, it is difficult to solve such a large scale problem. A minimum spanning tree algorithm is used to generate a set of feasible initial population. To reduce computational time and avoiding from infeasible solution a special coding is generated for GA operator such as crossover and mutation. This coding guaranties validity of solution toward global optimum. The method is examined in two large – scale distribution system. Keywords: Genetic Algorithm, Minimum Spanning Tree, DSP, Graph Theory.
1 Introduction Planning of distribution network with minimizing the installation and operation costs is a complicated scenario. In [1] an application method to enhance distribution horizon planning for a 20 year horizon period are described. In [2] the authors presented a new multiobjective Tabu search (NMTS) algorithm to solve a multiobjective fuzzy model for optimal planning of distribution systems. This algorithm obtains multiobjective nondominated solutions to three objective functions: fuzzy economic cost, level of fuzzy reliability, and exposure (maximization of robustness), also including optimal size and location of reserve feeders to be built for maximizing the level of reliability at the lowest economic cost. The planned project must satisfy the electric consuming demands with acceptable reliability, at a minimum cost, taking into account the distribution substations loading level and feeders current limits. Each solution should have acceptable voltage levels at the nodes of the system, supplying of all loads and the radial structure of the system during operation [3], [4], [5], [6], [7]. The planning problem of distribution networks has been basically stated as a constraints multi objective optimization problem, where an objective function that includes both the investment and the operation costs of the network, is minimized subject to technical constraints related with the characteristics of the D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 7–14, 2008. © Springer-Verlag Berlin Heidelberg 2008
8
S.R. Najafi
electric service [8], [9], [10]. The formulation of the problem includes a set of electric distribution system constraints such as load flow as well as some optimizing constraints for instance minimizing of the loss and total installation and operating cost. The main reason for the application of alternative approaches, such GA that classified as heuristic methods is that are able to find good solutions with reduced computational effort. In general, the planning problem of distribution systems may be considered as an optimization problem, with given geographical area and a set of MV (Medium voltage) substation that previously estimated. The main contribution of the problem is finding location of the HV (High Voltage) substations and feeder routing required for the load supply, minimizing the total installation and operation costs of both HV substations and feeders, subject to the technical requirements and geographical constraints. Besides with the geographical limitations and other the problem of solving the distribution system planning is more restricted and complicated and reduces the degree of freedom of the project. The proposed method was mainly based upon the application of the graph theory for generation and evaluation of feasible solutions for initial population of genetic algorithm. In this step only the main constraints such as radial characteristic of the network, supplying all of loads and geographical feasible solutions is considered. In this work, the proposed technique is applied to two test cases first taken from the [8] with 201 loads and the other from the Tabriz electric distribution company section Golestan with 90 loads. The results show an interesting potential of the simultaneous application of GA and MST (minimum spanning tree) algorithms applicable to very large scale optimization problem with many constraints. The problem of distribution system planning can be defined in three general stage according to follow: A) Long - term load forecasting B) Optimal distribution substation placement C) Optimal HV substation locating and feeder routing In this paper the final stage of distribution system planning called "optimal HV Substation and Feeder Routing" is considered.
2 Graph Theory and Minimum Spanning Tree A connected undirected acyclic graph is called a tree. Spanning Trees are tree that are subgraphs of G and contain every vertex of G is shown at Fig. 1. In a weighted connected graph G ≡ G( V , E ) , it is often of interest to determine a spanning tree with minimum total edge weight – that is, such that the sum of the weights of all edges was minimum. Such a tree is called Minimum Spanning Tree. In this paper all of the possible and candidate routes and the existing feeders are entered as input data. This planning and basic data constructs a graph representation of study system. The topology of the network is fully specified by the node –branch connection information or by the incidence matrix of the system graph. In each graph consisting of k feeders section and n nodes or MV substation there are many different trees. Among all the trees of this network the minimum spanning tree is the one that
A New GA – Based and Graph Theory Supported Distribution System Planning
9
Fig. 1. Concept of minimum spanning tree
the total length of branches is minimum. This algorithm is applied for generation of initial population in order to guarantee the two first major constraints of the problem and feasibility of the solution. It should be notice that the MST algorithm is only applied in the initialization of GA and during the progress of GA only special coding of crossover and mutation guarantees the feasibility of the solutions [10].
3 Description of Optimization In this paper special crossover and mutation operators are designed to guarantee the proper and feasible solutions in GA process while the probabilistic nature of them is preserved. The binary representation of simultaneous solving of HV substation placement and MV feeder routing is shown in Fig. 3. Each chromosome or solution is a vector with binary entry. According to the Fig. 3 the vector contains two sections. The first section includes HV candidate substations and the second includes the candidate MV feeders. The length of each chromosome vector is equal to the sum of number of candidate HV substations and feasible MV feeders. The MV feeder's string should be included set of ones that number of the "1" is equal to the number of loads. In fact during simulation process the sum of the "1" in each chromosome is constant and is equal to sum of the system nodes.
Fig. 2. Design of GA operator for the problem
For example suppose that chromosome in Fig. 3 is selected for mutation and suppose that the altered genes be within the ellipse. This means that the array 10110 should be changed for mutation. In this case for satisfaction of the condition, the number of the "1" should not be changed; hence this array can be replaced by 10101. In both array the number of the "1" is equal to 3. The same method is used for crossover operator construction. This is a key and important fact which is considered in this stage of DSP for GA optimization as special operators. The roulette wheel selection with probability of 0.4 is used as selection operator.
10
S.R. Najafi
3.1 Mathematical Formulation of Optimization Problem
In this section formulation of optimal HV substation placement and feeders routing is presented in detail. The cost function for optimal distribution system planning is obtained from (1). The constraints for the optimization problem are given by (2) TSN
CFFR =
∑ SC (S
∑ [SC ( F ) + I
TFN n)
n=1
+
n
2
( Fn ) ∗ Rn * α
n=1
]
(1)
Minimize CFFR Il
s.t
∑F
n
< I l l = 1,2,..., L
n =1
I ( Fn ) < I max ( Fn ) n = 1,2,..., n
(2)
Kn
∑R
n
* I ( Fn ) < VDMV max m = 1,2,...N
n =1 Kj
∑
3 * VLL * I ( Fn ) < CAP ( S j ) j = 1,2,...J
n =1
The fitness function that should be maximized is as (3) F=
1 CFFR
The parameters used in equations (11) and (12) are defined as follow: CFFR : Cost function to be minimized TSN : Total substation number TFN : Total MV feeder number SC ( S i ) : The cost of HV substation S i SC ( Fi ) : The cost of HV substation S i I ( Fi ) : The Line current of feeder Fi Fi : Feeder i Ri : Resistance of feeder i n : Number of feeder sections L : Number of loops in the network K : Number of feeder sections in the network N : Number of MV substations connected to a feeder
J
: Number of HV substations I max : Maximum loading of feeder sections
VDMV max : Acceptable voltage in downstream feeders V LL : Line voltage CAP : HV substation capacity
(3)
A New GA – Based and Graph Theory Supported Distribution System Planning
11
Table 1. HV and MV substations data
Sn 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
S nc
Sx
Sy
0 0 0 315 315 315 315 315 315 250 315 315 400 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 400 315 315 315 315 315 315 315 400 315 315 315 315
4900 4400 5400 5561 4152 4425 4431 4435 4217 4194 3897 3920 4014 4297 4163 4420 4015 4128 3825 3645 3711 4690 5083 5373 5373 5398 5512 5396 5510 5395 5510 5419 5127 4813 5865 6118 6325 5765 5893 6099 5997 5509 4969 4512 4730 4508 4216 4874
9900 10500 9600 8517 8925 8862 9156 9730 9428 9199 9456 9741 10185 9988 10318 10665 10465 10855 10870 10689 10952 11293 10725 10798 10574 10479 9868 10168 9694 9802 9520 9307 9457 9193 10217 10130 10081 9894 9644 9461 9003 8997 8995 9040 8882 8829 9695 9575
Sn
S nc
Sx
Sy
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93
315 315 315 315 315 315 315 400 315 315 315 630 630 315 800 500 1000 800 500 800 630 400 630 630 400 630 315 315 315 315 315 315 315 400 315 315 315 630 630 315 800 500 1000 800 500
4884 3659 4272 4757 4790 4339 5200 5382 4862 4450 4500 3910 4049 3990 4409 3850 4666 3800 4288 5109 5055 4719 5615 6204 6609 6332 5860 5506 5191 4871 4212 3855 4374 4951 5614 5264 4705 5231 5111 4646 4684 4625 4267 4717 5079
10895 10451 11130 10246 9075 9023 11070 10709 8883 11150 10543 9352 8474 8866 9404 9988 10845 11285 11500 11273 10559 9772 10090 9877 10114 9555 9315 8792 8723 8546 10557 10420 10214 9418 9282 9494 9406 8992 10092 9647 9075 8755 8432 8622 8560
The proposed algorithm is designed such that the following necessary conditions of the network are checked. Radial configuration of network and servicing of all loads Maximum capacity limits of elements and keeping voltages in acceptable limits
In (2) the cost of HV substations as well as the cost of new feeders and the cost of loss in the feeders should be minimized. The minimization is done with respect to electrical constraints in (3) and some geographical constraints that are implemented in the candidate feeders routes and HV substation locations. Since the variable and formulation of the equipments sizing is continues but the real size of them is discrete, hence after optimization solution the obtained results should be modified to over sizing of the results to insure the reliability of the normal system operation. For example in MV feeder routing the size of the proposed feeders are select or modified such that the
12
S.R. Najafi
over size rating of the cables considered. In order to demonstrate the presented algorithm, a large test case and a real network are adapted for testing. For test case 2, Candidate HV and MV substations data is given in Table 1. The first three substations with italic font are HV substation candidate. It should be mentioned that HV substation 1 in Table 1, is existed in the study time and the two other substation are proposed as new HV substation. The MV substation data including, the number of substation, geographical coordinates of all MV substation, their nominal rating and loading, are also given in Table 1. The new HV substations may be selected and the existing HV substation may be modified according optimization algorithm. There are ninety medium voltage substations from number 4-93 that are listed as Table 1. In Tables 1, S n stand for substation number, S nc is the nominal capacity of substations,
S x and S y is coordinates of substation in Cartesian coordination. The GA described in this section has been intensively tested in large computational experiments for a multiobjective model. Initialization of GA starts with minimum spanning trees algorithm, which provide a set of nondominated solutions obtained by this algorithm. In order to avoid from infeasible solutions during GA optimization process, special crossover and mutation is provided.
4 Simulation and Results This section contains the main results of optimal planning of distribution network; represented in Fig .3, Fig. 4 and Fig. 5. Figure 3 show the results of our algorithm for first case which is confirmed by the results of ant colony algorithms represented in [8]. To insure from optimality of final obtained results the simulation is repeated so many times with different probability of GA operators and different GA termination criteria. Almost in all the running case the observed trajectory for the GA fitness function was the same. Fig. 4 shows the results of optimal feeder routing for 20-kV feeder
Fig. 3. Optimal system configuration (first test case)
A New GA – Based and Graph Theory Supported Distribution System Planning
13
Fig. 4. Optimal network configuration (second test case)
Fig. 5. Trajectory of best solution of GA
networks (continuous segments) and the feasible routes (dashed segments) to build future feeders configuration. The feasible feeder (dashed lines) in the figure includes both existing and new proposed feeder routes. The feasible routes for MV Feeder and feasible location for the HV substations are determined by considering both topological and geographical constraints of the city map and expert engineer experiences. Fig. 4 shows that among three HV candidate substations, the substation number 1 (black square) is selected again, which indicates the location of the existing HV substation is a relevant and acceptable place. In this case only the capacity of the existing HV substation should be modified. Beside among seven candidate outgoing feeder we have six selected feeders. Also Fig. 4 shows that algorithm is preserved
14
S.R. Najafi
radial structure of the network. The current in feeders section is within its limit and nodes have acceptable voltage. The trajectory of best solution at any iteration of GA for test case 2 is shown in Fig. 5.
5 Conclusion In this paper a new method, based upon simultaneous use of MST algorithm in graph theory and GA algorithm are proposed for optimal HV substation allocation and MV feeders routing. The application of the methodology on a first and second test cases, showed the feasibility of the application of the proposed method, presenting a significant reduction of the computational effort with providing a valid and feasible initial population for GA. The proposed method finds the HV substation location as well as routes of MV feeders simultaneously. The proposed MST and improved GA algorithms are used for evaluation of fitness function. Simulation results show the capability of the method for application in large scale DSP problems.
References [1] Fletcher, R., Strunz, K.: Optimal Distribution System Horizon Planning–Part I: Formulation. IEEE Trans. Power Sys. 22(2), 791–799 (2007) [2] Ramírez, J., Domínguez, J.: New Multiobjective Tabu Search Algorithm for Fuzzy Optimal Planning of Power Distribution Systems. IEEE Trans. Power Sys. 21(1), 224–231 (2006) [3] Gönen, T.: Electric Power Distribution Systems Engineering. McGraw-Hill, NewYork (1986) [4] Lakervi, E., Holmes, E.J.: Electricity Distribution Network Design. Stevenage, U.K (1995) [5] Pansini, A.J.: Electrical Distribution Engineering. McGraw-Hill, New York (1993) [6] Willis, L.: Power Distribution Planning Reference Book. Marcel Decker, New York (1997) [7] Khator, S.K., Leung, L.C.: Power distribution planning: A review of models and issues. IEEE Trans. Power Sys. 12, 1151–1159 (1997) [8] Gómez, J.F., Khodr, H.M., De Oliveira, P.M., Ocque, L., Yusta, J.M., Villasana, R., Urdaneta, J.: Ant Colony System Algorithm for the Planning of the Primary Distribution Circuits. IEEE Trans. Power Sys. 13(2) (2004) [9] Parada, V., Ferland, J.A., Arias, M., Daniels, K.: Optimization of Electrical Distribution Feeders Using Simulated Annealing. IEEE Trans. Power Deliver. 19(3), 1135–1141 (2004) [10] Chachra, P., Ghare, M., Moore, J.M.: Applications of Graph Theory Algorithms. Elsevier North Holland, New York (1979)
Adaptive Routing Algorithm in Wireless Communication Networks Using Evolutionary Algorithm Xuesong Yan1, Qinghua Wu2, and Zhihua Cai1 1
School of Computer Science, China University of Geosciences, Wu-Han 430074, China 2 Faculty of Computer Science and Engineering, WuHan Institute of Technology, Wuhan, 430074, China
[email protected] Abstract. At present, mobile communications traffic routing designs are complicated because there are more systems inter-connecting to one another. For example, Mobile Communication in the wireless communication networks has two routing design conditions to consider, i.e. the circuit switching and the packet switching. The problem in the Packet Switching routing design is its use of high-speed transmission link and its dynamic routing nature. In this paper, Evolutionary Algorithms is used to determine the best solution and the shortest communication paths. We developed a Genetic Optimization Process that can help network planners solving the best solutions or the best paths of routing table in wireless communication networks are easily and quickly. From the experiment results can be noted that the evolutionary algorithm not only gets good solutions, but also a more predictable running time when compared to sequential genetic algorithm.
1 Introduction There has been growing general interest in infrastructure less or “ad hoc” wireless networks recently as evidenced by such activities as the MANET (Mobile Ad hoc NET working) working group within the Internet Engineering Task Force (IETF). Other examples are plans unveiled for NASA’s Earth orbit satellite constellation networks, and the Mars network, consisting of a “web” of satellites, rovers, and sensors within a ubiquitous information network [1]. Intelligent network routing, bandwidth allocation, and power control techniques are thus critical for such networks that have heterogeneous nodes with different data rate requirements and limited power and bandwidth. Such techniques coordinate the nodes to communicate with one another while exercising power control, using efficient protocols, and managing spectral occupancy to achieve the desired Quality of Service (QoS). They also let the network adapt to the removal and addition of different high and low rate communication sources, changing activity patterns, and incorporation of new services. For the present work, a reliable network design problem is stated as all-terminal network reliability (also known as uniform or overall reliability). In this approach, D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 1–6, 2008. © Springer-Verlag Berlin Heidelberg 2008
2
X. Yan, Q. Wu, and Z. Cai
every pair of nodes needs a communication path to each other [2,3]; that is, the network forms at least a spanning tree. Thus, the primary design problem is to choose enough links to interconnect a given set of nodes with a minimal cost, given a minimum network reliability to be attained. This minimization design problem is NP-hard [4], and as a further complication because the calculation of all-terminal reliability is also NP-hard. There are many research papers have been published on this problem or similar ones, no known method is efficient enough to deal with real large networks[5-11]. Considering the complexity of designing reliable networks, and the amount of different published method, this problem seems to be a good candidate for evolutionary algorithm. Evolutionary algorithm is based on the idea of Darwin evolutionism and mendel genetics that simulates the process of nature to solve complex searching problems. It adopts the strategy of encoding the population and the genetic operations, so as to direct the individuals’ heuristic study and searching direction. Since evolutionary algorithm owns the traits of self-organization, self-adaptation, self-study etc, it breaks away from the restriction of the searching space and some other auxiliary information. However, when facing different concrete problems (e.g. NP-hard problem), it’s always necessary for us to seek better genetic operators and more efficient control strategy due to the gigantic solution space and limitation of computation capacity.
2 Statement of the Problem A network is modeled[12] by a probabilistic undirected graph G=(N, L, p), in which N represents the set of nodes, L a given set of possible links, and p the reliability of each link. It is assumed one bi-directional link between each pair of nodes; that is, there is no redundancy between nodes. The optimization problem may be stated as:
MinimizeZ =
N −1
N
∑ ∑c
ij x ij
(1)
i =1 j =i +1
Subjectto : R( x) ≥ R0 where xij is a decision variable {0, 1}, cij is the cost of a link (i, j), R(x) is the network reliability, and R0 is the minimum reliability requirement. To solve the problem, the following assumptions are made: 1) The N nodes are perfectly reliable. A problem with a node may be simulated by the failure of its incident links. 2) The cost cij and the reliability pij of each link (i,j) are known. 3) The links have two states: either operational (xij = 1) or failed (xij = 0). 4) The links failures are independent. 5) No repair is considered. 6) Two-connectivity is required.
Adaptive Routing Algorithm in Wireless Communication Networks
3
3 Evolutionary Algorithm for Wireless Communication Networks The wireless communication networks require high-speed data link and high efficiency of routing algorithm. OSPF is the best algorithm to find shortest path, but it is not full efficiency in wireless communication networks, because there are many conditions to consider, i.e. the shortest path, link cost, link speed, load sharing or load balance, and especially used a little time to find the best solution or the best path. In this paper, we used evolutionary algorithm to optimize route path from source node s to destination node t. Motivation to use evolutionary algorithm because it can solve problem to near the best solution quickly, and easy to add or remove conditions or variables to consider. 3.1 Representation
Representation is one of the problems, when we are starting to solve problem with genetic algorithms(GAs). Encoding vary depends on the problem. However, this is not the case for the link cost and the link weight setting problem. A solution to the link cost setting is represented by number of nodes on the link path, and the link weight-setting problem is represented by the shortest path and link speed. All points in the search space represent feasible solutions. 3.2 Initial and Fitness Function
The initial population is generated by randomly choosing feasible points in the search space [1, 65535]|A|, represented as integer vectors. Population size is says how many chromosomes are in population (in one generation). If there are few chromosomes, evolutionary algorithm(EA) has a few possibilities to perform crossover and only a small part of search space is explored. On the other hand, if there are too many chromosomes, EA slows down. The fitting population size is one to two time of total number of the problem. The association of each solution to a fitness value is done through the fitness function. We associate the link cost and link weight to each individual through the function Φ. The evaluation function is complex and computationally demanding, as it includes the process of the shortest path and best routing, needed to determine the arc loads resulting from a given set of weights. This evaluation function is the computational bottleneck of the algorithm. Another basic computation needed by the genetic algorithm is the comparison of different solutions. 3.3 Evolutionary Algorithms
In this paper, we used the Inver-over operator in our evolutionary algorithm. The Inver-over operator has proved to be a high efficient operator in evolutionary algorithm[13] and very useful for combinatorial optimization[14]. The creativity of this operator is that it adopts the operation of inversion in genetic operators, which can effectively broaden the variety of population and prevent from local minimum and lead to find the best solutions quickly and accurately. Our algorithm is be perceived as a set of parallel hill-climbing procedures. Fig.1 provides a more detailed description of the whole algorithm in general and of the proposed operator in particular.
4
X. Yan, Q. Wu, and Z. Cai
Random initialization of the population P While (not satisfied termination condition) do begin for each individual S i ∈ P do begin
S ' ← Si Select (randomly) a Node C from S ' repeat {begin if (rand () ≤ p) select the node C ' from the remaining nodes in S ' else {Select (randomly) an individual in P assign to C ' the next node to the node C in the select individual } if (the next node or the previous node of node C in S ' is C ' ) exit repeat loop invert the gene node from the next node of C to the node C ' in S '
C ← C' } end if (eval ( S ' ) ≤ eval ( S i ))
Si ← S ' end end Fig. 1. The outline of the Evolutionary Algorithm
In the evolutionary algorithm, includes the following two genetic operators. The one is the mutation operator, we randomly select two nodes C , C ' in parent S , perform inversion upon the nodes between the city next to C and C ' ( C ' concluded). The other
Adaptive Routing Algorithm in Wireless Communication Networks
5
is the crossover operator, in this operator we randomly select a node C in parent S , then select another parent S ' , assign the node next to C in S ' be C ' . If in parent S , C ' is next to C , then quit; else, perform inversion upon the cities between the node next to C and C ' .
4 Experimental Results The experimental experiences have been performed over a 10Mbps Ethernet network, with three personal computers with Intel P 2.0G processors and 256MB RAM. The programs were written in VC++. For the present work, the reliability constraint is relaxed to allow performance comparisons between the sequential GA and evolutionary algorithm. As discussed before, the penalization function is not sufficient to prevent almost reliable networks from being chosen as the best solution, especially considering that Monte Carlo simulation only gives a good approximation of a given network reliability. Table 1. presents results over 10 runs when designing a 10 nodes network that may be fully interconnected, i.e. there are 45 possible links. Each link has a reliability of 90%. The testing network design problem is extracted from a test-set provided in [15]. It can be noted that the evolutionary algorithm not only gets good solutions, but also a more predictable running time when compared to sequential GA.
Ⅳ
Table 1. Experimental results over 10 runs
Run 1 2 3 4 5 6 7 8 9 10 Average
Evolutionary Algorithm Best Cost Reliability Time (s) 140 0.9410 330 141 0.9458 314 145 0.9375 158 135 0.9444 380 139 0.9374 192 138 0.9360 155 142 0.9379 177 140 0.9371 294 138 0.9467 345 138 0.9464 198 139.6 0.94102 254.3
Sequential Genetic Algorithm Best Cost Reliability Time (s) 147 0.9416 230 149 0.9470 719 140 0.9382 341 135 0.9323 983 140 0.9446 1480 142 0.9364 334 150 0.9381 640 142 0.9370 1755 139 0.9340 652 139 0.9388 669 142.3 0.9388 780.3
5 Conclusion Wireless communication networks design problem subject to reliability constrain that is especially complex because not only the design itself is NP-hard, but also the exact reliability calculation. For this reason, several different methods have been published but none of them is efficient enough to solve network size of nowadays. In this paper, we have solved the routing optimization by Evolutionary Algorithm. We developed a Genetic Optimization Process that can help network planners solving the best solutions or the best paths of routing table in wireless communication networks are easily and quickly.
6
X. Yan, Q. Wu, and Z. Cai
One future direction of such research is developing evolution methodologies for power-aware routing optimization. Here robustness, scalability, and routing efficiency, may trade against power efficiency in a wireless system. Acknowledgements. This paper is supported by Astronautics Research Foundation of China (NO. C5220060318).
References 1. Woo, M., Singh, S., Raghavendra, C.S.: Power-aware Routing in Mobile Ad Hoc Networks. In: Proc. 4th Annual ACM/IEEE Intl. Conf. on Mobile Computing & Networking, pp. 181–190 (1998) 2. Colbourn, C.J.: The Combinatorics of Network Reliability. Oxford Univ. Press, Oxford (1987) 3. Jan, R.H.: Design of reliable networks. Comput. Oper. Res. 20, 25–34 (1993) 4. Jan, R.H., Hwang, F.J., Cheng, S.T.: Topological Optimization of a Communication Network Subject to a Reliability Constraint. IEEE Trans. Reliab. 42, 63–70 (1993) 5. Vetetsanopoulos, A.N., Singh, I.: Topological Optimization of Communication Networks Subject to Reliability Constraint. Probl. Contr. Inform. Theor. 15, 63–78 (1986) 6. Atiqullah, M.M., Rao, S.S.: Reliability Optimization of Comunication Network Using Simulated Annealing. Microelectron. Reliab. 33, 1303–1319 (1993) 7. Pierre, S., Hyppolite, M.A., Bourjolly, J.M., Dioume, O.: Topological Desing of Computer Communication Network Using Simulated Annealing. Eng. Appl. Artif. Intel. 8, 61–69 (1995) 8. Glover, F., Lee, M., Ryan, J.: Least-cost Network Topology Design for a New Service: An Application of a Tabu Search. Ann. Oper. Res. 33, 351–362 (1991) 9. Beltran, H.F., Skorin-Kapov, D.: On Minimum Cost Isolated Failure Immune Network. Telecommun. Syst. 3, 183–200 (1994) 10. Koh, S.J., Lee, C.Y.: A Tabu Search for the Survivable Fiber Optic Communication Network Design. Comput. Ind. Eng. 28, 689–700 (1995) 11. Davis, L. (ed.): Genetic Algorithms and Simulated Annealing. Morgan Kaufmann Publishers, San Mateo (1987) 12. Barán, B., Laufer, F.: Topological Optimization of Reliable Networks using A-Teams. In: Proceedings of the International Conferences on Systemics, Cybernetic and Informatics, Orlando-Florida, USA (1999) 13. Guo, T., Michalewize, Z.: Inver-over Operator for the TSP. In: Parallel Problem Sovling from Nature (PPSN V), pp. 803–812. Springer, Heidelberg (1998) 14. Yan, X.S., Li, H., et al.: A Fast Evolutionary Algorithm for Combinatorial Optimization Problems. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, pp. 3288–3292. IEEE Press, Los Alamitos (2005) 15. Dengiz, B., Altiparmak, F., Smith, A.E.: Local Search Genetic Algorithm for Optimal Design of Reliable Networks. IEEE Trans. Evolut. Comput. 1(3), 179–188 (1997)
Sequencing Mixed-Model Assembly Lines with Limited Intermediate Buffers by a GA/SA-Based Algorithm Binggang Wang, Yunqing Rao, Xinyu Shao, and Mengchang Wang The State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
[email protected] Abstract. This study is concerned about how to optimize the input sequence of product models in Mixed-Model Assembly Lines (MMALs) with limited intermediate buffers.Two objectives are considered simultaneously: minimizing the variation in parts usage and minimizing the makespan. The mathematical model is presented by incorporating the two objectives according to their relative importance weights. A hybrid algorithm (GASA), based on a genetic algorithm (GA) and a simulated annealing algorithm (SA), is proposed for solving the model. The performance of the GASA is compared with the GA over various test problems. The results show that, in terms of solutions’ quality, both the GASA and GA can find the same best solution for small-sized problems and the GASA performs better than the GA for mediate and large-sized problems. Moreover, the impact of buffer size on MMAL’s performance is investigated. Keywords: Mixed-model assembly lines, Limited intermediate buffers, Sequencing, GA/SA-based algorithms, Genetic algorithms
1 Introduction MMALs are increasingly accepted in industry to cope with the diversified demands of customers without holding large end product inventories. In order to maximize the utilization of MMALs, many researchers have investigated the sequencing problem in MMALs. Numerous exact and approximate algorithms are proposed for different optimization goals. For the goal of minimizing the variation in parts usage, Toyota [1] sequenced its MMALs by Goal Chasing Algorithm (GCA). Cakir and Inman [2] modified GCA to sequence products with non-zero/one product-part usage matrices. Miltenburg and Sinnamon [3] improved GCA to schedule mixed-model multi-level production systems. This problem was also been investigated by another researchers [4-6]. For multicriteria sequencing problem in MMALs, much research work has done [7-11]. The optimization objectives included: minimizing the overall line length, keeping a constant rate of parts usage, minimizing total utility work, minimizing total setup cost , minimizing the tardiness and earliness penalties, minimizing the production rate variation cost, and minimizing the makespan, etc. Many heuristic algorithms, such as tabu search, genetic algorithm, memetic algorithm, multi-objective genetic algorithm,etc., D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 15–22, 2008. © Springer-Verlag Berlin Heidelberg 2008
16
B. Wang et al.
were proposed to solve the problems. Though much research work has done on sequencing problem in MMALs, there were only a few researchers studied the sequencing problem with the optimization goals of leveling parts usage and minimizing the makespan together. Moreover, to our best knowledge, few work was on the same optimization goals in MMALs with limited intermediate buffers. So we study this problem in this paper. The remainder of this paper is organized as follows: Section 2 presents the mathematical models. Algorithms procedures are described in Sect. 3. Case studies and discussions are reported in Sect. 4. The last section is the conclusions.
2 Mathematical Models 2.1 Minimizing the Variation in Parts Usage This problem can be formulated as below [1]. Minimize:
2
J K ∑∑ ( x jk − k × N j / K ) .
(1)
j k
Subject to: I
N j = ∑ di × bij
.
(2)
i =1
where, xjk is the number of part type j required to assemly products scheduled in stages 1 to k, di is the number of product type i needed to be assembled, bij is the number of part type j required by product type i, Nj is the total demand of part type j to produce all the products in a production plan. 2.2 Minimizing the Makespan The mathematical model for this objective is formulated as follows [12].
Minimize:
63 7 M A3 7 M
.
(3)
Subject to: 63 1
63 m
63 W m
63 W m
^
^
0
.
63 m 1 A3 m 1ˈm
(4) 2,3,...M
.
(5)
`
max 63 W m1 A3 W m1 , 63 W m A3 W m ˈW 2,3,...Bm1 1 .
`
max 63 W m 1 A3 W m 1 , 63 W m A3 W m , 63 W% ˈW ! Bm 1 1 P m 1
(6) .
(7)
Sequencing MMALs with Limited Intermediate Buffers
17
where, SP(T)M is the starting time of the last product, P(T), processed on the last machine, machine M, AP(T)M is the processing time of the last product on the last machine, Bm is the size of the buffer between the two successive machines m and m-1.
3 Algorithms for Solving the Model 3.1 GA/SA-Based Algorithm Procedures The procedures of the proposed algorithm are described as follows. Step1: Generate an initial population, Pop(0). Step2: Calculate each individual’s fitness value in Pop(0), let the best solution and its objective function value be the global best S* and C*, respectively, determine the initial temperature t0, and let generation g=0. Step3: If the termination condition is satisfied, then output S*and C*, otherwise, continue the following steps. Step4: Genetic operations (Selection, Crossover, Mutation). Step5: Metropolis sampling process. Step6: Evaluate each individual in the temporary population, pt(g), obtained after Step5, and update S* and C* if necessary. Step7: Keep the best. Step8: g=g+1, tg=α×tg-1. go to Step3. 3.2 Implementation of the GA/SA-Based Algorithm Main steps for implementation of the proposed algorithm are described as follows.
⑴
Encoding scheme: Job permutation based encoding scheme is adopted in this paper. But when generating the individuals, the number of each type of products in each individual can not break the constraints on the demand in production plan. Normalize the two objective function values: In order to make the two objective function values at the same level, the following steps are taken. Generate a group of feasible solutions randomly. Number of the generated solutions is determined according to the problem scale. Here, let it be 4 times of the population size, POPSIZE. For each solution, calculate the two objective function values, and then sum them respectively. Denote the sum of the first objective function value, OFV1, as V1, and the second objective function value, OFV2, V2. If V2/V1≥1, then multiple OFV1 with V2/V1, else, multiple OFV2 with V1/V2. Fitness calculation: For minimizing problem is considered, the fitness value can be calculated from the following:
⑵ ① ②
③ ⑶
f(j,k,m,t,E,g)=1/F(j,k,m,t,E,g) .
(8)
where, f (j, k, m, t, E, g) and F (j, k, m, t, E, g) is the fitness and objective function value, respectively, of the Eth feasible solution in gth generation.
18
B. Wang et al.
⑷
Initial temperarure: Let Boltzman constant be 1, initial temperature can be calculated from the following: to=ΔF/ln(Pa-1) .
(9)
ΔF=Fmax-Fmin .
(10)
where, Fmax and Fmin is the maximum and the minimum objective function value, respectively, in Pop(0). Genetic operators: Proportional selection method is used in this paper to let the better solutions have more chances to be chosen to enter the next generation. Modified order crossover (modOX) operator is adopted. After crossover operation, the best two among the four solutions, two parents and two offsprings, were selected to replace the original two parents. And for mutation operation, the INV mutation operator is employed. Metropolis sampling process: Using each individual, S, in the temporary population, pt(g), as the initial solution, a local search process is performed. The scheme has the following working steps:
⑸
⑹
① Let l=0, current best solution S**=S*, q=0, current state S’(0)=S. ② Generate a new solution, S’, from S using INV mutation operator mentioned above, calculate ΔC’=C(S’)-C(S). If ΔC’ C(S’ ), let S* = S’ ’ IfΔC >0, accept S’ by the probability of Pa, if S’ is accepted, let S’(l+1)=S’, q=q+1, else let S’(l+1)=S’(l). ④ l=l+1. If termination condition (q>qmax or l>lmax) satisfied, continue the following steps, else go to ②. Replace S with S**.
③
;
⑤
where, l is the length of Markov chain, lmax is the maximum length of Markov chain, Pa is the probability of inferior solutions being accepted, q is the times of the current best solution remain unchanged, qmax is the maximum times of the current best solution remains unchanged. 3.3 GA Algorithm A GA is designed for comparison with the GASA. In this GA, propotional selection method, modOX crossover operator, and INV mutation operator are adopted.
4 Case Studies and Discussions The GASA and GA are coded in C++ and implemented on a PC (Pentium (R) 4, CPU 2.80 GHz, 512M). Computational data is listed in Table 1 to Table 3. Assume that the two objectives are of the same importance, the two weight values are all set to 0.5. The parameters values for GASA are: POPSIZE=50, G =200, Pc =0.85, Pm =0.20, lmax =20, qmax =3, Pa=0.15, and the selected GA parameters values are: POPSIZE=50, G =200, Pc =0.85, Pm =0.20. Each experiment is repeated 10 times.
Sequencing MMALs with Limited Intermediate Buffers
19
Table 1. Production plans (MPS)
Products Plan1 Plan2 Plan3
Y(1) 2 4 7
Y(2) 1 2 5
Y(3) 3 5 5
Y(4) 2 3 8
Y(5) 2 6 5
Table 2. Parts needed for assembling different products
Parts X(1) X(2) X(3) X(4) X(5)
Y(1) 1 0 1 1 0
Products Y(3) 1 1 1 0 1
Y(2) 0 1 1 1 1
Y(4) 1 0 0 1 1
Y(5) 0 1 1 1 0
Table 3. The assembly time
Machine code
Product
Y(1) Y(2) Y(3) Y(4) Y(5)
1 4
2 6
3 8
4 9
5 5
6 6
7 8
8 9
9 8
10 7
11 5
12 5
13 6
14 7
15 8
16 8
17 3
18 4
19 5
20 6
2
5
6
8
4
5
7
9
6
8
9
5
4
4
5
6
7
8
9
9
3
4
6
7
9
6
8
7
6
9
9
6
6
5
6
7
7
6
7
8
2
3
4
4
4
5
7
8
8
9
6
7
7
5
6
6
7
8
8
9
3
5
7
8
9
7
6
8
8
8
9
9
6
4
5
5
5
6
8
8
Table 4. Computational results by different algorithms
Machine Production plan (2,1,3,2,2) (4,2,5,3,6) 4 (7,5,5,8,5) (2,1,3,2,2) 10
(4,2,5,3,6) (2,1,3,2,2)
20
(4,2,5,3,6)
TOFV Best solutions GA GASA GA GASA 53.102 53.102 2134531453 2134531453 2135453513 2135453154 88.9554 87.5969 4515345312 3512315435 5435421142 4351241534 184.925 175.472 3312415435 2142135412 4121345412 4351421354 88.148 87.648 4351321345 4351354312 4351253154 4535135453 128.325 122.325 2351531354 1213545312 132.31 131.811 4531231453 4531321345 4535315425 4535132153 169.285 161.637 1312135345 4521315345
CPU time GA GASA 1s 51s 2s
94s
3s 136s 3s 168s 7s 257s 8s 486s 12s 659s
20
B. Wang et al.
Fig. 1. Convergence curves for different production plans (4 machines)
Firstly, assume that the MMAL has 4 machines and the buffer size between every two successive machines is 1. Computational results by the GASA and the GA, respectively, are shown in Table 4 for different production plans. Fig. 1 shows the convergence curves. It can be seen, for small-sized problems, the same best total objective function value, TOFV, can be obtained by both the GASA and GA. However, for mediate and large-sized problems, the best solutions obtained by the GASA are better than those obtained by the GA. We can also find that the GA costs less time for all the problems. Secondly, for MMALs consist of 10 and 20 machines, respectively, and also let all the buffer size be 1. The best solutions for different production plans and algorithm
Fig. 2. Convergence curves for different production plans (10 machines)
Sequencing MMALs with Limited Intermediate Buffers
21
Fig. 3. Convergence curves for different production plans (20 machines) Table 5. Computational results for the impact of buffer size on MMAL’s performance
Production plan (2,1,3,2,2)
Bm=0 Bm=1 Bm=2 Bm=3 TOFV solution TOFV solution TOFV solution TOFV solution 53.922
43513 21345 21345 21345 53.102 53.102 53.102 21354 31453 31453 31453
are also listed in Table 4. The convergence curves are shown in Fig. 2 for 10 machines and Fig.3 for 20 machines. From calculation and comparison, we can conclude that the GASA performs better than the GA in terms of solution’s quality, but it takes much longer computational time than the GA. Finally, we consider the impact of buffer size on the MMAL’s performance. Table 5 shows the computational results by the GASA for the above small-sized problems with different buffer sizes, respectively. It can be found that the best solutions with the buffer size 1 are better than those with the buffer size 0, and the best solutions remain unchanged when the buffer size increases to 2 and 3. This implies us that determining the buffer size reasonably can smooth the production process and decrease the makespan, but too large buffer size is of no use in improving the MMAL’s performance.
5 Conclusions A hybrid algorithm is proposed for solving the sequencing problem in MMALs with limited intermediate buffers. The algorithm’s performance is tested by comparing it with a GA. The computational results show that, for small-sized problem, both the proposed algorithm and the GA can obtain the same best solution, but the hybrid algorithm performs better than the GA in finding the best solution for mediate and large-sized problems. Also, efforts are made to investigate the impact of buffer size
22
B. Wang et al.
on the MMAL’s performance, we can conclude from computational results that reasonable design of the buffer size can improve the MMAL’s performance and at the same time keep a shorter line length. Acknowledgements. This research work is supported by 863 High Technology Plan Foundation of China under Grant No. 2007AA04Z186 and National Natural Science Foundation of China under Grant No. 50775089.
References 1. Monden, Y.: Toyota Production System, 2nd edn. Institute of Industrial Engineers, Norcross, Georgia (1993) 2. Cakir, A., Inman, R.R.: Modified Goal Chasing for Products with Non-Zero/One Bills of Material. Int. J. Prod. Res. 31(1), 107–115 (1993) 3. Miltenburg, J., Sinnamon, G.: Scheduling Mixed-Model Multi-Level Just-In-Time Production Systems. Int. J. Prod. Res. 27, 1487–1509 (1989) 4. Miltenburg, J., Sinnamon, G.: Algorithms for Scheduling Multi-Level Just-In-Time Production Systems. IIE Transactions 24, 121–130 (1992) 5. Leu, Y.-Y., Matheson, L.A., Rees, L.P.: Sequencing Mixed Model Assembly Lines with Genetic Algorithms. Comput. Indus. Engin. 30(4), 1027–1036 (1996) 6. Leu, Y.-Y., Huang, P.Y., Russell, R.S.: Using Beam Search Techniques for Sequencing Mixed-Model Assembly Lines. Ann. Oper. Res. 70, 379–397 (1997) 7. Bard, J.F., Shtub, A., Joshi, S.B.: Sequencing Mixed-Model Assembly Lines to Level Parts Usage and Minimizing Line Length. Int. J. Prod. Res. 32(10), 2431–2454 (1994) 8. Hyun, C.J., Kim, Y., Kim, Y.K.: A Genetic Algorithm for Multiple Objective Sequencing Problems in Mixed Model Assembly Lines. Comput. Oper. Res. 25(7/8), 675–690 (1998) 9. Guo, Z.X., Wong, W.K., Leung, S.Y.S., Fan, J.T.: A Genetic-Algorithm-Based Optimization Model for Scheduling Flexible Assembly Lines. Int. J. Adv. Manuf. Technol. (2006) DOI: 10.1007/s00170-006-0818-6 10. Tavakkoli-Moghaddam, R., Rahimi-Vahed, A.R.: Multi-Criteria Sequencing Problem for a Mixed-Model Assembly Line in a JIT Production System. Appl. Math. Comput. 181, 1471–1481 (2006) 11. Yu, J.f., Yin, Y.H., Chen, Z.N.: Scheduling of an Assembly Line with a Multi-Objective Genetic Algorithm. Int. J. Adv. Manufact. Technol. 28, 551–555 (2006) 12. Nowicki, E.: The Permutation Flow Shop with Buffers: a Tabu Search Approach. European J. Operat. Res. 116, 205–219 (1999)
Solving Vehicle Routing Problem Using Ant Colony and Genetic Algorithm Wen Peng and Chang-Yu Zhou School of Computer Science and Technology, North China Electric Power University, Beijing 102206
[email protected],
[email protected] Abstract. Vehicle routing problem becomes more remarkable with the development of modern logistics. Ant colony and genetic algorithm are combined for solving vehicle routing problem. GA can overcome the drawback of premature and weak exploitation capabilities of ant colony and converge to the global optimal quickly. The performance of the proposed method as compared to those of the genetic-based approaches is very promising. Keywords: ant colony, vehicle routing problem, genetic algorithm.
1 Introduction Many heuristic methods currently used in combinatorial optimization are inspired by adaptive natural behaviors or natural systems, such as genetic algorithms, simulated annealing, neural networks, etc. Ant colony algorithms belong to this class of biologically inspired heuristic. The basic idea is to imitate the cooperative behavior of ant colonies, which can be used to solve several discrete combinatorial optimization problems within a reasonable amount of time. Dorigo and his colleagues were the first to apply this idea to the traveling salesman problem [1]. This algorithm is referred to as ant colony algorithm (ACA). ACA has achieved widespread success in solving different optimization problems, such as the job shop scheduling [2], cell assignment problem [3] and the multiple objective JIT sequencing problem [4]. Finding efficient vehicle routes is an important logistics problem which has been studied for the last 40 years. A typical vehicle routing problem (VRP) can be described as the problem that aims to find a set of minimum cost routes for several vehicles from a depot to a number of customers and return to the depot without exceeding the capacity constraints of each vehicle. Since the process of selecting vehicle routes allows the selection of any combination of customers, VRP is considered as a combinatorial optimization problem where the number of feasible solutions for the problem increases exponentially with the number of customers to be serviced [5]. Heuristic algorithms such as simulated annealing [6], genetic algorithms [7], tabu search [8] and ant colony optimization [9] are widely used for solving the VRP. In this paper, ant colony and genetic algorithm are combined for solving vehicle routing problem. The vehicle routing problem is analyzed deeply and decomposed so as to apply the ant colony model, in which one ant is presented as a vehicle and D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 23–30, 2008. © Springer-Verlag Berlin Heidelberg 2008
24
W. Peng and C.-Y. Zhou
burden all tasks. When all customers are satisfied, one ant goes from a depot to the customers and return to the depot several times. Then genetic algorithm, which is used to improve ant colony model, can overcome the drawback of premature and weak exploitation capabilities of ant colony and converge to the global optimal quickly. The proposed algorithm can obtain the optimal solution in a reasonably shorter period of time.
2 Ant Colony Model The ant colony algorithms have been introduced with Dorigo’s Ph.D thesis. They are based on the principle that by using very simple communication mechanisms, an ant group is able to find the shortest path between any two points. During their trips a chemical trail (pheromone) is left on the ground. The role of this trail is to guide the other ants towards the target point. For one ant, the path is chosen according to the quantity of pheromone. Furthermore, this chemical substance has a decreasing action over time, and the quantity left by one ant depends on the amount of food found and the number of ants using this trail. The general principles for the ant colony simulation of real ant behavior are as follows. (1) Initialization. The initialization of the AC includes two parts: the problem graph representation and the initial ant distribution. First, the underlying problem should be represented in terms of a graph, G = , where N denotes the set of nodes, and E the set of edges. The graph is connected, but not necessarily complete, such that the feasible solutions to the original problem correspond to paths on the graph which satisfy problem-domain constraints. Second, a number of ants are arbitrarily placed on the nodes chosen randomly. Then each of the distributed ants will perform a tour on the graph by constructing a path according to the node transition rule described next. (2) Node transition rule. The ants move from node to node based on a node transition rule. According to the problem-domain constraints, some nodes could be marked as inaccessible for a walking ant. The node transition rule is probabilistic. For the kth ant on node i, the selection of the next node j to follow is according to the node transition probability: ⎧ (τ ij )α (ηij ) β ⎪ (τ ih )α (ηih ) β pijk = ⎪⎨ ∉ h tabu k ⎪ ⎪0 ⎩
∑
where
if j ∉ tabu k
(1)
otherwise
τ ij is the intensity of pheromone laid on edge (i,j), ηij
is the value of visibility
of edge (i,j), α and β are control parameters, and tabuk means the set of currently inaccessible nodes for the kth ant according to the problem-domain constraints. (3) Pheromone updating rule. The ant keeps walking through edges to different nodes by iteratively applying the node transition rule until a solution to the original problem
Solving Vehicle Routing Problem Using Ant Colony and Genetic Algorithm
25
is constructed. We define that a cycle of the AC algorithm is completed when every ant has constructed a solution. At the end of each cycle, the intensity of pheromone trails on each edge is updated by the pheromone updating rule: m
τ ij ← ρτ ij + ∑ Δτ ijk
(2)
k =1
where ρ ∈ (0,1) is the persistence rate of previous trails, Δτ ij is the amount of k
pheromone laid edge (i,j) by the kth ant at the current cycle, and m is the number of distributed ants. If we define Lk, the total length of the kth ant in a cycle, as the fitness value of the solution, then Δτ ij can be given by k
k Δ τ ij
⎧Q ⎪ = ⎨ Lk ⎪0 ⎩
if edge (i, j ) is traversed by the kth ant at this cycle
(3)
otherwise
where Q is a constant. (4) Stopping criterion. The stopping criterion of the AC algorithm could be the maximal number of running cycles or the CPU time limit.
3 Vehicle Routing Problem The Vehicle Routing Problem (VRP) is described as a weighted graph G= where the nodes are represented by N=(N0,N1,…,NM) and the arcs are represented by E={(Ni,Nj):i≠j}, shown as in Fig. 1. In this graph model, N0 is the central depot and the other nodes are the M customers to be served. Each node is associated with a fixed quantity qi of goods to be delivered (a quantity q0=0 is associated to the depot N0). To each arc (Ni,Nj) is associated a value dij representing the distance between Ni and Nj. Each tour starts from and terminates at the depot N0, each node Ni must be visited exactly once, and the quantity of goods to be delivered on a route should never exceed the vehicle capacity Q. And other notations for describing the model are summarized below: k: the vehicle identification. K: the amount number of all vehicles. Qk: the max capacity of the kth vehicle. Dk: the max distance of the kth vehicle. nk: the number of customers dispatched by the kth vehicle. Rk: the set of customers dispatched by the kth vehicle. When nk=0, Rk=Φ. When nk≠0,
{
Rk = rk 1 ,
rki ,
}
, rknk ⊆ {1, 2,
dispatched sequence of the kth vehicle. S: the total cost of one solution.
, M } , where rki is the ith customer in the
26
W. Peng and C.-Y. Zhou
Fig. 1. An example of VRP
The model is: F = min(S), where
K ⎛ nk ⎞ S = ∑ ⎜ ∑ d rki−1rki + d rknk 0 ⎟ ⋅ sgn ( nk ) with k =1 ⎝ i =1 ⎠
constraint
⎧1, n ≥ 1 sgn ( nk ) = ⎨ k ⎩0, nk = 0 nk
∑q i =1
nk
∑d i =1
rki −1rki
rki
(4)
≤ Qk nk≠0
(5)
+ d rknk 0 ≤ Dk nk≠0
Rk1 ∩ Rk 2 = Φ k1≠k2 K
∪ Rk = {1, 2, k =1
(6)
(7) K
M } 0 ≤ nk ≤ M , ∑ nk = M
(8)
k =1
4 Ant Colony for VRP In ant colony model, an individual ant simulates a vehicle, and its route is constructed by incrementally selecting customers until all customers have been visited. The customers, who were already visited by an ant or violated its capacity constraints, are stored in the infeasible customer list (tabu). Graph Representation To apply ant colony, the underlying problem should be represented in terms of a directed graph, G=. Apparently, the VRP can be represented a graph easily.
Solving Vehicle Routing Problem Using Ant Colony and Genetic Algorithm
27
Node Transition Rule The node transition rule is a probabilistic one determined by the pheromone intensity τ ij and the visibility value ηij of the corresponding edge. In the proposed method,
τ ij
is equally initialized to any small constant positive value, and is gradually
updated at the end of each cycle according to the average quality of the solution that involves this edge. On the other hand, the value of ηij is determined by a greedy heuristic method, which encourages the ants to walk to the minimal S edge. We now define the transition probability from node i to node j at time t as ⎧ [τ ij (t )]α [ηij ]β ⎪ α β ⎪ pij (t) = ⎨ [τij (t)] [ηij ] tabuk ⎪ ⎪⎩0
if j ∈tabuk
∑
(9)
otherwise
where tabuk are the accessible nodes by walking ants, and the means of other symbols are same to the Eq. (1). tabuk must satisfy the Eq. (5), (6), (7). Pheromone Updating Rule The intensity of pheromone trails of an edge is updated at the end of each cycle by the average quality of the solutions that traverse along this edge. We simply apply and modify Eqs. (2) and (3) to update pheromone intensity. m
τ ij ← ρ .τ ij + ∑ Δτ ijk
(10)
t =1
⎧Q ⎪ Δτ = ⎨Sk ⎪0 ⎩ k ij
if thekth ant walksedge(i, j) (11) otherwise
where Sk is the distance of the kth ant at current cycle.
5 Improved Ant Colony When the pheromone intensity of one route is higher heavily than other routes, ant colony model will be premature and can’t find the most optimal solution. To overcome this drawback, we improve the ant colony model through introducing genetic algorithm. The purpose of introducing genetic algorithm is improving the weak local exploitation capability. In one cycle, the optimal solutions after applying ant colony are selected to run into genetic algorithm. One solution is represented an individual with a string of customers representing a route to be served by only one vehicle. For example, in the dispatched task of six customers {0, 1, 2, 7, 3, 4, 8, 5, 6} can be a solution, where the numbers bigger than six (or M) are separators. This solution is
28
W. Peng and C.-Y. Zhou
explained as 0Æ1Æ2Æ0, 0Æ3Æ4Æ0, 0Æ5Æ6Æ0. After defining the chromosome, selection, crossover and mutation is applied to jump out the local solution. In the crossover, the initial position and crossover length are generated randomly, the crossover method can be described as followed. Given s1: P1|P2|P3, s2: Q1|Q2|Q3 P2 and Q2 are crossover section. Q2 is inserted into s1 before P2, and s3: P1| Q2|P2|P3 is got. Then the duplicate numbers are deleted from s3 and obtain one children individual. Another is got by the same method.
6 Experimental Results The proposed algorithm has been programmed in VC++ language, and run in Windows XP. For comparison of the results, we implement the algorithm in Ref. [10]. Table 1 shows the distances of customers, and there are eight customers and Qk=8T, Dk=40km k=1,…,K. Table 2 is the requirements of each customers. Do experiments independently five times and the results of Ref. [10] and our algorithm are shown in Table 3. Table 1. Distances between customers
i j 0 1 2 3 4 5 6 7 8
0
1
2
3
4
5
6
7
8
0 4 6 7.5 9 20 10 16 8
4 0 6.5 4 10 5 7.5 11 10
6 6.5 0 7.5 10 10 7.5 7.5 7.5
7.5 4 7.5 0 10 5 9 9 15
9 10 10 10 0 10 7.5 7.5 10
20 5 10 5 10 0 7 9 7.5
10 7.5 7.5 9 7.5 7 0 7 10
16 11 7.5 9 7.5 9 7 0 10
8 10 7.5 15 10 7.5 10 10 0
6 4
7 2
8 2
Table 2. Requirement of customer
Customer ID requirement
1 1
2 2
3 1
4 2
5 1
Table 3. Comparison between Ref. [10] and our method
Index 1 2 3 4 5 average
Ref.[10] 72.5 76 67.5 72 73.5 72.3
Our 67.5 67.5 67.5 67.5 67.5 67.5
Details of our results 0Æ6Æ7Æ4Æ0 / 0Æ1Æ3Æ5Æ8Æ2Æ0 0Æ6Æ7Æ4Æ0 / 0Æ1Æ3Æ5Æ8Æ2Æ0 0Æ1Æ3Æ5Æ8Æ2Æ0 / 0Æ6Æ7Æ4Æ0 0Æ6Æ7Æ4Æ0 / 0Æ1Æ3Æ5Æ8Æ2Æ0 0Æ6Æ7Æ4Æ0 / 0Æ1Æ3Æ5Æ8Æ2Æ0
Solving Vehicle Routing Problem Using Ant Colony and Genetic Algorithm
29
Table 4. Coordinates and requirements of others customers
ID
X coordinat e 12.8 18.4 15.4 18.9 15.5 3.9 10.6 8.6 12.5 13.8
1 2 3 4 5 6 7 8 9 10
Y coordinat e 8.5 3.4 16.6 15.2 11.6 10.6 7.6 8.4 2.1 5.2
requirem ent
ID
0.1 0.4 1.2 1.5 0.8 1.3 1.7 0.6 1.2 0.4
11 12 13 14 15 16 17 18 19 20
X coordinat e 6.7 14.8 1.8 17.1 7.4 0.2 11.9 13.2 6.4 9.6
Y coordinat e 16.9 2.6 8.7 11.0 1.0 2.8 19.8 15.1 5.6 14.8
require ment 0.9 1.3 1.3 1.9 1.7 1.1 1.5 1.6 1.7 1.5
Fig. 2. Results of our method Table 5. Details of our result
Index 1
distance(km) 109.627
2
110.187
3
109.139
4
109.627
5
107.84
Average distance: 109.284
Details 0Æ5Æ14Æ2Æ12Æ9Æ10Æ7Æ0 0Æ1Æ8Æ19Æ15Æ16Æ13Æ6Æ0/0Æ4Æ0 0Æ18Æ20Æ11Æ17Æ3Æ0 0Æ6Æ13Æ16Æ15Æ19Æ8Æ1Æ0/0Æ4Æ0 0Æ18Æ3Æ17Æ11Æ20Æ0 0Æ5Æ14Æ2Æ12Æ9Æ10Æ7Æ0 0Æ18Æ0/0Æ5Æ14Æ2Æ12Æ9Æ10Æ1Æ7Æ0 0Æ8Æ19Æ15Æ16Æ13Æ6Æ0 0Æ4Æ3Æ17Æ11Æ20Æ0 0Æ18Æ20Æ11Æ17Æ3Æ0 0Æ6Æ13Æ16Æ15Æ19Æ8Æ1Æ0/0Æ4Æ0 0Æ5Æ14Æ2Æ12Æ9Æ10Æ7Æ0 0Æ4Æ3Æ17Æ11Æ20Æ0 0Æ8Æ19Æ15Æ16Æ13Æ6Æ0 0Æ5Æ14Æ2Æ12Æ9Æ10Æ7Æ1Æ0/0Æ18Æ0
30
W. Peng and C.-Y. Zhou
Then we apply the presented algorithm to resolve real vehicle routing problem, and the parameters are Qk=8T, Dk=50km k=1,…,5. The coordinate of the central depot is (14.5km, 13km), and the coordinates and requirements of others customers are in Table 4. Fig. 2 and Table 5 are both our results.
7 Conclusion We have devised a hybrid approach of integrating both ant colony and GA such that both their respective intensifying and diversifying process are exploited and integrated. The experimental results show that it’s feasible and successful to do in this way. However, there should be further research on how to implement the VRP with multiple central supply depot or deadline constraints in this way.
References 1. Dorigo, M.: Optimization, learning and natural algorithms. Ph.D. Thesis, Italy (1992) 2. Huang, K., Liao, C.: Ant colony optimization combined with taboo search for the job shop scheduling problem. Computers and Operations Research 35, 1030–1046 (2008) 3. Jian, S., Jian, S., Lin, B.M.T., Hsiao, T.: Ant colony optimization for the cell assignment problem in PCS networks. Computer and Operations Research 33, 1731–1740 (2006) 4. McMullen, P.R.: An ant colony optimization approach to addressing a JIT sequencing problem with multiple objectives. Artificial Intelligence 15(3), 309–317 (2001) 5. Bell, J.E., McMullen, P.R.: Ant Colony Optimization Techniques for the Vehicle Routing Problem. Advanced Engineering Informatics 1(8), 41–48 (2004) 6. Tavakkoli-Moghaddam, R., Safaei, N., Gholipour, Y.: A hybrid simulated annealing for capacitated vehicle routing problems with the independent route length. Applied Mathematics and Computation 176, 445–454 (2006) 7. Prins, C.: A simple and effective evolutionary algorithm for the vehicle routing problem. Computers & Operations Research 31, 1985–2002 (2004) 8. Brandao, J., Mercer, A.: A Tabu Search Algorithm for the Multi-Trip Vehicle Routing and Scheduling Problem. European Journal of Operational Research 100, 180–191 (1997) 9. Doerner, K.F., Hartl, R.F., Kiechle, G., Lucka, M., Reimann, M.: Parallel Ant Systems for the Capacitated Vehicle Routing Problem. In: Gottlieb, J., Raidl, G.R. (eds.) EvoCOP 2004. LNCS, vol. 3004, pp. 72–83. Springer, Heidelberg (2004) 10. Liu, L., Zhu, J.: The Research of Optimizing Physical Distribution Routing Based on Genetic Algorithm. Computer Engineering and Application 27, 227–229 (2005)
A Research on the Association of Pavement Surface Damages Using Data Mining Ching-Tsung Hung1, Jia-Ray Chang2, Jian-Da Chen3, Chien-Cheng Chou4, and Shih-Huang Chen5 1
Assistant Professor, Department of Transportation Technology and Supply Chain Management, Kainan University
[email protected] 2 Associate Professor, Department of Civil Engineering, Minghsin University of Science and Technology
[email protected] 3 Ph.D. Candidate, Department of Civil Engineering, National Central University
[email protected] 4 Assistant Professor, Department of Civil Engineering, National Central University
[email protected] 5 Assistant Professor, Department of Traffic Engineering and Management, Feng Chia University
[email protected] Abstract. The association of pavement surface damages used to rely on the judgments of the experts. However, with the accumulation of data in the pavement surface maintenance database and the improvement of Data Mining, there are more and more methods available to explore the association of pavement surface damages. This research adopts Apriori algorithm to conduct association analysis on pavement surface damages. From the experience of experts, it has been believed that the association of road damages is complicated. However, through case studies, it has been found that pavement surface damages are caused among longitudinal cracking, alligator cracking and pen-holes, and they are unidirectional influence. In addition, with the help of association rules, it has been learned that, in pavement surface preventative maintenance, the top priority should be the repair of longitudinal cracking and alligator cracking, which can greatly reduce the occurrence of pen-holes and the risk of state compensations.
1 Introduction In the past, the pavement distresses only determined that what is the reason or how to repair pavement. It didn’t take the relation in pavement distresses and determine the maintenance strategy by expert’s experience .It is difficult to take the knowledge from experts, so the experience can’t be passed down. The document generated by pavement maintenance actives, the pavement database has huge data. Developing on data mining technology, the method can process some information from pavement database. This research attempts to use the method of association rules in Data Mining to D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 31–38, 2008. © Springer-Verlag Berlin Heidelberg 2008
32
C.-T. Hung et al.
analyze the association of pavement surface damages and, based on the method of decision tree, determine what maintenance methods should be taken. Section 2 looks into the application of different Data Mining categories in pavement surface maintenance. Section 3 introduces Association Analysis. In section 4, based on the result of pavement surface survey, association rules are established. In the end, further discussions on the application of Data Mining in pavement engineering are provided.
2 Data Mining Application in Pavement Maintenance Data Mining means the process of finding the important hidden information in the data, such as trends, patterns and relationships. That is, exploring information or knowledge in the data. As a result, there are several different names for Data Mining, including Knowledge Discovery in Databases (KDD), Data Archaeology, Data Pattern Analysis and Functional Dependency Analysis. Many researchers view Data Mining as an important field in combining database systems and machine learning technologies. However, Data Mining is not omnipotent. It does not monitor the development process of the data and then pinpoint the special cases in the database. Also, it does not mean that, when Data Mining is adopted, there is no need to understand statistical principles, the background of the issue and what the data itself really means. We should not assume that the information obtained with Data Mining is all accurate and can be applied without any verification. In fact, Data Mining is used to help planning and analysis personnel find hypotheses, but it is not responsible for the verification of such hypotheses, nor dose it determine the real value of them. In some countries, there are cases where Data Mining is used successfully in civil engineering, but they are not common. In 2000, Vanessa Amado[1] applied Data Mining in the pavement surface management data saved in MoDOT’s (Department of Transportation, the State of Missouri). She used great amounts of pavement surface condition data collected between 1995 and 1999. The data format included 28,231 data items and 49 columns, predicting future PSR to determine the remaining lifespan of the pavement surface. This pavement surface database contains pavement surface service data collected by automatic testing vehicles and structural data collected by structural testing equipment. Here is how the analysis process was carried out. The goal of this study was preparing and exploring data. The first step in establishing the analysis model is converting the database files to Excel files and the data type of each column is related to measurements. Then, the software to be used is selected. Due to the fact that IBM Intelligent Miner for Data’s function of association and Data Mining, this study used it to analysis. IBM Intelligent Miner for Data can provide great amounts of relevant data to conduct analysis and this software is compatible with ASCII, Dbase, Oracle and Sybase formats. In addition, it can execute much application analysis of Data Mining, such as prediction, data pre-processing, regression, classification, clustering and association. Moreover, this software also uses Decision Tree and Artificial Neural Network as the method of exploring data. This analysis methods of Data Mining this software adopts include association, neural clustering and tree classification. Analysis results are provided regarding the pavement surface
A Research on the Association of Pavement Surface Damages Using Data Mining
>
33
<
characteristics of two groups, PSR(Pavement Serviceability Rating) 24 and PSR 24. That is, with data mining, separating the analysis data into 2 pavement surface types, PSR 24 and PSR 24. (a) Association is used to reveal the numeric value of each attribute. For cut-anddried data set, such a method can identify the PSR of each specific pavement surface. (b) Neural clustering is used to find the central location of clustering with similar characteristics. This technique is used to analyze the PSR of a specific pavement surface and the similarity of each clustering. When a new pavement surface is assigned to a certain clustering, it means that this pavement surface is most similar to the center in this clustering. (c) The model generated from Tree Classification is based on the known data. This technique can separate pavement surfaces into 2 categories, “not good” (PSR 24) or “good” (PSR 24). This technique’s classification process has 3 parts, the training model, the testing model and the application model. The training model is learning the user how to divide. The testing model applies this model in the testing data, in which the specific pavement surface level has been determined, to test the accuracy of the model generated by the training model. The application model is used to predict the future PSR of a pavement surface. In 2004, Bayrak et al. [2] adopted Neural Network to establish the test model for the flatness of cement concrete pavement surface. This study included pavement surfaces in 83 pavement surface sections in 9 states. 7 variables, including very kinds of data, traffic volume data and road surface damage data were used, and flatness prediction model, which was based on 7-10-10-10-1 network model, was established. This model had a coefficient of determination of 0.83 in the training data and 0.81 in testing data, which means it had a good predicting ability. In 2007, Khaled [3] adopted Data Mining to analyze the transportation project database of Illinois State. Using Association Analysis to analyze 21 data groups with several characteristics, including general data, project specific data, traffic control data and contract data, Khaled generated 9 rules. For example, one of the rules is, a new surface will be built if the final bid amount is less than $508,391 and the total traffic cost is less than $9,125 (This rule is 93% accurate in this database and there are at least 13 cases to support this rule). Therefore, the use of Data Mining can effectively help pavement surface engineers make the right decisions for their projects. The information technology is indispensable to the collection and analysis of pavement surface data. Data, both collected automatically or by people, can be used to determine M&R (maintenance & rehabilitation). However, the amount of data contained in PMS database is very large, and, therefore, there is a need to further explore this data to obtain more unknown and precious knowledge to help make the right M&R decisions.
>
<
>
<
3 Association Analysis There are several Data Mining techniques and more and more techniques which target at different fields of application and different types of databases have been introduced. Each technique has its characteristics and applications, the main and most
34
C.-T. Hung et al.
popular techniques include Characterization and Discrimination, Association Analysis, Classification and Prediction, Cluster Analysis, Outlier Analysis and Evolution Analysis. Han and Kamber [4] pointed out that Association Rules was the most mature and widely used technique in Data Mining. Association Rules was first proposed by Agrawal et. al.[5], and was mainly used to find the association among database items. Brin et al. [6] pointed out that Association Rules was initially used to study Market Basket Data. Through analyzing customers’ purchasing behaviors, the association of products can be found, which can be used as reference for business owners to decide how to shelf the products, what to buy and how much inventory they should have. By doing this, the products will be more competitive so sales turnover of these products is improved and profits will increase. For example, a customer is very likely to purchase bread after he buys milk. Therefore, milk products should be shelved next to bread products. Such information is called “Association Rules”. Here is how it is described: milk→bread[minsup=2% minconf = 80%]. There are two important parameters in Association Rules, which are support and confidence. These two parameters are used to evaluate if Association Rules can meet the expectations of the users. The most common algorithms to obtain Association Rules include Apriori algorithm, DHP, AprioriTid, AprioriHybrid, Boolean, FP-Tree, ICI and AIM. Apriori algorithm is the most representative one among all Association Rules algorithms and many Association Rules-related algorithms are based on, improved from or extended from Apriori algorithm. Currently, the improved algorithms include AprioriTid, AprioriHybrid, Boolean, Partition, DIC, Cloum-Wise Apriori, MultipleLevel and so on. Apriori algorithm includes the following steps:
,
Step 1. Use (k-1) – frequent item sets (LK-1) to generate candidate item sets (Ck). Step 2. Scan through Database D and calculate the support of all candidate item sets. All candidate item sets, whose support is larger or equal to the minimum support, are selected to become frequent item sets Lk , whose length is K. Step 3. Repeat step 1 and 2 until no new candidate item sets can be generated. (a) The rules to join and prune candidate item sets: (1) Follow step 1 to locate (k-1) – frequent item set, which has two identical k-2 items, to form k-item set. (2) Check the k-item set in step 1 to see if all the subsets of (k-1) –item set have appeared. If this is true, then keep this k-item set. (b) Two bottlenecks of Apriori algorithm: (1) Generate a great number of item sets 2-candidate items are generated by combining two 1-frequent items. If there are k items in 1-frequent item set, k-1 + k-2 +…+1 2candidate items will be generated, or k* k-1 /2. If 1-frequent item set has 1,000 items, 450,000 2-candidate items will be generated. (2) Scanning through the database several times is necessary Because there is a great number of candidate items, and each item has to scan through the whole database to obtain support, resulting in the low efficiency. The goal of this study is to shorten the time needed in generating frequent item sets.
( )
( ) ( ) ( )
A Research on the Association of Pavement Surface Damages Using Data Mining
35
4 Case Study 4.1 The Application of Road Repairing Data This study uses 92 groups of data obtained on Line 110A in 1999 to process the following association analysis of pavement surface damages. There are 18 different kinds of pavement distresses.. The severity of the damage can be divided into 3 categories: S (minor), M (medium) and H (severe). The range of the damaged areas can also be divided into 3 categories: a (minor), b (medium) and c (extensive). Apriori algorithm, targeting at the database studied, first establishes a candidate item set with only one item. Next, the database is scanned to find how many times each candidate item set appears in the database, which is called support. If we set the minimum support at 10, then candidate item sets that appear 10 times or more become the large item set. In this example, {1sa}{1ma}{1mb}{3sa}{3sb}{3mb}{4sa} is a large item set L1 . The candidate item set that has 2 items. When we calculate how many times each candidate item set, which contains 2 items, appears in the database, we get {1mb,3sb} as the large item set of two items. Since we cannot continue calculating a large item set with 3 items, the first stage of Aprioir algorithm ends. The next step is finding association rules in the large item set that has at least 2 items. In this example, the only large item set that has at least 2 items is {1mb,3sb}, so there are 2 possible association rules: (a) If Alligator cracking (severity of damage: medium; range of damaged area: medium) are found, then longitudinal cracking (severity of damage: minor; range of damaged area: medium) are very likely to appear. (b) If longitudinal cracking (severity of damage: minor; range of damaged area: medium) are found, then alligator cracking (severity of damage: medium range of damaged area: medium) are very likely to appear.
;
Take rule 1 as an example, the support of this rule can be calculated as follows: Confidence
1 mb → 3 sb
=
Support (1mb ,3 sb ) 11 = = 0 . 34 Support (1mb ) 32
(1)
Take rule 2 as an example, the support of this rule can be calculated as follows: Confidence
3 sb → 1 mb
=
Support (1mb , 3 sb ) 11 = = 0 . 65 Support ( 3 sb ) 17
(2)
If we set the minimum support at 0.5, then only the second rule’s support value is larger than this number. Therefore, based on Apriori algorithm, if longitudinal cracking (severity of damage: minor; range of damaged area: medium) are found, then alligator cracking (severity of damage: medium range of damaged area: medium) are very likely to appear. Based on the result of this study, it can be concluded that longitudinal cracking will affect a pavement surface’s load bearing ability, resulting in the appearing of alligator
;
36
C.-T. Hung et al.
cracking. Alligator cracking have no obvious influence on the serviceability of a pavement surface, but they have a major negative impact on the structure of the pavement surface. The reason is alligator cracking will bring water invasion so they will further develop into pen-holes and dents. Therefore, based on the result of Data Mining, this study concludes that, to prevent future structural damages, longitudinal cracking have to be prevented. Longitudinal cracking are caused when rolling press is not done properly, so on-site quality control should be enhanced to prevent longitudinal cracking from appearing. Table 1. Frequency of a candidate item set with 1 item appears
Damage Type
Support
Damage Type
Support
1sa 1sb 1ma 1mb 1mc 2sa 2sb 2mb 3sa 3sb 3ma 3mb
15 4 17 32 2 1 4 9 15 17 1 22
4sa 4sb 4ma 4mb 4ha 5sa 5sb 5ma 5mb 6sa 6ma 6mb
27 2 7 1 3 4 3 1 5 2 1 1
Damage Type
Support
7sa 7sb 7sc 7ma 7mb 7mc 8sa 8ma 9sb 10mb 10mc 13sa 13mb
1 2 1 1 5 1 3 1 1 1 1 2 2
Table 2. Frequency of a candidate item set with 2 item appears
Damage Type
Support
Damage Type
Support
Damage Type
Support
1sa,1ma 1sa,1mb 1sa,3sa 1sa,3sb 1sa,3mb 1sa,4sa 1ma,1mb
0 0 2 1 3 8 0
1ma,3sa 1ma,3sb 1ma,3mb 1ma,4sa 1mb,3sa 1mb,3sb 1mb,3mb
2 1 6 5 2 11 8
1mb,4sa 1mb,3sb 1mb,3mb 1mb,4sa 3sb,3mb 3sb,4sa 3sm,4sa
3 0 0 7 0 1 5
4.2 Applications of Road Damage Data Using the same method to find the relationship in survey of pavement distresses . There are five kinds of pavement distresses and different items with section 4.1. By Apriori algorithm, it produced one rule base on support is 10 times and confidence rate is 0.5. When pavement had alligator crack, there will cause the pen-hole. This means repairing alligator cracking will help reduce the incidence of pen-holes. Therefore, when it comes to choosing pavement surface maintenance methods, preventative
A Research on the Association of Pavement Surface Damages Using Data Mining
37
maintenance methods, which are used in other countries, should be adopted. If repair work is done to alligator cracking the moment they appear, the incidence of pen-holes on road surfaces can be reduced greatly. As for repair materials, high quality and durable ones should be used. Some other road maintenance departments have used the latest repair materials. Based on their record, it has been shown that these latest repair materials are more durable. When it comes to the question of how to choose materials that can show results in the early stage of repair and are also durable, more studies should be done to find the answer. 4.3 Discussion With the use of Apriori algorithm, we have a better understanding of the association of different pavement surface damages. Also, based on the association of these damages, preventative maintenance methods can be adopted to lengthen the lifespan of pavement surfaces. When studying how reliability changes in the second case, we find that, the lower the reliability is, the larger the number of association rules but the weaker their links are. For instance, when reliability is lowered to 10%, the rule, penholes will result in alligator cracking, is concluded. The incidence of both damages appearing at the same time accounts for 83.70% of all the data; that is, there is a high percentage of both damages occurring at the same time. However, the association that pen-holes will result in alligator cracking is only 10%, which is not reasonable in practice. Even, there is a association rule that alligator cracking will lead to manhole distress, but these two damages are not relevant. Therefore, reliability still relies on the judgment of experts to conclude better association rules. However, compared with experts’ judgment, Apriori algorithm can, based on scientific theories, conclude more accurate association rules. With the increase of data in the database, more associations can be found. Therefore, Data Mining can yield great results in concluding association rules of pavement surface damages.
5 Conclusion With the development of information technology, in recent years, more advanced testing equipment, which is used to collect civil engineering related data, has been invented. As for the interpretation and processing of testing data, some management decision making systems, which are based on flexible calculation or artificial intelligence, have been established, providing reasonable and viable maintenance and repair decisions. An effective public construction management information system should have a database which is complete and has large amounts of data. The data in such a database must be reliable, objective and appropriate so it can assist with the planning of maintenance and the decision-making of budgets. With the development of information technology and the rapid growth of public constructions, automatic data collecting methods have become more and more common. As the capacity of databases continues to increase, new methods and techniques are needed to help engineers and policymakers discovery the useful information and knowledge in the database. This study first adopts Apriori algorithm to conduct association analysis of pavement surface damages to understand the fact that the presence of certain damages is a result of other damages.
38
C.-T. Hung et al.
Acknowledgements. This study is the partial result of the 2003 Project Plan “The Study of Data Collection in the Database of Pavement Surface Management System (NSC92-2211-E-159-005)” by the National Science Council (NSC), and “The Study of the Rapid Re-construction and Repair Techniques of Public Roads(MOTC-IOT-96EDB009)” by the Institute of Transportation of the Ministry of Transportation and Communications in 2007. We would like to thank both NSC and the Institute of Transportation for their financial support.
References 1. Amado, V.: Expanding the Use of Pavement Management Data. In: 2000 MTC Transportation Scholars Conference, Ames, Iowa (2000) 2. Sarimollaoglu, M., Dagtas, S., Iqbal, K., Bayrak, C.: A Text-Independent Speaker Identification System Using Probabilistic Neural Networks. In: Proceedings of the International Conference on Computing, Communication and Control Technologies CCCT 2004, Austin, Texas, USA, vol. 7, pp. 407–411 (2004) 3. Nassar, K.: Application of data-mining to state transportation agencies. IT con. 12, 139–149 (2007) 4. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman, San Francisco (2000) 5. Agrawal, R., Imilienski, T., Swami, A.: Mining association rules between sets of items in large datasets. In: Buneman, P., Jajodia, S. (eds.) Proc. of the 1996 ACM SIGMOD Int’l Conf. on Management of Data, pp. 207–216. ACM Press, New York (1993) 6. Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Analysis. In: Proceeding of 1997 ACM-SIGMOD (SIGMOD 1997), Tucson, AZ, pp. 255–264 (1997)
An Integrated Method for GML Application Schema Match Chao Li1, Xiao Zeng2, and Zhang Xiong1 Computer Application Institute, School of Computer Science and Engineering, Beihang University, 37th Xueyuan Road, Haidian District, Beijing, China, 100083 {licc,xiongz}@buaa.edu.cn, {zengxiao29}@gmail.com
Abstract. GML has been a standard in geographical information area for enhancing the interoperability of various GIS systems for data mining. In order to share geography information based on GML, problems in application schema match need to be overcome first. This paper introduces an integrated multistrategy approach on GML application schema match. It combines existing scheme match algorithm with GML3.0 application schema. Firstly, it transforms the input GML application schemas into a GSTree according to linguistic-based and constraint-based match rules. Similarity between two elements is calculated trough different rules separately, and merged into element-level similarity. Secondly, the element-level similarity is rectified by a structure-level match algorithm based on similarity flooding. Finally, the mapping table of GML application schema elements is obtained. The experiment result shows that the approach can effectively discovery the similarity of schema elements, and improve the match results with a high degree of accuracy.
1 Introduction With the development of technologies on multimedia, network communication, data mining and spatial information, WebGIS has become the main trend in building an open, interoperable and internationalized Geographical Information System (GIS). Commercial GIS manufacturers released their products in succession, such as MapInfo ProServer of MapInfo, GeoMedia Web Map of Intergraph, and so forth. Since there is not a standard of development formula among these companies, they built their own spatial data structure independently. These diverse data formats have to be transformed when realizing data sharing. However, the problems of information missing or information losing usually emerge after transformation because of the lack of standard description of their spatial objects. For overcoming problems in sharing multi-sources heterogeneous spatial data, OpenGIS Consortium (OGC) established an encoding standard – Geography Markup Language (GML) [1], which is used for, storage modeling and geographical information transporting. As a communication media of different GIS applications. GML defined a universal data format. Therefore different applications can communicate with each other by using this data description method and thus the geographical information can be shared semantically among different areas as a basis for deeply spatial data mining. D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 39–46, 2008. © Springer-Verlag Berlin Heidelberg 2008
40
C. Li, X. Zeng, and Z. Xiong
GML defined various geographical elements by using XML Schema. It provided some basic schemas as meta-schema from which users can choose necessary elements to build their own application schema. However, though the source of spatial data is fairly abroad and the structure is complicated, GML allows users to model unrestrictedly. It caused that the application models made by different users may differ in thousands ways from each other in namespace, data-type, modeling structure and so forth, even if they are all defined by the same geographical element. Therefore, GML application schema match is essential for sharing of GML-based geographical information [2]. This paper put forward an integrated multi-strategy method based on GML 3.0 specification for GML application schema match. It combined element-level schema match and structure-level schema match, included linguistic-based match rules and constraint-based match rules. It took use of similarity-flooding-based structure match method to rectify the similarity when considering the interaction between neighboring nodes of GML schema trees
2 Related Work and Techniques 2.1 Schema Match The process of schema match can be simply summarized as follows: inputting two schemas, taking use of certain algorithm to match the elements and then outputting the result which is the mapping of the elements of the two schemas. As shown in Fig. 1, methods of match can be classified into schema-level match, instance-level match, element-level match and structure-level match in terms of the diversity of match objects; and schema match methods also can be classified into linguistic-based match and constraint-based match [3].
Fig. 1. Classification of schema match methods
Schema match can be used to build a matcher based on one single match algorithm. However, every single algorithm has its own limit. Recent research mainly focuses on building mixed-matcher based on several match rules or combining the match results of several matchers with weight. Furthermore, a mass of assistant information such as data dictionary, knowledge library, users’ input and the reuse of match result are utilized during the process of real schema match.
An Integrated Method for GML Application Schema Match
41
2.2 GML Application Schema Match GML 1.0 was released officially on April, 2000 and it described a variety of geography elements by using Document Type Definition (DTD). GML 2.0 was released on February, 2001 and it begun to use XML Schema to define the geography elements and their attributes. GML 2.0 only contained three meta-schemas and mainly focused on simple geography elements. GML 3.0 was released on January, 2003 and the number of its meta-schema was added to 32. Except the simple elements, GML 3.0 also described some geography elements which are not 2D linetype element. These elements include complex, non-linear 3D elements, 2D elements with topological structure, temporal elements, dynamic elements and layers. A support to complex geometrical entities, topological, spatial reference system, metadata, temporal characters and dynamic elements was added into GML 3.0 [4]. By now, most methods on GML application schema match are based on GML 2.0 [2][4][5]. Since the modeling mechanism of GML 2.0 is comparatively simple; the main differences to a same geography element of various models are in the naming of element and the data type. Thus most relevant match methods can only be classified into element-level match [2][5]. GML 3.0 provides a suit of much more abundant basic labels, public data type and the mechanism which allows users to build their application schema, all of which make the GML3.0-based modeling to a same element not only be different in naming of element and data type, but also might be different in the organization method of the element; thus only taking use of elementlevel match method cannot meet the requirement of complex matches. Reference [4] proposed a structure match method. This method firstly sets the similarity of subnodes as their linguistic similarity and then gets the similarity of two nodes by comparing the similarity of the sub-nodes of the two nodes. Although this method considered the influence between the similarity of different elements to some extent, it only focused on the influence that sub-nodes bring to nodes but overlooked the influence that nodes bring to sub-nodes. In terms of the complexity of GML 3.0 application schema match, it is necessary to consider the element-level and structure-level influential factors comprehensively, pre-saving a mass of GIS assistant information and GML 3.0 meta-schema information in database.
3 GML Application Schema Match This paper advances a multi-strategies method on GML application schema match. Firstly, the inputted GML application schemas need to be transformed into GML schema trees and then a process is implemented for calculating the similarity of the element-pairs of the GML schema trees by using linguistic-based match rule and constraint-based match rule separately. After doing that, the similarity in elementlevel match can be got by weighted combining the two results. Secondly, the similarity need to be modified by using structure-level match based on similarity flooding. Finally, the mapping tables of the two inputs are generated.
42
C. Li, X. Zeng, and Z. Xiong
Fig. 2. GML application schema match
3.1 GML Application Schema Match Since GML inherits the characters of XML, we choose tree structure as the model of GML application schema match and name this kind of tree as GSTree. The root node of GSTree represents the root element of GML application schema and the leaf node of GSTree represents the element of GML application schema which must not contain any other object. Thus the data type of leaf node is the basic data type of GML. The connection between father node and son node in GSTree represents the “Contain” relationship between the relevant elements in GML application schema. By taking use of GSTree, the influence of loop can be removed and thus the unlimited similarity flooding can be avoided. Furthermore, in GSTree, every node has one father at most, which guarantees that there is only one route from root node to a certain leaf node. A typical GSTree is shown in Fig. 3.
Fig. 3. A typical GML schema tree
In this paper we only defined a sub-element named “Feature” simply for the “FeatureMember” element. In real application, there might be multiple sub-elements. We can build the GSTree by adding sub-nodes to the node “FeatureMember” according to same rules.
An Integrated Method for GML Application Schema Match
43
3.2 Linguistic-Based Element Match Based on element name, we introduce the rules of match from two aspects—semantic match and string match [4][5]. Definition 1: dualistic bidirectional relational operator ≌ represents the match relation between two elements. The similarity value of element e1 and element e2 which is calculated based on rule R is saved as λR(e1, e2). Definition 2: If input data cannot meet the restriction requirement set by rule R, then rule R would be called invalidation. z Semantic Match Rule Rule 1: in same namespace, if element e1 and element e2 have a same name, λR1(e1, e2)=1; otherwise, λR1(e1, e2)=0. Rule 2: in different namespace, if element e1 and element e2 have a same name after proper noun pretreatment, then λR2(e1, e2)=1; Otherwise, rule 2 is invalidated. Rule 3: in different namespace, if element e1 and element e2 have a same name after synonymous pretreatment, then λR3(e1, e2)=S; Otherwise, rule 3 is invalidated. Rule 4: in different namespace, if element e1 and element e2 have the same name after approximated pretreatment, then λR4(e1, e2)=H; Otherwise, rule 4 is invalidated. The rules above need to be supported by the data dictionary, proper noun library, and synonymy library in geography information area. The S represents the similarity of synonymy and the H represents the similarity of approximator and S>H. z String Match Rule Rule 5: if str1 represents the name string of element e1 and str2 represents the name string of element e1, then λR5(e1, e2)=1/ ( F(str1, str2)); the function F means the EditDistance between str1 and str2, which is implemented based on Levenshtein-Distance algorithm [7].
Fig. 4. PRI and combination relations of the rules
Fig.4 shows the PRI (Priority) and combination relations of the 5 rules mentioned. If the rules with high PRI take effect, the rules with lower PRI would not be imported; if rule 3 and rule 4 take effect, rule 5 would be imported to implement combination
44
C. Li, X. Zeng, and Z. Xiong
computing; for the combination of two rules, if one of them is invalid, then this combination would be invalid; otherwise, the maximum among the computing results of the two rules in the combination would be set as similarity. For example, if rule 4 takes effect, the similarity value would be set as Max(H, λR5(e1, e2)). 3.3 Constraint-Based Element Match Schema usually contains some restriction to define attributes of elements, such as key mark, data type, range, uniqueness, selectivity and so forth. Considering the characters of GML application schema, we need to build basic label library and basic data library based on XML Schema and GML 3.0. The match rules are as follows: Rule 6: in a same name space, if the data type and label type of element e1 and element e2 are same, then λR6(e1, e2)=1; otherwise, λR6(e1, e2)=0. Rule 7: in a different name space, if element e1 and element e2 whose data type is not empty have a same data type, then λR7(e1, e2)=1; otherwise, rule 7 is invalidated. Rule 8: in a different name space, if element e1 and element e2 have a same label type, then λR8(e1, e2)=1; otherwise, λR8(e1, e2)=0. In the three rules mentioned above, rule 6 has the highest PRI and rule 8 has the lowest PRI. 3.4 Weighted Combination of Similarity Constraint-based element match is usually combined with other match method, which can help to limit the amount of candidate match [3]. If we set e1 as the node of input schema A, set e2 as the node of input schema B, set L as the similarity calculated by using linguistic-based match method, set C as the similarity calculated by using constraint-based match method and set ω as the weight inputted by users, then:
λ (e1 , e2 ) = ω * L + (1 − ω ) * C
(1)
3.5 Structure-Level Match Based on Similarity Flooding Element-level schema match method only computes the similarity of the two input schema elements, but neglects the inter-influence of similarity between those elements. In GSTree, there are abundant of “contain” and “be contained” relations among element nodes. The change of the similarity of one couple of nodes may lead to the changes of the similarity of the couple of its father nodes and the couple of its son nodes, so we take use of structure-level match to modify the result of elementlevel match. The idea of similarity flooding came from Similarity Flooding Algorithm [6], based on which reference [8] proposed a general structure match method Structure Match(SM) which includes a flooding mechanism based on directed graph and a method of similarity modification. The difference between GSTree and general directed graph is: there is only one route between root node and leaf node and the
An Integrated Method for GML Application Schema Match
45
similarity of each node on this route has influence on the similarities of others. We improved SM algorithm and made it be applicable on the tree structure of GSTree. Definition 3: for one node e of GSTree, P(e) represents its father node and C(e) represents its son node. For two GSTree, the initial similarity of their node pair (e1, e2) is λ( e1, e2)0, which is the similarity calculated by using element-level match method. Then after a loop of k times, we can get the similarity of (e1, e2) by using the following expression:
λ ( e1 , e2 ) k = θλ ( e1 , e2 )k −1 + θ P λ ( P ( e1 ) , P ( e2 ) )k −1 + θ C λ ( C ( e1 ) , C ( e2 ) )k −1
(2)
In Expression (2), θ, θP, θC represent the weight inputted by users and θ+θP+θC= 1. It can be seen that the similarity of node pair (e1, e2) is defined by its similarity in previous loop and the similarity of its father node and its son node.
λ (e 1 , e 2 )k − λ (e 1 , e 2 )k − 1 < ε
(3)
The loop will end when it satisfies Expression (3). ε is a threshold inputted by users. By using loop calculation, the similarity of node pair (e1, e2) spreads in the whole route. After similarity rectifying, there might be several results, in which the maximum value is chosen as output. In order to improve precision of match, a similarity threshold value can be set and only those similarities whose values are higher than the threshold can be chosen as final right results.
4 Power Management in DM-Sensors We implemented above algorithm in a prototype system and applied practical data sets to carry out experiment. We chose 6 groups of experiment data and each group contained two GML application schemas which describe city geography information. We chose the right detection ratio of element-level match, which is the ratio of the right match number detected by the system to real right match number, as criterion of evaluation.
A. Threshold of similarity=0.7
B. Threshold of similarity=0.8
Fig. 5. Correct match ratio with different thresholds of similarity
46
C. Li, X. Zeng, and Z. Xiong
According to the experimental results, the average value of right detection rate is 71.3% when the threshold value of similarity is 0.7 by used element-level match method only. When we imported structure-level match to implement similarity modification, the average value of right detection rate increased to 94.5%. The average value of right detection rate is 67.3% when the threshold value of similarity is 0.8 and only element-level match method is applied; after rectifying, the average value of right detection rate increased to 92.6%. The experiment results indicate that taking use of the multi-strategy method for GML application schema match is an effective way to detect the element match relation of GML application schema.
5 Conclusion This article proposes a multi-strategy method for GML application schema match based on GML 3.0 application schema. This method uses GSTree as the model of schema match and takes use of linguistic-based match and constraint-based match separately to calculate similarity of element pair and then combines those two match results with weight. Considering the influence of similarity between neighboring nodes, we adopt a structure-level match method based on similarity flooding to rectify the similarity of element pair. The experimental results indicate that this method can detect the element match relation of different GML application schema effectively, holds higher right match ratio and can be applied widely to the integration of geography information and spatial data mining based on GML.
References 1. OpenGIS Consortium Inc.: Geographic Information–Geography Markup Language (GML) (2003) 2. Guan, J.H., Zhou, S.G., Chen, J.P.: Ontopology Based GML Schema Match for Spatial Information Integration. In: Proceedings of the Second International Conference on Machine Learning and Cybernetics, pp. 2240–2245 (2003) 3. Rahm, E., Bernstein, P.A.: A Survey of Apporaches to Automatic Schema Match. The VLDB J. 10(4), 334–350 (2001) 4. Guan, J.H., Yu, W., An, Y.: Geography Markup Language Schema Match Algorithm. J. Wuhan Univ. 29(2), 169–174 (2004) 5. Zhang, Q., Sun, S., Yuan, P.P.: Fuzzyset-based Schema Match Algorithm for Geographic Information. J. Huazhong Univ. Sci. Technol. 34(7), 46–48 (2006) 6. Melnik, S., Hector, G.M., Rahm, E.: Similarity Flooding: A Versatile Graph Match Algorithm and Its Application to Schema Match. In: Proceedings of the 18th International Conference on Data Engineering, pp. 117–128 (2002) 7. Zhou, J.T., Zhang, S.S., Wang, M.W.: Element Match by Concatenating Linguistic-based Matchers and Constraint-based Matcher. In: Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence, pp. 265–269 (2005) 8. Cheng, W., Zhou, L.X., Sun, Y.F.: A Multistrategy Generic Schema Match Approach. Computer Science 31(11), 121–123 (2004)
Application of Classification Methods for Forecasting Mid-Term Power Load Patterns Minghao Piao, Heon Gyu Lee, Jin Hyoung Park, and Keun Ho Ryu* Database/Bioinformatics Laboratory, Chungbuk National University, Cheongju , Korea {bluemhp,hglee,neozean,khryu}@dblab.chungbuk.ac.kr
Abstract. Currently an automated methodology based on data mining techniques is presented for the prediction of customer load patterns in long duration load profiles. The proposed approach in this paper consists of three stages: (i) data preprocessing: noise or outlier is removed and the continuous attributevalued features are transformed to discrete values, (ii) cluster analysis: k-means clustering is used to create load pattern classes and the representative load profiles for each class and (iii) classification: we evaluated several supervised learning methods in order to select a suitable prediction method. According to the proposed methodology, power load measured from AMR (automatic meter reading) system, as well as customer indexes, were used as inputs for clustering. The output of clustering was the classification of representative load profiles (or classes). In order to evaluate the result of forecasting load patterns, the several classification methods were applied on a set of high voltage customers of the Korea power system and derived class labels from clustering and other features are used as input to produce classifiers. Lastly, the result of our experiments was presented.
1 Introduction Electrical customer load patterns prediction has been an important issue in the power industry. Load patterns prediction deals with the discovery of power load patterns from load demand data. It attempts to identify existing customer load patterns and recognize new load forecasting methods, employing methods from sciences such as statistical analysis [1], [2] and data mining techniques [3], [4], [5]. In power system, data mining is the most commonly used methods to determinate load profiles and extract regularities in load data and load pattern forecasting. In particular, it promises to help in the detection of previously unseen load patterns by establishing sets of observed regularities in load demand data. These sets can be compared to current load pattern for deviation analysis. Load patterns prediction using data mining is usually made by building models on relative information, weather, temperature and previous load demand data. Such prediction is aimed at short-term prediction [6, 7, 8, 9, 10, 11], since mid- and long-term prediction may not be reliant because the results of prediction contain high forecasting errors. However, mid- and long-term [12] (load patterns for longer period) forecasting on load demand is very useful and interest. *
Corresponding author.
D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 47–54, 2008. © Springer-Verlag Berlin Heidelberg 2008
48
M. Piao et al.
Input Data Customer information Temperature AMR load data
Representative monthly Load patterns for each customer
Preprocessing Discretization Removing Noise & Outliers Select Classifier Method acquisition
Cluster analysis (K-means) -generate load profiles and classes
Class label (cluster) assign
Build & evaluate Classifiers Build up the Model with training set
Validate the Model with testing set
Results of validation
Fig. 1. Load pattern prediction framework
The main objective of our work is to forecast monthly load patterns from capacity of daily power usage dataset measured for 10 months and customer information in terms of accuracy for the classification processes. The main tasks are the following: and a framework of our approaches is showed in Figure. 1. 1. 2. 3.
Cluster analysis is performed to detect load pattern classes and the load profiles for each class. Classification module is performed using customer load profiles to build a classifier able to assign different customer load patterns to the existing classes. The classifiers are evaluated to select a suitable classification method.
2 Data Collection and Preprocessing A case study concerning a database with load patterns and power usage from 1049 high voltage consumers is considered and this information has been collected by KEPRI (Korea Electric Power Research Institute). The collected load patterns from AMR were made during a period of ten months (from Jan. to Oct.) in 2007. The instant power consumption for each consumer was collected with a cadence of 15 min. The commercial index related with customer electricity use code, and max load demand and temperatures are also applied. To compare the load patterns, we use features of load shapes [13], able to capture relevant information about the consumption behavior, must be create the classifier. These features must contain information about the daily load curve shape of each consumer for each month and presented in Table 1. Lastly, since the extracted features contain continuous variables, entropy-based discretization has been used because the intervals are selected according to the information they contribute target variable. Due to the decision tree’s discretization [14], all continuous contributed variables are cut up into a number of intervals. Let T partition the set D of examples into the subsets D1 and D2. Let there be k classes C1,...,Ck. Let
Application of Classification Methods for Forecasting Mid-Term Power Load Patterns Table 1. Load curve shape features
Shape Feature
Definition
PatternAvg . for day
L1: Load Factor (24h)
s1 =
L2: Night Impact (8h: 23pm~07am)
s2 =
1 Pattern Avg . for night 3 Pattern Avg . for day
L3: Lunch Impact (3h: 12am~03pm)
s3 =
1 PatternAvg . for lunch 8 PatternAvg . for day
PatternMax. for day
Feature
Type
Description
Customer Electricity Use Code
nominal
Different 21 values
Max Load Demand
continuous
Min.:0.32 ~ Max.:5544
Temperature
continuous
Min.:-15.34 ~ Max.:35.23
continuous
Min.:0.32 ~ Max.:5544
1st Jan. AMR daily Power usage (15min. Interval)
0,15,…,2345
…
0,15,…,2345
continuous
Min.:0.32 ~ Max.:5544
1st May
0,15,…,2345
continuous
Min.:0.32 ~ Max.:5544
…
0,15,…,2345
continuous
Min.:0.32 ~ Max.:5544
0,15,…,2345
continuous
Min.:0.32 ~ Max.:5544
Cluster
nominal
{cluster1,…,cluster12}
30th
Oct.
Class
Data Preprocessing
Feature
Type
Description
Customer Electricity Use Code
nominal
Different 21 values
Max Load Demand
nominal
Discrete values
Temperature
Daily Load Factors
Class
1
1
1
2
3
1st Jan.
L ,L ,L
…
…
1st May
1
120
120
120
2
3
L ,L ,L 1
…
…
30th Oct.
L ,L ,L
304
304
304
1
2
3
Cluster
nominal
Discrete values
nominal
Discrete values
nominal
Discrete values
nominal
Discrete values
nominal
Discrete values
nominal
Discrete values
nominal
{cluster1,…,cluster12}
Fig. 2. Data preprocessing for AMR data
49
50
M. Piao et al. 1 st Jan. CUD
MLD
TEM
1
1
L
L
L
1
2
1 st M ay
… 1 3
221
2
2
3
1
3
721
5
6
6
5
2
311
1
4
7
4
2
…
…
…
…
…
…
…
…
…
120
120
L
L
2
L
6
2
3
6
5
1
6
3
1
…
…
…
1
30 th O ct.
… 120 3
…
…
…
304
L
1
304
L
2
304
L
Class
3
5
2
4
6
2
3
cluster1 cluster3
4
2
3
cluster6
…
…
…
…
Fig. 3. Sample of preprocessed input data
P(Ci, Dj) be the proportion of examples in Dj that have class Ci. The class entropy of a subset Dj, j=1, 2 is defined as, k
Ent ( D j ) = −∑ P(Ci , D j ) log(P(Ci , D j ))
(7)
i =1
Suppose the subsets D1 and D2 are induced by partitioning a feature A at point T. Then, the class information entropy of the partition, denoted E(A, T ; D), is given by: E ( A, T ; D) =
D1 D Ent ( D1 ) + 2 Ent ( D2 ) D D
(8)
A binary discretization for A is determined by selecting the cut point TA for which E(A, T ; D) is minimal amongst all the candidate cut point. The same process can be applied recursively to D1 and D2 until some stopping criteria is reached. The Minimal Description Length Principle is used to stop partitioning. Recursive partitioning within a set of values D stop if Gain( A, T ; D )
>|U|=n, the information structure has enough attributes and different attribute values to discernible all objects, so that, in the attribute reduction, more attributes can be reduced when maintaining its classes. Else, if |U|≈N, it means only finite attributes can be reduced from the information system because they just enough for discernible all objects.
A Tentative Approach to Minimal Reducts by Combining Several Algorithms
121
Number N gives much information on attribute reduction. On the other sides, when C={c1,c2,…,cm}, Vci={ci1, ci2,…, ci,mi} and |U|=n have been known, whether how many attributes are needed to distinguish the n objects also can be probably known. Information system S, if there are m condition attributes, firstly, if among {c1,c2,…,cm}, t1 attributes have s1 different values ( |U/c|=s) t2 attributes have s2 different values t3 attributes have s3 different values … and tr attributes have sr different values then the system can describe N different objects:
, ,
, ,
,
r
N=
:2≤s ≤|U|, ∑ t r
In (4-1)
i
i
i =1
=m
∏
s it i
(4-1)
i =1
。
: :
If |U|=n, ci∈C, |U/ci|=si, Indexing the si from big to small s1≥s2≥s3≥…≥sm, p0∈I I={1, 2, 3, …, m} makes the two formulas to be true
,
,
p 0 −1
∏s i =1
i
≤n
,
∃p , 0
(4-2)
p0
∏ si ≥ n . i =1
Formulas (4-2) show: at least p0 attributes are needed to describe the different objects in the system. If less than p0 the system certainly does not arrive |U/ind(C)|=n. Because a reduct must meet (2-3), so the attribute number in a reduct generally has
:
| redD(C)|≈p0 or | redD(C)|≥p0. The result displays the low limit: p0, it is the least attributes to discern any two objects in the system, also it is the least number to search minimum reducts.
5 Algorithm and Examples
(Pre-Analysis of attribute :
Following is the algorithm of reduction analysis PARA Reduction Algorithm). Information system S={U, A, V, f}
①. Getting t 、s of C, and computing N by (4-1); ②. Indexing s descent, computing p by (4-2); ③. IF |C|-p =0 THEN stop and exit, ELSE; ④. Computing ind(C) of the system; ⑤. Computing pos (D) of the system; ⑥. ∀ c∈C:IF pos (D)≠pos (D) THEN core(C)= core(C)∪{c}; i
i
i
0
0
ind(C)
ind(C-c)
ind(C)
122
N. Xu, Y. Liu, and R. Zhou
⑦. C’=C- core(C); ⑧. Output N , n, p , |C|-p , core(C), p -|core(C)|, C’. 0
0
0
This is a classical CTR(Car Test Result) dataset[6] and is discreted as table 1 reduction analysis by PARA is as follows
:
,its
Table 1. Classified CTR Dataset
U 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
a 0 0 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 1 0 0
b 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0
c 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
d 1 1 1 1 1 0 1 0 0 0 1 1 0 1 0 1 1 1 1 1 0
e 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0
f 1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 0 0 0 1 1 1
g 1 1 1 1 0 0 1 2 2 0 2 0 0 1 2 1 1 1 0 0 0
h 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0
i 0 0 0 1 0 2 2 1 0 0 1 0 0 0 0 0 0 0 0 0 0
D 1 1 1 2 1 0 0 2 1 1 2 2 1 2 2 1 1 1 2 2 1
①. Getting:t =2, s =3; t =7, s =2; by (4-1):N=1152; ②. s : 3,3,2,2,2,2,2,2,2, use (4-2): n=21, p = 4; ③. m=9, |C|-p = 5 ④. |U/ind(C)|=|{1},{2},{3},{4},…,{21}| = |U| ; ⑤. pos (D)={1,2,3, 4,…,21}={U}; ⑥. pos (D)≠{U} and pos (D)≠{U}, core (C)= {d,i}; ⑦. C’= C- core (C)={a,b,c,e,f,g,h}; ⑧ . Output: N=1152, n=21, p =4, |C|-p =5, core (C) ={d,i}, p -|core(C)|=2, 1
1
2
2
i
0
0
ind(C)
ind(C-d)
ind(C-i)
D
D
0
0
D
0
|C’|=7 Because N >>n (1152>>21), so the system has enough condition attributes, and may half of them are redundant. Because p0=4, so four attribute could be a reduct. As the core has two attributes, other two attributes are needed to a minimal reduct.
A Tentative Approach to Minimal Reducts by Combining Several Algorithms
123
So now, the low bound to search minimal reduct is 4 attributes. Because |coreD(C)| = 2, in fact to find a reduct from 511 subsets of 7 attributes is reduced to 21 subsets C 72 = 21 firstly. If a reduct can be decided from them, the other depth-first-search is no meanings; it is the minimal reduct of the IS. If no reduct in the 21 subsets, it tells a low bound of attribute number in a reduct. Then by a depth-first-search gets a reduct. The attribute number in the reduct will be the up bound, the binary search can begin between the two attribute numbers. To CTR dataset, the minimum reduct is: redD(C)={d,i,a,e}. Another example is from [7]. The dataset comes from medical treatment records. There are 20 inspective attributes, and 568 cases. Five exports divided the cases to 5 classes. By use of the reduction analysis algorithm PARA, the out put is: N≈8.0×1014, N>>n, p0=4, and |coreD(C)|=1, |C’|=19. 3
The low bound of finding minimum reducts is check C19 =969 attribute subsets. Any heuristic algorithm of attribute significance, for example the dependency of attribute as significance, can get a reduct which has 14 attributes. This will be the up bound to search minimal reduct. The next step is searching subsets, several reducts are obtained. Then is searching no reduct in them. It will be the last step to search
8 C19 = 75582 attribute
6 C19 = 27132 attribute subsets,
7 C19 = 50388 attribute subsets, 13
reducts are obtained; all of them are minimal reducts. Till now, the search number of attribute subsets is 154071; it is greatly reduced than 19! ≈1.2×1017 attribute subsets, and less than 1/(7.89×1011) times. A quickly heuristic algorithm gets a reduct of 8 attributes. By binary search, only the searches of
5 6 C19 + C19 =38760 attribute subsets are needed, it is less than
12
1/(3.13×10 ) times of 19!.
6 Conclusion Attribute reduction, especially high-dimensionality reduction, has many important meanings. This paper discusses a reduction analysis algorithm from indiscernibility relation and equivalence classes. When the attributes is increased in an indiscernibility relation, its equivalence classes will be increased. Ind(C) always arrive the most number of equivalence classes and has |U/ind(C)|=|U|. From this point, the data structure of information system can be used to analysis its ability to distinguish any two objects. From the data structure of information system, a reduction preanalysis algorithm PARA gives much information and determines the low limit of searching minimal reducts. Any heuristic algorithm can get up limit to search minimal reducts. Some examples show the algorithm, it is efficient and quickly. It may play some role in dimensionality reduction and finding minimal reducts. The study tries to give a simple think route on dimensionality reduction.
124
N. Xu, Y. Liu, and R. Zhou
Acknowledgements. The project is supported by China Guangdong Natural Science Foundation (No.06301299) and Professor & Doctor Special Research Funds of Guangdong Institute of Education.
References 1. Pawlak, Z.: Rough Sets, Int. J. Comput. Inform. Sci. 11(5), 341–356 (1982) 2. Pawlak, Z.: Rough Sets and Their Applications. Microcomputer Applications 13(2), 71–75 (1994) 3. Wong, S.K.M., Ziarko, W.: On Optimal Decision Rules in Decision Tables. Bullet. Polish Acad. Sci. 33, 693–696 (1995) 4. Xu, N.: The Theory and Technique Research of Attribute Reduction in Data Mining Based on Rough Sets, PhD dissertation, Guangdong University of Technology (2005) 5. Ni, Z., Cai, J.: Discrete Mathematics. Science Publishes (2002) 6. Zhang, W., Wu, W., Liang, J., Li, D.: Theory and Method of Rough Sets. Science Publishes (2001) 7. Guo, J.: Rough set-based approach to data mining, PhD dissertation, Department of Electrical Engineering and Computer Science, Case Wester University, USA (2003) 8. Hu, X.: Knowledge Discovery in Database: An Attribute-oriented Rough Set Approach (Rules, Decision Matrices), PhD dissertation, The University of Regina, Canada (995) 9. Wang, J., Miao, D.: Analysis on Attribute Reduction Strategies of Rough Set. J. Comput. Sci. Technol. 13(2), 189–193 (1998) 10. Shi, Z.: Knowledge Discovery. Tsinghua University Press, Beijing (2002) 11. Duntsch, I., Gediga, G., Orlowska, E.: Relation Attribute Systems II: Reasoning with Relations in Information Structures. In: Peters, J.F., Skowron, A., Marek, V.W., Orłowska, E., Słowiński, R., Ziarko, W. (eds.) Transactions on Rough Sets VII. LNCS, vol. 4400, pp. 16–35. Springer, Heidelberg (2007)
Ameliorating GM (1, 1) Model Based on the Structure of the Area under Trapezium Cuifeng Li Zhejiang Business Technology Institute, 315012, Zhejiang Ningbo
[email protected] Abstract. According to the research on the structure of background value in the GM(1,1) model, the structure method of background value, a exact formula about the background value of x (1) (t ) in the region [k , k + 1] ,which is used when establishing GM(1,1), is established by integrating x (1) (t ) from k to
k + 1 .The modeling precision and prediction precision of the ameliorating background value can be advanced. Moreover, the application area of GM(1,1) model can be enlarged. At last, the model of Chinese per-power is set up. Simulation examples show the effectiveness of the proposed approach. Keywords: grey theory, background value, precision.
1 Introduction The grey system theory has been caught great attention by researchers since 1982 and has already been widely used in many fields, such as industry, agriculture, zoology, market economy and so on. GM(1,1) has been high improved by many scholars from home and abroad. The grey system theory can effectively deal with incomplete and uncertain information system. The background value is an important factor in the fitting precision and prediction precision. According to the research on the structure of background value in the GM(1,1) model, the background value’s structure method of GM(1,1) model, a exact formula about the background value of x (1) (t ) in the region [k , k + 1] ,which is used when establishing GM(1,1), is established by integrating x (1) (t ) from k to k + 1 .The modeling precision and prediction precision of the ameliorating background value can be advanced. Moreover, the application area of GM (1, 1) model can be enlarged. At last, the model of Chinese per-power is set up. Simulation examples show the effectiveness of the proposed approach.
2 Modeling Mechanism of the Ameliorating GM (1, 1) Model 2.1 GM(1,1) Model
Let the non-negative original data sequence be denoted by:
X (0) = {x (0) (1), x (0) (2),..., x (0) (n)} . D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 125–131, 2008. © Springer-Verlag Berlin Heidelberg 2008
(1)
126
C. Li
Then the 1-AGO (accumulated generation operation) sequence X (1) can be gotten as follow: X (1) = {x (1) (1), x (1) (2),..., x (1) (n)} .
(2)
Where k
x (1) (k ) = ∑ x (0 ) (i ), k = 1,2,..., n .
(3)
i =1
The grey GM(1,1) model can be constructed by establishing a first-order differential equation for x (1) (t ) as:
d x (1 ) ( t ) / d t + a x (1 ) ( t ) = u .
(4)
Where a and u are the parameters to be estimated to calculate the integral of (3), we can get the follow equation:
∫
k +1
k
dx (1) (t ) + a ∫
k +1
k
x (1) (t )dt = u ∫
k +1
k
dt .
(5)
Then k +1
∫
k
dx(1) (t) = x(1) (t) kk+1 = x(1) (k +1) − x(1) (k) = x(0) (k + 1) .
(6)
Suppose
z ( 1 ) ( k + 1) =
∫
k +1
k
x (1 ) ( t ) dt .
(7)
is the background value of x (1) (t ) in the region [k , k + 1] . Thus, (3) can be rewritten into the following form: x (0) (k + 1) + az (1) (k + 1) = u .
(8)
Form (7), it is observed that the value of Z (1) (k + 1) can be established by integrating x (1) (t ) from k to k + 1 . Solve a and u by means of LS (least square): ⎛ aˆ ⎞ T −1 T ⎜ ˆ ⎟ = [B B] B Y . ⎝u ⎠
Where
⎡ − 2 x (1) (t )dt 1 ⎤ ⎢ ∫1 ⎥ ⎢ 3 (1) ⎥ − ∫ x (t )dt 1 ⎥ ⎢ 2 M Y = [ x (0) (2), x (0) (3),..., x (0) (n)]T B= ⎢ ⎥ M ⎢ ⎥ ⎢ k +1 (1) ⎥ ⎢⎣ − ∫k x (t )dt 1 ⎥⎦
(9)
Ameliorating GM (1, 1) Model Based on the Structure of the Area under Trapezium
127
Therefore, we can obtain the time response function by solving (3) as follow: uˆ uˆ xˆ (1) ( k + 1) = [ x (1) (1) − ]e − aˆk + . aˆ aˆ
(10)
2.2 The Improved Structure of the Background Value
Z (1) (k + 1) is the average of x (1) (k ) and x (1) (k + 1) in the traditional GM(1,1) model. We can see from Fig.1 that Z (1) (k + 1) using the traditional background value can be also regarded as the area of the trapezium abcd , but the real background value Z (1) (k + 1) is the background value of x (1) (t ) in the region [k , k + 1] .Thus we can know that using the traditional background value to build the model will bring lower precision and higher error. The background value which is reconstructed by the method of rectangle is proposed by Tan Guanjun. The method has made better precision, but it also has rather bigger error which we can see form fig.2. A new background value which using high precision interpolation formula and trapezium method is proposed. This method can improve the prediction precise of GM(1,1). The thought of this method is sorted out as follows. The interval of k to k + 1 is and let the divided into N space equivalently with the length named Δt is 1 N
,
values of the function x (1) (t ) in the points is x (1) (k ), x1 , x2 , x3 ,..., xN −1 , x (1) (k + 1) correspondingly as fig.3.
Fig. 1. Z (1) (k + 1) using the tradi-tional background value
Fig. 2. Z (1) (k + 1) reconstructed by the method of rectangle
Fig. 3. Z (1) (k + 1) reconstructed by the method of trapezium
The total of N areas under trapezium is regard as an approximation of the actual area. Obviously, the bigger N is, the total of N areas is much closer to the actual area as fig.3.Thus, the background value with this method proposed by this paper is nearer the actual area than the traditional method. Now, the total of N areas named S N is deduced as follows.
128
C. Li
The area of narrow trapezium is substituted for rounded line of narrow trapezium in every space. According to the formula of the area under trapezium, we can obtain as follows. k +1
SN = ∫ x(1) (t)dt k
1 1 1 1 ≈ [x(1) (k) + x1 ]Δt + (x1 + x2 )Δt + (x2 + x3 )Δt +... + [xN −1 + x(1) (k +1)]Δt . 2 2 2 2 1 (1) (1) = [x (k) + 2x1 + 2x2 + 2x3 +... + 2xN −1 + x (k +1)] 2N
(11)
Suppose
zN(1) (k +1) = SN =
1 (1) [x (k) + 2x1 + 2x2 + 2x3 + ... + 2xN −1 + x(1) (k +1)] . 2N
(12)
k = 1, 2,..., n − 1
Where
,
i=k+
i ,(i=1,2,L ,N-1) , thus N
xi is the ordinate value of the corresponding curve
xi = x (1) (k +
i ), i = 1, 2,..., N − 1 . N
when
(13)
Obviously, the following equality could be gotten when N = 1 .
1 zN(1) (k +1) = SN == [ x(1) (k ) + x (1) (k +1)] . 2
(14)
2.3 Calculate the Background Value
From above, if the new background value is to be restructured, we should get the value xi firstly. But the value xi is not exit. Now Newton-Cores interpolation formula is introduced to get it. Suppose Y ( k ) = k , k = 1,2,..., n , let [Y ( k ), x (1) ( k )], k = 1,2,..., n be the point of the corresponding curve, then using Newton-Cores interpolation formula to get the value
i i x(1) (k + ) in light of its corresponding abscissa Y(k+ ) i =1,2...,n −1 . N N Definition 3.1 [6]. The function f [ x 0 , x k ] = f ( x k ) − f ( x 0 ) is defined as a first-order xk − x0
mean-variance of f (x ) about x0 , xk .
Ameliorating GM (1, 1) Model Based on the Structure of the Area under Trapezium
The function f [ x 0 , x1 , x k ] =
129
f [ x 0 , x k ] − f [ x 0 , x1 ] is defined as a second-order meanx k − x1
variance of f (x ) about x0 , xk . The function f [x0 , x1,...,xk−1] =
f [x0 ,...,xk−3, xk−1] − f [x0 , x1,...,xk−2 ] is defined as a (k-1) order xk−1 − xk−2
mean-variance of f (x ) about x0 , xk . The function
f [x0 , x1,...,xk ] =
f [x0 ,...,xk−2 , xk ] − f [x0 , x1,...,xk −1] is defined as a k order xk − xk−1
mean-variance of f (x ) about x0 , xk . Newton-Cores interpolation formula in [6] is as follow: Suppose x is a point in [ a, b] , then we can get: f ( x ) = f ( x 0 ) + f [ x , x 0 ]( x − x 0 ) f [ x , x 0 ] = f [ x 0 , x1 ] + f [ x , x 0 , x1 ]( x − x1 )
.
(15)
... f [ x , x 0 , x1 ,..., x n −1 ] = f [ x 0 , x1 ,..., x n ] + f [ x , x 0 ,..., x n ]( x − x n )
As long as the latter formula has been taken into the former formula, we can get: f ( x ) = f ( x0 ) + f [ x0 , x1 ]( x − x0 ) + f [ x0 , x1 , x 2 ]( x − x0 ) ( x − x1 ) + ... + f [ x0 , x1 ,..., xn ]( x − x0 ) ( x − x1 )...( x − x n ) + f [ x , x0 ,..., xn ]ω n +1 ( x )
.
(16)
= N n ( x ) + Rn ( x )
Where R n ( x ) = f ( x ) − N n ( x ) = f [ x , x 0 , x1 ,..., x n ]ω n +1 ( x )
(17)
ω n +1 ( x) = ( x − x0 )( x − x1 )...( x − x n ) . The polynomial of Newton-Cores interpolating formula is as follow: N n ( x ) = f ( x 0 ) + f [ x 0 , x1 ]( x − x 0 ) + f [ x 0 , x1 , x 2 ]( x − x 0 )( x − x1 ) + ... + f [ x 0 , x1 ,..., x n ]( x − x 0 )( x − x1 )...( x − x n )
.
(18)
Then the new background value is held easily as follow z (1) (k +1) =
1 (1) [ x ( k ) + 2 x1 + 2 x2 + 2 x3 + ... + 2 xN −1 + x (1) ( k + 1)] . 2N
(19)
Generally, the bigger N is, the More accurate GM(1,1) model is.
3 Example Per-power is the measure of economic development level and people's living standards Thus, it is necessary to build the model of per-power and to predict developmental tendency. Now, using the method proposed by this paper to build
130
C. Li Table 1. Comparison of two modeling methods
Year
Real
Method proposed in [2]
value
Method proposed in this paper
1980
306.35
Model value 306.35
1981
311.2
303.04
2.62
300.07
3.58
1982
324.9
325.16
ˉ0.07
321.83
0.94
1983
343.4
348.89
ˉ1.60
345.18
ˉ0.52
1984
361.61
374.35
ˉ3.52
370.22
ˉ2.38
1985
390.76
401.67
ˉ2.79
397.08
ˉ1.62
1986
421.36
430.99
ˉ2.29
425.88
ˉ1.07
1987
458.75
462.45
ˉ0.81
456.78
0.43
1988
494.9
496.2
ˉ0.26
489.91
1.01
1989
522.78
532.42
ˉ1.81
525.45
ˉ0.51
1990
547.22
571.28
ˉ4.40
563.57
ˉ2.99
1991
588.7
612.97
ˉ4.11
604.45
ˉ2.68
1992
647.18
657.71
ˉ1.63
648.30
ˉ0.17
1993
712.34
705.71
0.93
695.33
2.39
1994
778.32
757.22
2.71
745.78
4.18
1995
835.31
812.49
2.79
799.87
4.24
1996
888.1
871.79
1.84
857.90
3.40
1997
923.16
935.42
ˉ1.33
920.14
0.33
1998
939.48
1003.7
ˉ6.83
986.89
ˉ5.05
1999*
988.60
1076.9
ˉ8.94
1058.49
ˉ7.07
2000*
1073.62
1155.5
ˉ7.65
1135.27
ˉ5.74
2001*
1164.29
1239.9
ˉ6.49
1217.63
ˉ4.58
(predict value with*).
Relative error˄%˅ 0
Model value 306.35
Relative error˄%˅ 0
Ameliorating GM (1, 1) Model Based on the Structure of the Area under Trapezium
131
China per-power form1980 to 1998 and to predict per-power form 1999 to 2001.The model by using the method proposed in this paper is as follow: Table 1 gives comparison of two modeling methods.
xˆ (1) (k ) = 4136.36e0.070033( k −1) − 3830.00
+
xˆ (k 1) = 279.77e (0)
xˆ (1) = 306.35
0.070033 k
,k ≥ 1
,k ≥ 1
.
(0)
The error inspection of post-sample method can be used to inspect quantified approach .The post-sample error c = S1 / S 0 (where S1 is variation value of the error and S 0 is variation value of the original sequence) of the model proposed by this paper is c1
= 0.0867 , while the post-sample error proposed in [2] is c2 = 0.1186 . Then
we can come to conclusion that the method proposed by this paper has improved the fitted precision and much better than the method proposed in [2]. The small error probability is p = P{| e (0) (i ) − e −(0) |< 0.6745S 0 } = 1 . Thus, the practical application results show the effectiveness of the proposed approach.
4 Conclusion According to the research on the structure of background value in the GM(1,1) model, the structure method of background value, a exact formula about the background value of x (1) (t ) in the region [k , k + 1] ,which is used when establishing GM(1,1), is established by integrating x (1) (t ) from k to k + 1 .The modeling precision and prediction precision of the ameliorating background value can be advanced. Moreover, the application area of GM(1,1) model can be enlarged. At last, the model of Chinese perpower is set up. Simulation examples show the effectiveness of the proposed approach.
References 1. Liu, S.F., Guo, T.B., Dang, Y.G.: Grey System Theory and Its Application. Science Press, Beijing (1999) 2. Tan, G.J.: The Structure Method and Application of Background Value in Grey System GM (1, 1) Model (I). Systems & Engineering-Theory, 98–103 (2000) 3. Chen, T.J.: A New Development of Grey Forecasting Model. Systems Engineering, 50–52 (1990) 4. Fu, L.: Grey Systematic Theory and Application. Technical Document Publishing House, Beijing (1992) 5. Shi, G.H., Yao, G.X.: Application of Grey System Theory in Fault Tree Diagnosis Decision. Systems Engineering theory & Practice 144, 120–123 (2001) 6. Gong, W.W., Shi, G.H.: Application of Gray Correlation Analysis in the Fe-spectrum Analysis Technique. Journal of Jiangsu University of Science and Technology (Natural Science) 1, 59–61 (2001)
Comparative Study with Fuzzy Entropy and Similarity Measure: One-to-One Correspondence Sanghyuk Lee, Sangjin Kim, and DongYoup Lee School of Mechatronics, Changwon National University #9 sarim-dong, Changwon, Gyeongnam 641-773, Korea {leehyuk,aries756,dongyeuplee}@changwon.ac.kr
Abstract. In this paper we survey the relation between fuzzy entropy measure and similarity measure. Each measure has data uncertainty and similarity. By the one-to-one correspondence, distance measure and similarity measure have complementary characteristics. First we construct similarity measure using distance measure. Verification of usefulness is proved. Furthermore analysis of similarity measure from fuzzy entropy measure is also discussed. Keywords: Similarity measure, distance measure, fuzzy entropy, one-to-one correspondence.
1 Introduction Fuzzy entropy and similarity measure are both used for the quantifying uncertainty and similarity measure of data [1,2]. Data uncertainty and certainty are usually expressed through probability point of view, probability of event denotes, which lies within. That probability value has the meaning of certainty and uncertainty simultaneously. Degree of similarity between two or more data has central role for the fields of decision making, pattern classification, or etc. [3-8]. Until now the research of designing similarity measure has been made by numerous researchers [8-12]. Two design methods are introduced through fuzzy number approach [8-11] and distance measure [12]. Method by fuzzy number make easy to design similarity measure. However considering similarity measures are restricted within triangular or trapezoidal membership functions [8-11]. Whereas similarity measure based on the distance measure is applicable to general fuzzy membership function including non-convex fuzzy membership function [12]. For fuzzy set, uncertain knowledge is contained in fuzzy set itself. Hence uncertainty of the data can be also obtained from analyzing the fuzzy membership function. Mentioned uncertainty is described fuzzy entropy. Characterization and quantification of fuzziness are important issues that affect the management of uncertainty in many system models and designs. The fact that the entropy of a fuzzy set is a measure of the fuzziness of that fuzzy set has been established by previous researchers [14-16]. Liu proposed the axiomatic definitions of entropy, distance measure, and similarity measure, and discussed the relations between these three concepts. Kosko considered D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 132–138, 2008. © Springer-Verlag Berlin Heidelberg 2008
Comparative Study with Fuzzy Entropy and Similarity Measure
133
the relation between distance measure and fuzzy entropy. Bhandari and Pal provided a fuzzy information measure for discrimination of a fuzzy set relative to some other fuzzy set. Pal and Pal analyzed classical Shannon information entropy. In this paper we try to analyze relations between fuzzy entropy and similarity. With the help of distance measure, we design the similarity measure. Obtained similarity measure produce fuzzy entropy based on one-to-one correspondence between distance measure and similarity measure. Fuzzy entropy, from similarity measure, is proved by verifying definition of fuzzy entropy. We also continue discussion of similarity from fuzzy entropy. In the following chapter, we discuss the definition of fuzzy entropy and similarity measure of fuzzy set. We also introduce the previous obtained fuzzy entropy and similarity measure. In Chapter 3, fuzzy entropy is induced with similarity measure and vise versa. Conclusions are followed in Chapter 4.
2 Fuzzy Entropy and Similarity Measure Analysis Fuzzy entropy represents the fuzziness of fuzzy set. Fuzziness of fuzzy set is represented through degree of ambiguity, hence the entropy is obtained from fuzzy membership function itself. Liu presented the axiomatic definitions of fuzzy entropy and similarity measure [13], and these definitions have the meaning of difference or closeness for different fuzzy membership functions. First we introduce fuzzy entropy. We design fuzzy entropy based on distance measure satisfying definition of fuzzy entropy. Notations of Liu are used in this paper [13]. Definition 2.1 [13]. A real function: e : F ( X ) → R + is called an entropy on F ( X ) , if e has the following properties: (E1) e( D) = 0 , ∀D ∈ P( X ) (E2) e ([1 2]) = max A∈F ( X ) e( A) (E3) e( A* ) ≤ e( A) , for any sharpening A* of A (E4) e( A) = e( Ac ) , ∀A ∈ F ( X ) where [1 2] is the fuzzy set in which the value of the membership function is 1 2 , R + = [ 0, ∞ ) , X is the universal set, F ( X ) is the class of all fuzzy sets of X , P( X ) is the class of all crisp sets of X and D c is the complement of D . A lot of fuzzy entropy satisfying Definition 2.1 can be formulated. We have designed fuzzy entropy in our previous literature [1]. Now two fuzzy entropies are illustrated without proofs. Fuzzy Entropy 1. If distance d satisfies d ( A, B ) = d ( AC , BC ) , A, B ∈ F ( X ) , then e ( A ) = 2 d ( ( A ∩ Anear ) , [1]) + 2 d ( ( A ∪ Anear ) , [ 0 ]) − 2
is fuzzy entropy.
134
S. Lee, S. Kim, and D. Lee
Fuzzy Entropy 2. If distance d satisfies d ( A, B ) = d ( AC , BC ) , A, B ∈ F ( X ) , then e ( A ) = 2d
(( A ∩ A ) , [0]) + 2d (( A ∪ A ) , [1]) far
far
is also fuzzy entropy. Exact meaning of fuzzy entropy of fuzzy set A is fuzziness of fuzzy set A with respect to crisp set. We commonly consider crisp set as Anear or A far . In the above fuzzy entropies, one of well known Hamming distance is commonly used as distance measure between fuzzy sets A and B , d ( A, B ) =
1 n ∑ μ A ( xi ) − μB ( xi ) 2 i =1
where X = { x1 , x2 ,L xn } , k is the absolute value of k . μ A ( x) is the membership func-
tion of A ∈ F ( X ) . Basically fuzzy entropy means the difference between two fuzzy membership functions. Next we will introduce the similarity measure, and it describes the degree of closeness between two fuzzy membership functions. It is also found in literature of Liu.
Definition 2.2 [13]. A real function s : F 2 → R + is called a similarity measure, if s has the following properties:
(S1) s( A, B) = s( B, A) , ∀A, B ∈ F ( X ) (S2) s( D, D c ) = 0 , ∀D ∈ P ( X ) (S3) s(C , C ) = max A, B∈F s( A, B) , ∀C ∈ F ( X ) (S4) ∀A, B, C ∈ F ( X ) , if A ⊂ B ⊂ C , then s ( A, B) ≥ s ( A, C ) and s ( B, C ) ≥ s( A, C ) . With Definition 2.2, we propose the following theorem as the similarity measure. Similarity Measure 1. For any set A, B ∈ F ( X ) , if d satisfies Hamming distance
measure and d ( A, B ) = d ( AC , B C ) , then s( A, B ) = 1 − d
( ( A ∩ B ) , [0]) − d (( A ∪ B ) , [1]) C
C
(1)
is similarity measure between set A and set B . We have proposed the similarity measure that is induced from distance measure. The similarity is useful for the non interacting fuzzy membership function pair. Another similarity is also obtained, and it can be found in our previous literature [2]. Similarity Measure 2. For any set A, B ∈ F ( X ) if d satisfies Hamming distance measure, then s ( A, B ) = 2 − d
( ( A ∩ B ) , [1]) − d ( ( A ∪ B ) , [0])
(2)
is also similarity measure between set A and set B . To be a similarity measure, similarity (1) and (2) do not need assumption d ( A, B ) = d ( A C , B C ) . Liu also pointed out that there is an one-to-one relation between all
Comparative Study with Fuzzy Entropy and Similarity Measure
135
distance measures and all similarity measures, d + s = 1 . In the next chapter, we derive similarity measure that is generated by distance measure. Furthermore entropy is derived through similarity measure by the properties of Liu. It is obvious that Hamming distance is represented as d ( A, B ) = d
( ( A ∩ B ) , [1]) − (1 − d ( ( A ∪ B ) , [0]) .
(3)
Where A ∩ B = min ( μ A ( xi ) , μ B ( xi ) ) and A ∪ B = max ( μ A ( xi ) , μ B ( xi ) ) are satisfied. With the Proposition 3.4 of Liu [13], we generate the similarity measure or distance measure from distance measure or similarity measure [13]. Proposition 2.1[13]. There exists an one-to-one correlation between all distance measures and all similarity measures, and a distance measure d and its corresponding similarity measure s satisfy s + d = 1 . With the property of s = 1 − d , we can construct the similarity measure with distance measure d , that is s < d > . From (3) it is natural to obtain following result. d ( A, B ) = d ( ( A ∩ B ) , [1]) + d ( ( A ∪ B ) , [ 0 ]) − 1 = 1 − s ( A, B )
Therefore we propose the similarity measure with above expression. s < d >= 2 − d ( ( A ∩ B ) , [1]) − d ( ( A ∪ B ) , [ 0 ])
(4)
This similarity measure is exactly same with (2). At this point, we verified the oneto-one relation of distance measure and similarity measure. In the next chapter, we verify that the fuzzy entropy is derived through similarity (2).
3 Entropy Derivation with Similarity Measure Liu also suggested propositions about entropy and similarity measure. He also insisted that the entropy can be generated by similarity measure and distance measure, those are denoted by e < s > and e < d > . 3.1 Entropy Generation by Similarity
Proposition 3.5 and 3.6 of reference [13] are summarized as follows. Proposition 3.1 [13]. If s is a similarity measure on F , define
(
e ( A) = s A, AC
)
, ∀A ∈ F .
Then e is an entropy on F . Now we check whether our similarities (1) and (2) satisfy Proposition 3.1. Proof can be obtained by checking whether s ( A, AC ) = 2 − d
(( A ∩ A ) , [1]) − d (( A ∪ A ) , [0]) C
satisfy from (E1) to (E4) of Definition 2.1.
C
136
S. Lee, S. Kim, and D. Lee
For (E1), ∀D ∈ P ( X ) , s ( D, D C ) = 2 − d
(( D ∩ D ) , [1]) − d (( D ∪ D ) , [0]) = 2 − d ([0] , [1]) − d ([1] , [0]) = 0 C
C
(E2) represents that crisp set entropy e ([1 2]) satisfies s ([1 2] , [1 2] ) = 2 − d C
has the maximum entropy value. Therefore, the
12
(([1 2] ∩ [1 2] ) ,[1]) −d (([1 2] ∪ [1 2] ) ,[0]) = 2 − d ([1 2] , [1]) − d ( ⎡⎣1 2 , [0]⎤⎦ ) C
C
= 2 −1 2 −1 2 = 1
In the above equation, [1 2]
C
= [1 2]
is satisfied.
(E3) shows that the entropy of the sharpened version of fuzzy set than or equal to e ( A ) . s ( A* , A*C ) = 2 − d
( )
A , e A*
, is less
( ( A ∩ A ) , [1]) − d (( A ∪ A ) , [0]) ≤ 2 − d ( ( A ∩ A ) , [1]) − d ( ( A ∪ A ) , [ 0 ]) = s ( A, A *
*C
*
*C
C
C
C
)
Finally, (E4) is proved directly s ( A, AC ) = 2 − d
(( A ∩ A ) , [1]) − d (( A ∪ A ) , [0]) = 2 − d ( ( A ∩ A ) , [1]) − d ( ( A C
C
C
C
)
)
∪ A , [ 0 ] = s ( AC , A)
From the above proof, our similarity measure s ( A, AC ) = 2 − d
(( A ∩ A ) , [1]) − d (( A ∪ A ) , [0]) C
C
generate fuzzy entropy. Next another similarity (1) between A and AC s ( A, AC ) = 1 − d
( ( A ∩ A) , [0]) − d ( ( A ∪ A ) , [1]) = 1 − d ( ( A ) , [0]) − d ( A, [1]) .
is also satisfied and proved easily. 3.2 Relation of Similarity and Distance
With the property of one-to-one correspondence between similarity and distance, we have derived similarity measure with distance measure. Furthermore with the similarity measure we also obtained fuzzy entropy. For the derivation of similarity measure, s = 1 − d is also used. If we use distance measure (3) d ( A, B ) = d
( ( A ∩ B ) , [1]) − (1 − d ( ( A ∪ B ) , [0]) ) ,
We obtain the corresponding similarity measure s < d >= 2 − d ( ( A ∩ B ) , [1]) − d ( ( A ∪ B ) , [ 0]) .
then this similarity is identical to (2).
Comparative Study with Fuzzy Entropy and Similarity Measure
137
From another similarity (1) s ( A, B ) = 1 − d
is
d ( A, B ) = d
(( A ∩ B ) , [0]) − d (( A ∪ B ) , [1]) , C
C
(( A ∩ B ) , [0]) + d (( A ∪ B ) , [1]) satisfied ? C
C
By the definition of distance measure of Liu [13], d ( A, B ) = d =d
=d
( ( A ∩ B ) , [0]) + d (( A ∪ B ) , [1]) C
C
(( A ∩ B ) ,[0] ) + d (( A ∪ B ) , [1] ) C
(( A
C
C
C
) )
∪ B , [1] + d
C
(( A
C
)
C
∩ B , [0]
C
)
= d ( B, A) .
d ( A, A) = d
( ( A ∩ A ) , [0]) + d (( A ∪ A ) , [1]) C
(
C
)
(
)
= d [ 0 ] , [ 0 ] + d [1] , [1] = 0 .
For d ( A, B ) = d ( ( A ∩ B C ) , [ 0]) + d ( ( A ∪ B C ) , [1]) ≤d
(( D ∩ D ) , [0]) + d (( D ∪ D ) , [1]) CC
CC
= d ( D , [ 0 ]) + d ( D , [1]) = 1 .
Hence it is natural that distance between crisp set and its complement become maximal value. Finally, d ( A, B ) = d
( ( A ∩ B ) , [0]) + d ( ( A ∪ B ) , [1]) ≤ d ( ( A ∩ C ) , [ 0 ]) + d ( ( A ∪ C ) , [1] ) = d ( A, C ) C
C
C
and d ( B, C ) = d
( ( B ∩ C ) , [0]) + d (( B ∪ C ) , [1]) C
C
C
≤d
( ( A ∩ C ) , [0]) + d (( A ∪ C ) , [1]) = d ( A, C )
are satisfied because of inclusion property,
C
C
A⊂ B⊂C.
4 Conclusions We have discussed the similarity measure that is derived from distance measure. The proposed similarity usefulness is proved. Furthermore with the relation between fuzzy entropy and similarity measure, we also verified that the fuzzy entropy is induced through similarity measure. In this paper our proposed similarity measures are provided for the design of fuzzy entropy. Among the proposed similarity measure, a similarity satisfies fuzzy entropy trivially. Even though there are similarity measure satisfying similarity definition, there can exist trivial fuzzy entropy. Finally, proposed similarity measure can be applied to the general types of fuzzy membership functions.
138
S. Lee, S. Kim, and D. Lee
Acknowledgments. This work was supported by 2nd BK21 Program, which is funded by KRF(Korea Research Foundation).
References 1. Lee, S.H., Cheon, S.P., Kim, J.: Measure of Certainty with Fuzzy Entropy Function. In: Huang, D.-S., Li, K., Irwin, G.W. (eds.) ICIC 2006. LNCS (LNAI), vol. 4114, pp. 134– 139. Springer, Heidelberg (2006) 2. Lee, S.H., Kim, J.M., Choi, Y.K.: Similarity Measure Construction Using Fuzzy Entropy and Distance Measure. In: Huang, D.-S., Li, K., Irwin, G.W. (eds.) ICIC 2006. LNCS (LNAI), vol. 4114, pp. 952–958. Springer, Heidelberg (2006) 3. Yager, R.R.: Monitored Heavy Fuzzy Measures and Their Role in Decision Making under Uncertainty. Fuzzy Sets and Systems 139(3), 491–513 (2003) 4. Rébillé, Y.: Decision Making over Necessity Measures through the Choquet Integral Criterion. Fuzzy Sets and Systems 157(23), 3025–3039 (2006) 5. Sugumaran, V., Sabareesh, G.R., Ramachandran, K.I.: Fault Diagnostics of Roller Bearing Using Kernel Based Neighborhood Score Multi-class Support Vector Machine. Expert Syst. Appl. 34(4), 3090–3098 (2008) 6. Kang, W.S., Choi, J.Y.: Domain Density Description for Multiclass Pattern Classification with Reduced Computational Load. Pattern Recognition 41(6), 1997–2009 (2008) 7. Shih, F.Y., Zhang, K.: A Distance-based Separator Representation for Pattern Classification. Image Vis. Comput. 26(5), 667–672 (2008) 8. Chen, S.M.: New Methods for Subjective Mental Workload Assessment and Fuzzy Risk Analysis. Cybern. Syst. 27(5), 449–472 (1996) 9. Hsieh, C.H., Chen, S.H.: Similarity of Generalized Fuzzy Numbers with Graded Mean Integration Representation. In: Proc. 8th Int. Fuzzy Systems Association World Congr., vol. 2, pp. 551–555 (1999) 10. Lee, H.S.: An Optimal Aggregation Method for Fuzzy Opinions of Group Decision. In: Proc. 1999 IEEE Int. Conf. Systems, Man, Cybernetics, vol. 3, pp. 314–319 (1999) 11. Chen, S.J., Chen, S.M.: Fuzzy Risk Analysis Based on Similarity Measures of Generalized Fuzzy Numbers. IEEE Trans. Fuzzy Syst. 11(1), 45–56 (2003) 12. Lee, S.H., Kim, Y.T., Cheon, S.P., Kim, S.S.: Reliable Data Selection with Fuzzy Entropy. In: Wang, L., Jin, Y. (eds.) FSKD 2005. LNCS (LNAI), vol. 3613, pp. 203–212. Springer, Heidelberg (2005) 13. Liu, X.: Entropy, Distance Measure and Similarity Measure of Fuzzy Sets and Their Relations. Fuzzy Sets and Systems 52, 305–318 (1992) 14. Bhandari, D., Pal, N.R.: Some New Information Measure of Fuzzy Sets. Inform. Sci. 67, 209–228 (1993) 15. Kosko, B.: Neural Networks and Fuzzy Systems. Prentice-Hall, Englewood Cliffs (1992) 16. Pal, N.R., Pal, S.K.: Object-background Segmentation Using New Definitions of Entropy. IEEE Proc. 36, 284–295 (1989)
Low Circle Fatigue Life Model Based on ANFIS Changhong Liu1, Xintian Liu1, Hu Huang1, and Lihui Zhao1,2 1
College of Automobile Engineering, Shanghai University of Engineering Science, 201620, Shanghai, China
[email protected] 2 School of Mechanical Engineering, Shanghai Jiao Tong University, 200240, Shanghai, China
Abstract. With the adaptive network fuzzy inference system (ANFIS), this paper presents a method of building a model of the low circle fatigue life. According to real experiment data got in the low circle fatigue experiment, a fatigue life model for low fatigue experiment is built. Finally, comparing with the Manson-Coffin equation, it can be concluded that the model of ANFIS is accurately and effectively.
1 Introduction Fuzzy Inference System (FIS) is based on expertise expressed in terms of ‘IF–THEN’ rules [1, 2]. FIS can be used to predict uncertain systems and its application dose not require knowledge of the underlying physical process as a precondition [3]. ANNs are inspired from the biological sciences by attempting to emulate the behavior and complex functioning of the human brain in recognizing patterns; they are based on a schematic representation of biological neurons in the human brain and attempt to emulate the processes of thinking, remembering and problem solving [4,5]. ANNs have many inputs and outputs and allow nonlinearity in the transfer function of the neurons; therefore they can be used to solve multivariate and nonlinear modeling problems. In recent years, the two methods were combined one anther, and then a pop research field appeared. In 1993, a hybrid ANFIS algorithm based on the Sugeno system, which was improved by Jang, was used on acquiring optimal output data in the study. ANFIS is an outstanding method in the pop research field. At present, ANFIS applications are generally encountered in the areas of function approximation, fault detection, medical diagnosis and control, and so on. Material’s low circle fatigue life estimate is a frequent and important problem in engineering field. Simultanously, it is always an important fields attracting the Science’s and Engineering’s attention. In the engineering field, there are many effective formulae on the low circle fatigue life estimate, for example, Manson-Coffin formula etc. The paper presents a method on Low circle fatigue life estimate which is built through ANFIS.
2 Adaptive Network Based Fuzzy Inference Systems (ANFIS) Adaptive network based fuzzy inference systems (ANFIS) is a FIS implemented in the framework of an adaptive fuzzy neural network. Such framework makes the D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 139–144, 2008. © Springer-Verlag Berlin Heidelberg 2008
140
C. Liu et al.
ANFIS modeling more systematic and less reliant on expert knowledge. The main aim of ANFIS is to optimize the parameters of the equivalent FIS by applying a learning algorithm using input-output data sets [6~8]. The parameter optimization is done in a way such that the error measure between the target and the actual output is minimized [9]. To present the ANFIS architecture, a fuzzy if–then rule based on a complex learning process is considered [10, 11]. Rule 1: if ( x is A1 ) and ( y is B1 ),
f 1 = p1 x + q1 y + r1 ); (1) Rule 2: if ( x is A2 ) and ( y is B2 ), (2) then ( f 2 = p 2 x + q 2 y + r2 ) where x and y are the inputs, Ai and Bi are the fuzzy sets, f i are the outputs within then (
the fuzzy region specified by the fuzzy rule,
pi , q i and ri are the design parameters
that are determined during the training process. The ANFIS architecture to implement these two rules is shown in figure 1 in which a circle indicates a fixed node whereas a square indicates an adaptive node. ANFIS has a 5 layer feed-forward neural network [8].
Fig. 1. This is the Architecture of ANFIS
Layer 1: All the nodes are adaptive nodes. The outputs of layer 1 are the fuzzy membership grades of the inputs, which are given by:
Oi1 = μAi ( x)
i = 1,2
Oi1 = μBi − 2 ( y ) i = 3,4 where
μA (x) i
and
(3) (4)
μB ( y ) can adopt any fuzzy membership function (MF) . Oi1 i − 2
indicates the output of layer 1. For example, if the bell-shaped membership function is employed, μAi (x ) is given by
μA ( x) = i
1 1 + {[( x − ci ) / a i ] 2 }bi
(5)
Low Circle Fatigue Life Model Based on ANFIS
where
141
ai , bi and ci are the parameters of the membership function, governing the
bell-shaped functions accordingly. Layer 2: Every node in this layer is a fixed node with the task of multiplying incoming signals and sending the product out. This product represents the firing strength of a rule. For example, in Fig. 1
Oi2 = wi = μ Ai ( x) μ Bi ( y ) i = 1,2
(6)
Layer 3: The nodes are fixed nodes. They play a normalization role to the firing strengths from the previous layer. The outputs of this layer can be represented as
Oi3 = w i =
wi w1 + w2
i = 1,2
(7)
which are the so-called normalized firing strengths. Layer 4: The nodes are adaptive nodes. The output of each node in this layer is simply the product of the normalized firing strength and a first-order polynomial (for a first-order Sugeno model). Thus the outputs of this layer are given by
O4i = w i f i = w i ( pi x + qi y + ri ) i = 1,2
(8)
Layer 5: There is only one single fixed node, which performs the summation of all incoming signals. Hence the overall output of the model is given by 2
2
∑w f
i =1
w1 + w2
O = ∑ wi f i = 5 i
i
i
i =1
(9)
It can be observed that there are two adaptive layers in this ANFIS architecture, namely the first layer and the fourth layer. In the first layer, there are three modifiable parameters {a i , bi , c i } , which are related to the input membership functions. These parameters are the so-called premise parameters. In the fourth layer, there are also three modifiable parameters { pi , qi , ri } , pertaining to the first-order polynomial. These parameters are so-called consequent parameters. In a word, the model based on the arithmetic is built and looked upon as corrective parameters which needed to be input. So that the model will export the corresponding simulation system of the low circle fatigue life. It is no need for the users to know the task principle and to have possession of fuzzy theory. In other works, the favorable precondition is provided, which is conventiency.
3 Low Circle Fatigue Life Estimate Model Based on ANFIS Low circle fatigue belongs to the fatigue problem of short life and has higher stress level [13]. Break stress always exceeds scale limitation. In every circle largish plastic deformation maybe happens [14,15]. Because the material lies in plastic yielding period, stress is one of important controls parameters in the low circle fatigue test
142
C. Liu et al.
[16,17]. According to literature [18], it can be concluded that experiment result of low circle fatigue is on the 2.25 Cr-1Mo steel in the 500 . Using Manson-Coffin formula,
℃
Δε p
2 where
= CN df
(10)
Δε p is plastic stress process, C and d are material constant, N f is circle life.
According to table 2, some parameters can be confirmed, C =1.566×105 , d =−0.6576. Table 1. Experiment results of low circle fatigue
( Δε )( με ) 2
(
1280 1454 1600 1746 1790 1868 1935 2032
Δε p 2
)( με )
387 768 1456 2143 2654 3668 5287 8801
Nf 9437 3664 1000 609 514 323 175 84
Using ANFIS, the model of low cycle fatigue life is built. First, plastic strain process is regarded as input sample and corresponding fatigue cycle life is looked upon as output result. Membership function of input variable adopts gbellmf which uses nine fuzzy rules, and membership function of output chooses Constance type that generates fuzzy inference system by grid method. Using hybrid learning algorithm to train network, error accuracy is zero. And then, by making comparation between low cycle fatigue life model and Manson-Coffin model got by training, the difference of them is small in Table 2. In addition, the thesis makes use of two parameters in table 1, named elastic strain process and plastic strain process as input parameters, to gain result which is close to the both mentioned above. In essence, from table 1, it indicates that elastic strain part increases with test load enlarging, which has less influence on cycle fatigue life than plastic strain process. Therefore, the low cycle fatigue life model built by using ANFIS is feasible. According to table 2, the results of Manson-Conffin are similar to the results of ANFIS, it can be concluded that the model of ANFIS is accurately and effectively. Table 2. The results from two low circle fatigue experiments
Δε p
)( με )
1112
1799.5
2398.5
3171
Manson-Coffin
1851
890
575
376
ANFIS
1817
828
573
384
(
2
Low Circle Fatigue Life Model Based on ANFIS
143
4 Discussions To sum up, characteristics based on low cycle fatigue life model are described as follows.Fatigue life model is built easily because ANFIS is in need of training input and output data merely, so it is not necessary to analyzing internal mechanism. But this model is a black box about input and output, that is to say, internal mechanism is dim.Although this model refers to membership function of fuzzy variables and other fuzzy concepts, it’s not necessary to understand relational fuzzy knowledge deeply in practical running. Generally speaking, bell form membership function is suitable for non-fuzzy parameters; the number of fuzzy rules is related to times of iteration and training precision. Commonly, the more fuzzy rules used are, the fewer times of iteration become and the higher training precision is, but the more time every training cost. Especially, it is more obvious with multivariable input.The more variable number is, the longer time needed by training model will be. Sometimes, if a variable added, the time will increase greatly, so it had better reduce the number of input variable as soon as possible.The thesis puts forward a method to build a model as a black box, which is different from traditional establishment of constitutive relationship. It is not necessary to analyze internal mechanism. Indeed, only relative parameters are considered as input data and it is not in need of knowing which parameters are main variables.Although spline function can be used to fit out the relationship of low cycle fatigue life, ANFIS model is better to fit out fluctuant condition with interrelated present data. ANFIS model has very good adaptability and precision among spline function, ANN and ANFIS model. Acknowledgment. This work supported by the Research Fund for the University Excellent Young Teachers in Shanghai (GJD-07021) and Shanghai Leading academic discipline project (P1045).
References 1. Kazazian, H.H., Phillips, J.A., Boehm, C.D., Vik, T.A., Mahoney, M.J., Ritchey, A.K.: Prenatal Diagnosis of Beta-thalassemia by Amniocentesis: Linkageanalysis using Multiple Polymorphic Restriction Endonuclease Sites. Blood 56, 926–930 (1980) 2. Esragh, F., Mamdani, E.H.: A General Approach to Linguistic Approximation. In: Fuzzy Reasoning and Its Application, London (1981) 3. Kazeminezhad, M.H., Etemad-Shahidi, A., Mousavi, S.J.: Application of Fuzzy Inference System in the Prediction of Wave Parameters. Ocean Engin. 32, 1709–1725 (2005) 4. Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan Publishing, New York (1999) 5. Fu, J.Y., Liang, S.G., Li, Q.S.: Prediction of Wind-induced Pressures on a Large Gymnasium Roof using Artificial Neural Networks. Computers and Structures 85, 179–192 (2007) 6. Guler, I.: Adaptive Neuro-fuzzy Inference System for Gap Discontinuities in Coplanar Waveguides. Int. J. Electron. 92, 173–188 (2005) 7. Übeyli, E.D., Güler, İ.: Adaptive Neuro-Fuzzy Inference Systems for Analysis of Internal Carotid Arterial Doppler Signals. Comput. Biol. Med. 35, 687–702 (2005)
144
C. Liu et al.
8. Shalinie, S.M.: Modeling Connectionist Neuro-Fuzzy Network and Applications. Neural Comput. Applic. 14, 88–93 (2005) 9. Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice-Hall, Upper Saddle River (1997) 10. Roger, J.J.S.: ANFIS: Adaptive Network-based Fzzy Inference Systems. IEEE Tran Systems, Man and Cybernatics 23(3), 665–685 (1993) 11. Stepnowski, A., Mo szyński, M., Tran, V.D.: Adaptive Neuro-Fuzzy and Fuzzy Decision Tree Classifiers as Applied to Seafloor Characterization. Acoust. Physics 49(2), 193–202 (2003) 12. Ertuğrul, Ç., Osman, Y.: Prediction of Wind Speed and Power in the Central Anatolian Region of Turkey by Adaptive Neuro-Fuzzy Inference Systems (ANFIS). J. Eng. Env. Sci. 30, 35–41 (2006) 13. Miyano, Y., Nakada, M., McMurray, M.K., Muki, R.: Prediction of Flexural Fatigue Strength of CFRP Composites under Arbitrary Frequency, Stress Ratio and Temperature. Journal of Composite Materials 31, 619–638 (1997) 14. Miyano, Y., McMurray, M.K., Enyama, J., Nakada, M.: Loading Rate and Temperature Dependence on Flexural Fatigue Behavior of a Satin Woven CFRP Laminate. Journal of Composite Materials 28, 1250–1260 (1994) 15. Qi, H.Y., Wen, W.D., Sun, L.W.: Fatigue Life Prediction and Experiment Research for Composite Laminates with Circular Hole. J. Cent. South Univ. Techno. 11(1), 19–22 (2004) 16. Caprino, G., Amore, A.: Fatigue Life of Draphite/Epoxy Laminates Subjected to TensionCompression Loadings. Mechanics of Time-dependent Materials 4, 139–154 (2000) 17. Novozhilov, N.I.: Prediction of Fatigue Life and the Technicoeconomic Efficiency of High-Strength Steel Railway Bridge Structures. Strength of Materials 10(1), 43–47 (1978) 18. Dai, Z.Y.: Fatigue Damage Critical and Damage Locality. In: Wang, G.G., Gao, Q. (eds.) Solid Damage and Destroy, pp. 75–81. Chengdu University Science and Technology Press, Chengdu (1993)
New Structures of Intuitionistic Fuzzy Groups Chuanyu Xu Department of Math, Zhejiang Gongshang University 310035 Hangzhou, China
[email protected] ℉) set is a generalization of the concept ‘fuzzy set’. Intuitutionistic fuzzy group is ℐ℉ set with a kind of operation. However, few of structure of Intuitutionistic fuzzy groups (ℐ℉Gs) are known. Aimed at this, this paper gives and proves four theorems about some structures as follows: 1.Caushy theorem of ℐ℉ groups. 2. The sufficient and necessary condition of an ℐ℉ p-group is that the order of ℐ℉ group is a power of p. 3.The number of elements of conjugate class in ℐ℉G group equals the number of cosets in ℐ℉ quotient group. And 4.The condition that there exist fixed element in conjugate class in ℐ℉G group and the number of fixed elements. Compared with relative works, The sets and operations of classical groups are classical. In this paper, the sets are ℐ℉Ss and the operations are based on ℐ℉ relation. The similar works in this paper have not be seen in available ℐ℉ groups. Abstract. Intuitutionistic fuzzy (ℐ
1 Introduction
℉
After Intuitutionistic fuzzy sets (simply, ℐ℉Ss) were presented, [1,2] a new type ℐ groups were forwarded[3-5]. However, in their structures, only homomorphism were studied. Other structures have not be seen in report. Some important structures should be studied, for example,
℉ groups; How many are the number of elements of conjugate class in ℐ ℉ groups? Is there any fixed element in the sets on which ℐ ℉ groups act? How many are the relation between the structure and the number of order of ℐ
they? In order to solve these problems, this paper gives and proves four theorems about some structures as follows: 1. Caushy theorem of ℐ groups.
℉
2. The sufficient and necessary condition of an ℐ ℐ
℉ group is a power of p.
℉ p-group is that the order of
3. The number of elements of conjugate class in ℐ of cosets in ℐ
℉ quotient group.
D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 145–152, 2008. © Springer-Verlag Berlin Heidelberg 2008
℉G group equals the number
146
C. Xu
4. The condition that there exist fixed element in conjugate class in ℐ and the number of fixed elements.
℉G group
Compared this paper with relative works is as follows: 1. The difference between classical groups and this paper is as follows. The sets and operations of classical groups are classical. In this paper, the sets are ℐ℉Ss and the operations are based on ℐ relation.
℉ ℉ groups and this paper is as follows. The similar works in this paper have not be seen in available ℐ℉ groups.
2. The difference between available ℐ
The rest of paper is organized as follows. 2.preliminaries, 3. Some structures of ℐ℉ groups and 4.conclusion.
2 Preliminaries Definition 2.1 [1,2] (Intuitionistic Fuzzy Set, ℐ℉S). Let a set E be fixed. An IFS A in E is an object having the form A={<x, μA(x), ν A (x)>⏐x∈E}, where the functions μA(x):E→[0,1] and ν A (x) :E→[0,1] define the degree of membership and the degree of nonmembership of the element x∈E to the set A, which is a subset of E, respectively, and for every x∈E: 0≤μA(x),+ν A (x) ≤1. Note. Obviously every fuzzy set has the form
┃
{<x, μA(x),1−μ A (x)>⏐x∈E}. In Definition 2.2-2.4, 0≤μ +ν≤1 .
Definition 2.2 (Intuitionistic Fuzzy mapping, ℐ℉ mapping). Let X and Y be two nonvoid sets, (x, y)∈X×Y, and ∃ θ1>0, θ2 >0. If (1) ∀x∈X, ∃y∈Y, such that μ (x, y)>θ1 and ν(x, y)>θ2 (2) ∀x∈X, ∀y1, y2∈Y, μ(x,y1)>θ1 and ν(x,y1)>θ2, μ(x,y2)>θ1 and ν(x, y2) >θ2⇒y1=y2, then the vector function (μ ,ν) is called an ℐ℉ mapping (μ ,ν):X Y, x⊢→y, denoted as (μ ,ν)(x)=y, or for simplicity, f(x)=y
→
┃
Definition 2.3. If (μ ,ν) satisfies that ∀ y∈Y, ∃ x∈X and ∃ θ1>0, θ2 >0 such that μ (x, y)>θ1, ν(x, y)>θ2 , then (μ ,ν) is called ℐ℉ surjection. If ∀x1, x2∈X, ∀y∈Y, μ(x1,y)>θ1, ν(x1,y)>θ2, and μ(x2, y)>θ1, ν(x2, y)> θ2⇒x1=x2, then (μ ,ν) is called ℐ℉ injection. If (μ ,ν) is both ℐ℉ surjection and ℐ℉ injection, then (μ ,ν) is called ℐ℉ bijection.
┃
New Structures of Intuitionistic Fuzzy Groups
147
Definition 2.4 [4,5 9-11](ℐ℉ Binary operation). Let G be a nonvoid set, and ℐ℉ mapping, (μ ν): G×G×G→, θ1, θ2∈ [0,1] . If (1)∀x, y ∈G, ∃z∈G, such that μ (x, y, z)>θ1 and ν(x, y, z) >θ2; (2) ∀x, y∈G, ∀z1, z2∈G, μ(x, y, z1) >θ1 , ν(x, y, z1)> θ2; and μ(x, y, z2) >θ1, ν(x, y, z2)>θ2⇒ z1=z2, then the vector function (μ ,ν) is called an ℐ℉ binary operation on G Denote (x○y)(z)≜, here‘○’ is called the ℐ℉ binary ┃
In Definition 2.5, 2.6 and 2.10 0≤*μ +*ν≤1 and 0≤μ* +ν*≤1 . Definition 2.5. The ℐ℉ composition operation between elements in G is defined as follows:
=
*
*
∨ν(x,c,a))> ≜ *
*
Definition 2.6 [1,2,4,5](ℐ℉ group). Let G be a nonvoid set and ∃ θ1>0, θ2 >0. If
┃
,
(1) ((x○y)○z)(a1)= , (x○(y○z))(a2)=, *μ, μ*>θ1, *ν, ν*>θ2⇒ a1=a2 ’ ○’ is called to satisfy the association law; (2) ∀x∈G, ∃e∈G, (e○x)(x) =, (x○e)(x) =, μ(•, •, •)>θ1, ν(•, •, •)>θ2, e is called an identity element; (3) ∀x∈G, ∃y∈G, (x○y)(e)= , (y○x)(e)=< μ(y, x, e), ν(y, x, e)>, μ(•, •, •)>θ1, ν(•, •, •)>θ2, y is called an inverse element of x, and denoted as x−1 ;
┃
then G is called a ℐ℉ group.
℉ group G is a ℉ subgroup of G,denoted by ┃ Definition 2.8 [1,2,4,5]. Suppose that H is an ℐ℉ subgroup of ℐ℉ group G,x, z∈ G,define μ(x, h, z),∧ ν(x, h, z)> (xH)(z)= θ2, then
∀h∈H,
(a○(h○a 1))(b) =, μ*>θ1, , ν*>θ2⇒ b∈H, –
℉ normal subgroup of G, it is denoted H⊳G. ┃ Definition 2.11. Suppose H is an ℐ℉ normal subgroup of ℐ℉ group G, ∀x∈G,
then H is called an ℐ
G/H≜{xH⏐∀x∈G}. Let the operation on G/H be
∨μ((x′,y′,z′),∧ν((x′′,y′′,z′′ )> , where x’H∽x′′H∽xH, y′H∽y′′H∽yH, z′H∽z′′H∽zH, then G/H is an ℐ℉ group about the operation,G/H is called ℐ℉ quotient group. ┃ (xH○yH)(zH)=
θ1, ν>θ2, then ϕ is called an ℐ℉ homomorphism. If ϕ is the ℐ℉ injection, surjection, and bijection, respectively then ϕ is called
,
ℐ℉ injection homomorphism, surjection homomorphism, and isomorphism, respectively. Lemma 2.1 [6-8]. If an action of a ℐ a finite set S, and
┃
℉ group H with the order of pn (p is a prime) on
S0={x∈S|hx=x for all h∈H}, then |S|≡|S0| (mod p)
┃
3 Some Structures of ℐ ℉ Groups In the section, θ1*>0, θ2*>0, 0≤θ1*+θ2*≤1.
℉
Definition 3.1 (Order of element). For an element in an ℐ group G, denoted a, if there is a positive integer p such that (…(a1○a2)○…○ap)(e)=, then the order of element a is called p. If there is no such p, then the element a is called the element with infinite order. Definition 3.2. Suppose the action of ℐ
,such that (a,x) ⊢→a(x), then
℉ group upon an nonempty set is: G×X→X
┃
Gx={a(x)⏐a∈G, (a○x)(a(x)) =] is called the orbit of x. If Gx={x}, then x is called the fixed element of G. If X=G, and
New Structures of Intuitionistic Fuzzy Groups
((a○x)○a-1)(a(x))=,
149
┃
a,x∈G, then the orbit of x is called the conjugate class of x. Because e∈G, and (e○x)○e-1)(x) therefore , the orbit includes x. Denote x~y⇔∃a∈G such that –
(a○x)○a 1)(y) ,
“~” is an equivalent relation
,the orbit Gx is just the equivalent class determined
by“~”and x is its representative element. Remark. The notation “~” is different from the notation “ posets.
∽”of equivalent relation of
℉ centralizer) Suppose G is an ℐ ℉ group, for any element x in G, Stab x={a∈G⏐(a○(x○a ))(x) =} is a ℐ ℉ –subgroup, it is called the stable ℐ ℉ subgroup, also centralizer, it is denoted by Z (x). ┃ Definition 3.4. (ℐ ℉ index) The number of the left (right) ℐ ℉ coset about H is called as the index H in G, denoted as [G:H]. ┃ Definition 3.5. For an ℐ ℉ group, if its order of each element is the power of some ┃ constant prime p, then the group is called ℐ ℉ p-subgroup. Definition 3.3. (ℐ
-1
* 1
G
*
2
G
Caushy theorem describes the relation between the structure and the number of order of ℐ
℉ groups.
℉
℉ group, p⏐|G|,
groups) If G is a finite ℐ Theorem 3.1. (Caushy theorem of ℐ here p is a prime, then there is an element whose number of order is p. Proof Let n=|G|. Construct a set of n-dimensional vectors S= (…(a1○a2)○…○ap {(a1, a2,…, ap)⏐ai∈G, 1 i =}, where = (…(a1○a2)○…○ap -1 ⇔ ((…(a1○a2) ○…○ap-1 ○ a p) * * =.
≤ ≤p,
)(e)
∴
It is known that
p⏐n. ∴
) e)(
|S|=np-1. |S|≡0(mod p).
)(e)
150
C. Xu
Suppose Zp to be a residue class modulo-p additive group where the set of elements in Zp is denoted by {0,1,2,…,p-1}. For k∈Zp, (a1,a2,…,ap)∈S, let the action of Zp upon the set S be the following cyclic permutation: (k, (a1,a2,…,ap)) ⊢→k((a1,a2,…,ap)) =(ak+!,ak+2,…,ak+p) ∈S
,
the action satisfies: 0((ai,ai+1,…, ap, a1, …,ai-1 )) =(ai,ai+1,…, ap, a1, …,ai-1 ) where the unit element 0∈Zp; (k+k’) ((ai,ai+1,…, ap, a1, …,ai-1 )) =k(k’((ai,ai+1,…, ap, a1,…,ai-1 ))), k,k’ ∈Zp. where (ak+!,ak+2,…,ak+p) ∈S can be verified. Because each element in ℐ its inverse element, thus (a1○(a2○…○(ap-1○ ap)…)) ) = (a2○ (○…○(ap○a1) … ) =… =(ak+1○(ak+2○…○(ak-1○ ak)…)) ) =, ⇒(ak+!,ak+2,…,ak+p)∈S. the last step is duo to the definition of S On the other hand, S0={x∈S⏐hx=x,∀h∈Zp}, where x= (a1,a2,…,ap),
,
,
∵
(e ))(e (e
℉ group has
(e,e,…,e)∈S0 ⇒|S0|≠0. (a1,a2,…,ap)∈S0 ⇔ a1=a2=…=ap.
From lemma 2.1, 0≡|S|≡|S0|(mod p).
= T
(2)
And Zhigeng et.al [12] used the following gauss function as diffusion function: C ( ∇ I ) = e |∇ I |
2
/ 2K
2
(3)
.
Where parameter K, is the average gradient magnitude in the neighbour of each pixel and specify degree of diffusion. Catte et al [9, 13] used ∇ | Gσ * u | as input for diffusion function which cause smoothing image using Gaussian filter. Jijun Ren and Mingyi He [14] proposed following equation C ( s ) = 1 /( 1 + K )
(4)
Where parameter K, is the average of difference of gradient magnitude and maximum gradient magnitude in the neighbor of each pixel. 2.2 FCM FCM is a clustering algorithm introduced by Bezdek based on minimizing an object function as follow [8] J
q
=
n
m
i=1
j =1
∑ ∑
u ijq d ( x i , θ j )
.
(5)
Medical Image Segmentation Using Anisotropic Filter, User Interaction and FCM
Where d is distance between data membership of data
173
xi and centre of the cluster j, θ j and U is fuzzy
xi to cluster with centre θ j
u ij ∈ [ 0 ,1 ],
∑
n j =1
u ij = 1 & 0
5) and 25 with a negative OSAS such that normal subjects) were examined. The Review Board on Human Studies at our institution approved the protocol, and each patient gave his or
New Data Pre-processing on Assessing of Obstructive Sleep Apnea Syndrome
187
Table 1. Mean value of the statistical measures of clinical features and characteristics of the subjects The used features Age BMI (kg/m2) ARI index AHI index Sat-O2 minimum value in stage of REM PST in SaO2 intervals bigger than 89%
Non-OSAS 49 30.85 24.666 4.05 87.24 94.81
OSAS 49 38.15 150.45 33.51 79.35 62.92
her informed consent to participate in the study. Table 1 presents mean value of the statistical measures of used clinical features and subjects characteristics [7]. The readers can refer to [7] to get more information about OSAS dataset.
3 The Proposed Method In this work, we have proposed a data normalization method called Line Based Normalization Method and combined with classifier methods including C4.5 decision tree and LM artificial neural network on the diagnosis of OSAS. The data normalization method (LBNM) is used as data pre-processing method. And then classifier has run to classify the normalized OSAS dataset. Both two processes are run as offline. The used method is shown in Figure 1.
Fig. 1. Block diagram of the proposed method
The proposed method consisted of two stages: in order to pre-process the data, LBNM was used to transform the OSAS dataset to values of the range of [0,1], and as classifier algorithm, the C4.5 decision tree and ANN trained with LM were used to classify the normalized OSAS dataset. 3.1 Line Based Normalization Method (LBNM) and Data Scaling Methods All the attributes in any dataset may not always have a linear distribution among classes. If the non-linear classifier system is not used, data scaling or cleaning methods are needed both to transform data from original format to another space to
188
B. Akdemir, S. Güneş, and Ş. Yosunkaya
improve the classification performance in pattern recognition applications. In this study, we have proposed a new data pre-processing LBNM in pattern recognition and medical decision making systems. In this method, the proposed data scaling method consists of two steps. In the first step, we have weighted data using following equation (1). In the second step, weighted data is normalized in the range of [0,1]. By this way, data is scaled in the basis of features used in dataset. The advantage of LBNM is that this method can be used in the dataset with missing class labels. Also, this normalization method can be used to find the missing value on features. Figure 2 shows the pseudo code of Attribute Based Data Normalization.
Input: d matrix with n row and m column Output: weighted d matrix though column based data weighted method 1. Data is weighted by means of following equation. for i=1 to n (such that n is the number of row in d matrix) for j=1 to m (such that m is the number of features (attributes) in d matrix) d i, j (1) D _ column = (d i ,1 ) 2 + (d i , 2 ) 2 + ... + (d i , j ) 2
end end 2. Apply data normalization process to 'D_column' matrix.
process
after
weighted
Fig. 2. The pseudo code of LBNM
3.2 C4.5 Decision Tree Classifier A decision tree is a hierarchical data structure using the divide and conquers method. Decision trees can be used for both classification and regression and also are nonparametric methods. Here, we have used C4.5 decision tree that has pruning and working of ability of missing data as a type of decision trees. C4.5 decision tree learning is one of the most often used and practical methods for inductive inference. It is a method for approximating discrete-valued functions that is powerful to noisy data and capable of learning disjunctive expressions [8, 9]. C4.5 Decision tree learning is a method for approximating discrete-valued functions, where a C4.5 Decision tree represents the learned function. Learned trees structure can be explained as sets of ifthen rules to improve human readability. These learning methods are among the most popular of inductive inference algorithms and have been successfully applied to wide range of problems [10]. C4.5 Decision tree learning is a heuristic inquiry, hill climbing, non-backtracking search by way of the space of all possible C4.5 Decision trees [7, 8]. The objective of C4.5 Decision tree learning is recursively partitioning data into sub-groups. End of the learning, C4.5 generates if then rules to achieve to classification. Consecutively, if-then rules make Tree Classifier be fast and simple.
New Data Pre-processing on Assessing of Obstructive Sleep Apnea Syndrome
189
3.3 Levenberg Marquart Artificial Neural Network (ANN) An ANN is constructed for a specific application, such as pattern recognition or data classification, by way of a learning process. ANN is inspired from human brain activity. It has a number of nodes named neurons and theirs connections after applied any data to inputs, ANN tries to obtain best result reducing the output error level via adjusting the weights. The back propagation (BP) algorithm is a most widely used training procedure that adjusts the connection weights of a Multi Layer Perceptions (MLP) [11]. The LM algorithm is a least-squares estimation algorithm uses maximum neighborhood idea to obtain the desired weights to solve the problem. The smallest MLP composed of three layers: an input layer, an output layer, and one hidden layers. The inputs signals began to spread from first neuron to output neuron affecting each other related to estimated weight. Each layer consists of a predefined number of neurons. The neurons in the input layer work as a buffer serving to distribute the input signals to neurons in the hidden layer [12]. In our applications, the input layer, hidden layer, and output layer consist of 4, 10, and 2 neurons, respectively. Also, we have used the values of 0.9 and 0.8 as learning rate and momentum rate in ANN with LM.
4 Empirical Results and Discussion Data normalization is important issue in many classifier systems, since a lot of classifier algorithms work only on normalized or scaled data. In this study, we have proposed a novel data scaling method called Line Based Data Normalization and applied to diagnosis of obstructive sleep apnea syndrome that is common important disease among public. Here, we have investigated the effect of LBNM to classification accuracy of used classifiers on the diagnosis of OSAS. In order to compare the proposed normalization method, various normalization methods minmax normalization, z-score normalization, and decimal scaling were used. In diagnosing the OSAS, we have used the clinical features including ARI, AHI, SaO2 minimum value in stage of REM, and PST in stage of SaO2 intervals bigger than 89% obtained from Polysomnography device records. Table 2 shows the obtained results from C4.5 decision tree, LM back propagation algorithm, combining of C4.5 Decision Tree Classifier and LBNM, and combining of LM back propagation algorithm and LBNM using 10-fold cross validation on the diagnosis of OSAS. The best method to diagnose the OSAS was the combining of C4.5 Decision Tree Classifier and LBNM. Also, its effect of the proposed normalization method to classification accuracy of classifiers used in the diagnosis of OSAS was shown. Line Based Normalization Method was compared to other normalization or scaling methods including min-max normalization, z-score normalization, and decimal Scaling. The classifier accuracy and 95% confidence interval were used to compare above methods. Table 3 presents the obtained results from C4.5 decision tree classifier on the classification of OSAS disease using LBNM and various scaling or normalization methods comprising min-max normalization, z-score normalization, and decimal scaling on the 10-fold cross validation.
190
B. Akdemir, S. Güneş, and Ş. Yosunkaya
Table 2. The obtained results from classifiers used on the classification of optic nerve disease from VEP signals without GDA Method C4.5 Decision Tree ANN with LM C4.5 Decision Tree and LBNM ANN with LM and LBNM
PD(Recall)
Precision
0.965 0.933 1.00
Prediction Accuracy (%) 95.12 92.68 1.00
0.965 0.965 1.00
Fmeasure 0.965 0.948 1.00
The value of AUC 0.941 0.899 1.00
0.966
97.56
1
0.982
0.958
Table 3. Comprising of the obtained results from C4.5 decision tree classifier on the classification of OSAS using LBNM and various normalization methods Method
Min-Max Normalization Z-score Normalization Decimal Scaling Line Based Normalization Method (LBNM)
Prediction Accuracy (%) 49 30.85 24.666 94.81
Confidence Interval 95% 49 38.15 150.45 62.92
These results have shown that Attribute Based Data Normalization method can be useful in many pattern recognition and medical diagnostics applications as can be seen in the diagnosis of obstructive sleep apnea syndrome. Also, this method can be used in many applications such as speech recognition, text categorization, image processing etc. We believe that the proposed method can be very helpful to the physicians for their final decision on their patients. Acknowledgments. This study has been supported by Scientific Research Project of Selcuk University (Project number: 08701258).
5 Conclusion In this paper, we have proposed a novel data normalization method LBNM to assess the obstructive sleep apnea syndrome using clinical features obtained from Polysomnography device as a diagnostic tool in patients clinically suspected of suffering from sleep disorder. The proposed normalization method was compared with min-max normalization, z-score normalization, and decimal scaling methods existing in literature via diagnosing of OSAS. While the combining of C4.5 decision tree classifier and min-max normalization, z-score normalization, and decimal scaling, have been obtained the classification accuracy of 95.89% using 10-fold cross validation, combining of C4.5 decision tree classifier and LBNM has been achieved
New Data Pre-processing on Assessing of Obstructive Sleep Apnea Syndrome
191
the accuracy of 100% on the same condition. Here, we have given a medical application of this normalization method. In future, this data pre-processing method can be used in many pattern recognition applications.
References 1. AASM. Sleep-Related Breathing Disorders in Adults: Recommendations for Syndrome Definition and Measurement Techniques in Clinical Research. The Report of an American Academy of Sleep Medicine Task Force, SLEEP, vol. 22(5) (1999) 2. Eliot, S., Janita, K., Cheryl Black, L., Carole, L.: Marcus. Pulse Transit Time as a measure of arousal and respiratory effort in children with sleep-disorder breathing. Pediatric research 53(4), 580–588 (2003) 3. Al-Ani, T., Hamam, Y., Novak, D., Pozzo Mendoza, P., Lhotska, L., Lofaso, F., Isabey, D., Fodil, R.: Noninvasive Automatic Sleep Apnea Classification System, Bio. Med. Sim. 2005, Linköping, Sweden, May 26–27 (2005) 4. Haitham, M., Al-Angari, A., Sahakian, V.: Use of Sample Entropy Approach to Study Heart Rate Variability in Obstructive Sleep Apnea Syndrome. IEEE Transactions in Biomedical Engineering 54(10), 1900–1904 (2007) 5. Campo, F.d., Hornero, R., Zamarro´n, C., Abasolo, D.E., A´lvarez, D.: Oxygen saturation regularity analysis in the diagnosis of obstructive sleep apnea. Artificial Intelligence in Medicine 37, 111–118 (2006) 6. Kwiatkowska, M., Schmittendorf, E.: Assessment of Obstructive Sleep Apnea using Pulse Oximetry and Clinical Prediction Rules: a Fuzzy Logic Approach, BMT (2005) 7. Polat, K., Yosunkaya, Ş., Güneş, S.: Pairwise ANFIS Approach to Determining the Disorder Degree of Obstructive Sleep Apnea Syndrome. Journal of Medical Systems 32(3), 243–250 (2008) 8. Mitchell, M.T.: Machine Learning. McGraw-Hill, Singapore (1997) 9. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986) 10. Akdemir, B., Polat, K., Günes, S.: Prediction of E.Coli Promoter Gene Sequences Using a Hybrid Combination Based on Feature Selection, Fuzzy Weighted Pre-processing, and Decision Tree Classifier. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part I. LNCS (LNAI), vol. 4692, pp. 125–131. Springer, Heidelberg (2007) 11. Haykin, S.: Neural networks: A comprehensive foundation. Macmillan College Publishing Company, NewYork (1994) 12. Kara, S., Guven, A.: Neural Network-Based Diagnosing for Optic Nerve Disease from Visual-Evoked Potential 31, 391–396 (2007)
Recognition of Plant Leaves Using Support Vector Machine Qing-Kui Man1,2, Chun-Hou Zheng3,*, Xiao-Feng Wang2,4, and Feng-Yan Lin1,2 2
1 Institute of Automation, Qufu Normal University, Rizhao, Shandong 276826, China Intelligent Computing Lab, Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui 230031, China 3 College of Information and Communication Technology, Qufu Normal University 4 Department of Computer Science and Technology, Hefei University, Hefei 230022, China
[email protected],
[email protected] Abstract. A method using both color and texture feature to recognize plant leaf image is proposed in this paper. After image preprocessing, color feature and texture feature plant images are obtained, and then support vector machine (SVM) classifier is trained and used for plant images recognition. Experimental results show that using both color feature and texture feature to recognize plant image is possible, and the accuracy of recognition is fascinating. Keywords: Support vector machine (SVM), Image segmentation, Digital wavelet transform.
1 Introduction There are many kinds of plants that living on the earth. Plants play an important part in both our human life and other lives existing on the earth. Unfortunately, the categories of plant is becoming smaller and smaller. Fortunately, people are realizing the importance of protecting plant, they try all ways they can to protect plants that ever exist on the earth, but how can they do this work without knowing what kind of categories plant belongs to. Then a problem is: Since computer is more and more widely used in our daily life, how can we recognize the different kind of leaves using computer? Plant classifying is an old subject in human history, which has developed rapidly, especially after human being came into the Computer Era. Plant classifying not only recognizes different plants and names of the plant, but also tells the difference of different plants, and builds system for classifying plant. It can also help researchers find origins, relations of species, and trends in evolution. At present, there are many modern experiment methods in plants classifying area, such as plant cellular taxonomy, cladistics of plant, and so on. Yet all these methods are not easy for non-professional staff, because these methods can’t be easily used and the operation is very complex. * Corresponding author. D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 192–199, 2008. © Springer-Verlag Berlin Heidelberg 2008
Recognition of Plant Leaves Using Support Vector Machine
193
With the development of computer technology, digital image processing develops rapidly, so people want to use image processing and pattern recognition techniques to make up the deficiency of our recognition ability, in order that non-professional staffs can use computer to recognize variety of plants. According to theory of plant taxonomy, it can be inferred that plant leaves are most useful and direct basis for distinguishing a plant from others, what’s more, leaves can be very easily found and collected everywhere. By computing some efficient features of leaves and using a suitable pattern classifier, it is possible to recognize different plants successfully. Till now, many works have focused on leaf feature extraction for recognition of plant. In [1], a method of recognizing leaf images based on shape features using a hyper-sphere classifier was introduced. In [5], the author gave out a method which combines different features based on centroid-contour distance curve, and adopted fuzzy integral for leaf image retrieval. Gu et.al[6] used the result of segmentation of leaf’s skeleton to do leaf recognition. Among these methods, using leaf shape feature is the best way to recognize plant images [1], and the accuracy of recognizing is fascinating. Since image color and texture feature are two features that most sensitive to human vision, we select both of the two features as the feature to recognize plant image in this paper. In this paper, a method of using color feature and texture feature to recognize plant image was proposed. That is, using color moments as the color feature and extracting texture feature of plant leaf image after wavelet high pass filter. Usually, the wavelet transform has the capability of mapping an image into a low-resolution image space and a series of detail image spaces. For the majority of images, their detail images indicate the noise or uselessness of the original ones. In this paper, information of leaf vein was extracted as the texture feature. Therefore, after extracting these features of leaves, different species of plants can be classified by using SVM. The remainder is organized as follows: Section 2 describes something about image segmentation, and some definition of color moments and texture feature, especially wavelet transform. Section 3 describes the Support Vector Machine (SVM) in detail. Section 4 present the experimental results and demonstrates the feasibility and validity of the proposed method. Conclusions are included in Section 5.
2 Extracting Leaf Features In this section, we will introduce something about image segmentation. After the segmentation, color moments and wavelet transform are introduced to represent images of plant leaf. 2.1 Image Segmentation The images of plant leaf, which were gotten through camera, are always with complex background. The purpose of image segmentation is to get the region of interest (ROI), which will be used to extract color moments and other texture features. There are two kinds of background in the leaf images: one is simple, and the other is complicated. In this paper we select leaf image with simple background to test our algorithm that
194
Q.-K. Man et al.
recognizing leaf images. After the procedure of image segmentation, a binary image of which ROI is displayed with 1 and background is displayed as 0 will be received. For the leaf image with simple background, it can be seen that the gray level of pixels within leaf objects is distinctly different from that of pixels within the background. For the leaf images we collected ourselves are with simple background, we use adaptive threshold [10] method to segment them, and experimental results show that this method worked very well. There are many kinds of image features that can be used to recognize leaf image, such as shape feature [1], color feature and texture feature. In this paper, we select color feature and texture feature to represent the leaf image. 2.2 Color Feature Extraction Color moments have been successfully used in many color-based image retrieval systems [2], especially when the image contains just the image of leaf. The first order (mean), the second (variance) and the third order (skewness) color moments have been proved to efficient and effective in representing color distributions of images. Mathematically, the first three moments can be defined as:
μ σ
k
= (
δk = ( Where
p ik
k
=
1 sum sum
1 sum
∑
1 sum
sum
sum
∑
p ik
(1)
i =1
( p ik − μ k ) 2 ) 1 / 2
(2)
( p ik − μ k ) 3 ) 1 / 3
(3)
i =1
∑
i =1
is the value of the k-th color component of the image i-th pixel, and sum is
the number of pixel that the region of interest contains. For the reason that HSV color space is much closer to human vision than HIS color space [12], we extracted color moments from HSV color space in this paper. 2.3 Image Normalization Texture feature is another important image feature to represent image. In this paper, we use wavelet transform to obtain the leaf vein on which the texture feature is based. Before wavelet transform, we do some preprocessing to normalize the leaf image [4]. The method of normalizing the image is summarized as follows: (1)Compute the center coordinate A( x0 , y0 ) of plant image. (2)Find the coordinate B ( x1 , y1 ) which is farthest from the center coordinate.
Recognition of Plant Leaves Using Support Vector Machine
(3)From coordinates A(x0, y0 ) and B(x1, y1) , we can get θ = arctan( (4)Rotate the plant image by θ .
195
y1 − y 0 . ) x1 − x 0
The results of this preprocessing are shown in Fig.1.
Fig. 1. Leaf image after normalization
2.4 Texture Feature Extraction The
wavelet
transform
(WT),
a
linear
integral
transform
that
maps L ( R ) → L ( R ) , has emerged over the last two decades as a powerful new theoretical framework for the analysis and decomposition of signals and images at multi-resolutions [7]. Moreover, due to its both locations in time/space and frequency, this transform is completely differs from Fourier transform [8, 9]. Wavelet transform is defined as decomposition of a signal f (t ) using a series of elemental functions called as wavelets and scaling factors, which are created by scaling and translating a kernel function ψ (t) referred to as the mother wavelet: 2
2
2
ψ
2
ab
(
1 ψ a
)=
t
(
t − b a
)
(4)
Where a , b ∈ R , a ≠ 0 and the discrete wavelet representation (DWT) can be defined as:
W
d f
( j, k ) =
∞
∫ψ
_
j ,k
( x ) f ( x ) dt = ψ
j ,k
, f
j, k ∈ Z
(5)
−∞
In this paper, we use wavelet transform in 2D, which simply use wavelet transform in 1D separately. 2D transform of an image I = A0 = f ( x, y ) of size M × N is:
A j = ∑ ∑ f ( x , y )ϕ ( x , y ) . x
y
D j1 = ∑ ∑ f ( x , y )ψ H ( x , y ) . x
y
x
y
D j 2 = ∑ ∑ f ( x , y )ψ V ( x , y ) .
D
j1
=
∑∑ x
y
f ( x , y )ψ
D
(x, y) .
196
Q.-K. Man et al.
That is, four quarter-size output sub-images, A j , D j1 , D j 2 and D j 3 , are generated after wavelet transform. After the Digital Wavelet Transform (DWT), we use high pass filter to obtain the leaf vein. Then we calculate leaf image’s co-occurrence matrix which is used to calculate the texture feature. The result of this transform is shown in Fig.2. From the image after wavelet high pass filter, it is easy for us to find that leaf vein was more distinctive than in the original image, and the approximate part of the original image was filtered.
Fig. 2. Leaf image after wavelet high pass filter transform
Then we use the transformed image to extracted co-occurrence matrix. Texture features we get can be defined as following: L −1 L −1
Entropy:
ent = − ∑
∑
i=0
Homogeneity:
h =
L −1 L −1
∑ ∑ i= 0
Contraction:
p ( i , j ) log
2
p (i, j )
(6)
j=0
con t =
j=0
L −1 L −1
∑∑ i=0
p (i, j) 0 .1 + | i − j |
p (i, j ) | i − j |
(7)
(8)
j=0
Based on co-occurrence matrix, in the four different directions of the image, i.e. the angles take different value: 0, 45, 90 and 135 degree, we can get texture features of plant images. All the data we extracted as described in Section 2 are raw data. Both data that represent color feature and texture feature will be processed before training the classifier.
3 Support Vector Machine (SVM) Support vector machine (SVM) [11] is a popular technique for classification, and using SVM to process multi-class problem is one of present research focuses. A classification task usually involves with training and testing data which consist of some data instances. Each instance in the training set contains one “target value” (class labels) and several “attributes” (feature). The goal of SVM is to produce a model which predicts target value of data instances in the testing set which are given only the attributes.
Recognition of Plant Leaves Using Support Vector Machine
197
( xi , yi ), i = 1,..., l where xi ∈ R n
Given a training set of instance-label pairs
and yi ∈ {1, −1}l , the support vector machines (SVM) require the solution of the following optimization problem:
minξ w ,b ,
l 1 T w w + c∑ ξ 2 i =1
yi (wTφ(xi ) + b) ≥ 1− ξi , ξ ≥ 0
subject to Here training vectors
.
xi are mapped into higher (maybe infinite) dimensional space
by the function φ . Then SVM finds a linear separating hyperplane with the maximal margin in this higher dimensional space. c > 0 is the penalty parameter of the error term. Furthermore,
K ( xi , x j ) ≡ φ ( xi )T φ ( x j ) is called the kernel function. Though
new kernels are being proposed by researchers, the following are the four basic kernels: Linear: K ( x i , x j ) = x iT x j . Polynomial: K ( x i , x j ) = ( γ x iT x j + r ) d , γ > 0 . Radial Basis Function (RBF): K ( xi , x j ) = exp( − γ || xi − x j || 2 ), γ > 0 . Sigmoid: K ( x i , x j ) = tanh( γ x iT x Here,
γ
r and
d
j
+ r) .
are kernel parameters.
4 Experimental Results In this section, we will select some features that we extracted through the procedure we describe above, such as image segmentation and wavelet transform, to do classification experiment, and select
K ( xi , x j ) = exp(−γ || xi − x j ||2 ),γ > 0 as the SVM kernel.
The following experiments are programmed using Microsoft Visual C++ 6.0, and run on Pentium 4 with the clock of 2.6 GHz and the RAM of 2G under Microsoft windows XP environment. Meanwhile, all of the results for the following figures and tables are the average of 50 experiments. This database of the leaf images is built ourselves in our lab by the scanner and digital camera, including twenty four species. In this section, we take 500 leaf samples corresponding to 24 classes collected by ourselves such as seatung, ginkgo, etc (as shown in Fig.3). We selected data of color feature and texture features as the input data for training classifier SVM. Before training the classifier, we do some processing with the raw data [3]. We used z-score normalization to do data preprocess, which is defined as: _
v ' = (v − A ) / σ _
Where A and
σ
A
(9) A
are the mean and standard deviation of component A respectively.
198
Q.-K. Man et al.
Fig. 3. Leaf images used for experiment
Firstly, we only use color features to do experiment and find that the accuracy is more than 90 percent if number of categories is small, yet when the number grew to five or six the accuracy drop to 60 percent. This is because that the color of most plant leaf images are green, and in HSV color space, which is similar to human’s vision, the difference between every two plants leaf image is very little, that’s to say, color feature is not a good feature for plant leaf image recognition. Secondly, we take only texture features as the experiment data. The result is that: the rate of recognition is satisfying. From the result, we can get the truth that texture of image is a good feature to recognize plant images. Thirdly, because color is an important feature of the plant image, so we also use both of the two image features, color feature and texture feature, to do experiments. The result is fascinating: the right recognition accuracy is up to 92%. Table 1. Results of leaf image recognition
Accuracy 4 categories
Using color feature Using texture feature Using both feature
90% 98% 100%
6 categories
63% 96% 100%
10 categories
40% 93.5% 97.9%
24 categories
Very low 84.6% 92%
The result of our experiment is shown in table 1. From the table we can see that our method is competitive. In [1], the method authors proposed that using shape feature to recognize plant images can recognize more than 20 categories plants with average correct recognition rate up to 92.2%. Compared to that method, our way that using color feature and texture feature of plant image is very good.
5 Conclusions In this paper, a way of using color feature and texture feature to recognize plant images was proposed, i.e. using color moments and texture feature of plant leaf image after wavelet high pass filter to recognize plant leaf images. The wavelet transform is of the capability of mapping an image into a low-resolution image space and a series of detail
Recognition of Plant Leaves Using Support Vector Machine
199
image spaces. However, in this paper, information of leaf vein was extracted after wavelet high pass filter to represent the texture feature. After computing these features of leaves, different species of plants was classified by using SVM. And the rate of recognizing plant using this method is satisfying. Our future work include selecting most suitable color feature and texture feature, as well as preprocessing the raw data we selected from leaf images, which will heighten the accuracy rate. Acknowledgements. This work was supported by the grants of the National Science Foundation of China, Nos. 60772130 & 60705007 the grant of the Graduate Students’ Scientific Innovative Project Foundation of CAS (Xiao-Feng Wang), the grant of the Scientific Research Foundation of Education Department of Anhui Province, No. KJ2007B233, the grant of the Young Teachers’ Scientific Research Foundation of Education Department of Anhui Province, No. 2007JQ1152.
,
References 1. Wang, X.F., Du, J.X., Zhang, G.J.: Recognition of Leaf Images Based on Shape Features Using a Hypersphere Classifier. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 87–96. Springer, Heidelberg (2005) 2. Han, J.H., Huang, D.S., Lok, T.M., Lyu, M.R.: A Novel Image Retrieval System Based on BP Neural Network. In: The 2005 International Joint Conference on Neural Networks (IJCNN 2005), Montreal, Quebec, Canada, vol. 4, pp. 2561–2564 (2005) 3. Liu, Z.W., Zhang, Y.J.: Image Retrieval Using Both Color and Texture Features. J. China Instit. Commun. 20(5), 36–40 (1999) 4. Liu, J.L., Gao, W.R., Tao, C.K.: Distortion-invariant Image Processing with Standardization Method. Opto-Electronic Engin. 33(12), 75–78 (2006) 5. Wang, Z., Chi, Z., Feng, D.: Fuzzy Integral for Leaf Image Retrieval. Proc. Fuzzy Syst. 372–377 (2002) 6. Gu, X., Du, J.X., Wang, X.F.: Leaf Recognition Based on the Combination of Wavelet Transform and Gaussian Interpolation. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 253–262. Springer, Heidelberg (2005) 7. Vetterli, M., Kovacevic, J.: Wavelets and Subband Coding. Prentice Hall, Englewood Cliffs (1995) 8. Akansu, A.N., Richard, A.H.: Multiresolution Signal Decomposition: Transforms, Subbands, and Wavelets. Academic Press. Inc., London (1992) 9. Vetterli, M., Herley, C.: Wavelets and Filter Banks: Theory and Design. IEEE Trans. On Signal Proc. 40, 2207–2231 (1992) 10. Chan, F.H.Y., Zhu, F.K.H.: Adaptive Thresholding by Variational Method. IEEE Trans. Image Proc., 468–473 (1998) 11. Cortes, V.V.: Support-vector network. Machine Learn. 20, 273–297 (1995) 12. Plataniotis, K.N., Venetsanopoulos, A.N.: Color Image Processing and Applications. Springer, Heidelberg (2000)
Region Segmentation of Outdoor Scene Using Multiple Features and Context Information Dae-Nyeon Kim, Hoang-Hon Trinh, and Kang-Hyun Jo Graduate School of Electrical Engineering, University of Ulsan, San 29, Mugeo-Dong, Nam-Gu, Ulsan, 680 - 749, Korea {dnkim2005,hhtrinh,jkh2008}@islab.ulsan.ac.kr
Abstract. This paper presents a method to segment the region of objects in outdoor scene for autonomous robot navigation. The proposition of the method segments from an image taken by moving robot on outdoor Scene. The method begins with object segmentation, which uses multiple features to obtain the object of segmented region. Multiple features are color, edge, line segments, Hue Co-occurrence Matrix (HCM), Principal Components (PCs) and Vanishing Points (VPs). Model the objects of outdoor scene that define their characteristics individually. We segment the region as mixture using the proposed features and methods. Objects can be detected when we combine predefined multiple features. Next, the stage classifies the object into natural and artificial ones. We detect sky and trees of natural object and building of artificial object. Finally, the last stage shows the combination of appearance and context information. We confirm the result of object segmentation through experiments by using multiple features and context information. Keywords: object segmentation, outdoor scene, multiple features, context information.
1
Introduction
When an autonomous the robot navigation on outdoor scene, it is likely for him to set specific a target. He also needs avoid objects when he encounters obstacle, and know where he is and know further path take he. To object segmentation, we classify the object into the artificial and the natural [9]. Then we define their characteristics individually. The method begins with object segmentation, which uses multiple features to obtain the object of segmented region. Multiple features are color, edge, line segments, PCs, vanishing point and HCM. Among multiple features, we present a method to apply the texture and color information. Image segmentation can become very difficult, as the image gray value or color alone are rarely good indicators for object boundaries due to noise, texture, shading, occlusion, or simply because the color of two objects is nearly the same. Zhang et al. [3] proposed color image segmentation method by intensity and color. For example, we have good result for image in simple object in single form such as building. But, the case which one object is consisting of so that it is complex or D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 200–207, 2008. c Springer-Verlag Berlin Heidelberg 2008
Region Segmentation of Outdoor Scene
201
different object has identical color; different object is divided to identical object or one object has many results which are divided to object. To overcome such defect, we are presenting method to combine various features in complex image. So, we propose a method for detecting the faces of building using line segments and their geometrical vanishing points [6, 9]. Haralick et al. [1] used statistical features extracted from object using gray level co-occurrence matrix (GLCM) in analysis method of texture [1,4,2]. We developed and evaluated different implementations of GLCM, using co-occurrence matrix of Hue-value instead of Gray-level. This paper shortened a processing time taking a displacement vector which is specific direction 135 ◦ into accounts at HCM [9]. In additional, we use HCM to detect the region of trees. The method to combine a features use according to the characteristic of the object and does segment the images. We consider images of outdoor scene and we would like to segment each pixel as sky, trees, building, etc. To achieve this goal, the object segmentation task requires the knowledge of the objects contained in the image. We propose a probabilistic method taking contextual information into account to segment regions belonging to scene primarily containing objects. Nevertheless, it is increasingly being recognized in the vision community that context information is necessary for a reliable extraction of the image regions and objects. This paper is organized as the following. Section 2 describes feature extraction for objects of image that present a color, edge, line segments, PCs, VPs and HCM. Section 3 describes a probabilistic method using contextual information to segment regions belonging to scene primarily containing objects. Section 4 presented the methods of region segmentation. Experimental results are shown in section 5. Section 6 concludes the paper.
2
Multiple Features
When the robot navigates on outdoor scene, we classify to know the object from image to get as the priori knowledge, and then we apply the knowledge of an object. We present the candidate for a segmented region with natural and artificial object such as sky, trees and building. So, we segment the region by using multiple features. The features are color, edge, line segments, PCs, vanishing point and HCM. The feature of color use Hue, Saturation and Intensity (HSI) color model. We can see a line segments component much in artificial object such as building. The PCs are formed by merging neighborhood of basic parallelograms which have similar colors [6]. The regions of PCs are detected. An edge is boundary between two regions with relatively distinct gray-level properties [9]. We use M-estimator SAmple Consensus (MSAC) to group such parallel line segments which have a common vanishing points [6]. We calculate one dominant vanishing point for vertical direction and five dominant vanishing points in maximum for horizontal direction. HCM are spatial dependence frequencies with a function of the angular relationship between the neighboring resolution pixels as well as a function of the distance between them [9]. We use by mixture with extracted six features. To extract region of sky and cloud, we use features of color, context
202
D.-N. Kim, H.-H. Trinh, and K.-H. Jo
information. The extraction method of tree region use features of color, context information and HCM. Also, we use to extract building as color, edge, line segments, PCs and Vps.
3
Contextual Probability
For each object we searched its habitual location in the image, which is described by the percentages of being at the top, middle and bottom of an image, (LT i , LMi and LBi , respectively). The y position of all pixels is obtained and the probability of each of them to belong to a certain position is computed. The main drawback of not using context is the overlap between classes, e.g. sky and water, both blues. The system can then easily confuse a water region, at the bottom of the image, with sky, since they have a very similar appearance. Two small image patches are ambiguous at a very local scale but clearly identifiable inside their context. Specifically, we distinguish two kinds of context information: (i) Absolute context: refereed to the location of objects in the image (sky is at top of the image, and water at bottom), (ii) Relative context: position of the objects respect to other objects in the images (grass tends to be next to the road, and clouds in the sky). Some proposals consider both kinds of context [5], while only the relative context is considered by He et al. [7]. The fuzzy rules used to provide the position of pixels in a fuzzy way. The probabilities PT (yj ), PM (yj ) and PB (yj ), are the belief that a pixel with yj position is to a certain location (top, middle and bottom) in the image. Therefore, Eq. (1) gives us the probability that a pixel j at position yj belongs to an object ØLi considering its absolute position: PL (j|ØLi ) = max(LT i ∗ PT (yj ), LMi ∗ PM (yj ), LBi ∗ PB (yj ))
(1)
We carry out with the results in this paper as the pixels with the highest probability to belong to an object (PL > 0.8) constitute the region.
Fig. 1. Flowchart for segmentation of natural and artificial object
Region Segmentation of Outdoor Scene
4
203
Segmentation of Object Region
The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. We might consider images of outdoor scene and we would like to segment each object as sky, trees, building, etc. Region segmentation uses the mixture such multiple features according to the characteristic of the objects. The flowchart of the process for segmentation of natural and artificial object is described in Fig. 1. 4.1
Segmentation of Sky and Cloud Region
Several color spaces are in wide use, including RGB, HSI, CIE, YIQ, YCbCr, etc. We convert RGB color space to HSI [3]. This paper uses HSI color model and find the value of sky and cloud in the image. This method finds the value of HSI to repeated experiment. Also, we use absolute context information for referee to the location of objects in the image. The image divides as a three part at the top, middle and bottom. If the robot travels as regular intervals on outdoor scene, we regard that sky and cloud exist at top in image. Here, we add context information as sky position at top of the image. If there is a different object in the sky, it regards as the region of sky. The method used HSI color model and found the part to correspond to the value of cloud and sky in the image. We find the value of HSI to repeated experiment. The range of sky and cloud is equal to hue, saturation and intensity of table 1. Region segmentation extracts the region of the cloud after the extraction does sky region. Region segmentation of sky and cloud shows in Fig. 1. The image to do region segmentation of sky and cloud is seen at a Fig. 2(b), Fig. 2(c). The merger of sky and cloud show Fig. 2(d). 4.2
Segmentation of Trees Region
We used HSI color model and found the part to correspond to the value of the trees in image. To find the region of trees, we uses the value of HSI to repeated experiment. Additionally, in order to estimate the similarity between different gray level co-occurrence matrices (GLCM), Haralick [1] proposed statistical features extracted from them. GLCM, one of the most known texture analysis methods,
(a)
(b)
(c)
(d)
Fig. 2. Segmentation of sky and cloud region: (a) original image (b) sky (c) cloud (d) the merger of sky and cloud
204
D.-N. Kim, H.-H. Trinh, and K.-H. Jo
(a)
(b)
(c)
(d)
Fig. 3. Comparison results of diverse cues of segments region with trees: (a) original images (b) trees detection using HSI (c) trees detection using HCM (d) trees detection using HSI+HCM
estimates image properties related to second-order statistics. Each entry (i, j ) in GLCM corresponds to the number of occurrences of the pair of gray levels i and j which are a distance apart in original image. We use co-occurrence matrix of hue-value instead of gray-level. To reduce the computational complexity, only some of these features were selected. We analyze the spatial characteristics using HCM [9]. The HCM P [i, j ] is defined by specifying a displacement vector and counting all pairs of pixels separated by distance d and direction φ having hue level i, j. Kim et al. [9] illustrated how to object HCM in the 135 ◦ diagonal direction from a simple original image having hue levels 0, 1 and 2. We have get image segmentation using displacement vector of 135 ◦ diagonal direction in the HCM. This paper thus attempts to outline an alternative reading of GLCM. Thus, this paper proposes the method of HCM algorithm. HCM analyzes of appearance count of hue value pixel pairs at original image. At first, we use HSI and find the range of hue. Then, we define a range from CM for high frequency regions. At last, we obtain the value of HCM. Finally, we use HCM and HSI together. The method of HSI alone has many noises at the image segmentation. We decrease such noise using the method using HCM and HSI together. HSI has been desired from the repeated experiment trials to segment trees regions for the natural object. 4.3
Segmentation of Building Face Region
Face of building is a plane surface which contains PCs as doors, windows, wall region and columns. The first step detects region of trees as algorithm of HSI and HCM described at previous chapter 4.2. The second step of the line segments detection use Canny edge detector. Line segments detection is a part of edge which√ satisfied two conditions [9]. In experiments, we choose T1 and T2 as 10 and 2 pixels respectively. The result of line segments detection is shown in Fig. 4(b). Most of the low contrast lines usually do not locate on the edge of PCs because the edge of PCs distinguishes the image into two regions which have high contrast color. We based on the intensity of two regions beside the line to discard the low contrast lines [9]. The result is illustrated by Fig. 4(c). The vertical group contains line segments which create an acute angle 20 ◦ in
Region Segmentation of Outdoor Scene
205
Table 1. Region segmentation of object using the range of HSI value HSI
Hue
Sky(1)
170∼300
10∼50
I ≥ 160
Cloud(2)
170∼300
S ≤ 15
I ≥ 200
Merge of (1),(2) 170∼300
S ≤ 10
I ≥ 160
S ≤ 15
I ≥ 65
Trees
(a)
60∼140
Saturation Intensity
(b)
(d)
(c)
(e)
Fig. 4. The result of building detection: (a) original images (b) line segments detection and trees region (c) survived line segments reduction (d) dominant vanishing points detected by MSAC (e) mesh of basic parallelograms of face
maximum with the vertical axis. The remanent lines are treated as horizontal groups. For the fine separation stage, we used MSAC [6] robustly to estimate the vanishing point. Suppose that the line segments end points are x1 , x2 such that [2] l = (a, b, c)T ; l = x1 × x2 and x1 = (x´1 , y´1 , 1)T , x2 = (x´2 , y´2 , 1)T . Given two lines, a common normal is determined by v = li × lj , where v = (v1 , v2 , v3 )T . Hence given a set of n line segments belonging to the lines parallel in 3D, the vanishing point v is obtained by solving the following Eq. (2): liT v = 0;
i = 1, 2, . . . , n.
(2)
The robust estimation of v by MSAC has proven the most successful. We calculate five dominant vanishing points in maximum for horizontal direction [9]. The algorithm proceeds in three steps [8,9]. The priority of horizontal vanishing point is dependent on the number Ni of parallel lines in corresponding groups. They are marked by color as following red, green, blue, yellow and magenta color. They are illustrated by Fig. 4(d). The vertical line segments are extended
206
D.-N. Kim, H.-H. Trinh, and K.-H. Jo
to detect a vanishing points. We based on the number of intersection of vertical lines and horizontal segments to detect and separate planes as the faces of building. Fig. 4(e) shows us the results of face detection. The boundaries of faces define as three steps by Kim et al [9]. The minimum of horizontal lines in left and right faces is Nl and the number of points of intersection is Ni . The ratio of Nl and Ni is larger than given threshold satisfying Eq. (3) with NT is 0.35. N=
Ni ≥ NT Nl
(3)
Finally, the mesh of basic parallelograms is created by extending the horizontal lines. Each mesh represents one face of building. Fig. 4(e) shows the results of mesh of face detection.
5
Experiment
The image database used in the experiment consist about 1300 images. Normally around the leaf of the trees has high frequency; we search the object which has proposed HCM algorithm for trees. We can see that result in Fig. 3. Also, we used HSI color model, conversion of RGB to HSI, and found a part to correspond to the value of the trees in image. At last, we find the region of trees as combined features of HSI, context information and HCM. The result of segmentation of trees region is preprocessing for detection of building. Then we remove the high frequency in trees region. Line segments for detecting building were used because of noise reduction. For detecting the faces of building used line segments and their geometrical vanishing points. MSAC algorithm is used to find the vanishing points not only for the multiple faces of building but also for the face having noises as branches of trees or electrical lines. We can see that well result in Fig. 4(e). The meshes of parallelograms can help us to detect more PCs as window, door and so on. In addition, the relation of geometrical properties as the height and the number of windows can be exploited to analyze more information of building. For example, how many rooms the building has.
6
Conclusion
This paper proposed a method of object segmentation on outdoor scene by using multiple features and context information. Multiple features are color, edge, line segments, HCM, PCs and VPs. Mixing those features, we segment the image to several regions such as sky and trees of natural object and building of artificial object. Here, we use features of color and absolute context information for extract of sky and cloud region. And, we use features of color, edge and HCM for extract of trees region. Also, we use to extract building as color, edge, line segments, PCs and VPs. Then we remove the high frequency in trees region. The meshes of parallelograms can help us to detect more PCs as window, door and so on. Overall the system of this paper segments the region of the object as a mixture by
Region Segmentation of Outdoor Scene
207
using multiple features. We accomplished the process of preprocessing to know objects from an image taken by moving robot on outdoor scene. In future, we will study how to the objects respect geometric relationships in outdoor scene between objects as well as to apply the method in a set of images containing more objects (car, people, animals, etc.). In addition, we want to know accurate the property of trees according to the season, the time of day and the weather. Acknowledgments. The authors would like to thank to Ulsan Metropolitan City and MOCIE and MOE of Korean Government which partly supported this research through the NARC and post BK21 project at University of Ulsan.
References 1. Haralick, R.M., Shanmugam, K., Dinstein, I.: Texture Features for Image Classification. IEEE Trans. on Syst. Man Cybern. SMC 3(6), 610–621 (1973) 2. Li, J., Wang, J.Z., Wiederhold, G.: Classification of Textured and Non-textured Images Using Region Segmentation. Int’l, Conf. on Image Processing, pp. 754–757 (2000) 3. Zhang, C., Wang, P.: A New Method of Color Image Segmentation Based on Intensity and Hue Clustering. Int’l Conf. on Pattern Recognition 3, 613–616 (2000) 4. Partio, M., Cramariuc, B., Gabbouj, M., Visa, A.: Rock Texture Retrieval Using Gray Level Co-occurrence Matrix. In: Proc. of 5th Nordic Signal Processing Symposium (2002) 5. Singhal, A., Jiebo, L., Weiyu, Z.: Probabilistic spatial context models for scene content understanding. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 235–241 (2003) 6. Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge University Press, Cambridge (2004) 7. Xuming He., Zemel R. S., Carreira-Perpinan, M. A.: Multiscale conditional random fields for image labeling. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 695–702(2004) 8. Zhang, W., Kosecka, J.: Localization based on building recognition. In: Int’l Conf. on Computer Vision and Pattern Recognition, vol. 3, pp. 21–28 (2005) 9. Kim, D.N., Trinh, H.H., Jo, K.H.: Object Recognition by Segmented Regions Using Multiple Cues on Outdoor Environment. International Journal of Information Acquisition 4(3), 205–213 (2007)
Two-Dimensional Partial Least Squares and Its Application in Image Recognition Mao-Long Yang1,2, Quan-Sen Sun1, and De-Shen Xia1 1
Institute of Computer Science, Nanjing University of Science & Technology, Nanjing 210094, China 2 International Studies University, Nanjing 210031, China
[email protected] Abstract. The problem of extracting optimal discriminant features is a critical step in image recognition. The algorithms such as classical iterative partial least squares (NIPALS and CPLS), non-iterative partial least squares based on orthogonal constraints (NIPLS), and partial least squares based on conjugation orthogonal constraints (COPLS) are introduced briefly. NIPLS and COPLS methods based on original image matrices are discussed where image covariance matrix is constructed directly using the original image matrices just like 2DPCA and 2DCCA. We call them 2DNIPLS and 2DCOPLS in the paper. Two arbitrary optimal discriminant features can be extracted by 2DCOPLS based on uncorrelated score constraints in theory. At the same time, it is pointed out that 2DCOPLS algorithm is more complicated than other PLS based algorithms. The results of experiments on ORL face database, Yale face database, and partial FERET face sub-database show that the 2DPLS algorithms presented are efficient and robust. Keywords: Partial Least Squares (PLS), Uncorrelated Constraints, 2DPCA, Optimal Projection, Image Recognition.
1 Introduction Partial Least Squares Regression (PLSR) is a new multivariable analysis method proposed from application fields, which was conceived by Herman Wold for econometric modeling of multivariate time series in order to reduce the impacts from the noise in the data and to get a robust model[1]. It becomes a tool widely used for chemometrics[2]. PLS has been developing quickly in theories, algorithms and applications since 1980s. Its properties make PLS a powerful tool for regression analysis and dimension reduction, and which has good employment in many fields such as program control, data analysis and prediction, and image process and classification[3]. Classical iterative PLS (CPLS) based on singular value decomposition (SVD) is proposed because of the uncertain solutions of nonlinear iterative PLS (NIPALS) [4,5]. The first d (d = rank ( X )) projective vectors (loading vectors)
α1 ,L , α d
based on CPLS are orthogonal, and the PLS components
corresponding to them are orthogonal, too. On the other hand, non-iterative PLS D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 208–215, 2008. © Springer-Verlag Berlin Heidelberg 2008
Two-Dimensional Partial Least Squares and Its Application in Image Recognition
209
(NIPLS) based on orthogonal constraints can extract PLS scores (PLS projective features) effectively by solving SVD one time but the PLS scores may be correlated. PLS based on conjugation orthogonal constraints (COPLS) instead of orthogonal constraints can extract uncorrelated PLS scores in theory [6-13]. The criterion function of two-dimensional PLS (2DPLS) can be established with original image covariance matrix directly which is similar to 2DPCA[14] and 2DCCA[15], instead of undergoing reshaping them into vectors. In the case of image matrices, 2DPLS involves iterative problems and eigenvalue problems for much smaller size matrices, compared to the 1DPLS based, which reduces the complexity dramatically, and the PLS scores based on 2DPLS can be extracted more effectively. We introduce the basic idea of CPLS, NIPLS and COPLS briefly, then present a 2D extension of PLS, referred to 2DNIPLS and 2DCOPLS based on the basic idea of NIPLS and COPLS, which are used to extract the PLS scores of images for recognition. The results of experiments on ORL face database, Yale face database, and a partial FERET face sub-database show that the algorithms present are more efficient and robust than 1DPLS.
2 Partial Least Squares Considering
two
centered
( X , Y ) = {( xi , yi )}i =1 ∈ R × R n
vectors), α and β
p
q
sample
sets
with
n
samples,
, PLS finds pairs of projective vectors(loading
, which make the projections x
*
= X α , y * = Y β cover their *
*
variation information as much as possible, and the correlation between x , y are maximized at the same time. In general, PLS creates orthogonal score vectorsby CPLS. In other words, the criterion function to be maximized is given by
Cov ( x* , y* ) = α T E ( X T Y ) β = α T Gxy β → max .
(
Where Gxy = E X Y T
(1)
) denotes the covariance matrix between X and Y . Then
PLS is formulated as
J PLS (α , β ) = α Gxy β = T
Subject to α
T
α = β T β = 1.
The unit projective vectors,
α T Gxy β
(α α ⋅ β β ) T
1/ 2
T
.
(2)
α and β , which maximizing the function, are called *
*
PLS loading vectors. The projective vectors, x and y , have the largest covariance when the original sample vectors are projected on the loading vectors. From the idea of PLS modeling, it is easy to see how PCA and canonical correlations analysis (CCA) work in PLS, and the advantages of PCA and CCA are integrated in PLS. Besides, it is easy to see how PLS can be thought of as “penalized” CCA, with
210
M.-L. Yang, Q.-S. Sun, and D.-S. Xia
basically two PCAs (one in the X space and the other in the Y space) providing the penalties[3,9]. Formula (2) based on the orthogonal constraints on transformed
Gxy Gyxα = λα
in order (or G yx Gxy β
to
= λβ
solve ). The first
α kT α i = β kT β i = 0 can
be
eigenvalue problem of k (k ≤ r ) pairs of PLS projective
vectors are the eigenvectors of Gxy G yx (and G yx Gxy )corresponding to the first
k th
largest eigenvalues. We call the algorithm NIPLS. If, instead of orthogonal constraints, conjugate orthogonal constraints are imposed, formula (2) can be transformed in order to solve eigenvalue problem of
( I − (Gx Dx )((Gx Dx )T (Gx Dx )) −1 (Gx Dx ))Gxy Gyxα k +1 = λα k +1 , where I is a
,and D
unit matrix
x
= (α1 , α 2 ,L , α k )T . There is a completely similar expression
for the Y space structure[3,9]. We call the algorithm COPLS.
3 2DPLS 3.1 2DNIPLS Let X
= [ x1,1 ,L , xc ,nc ] be image sample matrices. xi , j is image matrix with size of
h × l , where ni (i = 1,L , c) is the number of samples belonging to the i th class and
N = n1 + n2 + L + nC is the total numbers of sample set. Thus we can obtain the mean matrix of samples matrix X =
1 N
c
ni
∑∑ x
i, j
.
i =1 j =1
To image recognition tasks, sample images can be considered as a variable set in 2DPLS, called sample matrix. Another variable set is class membership matrix, which represents the relationship between samples and classes. It is similar to the definition in traditional CCA and PLS methods[3], the class membership matrix can be coded in two equally reasonable ways as follow[3,15]:
⎡ P1 02 ⎢0 P Z1 = ⎢ 1 2 ⎢M L ⎢ ⎣ 01 02
L 0c ⎤ ⎡ P1 02 ⎥ ⎢ L 0c , Z = ⎢ 01 P2 ⎥ 2 O M⎥ ⎢M L ⎥ ⎢ L Pc ⎦ ( h×c )×(l× N ) ⎣01 02
L L 0c ⎤ L L 0c ⎥⎥ . O M M⎥ ⎥ L Pc −1 0c ⎦ ( h×( c −1))×(l× N )
(3)
Pi means there are ni samples in the i th class, but each sample here is corresponding to a matrix Qh×l as large as the size of sample image (in general, we
Where
presume that the number of row is larger than that of column in image samples, namely h > l ). So the matrix Pi can be denoted as
Two-Dimensional Partial Least Squares and Its Application in Image Recognition
211
Pi = [Q,L , Q]h×( l×ni ) , i = 1,L , c . Such class membership matrix can not only show the membership between samples and classes but also maintain the special information of sample images. For obtaining the mean of class membership matrix in the sense of two dimensional sample representation, where yi , j is a matrix with the matrix Y is rewritten as Y = [ y1,1 ,L , yc , nc ]
,
size
(h × c) × l . Then the mean of class membership matrix is
of
1 Y= N
ni
c
∑∑ y
1 Gx = N
i, j
, and the covariance matrices of
X and Y are denoted as
i =1 j =1 ni
c
∑∑ ( x
i, j
− X )( xi , j − X )
T
i =1 j =1
Gxy = G Tyx =
1 N
c
ni
∑∑ ( x
i, j
,
1 Gy = N
c
ni
∑∑ ( y
i, j
− Y )( yi , j − Y )T
,
i =1 j =1
− X )( yi , j − Y )T , respectively.
i =1 j =1
Then formula (2) can be transformed in order to solve two eigenvalue problems of matrices as below:
Gxy G yxα = λ 2α .
(4)
G yx Gxy β = λ 2 β .
(5)
Under the orthogonal constraints
α kT α i = β kT β i = 0 ( 1 ≤ i < k ), as we know, the
number of available projective vectors is r pairs ( r is the nonzero eigengvalue numbers of matrix Gxy G yx ), and any subsequent PLS projective vectors, say α k , β k ( k
≤ r ) , with the computation of eigenvector of equation (4) and (5) corresponding to the k th largest eigenvalue. Since matrix Gxy G yx and G yxGxy are
symmetric matrices, and rank (Gxy G yx ) = rank (G yx Gxy ) ≤ rank (Gxy ) , we can get the conclusion that the nonzero eigengvalues of eigen-equation (4) and (5) are uniform and the numbers are not greater than rank (Gxy ) . Let λ1 ≥ λ2 ≥ L ≥ λr > 0 , the r pairs of eigenvectors corresponding to them are 2
2
2
orthogonal, namely α i
T
and
α j = βiT β j = δ ij , we can also deduce α i = λi−1Gxy βi .
(6)
βi = λi−1Gyxα i .
(7)
212
M.-L. Yang, Q.-S. Sun, and D.-S. Xia
α iT Gxy β j = α iT Gxy (λ j−1Gyxα j ) = λ j−1α iT (λ j2α j ) = λ jδ ij .
(8)
Generally we solve equation (4) or (5) which rank is less, and calculate another eigenvector with formula (6) or (7). We call the method mentioned above 2D noniterative PLS (2DNIPLS). 3.2 2DCOPLS *
The covariance matrix of sample feature vectors xi and
x*j ( yi* and y*j ) obtained with
2DNIPLS can be defined as
E[( xi* − E ( xi* ))T ( x*j − E ( x*j ))] = α Tj Gxα i .
(9)
E[( yi* − E ( yi* ))T ( y*j − E ( y*j ))] = β Tj Gy βi .
(10)
() ( )
Generally equation 9 and 10 are not equal to 0, that is, the feature vectors projected by loading vectors of NIPLS may be correlative. In order to obtain uncorrelated projective features, the k + 1 st (k ≥ 1) pair of optimal projective directions,
{α k +1 ; β k +1} , could be the one that satisfy conjugate orthogonal
constraints (11) and maximize criterion function (2) after the first pair of optimal discriminative projective direction by 2DNIPLS given in section 3.1.
α kT+1G xα i = β kT+1G y β i = 0
( i = 1, 2, L , k ) .
(11)
If we calculate r (r ≤ n) pairs of optimal projective directions with the method mentioned as the above, the improved optimal projective features *
x* , y * will be
*
deduced, and the arbitrary two projective features xi and x j are uncorrelated. As we know, the optimal projective directions {α k +1 ; β k +1} which satisfy conjugate orthogonal constraints (11) and maximize criterion function (2) are the eigenvector corresponding to the largest eigenvalue of the two eigen-equation (12) and (13) [3,7]
,
PGxy G yxα k +1 = λα k +1 .
(12)
QG yx Gxy β k +1 = λβ k +1 .
(13)
−1
Where P = I − (Gx Dx )((Gx Dx ) (Gx Dx )) (Gx Dx ) , T
−1
Q = I − (G y Dy )((Gy Dy ) (Gy Dy )) (G y Dy ) T
Dy = ( β1 , β 2 ,L , β k )T .
,
I is a unit matrix, and
Dx = (α1 , α 2 ,L , α k )T
,
Two-Dimensional Partial Least Squares and Its Application in Image Recognition
213
After the optimal projections {α i ; β i }i =1 are calculated, we can use the k
= (α1 , α 2 ,L , α d )(d = 1,L , k ) to extract 2D feature of images. For j = W T x .The size of example, for a given image xni with h × l , we have x ni x ni j j as the matrix W and x is h × d and d × l respectively, and we call matrix x
matrix Wx
x
ni
ni
projective feature matrix of given image A.
4 Experiments and Discussion In this section, we design experiments about image recognition to test the performance of the 2DCOPLS method on the ORL database, Yale database and partial data of a FERET sub-database, respectively. All the experiments are carried out on a PC with Intel Core 2, 1.83GHz, 1.5GMB memory and the MATLAB7.5 software platform. The ORL database (http://www.cam-orl.co.uk) contains images from 40 individuals, each providing 10 different images. For some subjects, the images were taken at different times. The facial expressions (open or closed eyes, smiling or nonsmiling) and facial details (glasses or no glasses) also vary. The images were taken with a tolerance for some tilting and rotation of the face of up to 20 degrees. Moreover, there is also some variation in the scale of up to about 10 percent. All images are grayscale and normalized to a resolution of 92 112 pixels. Yale database (http://cvc.yale.edu/projects/yalefaces/yalefaces.html) contains 165 grayscale images of 15 individuals, each of the following facial expressions or configurations: centerlight, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink. All images are cropped with the size of 120×91 pixels. A partial FERET face sub-database comprises 400 gray-level frontal view face images from 100 individuals, and each individual has two images (fa and fb) with different facial expressions. The images are pre-processed by the methods present in literature[16] which are normalized with respect to eye locations, and are cropped with the size of 130×150 pixels. We select five image samples randomly per individual for training and the remaining five images for testing on ORL and Yale database. Two and two sample images are selected randomly for training and testing on FERET face sub-database respectively. In the experiment of 1DPCA, 1DCCA and 1DPLS, we first reduce the image dimension by PCA to a lower dimension until 90% energy of the image being kept. In the case of 2D, image size is reduced to 1/4 of the original on ORL and Yale database, and which is 1/5 on FERET database. In the experiments, we use image samples and class membership matrix Z1 given in
×
section 3.1, and the nearest neighbor classifier is employed. The experiments are repeated 20 times and the optimal average results are shown in table 1. Data in parenthesizes are the feature dimensions corresponding to the accuracy. The results obtained by PCA and CCA with the same samples and conditions are shown in the table. Time elapsed corresponding to the best accuracy on databases are shown in table 2.
214
M.-L. Yang, Q.-S. Sun, and D.-S. Xia Table 1. The best results on databases
Data base
PCA 0.9405 ORL (39) 0.7433 Yale (17) 0.7876 FERET (62)
CCA CPLS 0.9535 0.9490 (18) (38) 0.8893 0.7647 (41) (14) 0.8255 0.7874 (66) (61)
Method NIPLS COPLS 2DPCA 0.9490 0.9533 0.9545 (38) (36) (16) 0.7700 0.7900 0.8160 (15) (15) (22) 0.7884 0.7942 0.8354 (63) (61) (21)
2DCCA 2DNIPLS 2DCOPLS 0.9590 0.9503 0.9638 (6) (23) (2) 0.9453 0.8213 0.9093 (23) (16) (17) 0.8388 0.8202 0.8331 (25) (24) (15)
From table 1, we can find that both 2DNIPLS and 2DCOPLS work effectively in image recognition. The efficiency is equivalent to 2DPCA and 2DCCA since each has its strong point on the three databases. The best recognition accuracy of 2DCOPLS , corresponding to 1DPLS, rises 1%,12% and 4% on ORL, Yale and FERET database respectively. In the experiments we also find that the error rate with 2DCOPLS descends more quickly than that of other PLS methods with feature dimensions increasing. From table 2, we can find that Time elapsed is less than other PLS methods when 2DNIPLS is employed. 2DCOPLS works more inefficient with the increments of training sample number and image size. For example, when 2DCOPLS is employed, in the case of 200 training samples and 200 testing samples, 28×23 (scale=1/4)and 30×26 (scale=1/5) image size on ORL and FERET, time elapsed for feature extracting on ORL is twelvefold as that on FERET! Table 2. Time elapsed corresponding to the best accuracy on databases(s) Sample numbers 400 112 92 165 120 91 400 150 130
Database Image size ORL Yale FERET
× × ×
CPLS 14.7 3.3 19.1
NIPLS 15.3 3.7 19.4
Method COPLS 2DNIPLS 17.8 8.8 4.0 3.2 20.8 15.1
2DCOPLS 226.7 19.5 2458
From the process of solving projective vectors, we know that 2DCOPLS is an effective method for image recognition whether total-class scatter matrices are singular. On the other hand, 2DCOPLS consumes away more spatial and temporal resource than other PLS method mentioned in the paper. So we should consider all the factors such as image size and sample number to select an appropriate method for recognition. For example, the 2DNIPLS may be a good choice in some cases.
5 Conclusion We present reformative PLS methods called 2DNIPLS and 2DCOPLS in the paper, which are efficient and robust methods for image recognition. The proposed methods directly use image matrix to extract the feature instead of matrix-to-vector transformation, which can effectively avoid the singularity of total-class scatter matrices. Furthermore, 2DCOPLS can achieve better recognition accuracy than other
Two-Dimensional Partial Least Squares and Its Application in Image Recognition
215
PLS based methods since conjugate orthogonality constraints are imposed on the directions in both the X and Y spaces. In theory, 2DCOPLS can extract two arbitrary uncorrelated vectors, and then the optimal discriminant projective features can be extracted. Besides, we point out that 2DCOPLS is more complicated than other PLS based method, and the spatial and temporal cost increases more quickly with sample size and number increasing. Acknowledgements. We wish to thank the National Science Foundation of China, under Grant No. 60773172, for supporting our research.
References 1. Wold, H.: Estimation of Principal Components and Related Models by Iterative Least Squares. In: Multivariate Analysis. Academic, New York (1966) 2. Wold, S., Sjölström, M., Erikson, L.: PLS_Regression: A Basic Tool of Chemometrics. Chemometrics and Intelligent Laboratory Systems 58, 109–130 (2001) 3. Barker, M., Rayens, W.: Partial Least Squares for Discrimination. Journal of Chemometrics 17, 166–173 (2003) 4. Wold, H.: Path with Latent Variables: The NIPALS Approach. In: Balock, H.M. (ed.) Quantitative Sociology: International Perspectives on Mathematical and Statistical Model Building, pp. 307–357. Academic Press, London (1975) 5. Höskuldsson, A.: PLS Regression Methods. Journal of Chemometrics 2, 211–228 (1988) 6. Liu, Y.-S., Rayens, W.: PLS and Dimension Reduction for Classification. Computational Statistics 22, 189–208 (2007) 7. Yang, J., Yang, J.-Y., Jin, Z.: A Feature Extraction Approach Using Optimal Discriminant Transform and Image Recognition. Journal of Computer Research & Development 38, 1331–1336 (2001) 8. Frank, I.E., Friedman, H.: A Statistical View of Some Chemometrics Regression Tools. Technometrics 35, 109–135 (1993) 9. Han, L.: Kernel Partial Least Squares for Scientific Data Mining. PHD thesis, Rensselaer Polytechnic Institute, Troy, New York (2007) 10. Arenas-García, J., Petersen, K.B., Hansen, L.K.: Sparse Kernel Orthonormalized PLS for Feature Extraction in Large Data Sets. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press, Cambridge (2007) 11. Baek, J.-S., Kim, M.: Face Recognition Using Partial Least Squares Components. Pattern Recognition 37, 1303–1306 (2004) 12. Jacob, A.: A Survey of Partial Least Squares Methods, with Emphasis on The Two-block Case. Technical Report, Department of Statistics, University of Washington, Seattle (2000) 13. Trygg, J., Wold, S.: Orthogonal Projections to Latent Structures. Journal of Chemometrics 16, 119–128 (2002) 14. Yang, J., Zhang, D., Alejandro, F., Yang, J.-Y.: Two-dimensional PCA: A New Approach to Appearance-based Face Representation and Recognition. IEEE transactions on pattern analysis and machine intelligence 26, 131–137 (2004) 15. Lee, S.-H., Choi, S.: Two-Dimensional Canonical Correlation Analysis. IEEE Signal Processing Letters 14, 735–738 (2007) 16. Bolme, D.S., Beveridge, J. R., Teixeira, M., Draper, B. A.: The CSU Face Identification Evaluation System: Its Purpose, Features, and Structure. In: Proceedings of 3rd International Conference on Computer Vision Systems (ICVS), pp.304–313 (2003)
A Novel Method of Creating Models for Finite Element Analysis Based on CT Scanning Images Liulan Lin, Jiafeng Zhang, Shaohua Ju, Aili Tong, and Minglun Fang Rapid Manufacturing Engineering Center, Shanghai University, 99 Shang Da Road, 200444 Shanghai, China {linliulan}@staff.shu.edu.cn
Abstract. A novel method of creating models for finite element analysis (FEA) from medical images was proposed in this paper. The CT scanning images of human right hand were imported into a medical image processing software Mimics and the 3D STL model of the bone framework was reconstructed by selecting proper threshold value. A piece of the radius was cut from the bone framework model and remeshed in Magics to obtain triangles with higher quality and optimized quantity. The remeshed radius model was exported into FEA software ANSYS to create the volume mesh, and the unidirectional loading simulation was analyzed. This method eliminates the need for extensive and long time experiments and provides a helpful tool for biomedicine and tissue engineering. Keywords: Finite element analysis; CT scanning images; STL.
1 Introduction Recently, finite element (FE) modeling technique in combination with computed tomography (CT) imaging methods has become an important tool for the characterization of bone mechanics[1,2]. Although the resolution of CT images is not as good as that obtained from micro-imaging techniques, it is sufficient to provide a basis for the generation of FE-models that represent the bones in vivo. FEA methods have been used for the determination of mechanical stresses during anatomical function, the strength of tissue segments, the prediction of failure modes/causes, but also the suggestion of possible remedies[3,4]. In order to generate the FE-models, the traditional meshing procedures have been developed. The most commonly applied being the voxel conversion technique providing meshes with hexahedron elements and the marching cube algorithm providing meshes with tetrahedral elements [5-7]. Since, however, this method is inefficient, which will exceed the desired time of optimal clinical treatment, and a large number of elements and nodes are created to represent the FE model, which has negative effect in interactive operation with the users. An alternative meshing strategies of creating models for FEA from CT images was proposed in this paper, which is comprised by area mesh optimization and solid mesh D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 216–221, 2008. © Springer-Verlag Berlin Heidelberg 2008
A Novel Method of Creating Models for Finite Element Analysis
217
Fig. 1. Schematic of the computer assisted analysis method
creation method. The entire process of the method was showed as Fig.1. This method was demonstrated by uniaxial pressure analysis to simulate stress distribution of a human hand bone with accurate results.
2 Methods 2.1 Modeling for FE One human hand was scanned by Computer Tomography (CT) with the scan distance of 0.1mm. A total of 208 slices were taken about 10min. The different bone tissues visible on the scans were using an interactive medical image control system (MIMICS 10.01, Materialise, Leuven, Belgium). MIMICS imports CT data in a wide variety of formats and allows extended visualization and segmentation functions based on image density thresholding. 3D objects were automatically created by growing a threshold region on the entire stack of scans (Fig.2A). Then these objects were exported with STL file into rapid prototyping software (Magics X, Materialise, Leuven, Belgium). Cut a part of radius with cutting operation in this software (Fig.2B).
Fig. 2. FE modeling. (A) CT-scan data as seen in MIMICS 10. 3D representation of human’ hand bone as a result of construction in MIMICS (B) Cut a part of radius (Green part) with cutting operation in Magics.
218
L. Lin et al.
2.2 Mesh Generation The REMESH module attached to Magics was used to reduce the amount of triangles automatically, and simultaneously improve the quality of the triangles while maintaining the geometry. During remesh, the tolerance variation from the original data can be specified. In view of loaded and constrained faces, the local optimizations were applied in these faces. The holistic quality is defined as a measure of triangle height/base ratio so that the file can be imported in the finite element analysis software package without generating any problem (Fig. 3B). This step was taken about 15min.
Fig. 3. Meshing. (A) STL file of the part of radius obtained through Magics. (B) Radius STL file optimized for FEA using the REMESH module within Magics.
Fig. 4. Volumetric meshes (Element type: Solid 186, 20node)
The optimized STL file of the radius was then imported in finite element analysis software (ANSYS, inc. USA) for the generation of volumetric mesh and appended material properties (Fig.4). Before volumetric mesh operation, the loading and
A Novel Method of Creating Models for Finite Element Analysis
219
constraining face should booleaned into an integer respectively. The radius model was meshed by tetrahedral element (20 notes). 2.3 FEA Validation For validation of the model, 1MPa of pressure was inflicted on the top face of radius in this examination (Fig.5). The material properties obtained in other researches were also considered. A finite element model for strength analysis of radius model under compression was presented. Characteristics of stress distribution and location are determined according to the model.
Fig. 5. Loading and constraining model
3 Results and Discussion A series of meshing operation to STL file of radius were implemented before the distribution of stress analysis. The number of elements and nodes were in these operations for meshing (Table 1). Before the optimization of STL file which obtained through Magics (Fig. 3A), the number of elements is 4806 but no node. Using the Magics REMESH module, the number of elements was reduced by 20%. The two meshing steps have empty node because the mesh is just surface mesh. The optimized STL file of the radius was then imported in finite element analysis software (ANSYS, inc. USA) for the generation of volumetric mesh. The elements increase from 3826 to 25082. The number of nodes is 37791. The meshing approach used in this study suggests that maximum anatomical detail is obtained by surface/interface-based meshing using Stereolithography (STL) surface data. The different parts of the model that featuring different mechanical properties are identified firstly (segmentation process) and meshed accordingly. The very
220
L. Lin et al. Table 1. The Comparison of each meshing step with number of elements and nodes Before optimized
After optimized
Volumetric mesh
Elements
4806
3826
25082
Nodes
N
N
37791
Fig. 6. Stress distribution
user-friendly graphic interface allows for rapid modifications of the different parts and generation of new STL that can be instantly exported and volumetrically meshed the FEA program. Results of the finite element analysis show that stress density in the radius of reconstructed model which was based on CT scan data (Fig.6).The applicability of the method is proved by the results of stress distribution. The minimum stress was distributed in the constraining face of model. Stress distribution of this 3D digital model was continuous. The potential use of the model was demonstrated using nonlinear contact analysis to simulate compression loading. It has proven to be a useful tool in the thinking process for the understanding of the biomimetic approach in restorative bone grafts.
4 Conclusion This method of creating models for finite element analysis (FEA) from medical images could eliminate the need for extensive and long time experiments. The efficiency and accuracy of image processing, 3D reconstruction, STL file remesh and FEA volume mesh generation of this method were validated in this paper. This methodology
A Novel Method of Creating Models for Finite Element Analysis
221
could facilitate optimization and understanding of biomedical devices prior to animal and human clinical trials. Acknowledgments. The authors would like to acknowledge the support of Shanghai Academic Excellent Youth Instructor Special Foundation Postdoctor Science Fund (No.20070410715).
References 1. Pistoia, W., Rietbergen, B.V., Lochmuller, E.M., Lill, C.A., Eckstein, F., Rüegsegger, P.: Estimation of Distal Radius Failure Load with Micro-finite Element Analysis Models Based on Three-Dimensional Peripheral Quantitative Computed Tomography Images. Bone 30(6), 842–848 (2002) 2. Zannoni, C., Mantovani, R., Viceconti, M.: Material Properties Assignment to Finite Element Models of Bone Structures: A New Method. Medical Engineering & Physics 20, 735– 740 (1998) 3. Cattaneo, P.M., Dalstra, M., Melsen, B.: The Finite Element Method: A Tool to Study Orthodontic Tooth Movement. J. Dent. Res. 84(5), 428–433 (2005) 4. Su, R., Campbell, G.M., Boyd, S.K.: Establishment of an Architecture-Specific Experimental Validation Approach for Finite Element Modeling of Bone by Rapid Prototyping and High Resolution Computed Tomography. Medical Engineering & Physics 29, 480–490 (2007) 5. Chevalier, Y., Pahr, D., Allmer, H., Charlebois, M., Zysset, P.: Validation of a Voxel-Based FE Method for Prediction of the Uniaxial Apparent Modulus of Human Trabecular Bone Using Macroscopic Mechanical Tests and Nanoindentation. Journal of Biomechanics 40, 3333–3340 (2007) 6. MacNeil, J.A., Boyd, S.K.: Bone Strength at the Distal Radius can Be Estimated from HighResolution Peripheral Quantitative Computed Tomography and the Finite Element Method. Bone 42, 1203–1213 (2008) 7. Ulrich, D., Rietbergen, B.V., Weinans, H., Rüegsegger, P.: Finite Element Analysis of Trabecular Bone Structure: A Comparison of Image-Based Meshing Techniques. Journal of Biomechanics 31, 1187–1192 (1998)
Accelerating Computation of DNA Sequence Alignment in Distributed Environment Tao Guo1, Guiyang Li1, and Russel Deaton2 1
College of Computer Science, Sichuan Normal University, 610066 Chengdu, China {tguo,gyli}@sicnu.edu.cn 2 College of Computer Science and Engineering, University of Arkansas, 72701 Fayetteville, USA
[email protected] Abstract. Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.
1 Introduction DNA contains the genetic information of cellular organisms. It consists of polymer chains or DNA strands. The DNA strand contains a linear chain of nucleotides or bases. With the development of modern methods for DNA sequencing, a huge amount of DNA sequences has been generated so far. However, mining the voluminous sequence databases to generate useful information is backward because of the problem complexity [1]. During the past years, various heuristic methods like FASTA [2] and BLAST [3], and dynamic programming methods of Smith-Waterman [4] to identify homologous sequences have been reported. Some of methods have been showed very promising. Janaki pointed out that it is impossible for current single processor computer to handle such voluminous DNA sequences [5]. Javaparty provides a distributed platform and can be possibly used for DNA computation when appropriate computing algorithm is selected. Currently, no literature information has been reported for DNA sequence comparison by using JavaParty combined with dynamic programming algorithm in a distributed environment for parallel computation. In this paper, a dynamic programming running on a distributed JavaParty environment is proposed to accelerate DNA sequence computation. The dynamic programming algorithm, thread generation, DNA concurrent computation, and validity of this method have been addressed. D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 222–228, 2008. © Springer-Verlag Berlin Heidelberg 2008
Accelerating Computation of DNA Sequence Alignment in Distributed Environment
223
In this method, each generated thread will be sent to a virtual machine in JavaParty runtime environment to perform DNA sequence comparison concurrently. JavaParty classes in this method can be declared as remote objects targeted to different standard virtual machines for implementing a distributed computation of DNA sequences. The outline of this new method of DNA computing was described by the flow diagram. See Figure 1. JavaParty Environment
Dynamic Programming for DNA Sequence Comparison
Multiple Threads
DNA Sequence Comparison Fig. 1. DNA computing in JavaParty distributed environment
2 Methods for Sequence Computation Sequence alignment is one of the most important operations in computational biology, facilitating everything from identification of gene function to structure prediction of proteins. Alignment of two sequences shows how similar the two sequences is, where there are differences between them, and the correspondence between similar subsequences since sequence alignment represents important information for biologists. To find optimal alignment score Mij of two sequences X[1...i] and Y[1...j], three steps of sequence alignment computation are considered in this method: 1) Create a matrix and perform an initialization To find the alignment, the first step in the alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned. 2) Calculate score of each cell in matrix One possible solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score Mi,j for each position in the matrix. In order to find Mi,j for any i,j it is minimal to know the score for the matrix positions to the left, above and diagonal to i, j. In terms of matrix positions, it is necessary to know Mi-1,j, Mi,j-1 and Mi-1, j-1.
224
T. Guo, G. Li, and R. Deaton
For each position, Mi,j is defined to be the maximum score at position i,j; (1)
Mi,j = MAXIMUM [Mi-1, j-1 + Si,j, Mi,j-1 + w, Mi-1,j + w ] Si,j : match or mismatch in diagonal. If matches, Si,j=1, if not, Si,j = -1 w: gap penalty in sequence M and N. Default value equals 0
3) Trace back to get sequence alignment and computing the length of a Longest Common Sequence (LCS) After the matrix was filled with score, the maximum alignment score for the two test sequences is gotten. The traceback step determines the actual alignment(s) that result in the maximum score. Assume two DNA sequences X= and Y=. One possible maximum alignment is shown in Figure 2. From Figure2, an alignment of two DNA sequences is deduced as following: GAATTCAGTTA | | || | | GGA_TC_G__A G
A
A
T
T
C
A
G
T
C
A
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
2
2
2
2
0
1
1
2
2
2
2
2
2
2
2
3
0
1
2
2
3
3
3
3
3
3
3
3
0
1
2
2
3
3
4
4
4
4
4
4
0
1
2
2
3
3
4
4
5
5
5
5
0
1
2
3
3
3
4
5
5
5
5
6
Fig. 2. A matrix of scores comparing two DNA sequence; continuous high-scoring matches are highlighted
To measure the similarity of strands M and N is by finding a third strand L= in which the bases in M appear in each of M and N; these bases must appear in the same order, but not necessarily consecutively. The longer the strands L, the more similar M and N are [5]. The running time of the procedure is O(mn)(when two sequences are not equal, if equal, O(n2)), and the constructing time for an LCS takes O(m+n) [5].
3 JavaParty for Parallel Computing JavaParty was first designed and built by Michael Philippsen and Matthias Zenger in 1996[6]. It combines Java-like programming and the concepts of distributed shared
Accelerating Computation of DNA Sequence Alignment in Distributed Environment
225
memory in heterogeneous networks. JavaParty is a “two-purpose platform” [7]. It serves as a programming environment for cluster or parallel applications and provides a basis for computer science research in finding an optimization technique to improve performance. With JavaParty, remote classes and their instances are visible and accessible through the entire distributed JavaParty environment. This mechanism allows objects are used locally at the cost of a pointer indirection instead of expensive OS communication overhead [7]. JavaPary is easily to turn the multi-threaded Java program into a distributed environment by identifying those classes and thread. At this point, JavaParty is an optimal way to program clusters of workstations and workstation-based parallel computers with Java. Haumacher [8] pointed that JavaParty was already used for transparent distributed threads successfully.
4 Multi-threads and Concurrent Computing 4.1 Multi-threads for Sequence Comparison In this new method, a distributed DNA sequence comparison with dynamic programming was designed to run in separated multiple threads under JavaParty environment. Each thread performs sequence alignment concurrently. The algorithm for generating multi-threads is shown. Function for multi-threads generation Function GENERAT-THREAD (int-seq, tar_seq, parameter) interest_seqÅinterestGen target_seq1 the kernels take into account the non-linearity of the system. 2.1 First Order Volterra Systems A first order Volterra system is one for which superposition principle holds. It is equivalent to a linear system. For input x(t) =
cnxn(t)
The output
y(t) = H1[x(t)] = H1[
=
cnxn(t) ]
cn H1[xn(t) ]=
cnyn(t)
(2.1.1)
where H1 is the first order Volterra operator and yn(t) is the output of the system corresponding to each xn(t).
Consider a system with sinusoidal input x(t) = Acos( ω t)
(2.1.2)
Expanding this expression by Euler’s Formula we get y1(t) =
h1 ( τ )e j ω (t - τ ). d τ = H1 ω (j ω ) e j ω t
Similarly, y2(t) = H1 ω *(j ω ) e -j ω t = y1*(t) and H1 ω (-j ω ) = H1 ω *(j ω )
Now,
y(t) = A/2[y1(t) + y1*(t)] = A Re {H1 ω (j ω ) e j ω t }
This result will be used for application of Volterra on mechanical systems.
(2.1.3)
480
C. Bharathy et al.
2.2 Second Order Volterra Systems A non-linear system can be formed by the multiplication of homogeneous linear systems [1] as shown in fig 1.
za(t)
ha(t) y(t) x(t) *
hb(t) zb(t)
z2(t)
Fig. 1. A second-order system
The response of the individual homogeneous systems are : Za(t) =
ha( τ 1) x( t - τ 1 ) d τ 1
Zb(t) =
hb( τ 2) x( t - τ 2 ) d τ 2
The final response of the system is y(t)= Za(t) × Zb(t) =
=
ha( τ 1) x(t - τ 1 ) d τ 1
hb( τ 2)x( t - τ 2)d τ 2
(2.2.1)
h2* ( τ 1, τ 2)x( t - τ 1) x( t - τ 2) d τ 1 d τ 2
In which h2* ( τ 1, τ 2) = ha( τ 1) hb( τ 2). This kernel is not symmetric if ha( τ 1) hb( τ 2) * hb( τ 1) ha( τ 2) and hence h2 ( τ 1, τ 2) h2* ( τ 2, τ 1). If asymmetric kernels are taken into account then the results obtained may not be unique. All asymmetric kernels can be symmetrized by adding the kernels over all of the permutations and dividing them by no of permutations [3]. A second order operator is one for which the response to a linear combination of signals is a bilinear operation [4]. For input x δ (t) =
δ x(k δ )u δ (t-k δ )
where x δ (t) is the staircase approximation of x(t) and u δ (t) is the basic waveform defined as u δ (t) = (1/ δ ) for |t| < δ and 0 elsewhere.
An Introduction to Volterra Series and Its Application on Mechanical Systems
481
Then the response of the system is y δ (t) = H2[x δ (t)] δ x(m δ ) δ x(n δ )H2[u δ (t-m δ), u δ (t-n δ)]
=
Where H2 is the second order Volterra operator. The time function corresponding to each bilinear operation is h2(t-m δ,t-n δ, δ) = H2 [u δ (t-m δ), u δ (t-n δ)].
Using the above equations y(t) reduces to
h2 ( τ 1, τ 2)x(t- τ 1)x(t- τ 2)d τ 1d τ 2
y(t)=
(2.2.2)
where h2 ( τ 1, τ 2) is the Volterra kernel. Now when the system of fig 1 is excited with sinusoidal input we get y(t) = H2[x(t)] = H2[x1(t) + x2(t)] = H2[x1(t)] + H2[x2(t)] + H2[x1(t),x2(t)] + H2[x2(t),x1(t)]
where x1(t) = (A/2) e j ω t and x2(t) = (A/2)e -j ω t Expanding each term of the eq 2.2.3 we get, H2[ x1(t) ] =
= A2/4
h2 ( τ 1, τ 2) x1( t - τ 1) x1( t - τ 2 ) . d τ 1 d τ 2 h2 ( τ 1, τ 2) ej ω ( t - τ 1) ej ω (t - τ 2 ) . d τ 1 d τ 2
= A2/4 ej2 ω t
h2 ( τ 1, τ 2) e -j ω τ 1 e -j ω τ 2 . d τ 1 d τ 2
= A2/4 H2 ω (j ω, j ω ) ej2 ω t
Similarly,
H2[x2(t)] = A2/4 H2 ω (-j ω, -j ω) e -j2 ω t
Now, H2[x1(t), x2 (t)] =
h2 ( τ 1, τ 2) x1(t - τ 1) .x2( t - τ 2 ) d τ 1 d τ 2
= A2/4 H2 ω (j ω, -j ω)
Similarly,
H2[x2(t), x1(t)] = A2/4 H2 ω (-j ω, j ω)
(2.2.3)
482
C. Bharathy et al.
Adding these terms we obtain y2(t) = A2/2 [ Re{ H2 ω (j ω, j ω) ej2 ω t } + Re{ H2 ω (j ω, -j ω) } ]
(2.2.4)
Similarly applying third order operator on the same input we get
y3(t) = 2(A/2)3 Re{H3 ω (j ω, j ω,j ω) ej3 ω t } + 2(A/2)3 Re{H3 ω (j ω, j ω,-j ω) ej ω t } (2.2.5)
By inspection it can be found out that the argument of a linear function of a linear system following a multiplier is the sum of the arguments of the system functions of the linear system preceding it. So from the fig 2 it can be deduced that H3 ω can be expressed in terms of H1 ω (p,q,r) as (b/3!) H1 ω (p) H1 ω (q) H1 ω (r) H1 ω (p+q+r) (2.2.6) 2.3 Determination of Kernels When an equation giving the input-output relationship of a system is given then it is possible to determine the Volterra kernels using the Harmonic Input method and the Direct Expansion method [2]. 2.3.1 Harmonic Input Method The harmonic input method is used for determining the kernels in frequency domain. This method is best suited when n is small. For an input x(t) = ej ω 1t + ej ω 2t ...+ ej ω nt n Hn ω (j ω 1,..., j ω n) = coefficient of ej ω 1t +j ω 2t +…+j ω t Direct expansion Method The direct expansion method is used for determining the kernels in the time domain. The system equations are manipulated in order to obtain the Volterra form. Supposedly in fig 1 another linear homogeneous system is added. Then by eq. 2.2.1 we get y(t) = y1(t) y2(t) y3(t) =
h1( τ 1)h2( τ 2)h3( τ 3)
x(t - τ i) d τ 1 d τ 2 d τ 3
Comparing this equation with the Volterra series expression we obtain hn( τ 1,…, τ n)=0 for all n except at n = 3 where we get h3( τ 1, τ 2, τ 3) = h1( τ 1)h2( τ 2)h3( τ 3) (ignoring the factorial). This kernel can further be symmetrized. The Direct Expansion method is best suited when the value of n is large.
3 Simulation Algorithm: Step 1: Represent the system in the form of a differential equation. Step 2: Express its solution as a truncated Volterra series expansion. Step 3: Determine the values of the kernels using methods described in section 2.3
An Introduction to Volterra Series and Its Application on Mechanical Systems
483
Step 4: Substitute the values of the kernels in this equation to determine the linear and the non-linear part. 3.1 Volterra System to Simple Pendulum The equation of motion of a simple pendulum with linear damping [1] is given by y(t) = x (t) + ax (t) + bsinx(t)
(3.1.1)
Till now we assumed sinx(t) x(t) for small x(t). But here we consider Volterra systems to take into account the non-linearity caused by sine. Now y(t) can be expressed as y(t) = H[x(t)] = H1[x(t)] + Hr[x(t)]
where, H1[x(t)] is the linear term and Hr[x(t)] is the non linear term So, H1[x(t)] = x (t) + ax (t) + bsinx(t) And Hr[ x(t)] = b [ sinx(t) – x(t) ] =b
(-1)n [x(t)](2n+1)/(2n+1)!
Thus the solution is x(t) = H-1 [y(t)] x(t) =
Kn[y(t)] where Kn=Hn-1[ ] K1s(s) = 1/(s2 + as +b)
The next step is replacing each forcing function y(t) by cy(t) and equating the coefficients of like powers of c. cnKn[y(t)] =
Now. x(t) =
cnxn[y(t)]
Where, xn(t) = Kn[y(t)] By substitution we obtain cy(t) = xn3(t)+ (b/5!)
cn [xn (t) +a xn (t) +bxn(t)] + (b/3!) ...
c(n1+n2+n3) xn1(t) xn2(t)
c(n1+n2+n3) xn1(t) ... xn5(t) +…and so on
Now equating coefficients of c,c2,c3c4,c5 we get x1(t) = K1[y(t)], x2(t) = 0 x3(t) = H1-1[(b/3!) x13(t)] = (b/3!) K1 [ x13(t)] as shown in fig 2 x4(t) = 0 and x5(t) = (b/3!) K1 [ x12(t) x3(t)] - (b/5!) K1 [ x15(t)]
This was simulated for A=1, b=0.2, a=2 and shown in fig 3.
=
and graphs were obtained as
484
C. Bharathy et al.
y(t)
x1(t) K1
Cube Law Device
[x1(t)]3
K1{[x1(t)]3}
b/3!
x3(t)=K3[y(t)]
K1
Fig. 2. Block Diagram of the operator K3
We can also implement the simple pendulum using the results obtained before in first order and third order Volterra operators. In this case we assume the torque to be equal to the sum of first two non zero harmonics. x(t) = K1[x(t)] + K3[x(t)] (since even harmonics are zero) From the results obtained in eq 2.1.3, 2.2.5 and 2.2.6 and by substituting K1 ω (-j ω) = K1 ω * (j ω) we get
x(t) = A Re{K1 ω (j ω) ej ω t } + (3A3/4)(b/3!) Re{K1 ω 3 (j ω) K1 ω * (j ω) ej ω t } + (A3/4)(b/3!) Re{K1 ω 3 (j ω) K1 ω (j3 ω) ej3 ω t } = A | K1 ω (j ω)| ej ω t [ cos( ωt – ( ω) ] + (bA3/8) | K1 ω (j ω)|4 ej ω t [ cos( ωt –2 ( ω) ] + (bA3/24) | K1 ω (j ω)|3 |K1 ω (j3 ω)| ej ω t .[ cos( ωt –3 ( ω) - (3 ω) ]
Where ,
| K1 ω (j ω)| = [( b – ω 2 )2 + a2 ω 2 ] -1/2 and ( ω) = tan-1[a ω/(b- ω 2)] for ω 2< b and - tan-1[a ω/( ω 2 - b)] for ω 2>b
(a)linear part
(b)Non linear third harmonic
Fig. 3. Simulation graphs of a simple pendulum
3.2 Volterra System to a Non Linear Spring Equation of a nonlinear spring can be expressed as y(t) = mx (t) + b[x (t)]2 – kx(t)
(3.2.1)
Now assuming y(t) to be the sum of first two non zero Volterra operators we get y(t) = H1[x(t)] + H2[x(t)]
An Introduction to Volterra Series and Its Application on Mechanical Systems
485
Taking x(t) to be a sinusoidal input and using the results of eq 2.1.3 and 2.2.4 we get y(t) = A Re{ H1 ω (j ω) ej ω t } + A2/2 [ Re{ H2 ω (j ω, j ω) ej2 ω t } + Re{ H2 ω (j ω, -j ω) } ]
where H1 ω and H2 ω for different arguments can be found by harmonic method for determination of kernels explained in section 2.3.1 Thus we get H1 ω (j ω) = -m ω 2- k H2 ω (j ω, j ω) = -4b ω 2 H2 ω (j ω, -j ω) = 2b ω 2
This problem was simulated for A=1,b=0.2,m=200kg k=5 and were obtained as shown in fig 4.
(a) Linear part
=
/3 and graphs
(b) Non linear second harmonic
Fig. 4. Simulation graphs of spring
4 Results It is seen from the results obtained that incorporating the non-linearity of the system adds to the accuracy of the results as compared to the results obtained by neglecting the non-linear terms. In case of the simple pendulum the non-linearity is incorporated by taking into account the non-linear terms up to the fifth order thus considering the contribution of the sine term. The fig.3 shows the graph for the results obtained for a simple pendulum by expanding the sine term. The fig.4 shows the result for a nonlinear spring mass system.
5 Conclusion The Volterra series can be used as an effective means for developing a model of a non-linear system. We studied the Volterra series representation, the use of the Volterra series to represent first order systems and second order systems. This is then applied to non-linear mechanical systems i.e. the simple pendulum and the spring mass systems wherein the Volterra kernels are determined. It is seen that as the order of the non-linearity increases the complexity of determination increases. Two methods are used to determine the kernels namely the harmonic input method and the direct expansion method. The Volterra series can be used to incorporate non-linearity in other electrical systems like the RLC networks. Further the application of Volterra series can be extended over to the 2-D space for noise removal and edge enhancement in images
486
C. Bharathy et al.
References 1. Schetzen, M.: The Volterra and Wiener Theories of Nonlinear Systems. Wiley, Chichester (1980) 2. Cherry, J.A.: Distortion Analysis of Weakly Nonlinear Filters Using Volterra Series 3. Rugh, W.J.: The Volterra/Wiener Approach. Johns Hopkins University Press, Baltimore (1981) 4. Alper, P.: A Consideration of the Discrete Volterra Series. IEEE Transactions Automatic Control 10 (July 1965)
Skin Detection from Different Color Spaces for Model-Based Face Detection Dong Wang, Jinchang Ren, Jianmin Jiang, and Stan S. Ipson School of Informatics, University of Bradford, BD7 1DP, U.K. {d.wang6,j.ren,j.jiang1,s.s.ipson}@bradford.ac.uk http://dmsri.inf.brad.ac.uk/
Abstract. Skin and face detection has many important applications in intelligent human-machine interfaces, reliable video surveillance and visual understanding of human activities. In this paper, we propose an efficient and effective method for frontal-view face detection based on skin detection and knowledgebased modeling. Firstly, skin pixels are modeled by using supervised training, and boundary conditions are then extracted for skin segmentation. Faces are further detected by shape filtering and knowledge-based modeling. Skin results from different color spaces are compared. In addition, experimental results have demonstrated our method robust in successful detection of skin and face regions even with variant lighting conditions and poses. Keywords: Skin detection, face detection, performance evaluation, semantic image indexing and retrieval.
1 Introduction Automatic detection of skin and face plays very important roles in many vision applications, such as face and gesture recognition in intelligent human-machine intelligence and visual surveillance [1,2,4], naked adult image detection [3,8], video phone or sign language recognition [14, 15] as well as content-based multimedia retrieval [5, 13]. Usually, skins regions are segmented or detected by histogram matching, statistical classification and pixel-based thresholding or clustering. In [8], Jones and Rehg developed a general color model from color histograms in R, G and B channels and adopted supervised learning by manually labeling skin pixels in 4675 images to acquire the probability that a given color belonged a skin and non-skin classes. Then, they tested the method in another 8965 images to detect skins and judge naked images. Saber and Tekalp employed a YES model to detect skins from color images by linear weighting of R, G and B values like YUV space did [9]. In [10], Hsu etc used YCbCr space for skin detection in their face detection system and found after lighting compensation their algorithm could detect more accurate skin pixels. In [11], Garcia and Tziritas compared skin detection results obtained from color clustering, and found the results in YCbCr and HSV spaces are quite equivalent. They also concluded that the cluster of skin colors is less compact in HSV space than in YCbCr space, and D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 487–494, 2008. © Springer-Verlag Berlin Heidelberg 2008
488
D. Wang et al.
HSV space is more sensitive to lighting variations. In addition, some adaptive models are also proposed for skin detection [16-19]. After skin detection, faces can be detected by shape constraints, template matching, knowledge conducting and statistical classifications by Neural Network (NN), Hidden-Markov Model (HMM) or Support Vector Machine (SVM) [6, 9-11]. As statistical or neural classification is always implemented by supervised or unsupervised learning, in which many face features are applied internally, therefore, face features and knowledge of facial distribution is more important in face detection. Saber and Tekalp introduced an interesting algorithm to detect face by locating the eyes, nose and mouth [9]. At the same time, they took an ellipse to simulate the shape of a face just as many other people adopted [6, 10]. In [10], Hsu etc also detected faces from skin regions based on the spatial arrangement of skin patches like eyes, mouth and boundary maps and could attain the face ellipse and triangle feature points of eyes and mouth. In [11], Garcia and Tziritas presented another face detection method based on skin detection, region merging and constraints of shape, size and intensity texture analyzed by wavelet-packet decomposition, and they utilized rectangle shapes to mark the detected faces. In [12], face is detected by using a fuzzy pattern matching scheme. In [20], merging skin regions for face detection by using wavelet analysis is proposed. From skin detection to face detection, we have quite a few algorithms with different results and conclusions [8-12], and some of them are quite different and even disagreed with each other. Therefore, we will investigate the skin detection comparatively and employ same methodology (supervised clustering) in different color spaces during the comparisons before our knowledge-based face modeling and detection.
2 Color Space Transform Both linear and nonlinear color spaces are examined in our paper for comparisons. To provide some direct information about these color spaces, we summarize below how typical color space transforms, including linear and nonlinear ones, are defined. From RGB to YCbCr or YUV color spaces are linear transforms, in which the three components in both new spaces are defined simply by linear weighting of R, G and B values, and Y refers to illumination intensity defined in (1),
Y = ∑ λwλ , λ = R, G, B
∑ wλ = 1, wλ ≥ 0
U = B − Y ,V = R − Y .
(1)
Generally, these linear transforms can be defined as
⎡Y ⎤ ⎡ wr ⎢ A⎥ = ⎢ a ⎢ ⎥ ⎢ 10 ⎢⎣ B ⎥⎦ ⎢⎣a20
wg a11 a21
wb ⎤ ⎡ R ⎤ a12 ⎥⎥ ⎢⎢G ⎥⎥ . a22 ⎥⎦ ⎢⎣ B ⎦⎥
(2)
As hue is more effective in distinguishing different colors than illumination intensity, HSV (Hue, Saturation, Value) and HIS (Hue, Intensity, Saturation) transforms
Skin Detection from Different Color Spaces for Model-Based Face Detection
489
are taken as suitable color spaces that correspond to human visual perceptions and have been widely utilized in color clustering for image segmentation and coding. The RGB to HSV transform can be defined as [7]:
V = max(R, G, B);
S = V ' /V
V ' = V − M , M = min( R, G, B)
(3)
Let r ' = (V − R) / V ' , g ' = (V − G ) / V ' and b' = (V − B) / V ' , then H is given by
⎧5 + b' ⎪1 − g ' ⎪ 1 ⎪⎪1 + r ' H= ⎨ 6 ⎪3 − b' ⎪3 + g ' ⎪ ⎪⎩5 + r '
if R = V & G = M if R = V & G ≠ M if G = V & B = M if G = V & B ≠ M if B = V & R = M
(4)
otherwise
3 Skin Segmentation Since skin detection is a classification problem defined on color similarity, supervised clustering is applied to achieve the exact rules for effective skin color clustering and pixel classification. Through manually specifying representative skin and non-skin pixels, we can learn linear relationships between different components in the new color spaces. Finally, we obtain several main boundary conditions for skin pixels classification in different color spaces. Firstly, skin pixels are modeled by using the histogram-based approach, in which the probability or likelihood that each color represents skin is estimated by checking its occurrence ratio in the training data. In (5), Vskin indicates volumes or total occurrences of all skin colors in manual ground truth of training data.
p (color / skin) = sum(color / skin) / Vskin .
(5)
Then, boundary conditions in the skin model are extracted to allow more than 98% of skin pixels covered. Using the boundary conditions, test images are segmented into skin and nonskin regions accordingly. For different color spaces, these boundary conditions are found as follows. As for YUV space, the boundary conditions are found as:
⎧148 ≤ V ≤ 185 . ⎨ ⎩189 ≤ U + 0.6V ≤ 215
(6)
Considering the illumination intensity variation, we have boundary conditions as
⎧Y > 85 or . ⎨ ⎩Y < 85, U > 104, Y + U − V > 2
(7)
490
D. Wang et al.
In HSV space, we scale the H into [0,255] and let H = 255 − H if H > 128 . We also find several boundary conditions for skin pixels in HSV space and given in (8):
⎧S ≤ 21,V ≥ 2.5S ⎪ ⎨158 ≤ H + V ≤ 400, H + V > 13S . ⎪ H > 0.2V , H > 4S ⎩
(8)
Fig 1 gives skin results from YUV and HSV spaces, where we can find three people in the background, which can be clearly found in the histogram-equalized image. However, the skin regions can be successfully in HSV space from the original image while they cannot be found in YUV space.
(a) Original image
(b) histogram equation
(c) YUV skin results
(d) HSV skin results
Fig. 1. Comparison of skin regions detected from YUV and HSV color spaces
4 Knowledge-Based Face Modeling and Detection After skin detection, we need to locate faces in candidate skin regions. Again we detect faces of nearly frontal view, but there are no constraints on their leaning angles. Knowledge about the size, size ratio, locations of ears and mouth is used. Firstly, the detected skin regions are labeled to obtain the outer boundary rectangle and pixel number of every region. Then, small regions that have pixels less than a given threshold, i.e. 300, will be removed. Finally, the skin regions are filtered by a SR parameter (Width/Height ratio) defined as,
⎧width / height SR = ⎨ ⎩height / width
if
width ≤ height
if
width > height
.
(9)
In (10), the width and height of the regions are determined by the rectangle bounding box of each region, and we find the valid SR for candidate face regions should lie in [0.55,0.90]. To acquire more reasonable width and height of the regions, the main axis is extracted by moment calculation of each region. Then, the skin regions are rotated by the main axis angle to make the final main axis in vertical direction. Fig 2(a) is the filtering result by thresholding using the size of 300. In Fig 2(b), the main axis of each labeled region is marked with white line, and the angle and region number are also given. From Fig 2(d) to 2(e), we give the candidate face regions in rotated skin results and the skin regions before rotation in HSV space. Besides, Fig 2(f) gives the face candidates in RGB space for comparison.
Skin Detection from Different Color Spaces for Model-Based Face Detection
491
(a) Thresholded by size
(b) Main axis detection
(c) Rotation by main axis
(d) Thresholded by W/H ratio
(e) Face candidates
(f) Face in original image
Fig. 2. Face filtering from skin regions in Fig. 2(b) by thresholding of size and W/H ratio
Three basic rules are used in further face modeling and detection: First, there are one or two ears near the half height of every candidate face region which makes the width of the skin regions bigger than other lines. Second, there are one or two eyes over the height of the ear line which forms one or two dark holes. Third, an open mouth will form a dark hole near the middle of eyes below the ear line. Following is our algorithm for face detection and the results are given in Fig 3. 1) Detect the ear line by extracting of local maximum width near the center of the candidate face regions, see Fig 3(a); 2) Detect the holes by the illumination intensity difference. Holes contain those pixels that have lower intensity than the average intensity of the candidate regions, say, less than 80% of the average intensity, see Fig 3(b); 3) Judge the relative positions of the holes and ear line and determine the candidate region is a valid face or not.
(a) Ears location
(b) Feature holes
(c) detected face
(d) mapped back
Fig. 3. Ears location with white line (a) and feature holes detection (b) for face detection
5 Results and Discussions In our experiments, statistical models of skin colors are estimated through histogram based approach using a subset of ECU database [5], in which 500 images are used for
492
D. Wang et al.
training. Afterwards, we generate 100 test images in the office environments for evaluation. Results on skin detection from both the training images and our own images are summarized in Table 1 below. Although the results from different color spaces are very comparable, HSV and YUV seem yield slightly better performance in linear and nonlinear color spaces, respectively. More results on skin and face detection are also given in Fig. 4, along with discussions in details. Table 1. Skin detection results from different color spaces Results Test data trained Non-trained Overall
Linear color space YUV YCbCr bk TPR FPR skin 93.7% 91.2% 93.3%
(a) Original image
8.1% 10.2% 9.3%
93.5% 91.1% 92.9%
(b) YUV skins
7.9% 9.9% 9.2%
Nonlinear color space HSV HSI bk bk skin skin 94.1% 93.0% 93.6%
(c) HSV skins
8.2 10.1 9.5
93.9% 92.7% 93.5%
8.3% 10.1% 9.6%
(d) Final face detected
Fig. 4. Skin and face detection using image of Peter and Tommy.
As for skin detection, we can still find that skin regions detected from HSV space are more accurate and robust than that from YUV space, and the skin regions in background can also be detected easily in HSV spaces (see the face in Fig 1 and hand near the middle head in Fig 4, which means HSV space is less sensitive to variations of illumination intensity. Thresholding by size and ratio is very effective in non-face regions removal. Moreover, the face model composed by the rules on the relative positions of ears and holes of eyes or mouth is also very practical in face detection, as ears can be found in almost every face image, which are more robust for detection even when the face is rotated and eyes are difficult to be detected. Though our face detection algorithm can achieve quite satisfied results even there are pose variations, there are several additional strategies can be further applied for more robust face detection in our model, such as how to obtain the W/H ratio more accurately if there are connected skin regions and holes, and how to detect eyes and mouth if there are no holes can be found, especially for the face in the background. With detected regions of skin and face, semantic indexing and retrieval of images are achieved as follows: 1)
According to whether skin and face regions can be detected, all the images are automatically annotated as with or without skin/face regions respectively;
Skin Detection from Different Color Spaces for Model-Based Face Detection
2) 3)
4)
493
For those with skin or face regions, size and number of regions are also recorded; For images with face regions, the estimated positions of ears, etc. are also taken in semantic indexing, which can be further used to estimate pose of faces. Finally, these indexes are utilized in semantic retrieval of images.
6 Conclusions By comparative study of skin detection from different color spaces, we find nonlinear color spaces, such as HSV, can obtain more accurate and robust skin results, especially in detecting of background faces. Moreover, we find the shape filtering and knowledge-based modeling very useful in face detection. Besides, these detected skin and face regions can be further utilized for semantic indexing and retrieval of images. How to improve quantitative analysis of the shape filters and face modeling for more accurate and robust face detection, especially on separation of connected faces and detection of background faces, will be investigated as the next step in the near further. Acknowledgements. Finally, the authors wish to acknowledge the financial support under EU IST FP-6 Research Programme with the integrated project: LIVE (Contract No. IST-4-027312).
References 1. Hunke, M., Waibel, A.: Face Locating and Tracking for Human–Computer Interaction. IEEE Computer, 1277–1281 (1996) 2. Cui, Y., Weng, J.: Appearance-Based Hand Gesture Sign Recognition from Intensity Image Sequences. Comput. Vis. Image Und. 78, 157–176 (2000) 3. Forsyth, D., Fleck, M.: Automatic Detection of Human Nudes. Int. J. Comput. Vis. 32(1), 63–77 (1999) 4. Wren, C.R., Azarbayejani, A., Darrel, T., Pentland, A.P.: Pfinder: Real-time Tracking of the Human Body. IEEE T-PAMI 19(7), 780–785 (1997) 5. Phung, S.L., Bouzerdoum, A., Chai, D.: Skin Segmentation Using Color Pixel Classification: Analysis and Comparison. IEEE T-PAMI 27(1), 148–154 (2005) 6. Yang, M.H., Kriegman, D., Ahuja, N.: Detecting Faces in Images: A Survey. IEEE TPAMI 24(1), 34–58 (2002) 7. Palus, H.: Representation of Color Images in Different Color Spaces. In: Sangwine, S.J., Horne, R.E.N. (eds.) The Color Image Processing Handbook, London (1998) 8. Jones, M.J., Rehg, J.M.: Statistical Color Models with Application to Skin Detection. Int. J. Computer Visiopn. 46(1), 81–96 (2002) 9. Saber, E., Tekalp, A.M.: Frontal-View Face Detection and Facial Feature Extraction Using Color, Shape and Symmetry Based Cost Functions. Pattern Recognition Letters 19(8), 669–680 (1998) 10. Hsu, R.L., Abdel-Mottaleb, M., Jain, A.K.: Face Detection in Color Images. IEEE T-PAMI 24(5), 696–706 (2002)
494
D. Wang et al.
11. Garcia, C., Tziritas, G.: Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis. IEEE T-Multimedia 1(3), 264–277 (1999) 12. Wu, H., Chen, Q., Yachida, M.: Face Detection from Color Images Using a Fuzzy pattern Matching Model. IEEE T-PAMI 21(6), 557–563 (1999) 13. Kakumanu, P., Makrogiannis, S., Bourbakis, N.: A Survey of Skin-Color Modeling and Detection Methods. Pattern Recogonition 40, 1106–1122 (2007) 14. Habili, N., Lim, C.C., Moini, A.: Segmentation of the Face and Hands in Sign language Video Sequences Using Color and Motion Cues. IEEE-TCSVT 14(8), 1086–1097 (2004) 15. Chai, D., Ngan, K.N.: Face Segmentation Using Skin-Color Map in Videophone Applications. IEEE. T-CSVT 9(4), 551–564 (1999) 16. Cho, K.-M., Jang, J.-H., Hong, K.-S.: Adaptive Skin-Color Filter. Pattern Recognition 34, 1067–1073 (2001) 17. Zheng, Q.-F., Gao, W.: Fast Adaptive Skin Detection in JPEG Images. In: Ho, Y.-S., Kim, H.-J. (eds.) PCM 2005. LNCS, vol. 3768, pp. 595–605. Springer, Heidelberg (2005) 18. Zhu, Q., Cheng, K.-T., Wu, C.-T., Wu, Y.-L.: Adaptive Learning of an Accurate SkinColor Model. In: Proc. 6th IEEE Internat. Conf. on Automatic Face and Gesture Recognition, pp. 37–42. ACTA press, Calgary, AB, Canada (2004) 19. Zhang, M.-J., Gao, W.: An Adaptive Skin Color Detection Algorithm with Confusing Background Elimination. Proc. ICIP. II, pp. 390–393 (2005) 20. Garcia, C., Tziritas, G.: Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis. IEEE T-Multimedia 1(3), 264–277 (1999) 21. Albiol, A., Torres, L., Delp, E.J.: Optimum Color Spaces for Skin Detection. In: Proc. ICIP. I, pp. 122–124 (2001)
An Agent-Based Intelligent CAD Platform for Collaborative Design Quan Liu, Xingran Cui, and Xiuyin Hu Department of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
[email protected] Abstract. Collaborative design can create added value in the design and production process by bringing the benefit of team work and cooperation in a concurrent and coordinated manner. However, distributed design knowledge and product data make the design process cumbersome. To facilitate collaborative design, an agent-based intelligent CAD platform is implemented. Intelligent agents are applied to the collaborative design. Adopting the JADE platform as framework, an intelligent collaborative design software (Co-Cad platform for short) is designed. In this platform, every man, design software, management software, equipment and resource is regarded as a single agent, the legacy design can be abstracted to be interaction between agents. Multimedia technology is integrated into Co-Cad platform, communication and identity authentication among collaborative designers from different areas are more convenient. Finally,an instance of collaborative design using Co-Cad platform is presented. Keywords: Agent; Collaborative design, JADE, Co-Cad, Multimedia.
1 Introduction Product design is increasingly becoming a collaborative task among designers or design teams that are physically, geographically, and temporally distributed. Plenty of product modeling tools and engineering knowledge from various disciplines spread around different design phases, making effective capture, retrieval, reuse, sharing and exchange of these heterogeneous design knowledge a critical issue [1]. An ideal product design environment which is both collaborative and intelligent must enable designers and manufacturers to respond quickly to commercial market pressures [2]. Compared with current standalone CAD, the collaborative CAD is ‘‘not generally accepted’’ because of both technical and non-technical problems [3]. While the cultures, educational backgrounds and design habits are the non-technical problems, the weakness in interactive capabilities and convenient collaboration are identified as the major technical problems. As an emergent approach to developing distributed systems, agent technology has been employed to develop collaborative design systems and to handle the aforementioned challenges and limitations [4].An intelligent agent consists of self-contained knowledge-based systems capable of perceiving, reasoning, adapting, learning, cooperating, and delegating in a dynamic environment to tackle specialist problems. The D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 501–508, 2008. © Springer-Verlag Berlin Heidelberg 2008
502
Q. Liu, X. Cui, and X. Hu
way in which intelligent software agents residing in a multi-agent system interact and cooperate with one another to achieve a common goal is similar to the way that human designers collaborate with each other to carry out a product design project. Thus, we believe that a collaborative design environment implemented by taking an agentbased approach will be capable of assisting human designers or design teams effectively and efficiently in product design[5]. In order to make collaborative design more convenient, multimedia information is absolutely necessary. Integrating multimedia technology into collaborative design platform can express information more quickly and naturally than letters. This paper presents an ongoing project on the application of intelligent agents to collaborative design. In the project, an intelligent CAD platform(Co-Cad platform) is implemented, adopting multi-agent technology, collaborative design technology and multimedia technology. The platform uses the proposed JADE as middle-ware and integrates multimedia system. Based on the platform, intelligent agents can interact with each other. Thus,the ideal digital collaborative design is implemented. The rest of the paper is organized as follows: Section 2 gives an overview of the work related to our research. The basic principle and key method are described in Section 3. Section 4 introduces the implementation of Co-Cad platform. A case study is presented in Section 5. Finally, a number of conclusions are made in Section 6.
2 Related Works Web- and agent-based approaches have been dominant during the past decade for the implementation of collaborative product environments. An earlier review of multiagent collaborative design systems can be found in ref.[6]. Shen et al. [7] provide a detailed discussion on issues in developing agent-oriented collaborative design systems and a review of significant, related projects or systems. The interesting aspects of PACT include its federation architecture using facilitators and wrappers for legacy system integration. SHARE [8] was concerned with developing open, heterogeneous, network-oriented environments for concurrent engineering, particularly for design information and data capturingand sharing through asynchronous communication. SiFAs [9]was intended to address the issues of patterns of interaction,communication, and conflict resolution using single function agents. DIDE [10] was a typical autonomous multi-agent system and was developed to study system openness, legacy systems integration, and distributed collaboration. Co-Designer [11] was a system that can support localized design agents in the generation and management of conceptual design variants. A-Design [12] presented a new design generation methodology, which combines aspects of multiobjective optimization, multi-agent systems, and automated design synthesis. It provides designers with a new search strategy for the conceptual stages of product design, which incorporates agent collaboration with an adaptive selection of designs. Multi-agent systems provide a cooperative environment for the sharing of design information, data, and knowledge among distributed design team members. TILab proposes a software development framework, named JADE (JADE, 2007), aimed at developing multi-agent systems and applications. This software framework uses ACL specifications proposed by FIPA and provides a set of graphical tools that supports the debugging and
An Agent-Based Intelligent CAD Platform for Collaborative Design
503
deployment phases. JADE is middle-ware useful for developing agent-based applications in distributed environments and includes an FIPS-compliant agent platform and a package to develop Java agents (Bellifemine, Caire, Trucco, Rimassa, & Mungenast, 2007). However, these technologies only provide fundamental infrastructures for collaborative design systems by standardizing communications between individual systems. The interaction among components is predefined and falls short of supporting the integration of multidisciplinary design environments.
3 The Basic Principle and Key Method 3.1 Basic Principle Usually, each agent is regarded to be a physical or abstract entity. Distributed in the network environment, each agent is independent and can act on itself and the environment, manipulate part of the environment reflection and react to the changes in the environment. More importantly, through communication and cooperation with other agents, they can perform mutual work to complete the entire task. Man, design software, management software, as well as equipment and resources can be viewed as agents. Legacy design activities can be abstracted to be informational communication between agents, which includes not only communication between homogeneous agents but also the communication between heterogeneous agents through technology-aided design software. We can further abstract them to informational exchange. Theoretically, all of the information can be digital. In other words, if we provide a suitable platform for interaction, digital collaborative design can be realized. This paper seeks to build a multi-agent middleware, through which, man, software, manufacturing equipment within this collaborative organization can carry out informational communication. Different types of agents can communicate with their agent middleware through their respective forms of communication and achieve information interaction with agents in other organizations. Thus, it enables man to relieve from various types of software and achieve collaborative design efficiently and effectively. In light of the above principle, this paper seeks to construct a collaborative design software platform for agents, that is Co-Cad platform, and integrates multimedia technology into this platform. Co-Cad platform, using pure Java language, is independent and flexible. In the network it realizes the collaboration between designers, "You see what I see". Each designer’s operation will be reflected in others’ platform. Designers in different places can exchange their ideas on the interactive design in the form of video chat. Video chat can be the most direct way to confirm the identity of the others, which guarantees the safety of the collaborative design, it is also the fastest and most natural form of information expression. 3.2 The Multi-agent System Development Based on JADE Platform JADE (Java Agent Development Framework) is a middleware that facilitates the development of multi-agent systems in compliance with the FIPA (Foundation for Intelligent Physical Agents) specifications.
504
Q. Liu, X. Cui, and X. Hu
Each running instance of the JADE runtime environment is called a container as it can contain several agents. The set of active containers is called a platform. A single special main container must always be active in a platform and all other containers register with it as soon as they start. JADE agents are identified by a unique name, provided they know each other’s name, they can communicate regardless of their actual location. A main container differs from normal containers as it holds two special agents: AMS and DF. The AMS (Agent Management System) provides the naming service and represents the authority in the platform. The DF (Directory Facilitator) provides a yellow pages service by means of which an agent can find other agents providing the services he requires in order to achieve his goals. One of the most important features that JADE provide for agents is the ability to communicate. The communication paradigm adopted is the asynchronous message passing. Each agent has a sort of mailbox (the agent message queue) where the JADE runtime posts messages sent by other agents. Whenever a message is posted in the message queue the receiving agent is notified. If and when the agent actually picks up the message from the message queue to process it is completely up to the programmer.
4 Implementation of Co-cad Platform 4.1 Communication Model Under the currently popular mode of communication services, if A is going to discuss about certain parts of the design with B, A firstly produces his own design using computer-aided design software, such as Auto CAD, and then uploads it to the FTP server, while B using the same design software uploads his own design to the FTP server. Then they download each other's program. Through traditional telephone or EMail and even more network communication tools, they exchange their opinions and finally reach a consensus. When the design is completed, it is uploaded onto the FTP server, and then the WWW server will issue the note that the components design is completed for other people to use. Under this model, the exchanging and sharing of productive data don’t fit the requirements of network and it is at a low level of intelligence. It cannot meet the needs of increasingly complex product design. In order to overcome the above shortcomings, the platform's interactive process design is based on Multi-Agent communication mode shown in Fig.1. First, A starts his agent middleware and through agent middleware he can know that currently B is also online. Of course, they may have already reached the agreement that the software communication through certain middleware. A starts his software, such as Auto CAD, and notifies B. Both of them use the voice and video program and Auto CAD to make the real-time interactive design through agent middleware.Then, agent middleware will submit the design to the FTP server automatically for others to use, and automatically issues the news that components of the design is finished in the WWW server. In this paper, the middleware is to provide a platform for interaction, but the middleware itself is an agent, they must first interact with other middleware to complete their task.
An Agent-Based Intelligent CAD Platform for Collaborative Design
505
Fig. 1. Communication model based on agent middleware
4.2 Design and Integration of Multimedia System In recent years, Web-based video conferencing system has been widely used in remote collaborative design system, but these systems are using ready video conferencing system to bring about long-distance transmission, such as audio and video. Video conferencing systems are separated from remote collaborative design system in function, which is inconvenient to integrate with the collaborative design system seamlessly, the remote collaborative design system can not stick very well. As Co-Cad platform is completed developed in Java language, in order to make the seamless integration with multi-media audio and video system possible, this paper applies multimedia technology to achieve video chat function and adopts the JMF (Java Media Framework) as development environment. In order to design the Java programs to handle multimedia files and equipment, it’s a must to download and install JMF installation package. Media Capture. Time-based media can be captured from a live source for processing and playback. Capturing can be thought as the input phase of the standard media processing model.A capture device might deliver multiple media streams. For example, a video camera might deliver both audio and video. These streams might be captured and manipulated separately or combined into a single, multiplexed stream that contains both an audio track and a video track. Media Processing. In most instances, the data in a media stream is manipulated before it is presented to the user. Generally, a series of processing operations occur before presentation. The tracks are then delivered to the appropriate output device. If the media stream is to be stored instead of rendered to an output device, the processing stages might differ slightly. For example, if you wanted to capture audio and video from a video
506
Q. Liu, X. Cui, and X. Hu
camera, process the data, and save it to a file. It includes a viewer that displays a graphical overview of a processor's tracks and plug-ins. This graph enables you to monitor of the media flow during playback, capturing, or transcoding.
5 Case Study The only software requirement to execute the platform is the Java Run Time Environment version 1.4. All the software is distributed under the LGPL license limitations. The process of collaborative design using Co-Cad platform to carry out a product design project is performed as follows: Step 1: The host computer used as server launches JADE, and then Co-Cad, starts the test container in JADE. Other collaborative designers only need to run Co-Cad. While JADE platform is running, all collaborative designers can start Co-Cad platform and see Co-Cad software interface. Click on the “CodesignStart” submenu, the moment of collaborative design is coming. Step 2: Each designer can launch codesign request and select codesign partner. Then show the audio and video window. The main user interface of the agent manager in JADE is as Fig.2.
Fig. 2. The main user interface of the agent manager
Step 3: Collaborative designers can communicate in the form of letters or video conference spontaneously and freely. If one of the designers edits or modifies the design,the same operation will show on other’s platform. As presented in Fig.3, Selene and Jujumao are designing a mechanism accessory using Co-Cad platform. Their interfaces are presented separately in Fig.3. Selene modifies the plan, then the same modification shows in Jujumao’s platform. There is a small window used to input letters for chat, where speaking loudly is not politely.
An Agent-Based Intelligent CAD Platform for Collaborative Design
507
Fig. 3. Interface of Selene’s and Jujumao’s Co-Cad platform
6 Conclusion In this paper, on the basis of the problem identification and the analysis of the requirements for a collaborative design platform, an agent-based platform supporting collaborative design via the cooperation of a network of intelligent agents is presented. As the platform is still being fully implemented, more experiments are required to be carried out in order to test and improve our platform. However, some challenging problems, such as task assignment, conflict detection and conflict
508
Q. Liu, X. Cui, and X. Hu
solution, need to be carefully addressed and further development efforts are required before the technology can be widely deployed. In our project, on-going efforts are being made to refine the coordination agent and its underlying methodology into detail. Acknowledgements. This project is supported by International science and technology cooperation project (NO.2006DFA73180) from China's Ministry of Science and Technology.
References 1. Wang, J.X., Tang, M.X.: A Multi-agent Framework for Collaborative Product Design. In: Shi, Z., Sadananda, R. (eds.) PRIMA 2006. LNCS (LNAI), vol. 4088, pp. 514–519. Springer, Heidelberg (2006) 2. Wang, J.X., Tang, M.X.: An Agent-Based System Supporting Collaborative Product Design. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS (LNAI), vol. 4252, pp. 670–677. Springer, Heidelberg (2006) 3. Wang, J.X., Tang, M.X.: Knowledge Representation in an Agent-Based Collaborative Product Design Environment. In: Proceedings of the 9th International Conference on Computer Supported Cooperative Work in Design (CSCWD 2005), vol. 1, pp. 423–428. IEEE Computer Society Press, Los Alamitos (2005) 4. Rosenman, M.A., Wang, F.: A Component Agent Based Open CAD System for Collaborative Design. Automation in Construction 10(4), 383–397 (2001) 5. Wu, S.M., Ghenniwa, H., Zhang, Y., Shen, W.M.: Personal Assistant Agents for Collaborative Design Environments. Published by Elsevier B.V. Computers in Industry 57, 732– 739 (2006) 6. Lander, S.E.: Issues in Multi-Agent Design Systems. IEEE Expert 12(2), 18–26 (1997) 7. Shen, W., Norrie, D.H., Barthe, J.P.: Multi-Agent Systems for Concurrent Intelligent Design and Manufacturing, Taylor, Francis, London, UK (2001) 8. Toye, G., Cutkosky, M.R., Leifer, L., Tenenbaum, J., Glicksman, J.: A Methodology and Environment for Collaborative Product Development. In: Proceedings of Second Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises, pp. 33–47 (1993) 9. Brown, D.C., Dunskus, B., Grecu, D.L., Berker, I.: Support for Single Function Agents. In: Proceedings of Applications of Artificial Intelligence in Engineering, Udine, Italy (1995) 10. Shen, W., Barthe, J.P.: An Experimental Environment for Exchanging Engineering Design Knowledge by Cognitive Agents. In: Mantyla, M., Finger, S., Tomiyama, T. (eds.) Knowledge Intensive CAD-2, pp. 19–38. Chapman &Hall, Boca Raton (1997) 11. Hague, M.J., Taleb-Bendiab, A.: Tool for Management of Concurrent Conceptual Engineering Design. Concurrent Engineering: Research and Applications 6(2), 111–129 (1998) 12. Campbell, M.I., Cagan, J., Kotovsky, K.A.: Design an Agent-based Approach to Conceptual Design in A Dynamic Environment. Research in Engineering Design 11, 172–192 (1999)
Applying Frequent Episode Algorithm to Masquerade Detection Feng Yu1 and Min Wang2 1
2
School of computer, Northwestern Polytechnical University, Xi’an, China Department of information antagonism, Air Force Engineering University, Xi’an, China
[email protected],
[email protected] Abstract. Masquerade attacks are attempts by unauthorized users to gain access to critical data or higher access privileges, while pretending to be legitimate users. Detection of masquerade attacks is playing an important role in system security. In this paper, we discuss a formula to evaluate the effectiveness of masquerade detection algorithm and also present an effective approach to masquerade detection by using frequent episode algorithm. We evaluate our method by performing experiments over UNIX command records from the SEA dataset. The result shows that our approach is quite effective in masquerade detection. Keyword: Masquerade detection, computer security, frequent episode.
1 Introduction The Masquerade attack is one of the most serious security problems. It commonly appears as spoofing, where an intruder personates another person and uses that person’s identity, a typical example of a masquerade is a hacker who has gained a legitimate user’s password or forging their email address. Masqueraders can be insiders or outsiders. As an outsider, the masquerader may try to gain superuser’s access from a remote location and can cause considerable damage or theft. An insider attack can be executed against an unattended machine within a trusted domain. From the system’s point of view, all of the operations executed by an insider masquerader may be technically legal and hence not detected by existing access control or authentication schemes A well-known instance of masquerader activity is the case of Robert Hanssen, of the FBI who allegedly used agency computers to ferret out information sold later. Hanssen was a regular user, but his behavior was improper. Thus, while we are protecting from external intruders, there is a strong need for monitoring authorized users for anomalous behaviors. As stated, many serious intrusions and attacks are from within an organization as these intruders are familiar with the system architecture and its loopholes. To catch such a masquerader, the only useful evidence is the operations he executes, i.e., his behavior. Thus, we can compare one user’s recent behavior against their profile of typical behavior and recognize a security breach if the user’s recent behavior departs sufficiently from his profiled behavior, indicating a possible masquerader. D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 495–500, 2008. © Springer-Verlag Berlin Heidelberg 2008
496
F. Yu and M. Wang
2 Related Work The insider problem in computer security is shifting the attention of the research and commercial community from intrusion detection at the perimeter of network systems. Research and development is going on in the area of modeling user behaviors in order to detect anomalous misbehaviors of importance to security. Schonlau & al have summarized six approaches: Uniqueness, Bayes one-step Markov, Hybrid multi step Markov, Compression model, Incremental probabilistic action modeling (IPAM), and Sequence-match. Maxion and Townsend later achieved better results by using the naive-Bayes classifiers with updating that serve as a basis for comparison. In the same author shows that there is a loss of information due to truncation of system calls arguments, which yields to inferior results when compared to his work with enriched UNIX command line system calls. Coull et al. propose a novel technique based on pair-wise sequence alignment. Kim et al. proposed a new and efficient masquerade detection technique based on Support Vector Machines (SVM). Wilson et al present a highly effective approach to masquerade detection using Hidden Markov Models (HMM).
3 Formula of Effectiveness In masquerade detection methods the detection rate (DR) is defined as the ratio of detected “dirty blocks” to the total number of dirty blocks in the data set. Similarly another crucial performance parameter is the false positive rate (FPR), which is the ratio of number of clean blocks classified as dirty blocks to the total number of clean blocks. Any formula that is to evaluate the performance of masquerade detection necessarily needs to account for DR and FPR. Maxion and Townsend [1] created a scoring formulation to rate the overall goodness of a masquerade detection algorithm. They define the cost of masquerade detection as:
cos t = 6 × FPR + (1.00 − DR ) Table 1 shows the results of previous approaches with all performance factors that are used to compare one with other. When we compare with cost, then POMDP should be the best method among others for its lowest cost. But the DR of the POMDP is quite low when compared with other methods. It is a biased approach to reduce the FPR and it does not reward detection rate that should be the main task of masquerade detection. So we could not use cost as performance factor for selecting the best method. One has to focus on maximum DR with low FPR for efficient masquerade detection approach. With these two factors, we present the formula for calculating the overall effectiveness of the masquerade detection approach as below. effectiveness =
α + ( DR − β ) a + FPR
In this formula, α is a constant value to assure that the formula would not run error when FPR is 0. We set α=0.001. And β is a threshold for DR, we set β= 0.6 for the reason that if the detection rate of a system is lower than 0.6, that would be useless in practice. With this strategy, we can see that the SVM method is most effective.
Applying Frequent Episode Algorithm to Masquerade Detection
497
Table 1. Results of some methods to masquerade detection Method
DR
FPR
Cost
POMDP Navie Bayes{updating} Navie Bayes{no updating} Uniqueness Hybrid Markov model Semiglobal Alignment Bayes one-step Markov IPAM SVM Recurisive Data mining Sequence Matching Compression
61.8% 61.5%
1.0% 1.3%
44.2 46.3
66.2%
4.6%
61.4
39.4% 49.3%
1.4% 3.2%
69.0 69.9
75.8%
7.7%
70.4
69.3%
6.7%
70.9
41.1% 80.1% 75.0%
2.7% 9.7% 10.0%
75.1 78.1 85
36.8% 34.2%
3.7% 5.0%
85.4 95.8
4 Frequent Episode Mannila et al described an algorithm to discover serial frequency episodes from event sequences. In Mannila’s method, S={E1, E2…En} is an event sequence of n events and A={a1, a2,…,am} is the set of all the event attributes. Each event E={Ea1, Ea2…Ean} in S consists of m values for all the event attributes. The timestamp of E is denoted by E.T. A simple serial episode P (e1, e2, …ek) represents a sequential occurrence of k event variables where each ei(1≤i≤k) is an event variable, for all i and j (1≤i < j≤k), eiT<ejT. Usually, k is much smaller than n, eq represent an event variable consisting of q event attributes, i.e., eq{attr1=v1,attr2=v2,…,attrq=vq} where eq{attr1=v1,attr2=v2,…,attrq=vq} ⊆ A and 1≤q≤m. An episode P (e1, e2, …ek) as occurring in interval [t,t’] if t≤e1T, t’ ekT. An occurrence of P (e1, e2, …ek) in interval [t, t’] is defined as minimal if there does not exist another of occurrence of P (e1, e2, …ek) in a subinterval [u, u’] ⊂ [t, t’]. Given a threshold of window (representing timestamp bounds), the frequency of P (e1, e2, …ek) in the event sequence S is the total number of its minimal occurrences in any interval smaller than window. In order to find “frequent” episodes, a second threshold, minfrequency is used. An episode P (e1, e2, …ek) is called frequent if frequency(p)/(n-k+1) minfrequency. Since k
QoS Manager <MASQ C/S stub> RTP flow receiver Accounting
QoS Adaptation
RTP flow sender
Admission Control QoS Monitoring
Discovery Client CC/PP LDAP Client
Fig. 2. MASQ modular Architecture
The QoS Monitoring module has the duty of observing the state of resources and services that are local to its hosting node. The Admission Control module maintains resource allocation information about all VoD flows currently served. The Accounting module exploits the monitoring functions to keep a local log of the QoS level actually provided to the different receivers. The QoS Adaptation module is responsible for any transformation of data depending on the negotiated QoS level. The QoS Manager module coordinates the other modules and decides the QoS levels for the MASQ components in the VoD path [11]. Our proposed IPM_RQOS model aims at supporting adaptive reliable QoS requirements defined in application-level Missions described by a set of Actions of objects by reserving, allocating, and reallocating necessary Resources given dynamically changing situations. A high-level IPM_RQOS conceptual architecture to support adaptive reliable QoS requirements is shown in Figure 3. Situation-aware Agent(SMA), Resource Agent(RA), and FaultTolerance QoS Agent (FTQA) are the main components shown in Situation-Aware Middleware. Applications request to execute a set of missions to Situation-aware Middleware with various QoS requirements. A Situation-aware Manager analyzes and synthesizes context information captured by sensors over a period of time, and drives a situation. A Resource Agent simultaneously analyzes resource availability by dividing requested resources from missions by available resources. It is also responsible for monitoring, reserving, allocating and deallocating each resource. Given the driven situations, A Fault-Tolerance QoS Agent (FTQA) controls resources when it met errors through the Resource Aent(RA) to guarantee requested QoS requirements. If there are some error resource due to low resource availability, FTQA performs QoS resource error detection-recovery. RA resolves the errors by recovering resources for supporting high priority missions.
514
E.N. Ko and S. Kim
Mission1
Mission2
<Sensor> Situation 1
Missin n
Situation 2 Situationaware Agent Action1 +QoS1
Situation n Resource Agent
Action n + QoSn Fault-Tolerance QoS Agent
R e s o u r c e 1
R e s o u r c e 2
R e s o u r c e n
Fig. 3. Conceptual Architecture of Our Proposed IPM_RQOS Model
4 Simulation Results and Conclusion Our approach has distinct features such as the SoC principle support and dynamism of situation-aware support. For existing QoS management techniques: In OS-level management, QoS management schemes are limited to CPU, memory, disk, network, and so on [12,13]. In application-level management, a limitation of the schemes is in monitoring states of necessary resources from applications steadily or periodically. To extend the limitation, QoS is managed in the middleware level to satisfy integrated QoS of several applications over the network [14, 15]. However, these approaches are also limited and not flexible in dynamically changing situations, comparing with our situation-aware QoS management using the IPM_RQOS model. An example of situation-aware applications is a multimedia education system. QoS guarantees must be met in the application, system and network to get the acceptance of the users of multimedia communication system. There are several constraints which must be satisfied to provide guarantees during multimedia transmission. They are time, space, device, frequency, and reliability constraints.
Design of a Reliable QoS Requirement Based on RCSM
515
We proposed a method for increasing reliability through an Adaptive Reliable QoS for Resource Errors model for ubiquitous computing environments. The model aims at guaranteeing it through application QoS. However, since this new education system must be developed in a way that combines various field of technologies, including group communication and distributed multimedia processing which are the basis of packet based videoconferencing systems, integrated service functions such as middle ware are required to support it. QoS guarantees must be met in the application, system and network to get the acceptance of the users of multimedia communication system. There are several constraints which must be satisfied to provide guarantees during multimedia transmission. They are time, space, device, frequency, and reliability constraints. We proposed a method for increasing reliability through an Adaptive Reliable QoS for Resource Errors model for ubiquitous computing environments. The model aims at guaranteeing it through application QoS. Our future works are QoS-aware middleware for ubiquitous and heterogeneous environments.
References 1. Weiser, M.: The computer for the 21st century. Scientific American 265(30), 94–104 (1991) 2. Yau, S., Karim, F., Wang, Y., Wang, B., Gupta, S.: Reconfigurable Context-Sensitive Middleware for Pervasive Computing. IEEE Pervasive Computing 1(3), 33–40 (2002) 3. Yau, S.S., Karim, F.: Adaptive Middleware for Ubiquitous Computing Environments. Design and Analysis of Distributed Embedded Systems. In: Proc. IFIP 17th WCC, vol. 219, pp. 131-140 (August 2002) 4. Yau, S.S., Karim, F.: Contention-Sensitive Middleware for Real-time Software in Ubiquitous Computing Environments. In: Proc. 4th IEEE Int’l Symp. on Object-Oriented Realtime Distributed Computing (ISORC 2001), pp. 163–170 (May 2001) 5. Agnew, P.W., Kellerman, A.S.: Distributed Multimedia. ACM Press, New York (1996) 6. Ahn, J.Y., Lee, G., Park, G.C., Hwang, D.J.: An implementation of Multimedia Distance Education System Based on Advanced Multi-point Communication Service Infrastructure: DOORAE. In: proceedings of the IASTED International Conference Parallel and Distributed Computing and Systems, October 16-19, Chicago, Illinois, USA (1996) 7. ITU-T Recommendation T.122: Multipoint Communication Service for Audiographics and Audiovisual Conferencing Service Definition.ITU-T SG8 Interim Meeting mertlesham (October 18, 1994) (issued March 14th 1995) 8. Ko, E., Hwang, D., Kim, J.: Implementation of an Error Detection-Recovery System based on Multimedia Collaboration Works: EDRSMCW. In: MIC 1999 IASTED International Conference, Innsbruck Austria ( Febraury 1999) 9. Steinmetz, R., Nahrstedt, K.: Multimedia: computing, communications & Applications. Prentice Hall P T R, Englewood Cliffs 10. Baschieri, F., Bellavista, P., Corradi, A.: Mobile Agents for QoS Tailoring, Control and Adaptation over the Internet: the ubiQoS Video-on-Demand Service. In: To be published in 2nd IEEE Int. Symposium on Applications and the Internet (SAINT 2002), Japan, (January 2002) 11. Bellavista, P., Corradi, A., Montanari, R., Stefanelli, C.: An Active Middleware to Control QoS Level of Multimedia Services. In: Proceedings of the 8th IEEE Workshop on Future Trends of Distributed Computing Systems (FTDCS 2001). IEEE, Los Alamitos (2001)
516
E.N. Ko and S. Kim
12. Xu, D.: QoS and Contention-Aware Muiti-Resource Reservation. In: 9th IEEE International Symposium on High Performance Distributed Computing (HPDC 2000) (2000) 13. Bellavista, P., Corradi, A., Montanari, R.: An Active Middleware to Control QoS Level of Multimedia Services. In: Proceedings of the Eight IEEE Workshop on Future Trends of Distributed Computing System (2001) 14. Xu, D., Wichadakul, D., Nahrstedt, K.: Resource-Aware Configuration of Ubiquitous Multimedia Services. In: Proceedings of IEEE International Conference on Multimedia and EXPO 2000 (ICME 2000) (2000) 15. Nahrstedt, K., Xu, D., Wichadakul, D.: QoS-Aware Middleware for Ubiquitous and Heterogeneous Environments. In: IEEE Communications Magazine (2001)
Minimization of the Disagreements in Clustering Aggregation Safia Nait Bahloul1, Baroudi Rouba1, and Youssef Amghar2 1
Computer department, Faculty of Science, Es-Sénia, Oran University, Algeria 2 INSA de Lyon – LIRIS UMR 5205 CNRS, 7 avenue Jean Capelle 69621, Villeurbanne – France
[email protected] Abstract. Several experiences proved the impact of the choice of the parts of documents selected on the result of the classification and consequently on the number of requests which can answer these clusters. The process of aggregation gives a very natural method of data classification and considers then m produced classifications by them m attributes and tries to produce a classification called "optimal" which is the most close possible of m classifications. The optimization consists in minimizing the number of pairs of objects (u, v) such as a C classification place them in the same cluster whereas another C' classification place them in different clusters. This number corresponds to the concept of disagreements. We propose an approach which exploits the various elements of an XML document participating in various views to give different classifications. These classifications are then aggregated in the only one classification minimizing the number of disagreements. Our approach is divided into two steps: the first consists in applying the K-means algorithm on the collection of XML documents by considering every time a different element from the document. Second step aggregates the various classifications obtained previously to produce the one that minimizes the number of disagreements. Keywords: XML, classification, aggregation, disagreements.
1 Introduction The number of XML documents exchanged on internet increases continuously, and the necessary tools for the search for the information in documents are not sufficient enough. The tools allowing to synthesise or to classify wide collection of documents became indispensable. The unsupervised automatic classification (or clustering) aims to regroup the similar documents. The search for a relevant information in a wide collection means then interrogating sets (classes) of reduced size. This bases itself on the idea that if a document is relevant in a request, their neighborhoods (the similar documents of the same class) have more chance to be also relevant. D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 517–524, 2008. © Springer-Verlag Berlin Heidelberg 2008
518
S. Nait Bahloul, B. Rouba, and Y. Amghar
Several experiences of XML documents classification of homogeneous structure were realized by [1]. These experiences showed the impact of the choice of the selected parts of documents on the result of the classification and consequently on the number of requests satisfied by these clusters. So the aggregation of these classifications allows obtaining more relevant clusters. In this case we propose an approach allowing to optimize the aggregated clusters by minimizing the number of disagreements coming from a process of classification based on a set of attributes considered relevant.
2 The Classification of XML Documents The classification consists in analyzing data and in affecting them according to their characteristics or attributes, to such or such class. There is an important quantity of methods of document classification. These methods can be classified generally, according to their objectives in two types: the supervised classification (classification) and the unsupervised classification (clustering). The various presentations [2] of the methods of clustering are due, on one hand, to the fact that the classes of algorithms become covered (certain methods bases, for example, on probability models to propose partitions) and on the other hand, to the interest of the results of the clustering (hierarchy vs. Partitions, hard clustering vs. fuzzy Clustering etc.), and to the method used to reach this result (the use of the probability functions versus use of graphs, etc. …). Several works concerning the clustering [3, 4, 5], [6, 7] and the similarity [8, 9, 10] of XML documents were realized, and this with different objectives. Some works aim to identify the part of the DTD the most use [11], the others try to identify frequent structures in a wide collection [12]. Among objectives, one finds also the need of identification of the DTD for heterogeneous collections [13], and finally to realize the clustering [14] the combination of the structure and the content of documents is taken into consideration. Certain methods of classification reduce XML documents to their purely textual part [4, 15], without taking advantage of the structure which c carries rich information. The interest in [1] concerns the impact of the choice of the selected document's parts on the result of the classification. Two levels of selection were applied: one using the structure of the document, another at the level of the text first selected called a linguistic selection. A classification algorithm of type k-means [16, 17] builds a partition of documents, affects documents to classes and shows the list of the words which allowed the classification. So it has been proved that the quality of the classification depends strongly on selected parts of documents. Several approaches use the concept of aggregation in classification in various domains such as: machine learning [18, 19], pattern recognition [20], bioinformatics [21], and data mining [22, 23]. The aggregation supplies a very natural method for the data classification. By considering a set of tuples T1, …, Tn characterized by a set of attributes A1, … .Am, our idea consists in seeing every attribute as an element being able to produce a simple classification of the data set; if attribute Aj contains Kj different values then Aj regroups data in Kj clusters. The aggregation process considers then
Minimization of the Disagreements in Clustering Aggregation
519
produced m classifications by them m attributes and tries to produce a classification called "optimal" which is the most possible close to m classifications, that means minimizing the number of pairs of objects (u, v) such as one C places them in the same cluster, whereas another classification C0 place them in different clusters. This number corresponds to the concept of disagreements [24]. Our approach consists then in aggregating a set of based classifications each one a relevant attribute extracted from the DTD of documents to be classified. Every classification is arisen from the application of the k-means algorithm [16, 17]. The quality of the obtained clusters is assured on one hand by the efficiency of the k-means algorithm, as reference algorithms of classification, and on the other hand by the optimization (minimization of disagreements) assured by the aggregation concept. The following sections describe in detail steps and concepts of our approach.
3 Description of Our Approach The proposed approach follows four steps: Step 1: Determination of the relevant elements set (the relevance of the element is determined by its frequency of appearance in requests). Step 2: inventory for every attribute of the representative words (the attribute's possible values). Step 3: Application of the k-means algorithm for every attribute extracted in the first step. Step 3 consists in applying K-means algorithm [16, 17] on the collection by considering every time a different attribute. One will have for example clusters "Comedy", "Horror", "Action" for the attribute "Kind" Step 4: Aggregation of obtained results in the step 3. The last step is the phase of aggregation which allows aggregating the obtained classifications during the third step (first part). This step allows to build clusters of type "All the films of Action realized by Spielberg and edited by 20Century in which played Tom Cruz ". 3.1 Illustrative Example We illustrate our approach through an example. Let a collection of XML documents based on the following DTD: First stage in our approach is to identify the most important parts of the DTD being able to produce relevant clusters. It is evident that Title and budget do not constitute elements of classification. On the other hand, films can be to regroup in classes according to their kind, their director, their actors or their editor. Third step consists in applying K-means algorithm [16, 17] on the collection by considering every time a different attribute. One will have for example clusters "Comedy", "Horror", "Action" for the attribute "Kind" The last step is the phase of aggregation which allows aggregating the obtained classifications during the third step (first part). This step allows to build clusters of
520
S. Nait Bahloul, B. Rouba, and Y. Amghar
type " All the films of Action realized by Spielberg and edited by 20Century in which played Tom Cruz The following step in the process is the choice of the representative words of every attribute. The result of this step can have the shape of the following table: Table 1. Example of representative words of attributes
Attribute
Representative words
Kind
Comedy, action, horror…
Actor
Tom Cruz, Kevin kosner…
Realisator
Spielberg, newman…
Editor
20 Century, 3 stars…
3.2 Definitions Certain authors [24] define the aggregation as a problem of optimization aiming to minimize the number of disagreement among the m classification. 3.2.1 Aggregation Let CL = {C1 , …... Cm} be a set of m classifications. The concept of aggregation consists in producing a C classification which realizes a compromise with the m classifications. 3.2.2 Disagreement Let C and C0 two classifications, the disagreement is defined as being a pair of objects (u, v) such as C place them in the same cluster, whereas C0 places them in different clusters. If d (C0, C) is the number of disagreements between C and C0, the aggregation will consist then in finding a C classification which minimizes: m
∑ d (C , C ) i
(1)
i =1
The equation (1) allows calculating the distance between a classification C and the set of classifications. This distance represents in fact the number of couple of (Vi, Vj) objects on which the two classifications are in discord. Example of aggregation [24] Let C1, C2 and C3 of classifications, V1, … , V6 are objects to be classified. The Value K in entry (Vi, Cj) expresses that the Vi object belongs to the Cj cluster. The C column corresponds to the optimal classification which minimizes the number of disagreements among the C1, C2, C3 classifications.
Minimization of the Disagreements in Clustering Aggregation
521
In this example the total number of disagreement is 5: one with the C2 classification for the couple (v5; v6), and four with the C1classification for the couples (v1; v2); (v1; v3); (v2; v4); (v3; v4). It is not difficult to determine the classification which minimizes the number of disagreements corresponding in this example to the C3 classification. The determination of the classification can be defined as a problem of optimization aiming to minimize the number of disagreements. We realize our approach by the Clust-Agregat algorithm which we describe in the following section. Table 2. Example of an optimal classification
C1
C2
C3
C
V1
1
1
1
1
V2
1
2
2
2
V3
2
1
1
1
V4
2
2
2
2
V5
3
3
3
3
V6
3
4
3
3
4 Clust-Agregat Algorithm In what follows, we present an algorithm which summarizes various steps of our approach. Algorithm accepts as entry a set of V objects. Each object is characterized by a set of A attributes. The Algorithm builds a C set of classifications by taking into account every time a different attribute. General Process of Clust-Agregat: - In entry, we have V, a set of objects to be classified; - Let A= {A1,….Am}, a set of attributes such as: - For every attribute Ai: • To apply the algorithm of k-means; • To add the classification obtained Ci in the set of classifications; - Application of the function 'to aggregate' on the set of the obtained classifications; - In exit, we shall have a set of clusters forming the optimal classification Algorithm: Clust-Agregat Entry: V {the set of objects to be classified} Exit: Cf {A set of clusters the optimal classification"} A= {A1,….Am,} {A set f attributes}
522
S. Nait Bahloul, B. Rouba, and Y. Amghar
Begin C: =∅;{the set of classifications to be optimized} For i from 1 to m do Ci:=K-means/A;{Apply K-means by considering the attribute A} C:=C∪Ci; End For Cf: =Aggregate(C,V); End Function Aggregate(C, V) {return one Classification u,v : two objects to V.} Begin For i from 1 to m-1 Do For j from i+1 to m Do DV(Ci,Cj):=0;
For each (u,v) V2
d u , v (C i , C j ) End for dV(Ci,Cj):=
1 if Ci (u ) C i (v ) et C j (u ) z C j (v ) ° or C i (u ) z Ci (v ) et C j (u ) C j (v ) ® ° ¯0 else
dV(Ci,Cj)+ du ,v (Ci , C j ) ; {distance
between the two classifications Ci,Cj} End For End For m D (C ) = ∑ d (C , C ) V i i =1 Cf:= Min (D(C)); Return (Cf): End. The function Aggregate returns an optimal classification. The optimum criterion of the result corresponds to the minimization of the number of disagreements; on the other term this function returns the classification which is in agreement with all the classifications of the C set.
5 Complexity of Clust-Agregat Algorithm Global Complexity of Clust-Agregat Algorithm: The complexity of Clust-Agregat depends in one hand both of the complexity of k-means algorithm, and that of the function Aggregate. The complexity of a variant of k-means algorithm (k-mediode
Minimization of the Disagreements in Clustering Aggregation
523
fuzzy) has been evaluated to O (n2) [25]. One the other hand, the process of aggregation is in nature NP-complete [24], but after it has been demonstrated that is easy to find 2-approximation and consequently reducing the complexity of aggregation process to O(mn) (m is the number of classifications to be aggregated) (see the details of the BESTCLUSTERING algorithm in [24]). In general, we can say that the complexity of the proposed Clust-Agregat algorithm can exceed O (n2). Practical Aspects: The implanting of the algorithm Clust-Agregat is in progress makes on database Iris [26]. The purpose of this practice is the comparison of results with existing algorithms to prove efficiency and advantageous difference of our algorithm with regard to K-means. In the same frame we envisage a study aiming to aggregate algorithms of classification such as k-means and POBOC [25].
6 Conclusion In this paper we exploited the fact that various elements of a XML document participate in various view and lead to different classifications. This method produces clusters which constitute partial views in the data set. We proposed an algorithm aiming to improve the quality of the obtained clusters by exploiting the notion of aggregation. Our approach is based on a optimization process minimizing the disagreement among obtained classifications by the application of the k-means algorithm. The quality of the obtained clusters is guaranteed in one hand by the optimization process and one other hand by the reference of the k-means algorithm.
References 1. Despeyroux, T., Lechavellier, Y., Trousse, B., Vercoustre, A.: Expériences de Classification d’une Collection de Documents XML de Structure Homogène. IEEE Computer Society, Washington (2004) 2. Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997) 3. Guillaume, D., Murtagh, F.: Clustering of XML Documents. Computer Physics Communications 127(2-3), 215–227 (2000) 4. Denoyer L., Vittaut J.-N., Allinari P., Brunessaux S.: Structured Multimedia Document Classification. In: DocEng 2003, Grenoble, France, pp. 153–160 (2003) 5. Despeyroux, T., Lechavellier, Y., Trousse, B., Vercoustre, A.: Experiments in Clustering Homogeneous XML Documents to Validate an Existing Typology. In: Proceedings of the 5th International Conference on Knowledge Management (I-Know), Vienna, Autriche (July 2005) 6. Lee, M.L., Liang Huai Yang, L.H., Wynne Hsu, W., Yang, X.: XClust: Clustering XML Schemas for Effective Integration. In: CIKM 2002: Proceedings of the eleventh international conference on Information and knowledge (2002) 7. Steinbach, M., Karypis, M., Kumar, G.V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000) 8. Bertino, E., Guerrini, G., Mesiti, M.: Measuring the Structural Similarity among XML Documents and DTDs. Technical report, DISI-TR-02-02 (2001) 9. Flesca, S., Manco, G., Masciari, E., Pontieri, L., Pugliese, A.: Detecting Structural Similarities between XML Documents. In: WebDB, pp. 55–60 (2002)
524
S. Nait Bahloul, B. Rouba, and Y. Amghar
10. Nierman, A., Jagadish, H.V.: Evaluating Structural Similarity in XML Documents. In: Proceedings of the Fifth International Workshop on the Web and Databases (WebDB 2002), Madison, Wisconsin, USA (June 2002) 11. Lian, W., Cheung, D.W.-L.: An Efficient and Scalable Algorithm for Clustering XML Documents by Structure. IEEE Trans. Knowl. Data Eng. 16(1), 82–96 (2004) 12. Termier, A., Rousset, M.C, Sebag, M.: TreeFinder: A First Step towards XML Data Mining. In: ICDM 2002: Proceedings of the 2002 IEEE International Conference on Data Mining, p. 450 (2004) 13. McQueen, J.: Some methods for classification and analysis of multivariate observations. In: the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967) 14. Doucet, A., Ahonen-Myka, H.: Naive Clustering of a large XML Document Collection. In: INEX Workshop, pp. 81–87 (2002) 15. Yi, J., Sundaresan, N.: A classifier for semi-structured documents. In: Proc. of the 6th International Conference on Knowledge Discovery and Data mining, pp. 340–344 (2000) 16. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley & Sons, Chichester (1973) 17. Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, San Diego (1990) 18. Strehl, A., Ghosh, J.: Cluster ensembles: A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research (2002) 19. Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: ICML (2003) 20. Fred, A.L.N., Jain, A.K.: Data Clustering using evidence accumulation. In: ICPR (2002) 21. Filkov, V., Skeina, S.: Integrating microarray data by consensus clustering. In: International Conference on tools with Artificial Intelligence, pp. 418–426 (2003) 22. Topchy, A., Jain, A.K., Punch, W.: A mixture model of clustering ensembles. In: SDM (2004) 23. Boulis, C., Ostendorf, M.: Combining multiple clustering systems. In: PKDD (2004) 24. Gionis, A., Mannila, H., Tsaparas, P.: Clustering Aggregation. In: International Conference on Data Engineering (ICDE) (2005) 25. Cleuziou, G.: Une méthode de classification non-supervisée pour l’apprentissage de règles et la recherche d’information Thèse de doctorat, Université d’Orléan (2004) 26. Merz, C.J., Murphy, P.M.: UCI repository of machine learning databases (1998)
Modified Filled Function Method for Resolving Nonlinear Integer Programming Problem Yong Liu1 and You-lin Shang2 1 Henan University of Science and Technology, Department of Computer Science, Luoyang, 471003 China
[email protected] 2 Henan University of Science and Technology, Department of Mathematics, Luoyang, 471003 China
[email protected] Abstract. Filled function method is an approach to find the global minimum of nonlinear functions. Many Problems, such as computing, communication control, and management, in real applications naturally result in global optimization formulations in a form of nonlinear global integer programming. This paper gives a modified filled function method to solve the nonlinear global integer programming problem. The properties of the proposed modified filled function are also discussed in this paper. The results of preliminary numerical experiments are also reported. Keywords: Filled function, global optimization, local minimizer, communication control, nonlinear integer programming.
1 Introduction Communication Control, management and decision making problems such as capital budgeting, production planning, capacity planning, reliability networks and chemical engineering process, in real applications naturally result in global optimization formulations in a form of nonlinear global integer programming. For example, a plant upgrade problem, a heat exchange network optimization problem and a loading and dispatching problem in a random flexible manufacturing system both aim at searching the global minima of the objective function with constrained conditions. The solutions of these problems all involve the following mathematical problem
( DP )
min f ( x) , s.t. x ∈ Ω
Many methods have been proposed to solve this problem, including filled function method [1-8], tunneling method [11], etc. The filled function and The filled function method was first put forward by Ge's paper [3], and many other filled functions have been put forward afterwards [4-8]. The idea of this method is to construct a filled function P(X) and by minimizing P(X) to escape from a given local minimizer the original objective function
f (x) .
D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 549–556, 2008. © Springer-Verlag Berlin Heidelberg 2008
x1* of
550
Y. Liu and Y.-l. Shang
With regard to discrete nonlinear programming problem, the approaches of continuity are presented by Ge's paper [9], and Zhang's paper [10]. This paper proposes a modified filled function to solve the discrete nonlinear programming problem. The paper is organized as follows: Section 2 presents a modified filled function and discuss its properties. Next, in section 3, a modified filled function algorithm is presented and the results of preliminary numerical experiments are reported. Finally, conclusions are included in section 4.
2 A Modified Filled Function and Its Properties In this section, we propose a modified filled function of
f (x) at a local minimizer
* 1
x and discuss its properties. Let
x1* be the current local minimizer of problem (DP) and integer set S1 = {x ∈ Ω, f ( x) ≥ f ( x1* )} , S 2 = Ω \ S1 .
For the unconstrained continuous global optimization problem
( CP )
min f ( x) , s.t. x ∈ R n
f (x) is a (twice) continuously differentiable function on R n and assume that it is globally convex function, from the globally convex property of f (x ) , we known that there exists a closed and bounded domain Ω contains all the global minimizers of f (x ) . Paper [1] gives a definition of the globally convexized filled where
function of
f (x) at its local minimizer x1* as follows
Definition 2.1[1]. A continuous function function of
U (x) is called a globally convexized filled
* 1
f (x) at its minimizer x if U (x) has the following properties:
U (x) has no stationary point in the region S1 = {x ∈ Ω, f ( x) ≥ f ( x1* )} except a prefixed point x 0 ∈ S1 that is a minimizer of U (x ) .
(i)
(ii)
U (x) does have a minimizer in the region S 2 .
(iii)
U (x) → +∞ as x → +∞ .
With regard to the nonlinear integer programming problem (DP) in section 1 of this paper, a definition of the filled function of f (x ) at its local minimizer basis of Definition 2.1[1] is given in [8] as follows Definition 2.2[8]. Function * 1
x1* on the
U (x) is called a filled function of f (x) at its mini-
mizer x for nonlinear integer programming problem if properties:
U (x) has the following
Modified Filled Function Method
551
U (x) has no minimizer in the set S1 = {x ∈ Ω, f ( x) ≥ f ( x1* )} except a prefixed point x 0 ∈ S1 that is a minimizer of U (x ) .
(i)
(ii) If
S 2 ≠ φ , then U (x) does have a minimizer in the set S 2 .
This paper gives a modified definition of the filled function of
f (x) at its local
* 1
minimizer x for nonlinear integer programming problem (DP) on the basis of Definition 2.1[1] and Definition 2.2[8] as follows
P (x) is called a filled function of f (x) at a local minimizer x1* for nonlinear integer programming if P (x ) has the following properties: Definition 2.3.
P (x) has no local minimizer in the set S1 \ x0 . Prefixed point x0 is in the set S1 and is not necessarily a local minimizer of P (x) .
(i)
x1* is not a global minimizer of f (x) , then there exists a local minimizer x1 * of P (x ) , such that f ( x1 ) ≤ f ( x1 ) , that is x1 ∈ S 2 .
(ii) If
Definition 2.3 is different from Definition 2.1 and 2.2. It is based on the discrete set in the Euclidean space and x 0 is not necessarily local minimizer of P (x ) . Therefore, we can present a modified filled function of
f (x) at its local minimizer
* 1
x is as follows: P ( x) = η (|| x − x0 ||) − A ⋅ [min{ f ( x) − f ( x1* , 0)}] 2 , where A > 0 is a parameter, prefixed point
(2.1)
x0 ∈ Ω satisfies the condition
f ( x0 ) ≥ f ( x ) and function η (t ) need to satisfy the following conditions: * 1
(i) η (t ) is strictly monotone increasing function for any (ii) η (0)
t ∈ [0, + ∞) ;
= 0.
Lemma 2.1[2]. For any integer point that Theorem 2.1.
x ∈ Ω , if x ≠ x0 , there exists d ∈ D such
x + d − x0 < x − x0 .
(2.2)
P (x) has no local minimizer in the set S1 \ x0 for any A > 0 .
Proof: From Lemma 2.1, we know that, for any
d ∈ D such that x + d − x0 < x − x0 .
x ∈ S1 and x ≠ x0 , there exists a
552
Y. Liu and Y.-l. Shang
Consider the following two cases:
f ( x1* ) ≤ f ( x + d ) ≤ f ( x) or f ( x1* ) ≤ f ( x) ≤ f ( x + d ) , then P( x + d ) = η (|| x + d − x0 ||) − A[min{ f ( x + d ) − f ( x1* ), 0}] 2
(1) If
= η (|| x + d − x 0 ||) < η (|| x − x0 ||) = η (|| x − x 0 ||) − A[min{ f ( x) − f ( x1* ), 0}] 2 = P( x) Therefore,
x is not a local minimizer of function P (x) .
f ( x + d ) < f ( x1* ) ≤ f ( x) , then P( x + d ) = η (|| x + d − x0 ||) − A[min{ f ( x + d ) − f ( x1* ), 0}]2
(2) If
= η (|| x + d − x0 ||) − A[ f ( x + d ) − f ( x1* )]2 ≤ η (|| x + d − x0 ||) < η (|| x − x0 ||) = η (|| x − x0 ||) − A[min{ f ( x) − f ( x1* ), 0}]2 )) = P( x)
x is not a local minimizer of P(x) . From Theorem 2.1, we know that the function P (x ) satisfies the first property of Definition 2.3 without any further assumption on the parameter A > 0 . A question arises how large the parameter A should be such that P ( x ) has a local minimizer in the set S 2 . To answer this question, we have the following Theorem. Therefore, it is also show that
Theorem 2.2. Let
S 2 ≠ φ . If the parameter A > 0 satisfies the condition A>
where
C , [ f ( x ) − f ( x1* )]2 *
C ≥ maxη (|| x − x0 ||), x * is a global minimizer of f (x) , then P(x) has x∈Ω
a local minimizer in the set Proof: Since the set
S2 .
S 2 is nonempty and x * is a global minimizer of f (x) ,
f ( x * ) < f ( x1* ) holds and P( x * ) = η (|| x * − x 0 ||) − A[min{ f ( x * ) − f ( x1* ), 0}] 2 = η (|| x * − x 0 ||) − A[ f ( x * ) − f ( x1* )] 2 ≤ C − A[ f ( x * ) − f ( x1* )] 2 When
(2.3)
A > 0 and satisfies the condition (2.3), we have P( x * ) < 0 .
Modified Filled Function Method
553
y ∈ S1 , we have P( y ) = η (|| y − x 0 ||) − A[min{ f ( y ) − f ( x1* ), 0}] 2 = η (|| y − x 0 ||) ≥ 0. Therefore, the global minimizer of P (x ) belong to the set S 2 , that is, function P(x) has a local minimizer in the set S 2 . On the other hand, for any
Theorem 2.3. Suppose that
ε
is a small positive constant and
A > C / ε 2 , then for
x1* of f (x) such that f ( x1* ) ≥ f ( x * ) + ε , P(x) has local minimizer in the * set S 2 , where x is a global minimizer of f (x ) .
any
Proof: Since
f ( x1* ) − f ( x * ) ≥ ε , we have C C ≤ 2 . * 2 [ f ( x ) − f ( x1 )] ε *
It follows from Theorem 2.2 that the conclusions of this Theorem hold. We construct the following auxiliary nonlinear integer programming problem (AIP) relate to the problem (DP): (ADP)
min P( x),
s.t. x ∈ Ω.
(2.4)
3 Modified Filled Function Algorithm and Numerical Results In this section, we put our modified filled function in the following algorithm to solve the problem (DP). The local minimizer of f (x ) over Ω is obtained by the following Algorithm. Algorithm 1[7] Step 1. Choose any integer x 0 ∈ Ω .
x0 is a local minimizer of f (x) over Ω , then stop; otherwise search the neighborhood N ( x 0 ) and obtain a point x ∈ N ( x 0 ) ∩ Ω such that f ( x) < f ( x 0 ) . Step 3. Let x 0 := x , go to Step 2. Step 2. If
Algorithm 2 (The modified filled function method) Step 1. Choose: (a) choose functions η (t ) satisfy the conditions in section 2 of this paper; (b) choose a constant N L > 0 as the tolerance parameter for terminating the minimization process of problem (IP); (c) choose a small constant ε as a desired optimality tolerance.
554
Y. Liu and Y.-l. Shang
Step 2. Input: (a) input an integer point
x0 ∈ Ω ;
A satisfying the condition (2.3) or A > C / ε 2 . * Step 3. Starting from the point x 0 , obtain a local minimizer x1 of f (x ) over Ω . (b) input a constant
x0 is a local minimizer of f (x) over Ω , let x1* = x0 and go to Step 4; (b) if x 0 is not a local minimizer of f (x ) over Ω , search the neighborhood N ( x0 ) and obtain a point x ∈ N ( x0 ) ∩ Ω such that f ( x) < f ( x 0 ) ; (c) let x 0 = x and go to (a) of Step 3. Step 4. let η (t ) = t , we construct the filled function P (x ) as follows: P( x) = η (|| x − x0 ||) − A[min{ f ( x) − f ( x1* , 0)}]2 . Step 5. Let N = 0 . Step 6. If N > N L , then go to Step 11. Step 7. Set N = N + 1 . Choose an initial point on the set Ω . Starting from this point, minimize P ( X ) on the set Ω using any local minimization method. Suppose that x ′ is an obtained local minimizer. Step 8. If x ′ = x 0 , go to Step 6; otherwise, go to Step 9. Step 9. Minimize f (x ) on the set Ω from the initial point x ′ , and obtain a local (a) if
x 2* of f (x) . * * Step 10. Let x1 = x 2 and go to Step 4. * * Step 11. Out put x1 and f ( x1 ) as a approximate global minimal solution and minimizer
global minimal value of problem (DP) respectively. Example 1 (in [6] and [8]) n −1
min f ( x) = ( x1 − 1) 2 + ( x n − 1) 2 + n ∑ (n − i )( xi2 − xi +1 ) 2 , i =1
s.t. xi ≤ 5, xi is integer, i = 1, 2, " , n. n
This problem has 11 feasible points and many local minimizers (4, 6, 7, 10 and 12 local minimizers for n=2, 3, 4, 5 and 6, respectively), but only one global minimum solution:
x *g = (1, 1, ", 1) with f ( x *g ) = 0 , for all n. We considered two sizes of
the problem: n=2 and 5.
Modified Filled Function Method
555
Example 2 (in [6] and [7]) n −1
min f ( x) = ∑ [100( xi +1 − xi2 ) 2 + (1 − xi ) 2 ], i =1
s.t. xi ≤ 5, xi is integer, i = 1, 2, " , n. n
This problem has 11 feasible points and many local minimizers (5, 6, 7, 9 and 11 local minimizers for n=2, 3, 4, 5 and 6, respectively), but only one global minimum solution:
x *g = (1, 1, " , 1) with f ( x *g ) = 0 , for all n. We considered two sizes of
the problem: n=5 and 6. In the following, computational results of some test problems using the above algorithm are summarized. The computer is equipped with Windows XP system with 900 MH Z CPU . The symbols are used in the tables are noticed as follows: n: The number of variables; TS : The number of initial points to be chosen; Table 1. Results of numerical Example 1, n=5, A=283
TS 1 2 3
k
xik
x kf
1 2
(-1,3,-4,3,2) (1,1,1,1,1)
(0,0,0,0,0) (1,1,1,1,1)
2 0
(1,1,1,1,1)
1 2
(2,-2,1,0,0) (1,1,1,1,1)
(0,0,0,0,0) (1,1,1,1,1)
2 0
(1,1,1,1,1)
1 2 3
(-2,2,0,1,1) (0,0,0,0,0) (1,1,1,1,1)
(-1,1,1,1,1) (0,0,0,0,0) (1,1,1,1,1)
4 2 0
(0,0,0,0,0) (1,1,1,1,1)
f ( x kf )
x kp
f ( x kp ) QIN 0
12
10 5 + 1 0
21
10 5 + 1 2 0
4 8
10 5 + 1
Table 2. Results of numerical Example 2, n=5, A=448
TS 1
2 3 4
k
xik
1 2 3
(-2,-3,-1,-4,5) (0,0,0,-2,4) (0,0,0,0,0) (1,1,1,1,1) (1,1,1,1,1) (1,1,1,1,1)
412 0 0
(0,0,0,0,0) (1,1,1,1,1)
1 2
(-4,-2,-3,-1,5) (0,0,0,-2,4) (1,1,1,1,1) (1,1,1,1,1)
412 0
(1,1,1,1,1)
1 2
(0,0,-2,0,0) (1,1,1,1,1)
4 0
(1,1,1,1,1)
1 2 3
(-4,-2,-3,-1,5) (0,0,0,-2,4) (1,1,1,2,4) (1,1,1,1,1) (1,1,1,1,1) (1,1,1,1,1)
412 0 0
(1,1,1,2,4) (1,1,1,1,1)
x kf
(0,0,0,0,0) (1,1,1,1,1)
f ( x kf )
x kp
f ( x kp ) QIN 4 0
5 41
10 5 + 1 0
0
10 5 + 1 0
46
10 5 + 1 101 0
0 12
10 5 + 1
556
Y. Liu and Y.-l. Shang
k : The number of times that the local minimization process of the problem (DP); xik : The initial point for the k-th local minimization process of problem (DP);
x kf : The minimizer for the k-th local minimization process of problem (DP); x kp : The minimizer for the k-th local minimization process of problem (ADP); QIN: The iteration number for the k-th local minimization process of problem (ADP)
4 Conclusions This paper gives a modified filled function method to solve the nonlinear global integer programming problems, such as computing, communication control, and management, etc. The properties of the proposed modified filled function are also discussed in this paper. The results of preliminary numerical experiments are also reported of the proposed method. Acknowledgements. This work was supported by National Natural Science Foundation of China (No. 10771162) and Natural Science Foundation of Henan University of Science and Technology (N0. 2005ZD06).
References 1. Lucid, S., Piccialli, V.: New Classes of Globally Convexized Filled Functions for Global Optimization. J. Global Optimiz. 24, 219–236 (2002) 2. Ge, R.P., Qin, Y.F.: The Global Convexized Filled Functions for Globally Optimization. Applied Mathematics and Computation 35, 131–158 (1990) 3. Ge, R.P.: A Filled Function Method for Finding a Global Minimizer of a Function of Several Variables. Mathematical Programming 46, 191–204 (1990) 4. Shang, Y.L., Zhang, L.S.: A Filled Function Method for Finding a Global Minimizer on Global Integer Optimization. J. Computat. Appl. Math. 181, 200–210 (2005) 5. Shang, Y.L., Zhang, L.S.: Finding Discrete Global Minimizer with a Filled Function for Integer Programming. Europ. J. Operat. Res. 189, 31–40 (2008) 6. Shang, Y.L., Pu, D.G., Jiang, A.P.: Finding Global Minimizer with One-parameter Filled Function on Unconstrained Global Optimization. Appl. Math. Comput. 191, 176–182 (2007) 7. Shang, Y.L., Han, B.S.: One-parameter Quasi-filled Function Algorithm for Nonlinear Integer Programming. J. Zhejiang Univers. SCIENCE 6A, 305–310 (2005) 8. Zhu, W.X.: A Filled Function Method for Nonlinear Integer Programming. Chinese ACTA of Mathematicae Applicatae Sinica 23, 481–487 (2000) 9. Ge, R.P., Huang, H.: A Continuous Approach to Nonlinear Integer Programming. Appl. Math. Comput. 34, 39–60 (1989) 10. Zhang, L.S., Gao, F., Yao, Y.R.: Continuity Methods for Nonlinear Integer Programming. OR Transactions 2, 59–66 (1998) 11. Levy, A.V., Montalvo, A.: The Tunneling Algorithm for the Global Minimization of Function. SIAM J. Science Statistical Comput. 6(1), 15–29 (1985)
Prediction of Network Traffic Using Multiscale-Bilinear Recurrent Neural Network with Adaptive Learning Dong-Chul Park Center for Intelligent Imaging Systems Research, Dept. of Information Engineering Myong Ji University, Korea
[email protected] Abstract. A prediction scheme for network traffics using MultiscaleBilinear Recurrent Neural Network (M-BLRNN) with adaptive learning procedure is proposed and presented in this paper. The proposed predictor is a combination between M-BLRNN and adaptive learning procedure. In M-BLRNN, the wavelet transform is employed to decompose the original traffic signals into several simple traffic signals. In addition, the adaptive learning procedure is applied to improve the learning process at each resolution level in M-BLRNN with adaptive learning (MBLRNN(AL)). Experiments and results on a Ethernet network traffic prediction problem show that the proposed M-BLRNN(AL) scheme converges faster than M-BLRNN. The prediction accuracies of M-BLRNN and M-BLRNN(AL) are very similar in terms of the normalized mean square error(NMSE). Keywords: prediction, time-series, recurrent, neural network.
1
Introduction
Because of new services, increasing number of subscribers, and newly developed technologies, network traffic prediction problem is an important issue that has received much attention recently from the computer networks community. A proper strategy on capacity planning and overload warning obtained by accurate traffic predictions becomes an important issue for reducing operational costs. The network traffic prediction task, one of the typical issues in measured information-based network control, is to forecast the future traffic variation and it can be considered as a time-series prediction problem. Various models have been proposed to model and predict the future behavior of time series. Statical models such as moving average and exponential smoothing methods, linear regression models, autoregressive models (AR), autoregressive moving average (ARMA) models, and Kalman filtering-based methods have been widely used in practice [1,2]. Generally, most statical models are based on linear analysis techniques. However, the use of a linear analysis technique to approximate a nonlinear function may lead to inaccurate prediction of a time series. Since models based D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 525–532, 2008. c Springer-Verlag Berlin Heidelberg 2008
526
D.-C. Park
on linear analysis techniques may not be suitable for modeling and predicting time series, many nonlinear models were proposed to deal with highly nonlinear data. Chun and Chandra tried to solve the non-stationary and non-linear problems exhibited in the internet traffic using the threshold autoregressive (TAR) model[3] ; but this model is used only to simulate Internet data traffic and can not directly applied for prediction. Aimin and Sanqi proposed the Autoregressive Moving Average (ARMA) and Markov modulated Poisson process (MMPP) models to analyze the predictability of network traffic[4], but the models ignore the non-stationary property of the network traffic. Various nonlinear models have been proposed for time series prediction. In contrast with the previous methods, neural networks used for neural networks (NN)-based models are capable of their universal approximation capabilities and have been successfully applied to time-series prediction problems [5,6]. In this paper, the proposed prediction scheme employs Bilinear Recurrent Neural Network (BLRNN) for predicting the network traffic at each resolution of the Multiscale-BLRNN (M-BLRNN)[7]. The BLRNN was proposed to overcome the inherent limitations of the Multi Layered Perceptron type Neural Network (MLPNN)[8]. Since BRLNN is able to model a complex non-linear system with the minimum number of parameters, it could be an appropriate method for time series prediction especially for network traffic forecasting problem. Furthermore, M-BLRNN with Adaptive Learning (M-BLRNN(AL)) improved the learning process of BLRNN model by using the adaptive learning algorithm. For each BLRNN model, the learning rates are adjusted to the input signal during its learning process. The remainder of this paper is organized as follows: A brief review of multiresolution analysis with the wavelet transform is presented in Section 2 and Section 3 summarize M-BLRNN(AL). Section 4 presents some experiments and results on a network traffic data set including a performance comparison with M-BLRNN. Section 5 concludes the paper.
2
Multiresolution Wavelet Analysis
The aim of a multiresolution analysis is to analyze the signal at different frequencies with different resolutions. Recently, several neural network models have applied multiresolution wavelet analysis to time series prediction and signal filtering [9,10,11]. In particular, the so-called a ` trous wavelet transform has been proposed and applied widely. The formulation of a ` trous wavelet transform can be described as follows: First, the signal is passed a system of low-pass filters to suppress the high frequency components of a signal while allowing the low frequency components to pass through. A scale function associated with low-pass filters is used to calculate the average of elements, which results in a smoother signal. The signal cj (t) at resolution j can be obtained by performing convolutions between cj−1 (t) and the discrete low-pass filter h. h(k)cj−1 (t + 2j−1 k) (1) cj (t) = k
Prediction of Network Traffic Using M-BLRNN with Adaptive Learning
527
(a)
(b)
(c)
(d)
(e)
Fig. 1. Example of wavelet and scaling coefficients for Ethernet traffic data: (a) original signal, (b)w1 , (c) w2 , (d) w3 , and (e) c3
where h is a discrete low-pass filter associated with the scaling function and c0 (t) is the original signal. From the sequence of the smoothing of the signal, the wavelet coefficients are obtained by calculating the difference between the two signals: wj (t) = cj−1 (t) − cj (t)
(2)
Otherwise, the original signal can be reconstructed from wavelet coefficients and the scaling coefficients as follows: c0 (t) = cJ (t) +
J
wj (t)
(3)
j=1
where J is the number of resolution and cJ is the finest version of the signal.
528
D.-C. Park
Fig.1 illustrates wavelet coefficients and the scaling coefficients for three levels of resolutions for network traffic data. The original signal, three levels of the wavelet coefficients, and the finest scaling coefficients of Ethernet network traffic data are showed from top to bottom, respectively.
3
Multiscale-BLRNN with an Adaptive Learning
The M-BLRNN model is a combination of several BLRNN models[8], where each one is used to predict signal at each resolution level obtained by wavelet transform. Fig.2 is an example of M-BLRNN model. In M-BLRNN model, the original signal is decomposed into several signals at different resolutions and each signal is predicted by a BLRNN. The prediction of a signal is the summation of predicting results from all of the resolutions. x ˆ(t) = cˆJ (t) +
J
w ˆj (t)
(4)
j=1
ˆj (t), and x ˆ(t) are the predicted values of the finest scaling coeffiwhere cˆJ (t), w cients, the wavelet coefficients at level j, and the original signal. The M-BLRNN(AL) is a combination of M-BLRNN and adaptive learning algorithm. The adaptive learning algorithm is employed to improve the learning speed and prediction accuracy. In M-BLRNN(AL), each resolution has a private activate function, and a slope parameter correspondingly. The slope parameter of activate functions are adjusted during the training period. Assume that we have the following cost function: 1 2 (tl − yl ) (5) E= 2 l
Fig. 2. Example of Multiscale Bilinear Recurrent Neural Network model with 3 resolution levels
Prediction of Network Traffic Using M-BLRNN with Adaptive Learning
529
At the output layer, the slope parameter λl at each output neuron l can be iteratively updated by: −λl sl
sl .e λl (n + 1) = λl (n) + μλ (tl − yl ) (1+e −λl sl 2 )
(6)
Similarly, at the hidden layer, the slope parameter λp at each hidden neuron p can be iteratively updated by sp e−λp sp λl e−λl .sl λp (n + 1) = λp (n) + μλ (tl − yl ) (1+e w (7) lp −λl sl 2 ) (1+e−λp sp )2 l
By using the adaptive algorithm, the learning process of M-BLRNN(AL) can be improved.
4
Experiments and Results
The performance of M-BLRNN(AL)-based prediction model is evaluated and compared with prediction models based on the conventional algorithms such as MLPNN, BLRNN, and M-BLRNN on a real-time Ethernet traffic data. These data series, one of the most popular benchmark data that have been used for evaluating network traffic predictors, are part of a data set collected at Bellcore in August 1989[12]. The amplitude of the Ethernet traffic data was adjusted to lie in the range of the logistic function. The first 4,000 samples are used for training and the remaining data are used for testing. The BLRNN employs the structure 5-3-1 that denotes the number of inputs, hidden units, and outputs, respectively. For M-BLRNN and M-BLRNN(AL) used in experiments, the resolution levels vary from 2 to 5. In this assessment, the number of training iterations for the predictors is set to be 1,000. In experiments, Normalized Mean-Square Error(NMSE) is used for measuring the prediction performance. The input-output relation for this experiment is as follows: (input: output) = (x[n − 4], x[n − 3], · · · , x[n] : x[n + Δ])
(8)
where x[n] denotes the traffic at the time n and Δ = 1. In experiments, on-line training and test are performed: at time n, x ˆ[n + 1] is predicted and the predictors are trained with the actual x[n + 1] as its target output for the prediction of x ˆ[n + 2]. In this on-line training procedure, the number of training epoches are set to be 3. Learning curves of M-BLRNN and M-BLRNN(AL) are shown in Fig. 3. As can be seen from Fig. 3, M-BLRNN(AL) converges much faster than M-BLRNN. Experiments on multi-step predictions (Δ values in Eq. (8) are up to 100) were carried out for evaluating the generalization capability of M-BLRNN(AL). The result shows that M-BLRNN(AL) achieves very similar prediction accuracy with M-BLRNN when compared with the results in [7]. In order to investigate the effect resolution levels on the performance of the MBLRNN(AL)-based predictor, another experiments were performed by varying
530
D.-C. Park
0.031 M−BLRNN M−BLRNN(AL)
0.03 0.029 0.028
Error
0.027 0.026 0.025 0.024 0.023 0.022 0.021
0
50
100 epoch
150
200
Fig. 3. Learning Curves of M-BLRNN and M-BLRNN(AL)
Table 1. Effect of resolution levels on prediction accuracy in NMSE
2 3 Resolution level 4 5
10 mean : 1.86 sd : 0.06 mean : 1.21 sd : 0.08 mean : 1.18 sd : 0.07 mean : 2.71 sd : 0.28
Prediction steps 30 60 mean : 1.42 mean : 1.28 sd : 0.05 sd : 0.03 mean : 1.12 mean : 1.04 sd : 0.01 sd : 0.03 mean : 1.07 mean : 1.09 sd : 0.03 sd : 0.02 mean : 1.49 mean : 1.53 sd : 0.07 sd : 0.05
100 mean : 1.87 sd : 0.08 mean : 1.30 sd : 0.04 mean : 1.27 sd : 0.03 mean : 2.35 sd : 0.12
the resolution levels up to 5. In each combination of prediction step and resolution level, 20 experiments are performed and resulting NMSEs are obtained. The results are summarized in Table 1 in terms of the mean and standard deviation of obtained NMSEs. As can be seen from Table I, the best performance was achieved when the resolution levels were 3 and 4. This is somewhat interesting result. This can be considered as a result of highly nonlinear and random characteristics in Ethernet network traffic data.
Prediction of Network Traffic Using M-BLRNN with Adaptive Learning
5
531
Conclusions
An Ethernet network traffic prediction model using Multiscale-BiLinear Recurrent Neural Network with adaptive learning (M-BLRNN(AL)) is proposed in this paper. The performance of M-BLRNN(AL)-based prediction model is evaluated and compared with prediction models based on M-BLRNN on a real-time Ethernet traffic data. The M-BLRNN(AL)-based predictor shows an improvement in convergence speed, which is an important issue in BLRNN training, over M-BLRNN. As far as prediction accuracy is concerned, M-BLRNN(AL) and MBLRNN are very similar in terms of NMSE in the multi-step prediction experiments. For non-linear network traffic prediction problems, the M-BLRNN(AL) appears to perform well enough to be of practical use. However, M-BLRNN(AL) still has some inaccurate predictions at the peaks. This implies that more study on how to yield a higher accuracy in peaks should be carried out in future research.
Acknowledgments This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korean government (MOST)( Grant No.: R012007-000-20330-0). The author would like to thank Intelligent Research Laboratory (ICRL) members including J.-Y. Kim and V.-L. Huong for their help in preparing this manuscript.
References 1. Wu, W.R., Chen, P.C.: Adaptive AR Modeling in White Gaussian Noise. IEEE Trans. on Signal Processing 45, 1184–1192 (1997) 2. Kiruluta, A., Eizenman, M., Pasupathy, S.: Predictive Head Movement Tracking using a Kalman Filter. IEEE Trans. on Systems, Man and Cybernetics 27, 326–331 (1997) 3. Chun, Y., Chandra, K.: Time series models for the internet data traffic. In: 24th Conference on the Local Computer Networks, pp. 164–171 (1999) 4. Aimin, S., Sanqi, Li.: A predictability analysis of the network traffic. INFOCOM 1, 342–351 (2000) 5. Park, D.C., El-Sharkawi, M.A., Marks II, R.J., Atlas, L.E., Damborg, M.J.: Electronic Load Forecasting using an Artificial Neural Network. IEEE Trans. Power System 6, 442–449 (1991) 6. Leung, H., Lo, T., Wang, S.: Prediction of Noisy Chaotic Time Series using an Optimal Radial Basis Function Neural Network. IEEE Trans. on Neural Networks 12, 1163–1172 (2001) 7. Park, D.C., Tran, C.N., Lee, Y.: Multiscale BiLinear Recurrent Neural Networks and Their Application to the Long-Term Prediction of Network Traffic. In: Wang, ˙ J., Yi, Z., Zurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006. LNCS, vol. 3973, pp. 196–201. Springer, Heidelberg (2006) 8. Park, D.C., Jeong, T.K.: Complex Bilinear Recurrent Neural Network for Equalization of a Satellite Channel. IEEE Trans. on Neural Network 13, 711–725 (2002)
532
D.-C. Park
9. Mallat, S.G.: A Theory for Multiresolution Signal Decomposition: the Wavelet Representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 11, 674–693 (1989) 10. Liang, Y., Page, E.W.: Multi resolution Learning Paradigm and Signal Prediction. IEEE Trans. Sig. Proc. 45, 2858–2864 (1997) 11. Renaud, O., Starck, J.L., Murtagh, F.: Wavelet-Based Combined Signal Filtering and Prediction. IEEE Trans. on Systems, Man and Cybernetics 35, 1241–1251 (2005) 12. Fowler, H.J., Leland, W.E.: Local Area Network Traffic Characteristics with Implications for Broadband Network Congestion Management. In: IEEE JSAC, pp. 1139–1149 (1991)
Replay Attacks on Han et al.’s Chaotic Map Based Key Agreement Protocol Using Nonce Eun-Jun Yoon1 and Kee-Young Yoo2, 1
School of Electrical Engineering and Computer Science, Kyungpook National University, 1370 Sankyuk-Dong, Buk-Gu, Daegu 702-701, South Korea
[email protected] 2 Department of Computer Engineering, Kyungpook National University, 1370 Sankyuk-Dong, Buk-Gu, Daegu 702-701, South Korea Tel.: +82-53-950-5553; Fax: +82-53-957-4846
[email protected] Abstract. In 2008, Han et al. proposed two key agreement protocols based on chaotic maps; timestamp based protocol and nonce based protocol. The current paper, however, demonstrates the vulnerability of Han et al.’s nonce based key agreement protocol to replay attacks. Keywords: Cryptanalysis, Key agreement, Chaotic maps, Protocol, Nonce.
1
Introduction
In 2003, Kocarev et al. [1] proposed a new encryption system, which was a cryptographic system using chaotic maps, especially Chebyshev chaotic maps [2]. Following Kocarev et al.’s work, Xiao et al. [3] proposed a chaos-based deniable authentication scheme. In 2005, Bergamo et al. [5], however, showed an attack on Xiao et al.’s protocol. In 2006, Chang et al. [7] proposed a new key agreement protocol using chaotic map and passphrase. Chang et al.’s protocol, however, can only work in clock synchronization environment. In 2007, Xiao et al. [4] proposed a novel key agreement protocol based on chaotic maps using nonce. Xiao et al.’s new protocol, however, has been compromised by Han’s attack methods [6]. In 2008, in order to enhance the security and extend the flexibility and the usability, Han et al. [8] proposed two key agreement protocols based on chaotic maps. The first one works in clock synchronization environment (Timestamp based protocol) and the second one can work without clock synchronization (Nonce based protocol). Han et al. claimed that although there exists a replaying attacker interfering with the communication, user and server can still establish a shared session key securely. The current paper, however, demonstrates the vulnerability of Han et al.’s nonce based key agreement protocol to the replay attacks [6]. That is, if there
Corresponding author.
D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 533–540, 2008. c Springer-Verlag Berlin Heidelberg 2008
534
E.-J. Yoon and K.-Y. Yoo
exists a replaying attacker interfering with the communication, user and server cannot establish a correct shared session key securely. Therefore, Han et al.’s protocol does not satisfy the important security property against replay attacks unlike their claims. This paper is organized as follows: In Section 2, we briefly review Han et al.’s nonce based key agreement protocol based on chaotic maps. An outline of the replay attacks on Han et al.’s protocol is proposed in Section 3. Finally, our conclusions are given in Section 5.
2
Review of Han et al.’s Nonce Based Protocol
This section briefly reviews of Han et al.’s chaotic map based key agreement protocol using nonce [8]. Some of the notations used in Han et al.’s protocol are defined as follows: – – – – – –
A, IDA : a user and his/her identity number, respectively. B, IDB : a server and his/her identity number, respectively. P W : a password of user A. H(·): a chaotic hash function. β: a private key of server B, where β ∈ [−1, 1]. j0 , i0 : threshold values such that the semi-group property holds for any j j0 and any i i0 , respectively. – Eh (·): a symmetric encryption algorithm with h, where h = H(IDB , IDA , β, P W ) is the encryption/decryption key.
Assume that the server B and the user A secretly share the hash value h = H(IDB , IDA , β, P W ), where IDB , IDA , β and P W are concatenated as the pending message from left to right. Han et al.’s protocol performs as follows: (1) A → B: AU1 , r1 , n1 , IDA A chooses a random integer number r1 ∈ [−1, 1] and a random nonce n1 , and computes AU1 = H(h, r1 , n1 , IDA ). A then sends AU1 , r1 , n1 and IDA to B. (2) After receiving AU1 , r1 , n1 and IDA , B computes AU2 = H(h, r1 , n1 , IDA ) ? and then compares whether AU2 = AU1 . If not, then B stops this protocol; otherwise, A is authenticated and B goes to the next step. (3) B → A: AU3 , r2 , n2 , IDB B chooses a random integer number r2 ∈ [−1, 1] and a random nonce n2 , and computes AU3 = H(h, r2 , n2 , IDB ). B then sends AU3 , r2 , n2 and IDB to A. (4) After receiving AU3 , r2 , n2 and IDB , A computes AU4 = H(h, r2 , n2 , IDB ) ? and then compares whether AU4 = AU3 . If not, then A stops this protocol; otherwise, B is authenticated and A goes to the next step. (5) A → B: X A chooses a random integer j j0 , computes X = Eh (n1 , Tj (x)), and sends it to B.
Replay Attacks on Han et al.’s Chaotic Map
535
(6) B → A: Y B chooses a random integer i i0 , computes Y = Eh (n2 , Ti (x)), and sends it to A. (7) After receiving X, B gets n1 and Tj (x) by decrypting X, and checks whether ?
n1 = n1 . If holds, B computes the shared secret session key as Ti (Tj (x)) = Tij (x) = Tji (x) = Tj (Ti (x)); otherwise, B stops here and restarts the key agreement process with A. (8) After receiving Y , A gets n2 and Ti (x) by decrypting Y , and checks whether ? n2 = n2 . If holds, A computes the shared secret session key as Tj (Ti (x)) = Tji (x) = Tij (x) = Ti (Tj (x)); otherwise, A stops here and restarts the key agreement process with B.
3
Replay Attacks on Han et al.’s Nonce Based Protocol
This section demonstrates that Han et al.’s nonce based key agreement protocol is vulnerable to three replay attacks. From these attacks, we show that Han et al.’s protocol is still vulnerable and cannot help the user and the server to fulfil their purpose in establishing a secure secret session key. For each full run of Han et al.’s key agreement protocol, we call it a protocol run. For the kth full run, we call it the kth protocol run. Suppose the seeds for the Chebyshev polynomial map in the kth protocol run and the tth protocol run are rk1 and rt1 , respectively. Here, k1 < t1 and rk1 = rt1 (rk1 , rt1 ∈ [−1, 1] are random numbers. 3.1
Replay Attack 1
kth protocol run: With different seeds rk1 and rt1 , we inspect the Step (k1) and the Step (k5). (k1) A → B: AUk1 , rk1 , nk1 , IDA A chooses a random integer number rk1 ∈ [−1, 1] and a random nonce nk1 , and computes AUk1 = H(h, rk1 , nk1 , IDA ). A then sends AUk1 , rk1 , nk1 and IDA to B. (k5) A → B: Xk A chooses a random integer jk j0 , computes Xk = Eh (nk1 , Tjk (x)), and sends it to B. An adversary can easily intercept AUk1 , rk1 , nk1 , IDA and Xk . Actually, this adversary cannot get Tjk (x) since they are encrypted by using h. However, the adversary can take advantage of AUk1 , rk1 , nk1 , IDA and Xk as soon as she tries to prevent user A and server B from establishing a shared session key in the tth protocol run. In the following, we look at the tth protocol run through step-by-step to demonstrate how the adversary can prevent user A and server B from establishing a shared session key.
536
E.-J. Yoon and K.-Y. Yoo
tth protocol run: (t1) A → B: AUt1 , rt1 , nt1 , IDA A chooses a random integer number rt1 ∈ [−1, 1] and a random nonce nt1 , and computes AUt1 = H(h, rt1 , nt1 , IDA ). A then sends AUt1 , rt1 , nt1 and IDA to B. (t1.1) The adversary intercepts AUt1 , st1 , nt1 and IDA , and does not let it arrive at server B. (t1.2) Adversary → B: AUk1 , rk1 , nk1 , IDA The adversary replaces AUt1 , rt1 , nt1 , IDA with AUk1 , rk1 , nk1 , IDA which was intercepted in the kth protocol run. Then, the adversary sends AUk1 , rk1 , nk1 , IDA to server B. (t2) After receiving AUk1 , rk1 , nk1 and IDA , B will compute AUt2 = H(h, rk1 , ? nk1 , IDA ) and then compare whether AUt2 = AUk1 . Because AUt2 is always equal to AUk1 , the adversary is authenticated and B will go to the next step. (t3) B → A: AUt3 , rt2 , nt2 , IDB B will choose a random integer number rt2 ∈ [−1, 1] and a random nonce nt2 , and compute AUt3 = H(h, rt2 , nt2 , IDB ). B then will send AUt3 , rt2 , nt2 and IDB to A. (t4) After receiving AUt3 , rt2 , nt2 and IDB , A will compute AUt4 = H(h, rt2 , ? nt2 , IDB ) and then compare whether AUt4 = AUt3 . Because AUt4 is always equal to AUt3 , B is authenticated and A will go to the next step. (t5) A → B: Xt A chooses a random integer jt j0 , computes Xt = Eh (nt1 , Tjt (rt1 )), and sends it to B. (t5.1) The adversary intercepts Xt , and does not let it arrive at server B. (t5.2) Adversary → B: Xk The adversary replaces Xt with Xk which was intercepted in the kth protocol run. Then, the adversary sends Xk to server B. (t6) B → A: Yt B will choose a random integer it i0 , compute Yt = Eh (nt2 , Tit (rt1 )), and send it to A. (t7) After receiving Xk , B will get nk1 and Tjk (rk1 ) by decrypting Xk , and check ?
whether nk1 = nk1 . Because nk1 is always equal to nk1 , B will compute the shared secret session key as follows: Tit (Tjk (rk1 )) = Tit jk (rk1 )
(1)
(t8) After receiving Yt , A will get nt2 and Tit (rt1 ) by decrypting Yt , and check ? whether nt2 = nt2 . Because nk2 is always equal to nk2 , A will compute the shared secret session key as follows: Tjt (Tit (rt1 )) = Tit jt (rt1 )
(2)
Replay Attacks on Han et al.’s Chaotic Map
537
After the tth protocol run is completed, it is easy to see that Tit jk (rk1 ) = Tit jt (rt1 ) from the equations (1) and (2). This is because of the randomness of rt1 and rk1 as well as nt1 and nk1 . Therefore, the adversary successfully prevented user A and server B from establishing a shared session key, which will be used for subsequent cryptographic applications and communications. 3.2
Replay Attack 2
kth protocol run: With different seeds rk1 and rt1 , we inspect the Step (k3) and the Step (k6). (k3) B → A: AUk3 , rk2 , nk2 , IDB B chooses a random integer number rk2 ∈ [−1, 1] and a random nonce nk2 , and computes AUk3 = H(h, rk2 , nk2 , IDB ). B then sends AUk3 , rk2 , nk2 and IDB to A. (k6) B → A: Yk B chooses a random integer ik i0 , computes Yk = Eh (nk2 , Tik (rk1 )), and sends it to A. An adversary intercepts AUk3 , rk2 , nk2 , IDB and Yk in kth protocol run. tth protocol run: (t1) A → B: AUt1 , rt1 , nt1 , IDA A chooses a random integer number rt1 ∈ [−1, 1] and a random nonce nt1 , and computes AUt1 = H(h, rt1 , nt1 , IDA ). A then sends AUt1 , rt1 , nt1 and IDA to B. (t2) After receiving AUk1 , rk1 , nk1 and IDA , B will compute AUk2 = H(h, rk1 , ?
nk1 , IDA ) and then compare whether AUk2 = AUk1 . Because AUk2 is always equal to AUk1 , A is authenticated and B will go to the next step. (t3) B → A: AUt3 , rt2 , nt2 , IDB B will choose a random integer number rt2 ∈ [−1, 1] and a random nonce nt2 , and compute AUt3 = H(h, rt2 , nt2 , IDB ). B then will send AUt3 , rt2 , nt2 and IDB to A. (t3.1) The adversary intercepts AUt3 , rt2 , nt2 and IDB , and does not let it arrive at user A. (t3.2) Adversary → A: AUk3 , rk2 , nk2 , IDB The adversary replaces AUt3 , rt2 , nt2 , IDB with AUk3 , rk2 , nk2 , IDA which was intercepted in the kth protocol run. Then, the adversary sends AUk3 , rk2 , nk2 and IDA to use A. (t4) After receiving AUk3 , rk2 , nk2 and IDA , A will compute AUt4 = H(h, rk2 , ?
nk2 , IDB ) and then compare whether AUt4 = AUk3 . Because AUt4 is always equal to AUk3 , B is authenticated and A will go to the next step. (t5) A → B: Xt A chooses a random integer jt j0 , computes Xt = Eh (nt1 , Tjt (rt1 )), and sends it to B.
538
E.-J. Yoon and K.-Y. Yoo
(t6) B → A: Yt B will choose a random integer it i0 , compute Yt = Eh (nt2 , Tit (rt1 )), and send it to A. (t6.1) The adversary intercepts Yt , and does not let it arrive at user A. (t6.2) Adversary → A: Yk The adversary replaces Yt with Yk which was intercepted in the kth protocol run. Then, the adversary sends Yk to server B. (t7) After receiving Xt , B will get nt1 and Tjt (rt1 ) by decrypting Xk , and check ?
whether nt1 = nt1 . Because nt1 is always equal to nt1 , B will compute the shared secret session key as follows: Tit (Tjt (rt1 )) = Tit jt (rt1 )
(3)
(t8) After receiving Yk , A will get nk2 and Tik (rk1 ) by decrypting Yt , and check ?
whether nk2 = nk2 . Because nk2 is always equal to nk2 , A will compute the shared secret session key as follows: Tjt (Tik (rk1 )) = Tik jt (rk1 )
(4)
After the tth protocol run is completed, it is easy to see that Tit jt (rt1 ) = Tik jt (rk1 ) from the equations (3) and (4). This is because of the randomness of rt2 and rk2 as well as nt2 and nk2 . Therefore, the adversary successfully prevented user A and server B from establishing a shared session key, which will be used for subsequent cryptographic applications and communications. 3.3
Replay Attack 3
kth protocol run: With different seeds rk1 and rt1 , we inspect Steps (k1), (k3), (k5) and (k6). (k1) A → B: AUk1 , rk1 , nk1 , IDA A chooses a random integer number rk1 ∈ [−1, 1] and a random nonce nk1 , and computes AUk1 = H(h, rk1 , nk1 , IDA ). A then sends AUk1 , rk1 , nk1 and IDA to B. (k2) B → A: AUk3 , rk2 , nk2 , IDB B chooses a random integer number rk2 ∈ [−1, 1] and a random nonce nk2 , and computes AUk3 = H(h, rk2 , nk2 , IDB ). B then sends AUk3 , rk2 , nk2 and IDB to A. (k5) A → B: Xk A chooses a random integer jk j0 , computes Xk = Eh (nk1 , Tjk (x)), and sends it to B. (k6) B → A: Yk B chooses a random integer ik i0 , computes Yk = Eh (nk2 , Tik (rk1 )), and sends it to A. An adversary intercepts (AUk1 , rk1 , nk1 , IDA ), (AUk3 , rk2 , nk2 , IDB ), Xk and Yk in kth protocol run.
Replay Attacks on Han et al.’s Chaotic Map
539
tth protocol run: (t1) A → B: AUt1 , rt1 , nt1 , IDA A chooses a random integer number rt1 ∈ [−1, 1] and a random nonce nt1 , and computes AUt1 = H(h, rt1 , nt1 , IDA ). A then sends AUt1 , rt1 , nt1 and IDA to B. (t1.1) The adversary intercepts AUt1 , st1 , nt1 and IDA , and does not let it arrive at server B. (t1.2) Adversary → B: AUk1 , rk1 , nk1 , IDA The adversary replaces AUt1 , rt1 , nt1 , IDA with AUk1 , rk1 , nk1 , IDA which was intercepted in the kth protocol run. Then, the adversary sends AUk1 , rk1 , nk1 , IDA to server B. (t2) After receiving AUk1 , rk1 , nk1 and IDA , B will compute AUt2 = H(h, rk1 , ? nk1 , IDA ) and then compare whether AUt2 = AUk1 . Because AUt2 is always equal to AUk1 , the adversary is authenticated and B will go to the next step. (t3) B → A: AUt3 , rt2 , nt2 , IDB B will choose a random integer number rt2 ∈ [−1, 1] and a random nonce nt2 , and compute AUt3 = H(h, rt2 , nt2 , IDB ). B then will send AUt3 , rt2 , nt2 and IDB to A. (t3.1) The adversary intercepts AUt3 , rt2 , nt2 and IDB , and does not let it arrive at user A. (t3.2) Adversary → A: AUk3 , rk2 , nk2 , IDB The adversary replaces AUt3 , rt2 , nt2 , IDB with AUk3 , rk2 , nk2 , IDA which was intercepted in the kth protocol run. Then, the adversary sends AUk3 , rk2 , nk2 and IDA to use A. (t4) After receiving AUk3 , rk2 , nk2 and IDA , A will compute AUt4 = H(h, rk2 , ?
nk2 , IDB ) and then compare whether AUt4 = AUk3 . Because AUt4 is always equal to AUk3 , B is authenticated and A will go to the next step. (t5) A → B: Xt A chooses a random integer jt j0 , computes Xt = Eh (nt1 , Tjt (rt1 )), and sends it to B. (t5.1) The adversary intercepts Xt , and does not let it arrive at server B. (t5.2) Adversary → B: Xk The adversary replaces Xt with Xk which was intercepted in the kth protocol run. Then, the adversary sends Xk to server B. (t6) B → A: Yt B will choose a random integer it i0 , compute Yt = Eh (nt2 , Tit (rt1 )), and send it to A. (t6.1) The adversary intercepts Yt , and does not let it arrive at user A. (t6.2) Adversary → A: Yk The adversary replaces Yt with Yk which was intercepted in the kth protocol run. Then, the adversary sends Yk to server B.
540
E.-J. Yoon and K.-Y. Yoo
(t7) After receiving Xk , B will get nk1 and Tjk (rk1 ) by decrypting Xk , and check ?
whether nk1 = nk1 . Because nk1 is always equal to nk1 , B will compute the shared secret session key as follows: Tit (Tjk (rk1 )) = Tit jk (rk1 ) (t8)
(5)
After receiving Yk , A will get nk2 and Tik (rk1 ) by decrypting Yt , and check ? whether nk2 = nk2 . Because nk2 is always equal to nk2 , A will compute the shared secret session key as follows: Tjt (Tik (rk1 )) = Tik jt (rk1 )
(6)
After the tth protocol run is completed, it is easy to see that Tit jk (rk1 ) = Tik jt (rk1 ) from the equations (5) and (6). This is because of the randomness of rt1 , rk1 , rt2 and rk2 as well as nt1 , nk1 , nt2 and nk2 . Therefore, the adversary successfully prevented user A and server B from establishing a shared session key, which will be used for subsequent cryptographic applications and communications.
4
Conclusions
In 2007, Han et al. proposed two key agreement protocols based on chaotic maps; timestamp based protocol and nonce based protocol. The current paper demonstrated the vulnerability of Han et al.’s nonce based key agreement protocol to replay attacks. Acknowledgements. Eun-Jun Yoon was supported by 2nd Brain Korea 21 Project in 2008. Kee-Young Yoo was supported by the MKE(Ministry of Knowledge Economy) of Korea, under the ITRC support program supervised by the IITA(IITA-2008-C1090-0801-0026).
References 1. Kocarev, L., Tasev, Z.: Public-key Encryption based on Chebysheve Maps. In: Proc. IEEE Symp. Circuits Syst. (ISCAS 2003), vol. 3, pp. 28–31 (2003) 2. Rivlin, T.J.: Chebysheve Polynomials. John Wiley and Sons, Inc., New York (1990) 3. Xiao, D., Liao, X.F., Wong, K.: An Efficient Entire Chaos-based Scheme for Deniable Authentication. Chaos, Solitons & Fractals 23(4), 1327–1331 (2005) 4. Xiao, D., Liao, X.F., Deng, S.J.: A Novel Key Agreement Protocol based on Chaotic Maps. Inform. Sci. 177, 1136–1142 (2007) 5. Bergamo, P., D’Arco, P., Santis, A., Kocarev, L.: Security of Public Key Cryptosystems based on Chebyshev Polynomials. IEEE Trans. Circ. Syst.-I 52(7), 1382–1393 (2005) 6. Han, S.: Security of a Key Agreement Protocol based on Chaotic Maps. Chaos, Solitons & Fractals 38(3), 764–768 (2008) 7. Chang, E., Han, S.: Using Passphrase to Construct Key Agreement. CBS-IS-, Technical Report, Curtin University of Technology (2006) 8. Han, S., Chang, E.: Chaotic Map based Key Agreement With/out Clock Synchronization, Chaos, Solitons & Fractals,doi:10.1016/j.chaos.2007.06.030 (in press, 2007)
The Short-Time Multifractal Formalism: Definition and Implement* Xiong Gang1,2, Yang Xiaoniu1, and Zhao Huichang2 1
NO.36 Research Institute of CETC, National Laboratory of Information Control Technology For Communication System, Jiaxing, Zhe-Jiang, 314001, China 2 Electronic engineering dept., NJUST, Nanjing 210094, China
[email protected],
[email protected] Abstract. Although multifractal descirbles the singularity distribution of SE, there is no time information in the multifractal formalism, and the time-varying singularity distribu-tion indicates the spatial dynamics character of system. Therefore, the definition and implement of the short-time multifractal formalism is proposed, which is the prelude of time time-singularity spectra distribution.In this paper, the singularity analysis of windowed signal was given, further the short-time hausdorff spectum was deduced. The Partition Function and Short-time Legendre Spectrum was fractal statistical distribution of SE. WTMM method is popular in implement of MFA, and in section ,Short-time multifractal spectra based on WTMM is brough forward..
Ⅳ
1 Introduction Biosignals such as electroencephalogram (EEG), electrocardiogram (ECG), as well as other signals such as turbulent flows, lightning strikes, DNA sequences, and geographical objects represent some of many natural phenomena which are very difficult to characterize using traditional signal processing. Such signals are mostly nonstationary in time and/or space, and have a nonlinear behaviour. Thus, spectral methods (e.g., Fourier transform) are insufficient for analyzing them. There is strong evidence indicating that such signals have similar behaviour at multiple scales. This property is often referred to as fractality (aka self-affinity, longrange dependence, or long-range autocorrelation).Characterization of such fractal signals can be achieved through a measure of singularity α (aka Holder, or Lipschitz exponent), Mandelbrot singularity spectrum (MS) ) f (a) , and the generalized fractal dimensions D(q). For a monofractal signal, the MS shows only one point in the spectrum. The MS of a multifractal signal represents a spectrum of singularities and their dimension. The characteristics of α, f (a) are deeply-rooted in thermodynamics, and have been discussed extensively from the mathematical point of view. * This paper is supported by Post-doctoral Research Foundation of Province Zhejiang (No. 2006-bsh-27) and National Science foundation Research: Time-Dimension Spectral Distribution and the Affine Class Time-Frequency Processing of Stochastic Multifratal (No. 60702016). D.-S. Huang et al. (Eds.): ICIC 2008, CCIS 15, pp. 541–548, 2008. © Springer-Verlag Berlin Heidelberg 2008
542
X. Gang, Y. Xiaoniu, and Z. Huichang
Although multifractal descirbles the singularity distribution of SE, there is no time information in the multifractal formalism, and the time-varying singularity distribution indicates the spatial dynamics character of system. Therefore, the definition and implement of the short-time multifractal formalism is proposed, which is the prelude of time time-singularity spectra distribution. In Section II, the singularity analysis of windowed signal was given, further the short-time hausdorff spectum was deduced. The Partition Function and Short-time Legendre Spectrum in section was fractal statistical distribution of SE. WTMM method is popular in implement of MFA, and in section Short-time multifractal spectra based on WTMM is brough forward.
Ⅲ
Ⅳ,
2 Short-Time Singularity Exponent and Hausdorff Spectrum In macroscopical sense, The short-time multifractals provides the instantaneous singularity distribution, which arouses the difficulty of definiton. Given a cutty time interval, the singularity distribution of them is time-varying analysis. But the premise of previous analysis is that several singular signal poseese the linear adding character of time-varying analysis. 2.1 Singularity Analysis of Windowed Signal Assume the characteristic window of singular signal is h(τ − t ) , the windowed signal is
v(t ,τ ) = u (τ )h(τ − t ) Definition 2.1. A function or the path of a process
v(t ,τ ) is said to be in Cτh if there
is a polynomial Pu (τ ) such that
| v(t ,τ ) − Pu (τ ) |≤ C | u − τ | h for u suffciently close to τ . Then, the degree of local Holder regularity of
τ
v(t ,τ ) at
is H (t ,τ ) := sup{h : v (t ,τ ) ∈ Cτh } Of special interest for our purpose is the case when the approximating polynomial Pt is a constant, i.e., Pu (τ ) = v(t ,τ ) , in which case H (t ,τ ) can be computed easily.
To the end: Definition 2.2. Let us agree on the convention log(0) = −∞ and set
h(t,τ ) = liminf ε →0
1 log2 sup | v(t ,τ ) − v(t , u) | log2 (2ε ) |u −τ | h , when X ∈ C (t0 ) then there exists a constant C > 0 such that | d X ( j , k ) |≤ C 2 jh (1+ | 2 − j t0 − k |h ). Loosely speaking, it is commonly read as the fact that when X has Holder exponent h at
t0 = 2 j k ,the corresponding wavelet coefficients d X ( j , k ) are of the order of magnitude | d X ( j , k ) |~ 2 jh . This is precisely the case of the cusp like function mentioned above. Further results relating the decrease along scales of wavelet coefficients and Holder exponent can be found in e.g., [8]. 4.2 Wavelet Coefficient Based Multifractal Formalism Wavelet coefficient based structure functions and scaling exponents are defined as:
S d (t , q, j ) =
1 nj
nj
∑| d
X
(t , j, k ) |q
k =1
⎛ log2 Sd (t , q, j ) ⎞ ⎟⎟ j ⎝ ⎠
τ d (t , q) = lim inf ⎜⎜ j →0
where n j is the number of available d X (t , j , k ) at octave j : n j ≈ no 2− j . By definition of the multifractal spectrum, there are about
(2 j ) − D ( h ) points with Holder
exponent h, hence with wavelet coefficients of the order d X (t , j , k ) ≈ (2 j ) h . They contribute
to
Sd (t , q, j )
~ 2 j (2 j ) qh (2 j ) − D ( h ) = (2 j )1+ qh − D ( h ) .
as
j τ d (t , q )
S d (t , q, j ) will behave as ~ cq (2 )
Therefore,
and a standard steepest descent argument
yields a Legendre transform relationship between the multifractal spectrum D(h) and the scaling exponents τ d (t , q ) : τ d (t , q) = inf h (1 + qh − D(h)) . The Wavelet
The Short-Time Multifractal Formalism: Definition and Implement
547
Coefficient based Multifractal Formalism (hereafter WCMF) is standardly said to hold when the following equality is valid:
D(t , h) = inf (1 + qh − τ d (t , q)) − q ≠0
5 Experimental Results and Discussion Fig. 2 is the original data of sea clutter of radar, and the Fig.3 shows the partition function and the multifractal spectrum of fractal sea clutter. From the simulation of multifractal spectrum of sea clutter, we can see that the sea clutter is multifractal.
Fig. 1. The sea clutter of radar
τ (q)
f (α )
q
α
Fig. 2. The partition function and the multifractal spectrum of the sea clutter of radar
Fig. 3. The short time multifractal spectral of sea clutter
Fig.3 gives the short time multifractal spectral distribution of sea clutter, from which it’s can be seen that the multifractal of sea clutter is time-varying, and the STMFS can extract more characteristic of multifractal than the multifractal, especially when the multifractal is changed along the time.
548
X. Gang, Y. Xiaoniu, and Z. Huichang
References 1. Arneodo, A., Audit, B., Bacry, E., Manneville, S., Muzy, J.F., Roux, S.G.: Thermodynamics of Fractal Signals Based on Wavelet Analysis: Application to Fully Developed Turbulence Data and DNA Sequences. Physica A 254, 24–45 (1998) 2. Arneodo, A., Bacry, E., Muzy, J.F.: The Thermodynamics of Fractals Revisited with Wavelets. Physica A 213(1-2), 232–275 (1995) 3. Bacry, E.: Lastwave Pakage. Web Document, Febraury 28, 2005 (1997), http://www.cmap.polytechnique.fr/~bacry/LastWave/ 4. Donoho, D., Duncan, M.R., Huo, X.: WaveLab Documents, (Febraury 28, 2005) [Online] (1999), http://www.stat.stanford.edu/~wavelab/ 5. Faghfouri, A., Kinsner, W.: 1D Mandelbrot Singularity Spectrum, Ver. 1.0, (Febraury 28, 2005) [Online] (2005), http://www.ee.umanitoba.ca/~kinsner/projects 6. Grassberger, P., Procaccia, I.: Dimensions and Entropies of Strange Aattractors from a Fluctuating Dynamics Approach. Physica D 13(1-2), 34–54 (1984) 7. Hentschel, H., Procaccia, I.: The Infinite Number of Generalized Dimensions of Fractals and Strange Attractors. Physica D 8D, 435–444 (1983) 8. Kinsner, W.: Fractal: Chaos Engineering Course Notes. Winnipeg, MB: Dept. Electrical & Computer Eng., University of Manitoba (2003) 9. Mallat, S.G., Hwang, W.L.: Singularity Detection and Processing with Wavelets. IEEE Trans. Infor. Theory 38, 617–643 (1992) 10. Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, Chestnut Hill (2001) 11. Mandelbrot, B.B., Fractals, Multifractals.: Noise, Turbulence and Galaxies. Springer, New York (1989) 12. Muzy, J.F., Bacry, E., Arneodo, A.: Wavelets and Multifractal Formalism for Singular Signals: Application to turbulence data. Phys. Rev. Lett. 67(25), 3515–3518 (1991) 13. Muzy, J.F., Bacry, E., Arneodo, A.: Multifractal Formalism for Fractal Signals: The Structure Function Approach Versus the Wavelet-transform Modulus-maxima Method. Phys. Rev. E 47(2), 875–884 (1993) 14. Muzy, J.F., Bacry, E., Arneodo, A.: The Multifractal Formalism Revisited with Wavelets. Int. Jrnl. Bif. Chaos 4(2), 245–302 (1994) 15. Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-Time Signal Processing, 2nd edn. Prentice Hall, Englewood Cliffs (1999) 16. Proakis, J.G., Manolakis, D.G.: Digital Signal Processing: Principles, Algorithms and Applications, 2nd edn. Macmillan, New York (1996) 17. Van den Berg, J.: Wavelets in physics, 2nd edn. Cambridge University Press, Cambridge (2004)
Author Index
Akdemir, Bayram Amghar, Youssef
185 517
Hu, Kai-Wen 337 Hu, Xiuyin 501 Huang, Chenn-Jung 337 Huang, Hu 139 Huang, Huei-Chuen 76 Huang, Jingang 423 Huang, Qin-Hua 69 Huang, Xianxiang 294 Huang, Yalou 407 Hung, Ching-Tsung 31
˙ Babao˘ glu, Ismail 398 Balafar, M.A. 169, 177 Bayrak, Mehmet 398 Bharathy, C. 478 Cai, Zhihua 1 Chang, Jia-Ray 31 Chang, Pei-Chann 463 Chen, Chun-Hua 337 Chen, Feng 364 Chen, Jian-Da 31 Chen, Jianfu 153 Chen, Lin 260 Chen, Shih-Huang 31 Cheng, Erkang 423 Cho, Sung-Jae 356 Chou, Chien-Cheng 31 C ¸ omak, Emre 398 Cui, Xingran 501 Deaton, Russel Encheva, Sylvia
Ipson, Stan S.
Jang, Nam-Young 55 Jiang, Jianmin 487 Jiang, Peng 364 Jo, Kang-Hyun 200 Ju, Shaohua 216 Kalenatic, Dusko 109 Keerio, Muhammad Usman 244 Khawaja, Attaullah 244 Khorasani, K. 310 Kim, Chang-Beom 356 Kim, Dae-Nyeon 200 Kim, Sangjin 55, 132 Kim, SoonGohn 302, 372, 380, 509 Ko, Eung Nam 302, 372, 380, 509 Kong, Bin 423
222 268
Fan, Chin-Yuan 463 Fang, Kai 252 Fang, Minglun 216 Figueroa Garc´ıa, Juan C. Fındık, O˘ guz 398
109
Gang, Xiong 541 Gatsheni, Barnabas Ndlovu 329 Ge, Yunjian 252 Ghanbari, Mohammed 161 G¨ une¸s, Salih 185 Guo, Tao 222 He, Huan 415 He, Shao-ping 84 He, Yuqing 415 Hou, Yushi 415 Hu, Guihua 92
487
Lee, DongYoup 132 Lee, Heon Gyu 47 Lee, Sanghyuk 55, 132 Lee, Yang Koo 61 Levine, Daniel S. 276, 345 Lewis, Frank 276 Li, Bo 236 Li, Chao 39 Li, Cuifeng 125 Li, Dong 407 Li, Guiyang 222 Li, Rongheng 76 Lian, Hao 161 Liang, Qiaokang 252 Lin, Feng-Yan 192, 432
558
Author Index
Lin, Liulan 216 Liu, Chang-liang 260 Liu, Changhong 139 Liu, Dian 153 Liu, Quan 501 Liu, Xintian 139 Liu, Yong 549 Liu, Yunxiang 118 Lopez Bello, Cesar Amilcar Lu, Yuepin 244 Luo, Yun-Cheng 337
Song, Changxin 229 Sun, Gi-Ju 356 Sun, Quan-Sen 208
109
Valdes, A. 310 Vrabie, Draguna
Ma, Ke 229 Mabizela, Zwelakhe 329 Mahmud, Rozi 169, 177 Man, Qing-Kui 192, 432 Mao, Xiaobo 455 Mashohor, Syamsiah 169, 177 Mi, Yulin 286 Moin, M. Shahram 471 Moon, Cheol-Hong 356 Nait Bahloul, Safia 517 Najafi, Sajad Ravadanegh Ni, Weijian 407
Tayal, Akash 478 Tong, Aili 216 Touheed, Nasir 388 Trinh, Hoang-Hon 200 Tumin, Sharil 268
7
Pan, Junshan 440 Park, Dong-Chul 525 Park, Jin Hyoung 47 Parthasarthy, Harish 478 Parviz, Mehdi 471 Peng, Wen 23 Piao, Minghao 47 Premaratne, Prashan 447 Qiu, Ying 322 Qu, Dongcai 286 Qureshi, M. Atif 388 Rajpar, Altaf Hussain 244 Ramli, Abd. Rahman 169, 177 Rao, Yunqing 15 Ren, Jinchang 487 Rouba, Baroudi 517 Ryu, Keun Ho 47, 61 Sachdeva, Pratima 478 Saeed, Muhammad 388 Safaei, Farzad 447 Saripan, M. Iqbal 169, 177 Shang, You-lin 549 Shao, Xinyu 15
276
Wang, Binggang 15 Wang, Dong 487 Wang, Huisen 322 Wang, Ling 61 Wang, Mengchang 15 Wang, Min 495 Wang, Xiao-Feng 192, 432 Wang, Xiao-mei 260 Wang, Yang 407 Wang, Yen-Wen 463 Wei, Zhi 322 Wen, Chang-Gang 236 Wu, Qinghua 1 Xia, De-Shen 208 Xie, Maoqiang 407 Xiong, Zhang 39 Xu, Chuanyu 145 Xu, Ning 118 Xu, Yao-qun 84 Xue, Anke 364 Yan, Xuesong 1 Yang, Hongying 415 Yang, Mao-Long 208 Yang, Rijie 286 Yang, Xiaoniu 541 Yoo, Kee-Young 533 Yoon, Eun-Jun 533 Yosunkaya, S ¸ ebnem 185 Younus, Arjumand 388 Yu, Feng 495 Yu, Tao 92 Yuan, Jin 92 Yuan, Zhiyong 440 Zeng, Xiao 39 Zhang, Guangbin
252
Author Index Zhang, Jiafeng 216 Zhang, Jinyu 294 Zhang, Peng 455 Zhang, Ping 236 Zhang, Shanzhong 455 Zhang, Xingming 153 Zhang, Zhen 455 Zhao, Huichang 541
Zhao, Lihui 139 Zheng, Chun-Hou 192, 236, 432 Zheng, Fei 423 Zhou, Chang-Yu 23 Zhou, Ruqi 118 Zhu, Wenhua 92 Zhu, Xiangbin 101
559