Information and Management Engineering, Part VI - ICCIC 2011

Communications in Computer and Information Science 236 Min Zhu (Ed.) Information and Management Engineering Interna...

Author: Min Zhu

83 downloads 2618 Views 9MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Communications in Computer and Information Science

236

Min Zhu (Ed.)

Information and Management Engineering International Conference, ICCIC 2011 Wuhan, China, September 17-18, 2011 Proceedings, Part VI

13

Volume Editor Min Zhu Nanchang University 235 Nanjing Donglu Nanchang, 330047, China E-mail: [email protected]

ISSN 1865-0929 e-ISSN 1865-0937 ISBN 978-3-642-24096-6 e-ISBN 978-3-642-24097-3 DOI 10.1007/978-3-642-24097-3 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: Applied for CR Subject Classification (1998): C.2, H.4, I.2, H.3, D.2, J.1, H.5

© Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

The present book includes extended and revised versions of a set of selected papers from the 2011 International Conference on Computing, Information and Control (ICCIC 2011) held in Wuhan, China, September 17–18, 2011. The ICCIC is the most comprehensive conference focused on the various aspects of advances in computing, information and control providing a chance for academic and industry professionals to discuss recent progress in the area. The goal of this conference is to bring together researchers from academia and industry as well as practitioners to share ideas, problems and solutions relating to the multifaceted aspects of computing, information and control. Being crucial for the development of this subject area, the conference encompasses a large number of related research topics and applications. In order to ensure a high-quality international conference, the reviewing course is carried out by experts from home and abroad with all low-quality papers being rejected. All accepted papers are included in the Springer LNCS CCIS proceedings. Wuhan, the capital of the Hubei province, is a modern metropolis with unlimited possibilities, situated in the heart of China. Wuhan is an energetic city, a commercial center of finance, industry, trade and science, with many international companies located here. Having scientific, technological and educational institutions such as Laser City and the Wuhan University, the city is also an intellectual center. Nothing would have been achieved without the help of the Program Chairs, organization staff, and the members of the Program Committees. Thank you. We are confident that the proceedings provide detailed insight into the new trends in this area. August 2011

Yanwen Wu

Organization

Honorary Chair Weitao Zheng

Wuhan Institute of Physical Education, Key Laboratory of Sports Engineering of General Administration of Sport of China

General Chair Yanwen Wu

Huazhong Normal Universtiy, China

Program Chair Qihai Zhou

Southwestern University of Finance and Economics, China

Program Committee Sinon Pietro Romano

Azerbaijan State Oil Academy, Azerbaijan

International Program Committee Ming-Jyi Jang Tzuu-Hseng S. Li Yanwen Wu Teh-Lu Liao Yi-Pin Kuo Qingtang Liu Wei-Chang Du Jiuming Yang Hui Jiang Zhonghua Wang Jun-Juh Yan Dong Huang JunQi Wu

Far-East University, Taiwan National Cheng Kung University, Taiwan Huazhong Normal University, China National Cheng Kung University, Taiwan Far-East University, Taiwan Huazhong Normal University, China I-Shou University, Taiwan Huazhong Normal University, China WuHan Golden Bridgee-Network Security Technology Co., Ltd., China Huazhong Normal University, China Shu-Te University, Taiwan Huazhong University of Science and Technology, China Huazhong Normal University, China

Table of Contents – Part VI

Output Feedback Stabilization for Networked Control Systems with Packet Dropouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hao Dong, Huaping Zhang, and Hongda Fan

1

Study of the Fuzzy Nerve Network Control for Smart Home Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GaoHua Liao and JunMei Xi

7

The Study on RF Front-End Circuit Design Based on Low-Noise Amplifier Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhao San-ping

13

Balanced Ridge Estimator of Coefficient in Linear Model under a Balanced Loss Function (I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenke xu and Fengri Li

20

SEDE: A Schema Explorer and Data Extractor for HTML Web Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xubin Deng

26

Application of Artificial Neural Network (ANN) for Prediction of Maritime Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xu Jian-Hao

34

Embedded VxWorks System of Touch Screen Interrupt Handling Mechanism Design Based on the ARM9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Han Gai-ning and Li Yong-feng

39

A New Architectural Design Method Based on Web3D Virtual Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhang Jun

45

Design and Implement of a Modularized NC Program Interpreter . . . . . . Chen Long, Yu Dong, Hong Haitao, Guo Chong, and Han Jianqi

50

Parallel Computing Strategy Design Based on COC . . . . . . . . . . . . . . . . . . Jing-Jing Zhou

58

Preliminary Exploration of Volterra Filter Algorithm in Aircraft Main Wing Vibration Reduction and De-noising Control . . . . . . . . . . . . . . . . . . . Chen Yu, Shi Kun, and Wen Xinling

66

VIII


Development Strategy for Demand of ICTs in Business-Teaching of New and Old Regional Comprehensive Higher Education Institutes . . . . . Hong Liu

74

A Novel Storage Management in Embedded Environment . . . . . . . . . . . . . Lin Wei and Zhang Yan-yuan

79

Development Strategy for Demand of ICT in Small-Sized Enterprises . . . Yanhui Chen

84

Development Strategy for Demand of ICT in Medium-Sized Enterprises of PRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yanhui Chen

89

Diagnosing Large-Scale Wireless Sensor Network Behavior Using Grey Relational Difference Information Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongmei Xiang and Weisong He

94

Mining Wireless Sensor Network Data Based on Vector Space Model . . . Hongmei Xiang and Weisong He

100

Influencing Factors of Communication in Buyer-Supplier Partnership . . . Xudong Pei

105

An Expanding Clustering Algorithm Based on Density Searching . . . . . . . Liguo Tan, Yang Liu, and Xinglin Chen

110

A Ship GPS/DR Navigation Technique Using Neural Network . . . . . . . . . Yuanliang Zhang

117

Research of Obviating Operation Modeling Based on UML . . . . . . . . . . . . Lu Bangjun, Geng Kewen, Zhang Qiyi, and Dai Xiliang

124

The Study of Distributed Entity Negotiation Language in the Computational Grid Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Honge Ren, Yi Shi, and Jian Zhang

131

Study and Application of the Smart Car Control Algorithm . . . . . . . . . . . Zhanglong Nie

138

A Basis Space for Assignment Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shen Maoxing, Li Jun, and Xue Xifeng

148

The Analysis on the Application of DSRC in the Vehicular Networks . . . Yan Chen, Zhiyuan Zeng, and Xi Zhu

152

Disaggregate Logit Model of Public Transportation Share Ratio Prediction in Urban City . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dou Hui Li and Wang Guo Hua

157


IX

Design of Calibration System for Vehicle Speed Monitoring Device . . . . . Junli Gao, Haitao Song, Qiang Fang, and Xiaoqing Cai

166

Dynamic Analysis and Numerical Simulation on the Road Turning with Ultra-High . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liang Yujuan

173

Solving the Aircraft Assigning Problem by the Ant Colony Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tao Zhang, Jing Lin, Biao Qiu, and Yizhe Fu

179

Generalization Bounds of Ranking via Query-Level Stability I . . . . . . . . . Xiangguang He, Wei Gao, and Zhiyang Jia

188

Generalization Bounds for Ranking Algorithm via Query-Level Stabilities Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhiyang Jia, Wei Gao, and Xiangguang He

197

On Harmonious Labelings of the Balanced Quintuple Shells . . . . . . . . . . . Xi Yue

204

The Study of Vehicle Roll Stability Based on Fuzzy Control . . . . . . . . . . . Zhu Maotao, Chen Yang, Qin Shaojun, and Xu Xing

210

Fast Taboo Search Algorithm for Solving Min-Max Vehicle Routing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chunyu Ren

218

Research on the Handover of the Compound Guidance for the Anti-ship Missile beyond Visual Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhao Yong-tao, Hu Yun-an, and Lin Jia-xin

224

Intelligent Traffic Control System Design Based on Single Chip Microcomputer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xu Lei, Ye Sheng, Lu Guilin, and Zhang Zhen

232

Calculation and Measurement on Deformation of the Piezoelectric Pump Actuator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xing Wang, Linhua Piao, and Quangang Yu

239

FEM Analysis of the Jet Flow Characteristic in a Turning Cavity . . . . . . Xing Wang, Linhua Piao, and Quangang Yu

246

Software Compensation of the Piezoelectric Fluidic Angular Rate Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xing Wang, Linhua Piao, and Quangang Yu

253

Finite Element Analysis for Airflow Angular Rate Sensor Temperature Field and Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xing Wang, Linhua Piao, and Quangang Yu

261

X


Control System of Electric Vehicle Stereo-Garage . . . . . . . . . . . . . . . . . . . . Wang Lixia, Yang Qiuhe, and Yang Yuxiang

267

Research the Ameliorative Method of Wavelet Ridgeline Based Direct Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yan Zhe and Li Ping

273

Study on the Transportation Route Decision-Making of Hazardous Material Based on N-Shortest Path Algorithm and Entropy Model . . . . . Ma Changxi, Guo Yixin, and Qi Bo

282

Encumbrance Analysis of Trip Decision Choosing for Urban Traffic Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li Zhen-fu, He Jian-tong, and Zhao Chang-ping

290

Study on Indicators Forecasting Model of Regional Economic Development Based on Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yang Jun-qi, Gao -xia, and Chen Li-jia

297

An Adaptive Vehicle Rear-End Collision Warning Algorithm Based on Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhou Wei, Song Xiang, Dong Xuan, and Li Xu

305

A kind of Performance Improvement of Hamming Code . . . . . . . . . . . . . . . Hongli Wang

315

Intelligent Home System Based on WIFI . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhang Yu-han and Wang Jin-hai

319

A Channel Optimized Vector Quantizer Based on Equidistortion Principal and Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wang Yue

328

ESPI Field Strength Data Processing Based on Circle Queue Model . . . . Hongzhi Liu and Shaokun Li

335

The Research on Model of Security Surveillance in Software Engineering Based on Ant Colony Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongzhi Liu and Xiaoyun Deng

343

Realization on Decimal Frequency Divider Based on FPGA and Quartus II 350 Hu XiaoPing and Lin YunFeng Design of Quality Control System for Information Engineering Surveillance Based on Multi-agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongzhi Liu, Li Gao, and GuiLin Xing

357


XI

A Study about Incentive Contract of Insurance Agent . . . . . . . . . . . . . . . . Hu Yuxia

364

Scientific Research Management/Evaluation/Decision Platform for CEPB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhang Shen, Liu Zhongjing, and Wang Hui-li

370

Necessary and Sufficient Condition of Optimal Control to Stochastic Population System with FBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RenJie and Qimin Zhang

376

The Research on Newly Improved Bound Semi-supervised Support Vector Machine Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xue Deqian

383

The Application of Wireless Communications and Multi-agent System in Intelligent Transportation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Xiaowei

391

Study on Actuator and Generator Application of Electroactive Polymers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia Ji, Jianbo Cao, Jia Jiang, Wanlu Xu, Shiju E., Jie Yu, and Ruoyang Wang

398

Research on Chinese Mobile e-Business Development Based on 3G . . . . . Li Chuang

404

The Statistical Static Timing Analysis of Gate-Level Circuit Design Margin in VLSI Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhao San-ping

410

Forensic Analysis Using Migration in Cloud Computing Environment . . . Gang Zhou, Qiang Cao, and Yonghao Mai

417

Research on Constitution and Application of Digital Learning Resources of Wu Culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minli Dai, Caiyan Wu, Hongli Li, Min Wang, and Caidong Gu

424

Research on Digital Guide Training Platform Designing . . . . . . . . . . . . . . . Minli Dai, Caidong Gu, Jinxiang Li, Fengqiu Tian, Defu Zhou, and Ligang Fang

430

A Hypothesis Testing Using the Total Time on Test from Censored Data as Test Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shih-Chuan Cheng

436

Collaborative Mechanism Based on Trust Network . . . . . . . . . . . . . . . . . . . Wei Hantian and Wang Furong

445

XII


Design for PDA in Portable Testing System of UAV’s Engine Based on Wince . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . YongHong Hu, Peng Wu, Wei Wan, and Lu Guo

452

Adaptive Particle Swarm Optimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li Li and Qin Yang

458

Based on Difference Signal Movement Examination Shadow Suppression Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hu ChangJie

461

Application of Clustering Algorithm in Intelligent Transportation Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Long Qiong, Yu Jie, and Zhang Jinfang

467

Exploration and Research of Volterra Adaptive Filter Algorithm in Non-linear System Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen Xinling, Ru Yi, and Chen Yu

474

Application of Improved Genetic Algorithms in Structural Optimization Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shengli Ai and Yude Wang

480

Research on Intelligent Schedule of Public Traffic Vehicles Based on Heuristic Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liangguo Yu

488

The Integration Framework of Train Scheduling and Control Based on Model Predictive Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chao Mi and Yonghua Zhou

492

A Design of Anonymous Identity Generation Mechanism with Traceability for VANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An-Ta Liu, Henry Ker-Chang Chang, and Herbert Hsuan Heng Lai

500

A Improvement of Mobile Database Replication Model . . . . . . . . . . . . . . . . Yang Chang Chun, Ye Zhi Min, and Shen Xiao Ling

511

Software Design and Realization of Altimeter Synthetically Detector . . . . Shi Yanli, Tan Zhongji, and Shi Yanbin

517

Emulational Research of Spread Spectrum Communication in the More-Pathway Awane Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shi Yanli, Shi Yanbin, and Yu Haixia

522

A Pilot Study on Virtual Pathology Laboratory . . . . . . . . . . . . . . . . . . . . . . Fan Pengcheng, Zhou Mingquan, and Xu Xiaoyan

528

Research and Practice on Applicative “Return to Engineering” Educational Mode for College Students of Electro-mechanical Major . . . . Jianshu Cao

536


XIII

Engineering Test of Biological Aerated Filter to Treat Wastewater . . . . . Weiliang Wang

544

The Design of Propeller LED Based on AT89S52 . . . . . . . . . . . . . . . . . . . . . Xu zelong, Zhang Hongbing, Hong Hao, and Jiang Lianbo

551

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

559

Output Feedback Stabilization for Networked Control Systems with Packet Dropouts* Hao Dong1, Huaping Zhang1, and Hongda Fan2 2

1 Network Center, Yantai University, Yantai 264005, China Information Engineering, Naval Aeronautical and Astronautical University, Yantai 264001, China [email protected]

Abstract. The problems of stability and stabilization for the networked control systems (NCS) with stochastic packet dropouts are investigated. When packet dropouts occur between sensor and controller,the networked control system is modeled as a markov jump linear system with two operation modes .Based on this model, the sufficient condition for the stability of the system is presented, then static output feedback controller is obtained in terms of LMIs condition.A number example illustrates the effectiveness of the method in this paper. Keywords: Networked control system, packet dropout, stochastically stable, linear matrix inequality(LMI).

1 Introduction Networked control systems (NCSs) are control loops closed through a shared communication network[1-3].That is, in networked control systems, communication networks are employed to exchange the information and control signals (reference input, plant output, control input, etc.) between control system components (sensors, controllers, actuators, etc.) .The main advantages of networked control systems are low cost, reduced weight, simple installation and maintenance, and high reliability. As a result, networked control systems have been widely applied to many complicated control systems, such as, manufacturing plants, vehicles, and spacecraft. However, the insertion of communication network in the feedback control loop complicates the application of standard results in analysis and design of an NCS because many ideal assumptions made in the traditional control theory can not be applied to NCSs directly. The packet dropout is one of the most important issues in the NCSs. Data packet dropout can degrade performance and destabilize the system. In recent years, NCSs with packet dropout have been a hot research topic and obtained more concern. Some work on the effect of dropout on NCS has been published [4-5].The augmented state space method is an important method for dealing with the problem of data packet dropout provided in [4].[3] models NCSs with data packet dropout as asynchronous dynamic systems, but the stability condition derived in [3] is in bilinear matrix *

This work was supported by Educational Commission of Shandong Province, China (J08LJ19-02).

M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 1–6, 2011. © Springer-Verlag Berlin Heidelberg 2011

2

H. Dong, H. Zhang, and H. Fan

inequalities, which are difficult to solve. The issue of data packet dropout is modeled as a Markov process in [6], but no rigorous analysis is carried out. In this paper, we consider the stabilization problem of networked control systems with a discrete-time plant and the time driven controller. The packet dropout occurs between sensor and controller,and the networked control system here is modeled by a markov jump linear system(MJLS) with two modes. Then we can apply some MJLS theories to analysis stability and stabilization problems of NCS.

2 Problem Formulation The framework of NCS with data packet dropouts is depicted in Fig. 1, where the plant is described by the following discrete-time linear time-invariant system model ⎧ x k +1 = Axk + Buk , ⎨ ⎩ yk = Cxk

(1)

where k ∈ ], xk ∈ \ n is the system state, uk ∈ \ p is the control input , yk ∈ \ m is the measurement output. When date dropouts occur between sensor and controller , the dynamics of the switch S can be described as : When S is closed, the sensor output yk is successfully transmitted to the controller, the switch output yk is changed to yk ,and when it is open, the switch output is held at the previous value yk −1 and the packet is lost.

Actuatorr

Plant

Sensor

uk

yk

Network with packet dropouts

Controller

S

yk

Fig .1. Networked control system with packet dropouts

Thus the dynamics of the switch S can be modeled as ⎧⎪ yk , S is closed , yk = ⎨ ⎪⎩ y( k −1) ,S is open

(2)

Here,we consider the following static output feedback controller with packet dropouts: uk = Kyk .

(3)

Output Feedback Stabilization for Networked Control Systems with Packet Dropouts

3

T

Let zK = ⎡⎣ xkT ykT−1 ⎤⎦ be the augmented state vector. Then by depiction of the network channels and use of the models (1)–(3), the closed-loop networked control system with the packet dropout can be represented by the following two subsystems. (a) No packet dropouts exist in between the sensor and the controller. z k +1 = A1 zk , A = ⎡ A + BKC 0 ⎤ , 1 ⎢ 0⎥ C ⎣

(4)

⎦

(b) Packet dropouts occur between the sensor and the controller. z k +1 = A2 z k , A = ⎡ A BK ⎤ . 2 ⎢0 I ⎥⎦ ⎣

(5)

Now taking all the subsystems into consideration, the two subsystems can be lumped into a general framework that is described by the following discrete-time Markov jump linear system: zk +1 = Ark zk ,

(6)

{rk , k ∈ ]} is

a Markov chain taking value in finite space ℵ = {1, 2} ,with transition probability from mode i at time k to mode j at time k + 1 as:

pij = Pr {rk +1 = j | rk = i} , with pij ≥ 0 , i , j ∈ℵ ,and

2

∑p j =1

ij

= 1.

Lemma 1[7]. System xk +1 = A(rk ) xk is stochastically stable, if and only if for each mode ∀rk ∈ℵ , there exists matrix

Xi > 0

N

such that AT (i )∑ Pij X j A(i ) − X i < 0 holds. j =1

Lemma 2[8]. Given matrices X , Y , Z , with appropriate dimensions, and Y > 0 ,then

− X T Z − Z T X ≤ X T YX + Z T Y −1 Z

3 Controller Design In this section, stability analysis and static output feedback controller are considered for the NCS with packet dropouts. A sufficient condition is established via the theory from the discrete-time Markov jump linear system, and the corresponding controller design technique is provided. Theorem 1. For given controller (3), system (6) is stochastically stable, if for each mode i ∈ S , there exist matrices Xi > 0 , Si satisfying the following coupled LMIs: ⎡− Xi ⎢ Φi = ⎢ SA ⎢⎣ i i

⎤ ⎥ 0 , it follows that Si + SiT < 0 ,then Si is nonsingular for each mode i ∈ℵ . Based on Lemma 1, system (6) is stochastically stable, if and only if for each mode i ∈ℵ , there exists matrix Xi > 0 such that 2

Ai T ∑ pij X j Ai − X i < 0 j =1

(8)

.

In the following, we prove that if (7) holds, then (8) holds. Since Si is nonsingular, preand post-multiply (7) by diag { I , Si−1} and diag { I , Si−T } , respectively, and let Li = Si−T , inequality (7) is equivalent to ⎡− X i ⎢ ⎢ A ⎢⎣ i

⎤ ⎥ 0 dω ω =1 d ω ω =1 dω ω =1

f (ω ) and dfd(ωω ) are all continuous functions. So when ω < 1 and is sufficiently large, then df (ω ) > 0 , that is to say when ω < 1 and is large, When ω ≤ 1 ,

dω

24

W. Xu and F. Li

f (ω ) = MSE (αˆω ) is monotone increasing function of ω , then there is

ω ∗ 0 ， 0 ≤ ω ≤ 1 ,then (5)

2

αˆω ≤ αˆ 2

2

≤ 0,So λi (ωλi + 1 − ω + ω k ) ≤ ( λi + k ) (ωλi + 1 − ω )

i = 1, 2, " p λi 2 (ωλi + 1 − ω + ω k )

As well as Since αˆω

≤1

( λi + k ) (ωλi + 1 − ω ) −1 = αˆ ′Λ ⎡ω I + (1 − ω )( Λ + kI )−1 ⎤ ⎡⎣ωΛ + (1 − ω ) I ⎤⎦ ⎣ ⎦ 2

2

2 2

⎡⎣ωΛ + (1 − ω ) I ⎤⎦

−1

⎡ω I + (1 − ω )( Λ + kI ) −1 ⎤ Λαˆ ⎣ ⎦

2 ⎛ λ 2 ωλ + 1 − ω + ω k 2 λ p 2 (ωλ p + 1 − ω + ω k ) ⎞ ) 1 ( 1 ⎜ ⎟ αˆ = αˆ ′ diag ," , 2 2 ⎜ ( λ1 + k )2 (ωλ1 + 1 − ω )2 ( λp + k ) (ωλ p + 1 − ω ) ⎟⎠ ⎝

≤ αˆ ′αˆ = αˆ

Lemma

1[10]:

2

For

linear

model

(1),

Then

Aβˆ ~ Cβ

A( X ′X ) A′ ≤ A( X ′X ) C ′ . −1

if

and

only

if

−1

Theorem 4: For linear model (1), and arbitrary scalar

k >0

， 0 ≤ ω ≤ 1 . Then

within linear estimation class, βˆω is an admissible estimator of β , that is βˆω Proof: By theorem 1

(

βˆω

= φ ( ωΛ + (1 − ω ) I p )

To definite D= ωΛ + (1 − ω ) I p

)

−1

(ω I

p

−1

(ωI

p

)

−1 + (1 − ω )(Λ + kI p ) Λφ ′βˆ

+ (1 − ω ) ( Λ + kI p )

−1

)Λ

So φ Dφ ′( X ′X ) −1φ Dφ ′ = φ DΛ Dφ ′ −1

=φ

2 ⎛ λ ωλ + 1 − ω + ω k 2 ( 1 ) ," , λ p (ωλ p + 1 − ω + ω k ) ⎞⎟ φ ′ diag ⎜ 1 2 2 ⎜ ( λ1 + k ) 2 (ωλ1 + 1 − ω ) 2 ( λ p + k ) (ωλ p + 1 − ω ) ⎟⎠ ⎝

And φ Dφ ′( X ′X )−1 = φ DΛ −1φ ′

~β.

Balanced Ridge Estimator of Coefficient in Linear Model =φ

25

⎛ ωλ + 1 − ω + ω k ωλ p + 1 − ω + ω k ⎞ 1 ⎟φ ′ diag ⎜ ," , ⎜ ( λ1 + k )(ωλ1 + 1 − ω ) ( λ p + k )(ωλ p + 1 − ω ) ⎟⎠ ⎝

Because k − ω k = k (1 − ω ) ε0, So λi (ωλi + 1 − ω + ω k ) δ ( λi + k ) (ωλi + 1 − ω ) As well as

λi (ωλi + 1 − ω + ω k )

2

( λi + k ) (ωλi + 1 − ω ) 2

2

ωλi + 1 − ω + ω k

δ λ + k ωλ + 1 − ω , i = 1, 2," p ( i )( i )

Therefore φ Dφ ′( X ′X ) −1φ Dφ ′ < φ Dφ ′( X ′X )−1 , by lemma 1, βˆω is an admissible estimate of β . Theorem 1 gives the expression of the Balanced Ridge Estimator. Theorem 2 shows the Balanced Ridge Estimator is superiority over Least Squares Estimator under Mean Square Error criterion. Theorem 3 shows the length of Balanced Ridge Estimator is smaller than the length of Least Square Estimator, so Balanced Ridge Estimator is the compression toward a origin for Least Square Estimator, and also is a compression estimation. Theorem 4 shows the Balanced Ridge Estimator is an admissible estimation.

References 1. Zellner, A.: Bayesian and non-Bayesian estimation using balanced loss functions. In: Gupta, S.S., Berger, J.O. (eds.) Statistical decision theory and related topics V, pp. 377–390. Spring, New York (1994) 2. Wan, A.T.K.: Risk comparison of inequality constrained least squares and other related estimators under balanced loss. Economics Letters 46, 203–210 (1994) 3. Rodrignes, J., Zellner, A.: Weighted balanced loss function and estimation of the mean time to failure. Communications in Statistics-Theory and Methods 23, 3609–3616 (1994) 4. Giles, J.A., Giles, D.E.A., ohtani, K.: The exact risk of some pretest and stein-type regression estimators under balanced loss. Communications in Statistics-Theory and Methods 25, 2901–2919 (1996) 5. Xu, X., Wu, Q.: Linear Admissible Estimators of Regression Coefficient Under Balanced Loss. Acta Mathematiea Scientia 20(4), 468–473 (2000) 6. Luo, H., Bai, C.: The Balanced LS Estimation of the Regressive Coefficient in a Linear Model. Journal of Hunan University (Natural Sciences) 33(2), 122–124 (2006) 7. Qiu, H., Luo, J.: Balanced Generalized LS Estimation of the regressive coefficient. Joural of East China Normal University (Natureal Science) (5), 66–71 (2008) 8. Hoerl, A.E., Kennard, R.W.: Ridge Regression: Biased Estimation for Non-orthogonal Problems. Technometrics 12(1), 55–68 (1970) 9. Wang, S., Shi, J., Yin, S., et al.: Introduction Linear Model, 3rd edn. Science Press, Beijin (2004) 10. Wang, S.: Linear Model Theory and its application. Anhui education Press (1987)

SEDE: A Schema Explorer and Data Extractor for HTML Web Pages Xubin Deng School of Information, Zhejiang University of Finance & Economics, Hangzhou, 310018, China [email protected]

Abstract. We present an approach for automatically exploring relation schema and extracting data from HTML pages. By abstracting a DOM-tree constructed from a HTML page into a set of generalized lists, this approach automatically generates a relation schema for storing data extracted from the page. Based on this approach, we have developed a software system named as SEDE (Schema Explorer and Data Extractor for HTML pages), which can reduces the workload of extracting and storing data objects within HTML pages. This paper will mainly introduce SEDE. Keywords: DOM-tree abstraction, HTML page, relational database, relation schema.

1 Introduction As HTML pages contain useful data objects, how to extract them from ill-structured HTML pages is now a hot research topic. To this goal, there are three classes of approaches. The first class uses a set of predefined extraction rules to search for data objects [1,2]. The second class finds semantic data blocks based on page structure and appearance [3,4,5]. The third class finds frequent subtrees in HTML parse trees [6,7]. These approaches still have limitations such as the requirement of manual efforts, the neglect of relationships between data objects, the overlook of how to organize, store and query data objects, etc. In order to partly overcome the above limitations, this paper presents a new approach to automatically transform HTML pages into relational database (RDB), which includes the following steps. 1) Transformation. Transform an HTML page into a set of correlated relation tables, which serves as the first RDB schema and the data source for Web-based applications. 2) Schema integration. Integrate new schema with current RDB schema when page changes. 3) View generation. Extract web data via views of the RDB when necessary. Based on this approach, we have developed a software system named as SEDE (Schema Explorer and Data Extractor for HTML pages), which can reduce the workload of extracting, storing and querying data objects within HTML pages. In this paper, we shall mainly introduce SEDE. Related Work. Most close to this work is the web data extraction algorithm given in [8], which employs a HTML parse tree to search for contiguously-repeat structures and M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 26–33, 2011. © Springer-Verlag Berlin Heidelberg 2011

SEDE: A Schema Explorer and Data Extractor for HTML Web Pages

27

extract data records into tables using a partial tree alignment approach. The differences are the following. 1) [8] needs to filter much information that may be useful for some applications, while this work can losslessly transform an HTML page into a set of correlated relation tables. 2) [8] cannot obtain a whole schema for the HTML page, while this work can.

2 Foundation of SEDE Transformation Algorithm. This algorithm includes three steps. 1) DOM-tree creation. This step first uses a Web Browser control of Microsoft Visual Basic to obtain an HTML parse tree of an HTML page, and then transforms the HTML parse tree into a DOM-tree. 2) DOM-tree abstraction. This step finds continually repeat structures in HTML DOM-tree and represents them using a set of generalized lists. 3) DOM-tree transformation and data extraction. This step constructs a schema tree for an abstracted HTML DOM-tree and fills these relation tables in the schema tree using data extracted from the abstracted HTML DOM-tree. We omit the detail discussion of this algorithm; readers can refer to [9] for detailed discussion. Schema Integration. As HTML pages are changeful and the schema obtained from the latest version of a page may be different from that obtained from the last version of the page, anytime a new schema is obtained, we integrates it with the current RDB schema using an algorithm similar with tree edit distance algorithm given in [10]. This algorithm can compute an optimal sequence of operations (i.e., the edit script) that turns ordered forest F1 into ordered forest F2. The following algorithm realizes this notion. Algorithm 1. Schema Integration INPUT: S1: old schema tree; S2: new schema tree OUTPUT: Turn S1 into S2, adjust relevant views, and return true; or trigger modification alert to the user and return false. BOOL Integrate (TNode S1, TNode S2 ){ Script SC = (); //to store a sequence of operations on S1. Forest F1= (S1), F2= (S2); ED (NULL, F1, NULL, F2, SC); IF(exits op∈SC that deletes high score information){ Trigger alert; RETURN(FALSE); } ELSE { Execute SC; adjust views as S1 turns into S2. RETURN(TRUE); } } // Integrate float ED (TNode P1, Forest F1, TNode P2, Forest F2, Script &SC){//P1(P2) is the parent of F1 (F2) //compute a script turning F1 into F2; return its edit distance. TNode v = the rightmost tree root in F1; TNode w = the rightmost tree root in F2; IF (F1 is empty and F2 is empty)dist=0; ELSE IF (F2 is empty){

28

X. Deng

SC=SC+(delete); dist=ED(P1, F1-v, NULL, empty, SC)+Cost(delete v); } ELSE IF (F1 is empty){ SC=SC+(insert); dist=ED (NULL, empty, P2, F2-w, SC)+Cost(insert w); } ELSE { Script SC1= (), SC2= (), SC3= (); dist1==ED(P1, F1-v, P2, F2, SC1)+Cost(delete v); SC1= SC1+(delete); dist1==ED(P1, F1, P2, F2-w, SC2)+Cost(insert w); SC2= SC2+(insert); dist3=ED(v, CF(v), w, CF(w), SC3)+ED(P1, F1-T(v), P2, F2-T(w), SC3)+Cost(modify v to w); SC3= SC3+(modify v to w); IF (dist31) return error; //The word belong to the same group cannot be together } The error handing module can be constructed based on this errors detection mechanism which can be achieved easily.

5

Implementation Verification

So far, a prototype system to verify the feasibility of proposed NC program interpreter has been implemented according to the ISO6983 standard. Without the support of PLC and motion control module, machining command converting module can’t implement the corresponding canonical machining functions, so we just print the names of these canonical machining functions to replace this process. With the NC program input, the interpretation results are shown in Figure 3. For example, the word G00 in the line of N60 corresponds to spindle’s clockwise rotation, so the machining function START_SPINDLE_ CLOCKWISE () was printed. While the word G01 and G02 in the line of N70 belong to the same modal group, an error is displayed.

56

C. Long et al.

Fig. 3. NC program input and Interpretation results

Furthermore, to test interpretation rate, in Red Hat Linux @200MHz machine, the prototype system interpret a line cutting program of 50,000 lines needs 9372ms. In general, the processing time of a line is more than 0.1s, while the interpretation of a single line is less than 0.01s, so the interpreter does not significantly affect the rate of implementation of the system speed.

6

Conclusion

A modularized NC program interpreter is proposed for the CNC system of machine tool. In the syntax analysis module, the standard NC program rules which eliminated the ambiguity effectively were described as EBNF. In the error handing module, an error detection mechanism is put forward. So far, a NC program interpreter prototype system has already been built to verify the feasibility of the proposed design. The interpretation results indicated that the system development efficiency can be significantly improved. In future, more research activities will be put on the expansion of the NC program through the extension of the paradigm. Acknowledgment. This work was supported by the Major National S&T Program (High-grade CNC Machine Tools and basic manufacturing equipment- The Innovation Platform Construction for Supporting Technology of Open Numerical Control(ONC) System: No. 2011ZX04016-071).The foundation’s support is greatly appreciated.

References 1. Xiao, T.Y., Han, X.L., Wang, X.L.: General NC code translation techniques. Journal of System Simulation 10, 1–7 (1998) 2. Zhao, D.L., Fang, K., Qian, W.: Design and realization of NC code explaining. Manufacturing Automation 28, 43–45 (2006)

Design and Implement of a Modularized NC Program Interpreter

57

3. Kong, Z.Y., Ma, J.: CNC wire cutting of ISO code interpreter. Electrical Discharge Machining 1, 21–23 (1997) 4. Wu, K.N., Li, B., Chen, J.H.: Implementation of NC code interpreter of open architecture NC system platform. China Mechanical Engineering 17, 168–171 (2006) 5. Zhang, Q., Yao, X.F.: Design and Implement of a NC Code Interpreter for Open Architecture CNC System. Modular Machine Tool & Automatic Manufacturing Technique 2, 59–61 (2010) 6. Liu, Y.D., Guo, X.G.: An intelligent NC program processor for CNC system of machine tool. Robotics and Computer-Integrated Manufacturing 23, 160–169 (2007) 7. SO6983. Numerical control of machines – program format and definition of address words – Part 1.Data format for positioning, line motion and contouring control system. International Standards Organisation (1982)

Parallel Computing Strategy Design Based on COC Jing-Jing Zhou College of Information & Electronic Engineering, Zhejiang Gongshang University [email protected]

Abstract. Sparse and unstructured computations are widely involved in engineering and scientific applications. It means that data arrays could be indexed indirectly through the values of other arrays or non-affine subscripts. Data access pattern would not be known until runtime. So far all the parallel computing strategies for this kind of irregular problem are single network topology oriented, which cannot fully exploit the advantages of modern hierarchical computing architecture, like grid. We proposed a hybrid parallel computing strategy RP, shorted for “Replicated and Partially-shared”, to improve the performance of irregular applications in the COC (Cluster of Clusters) environment. Keywords: Irregular Applications, Replicated and Partially-shared, Cluster of Clusters.

1 Introduction Irregular data access is widely involved in many scientific computing and engineering applications in the form of indirect data indexing or non-affine subscriptions. The data access pattern could only be determined at runtime. This irregular issue becomes extremely important when we would like to parallelize these applications. So far, there are mainly three models for parallelizing this kind of irregular applications. The first one is the HPF-like model. High Performance FORTRAN(HPF)[1] is a big contribution towards the programming model on distributed memory machines. It allows programmers to specify the parallelism of the program either explicitly with an independent directive or implicitly by specifying the data distribution. But it can be a tedious task for programmers to deduce an optimal data distribution [2]. The HPF-like model is actually an extension of HPF by employing the inspector/executor scheme [3] to handle the irregular problem. However, the step of specific data partition is still in the first place. Generally each datum has a single owner. Computations that could change the datum’s value would be performed only on the owning node of this datum [4]. Input operands to the computations are received via messages from their owners. Therefore, the overhead of operations on large numbers of remote data can be involved. Some works have focused on the optimization of this model in recent years. P. Brezany [5] gave a runtime library which could on some level properly handle the irregular issue. A new minimized computation partition scheme was proposed in [6], as well as the cache-hit oriented computation partition scheme in [7]. This HPF-like model has been applied to many applications in practice, and mostly been implemented with a low M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 58–65, 2011. © Springer-Verlag Berlin Heidelberg 2011

Parallel Computing Strategy Design Based on COC

59

level runtime-support library. However, as the data partition is always the first step and only static analysis can be performed during compilation, the uncertainty of data access pattern still exists. Therefore, this model cannot guarantee load balance among all the compute nodes. The second one is the idea of Software Distributed Shared Memory (SDSM) proposed in [8]. By using software mechanisms to trace remote data access and perform communication, SDSM provides programmers a unified view of global memory on a distributed memory system. But every time after the update of data an implicit global synchronization is needed. Since every node that takes part in the computation keeps a complete and valid copy of the entire data set, this model definitely eases the burden on programmers and compilers. However, huge overhead can be introduced by unnecessary global synchronization. Experiments show that applications employed this SDSM model perform rather poor on clusters, especially when irregular computation is involved [9]. Continuous efforts have been made to solve this problem [10], but the original SDSM model gets complicated when new rules are employed. The well known replicating computation model takes in the idea of sharing data. Its essence is to eliminate all the network communication cost by doing redundant computation of fully shared data set. However, the lack of available memory space could become an issue as the program size grows, there might raise the Out-of-Core problem. Therefore, good scalability cannot be guaranteed by this SDSM or replicating computation model. The third one is the producer-consumer based partially shared model, “P-C PS” for short later in this paper, proposed by Ayon Basumallik [11]. In his work, a specific dependency between data and computation is studied to determine which part of the entire data set should be shared among computing nodes. As only part of the data would be shared, it will not occupy too much memory space. And better load balancing state can be achieved since the computation partition is considered prior to the data partition. However, there still are operations on remote data which bring about communication and synchronization cost. Table 1. Simple comparison among three traditional models

Mo HPF-like SDSM P-C PS

Simple

Scale

Load Balanced

√ √ √

√

COC (Cluster of Clusters), which is composed of different sized clusters, can provide innumerous storage capacity and unified huge computing resource. These clusters could employ different network protocols or topologies and be loosely organized. The intra- and inter-cluster network topology can be quite different. Then how to take the advantage of this system for irregular computations becomes an issue. Therefore, we are dealing with the problem of how to run irregular applications efficiently in a heterogeneous network based computing environment.

60

J.-J. Zhou

Our contribution is to propose an irregular problem solving scheme adapted to the COC computing environment: “Replicated and Partially-shared” hybrid computing strategy, RP for short later in this paper. We try to incorporate the advantages of the traditional models and eliminate the negative side-effects they may bring about. The essence of our method is to do computation redundantly inter-clusters while adopt the partially shared model for intra-clusters. The following benefits could be obtained: z z z

enable irregular applications to well fit the heterogeneous network; achieve shorter execution time owing to balanced workload; obtain good scalability of irregular applications.

The rest of this paper is organized as follows: some necessary background information of irregular computing is covered in section 2. RP hybrid computing scheme is described in section 3, along with detailed comparison between our method and other traditional ones. Experimental results are presented in section 4 and section 5 gives all the conclusions.

2 RP Model Based on COC As a COC system is composed of clusters with different storage and computing capacities, the ground work of our model is to divide the workload among these clusters with the help of hypergraph partition technology. Cost of this stage can be amortized. The hypergraph is defined in [12]. A hypergraph H = (V, N) consists a set of vertices V and a set of hyper-edge N connected those vertices, where each nj ∈Nis a set of vertices from V. Noticed that each datum in an irregular problem could be accessed by a set of computations, therefore, an irregular computing problem could be mapped into a hypergraph, where V stands for the data and N stands for data related computing task. Weights ( w ) can be associated to the edges ( n j ∈ N ), which represent the cost of doing task n j on certain machine. As shown in figure 1, the data elements are presented as squares and the tasks are presented as a set of edges are connected by dots, where the edges with same color represent the same task. Assuming this hypergraph is partitioned into three parts according to the number of clusters in a COC environment, as indicated by three enclosing rectangles. Our strategy is to minimize the number of cut-edges as long as load balancing among clusters is satisfied. Let N be the set of nets in domain A, CNA be the set of cut-edges involved in domain A, the workload of cluster A could be defined as j

A

WL A = ∑ n

j ∈N A

w j ( A) + ∑ n

j ∈CN A

w j ( A)

Assuming the number of clusters is n, and the average workload of these clusters is WL

avg

=

∑ WL i

j

/n

For 0 ≤ i ≤ n

We say the workload is balanced as long as the following condition is satisfied: WL i ≤ WL avg * (1 + ε ) For 0 ≤ i ≤ n ,


61

where ε is a predetermined ratio presenting the maximum imbalance in task division. After this hypergraph partition, the application could be totally mapped to the COC system in the form of different sets of tasks as shown in Figure 2.

Fig. 1. Sketch-map for the partition of hypergraph abstracted from irregular problems

Fig. 2. Map the irregular problem to a COC system with hypergraph partition

Considering the heterogeneity of inter- and intra- cluster network topology, a hybrid strategy should be employed to accommodate irregular problems in the computing environment. Extremely huge computation capacity could be obtained as long as we adopt different parallel computing models to different levels of a COC system. 2.1 Inter-cluster Strategy

It is quite clear that any task turned out to be a cut-edge in the hypergraph will introduce communication and synchronization overhead. Suppose all the clusters in a COC system are connected with Internet and geographically distributed, and then the communication overhead could become the bottle neck of improving the program’s execution efficiency. As a matter of fact, many clusters are loosely organized to form a grid-like uniformed computing resource. The cost of large scale communication among clusters is unacceptable. Both partially and fully shared models will bring about more

62

J.-J. Zhou

or less communication overhead. But replicate computing model could eliminate all the possible network overhead by doing redundant computation. And the cost of replicating computation can be controlled since the number of cut-edges is minimized in the hypergraph partition stage. As long as the overhead of communication among clusters over beats the cost of redundant computation, the replicating computation model is a better choice. According to experimental results, in most cases, replicating computation among clusters can greatly decrease the total execution time of applications. 2.2 Intra-cluster Strategy

As all the compute nodes within a cluster are connected with high productive network, like InfiniBand, communication and synchronization cost among these nodes is relatively acceptable. Execution efficiency on this level is mainly constrained to memory size and cache miss rate, etc. With all these factors taken into consideration, producer-consumer oriented partially shared framework is the best choice.

Task Partition perm1

permN

dataN

data1 Inspector CommTable1

CommTableN Production

data1

dataN Gather

data1

dataN

Computation targetN

target1 Reduction Final Result

t

Fig. 3. Proceeding flow of producer-consumer oriented partially shared model

Here, producer and consumer together define a logical relationship of the data production and consumption. Take the code in Figure 1 for example, data set p is produced in loop L1 and consumed in loop L2. Ayon [13] studied on the producer-consumer graph and gave some reasonable analyzing strategies. By studying this logical relationship, we can specify the data needed to be shared among compute nodes. Therefore, it is unnecessary to keep a complete and valid copy of the entire data set in each node, which can save quite a lot of memory space and promise a more efficient execution. The proceeding flow of partially shared model is given in Figure 3.


63

In a word, combined with two traditional models, the RP hybrid computing strategy can well fit the irregular computing problems into a heterogeneous network of COC environment. Not only does it avoid the unacceptable data transformation delay on the inter cluster level, but also reduces the storage cost and ensures balanced workload and high execution efficiency on the intra cluster level.

3 Experimental Results Our benchmarks are based on the CG application and are implemented in MPICH, version 1.2.7. We did the experiments on a COC system composed of two clusters, which are geographically distributed and connected through Internet. The configuration is as follows: Cluster A CPU

Intel Xeon 3.0GH Z

L1 Data Cache

16K

L2 Cache

1024K

Me mory

2GB

Disk Size

60GB

Operati ng System

Fedora 3

Net work

Gigabyte Switch Cluster B

CPU

Intel Xeon 2.8GH Z

L1 Data Cache

8K

L2 Cache

512K

Me mory

2GB

Disk Size

40GB

Oper ating System

Redhat 9

Net work

Infini-band

Fig. 4. Configurations of Cluster A and B

According to our test, the intra-cluster network is generally over 10 times faster then the inter-cluster connection. Obviously, a much shorter total execution time is achieved by employing RP strategy, as shown in Figure 4.

64

J.-J. Zhou CG 70 )d 60 no 50 ce 40 s( 30 em it 20 10 0

partially replicate r-p 100

500 1000 2000 size of data set(*1000)

Fig. 5. Experiment result on CG

4 Conclusions and Future Work A COC oriented “Replicated and Partially-shared” hybrid computing strategy is proposed to fit irregular applications into the hierarchical and heterogeneous network based computation environment. A class of irregular problems employed our strategy could obtain better performance and scalability than using traditional models. A new hybrid parallel computing model is under construction, and we hope it can be more general purposed and self-adaptive to the characteristic of irregular applications and execution environment. Acknowledgement. This work was supported by National Natural Science Foundation of China (No. 60903214, 60970126), the National High Technology Development 863 Program of China (No. 2008AA01A323), Scientific Research Fund of Zhejiang Provincial Education Department Project(Y200908196).

References 1. High Performance Fortran Forum, High Performance Fortran language specification, version 1.0, Technical Report CRPC-TR92225, Houston, Tex (1993) 2. Frumkin, M., Jin, H., Yan, J.: Implementation of NAS Parallel Benchmarks in High Performance Fortran, Technical Report NAS-98-009 3. Das, R., Uysal, M., Saltz, J., Hwang, Y.-S.S.: Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures 4. Koelbel, C.H., Loveman, D.B., Schreiber, R.S., Steel Jr., G.L., Zosel, M.E.: The High Performance Fortran Handbook. MIT Press, Cambridge (1994) 5. Brezany, P., Bubak, M., Malawski, M., Zajaac, K.: Large-Scale Scientific Irregular Computing on Clusters and Grids. In: Proceedings of the International Conference on Computational Science (2002) 6. Minyi, G.: Automatic Parallelization and Optimization for Irregular Scientific Applications. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium (2004) 7. Han, H., Tseng, C.-W.: Exploiting Locality for Irregular Scientific Codes. IEEE Transactions on Parallel and Distributed Systems (2006)


65

8. Hu, Y.C., Lu, H., Cox, A.L., Zwaenepoel, W.: OpenMP for Networks of SMPs. Journal of Parallel and Distributed Computing 60(12), 1512–1530 (2000) 9. El-Ghazawi, T., Carlson, W., Draper, J.: UPC Language Specifications V1.0 (February 2001) 10. Min, S.-J., Basumallik, A.y., Eigenmann, R.: Optimizing OpenMP programs on Software Distributed Shared Memory Systems. International Journal of Parallel Programming 31(3), 225–249 (2003) 11. Basumallik, A., Eigenmann, R.: Optimizing Irregular Shared-Memory Applications for Distributed-Memory Systems. In: Proceedings of the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (March 2006) 12. Feng, X.-B., Chen, L., Wang, Y.-R., An, X.-M., Ma, L., Sang, C.-L., Zhang, Z.-Q.: Integrating Parallelizing Compilation Technologies for SMP Clusters. Journal of Computer Science and Technology 20(1), 125–133 (2005) 13. Catalyurek, U.V., Aykanat, C.: A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices, ipdps. In: 15th International Parallel and Distributed Processing Symposium (IPDPS 2001) Workshops, vol. 3, p. 30118b (2001)

Preliminary Exploration of Volterra Filter Algorithm in Aircraft Main Wing Vibration Reduction and De-noising Control Chen Yu1, Shi Kun2, and Wen Xinling1 1

Zhengzhou Institute of Aeronautical Industry Management, China 2 Zhengzhou Thermal Power Corporation, China [email protected]

Abstract. In order to reduce vibration and noising of the aircraft main wing through the active control algorithm, research of non-linear adaptive filter algorithm is very important. This paper carries out variable step-size uncorrelated algorithm study based on traditional Volterra LMS algorithm. Through simulation of weakness and moderate intensity correlation input signal, the improved algorithm achieved good effect, and convergence speed is greatly improved, after 500 times iteration, weight coefficient mean-square error norm (NSWE) can achieve -60dB, but traditional Volterra LMS algorithm can only achieve -6dB, which proved the convergence of the rapidity and precision. Keywords: Volterra series, LMS, non-linear adaptive filter, convergence speed, convergence precision.

1 Introduction The aircraft’s main wing will produce unstable, separated eddy current load when aircraft in large attack angle or lesser attack angle flying. These eddy current load can cause vibrating strongly on the surface of aircraft main wing, and the serious even can cause aircraft main wing fatigue damage, etc. Because eddy current load of aircraft main wing is very complex and usually describes a kind of non-linear characters, in order to make aircraft main wing vibration reduction and de-noising, research of non-linear adaptive filter algorithm is very important. In solving non-linear problem, people have established many non-linear adaptive filter methods, such as neural network method, kalman filter, particles filter, Volterra filtering, etc. Because Volterra series is a kind of function and most non-linear system can be described in high precision by using Volterra series. Therefore, the research of Volterra filters adaptive filter algorithm has aroused researchers' attention at home and abroad. Because LMS algorithm has many advantages such as simple structure, good stability, which is the one of classic, effective adaptive filter algorithm, and widely used in adaptive control, radar, system identification and signal processing, etc. But, fixed step-length LMS adaptive algorithm is contradictory among convergence rate, tracking speed and maladjusted noise, etc. In order to overcome this inherent contradiction, peoples developed various variable step-length LMS improvement adaptive filter M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 66–73, 2011. © Springer-Verlag Berlin Heidelberg 2011

Preliminary Exploration of Volterra Filter Algorithm

67

algorithm. [1] This paper introduced a new improvement algorithm under analyze traditional Volterra LMS adaptive filter algorithm.

2 Volterra LMS Adaptive Filter Algorithm The relationship between input x( n) and output y ( n) of discrete non-linear Volterra system can be expressed in Volterra series as formula (1): [2] N −1

N −1 N −1

m1 = 0

m1 = 0 m2 = m1

y (n) = h0 + ∑ h1 (m1 )x(n − m1 ) + ∑

+

N −1 N −1

N −1

∑∑ ∑

m1 = 0 m2 =m1 m3 =m2

∑ h (m , m )x(n − m ) x(n − m ) 2

1

2

1

2

h3 (m1 , m2 , m3 )x(n − m1 ) x(n − m2 ) x( n − m3 ) + "

+ ∑ m =0 ∑ m = m "∑ m N −1

N −1

1

2

N −1

1

d

= md −1

hd (m1 , m2 ," , md ) x( n − m1 ) x (n − m2 ) " x (n − md ) + "

(1)

From formula (1) we can see, Volterra series considers the dynamic behavior of the system, and it can be regarded as a having memory’s Taylor series under the circumstance of existing expansion, therefore, it can describe non-linear dynamic systems. Form formula, hd ( m1 , m2 ," , md ) is called d orders Volterra kernel, N describes memory length. Formula (1) describes that non-linear system has infinite Volterra kernel. But, in fact application, we must carry on truncation processing, truncation processing contains order number d and memory length N. We should truncate the non-linear system according to the actual type and requirements precision, usually, we adopts second-order Volterra series truncation model. And we suggest h0 = 0 , and then formula (1) can be simple into formula (2): y (n) =

N −1

N −1 N −1

∑ h (m )x(n − m ) + ∑ ∑ h (m , m )x(n − m ) x(n − m )

m1 = 0

1

1

1

m1 = 0 m2 = m1

2

1

2

1

(2)

2

From formula (1) and (2), when order d and summary times N is large, the calculate amount of filter is large, the identification problem of non-linear system based on Volterra series is for unknown non-linear system using observations of input and output signal, in a sort of identification rule, by using the online recursive methods identify system various order, determine Volterra kernel coefficients, then define the non-linear system. The principle structure diagram is shown as Figure 1 based on Volterra series model non-linear system adaptive identification.[3]

Fig. 1. Structure diagram of Volterra adaptive filter algorithm

68

C. Yu, S. Kun, and W. Xinling

Through the application of Volterra adaptive filter algorithm, we can identify unknown nonlinear system, which making the error signal e(n) in a sense for minimal, also making some a cost function J(n) of e(n) into a minimum value. From Figure 1, W ( n) is coefficient vector of Volterra filter, when cost function J(n) achieved minimum, we can think kernel vector H (n) ≈ W (n) . According to the difference of the cost function J(n), we can get different adaptive algorithm. Because the definition of Volterra input vector and linear filter input vector is different, it make the convergence condition is different, the desired signal of Volterra series adaptive filter (i.e. the estimate signal) is a( n) . If definition cost function J(n) is shown as formula (3). J (n) = e2 (n) = [ y(n) − W T (n) X (n)]2

(3)

It is consist of the traditional least mean-square error (LMS) algorithm, Volterra filter LMS adaptive algorithm is shown as formula (4). e(n) = y (n) −W T (n) X (n)

(4)

W ( n + 1) = W (n ) + μ X (n )e(n)

LMS algorithm has the advantages of small amount of calculation, but because the system with non-linear, which making the input signal correlation matrix eigenvalues to the extended greaten, and its convergence speed become slow. In addition, because the input inevitably LMS algorithm existing interference noise, which will produce imbalance parameters, and the maladjusted quantity is proportional to the iteration step, therefore, we can use variable step-length LMS way to improve performance of the algorithm. Therefore, the requirement is contradictory among the convergence speed, time-varying system tracking speed and the accuracy of convergence in the algorithm of fixed step length adaptive filter. So, convergence speed of LMS Volterra filter algorithm is slowly. In order to speed up the convergence speed, we can adopt different step length to linear part and non-linear part. [4] The step length adjustment to variable step length adaptive filter should meet step length is bigger during initial convergence stage or unknown system parameter changes, which can have a bigger fast convergence rate and the tracking speed of time-varying systems. And then after the algorithm convergence, whatever the main input jamming signal v(n) is bigger, the algorithm should maintain a small adjustment to achieve small step length of the steady-state disorders noise.

3 LMS Variable Step-Length Decorrelation Algorithm In LMS algorithm, we can define correlation coefficient a (n) of X (n) and X (n − 1) similar to projection coefficient, which is shown as formula (5): (5) X T (n ) X (n − 1) a (n ) = X T (n − 1) X (n − 1)

a ( n) presents the association degree between X (n) and X (n − 1) , and

a (n) is larger,

the connections between them are stronger. Therefore, we can write the improved update orientation vector, such as below formula (6).


69

b(n) = X (n) − a(n) X (n − 1)

(6)

Clearly, a ( n) X ( n − 1) is the relevant part between X (n) and X (n −1) . Subtracting a part from X (n) , which is equivalent to decorrelation operations. For example, input signals

x(n) is generated by the model x(n) = ax(n − 1) + v (n) , mean value of

v (n) is 0, and the gaussian distribution with variance of 1, then the correlation coefficient of x (n) is shown as formula (7). c( n) =

E[ x( n) x (n - 1)] 2

E[ x (n - 1) ]

=

E[( ax(n - 1) + v (n )) x(n - 1)] 2

E[ x (n - 1) ]

=a+

E[v (n ) x( n - 1)]

(7)

2

E[ x (n - 1) ]

Visible, absolute value of a is the larger, c ( n) is the bigger, correlation of x (n) is stronger. Use adjusting weight coefficient of c ( n) , which can achieve more accurately and more quickly value. Therefore, formula (4) can be amended as formula (8).

W (n + 1) = W (n) + μ e(n)b(n)

(8)

In order to solve the contradiction between the steady-state error and the convergent speed, we will modify the constant step factor μ of formula (8) to variable step by step length factor, which is shown as formula (9). p ( n ) = χ p ( n − 1) + (1 − χ )[ e ( n ) e ( n − 1)

(9)

μ (n + 1) = δμ ( n) + ε p 2 ( n) n = kL, μ min ≤ μ (n + 1) ≤ μ max μ ( n + 1) = μ max n = kL, μ(n +1) > μmax μ ( n + 1) = μ min n = kL, μ (n + 1) < μ min μ ( n + 1) = μ (n) n ≠ kL Among formula (10), L is segmentation length, 0 < χ < 1 , 0 < δ < 1 , and approximating to 1,

ε >= 0 , and which is a many small normal number. μmin and

μ max respectively is lower limit and upper limit of step length. Instead μ (n + 1) of formula (10) in formula (9) and normalized, we can get weight vector adjust formula of variable step-length decorrelation LMS algorithm, which is shown as formula (11). W (n + 1) = W (n) +

μ (n + 1) e( n ) b ( n ) 2 ζ + b( n)

(11)

From formula (11), ζ is a very small normal number. According to literature [5], Weight coefficient mean-square error (NSWE) is shown as formula (12). N −1

NSWE =

10 log10∑ hi ( n) − hi* i=0 N −1

∑h i=0

2

(12)

*

i

On aircraft main wing vibration reduction and de-noising, we plans to use this adaptive filter algorithm to study, and to seek more favorable and convergence filter algorithm. [6]

70


4 Algorithm Simulation and Performance Analysis If we suggest identified each order kernel coefficient of the non-linear system expected output signal a(n) is:

a (n)=-0.75x( n) + 0.42x ( n − 1) – 0.34 x ( n − 2) + 0.5x 2 ( n) + 0.23x 2 (n − 1) –

1.51x 2 (n − 2)

–

0.54x ( n ) x ( n − 1)

+

1.74x ( n − 1) x ( n − 2)

–

0.9 x ( n ) x ( n − 2) + v 2(n) . Input signal is x ( n) = ax ( n − 1) + v1( n) , among them, v1 and v 2 are all mean of 0, variance of 1, and mutual independence gaussian white noise signal. Adaptive Volterra filter order is 2, and memory length is N of 3. If we select μmax = 0.5 , μmin = 0.001 , L = 12 , σ = 0.95 , α = 0.95 ,

β = 0.000001 , and c = 0.000001 , all simulation curve is taken from the average

through 20 times independent simulation results. When input signal is weak related, when x(n) = 0.3x(n − 1) + v1(n) , improvement LMS algorithm each convergence curve of Volterra kernel coefficient is shown as Figure 2.

Note: “―“ is improve Volterra LMS algorithm, and “-- “ is traditional Volterra LMS filter algorithm Fig. 2. Volterra kernel coefficient convergence conditions in Weak related input signal


71

Fig. 2. (continued)

From Figure 2 simulation results can be seen, the algorithm whether the convergence rate or steady-state mismatch is superior to the traditional Volterra LMS algorithm. Algorithm in 500 times’ iterations, algorithm is all achieve basic convergence and effect is good, as well as convergence precision is high. While, the convergence speed of traditional Volterra LMS algorithm has very big disparity. After the iteration of 3,000 times, Volterra weight coefficient can only reach basic convergence and achieve to an optimal value. Comparison diagram of weight coefficient mean square error (NSWE) is shown as Figure 3.

72


Note: “―“ is improve Volterra LMS algorithm, and “-- “ is traditional Volterra LMS filter algorithm Fig. 3. Comparison diagram of weight coefficient mean square error (NSWE)

We can see from Figure 3, this paper’s algorithm after 500 times iteration, weight coefficient mean-square error norm can achieve -60dB, but traditional Volterra LMS algorithm can only achieve -6dB, therefore, this algorithm is shown that Volterra LMS algorithm is far superior performance than traditional. When the input signal is a medium strength of correlation, i.e. x(n) = 0.6 x(n − 1) + v1(n) , this improvement algorithm is also superior to the traditional Volterra LMS algorithm. But, with the related intensity creasing (form 0.3, 0.6 to 0.9), traditional LMS algorithm and this paper Volterra all cannot achieve rapid convergence algorithm. Therefore, such as aircraft main wing with complex non-linear vibration condition, the algorithm can't achieve vibration reeducation and de-noising purpose, which are the future research direction and key.

5 Conclusions This paper studied Volterra LMS algorithm, and proposes a kind of variable step length LMS improved Volterra decorrelation algorithm, which largely improving the performance of the algorithm, especially when the input signal is weak intensity related and moderate intensity related circumstance. But when the input signal is in high strength related, performance of the algorithm is greatly affected, which can't even realize convergence. Therefore, in the future, we will use the relevance theory, adopt quadrature component of input signal adaptive filter to update the weight vector, and carry out the signal decorrelation processing, which meeting the complex aircraft main wing vibration reducation and de-noising active control. Acknowledge. This paper is supported by the Aeronautical Science Foundation in China. ( No.2009ZD55001) and (No.2010ZD55006).


73

References 1. Jin, J., Li, D., Xu, Y.: New uncorrelated variable step LMS algorithmand its application. Computer Engineering and Applications, 57–58 (2008) 2. Liu, L., Hu, P., Han, J.: A Modified LMS Algorithm for Second-order Volterra Filter. Journal of China Institute of Communications, 122–123 (2002) 3. Long, J., Wang, Z., Xia, S., Duan, Z.: An Uncorrelated Variable Step-Size Normalized LMS Adaptive Algorithm. Computer Engineering & Science, 60–61 (2006) 4. Aboulnasr, T., Myas, K.: A robust variable step-size lms-type algorithm: Analysis and simulations. IEEE Trans on Signal Processing, 631–639 (1997) 5. Lee, J., Mathews, V.J.: A fast recursive least squares adaptive second-order Volterra filter and its performance analysis. IEEE Trans on Signal Processing, 1087–1102 (1993) 6. Li, F., Zhang, H.: A new variable step size LMS adaptive filtering algorithm and its simulations. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 593 (2009)

Development Strategy for Demand of ICTs in BusinessTeaching of New and Old Regional Comprehensive Higher Education Institutes Hong Liu School of Business, Linyi University, Linyi 276005, P.R. China [email protected]

Abstract. In China, many regional higher education institutes (mainly be regional university) are believed as catalyst for the economic progress (especial for commerce), have been founded from 1998. From 2006, most of those regional higher education institutes promoted to be new comprehensive regional higher education institutes (CRHEIs) in succession. Now more and more students studying in those CRHEIs, and then the quality of teaching should be improved. In this paper, we analyses ICT (Information and communication technology)–based teaching and learning issues at CRHEIs level and old regional higher education institutes (OHEIs, mainly be regional university) level. Using normative Delphi method, we discussed ICT-based issues and get that ICT integration in CRHE and ORHE and get that CRHE need to be in lined with proper strategy in order to get their true benefits. Keywords: ZPD incidence development strategy, ICT, comprehensive regional higher education.

1

Introduction

Comprehensive regional higher business education institutes need ICT systems to facilitate the exchange of ideas and information about agriculture[1]. It is recognized by many researchers that use of ICT and application is an important business-skill for any worker. ICT usage at institute helps students to continue their learning beyond the classroom[2]. ICT -skilled teachers in comprehensive regional higher education institutes should adopt right pedagogical tools and practices in their teaching and enable their students embrace these new technologies. With time passed, ICT has changed the whole process of teaching and learning. While ICT usage in institute has become a standard, teachers will become more informed, more interactive, and more confident in the usage of various kinds of hardware and software to encourage and challenge students. But some teachers lag far behind the others in adopting ICT innovation[3]. However, change in professional domain, ICT innovation, and teaching methods is ineluctable. Keeping pace with the time, teachers ought to change their pedagogical tools to adapt such change. In china, many regional higher education institutes (mainly be regional college) have been founded from 1999 in China. From 2006, most of those regional higher M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 74–78, 2011. © Springer-Verlag Berlin Heidelberg 2011

Development Strategy for Demand of ICTs in Business- Teaching

75

education institutes promoted to be comprehensive regional higher education institutes (CRHEIs) in succession. Students in those CRHEIs become more and more. Those CRHEIs should improve their quality of teaching. In this study, teaching /learning with and without the help of ICT in CRHEIs and old comprehensive higher education institutes (ORHEIs) is explored. Their ability gaps in tackling and solving problems are recorded, and therefore proper strategy or mechanism can be figured out to reduce these ability gaps to a minimum. In the following, we propose Linyi University as an example on our research of CRHEIs in PR China. In this research, the personnel of college of business science in Linyi University are asked to devise development strategy with the help of normative Delphi technique. The purpose of these development strategies is to increase teachers/learners experiences by the use of ICT in CRHEIs, and therefore improve output of CRHEIs. The School of Business in Linyi University now has 130 faculty members, of whom there are 11 Professors, 47 Vice Professors, 32 with doctoral degrees, 10 in the process of completing doctoral degree programs, and 73 with master’s degrees. The school now has 6030 full-time undergraduates and non-degree students. We propose Qingdao science university as an example on our research of ORHEI in PR China. The School of Business in Qingdao science university has been build about 60 years and now has 101 faculty members, of whom there are 30 Professors, 38 Vice Professors, and 11 lecturers. 88% of the Professors and Vice Professors had achieved a PhD degree. The school now has 3 degree programs. The school now has 2081 full-time undergraduates. The organization of this paper is as follows. In section , we propose the introduction of ICT in CRHEIs. In section II, we discuss the data analysis of college of Business in Linyi University and Qingdao science University. In section III, we present our conclusion that development strategy for requirement of ICT proposed by panelists.

Ⅰ

2

Data Analysis

In this paper, we get the following data of college of business in Linyi University and Qingdao science University as the way of [4]. According to normative Delphi technique, a questionnaire was prepared and hand-delivered to the 230 members of staff in college of business in Linyi University, and 180 of panelists answered the questionnaire. There are 126 members of staff in college of business in Qingdao science University. 78 of panelists answered the questionnaire. Those data showing ZPD gaps obtained through the questionnaires is shown in Table 1 and Table 2. The concept of Zone of proximal development (ZPD)was coined by [5], ZPD gap is the difference between future/maximum and current state of any development/use of ICT. In this research, teaching/learning with and without the help/use of ICT at CRHEIs and ORHEIs level are explored and their ZPD gaps are recorded, so that a strategy that can reduce these gaps to a minimum can be devised. Please refer to Table 1 and Table 2 for issues.

76

H. Liu

Table 1. ZPD gaps of Linyi University (New Regional Comprehensive Higher Education Institutes)and Qingdao science university (Old Regional Comprehensive Higher Education Institutes) issues 1 2 3 4 5 6 7

Prepareng/Developing Class Lecture Presenting/Sharing Material Assessing Student’s Learning Managing Student Conduct Administrative Support Academic Research Social Networks

ZPD gaps University 3.08 1.98 1.51 1.74 2.13 2.30 1.99

of

Linyi

ZPD gaps of Qingdao science university 2.87 2.06 0.96 1.78 1.75 1.97 1.00

1. A teacher prepare/develop class lecture by reading online, searching information from ICT before his class lecture. ZPD gap(3.08) of CRHEIs and ZPD gap(2.87) of ORHEIs are recorded which shows levels of teachers using ICT tools and applications for these tasks. 2. For developing course material, sharing educational content, communication between teachers and outside using ICT and applications, ZPD gap (1.98) of CRHEIs and ZPD gap(2.06) of ORHEIs are obtained. 3. Checking exam papers, recording grades, and announcing results takes a lot of teachers’ time. ZPD gap of teachers of CRHEIs and ORHEIs are 1.51 and 0.96 respectively. 4. For managing student conduct with the help of ICT, ZPD gap (1.74) of CRHEIs and ZPD gap(1.78) of ORHEIs are obtained. 5. Teachers spends a lot of time in accomplishing administrative tasks such as keeping student records, issuing books, and supporting students with ICT and applications in their studies. ZPD gap of teachers of CRHEIs and ORHEIs regarding these tasks are 2.13 and 1.75 respectively. 6. For finding research information, communicating with researchers, and sharing ideas with other teachers, ZPD gap (2.03) of CRHEIs and ZPD gap(1.97) of ORHEIs are obtained. 7. Teachers quest for knowledge using social networks and learner forums. ZPD gap of teachers of CRHEIs and ORHEIs regarding these tasks are 1.99 and 1.00 respectively. Now we discuss the above data as the way of [6]. In comprehensive regional higher education institutes, teachers usually perform a number of tasks. It takes a teacher a lot of time to prepare a class lecture for his day-to-day teaching task. A teacher must develop course material and sharing educational content using ICT and applications. If a teacher is effective at checking exam papers, recording grades, and announcing results, his keeping record tasks will become much easier. Administrative tasks require a teacher to spend time keeping student records, issuing book, and supporting students. There are many ICT and applications that a teacher can use while finding research information and sharing ideas with other teachers. In almost all

Development Strategy for Demand of ICTs in Business- Teaching

77

aspects relating these teaching and learning issues about CRHEIs in PRC, big ability gaps are measured. Such significant gaps show different level of staff and staff in using ICT and applications. However, small ability gaps are measured about ORHEIs in PRC. The main causes of this difference are due to lack of funding, unavailability of resources and lack of attitude or vision etc between CRHEIs and ORHEIs. The spread of ICT and applications is considered as necessary in CRHEIs of developing countries; thus they can drive in pedagogical challenges coming from latest development. However, few strategies have been devised to solve these issues in developing countries (especially PRC). Accordingly we try to devise a strategy including some important measures for ICT enhancement in this study.

3

Conclusions

Through Comparing the ZPD gaps of Linyi University and Qingdao science university, some development strategy for requirement of ICT are listed below, most proposed by panelists in college of Business in Linyi University. 1. ICT for teaching RHE teachers need to be proficient in usage of ICT and applications to work effectively. Some of the recommendations suggested by our panelists in this regard are: (1) design of a persistent training program for faculty/staff in the use of ICT. (2) teaching/support staff should be supported /encouraged to use innovative methods of teaching in their routine work. 2. ICT for gaining proper attainment of students Some of the actions suggested by our panelists in this dimension are: (1) enable high speed ICT access for management, faculty and administrative staff. (2) local ICT needs to be developed and consummated. (3) students progress between key stages may be measured through management information system. (4) while delivering a lecture or performing some admin task, available software/hardware can be used conveniently. 3. ICT for teachers’ development Some of the recommendations suggested by our panelists in this dimension are: (1) RHE teaching/support staff should get opportunities of studying and gaining higher qualifications through scholarships/fellowships and attending research oriented events. (2) establish an information management strategy to be shared with other related stakeholders. (3) design of a mechanism that providing access to appropriate information using portal technology to other related stakeholders.

References 1. Information and Communication Technology Strategic Plan, 2005-06 to 2009-10, University of Oxford, http://www.ict.ox.ac.uk/strategy/plan/ ICT_Strategic_Plan_March2007pdf (retrieved August 2009) 2. Argyll and Bute Community Services: ICT Strategy for Education, http://www.argyllbute.gov.uk/pdffilesstore/ infotechnologystrategy (retrieved August 2009)

78

H. Liu

3. Deshpande, P.: Connected – where next?: A strategy for the next phase in the development of education ICT in Bournemouth, http://www.bournemouth.gov.uk/ Library/PDF/Education/Education_ICT_Strategy_2004_to_2009.pdf (retrieved August 2009) 4. Shaikh, Z.A.: Usage, acceptance, adoption, and diffusion of information & communication technologies in higher education: a measurement of critical factors. Journal of Information Technology Impact 9(2), 63–80 (2009

A Novel Storage Management in Embedded Environment Lin Wei and Zhang Yan-yuan Dept. Computer Science and Engineering, Northwest Polytechnic Univ, Xi'an, China [email protected]

Abstract. Flash memory has been widely used in various embedded devices, such as digital cameras and smart cell phones, because it has a fast access speed, a high availability and an economization of power. However, replacing hard drives with flash memory in current systems often either requires major file system changes or causes performance degradation due to the limitations of block based interface and out-of-place updates required by flash. We introduce a management framework of object-based storage system which can optimize the performance for the underlying implementation. Based on this model, we propose a data allocation method that takes richer information from an object-based interface. Using simulation, we show that cleaning overhead can be reduced by up to 11% by separating data and metadata. Segregating the hot data and cold data can further reduce the cleaning overhead by up to 19%. Keywords: Flash memory, Object-based flash file system, Data allocation.

1 Introduction The demand for storage capacity has been increasing exponentially due to the recent proliferation of multimedia contents. In the meantime, NAND flash memory becomes one of the most popular storage media for portable embedded systems such as MP3 players, cellular phones, PDAs (personal digital assistants), PMPs (portable media players), and in-car navigation systems. However, replacing hard drives with flash memory in current systems often either requires major file system changes or causes performance degradation due to the limitations of block based interface and out-of-place updates required by flash. Flash memory requires intelligent algorithms to handle their unique characteristics, such as out-of-place update and wear-levelling. Thus, use of flash memory on current systems falls into two categories: Flash Translation Layer (FTL[9,10])-based systems and flash-aware file systems. FTL is usually employed between operating system and flash memory. The main role of FTL is to emulate the functionality of block device with flash memory by hiding the erase-before-write characteristics as much as possible. Once FTL is available on top of NAND flash memory, any disk-based file system can be used. However, since FTL is operating at the block device level, FTL does not have any access to file system level information and this may limit the file system performance, reduces performance and wastes computing resources. On the other hand, several flash-aware file systems, such as JFFS2 [1], YAFFS2 [2], ELF [3], and TFFS [4], have been developed to simplify the file system design M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 79–83, 2011. © Springer-Verlag Berlin Heidelberg 2011

80

L. Wei and Z. Yan-yuan

without being in need of FTL and to extract maximum performance out of flash memory. Currently, JFFS2 or YAFFS2 serves as one of the most widely used general-purpose flash file systems in an embedded environment. However, Flash-aware file systems are designed to be generic and not tuned for specific hardware, and thus are relatively inflexible and cannot easily optimize performance for a range of underlying hardware. To solve these problems, we propose an object-based model for flash. In this model, files are maintained in terms of objects with variable sizes. The object-based storage model offloads the storage management layer from file system to the device firmware while not sacrificing efficiency. Thus, object-based storage devices can have intelligent data management mechanisms and can be optimized for dedicated hardware like SSDs. We simulate an object-based flash memory and propose two data placement policies based on a typical log structure policy. Our first approach separates data and metadata, assuming that metadata changes more frequently than data. The second approach segregates hot metadata and cold metadata to avoid additional cleaning. We compare the cleaning overhead of these approaches to identify the optimal placement policies for an object-based flash memory.

2 Background 2.1 Current Flash Memory File System JFFS2 is a log-structured file system designed for flash memories. The basic unit of JFFS2 is a node in which variable-sized data and metadata of the file system are stored. Each node in JFFS2 maintains metadata for a given file such as the physical address, its length, and pointers to the next nodes which belong to the same file. Using these metadata, JFFS2 constructs in-memory data structures which link the whole directory tree of the file system. This design was tolerable since JFFS2 is originally targeted for a small flash memory. However, as the capacity of flash memory increases, the large memory footprint of JFFS2, mainly caused by keeping the whole directory structure in memory, becomes a severe problem. The memory footprint is usually proportional to the number of nodes, thus the more data the file system has, the more memory is required. YAFFS2 is another variant of log-structured file system [2]. The structure of YAFFS2 is similar to that of the original JFFS2. The main difference is that node header information is moved to the NAND spare area and every data unit, called chunk, has the same size as NAND pages to efficiently utilize NAND flash memory. Similar to JFFS2, YAFFS2 keeps data structures in memory for each chunk to identify the physical location of the chunk on flash memory. It also maintains the full directory structure in main memory since the chunk representing a directory entry has no information about its children. In order to build these in memory data structures, YAFFS2 scans all the spare areas across the whole NAND flash memory. Therefore, YAFFS2 faces the same problems as JFFS2. 2.2 Object-Based Flash Translation Layer In a system built on object-based storage devices (OSDs) [5, 6], the file system offloads the storage management layer to the OSDs, giving the storage device more flexibility on

A Novel Storage Management in Embedded Environment

81

data allocation and space management. Recently, Rajimwale et al. proposed the use of an object-based model for SSDs [7]. The richer object-based interface has great potential to improve performance not only for SSDs but also for other new technologies. Our object-based model on flash can be divided into two main components: an object-based file system and one or more OSDs. The object-based file system maintains a mapping table between the file name and the unique object identifier for name resolution. A flash-based OSD consists of an object-based FTL and flash hardware. The object-based FTL also contains two parts: a data placement engine that stores data into available flash segments, and an index structure that maintains the hierarchy of physical data locations. A cleaning mechanism is embedded to reclaim obsolete space and manage wear levelling. The status of each object is maintained in a data structure called an onode(object inode), which is managed internally in the OSD.

3 Data Allocation Method One optimization with object-based model is the exploration of intelligent data placement policies to reduce cleaning overhead. In a typical log-structured policy, data and metadata are written sequentially to a segment to avoid erase-before-write, an approach we term a combined policy. The problem is that different data types are stored together; since metadata is usually updated more frequently than user data, this approach causes the cleaner to move a large amount of live user data out before erasing the victim segment. 3.1 Centralized Policy Our first approach is centralized policy, separates metadata and data into different segments, as was done in systems like DualFS [7] and hFS [8]. The figure 1 shows the data location with centralized policy. SB（Super Block）is at the first segment of the flash memory, SIB（Segment Information Block）record the status of each segment. The metadata of each object is centralized into the onode segment. Unlike those systems that do not manage file metadata internally, this could be easily accomplished in OSDs with sufficient information from the file system.

Fig. 1. Data location with centralized policy

3.2 Cold-Hot Model When writing requests are performed, the system must allocate free pages to store data. The data could be classified into hot and cold attributes. If the system stores the hot and cold data in the same segment, the cleaning activity needs to copy valid object to another free flash memory space for reclaiming segments. This operation would cause a lot of extra system overhead. To address the above problem, hot and cold data are

82

L. Wei and Z. Yan-yuan

separately stored to different segments. When the system writes new data to flash memory, it would be written to the cold segment. If the data are updated, the data would be considered as hot and be written to the upper region. Then, the segment which contains obsolete data would be moved to the suitable dirty list according the amount of invalid pages if there are no free pages in it.

4 Experiments We have implemented a simulator which consists of 512 MB NAND-type flash memory in MTD [11] module of Linux. Table 1 lists our experimental environment and setting. We also implement 3 different data placement policies including combined policy, centralized policy, centralized and hot-cold policy. The workload generator converts file system call-level traces to object-based requests and passes them to OSDs. The FTL contains an index structure, data placement policies and a cleaner. The evaluation mainly focuses on the cleaning overhead in terms of number of segments cleaned and number of bytes copied during cleaning under three data placement policies. Table 1. Experimental environment and setting Experimental Environment

NAND Flash

CPU: Pentium 4 3.2GHz

Block Size: 16 KB

Memory : 512MB

Page Size: (512 + 16) B

Flash memory: 512MB

Page read time: 35.9 us

OS: Linux 2.6.11

Page write time: 226 us

MTD module: blkmtd.o

Block erase time: 2 ms

For each policy in Figure 2, the left bar indicates the total number of segments cleaned and the right bar indicates the number of bytes copied during garbage collection. Each bar is normalized to the combined policy. Centralized policy can reduce cleaning overhead by up to 11%, centralized and hot-cold policy can further reduce the overhead by up to 19%. data cleaned metadata cleaned

data moved metadata moved

Cleaning overhead

1.0 0.8 0.6 0.4 0.2 0.0

combined

centralized

centralized and hot-cold

Fig. 2. Cleaning overhead of three placement policies

A Novel Storage Management in Embedded Environment

83

The amount of live data copied in the centralized policy under the read-heavy workload is reduced because dirty metadata segments have less live data than data segments, thus fewer pages are copied out from victim segments. By segregating access time from metadata, the cleaning overhead is further significantly reduced since frequent metadata updates are avoided by hot-cold policy.

5 Conclusions The performance of flash memory is limited by the standard block-based interface. To solve this problem, we have proposed the use of an object-based storage model for flash memory. We have explored a data allocation method for object-based file system by separating frequently updated metadata and data. The experiment shows the centralized and hot-cold data placement policies were able to reduce cleaning overhead over the typical log structured scheme.

References 1. Woodhouse, D.: JFFS: The Journaling Flash File System. In: Proc. Ottawa Linux Symposium (2001) 2. Aleph One Ltd. YAFFS: Yet another flash file system, http://www.yaffs.net 3. Dai, H., Neufeld, M., Han, R.: ELF: an efficient log-structured flash file system for micro sensor nodes. In: ACM Conference on Embedded Networked Sensor Systems (SenSys), pp. 176–187 (2004) 4. Douglis, F., Cáceres, R., Kaashoek, M., Li, K., Marsh, B., Tauber, J.: Storage Alternatives for Mobile Computers. In: Symposium on Operating Systems Design and Implementation (OSDI), pp. 25–37 (1994) 5. Rajimwale, Prabhakaran, V., Davis, J.D.: Block management in solid-state devices. In: USENIX Annual Technical Conference 2009 (June 2009) 6. Woodhouse, D.: The journaling flash file system. In: Ottawa Linux Symposium, Ottawa, ON, Canada (July 2001) 7. Piernas, J., Cortes, T., García, J.M.: DualFS: a new journaling file system without meta-data duplication. In: Proceedings of the 16th International Conference on Supercomputing, pp. 84–95 (2002) 8. Zhang, Z., Ghose, K.: hFS: A hybrid file system prototype for improving small file and metadata performance. In: Proceedings of EuroSys 2007 (March 2007) 9. Intel Corporation, Understanding the Flash Translation Layer (FTL) Specification (1998), http://developers.intel.com 10. Flash-Memory Translation Layer for NAND Flash (NFTL), M-Systems 11. Woodhouse, D.: Memory Technology Device (MTD) subsystem for Linux, http://www.linux-mtd.infradead.org/

Development Strategy for Demand of ICT in Small-Sized Enterprises Yanhui Chen School of Engineering, Linyi University, Linyi 276005, Shandong, P.R. China [email protected]

Abstract. With time passed, information and communication technology (ICT) penetrates into all aspects of industrial production and changes the whole process of industry. In the past years, small-sized enterprises have played an important role in the economy. But little attention has been paid to the usage of ICT in small-sized enterprises in China. In this paper, we analyses ICT–based issues from technicians in Linyi Xinyuan Friction Material Limited Company. The methodology for carrying out the tasks mainly contains questionnaires according to normative Delphi technique. The paper ends with recommendations which small-sized enterprises authorities should take in order to conform ICT in their production. Keywords: information and communication technology, ZPD gaps, small-sized enterprises.

1

Introduction

During the past 20 years, a wave of new information and communication technology (ICT) was introduced with great impact in almost all aspects of industrial production worldwide [1]. ICT do offer a new way in which ideas can be generated, communicated, and assessed [2]. Advances in the field of ICT, including Email, Internet bulletin boards, Internet-based courseware delivery strategies, and video conferencing have together changed the whole process of industry. In the past few years, small and medium-sized enterprises have played an important role in the economy and easing employment pressure [3, 4]. Small and medium-sized enterprises comprise more than 99% of all enterprises with more than 73% of entire workforce in China (similarly in other countries). It is vital for China as well as the other countries worldwide. Small-sized industrial enterprises in China are identified as follows: small-sized enterprises employ less than 300 employees; its annual income is not more than 30 million yuan or balance value of assets is not more than 40 million yuan. As the importance of small-sized enterprises has increased, it has been accompanied by an increase in the amount of research attention paid to them. Now small-sized enterprises face enormous pressures as China integrates more into the world economy. The way that small-sized enterprises develop in an increasingly competitive market has become one key issue [5]. How to stay competitive is a question that bothers most of M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 84–88, 2011. © Springer-Verlag Berlin Heidelberg 2011

Development Strategy for Demand of ICT in Small-Sized Enterprises

85

the enterprises as well as small-sized enterprises as they cannot compete on mass production [6]. One of the possible answers is innovation through lifelong and informal learning by the use of ICT. ICT is needed to facilitate the exchange of ideas and information about industry production [7]. With time passed, ICT has changed the whole process of industry. While ICT usage in industry has become a standard, workers will become more informed, more interactive, and more confident in the usage of various kinds of hardware and software. It is recognized by many researchers that use of ICT tools and application is an important life-skill for any worker, whereas some workers lag far behind the others in adopting ICT. A lot of research is dedicated to usage of ICT in large enterprises since large enterprises are able to invest more in ICT. Little attention has been paid to the usage of ICT in small-sized enterprises in China. Significance of usage of ICT in small-sized enterprises is not highlighted. In this study, we propose Linyi Xinyuan Friction Material Limited Company as an example on our research of Chinese small-sized industrial enterprises. As the way of [5], we get the following data of Linyi Xinyuan Friction Material Limited Company. The company covers an area of more than 260 acreages with assets of more than 36 million yuan. There are 280 employees, including 59 high and middle-level technicians in Linyi Xinyuan Friction Material Limited Company. The main product of the company is automobile disc brake shoes of more than four hundred models. The annual production of the company reached 1 million sets of automobile disc brake shoes, applying to more than 500 different kinds of vehicles. Working and learning with and without the help of ICT in Chinese small-sized enterprises is explored. Their ZPD gaps in tackling and solving problems are recorded, and therefore proper strategy or mechanism can be figured out to reduce these ZPD gaps to a minimum. According to normative Delphi technique, a questionnaire was prepared and hand-delivered to the 59 high and middle-level technicians, and 54 of the staff answered the questionnaire [8]. Within next 2 months, these 54 members complete other questionnaires for 3 rounds. The same group was asked to devise development strategy that small-sized enterprises authorities should take in order to integrate ICT in their company. The purpose of these development strategies is to increase technicians’ experiences by the use of ICT, and therefore improve output of small-sized enterprises. The concept of Zone of proximal development (ZPD) was coined by [9, 10]. ZPD gap is the difference between future/maximum and current state of any development/use of information technology. In this research, working and learning with and without the help/use of ICT are explored and their ZPD gaps are recorded, so that a strategy that can reduce these gaps to a minimum can be devised. Those data showing ZPD gaps obtained through the questionnaires is shown in Table 1. Please refer to Table 1 for issues. The organization of this paper is as follows. In section 1, we propose the introduction of ICT in small-sized enterprises. In section 2, we discuss the data analysis of Linyi Xinyuan Friction Material Limited Company. In section 3, we present our conclusion that development strategy for requirement of ICT proposed by technicians.

86

2

Y. Chen

Data Analysis

Data showing ZPD gaps obtained through the questionnaires from 54 technicians of Linyi Xinyuan Friction Material Limited Company. Please refer to Table 1 for issues. Table 1. ZPD gaps

1 2 3 4 5 6 7 8 9 10

issues Learning Professional Knowledge Finishing Routine Tasks /Sharing Material Communication with Leaders and Colleagues Innovative Research Use of Common ICT tools Rely on ICT tools in small-sized enterprises Use of ICT tools in small-sized enterprises Get help by ICT tools in small-sized enterprises ICT Demand in small-sized enterprises ICT Supply in small-sized enterprises

ZPD gaps 2.16 1.45 1.03 2.85 0.05 0.97 0.86 0.90 1.36 2.90

1. A technician learns professional knowledge by reading online, searching information from internet. High ZPD gap (2.16) is recorded which shows levels of technicians of small-sized enterprises using ICT tools and applications for these tasks. 2. For finishing their routine tasks task and sharing material using ICT tools and applications, ZPD gap (1.45) is obtained. 3. For communication with colleagues, leaders and outside between using ICT tools and applications, ZPD gap is 1.03. 4. For finding professional information, communicating with researchers, quest for knowledge using learner forums, large ZPD gap (2.85) is obtained. 5. Very small ZPD gap of 0.05 is recorded for use of common ICT tools such as MS office, web browsers, e-mail, search engines etc. 6. Regarding how much should technicians faculty of small-sized enterprises rely on ICT tools and applications, low ZPD gap (0.97) is recorded. 7. Regarding how much should technicians faculty of small-sized enterprises use ICT tools and applications, low ZPD gap (0.86) is recorded. 8. Regarding how much help technician’s faculty of small-sized enterprises get while using ICT tools and applications, low ZPD gap (0.90) is recorded. 9. ZPD gap of 1.36 is measured for the issue of demand for ICT in small-sized enterprises of PRC. 10. ZPD gap of 2.90 is recorded for ICT supply in response to its demand in small-sized enterprises. Technicians of small-sized enterprises usually perform a number of tasks. It takes a technician a lot of time to finishing his routine tasks using ICT tools and applications. There are many ICT that a technician can use while finding research information and

Development Strategy for Demand of ICT in Small-Sized Enterprises

87

sharing ideas with other technicians. If a technician is effective at communication with his colleagues, his tasks will become much easier. In several aspects relating these issues about technician in small-sized enterprises, big ZPD gaps are measured. Such significant gaps show different level of staff in using ICT tools and applications. In this study, the use of MS office, web browsers, search engines, and email is popular for most staff. Further than these skills, staff is less practical in using other tools which are essential in the development of their career profiles. The main causes of such significant gaps are due to lack of funding, unavailability of resources and lack of attitude or vision etc. The spread of ICT tools and applications is considered as necessary in small-sized enterprises of developing countries. However, few strategies have been devised to solve these issues in developing countries (especially PRC). Accordingly in this study we try to devise a strategy including some important measures for ICT enhancement.

3 Conclusions Some development strategy for requirement of ICT in small-sized enterprises is proposed by technicians in Linyi Xinyuan Friction Material Limited Company. It is necessary for government and small-sized enterprises to realize the full potential of ICT. All members of the small-sized enterprises, that is, management, technicians and common laborers, should be involved in readiness. Opinion leaders could be used as effective promotional vehicles in the implementation of ICT among small-sized enterprises. Small-sized enterprises currently implementing ICT should be identified and supported. Such small-sized enterprises who are successfully implementing ICT should be showcased as success models. While some small-sized enterprises want to adopt ICT to facilitate improved performance and subsequent growth, they may be constrained by finances. The small-sized enterprises should be given the necessary attention in policy, financial and general business support. It is necessary for small-sized enterprises to pursuit more funds from government. Local internet needs to be developed and consummated. Computer level of 3:2 technicians -computer ratio should be reached. High speed internet access for management, technicians and common laborers should be enabled. Technicians can be supported to use internet in their routine tasks. Technicians also need a lot of attention in adopting an effective and efficient e-learning environment. Technicians must be keen to collaborate with other colleagues in and outside the factory. Teams comprising of management, technicians and common laborers need to be developed which develop task-based education content. ICT training centers that fulfill training needs of technicians needs to be established. It should be designed that a persistent training program for technicians in the use of ICT.

References 1. Eyitayo, O.T., Ogwu, F.J., Ogwu, E.N.: Information Communication Technology (ICT) Education in Botswana: A Case Study of Students’ Readiness for University Education. Journal of Technology Integration in the Classroom 2(2), 117–130 (2010)

88

Y. Chen

2. Bader, M.B., Roy, S.: Using Technology to Enhance Relationship in Interactive Television Classroom. Journal of Education for Business 74(6), 357–364 (1999) 3. Anderson, A.R., Li, J., Harrison, R.T., Robson: The Increasing Role of Small Business in the Chinese Economy. Journal of Small Business Management 41, 310–316 (2003) 4. Zhang, W.: Zhongguo Zhongxiao Qiye Fazhan Xianzhuang (2005), http://www.ccw.com.cn 5. Cunningham, L.X., Rowley, C.: Small and Medium-Sized Enterprises in China: A Literature Review. Human Resource Management and Suggestions for Further Research, Asia Pacific Business Review 16(3), 319–337 (2010) 6. Raimonda1, A., Pundziene, A.: Increasing the Level of Enterprise Innovation through Informal Learning: The Experience of Lithuanian SMEs. International Journal of Learning 16(11), 83–102 (2009) 7. Information and Communication Technology Strategic Plan, 2005-06 to 2009-10, University of Oxford, http://www.ict.ox.ac.uk/strategy/plan/ ICT_Strategic_Plan_March2007pdf (retrieved August 2009) 8. Shaikh, Z.A.: Usage, acceptance, adoption, and diffusion of information & communication technologies in higher education: a measurement of critical factors. Journal of Information Technology Impact 9(2), 63–80 (2009) 9. Cyphert, F.R., Gant, W.L.: The Delphi technique: A case study. Phi Delta Kappan 52, 272–273 (1971) 10. Vygotsky, L.S.: Mind in Society: The development of higher psychological processes. Harvard University Press, Cambridge (1978)

Development Strategy for Demand of ICT in Medium-Sized Enterprises of PRC Yanhui Chen School of Engineering, Linyi University, Linyi 276005, Shandong, P.R. China [email protected]

Abstract. With time passed, information and communication technology (ICT) has penetrated into all aspects of industrial production worldwide. ICT offer a new way in which ideas can be generated, communicated, and assessed. During the past years, medium-sized enterprises have played an important role in the economy. However, little attention has been paid to the usage of ICT in medium-sized enterprises in China. In this paper, we analyses ICT–based issues from technicians in Shandong Linyi Lingong automobile drive axle Limited Company. The methodology for carrying out the tasks mainly contains questionnaires according to normative Delphi technique. Some recommendations are proposed for medium-sized enterprises authorities to take in order to properly penetrate ICT in their production. Keywords: information and communication technology, Delphi, ZPD gaps, medium-sized enterprises.

1 Introduction In the past 20 years, a wave of new information and communication technology (ICT) was introduced in almost all aspects of industrial production. Advances in the field of ICT, including Email, Internet bulletin boards, Internet-based courseware delivery strategies, and video conferencing have together changed the whole process of industry. ICT do offer a new way in which ideas can be generated, communicated, and assessed [1]. During past few years, small and medium-sized enterprises have played an important role in the economy and easing employment pressure [2]. Small and medium-sized enterprises comprise more than 99% of all enterprises with more than 73% of entire workforce in China. It is vital for China as well as the other countries worldwide. Medium-sized enterprises in China are identified as follows: medium-sized industrial enterprises employ from 300 to 2000 employees; its annual income is from 30 million to 300 million yuan or balance value of assets is from 40 million to 400 million yuan. As the importance of medium-sized enterprises has increased, it has been accompanied by an increase in the amount of attention paid to them. Now medium-sized enterprises face enormous pressures when China integrates into the world economy gradually. The way that medium-sized enterprises develop in an increasingly competitive market has become one main problem [3]. How to stay M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 89–93, 2011. © Springer-Verlag Berlin Heidelberg 2011

90

Y. Chen

competitive is a question that bothers most of the enterprises because they cannot compete on mass production [4]. One of the possible answers is innovation through lifelong and informal learning by the use of ICT. ICT is needed to facilitate the exchange of ideas and information about industry production [5]. ICT has changed the whole process of industry as time passed. While ICT usage in industry has become a standard, workers will become more informed, more interactive, and more confident in the usage of various kinds of hardware and software. It is recognized by many researchers that use of ICT tools and application is an important life-skill for any worker, whereas some workers lag far behind the others in adopting ICT. A lot of research is dedicated to usage of ICT in large enterprises since large enterprises are able to invest more in ICT. Little attention has been paid to the usage of ICT in medium-sized enterprises in China. Importance of usage of ICT in medium-sized enterprises is not highlighted.

2 Methodology In this study, Shandong Linyi Lingong automobile drive axle Limited Company is proposed as an example on our research of Chinese medium-sized enterprises. The company covers an area of more than 46 acreages with assets of more than 400 million yuan. There are above 1200 employees, including 265 high and middle-level technicians in Shandong Linyi Lingong automobile drive axle Limited Company. The main product of the company is mild form automobile transmission, agricultural equipment transaxle case, and engineering machinery components assembly. The annual production of the company reached 600 thousand sets of automobile drive axle for different famous automobile factory. According to normative Delphi technique, a questionnaire was prepared and hand-delivered to the 265 high and middle-level technicians, and 198 of the staff answered the questionnaire [6]. Within next 2 months, these 198 members complete other questionnaires for 3 rounds. The same group was asked to devise development strategy that medium-sized enterprises authorities should take in order to integrate ICT in their company. The purpose of these development strategies is to increase technicians’ experiences by the use of ICT, and therefore improve output of medium-sized enterprises. The concept of Zone of proximal development (ZPD) was coined by [7]. ZPD gap is the difference between future/maximum and current state of any development/use of information technology. In this research, working and learning with and without the help/use of ICT in Chinese medium-sized enterprises are explored. The ZPD gaps in solving problems are recorded, so that proper mechanism can be figured out to reduce these ZPD gaps to a minimum. Those data showing ZPD gaps obtained through the questionnaires is shown in Table 1.

3 Data Analysis Data shows ZPD gaps obtained through the questionnaires from 198 technicians of Shandong Linyi Lingong automobile drive axle Limited Company. Please refer to Table 1 for issues.

Development Strategy for Demand of ICT in Medium-Sized Enterprises of PRC

91

Table 1. ZPD gaps

issues 1 2 3 4 5 6 7 8 9 10

Learning Professional Knowledge Finishing Routine Tasks /Sharing Material Communication with Leaders and Colleagues Innovative Research Use of Common ICT tools Rely on ICT tools in medium-sized enterprises Use of ICT tools in medium-sized enterprises Get help by ICT tools in medium-sized enterprises ICT Demand in medium-sized enterprises ICT Supply in medium-sized enterprises

ZPD gaps 2.01 1.23 0.96 2.65 0.04 0.86 0.83 0.82 1.22 2.42

1. A technician learns professional knowledge by reading online, searching information from internet. High ZPD gap (2.01) is recorded which shows levels of technicians of medium-sized enterprises using ICT tools and applications for these tasks. 2. For finishing their routine tasks task and sharing material using ICT tools and applications, ZPD gap (1.23) is obtained. 3. For communication with colleagues, leaders and outside between using ICT tools and applications, ZPD gap is 0.96. 4. For finding professional information, communicating with researchers, quest for knowledge using learner forums, large ZPD gap (2.65) is obtained. 5. Very small ZPD gap of 0.04 is recorded for use of common ICT tools such as MS office, web browsers, e-mail, search engines etc. 6. Regarding how much should technicians faculty of medium-sized enterprises rely on ICT tools and applications, low ZPD gap (0.86) is recorded. 7. Regarding how much should technicians faculty of medium-sized enterprises use ICT tools and applications, low ZPD gap (0.83) is recorded. 8. Regarding how much help technicians’ faculty of medium-sized enterprises get while using ICT tools and applications, low ZPD gap (0.82) is recorded. 9. ZPD gap of 1.22 is measured for the issue of demand for ICT in medium-sized enterprises of PRC. 10. ZPD gap of 2.42 is recorded for ICT supply in response to its demand in medium-sized enterprises.

4 Discussion Technicians of medium-sized enterprises usually perform a number of tasks. It takes a technician a lot of time to finishing his routine tasks. There are many ICT that a technician can use while finding research information and sharing ideas with other technicians. If a technician is effective at communication with his colleagues, his tasks will become much easier.

92

Y. Chen

In several aspects relating these issues about technician in medium-sized enterprises, big ZPD gaps are measured. Such significant gaps show different level of staff in using ICT tools and applications. In this study, the use of MS office, web browsers, search engines, and email is popular for most staff. Further than these skills, staff is less practical in using other tools essential for the their development. The main causes of such significant gaps are due to lack of funding, unavailability of resources and lack of attitude or vision etc. The spread of ICT tools and applications is considered as necessary in medium-sized enterprises of developing countries. However, few strategies have been devised to solve these issues in developing countries. Accordingly in this study we try to devise a strategy including some important measures for ICT enhancement.

5 Conclusions Some development strategy for requirement of ICT in medium-sized enterprises is proposed by technicians in Shandong Linyi Lingong automobile drive axle Limited Company. It is necessary for government and medium-sized enterprises to realize the full potential of ICT. All members of the medium-sized enterprises, that is, management, technicians and common laborers, should be involved in readiness. Opinion leaders could be used as effective promotional vehicles in the implementation of ICT among medium-sized enterprises. Medium-sized enterprises currently implementing ICT should be identified and supported. Such medium-sized enterprises who are successfully implementing ICT should be showcased as success models. While some medium-sized enterprises want to adopt ICT to facilitate improved performance and subsequent growth, they may be constrained by finances. The medium-sized enterprises should be given the necessary attention in policy, financial and general business support. It is necessary for medium-sized enterprises to invest more funds in this field. Local internet needs to be developed and consummated. Computer level of 1:1 technicians -computer ratio should be reached. High speed internet access for management, technicians and common laborers should be enabled. Technicians also need a lot of attention in adopting an effective and efficient e-learning environment. A persistent training program should be designed for technicians in the use of ICT. ICT training centers that fulfill training needs of technicians needs to be established. Technicians must be keen to collaborate with other colleagues in and outside the factory. Teams comprising of management, technicians and common laborers need to be developed which develop task-based education content.

References 1. Bader, M.B., Roy, S.: Using Technology to Enhance Relationship in Interactive Television Classroom. Journal of Education for Business 74(6), 357–364 (1999) 2. Anderson, A.R., Li, J., Harrison, R.T., Robson: The Increasing Role of Small Business in the Chinese Economy. Journal of Small Business Management 41(3), 310–316 (2003)

Development Strategy for Demand of ICT in Medium-Sized Enterprises of PRC

93

3. Cunningham, L.X., Rowley, C.: Small and Medium-Sized Enterprises in China: A Literature Review. Human Resource Management and Suggestions for Further Research, Asia Pacific Business Review 16(3), 319–337 (2010) 4. Raimonda1, A., Pundziene, A.: Increasing the Level of Enterprise Innovation through Informal Learning: The Experience of Lithuanian SMEs 5. Information and Communication Technology Strategic Plan, 2005-06 to 2009-10. University of Oxford (retrieved August 2009) 6. Cyphert, F.R., Gant, W.L.: The Delphi technique: A case study. Phi Delta Kappan 7. Vygotsky, L.S.: Mind in Society: The development of higher psychological processes. Harvard University Press, Cambridge (1978)

Diagnosing Large-Scale Wireless Sensor Network Behavior Using Grey Relational Difference Information Space Hongmei Xiang and Weisong He Chongqing College of Electronic Engineering, Chongqing, P.R. China [email protected]

Abstract. Grey relational difference information space (GRDIS) theory is an effective tool for wireless sensor network diagnosis and situation prediction on poor-information network systems. This paper discusses the application of GRDIS in poor-information systems in which there is a few or incomplete sampled wireless sensor network data, or sample data updates quickly, or whole sample data is very complex, but in some space or time region, sample data obey regularity. Some experiments have been presented. Keywords: GRIDS, Wireless Sensor Network, Behavior Diagnosis.

1 Introduction A sensor is commonly viewed as a programmable, low-cost, low-power, functional tiny mobile or stationary device which usually has a much shorter working life span [1]. The current state of the deployment of sensor nodes seems to be ubiquitous. Starting from small objects such as insects to very large objects (and systems) such as state highway infrastructure has large number of embedded sensor nodes. In fact any kind of automation relies heavily on a set of programmed sensor nodes. For example, programmed set of sensor nodes drive fire-detector, remote health care, immersive sensors monitor crop growth, etc. In fact, there are very few units which do not use sensor nodes to perform their functions. The wide applications of wireless sensor networks and the challenges in designing such networks have attracted many researchers to develop protocols and algorithms for sensor networks [2][3][4][5][6][7]. However, the situation that having incomplete running mechanism and being naked to inherent connotation is often faced by network administrator. The situation present in the following ways: few and incomplete valid sampled network data; sampled data update frequently and contradict each other; overall data is complex, but the some time or spatial data is regularity. In this paper, GRIDS is proposed to provide a simple scheme to analyze large-scale wireless sensor network behavior under condition that given information is few. M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 94–99, 2011. © Springer-Verlag Berlin Heidelberg 2011

Diagnosing Large-Scale Wireless Sensor Network Behavior

95

The roadmap of the paper is as follows. To begin with, we motivate the need for diagnosing wireless sensor network behavior. Second, we illustrate grey relational differences information space theory. Furthermore, we provide experimental design. In the end, we conclude. 1.1 Grey Relational Differences Information Space Grey theory [6] is first conducted by Deng Julong in 1982 in China, and is used to predict from small-scaled, uncertain data. The research target of grey theory is the indeterminable, poor-informational system, which means that the system information is partly known and partly unknown. Grey relational differences information space is the basis of grey relational analysis.

@ GRF is grey relational factor set, Δ GR is grey

Definition 2.1. Assume

relational differences information space

@ GRF ⇒ Δ GR , Δ GR = {Δ, ζ , Δ 0i (max), Δ 0i (min)} ,

Δ 0i (k ) = x0 (k ) − xi (k ) , k ∈ K = {1, 2,..., n} ; Let

γ ( x0 (k ), xi (k ))

reference column,

as compare metric at the k − th

x0 is

xi is compare column; γ ( x0 (k ), xi (k )) is average value of

k∈K . If

point of Δ GR ,

satisfy

1) Norm

0 < γ ( x0 , xi ) ≤ 1, γ ( x0 , xi ) = 1 ⇔ x0 = xi , or xi and x0 is isomorph . γ ( x0 , xi ) = 0 ⇔ x0 , xi ∈ ϕ .

2) Symmetry

γ ( x, y ) = γ ( y, x), iff X = {x, y} .

3) Whole often

γ ( xi , x j ) ≠ γ ( x j , xi ), xi , x j ∈ X , X = {xi i ∈ I , POT .I ≥ 3};

96

H. Xiang and W. He

4) Closing The little the difference information

Δ 0i (k ) is, the bigger the γ ( x0 (k ), xi (k ))

is, denoted as

Δ 0i (k ) ↓⇒ γ ( x0 (k ), xi (k )) ↑ .

γ ( x0 (k ), xi (k )) as Grey Relational Coefficient of k xi to x0 , refer to γ ( x0 , xi ) as Grey Relational Grade of xi to x0 .

Then, we refer to

γ ( x0 , xi ) =

point

1 n ∑ γ ( x0 (k ), xi (k )). We call the four conditions for four axiom n k =1

of grey relation. Theorem 2.1 Let

( ΔGR , Γ) as the state of grey relational differences information space. Δ GR = {Δ, ζ , Δ 0i (max), Δ 0i (min)}, Δ = {Δ 0i ( k ) i ∈ I , k ∈ K = {1, 2,..., n}}, or Δ = {Δ 0i i ∈ I }, Δ 0i (k ) = x0 (k ) − xi (k ) , Γ satisfy four axiom of grey relation, then under is

γ ( x0 (k ), xi (k )) =

(ΔGR , Γ) ,the Grey Relational Coefficient γ ( x0 (k ), xi (k ))

min min Δ0i (k ) + ζ max max Δ0i (k ) i

k

i

k

Δ0i (k ) + ζ max max Δ0i (k ) i

k

Δ 0i (k ) = x0 (k ) − xi (k ) , ζ ∈ [0,1] ,then Grey Relational Grade γ ( x0 , xi ) is

γ ( x0 , xi ) =

1 n ∑ γ ( x0 (k ), xi (k )). n k =1

The basic tasks of grey relational analysis is micro or macro approximate of behavior, analyzing and determining the impact degree of each factors and the contribution metric of factor towards primary behavior.

2 Experimental Result In this paper, we collected eight sensor nodes data. The wireless sensor network is shown in Fig. 1.


97

Fig. 1. Eight Nodes

s1

2 1 0 -1

s2

2 1 0 -1

s3

4 2 0 -2

s4

2 0 -2

s5

2 0 -2

s6

2 0 -2

s7

6 4 2 0

s8

These eight sensor nodes, collected over each time bin, make up an eight time series. The time series is put into the data matrix, where sensor nodes vary across columns of and time varies across rows. The result is shown in Fig. 2.

6 4 2 0

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

Fig. 2. Eight Time Series

We apply grey relational differences information space theory. The result is shown as follows. Take 4 time points for example. S1=(3.94175, 4.10419, 4.02862, 4.10267); S2=( 4.59713, 4.50804, 4.55134, 4.65386); S3=( 6.12177, 6.08516, 6.23075, 6.16251); S4=(7.30286, 6.9077, 6.9079, 7.28666); S5=( 5.9208, 5.70464, 5.69311, 5.62988);

98

H. Xiang and W. He

S6=( 1.14474, 1.15791, 1.12062, 1.17267); S7=(250526, 150575, 161565, 155327); S8=(920, 934, 564, 621). Finish computing, we obtain the

relational grade

γ ( x0 , xi ) :

γ ( xs 5 , xs1 ) = 0.7795 γ ( xs 5 , xs 2 ) = 0.8811 γ ( xs 5 , xs 3 ) = 0.8459 γ ( xs 5 , xs 4 ) = 0.9066 γ ( xs 5 , xs 6 ) = 0.8536 γ ( xs 5 , xs 7 ) = 0.5128 γ ( xs 5 , xs 8 ) = 0.6291 The

γ ( xs 5 , xs 4 )

is biggest of

all relational grade, so

s4 impact s5 most, s2

second, s6 third, s3 fourth, s1 fifth, s8 sixth, s7 seventh.

3 Summary In this paper, we apply grey relational difference information space method on multivariate time series to obtain the relation of each sensor nodes. From experimental result, we can arrive at the conclusion that grey relational difference information space method is the good method for poor-information network systems especially when there is a few or incomplete sampled wireless sensor network data, or sample data updates quickly, or whole sample data is very complex, but in some space or time region, sample data obey regularity. Acknowledgement. The authors would like to thank reviewer for their helpful comments. This research is supported by Chongqing Education Committee Research Foundation under Grant KJ092503.

References 1. Kumar, V.: Sensor: The Atomic Computing Particle. ACM Sigmod Record (December 2003) 2. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: A survey. Computer Networks 38(4), 393–422 (2002) 3. Karlof, C., Wagner, D.: Secure routing in wireless sensor networks: Attacks and countermeasures. In: Proceedings of 1st IEEE International Workshop on Sensor Network Protocols and Applications (May 2003) 4. Newsome, J., Song, D.: GEM: graph embedding for routing and data-centric storage in sensor networks without geographic information. In: Proceedings of the First ACM Conference on Embedded Networked Sensor Systems (SenSys 2003), pp. 76–88 (November 2003)


99

5. Perrig, R., Szewczyk, V., Wen, D., Culler, Tygar, D.: SPINS: Security protocols for sensor networks. In: Proceedings of Seventh Annual International Conference on Mobile Computing and Networks (July 2001) 6. Rajasegarar, S., Leckie, C., Palaniswami, M.: Anomaly detection in wireless sensor networks. IEEE Wireless Communications 5(4), 34–40 (2008) 7. Yu, D.: DiF: A Diagnosis Framework for Wireless Sensor Networks. IEEE Infocom, pp. 1-5 (2010) 8. Deng, J.-l.: Grey Forecast and Grey Decision. Huazhong University of Science and Technology Press, Wuhan

Mining Wireless Sensor Network Data Based on Vector Space Model Hongmei Xiang and Weisong He Chongqing College of Electronic Engineering, Chongqing, P.R. China [email protected]

Abstract. In this letter, in order to explore wireless sensor network data, a vector-space-model is applied. With application of the method, the similarity between query features and wireless sensor network features is calculated, which facilitates detecting anomalies. Experiments are conducted and, together with application of SIGCOMM 2008 trace data, show some positive results. Keywords: Vector space model, Wireless sensor network, Feature Analysis.

1 Introduction The advent of cheap, compact sensor nodes with an on board central processing unit (CPU), memory, and wireless radio has enabled the development of wireless sensor networks that support in-network processing [1]. For the sake of remote monitoring or a heterogeneous environment or control of actuators in a homogeneous, sensor networks are often deployed in an unattended area of interest. Examples of applications of wireless sensor networks include home automation, vehicle tracking, target detection, and environmental monitoring [1]. It is important in applications where robust and reliable monitoring and unusual activities can be detected in an accurate and timely manner is necessary. Actually, however, sensor nodes have limited power, bandwidth, memory, and computational capabilities [2]. These inherent limitations of sensor nodes can make the network more vulnerable to faults and malicious attacks [3], [4]. To identify any anomalies or misbehavior is important in the network to develop reliable and secure functioning of the network. An anomaly or outlier in a set of data is defined as an observation that appears to be inconsistent with the remainder of the data set [5]. By analyzing either sensor data measurements or traffic-related attributes in the network, we can identify abnormal behaviors. While minimizing energy consumption in resource constrained wireless sensor networks, identifying anomalies with acceptable accuracy become a challenge. The majority of energy in sensor networks can be consumed in radio communication. For example, in Sensoria sensors and Berkeley motes, the ratio between communication and computation energy consumption ranges from 103 to 104 [6]. So, how to exploit distributed in network processing for the purpose of minimizing the communication requirements in the network is a key research challenge for anomaly detection in the context. The development of efficient distributed algorithms for anomaly detection can be required. On the contrary, centralized approaches to anomaly detection need large numbers of M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 100–104, 2011. © Springer-Verlag Berlin Heidelberg 2011

Mining Wireless Sensor Network Data Based on Vector Space Model

101

raw measurements to be communicated to a selected centralized node for processing, which will reduce the lifetime of the network and deplete the energy in the sensor network. The remain of the paper is as follows. To begin with, we introduce our approach. Second, we conduct experiment. In the end, we conclude.

2 Our Approach A. Vector Space Model Vector space model (VSM) [7] is an algebraic model for representing text documents (and any objects, in general) as vectors of identifiers, such as, for example, index terms. It is used in information filtering, information retrieval, indexing and relevancy rankings. Its first use was in the SMART Information Retrieval System. A typical three-dimensional index space is shown in Figure 1, where each item is identified by up to three distinct terms. The three-dimensional example may be extended to t dimensions when t different index terms are present. In that case, each document Di is represented by a t-dimensional vector Di = ( d i1 , d i 2 ,..., d it ) ,

dij representing the

weight of the jth term. Given the index vectors for two documents, it is possible to compute a similarity coefficient between them, sim( Di , D j ) , which reflects the degree of similarity in the corresponding terms and term weights. Let Di = wi1 , wi 2 ,..., wit , D j = w j1 , w j 2 ,..., w jt , then t

sim( Di , D j ) = ∑ wik • w jk

(1)

k =1

Such a similarity measure might be the inner product of the two vectors, or alternatively an inverse function of the angle between the corresponding vector pairs; when the term assignment for two vectors is identical, the angle will be zero, producing a maximum similarity measure.

Fig. 1. Vector Representation of Document Space

102

H. Xiang and W. He

B. Mining Wireless Sensor Network with VSM The commonly method for calculating weight is Term Frequency-Inverse Document Frequency (TFIDF), which is shown as follows

( IDF ) k = [log 2 n] − [log 2 d k ] + 1 where

n is the total number of documents, d k is number of documents that contain

feature k . Let f i is

(2)

k

denote occurrence times of document

i , then the feature weight

k

f i • ( IDF )k . The formula shows that the more a feature occur in a document, the

more it benefit the document. But, the more the feature appears in the whole document space, the less it’s contribution. We construct mathematical model as follows. Let every data be denoted as n−dimensional vector Di = d i1 , d i 2 , …, d in , i = 1, 2, …, m . The query row vector is

Q = (q1 , q2 , …, qn ) . We define that: q j = 1 , if the jth feature of

document Q occurs. Otherwise, q j = 0 . The document feature weight matrix

A is constructed as

follows

⎡ D1 ⎤ ⎢D ⎥ A=⎢ 2⎥ ⎢# ⎥ ⎢ ⎥ ⎣ Dm ⎦ where

(3)

dij denotes the jth feature weight of document Di , that is, the occurrence

times of the documents.

jth feature in document Di . Let N denotes the total number of

n j denotes the occurrence number of documents that contain jth feature.

Then the feature weight is modified as

wij = d ij • log( N / n j ) . Let W denotes

the feature weight matrix, then

W = A • log( N / n j )

(4)

3 Experimental Result The SIGCOMM traces of wireless traffic belonging to the traced network at several monitoring nodes distributed across the conference floor are gathered from 10:54 to 15:40 on Aug. 21, 2008. In addition, the traces on the wired switch to which the wireless access points connect are gathered. Here is a description of the traces we are gathering and the anonymization that is being performed. Our description here focuses on tracing on the wireless LAN. A

Mining Wireless Sensor Network Data Based on Vector Space Model

103

subset of this (viz., everything above the PHY layer) also applies to the tracing on the wired LAN. Each monitor will capture all of the 802.11 frames it sees, including: data frames, management frames (e.g., association, authentication), control frames (e.g., RTS, CTS, ACK). For each wireless frame captured at a monitor, we record up to 250 bytes of the following information: per-frame PHY information (channel frequency , RSSI and modulation rate), entire MAC header, with only the source and destination MAC addresses, the entire IPv4 and TCP/UDP header, with the source and destination IPv4 addresses anonymized, the entire DHCP payload, the DNS request/response payload. T-Fi plot visualizations provide a quick understanding of the completeness of a 802.11 packet trace. A T-Fi plot is a heat map. Firstly, the orientation on the y-axis shows completeness; the fraction of transmitted packets caught by the monitor. Secondly, the width of the shaded region on the x-axis shows the range of load. Finaly, the intensity of the shaded region shows the frequency of load. We define load (x-axis) as the number of packets sent by the AP and all associated clients between two beacon packets received from the AP. Over the same interval we define score (y-axis) as an approximation of the completeness of that interval. A score of 1 indicates the interval is complete, a score of 0 indicates that none of the packets were captured from that interval. The T-Fi plot is shown as Fig.2.

Fig. 1. Eight Nodes

D1 :< ACK ,190.35.226.121, ARP, IPV 6, ACK , ARP, ACK > D2 :< ACK , TCP, ARP, NBNS , TCP > D3 :< ACK , IEEE802.11, IPV 6, TCP > Q :< ACK , TCP,190.35.226.121 >

104

H. Xiang and W. He

Then,

⎡0 ⎤ ⎢1 ⎥ ⎢ ⎥ 0 0 0 0 ⎡ 0 0.477 0.352 ⎤ ⎢0 ⎥ ⎥, W=⎢ 0 0 0.176 0 0.352 0.477 0 ⎥ , Q = ⎢ 0 ⎢ ⎥ W ⎢ ⎥ ⎢⎣ 0 0 0 0.176 0.176 0 0.477 ⎥⎦ ⎢ 0.125⎥ ⎢ ⎥ ⎢0 ⎥ ⎢0 ⎥ ⎣ ⎦

⎡0.477 ⎤ P = W • QW = ⎢⎢0.044 ⎥⎥ . ⎢⎣0.022 ⎥⎦ Therefore, the sort of documents is

D1 , D2 , D3 .

4 Summary In this paper, a VSM-based approach is proposed to detect the anomalies of wireless sensor network. According to experimental results, we find that VSM-based approach is a promising method to discover the anomalies in a special manner. Acknowledgement. The authors would like to thank reviewer for their helpful comments. This research is supported by Chongqing Education Committee Research Foundation under Grant KJ092503.

References 1. Akyildiz, et al.: Wireless Sensor Networks: A Survey. Computer Networks 38(4), 393–422 (2002) 2. da Silva, A., et al.: Decentralized Intrusion Detection in Wireless Sensor Networks. In: Proc. 1st ACM Int’l. Wksp. oS and Sec. in Wireless and Mobile Networks, pp. 16–23 (2005) 3. Djenouri, D., Khelladi, L., Badache, A.: A Survey of Security Issues in Mobile Ad Hoc and Sensor Networks. IEEE Commun. Surveys and Tutorials 7(4), 2–28 (2005) 4. Shi, E., Perrig, A.: Designing Secure Sensor Networks. IEEE Wireless Communications, 38–43 (2004) 5. Hodge, V., Austin, J.: A Survey of Outlier Detection Methodologies. Artificial Intelligence Rev., 85–126 (2004) 6. Zhao, F., et al.: Collaborative Signal and Information Processing: An Information-Directed Approach. Proc. IEEE 91(8), 1199–1209 (2003) 7. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)

Influencing Factors of Communication in Buyer-Supplier Partnership Xudong Pei School of Economics and Management, Xi’an Shiyou University, Xi’an, 710065, China [email protected]

Abstract. Inter-organizational communication has been documented as a critical factor in promoting collaboration among firms. However, the influencing factors of communication remain unclear. Based on social exchange theory, this paper explores the influencing factors of communication in the context of buyer-supplier partnership. The results show that trust, commitment and dependence are positively associated with communication in buyer-supplier partnership. Keywords: trust, commitment, dependence, communication.

1 Introduction Over the last several years, there has been a growing interest in inter-organizational relations both in research and practice. Firms have recognized the need to manage the supply chain as part of broader business strategies, and in particular to build and exploit collaborative relationships with supply chain partners. Managers and researchers, in the interest of determining how to develop more effective inter-firm relationships, have enlarged the focus from formal contracts to more behavioral and relational approaches. Managers believe that these latter approaches can create more flexible, responsive partnerships. Consequently, inter-organizational partnership receives considerable research attention. However, unless the two-way communication between buyer and supplier exists, partnerships would not be able to adequately provide for overall long-term competitiveness. That communication is the essence of organizational life has been well documented by communication and management scholars and practitioners [1]. Similarly, literature in relationship marketing has recognized how collaborative communication is critical to fostering and maintaining value-enhancing inter-organizational relationships [2][3]. Reflecting its centrality to business performance, one business executive asserted that communication is as fundamental to business as carbon is to physical life [1]. Operations management researchers have also documented how inter-organizational communication enhances buyer–supplier performance [4-6]. In empirical studies, researchers have typically considered communication as a facet of a broader construct, such as supply management [7], or examined the extent to which the use of select communication strategies by buyer firms enhances supplier firm operational performance [5]. Although the importance of communication in buyer-supplier M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 105–109, 2011. © Springer-Verlag Berlin Heidelberg 2011

106

X. Pei

partnership has been recognized by researchers and managers, the influencing factors of communication remain unclear. Based on social exchange theory, this paper intends to address these important gaps and investigates the influencing factors of communication in the context of buyer-supplier partnership. We propose that trust, commitment and dependence are positively associated with partnership communication in buyer-supplier partnership.

2 Theory and Hypotheses 1. Dependence and communication. Dependence can be conceptualized as the economic power one firm has over another, which in turn may result in significant levels of adaptation [8]. As firms join forces to achieve mutually beneficial goals, they acknowledge that each part is dependent on the other. Dependence results from a relationship in which both firms perceive mutual benefits from interacting and in which any loss of autonomy will be equitably compensated through the expected gains [9]. Both parties recognize that the advantages of interdependence provide benefits greater than either could attain singly. Interdependency between partners increases when the size of the exchange and importance of exchange are high, when participants consider their partner the best alternative, and when there are few alternatives or potential sources of exchange. Communication can be defined broadly as the formal as well as informal information shared with the partners in a timely manner [10]. This definition appears to center on the efficacy of information exchange, which means that manufacturers require formal information such as cost, quality, and quantity information with response to their products, as well as overall organizational aspects from suppliers at any time, for any reason. Gulati and Sytch noted that a high level of interdependence generates conditions that develop and maintain trust and commitment owing to opportunistic behavioral costs, differential power (dependence), and asymmetric control [11]. The dependent partner may be willing to honor a request made by its partner and the superior partner may make requests of the dependent partner that solely benefits the superior partner. In a relationship where a supplier firm depends on a larger buyer for a substantial part of its output, the buyer may have a degree of coercive power over the supplier. Communication as a result of dependence is more likely to take place when there is buyer concentration and on account of the buyer's importance to the supplier. Therefore, we propose: Hypothesis1. Dependence buyer-supplier partnership.

is positively associated with communication in

2. Commitment and communication. Partner commitment refers to an exchange partner believing that a valued relationship with another is considered sufficiently important to warrant making a maximum effort at maintaining it; that is, the committed party believes the relationship is worth maintaining to ensure it endures indefinitely [12]. We clarify the definition of commitment to some degree, holding it as the belief of an exchange partner in an ongoing relationship” and that committed behavior ensures “maximum efforts at maintaining” the relationship [12]. Additionally, we conjecture herein that commitment performs a vital function in the partner’s exchange

Influencing Factors of Communication in Buyer-Supplier Partnership

107

relationship. The nature of commitments in all relationships, including interorganizational, intraorganizational, and interpersonal relationships stands for stability and sacrifice, In sum, commitment is a way to represent their efforts and reflects the belief that a partner is ready to take any potential high-risk action for the furtherance of the relationship, and will not elect to engage in any opportunistic options in alternative situation. In this regard, commitment can be explained as affection, which refers to a sense of belonging and a closeness of attachment to the organization. Commitment between supply chain partners integrates the supply chain business process. For this reason, commitment can be positioned as a key mediating variable between critical antecedents and outcomes [12]. Inter-organizational communication may lead to increased behavioral transparency and reduced information asymmetry, thereby lowering transaction costs and enhancing relationship value. When buyers and suppliers make special efforts to design a relationship with good information exchange between trading partners, they benefit from higher levels of relationship performance [4]. Communication plays an important role in activating and translating relational norms into value-enhancing relational assets. Thus, a long-term commitment between buyer and supplier provides the strategic context necessary for fostering collaborative communication. Therefore, we propose: Hypothesis2. Commitment is positively associated with communication in buyer-supplier partnership. 3. Trust and communication. Zaheer et al. define inter-organizational trust as “the extent to which organizational members have a collectively held trust orientation toward the partner firm” [8]. Trust is considered to exist when one party has confidence in an exchange partner's reliability and integrity [12], and a willingness to rely on an exchange partner on whom one has confidence. Trust may promote collaborative communication and enable supply chain partners to build stronger relational bonds [13]. With relational trust, supply chain partners are able to focus on knowledge development and exchange and increase investment in relational competencies. Insofar as these relational competencies are ‘‘socially created,’’ resulting from ongoing collaborative communication among exchange partners, and not easily tradable in strategic factor markets they may confer durable strategic advantages for the supply chain partners [14][15]. Thus, trust in buyer and supplier partnerships provides the strategic context necessary for fostering collaborative communication. Such relational trust also enables the exchange parties to cultivate relational norms that promote cooperation for mutual gains [12]. When supply chain partners develop relational trust, they tend to rely on understandings and conventions involving fair play and good faith, such that any agreements between them are enforceable largely through internal processes rather than through external arbitration or the courts [15]. Thus, relational trust enables the communication and exchange of information and knowledge, lowers transaction costs and enhances transaction value through strategic collaboration. In contrast, lacking of relational trust, adversarial buyer-supplier relationship focused on transaction cost economizing can inhibit the development of relational competencies, frustrate

108

X. Pei

collaborative communication, and heighten opportunism, which ultimately dissipates relational rents. Inadequate or insufficient two-way communication limits a firm’s ability to leverage otherwise supportive relationships to accomplish this. Moreover, rapid advances in technology and global information infrastructure mean that buyers and suppliers must possess appropriate, competitive two-way communication systems if they are to maintain the ability to respond quickly and effectively to changing customer needs and expectations. Thus, mutual trust between buyer and supplier fosters collaborative communication. Therefore, we propose: Hypothesis3. Trust is positively associated with communication in buyer-supplier partnership. 2.1 The Research Framework of This Study We develop a framework (see Figure 1) to examine the relationship among trust, commitment, dependence and communication in buyer-supplier partnership.

Dependence

Commitment

Communication in buyer-supplier Partnership

Trust Fig. 1. The conceptual model

3 Conclusions In the new economy, as firms become more dependent on outside partners to meet sophisticated customer needs, managing inter-organizational relationships effectively becomes important to gaining a competitive advantage. Consequently, inter-organizational partnership receives considerable research attention. Unless the two-way communication between buyer and supplier exists, partnerships would not be able to adequately provide for overall long-term competitiveness. Based on social exchange theory, this paper investigates the influencing factors of communication in the context of buyer-supplier partnership. We propose that trust, commitment and dependence are positively associated with partnership communication in buyer-supplier partnership. Acknowledgment. This work was supported by the Soft Science Project for Science and Technology Department of Shaanxi Province under Grant 2010KRM38(2) and by the Base Research Project of the Education Department of Shaanxi Province under Grant 2010JZ20.

Influencing Factors of Communication in Buyer-Supplier Partnership

109

References 1. Reinsch, N.L.: Business performance: communication is a compound, not a mixture. Vital Speeches of the Day 67, 172–174 (2001) 2. Mohr, J., Fisher, R.J., Nevin, J.R.: Collaborative communication in interfirm relationships: moderating effects of integration and control. Journal of Marketing 60, 103–115 (1996) 3. Schultz, R.J., Evans, K.R.: Strategic collaborative communication by key accounts representatives. Journal of Personnel Selling and Sales Management 22, 23–31 (2002) 4. Claycomb, C., Frankwick, G.I.: A contingency perspective of communication, conflict resolution and buyer search effort in buyer–supplier relationships. Journal of Supply Chain Management 40, 18–34 (2004) 5. Prahinksi, C., Benton, W.C.: Supplier evaluations: communication strategies to improve supplier performance. Journal of Operations Management 22, 39–62 (2004) 6. Cousins, P.D., Menguc, B.: The implications of socialization and integration in supply chain management. Journal of Operations Management 24, 604–620 (2006) 7. Chen, I.J., Paulraj, A.: Towards a theory of supply chain management: the constructs and measurement. Journal of Operations Management 22, 119–150 (2004) 8. Zaheer, A., Venkatraman, N.: Relational governance as an interorganizational strategy: an empirical test of the role of trust in economic exchange. Strategic Management Journal 16, 373–392 (1995) 9. Mohr, J., Spekman, R.: Characteristics of Partnership Success: Partnerships Attributes, Communication Behavior, and Conflict Resolution Techniques. Strategic Management Journal 2, 135–152 (1994) 10. Anderson, J.C., Narus, J.: A model of distributor firm and manufacturer firm working partnerships. Journal of Marketing 54, 42–58 (1990) 11. Gulati, R., Sytch, M.: Dependence asymmetry and joint dependence in Inter-organizational relationships: effects of embeddedness on a manufacturer’s performance in procurement relationships. Administrative Science Quarterly 52, 32–69 (2007) 12. Morgan, R.M., Hunt, S.D.: The commitment–trust theory of relationship marketing. Journal of Marking 58, 20–38 (1994) 13. De Toni, A., Nassimbeni, G.: Buyer–supplier operational practices, sourcing policies and plant performance: result of an empirical research. International Journal of Production Research 37, 597–619 (1999) 14. Kale, P., Singh, H., Perlmutter: Learning and protection of proprietary assets in strategic alliances: building relational capital. Strategic Management Journal 21, 217–237 (2000) 15. Dyer, J.H., Singh, H.: The relational view: cooperative strategy and sources of interorganizational competitive advantage. Academy of Management Review 23, 660–679 (1998)

An Expanding Clustering Algorithm Based on Density Searching Liguo Tan, Yang Liu, and Xinglin Chen School of Automation Science and Engineering, Harbin Institute of Technology, Harbin, 150001, China [email protected]

Abstract. Most clustering algorithms need to preset the initial parameters which affect the performance of clustering very much. To solve this problem, a new method is proposed, which determine the center points of clustering by density-searching according to the universality of the Gaussian distribution. After the center was obtained, the cluster expands based on the correlation coefficient between clusters and the membership of the samples until the terminating condition is met. The experimental results show that this method could classify the samples of Gaussian distribution with different degree of overlap accurately. Compared with the fuzzy c-means algorithm, the proposed method is more accurate and timesaving when applied to the Iris data and Fossil data. Keywords: clustering, density searching, clustering center, Algorithm.

1 Introduction Clustering algorithm belongs to the scope of unsupervised learning. It is widely applied to computer science, life and medical science, social science and economics etc [1, 8], especially in the image processing, data mining, and video. Data clustering is one of the main tasks of data mining [2]. It is used to find unknown object class in the database and identify meaningful mode or distribution. So far a lot of clustering algorithm has been already appeared, roughly divided into partition type clustering, such as K - means [3], K - medic algorithms [4]. Hierarchical clustering, such as BIRCH algorithm [5], CURE algorithm. Based on the density and grid of clustering, such as DBSCAN algorithm, OPTICS algorithm. In addition, there are some special clustering analysis methods, such as clustering fusion algorithm, high dimensional clustering algorithm [7], dynamic data clustering algorithm [6], etc. Although these clustering algorithms have obtained some effects in practical application, they have their inherent disadvantages. Such as, most algorithms are excessive dependence on initial value, they require users to input parameters in advance. In order to obtain high precision caused a complex algorithm, large amount of calculation, and are influences of the evaluation function. This article aims at above problems, proposes a clustering algorithm search based on density search. This method does not need to give any parameters in advance, and calculation is far less than the existing clustering method based on density. M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 110–116, 2011. © Springer-Verlag Berlin Heidelberg 2011

An Expanding Clustering Algorithm Based on Density Searching

111

2 Based on Density Search of Expansive Clustering Algorithm According to the statistics can know, most of stochastic process is normal distribution in practice. And any kind of distribution function can be composed of linear combination of multiple normal distribution function. Object data sets of clusterings also accord with the characteristics, thus it is reasonable to hypothesize that clustering algorithms of input data set meet normal distribution . Furthermore, the clustering center is in the input data sets the most dense area. On this basis, the algorithm put the clustering process into two steps. The first step, using density search clustering center, according to the definition of the correlation coefficient to divide data set partition. The second step, using incremental circle expansion strategy and definition of the membership to classify data set.

3 Algorithm Description Consider the two-dimensions, for example. Pseudo code of the algorithm can be formulated as follows. Step 1 search the date sets center. While (clustering density li > l0 ) ( l0 defined as the clustering density valve)

xi randomly, figure out the clustering level li between the n points closest to xi and xi . The equations employed to solve li is as Let pick up a point designated as

follows. n

li = ∑ ( xi + j − xi ) .

(1)

j =1

And

li + k = min(li , li +1 ,..., li + n ) , k = 0,1,..., n . Then save n points around xi + k and

calculate the clustering levels with n points centered respectlly by the procedures demonstrated as before. Then find the minimum according the definition above. If li +1 > li , we get the minimum li and mark the n points around xi . The center for the n

n points is denoted as x = 1 ∑ x . Do this process recursively until the while z1 j n j =1 condition is broken. Then cancel all the sample points that are on the searching orbit. End while Now m clustering centers can be obtained: xz1 xz 2 ... xzm .

，，

Step2 connect the two clustering center points xzi , xzj arbitrarily and use the points on the line to approximate the normal distribution to get the variance for mean. Step3 use the distance r between the two clustering points xzi , xzj chosen arbitrarily and σ i , σ j resulting from step2 to calculate correlation coefficient

ρij .

112

If ( ρ

L. Tan, Y. Liu, and X. Chen

ij

⎧ ⎛⎜ 2 ⎡⎢β0+β21(σi +σj )2⎤⎥⎞⎟ ⎪ ⎜⎜r −r⎢⎢ β1(σi+σj ) ⎥⎥⎟⎟ ⎪⎝ ⎣ ⎦⎠ ρij =⎨e ⎪ ⎪⎩0

< ρ0 )

Set with Δri

the

circle

around

= ησ i ,η ∈ ( 0,1) .

the

r ≤β1(σi +σj )

.

(2)

r >β1(σi +σj )

clustering

center

in

step

one

radius

If (the number of samples absorbed is not as 2% times as the original one|| the two circle tangent each other) Stop circling and the resulting clustering center is. n

xz =

∑μ x i =1 n

i zi

∑μ i =1

.

(3)

i

Else Continue circling with incremental radius. Step4 when overlaps between the samples exist slightly, the membership function is defined. ⎧ γ ( ri − x ) ⎪ ri x > ri . (4) p ( x ) = ⎨e ⎪1 x ≤ ri ⎩

ri is the radius of the i subclass, γ is the regulator, it regulate the membership

between every subclass and the samples outside the circle. When p ( x ) < p0 . When p ( x ) < p0 , the samples are identified as outlier.

4 Algorithm Demonstration In this section, clustering algorithm via density-searching is demonstrated and analyzed step by step in order to formulate the algorithm clearly and verify the reliability. First, a simulation is designed shown as Fig.1, where five groups of Gaussian distribution data sets are given, of which, two are aliased seriously, other two are aliased a little and the last one is isolated. Then, search the center point of the five groups of data sets and make n in the algorithm be 15. We mark the actual center point of each subclass by red triangle and mark center point of each pseudo-subclass found by text method by small red circle. By analyzing the algorithm for fast searching subclass center point proposed by this article, it is known that, there may be two problems as the following during the searching process. First, the same center point may be just found in several different steps in the recursive process. Second, initial sample point chosen arbitrarily may not have the minimum clustering density in the subclass. However, the center point would converge to the optimal solution finally, even if it may appear finite times repeat. That


113

is because the algorithm removes all the sample points that are on the searching orbit, just keeping some points in high clustering density. While there are still some sample points in low clustering density kept in original sample sets caused by the second problem after choosing clustering center point. But these points don’t have serious influence on calculation of clustering center point for isolated distribution. Just as shown in Fig. 1, center points obtained by the proposed algorithm are very close to the actual center points of the sample sets.

Fig. 1. comparison between the center point of pseudo-subclass found by text method and the actual center points of the subclass with different correlation coefficient

Fig. a. The clustering of the pseudo-subclasses with little correlation coefficient Fig. 2. The clustering among the pseudo-subclasses of different correlation coefficient

114

L. Tan, Y. Liu, and X. Chen

Fig. b. The clustering of the pseudo-subclasses with low correlation coefficient

Fig. c. The clustering of the pseudo-subclasses with high correlation coefficient Fig. 2. (continued)

After the center was obtained, the cluster classifys based on the correlation coefficient between clusters and the membership of the samples. Figure 2 shows the design of the simulation of three groups: Figure(a), there are four independent pseudo-sub-class of normal distribution; we use variable step circle to circle sample points of each pseudo-sub-class. Figure (b), we could see that five data sets mix between each other, but the five data sets are not serious sub-class of normal distribution, divided into two groups, the first group are two intersections, the second group are three intersections, also to delineate the scope of data sets, to determine the points outside the circle and mark the attribution. Figure(c), the five data sets mix between each other and aliasing is very serious, dividing into two cases, one group is that two pseudo-sub-class mix between each


115

other very serious, and the other group is that three pseudo-sub-class mix between each other very serious. Figure (a) shows that the clustering of a low level relevant sub-class is very easy to complete; the points outside of the round are outliers. Figure (b) shows that the clustering of a high level relevant sub-class is relatively more complicated, in the figure each small circle represents the corresponding sub-class cluster center, the great circle path on behalf of the boundaries of the corresponding subclass, which is completed by the method of limited step proposed by this paper, the same sign outside of the round belongs the same sub-class, which is completed by the method of weighting function, with no marked points outside of the circle is judged to be outliers. Figure (c) shows that the clustering of a higher level relevant sub-class is relatively simple, the points outside of the round are outliers. The above simulation results show that the method can eliminate outliers effectively and complete the clustering of the sample very well.

5 Typical Experimental Data To verify the practicality of this algorithm, two sets of typical data was selected to test, Experiment 1 Iris data sets are selected as the test data. The data set is divided into three categories, each of which contains 50 data. Experiment 2 actual Fossil data set is selected as the test sample. The data set is composed of 87 samples which consists of six-dimensional space, is also divided into three categories. Category 1 contains 40 samples (serial number 1-40), Category 2 contains 34 samples (serial number41-74), and Category 3 contains 13 samples (serial number 75-87). Using the algorithm proposed in this article and the classical fuzzy C means algorithm (FCM) to clustering Iris data sets and Fossil data sets respectively. We use the average results as the final results to compare after running 20 times for each method, the results are shown in Table 1, Table 2 and Table 3. Table 1. The comparison between the actual clustering center and the one found by the two algorithms Flower kinds Setosa Versicolour Virginica

Actual center

Center of ECM

Center of FCM

(5.1,3.5,1.4,0.2) (6.5,3.0,5.5,1.8) (6.0,2.7,5.1,1.6)

(5.07,3.39,1.48,0.2) (6.61,3.02,5.63,1.94) (5.90,2.77,4.85,1.48)

(5.00,3.40,1.48,0.25) (6.77,3.05,5.64,2.05) (5.88,2.76,4.36,1.39)

Table 2. The clustering accuracy resulting from the two algorithms Flower kinds Setosa Versicolour Virginica

ECM 98.00 % 96.30 % 89.47 %

FCM 94.17 % 95.67 % 85.33 %

116

L. Tan, Y. Liu, and X. Chen Table 3. The average clustering time derived from the two algorithms Test data

ECM

FCM

Iris

0.1267

0.0938

Fossil

0.1534

0.1312

From the data given in table 1, we can conclude that the clustering center obtained from the algorithm proposed in this article is much closer to the truthful data clustering center than FCM. In other words, the clustering center found by our algorithm is more accurate. As we can see from table 2, the testing result in Iris data set shows that the clustering accuracy with the algorithms in the article is exactly higher than FCM. According to table 3, the algorithms in the article needs less average clustering time than FCM in clustering Iris and Fossil data sets. Further more, the consistence in testing result for real data and manual data, proves once again that the algorithm proposed in this article is superior to FCM, both in classification accuracy and average clustering time.

6 Conclusion This article in accordance with actual process of normal distribution or the combination of the normal distribution for the infinite approximation principle proposes a kind of clustering algorithm based on density searching. This algorithm could automatically and exactly defines the data set center as well as classifies a high degree of aliasing data set. More over it effectively solves the traditional clustering problems which should be given the parameters in advance, and reduces the calculate amount. In line with the testing of standard Iris data set and Fossil data set, the algorithm is not only more accurate in classification, but also uses less time than FCM.

References 1. Hu, B.-p., He, X.-s.: Novel BSS algorithm for estimating PDF based on SVM. Computer Engineering and Applications 45(17), 142–144 (2009) 2. Berkhin, P.: A Survey of clustering data mining techniques. In: Koganand, J., Nieholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data: Recent Advances in Clustering, pp. 25–71. Springer, Heidelberg (2006) 3. Yu, F.-x., Su, J.-y., Lu, Z.-m., et al.: Multi-feature based fire detection in video. International Journal of Innovative Computing, Information and Control 4(8), 1987–1993 (2008) 4. Lei, X.-F., Xie, K.-Q., Lin, F.: An Efficient Clustering Algorithm Based on Local Optimality of K-Means. Journal of Software 7(19), 1683–1692 (2008) 5. Jiang, S.-y., Li, X.: Improved BIRCH clustering algorithm. Journal of Computer Applications 29(1), 293–296 (2009) 6. Zhou, X.-Y., Zhang, J., Sun, Z.-H.: An Efficient Clustering Algorithm for High Dimensional Turnstile Data Streams. Computer Science 33(11), 14–17 (2006) 7. Strehl, A., Ghosh, J.: Relationship-based Clustering and Visualization for High-dimensional Data Mining. Informs Journal on Computing 15(2), 208–230 (2003) 8. xu, R., Wunsch, D.: Survey of clustering algorithms. Transactions on Neural Networks 16(3), 645–678 (2005)

A Ship GPS/DR Navigation Technique Using Neural Network Yuanliang Zhang School of Mechanical Engineering, Huaihai Institute of Technology, Lianyungang, Jiangsu, China [email protected]

Abstract. A stable and accurate ship navigation system is very important for people navigating in the ocean. Dead reckoning (DR) system is a frequently used navigation system for ships navigating in the ocean. It can provide precise short term navigation data, but the error of DR system can accumulate over time without limitation. GPS can be used for navigation in outside environment. But the accuracy of GPS for civilian use is still big. In this paper a cheap single GPS receiver is used for the navigation for ships. This paper proposed a new Kalman filter based GPS/DR data fusion method using neural network. This method was designed based on the characteristic of the GPS receiver. By using this data fusion method the cheap single GPS receiver can cooperate with DR system to provide precise navigation information. Simulation is conducted to validate the proposed data fusion method. Keywords: Kalman filter, GPS, dead reckoning, data fusion method, BP neural network.

1

Introduction

Along with the rapid development of the exploration of the ocean, the requirement of good ship navigation systems increases quickly. Since the electromagnetic wave can be used to communicate information, most of the navigation systems used in the land can also be used in the ocean [1]. DR system is a frequently used technique for ships navigation. The main idea of DR is to use the velocity and acceleration information of the ship to calculate the position of the ship. DR system can provide short term precise navigation data. But since the errors of DR system accumulate over time without limitation it cannot be used to navigate the ships alone without any validation. It needs an external aid to provide compensation information to improve its long-term navigation precision. GPS is a most widely used global positioning system which has many applications including positioning, locating, navigating, surveying and determining the time. A GPS receiver relies on the signals including system time and ephemeris information received from several satellites that are not geostationary. Knowing the absolute satellite positions from the sent messages, the antenna position and the GPS system time can be calculated if four or more pseudoranges are available. M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 117–123, 2011. © Springer-Verlag Berlin Heidelberg 2011

118

Y. Zhang

GPS can provide positioning information with a bounded error. Since GPS and DR system have synergistic relationship it is better to combine these two kinds of navigation systems to provide navigation information. Kalman filter is a frequently used method to fuse the multi-sensors data to provide more accurate navigation information. Much work has been done about using GPS and GPS/DR navigation system to provide navigation information [2,3,4,5]. In this paper a new Kalman filter based GPS/DR data fusion method which is designed based on the characteristic of the GPS receiver and BP neural network is proposed to fuse the data coming from GPS and DR system. By modifying the belief of the GPS and DR system the proposed data fusion method can provide an accurate navigation result. Simulation using real GPS data is conducted to validate the proposed data fusion method.

2

DR and GPS System for Ship Navigation

DR System. DR system can provide short term precise navigation data. But the error of DR system can accumulate without limitation. The calculation formula of DR system for ship navigation is: 1 Tn VD cos Kdt R ∫Tl . 1 Tn λn = λl + ∫ VD sin K sec ϕ dt R Tl

ϕn = ϕl +

(1)

Here VD is the velocity of the ship, R is the radius of earth, Tl is the initial time, Tn is the calculation time, K is the direction of the ship, λl and ϕl are the longitude and latitude of the initial position, and λn and ϕn are the longitude and latitude of the present position. GPS System. In GPS applications, the main goal is to get the position of a receiver as accurately as possible. Since GPS works in outside environment many sources of disturbances can affect the precision of GPS. These disturbance sources include clock offset, atmospheric and ionospheric effect, multipath effect and receiver noise. DGPS can provide very precise positioning service but the cost of DGPS is very high. In this paper we used a cheap single GPS receiver to provide the navigation information. The absolute error of this GPS receiver is about 10-15 meters. We proposed a new Kalman filter based data fusion method which was designed based on the characteristic of the GPS receiver and BP neural network to fuse the data coming from GPS and DR system. Here we are interested in relative positioning accuracy. Fig. 1 shows the single GPS data collected in the fixed point. These data are the latitude and longitude values after subtracting their mean values. Fig. 2 shows the values of k + 1 time step latitude (longitude) subtracting k time step latitude (longitude). From Fig. 1 and Fig. 2 it can be seen that the error range of the single GPS receiver is big but the data between adjacent sampling time drift small, no more than three times of the GPS resolution. Fig. 2 shows the way the single GPS receiver error changes in time. Successive errors are tightly related and therefore the error is strongly colored.

A Ship GPS/DR Navigation Technique Using Neural Network

119

4 3

latitude (m)

2 1 0 -1 -2 -3

0

200

400

600

800

0

200

400

600

800

1000

1200

1400

1600

1800

1000

1200

1400

1600

1800

6

longitude (m)

4

2

0

-2

-4

t (s)

Fig. 1. Data of the single GPS receiver after subtracting the mean value 0.4

latitude (m)

0.2

0

-0.2

-0.4

0

200

400

600

800

0

200

400

600

800

1000

1200

1400

1600

1800

1000

1200

1400

1600

1800

longitude (m)

0.5

0

-0.5

t (s)

Fig. 2. Difference of GPS output between k

+ 1 time step and k time step

Neural Network Neural networks have the ability to “learn” the system characteristics through nonlinear mapping and provide a strong degree of robustness because of their ability to exhibit fault tolerance. By means of both off-line and on-line weight adaptation, neural networks can improve the adaptability. From Fig. 2 it can be seen that the current time

120

Y. Zhang

GPS output have some relationship with previous neighboring GPS data. In this paper the neural network is trained to predict the GPS output based on previous GPS outputs. A BP neural network is adopted to do the prediction job. In a fixed point 1860 sampling time measurements, one measurement per second, were collected. First 998 measurements are used as the training data of the neural network and the other 862 measurements are used as the test data of the neural network. The test results are shown in Fig. 3. Fig. 3 tells that the neural network can predict the single GPS receiver output effectively. 3 GPS data neural network output

latitude (m)

2 1 0 -1 -2 -3

0

100

200

300

400

500

600

700

800

0

100

200

300

400 t (s0

500

600

700

800

6

longitude (m)

4

2

0

-2

-4

Fig. 3. Prediction results of the test data

3

Data Fusion Method

The neural network result can be used to design the GPS/DR system data fusion method. DR system can provide precise short-term navigation data. The error of the DR system can accumulate without limitation. On the other hand the error of the single GPS receiver is big but with limitation. A data fusion method using neural network is proposed to use DR system to provide accurate short-term navigation information and the single GPS to modify the long term navigation information. The characteristic of the single GPS receiver is that the error drifts small (no more than three times of the single GPS receiver resolution) between two adjacent sampling times. For latitude the resolution is about 0.1856m and for longitude the resolution is about 0.1504m . A BP neural network is trained to predict the current sampling time GPS output based on the previous two sampling times GPS data. We assume that the covariance of DR system errors in latitude and longitude direction is Q and the covariance of the single GPS receiver errors is R .


121

The details about the GPS/DR system data fusion process is shown in the following ( x indicates latitude and y indicates longitude). In the starting point an imaginary single GPS receiver is assumed to be fixed and the real single GPS receiver moves with the ship. A BP neural network is used to predict the current time GPS output. The output of the neural network is the prediction of the imaginary GPS receiver’s output. 1) At the starting point when the ship does not run collect twice GPS output to get GPS xp1 , GPS xp 2 , GPS yp1 and GPS yp 2 . 2) Run the ship at time T (written ‘1’ for simple). From the DR system we can get the current time DR system based coordinate xDR (1| 0) and y DR (1| 0) . The covariance of these data are Qx and Qy . 3) Use the GPS data of step 1 and the neural network to predict the current time imaginary GPS output: GPS x and GPS y . 4) In the same time we get the real GPS output: GPS x (1) and

GPS y (1) .

5) Predict the current time GPS based coordinate using the difference between the real and imaginary GPS output: xGPS (1) = GPS x (1) − GPS x and

yGPS (1) = GPS y (1) − GPS y . 6) Since the DR system can provide precise short term navigation information, in one sampling interval, 1 second, the DR system based coordinate is precise. We can use this to calculate the covariance of GPS based coordinate. If

xGPS (1) − xDR (1) ≤ 0.1854m the

error

covariance

of

xGPS (1) is

0.1854 × 0.1854m . Else if xGPS (1) − xDR (1) ≤ 0.1854 × 2m the error 2

xGPS (1) is (0.1854 × 2) 2 m 2 . Else the covariance of xGPS (1) is a big one. Well we get the covariance of xGPS (1) , Rx . In the same way we can get the covariance of yGPS (1) , Ry . covariance of

7) The belief of DR system based coordinate are: Rx

( Qx + Rx ) , Ry ( Qy + Ry ) ,

and the belief of GPS system based coordinate are:

Qy

(Q

y

+ Ry ) .

Qx ( Qx + Rx ) ,

8) Calculate the fused result: x(1|1) = ( Rx ( Qx + Rx ) ) xDR (1| 0) + ( Qx ( Qx + Rx ) ) xGPS (1) ,

( (Q

y(1|1) = Ry

is

y

)

Px (1|1) = Qx Rx ( Qx + Rx )

Py (1|1) = Qy Ry

( (Q

+ Ry ) yDR (1| 0) + Qy

(Q

y

+ Ry ) .

and

y

)

+ Ry ) yGPS (1) . The covariance of x(1|1)

the

covariance

of

y (1|1)

is

122

Y. Zhang

9) Get the new inputs data of the neural network: GPS xp1 = GPS xp 2 , GPS yp1 = GPS yp 2 ,

GPS xp 2 = GPS x (1) − x(1|1) and GPS yp 2 = GPS y (1) − y (1|1) . These data can be used to pass through the neural network to predict the next sampling time imaginary GPS output. The covariance of these data is a fraction of Px (1|1) and Py (1|1) (written as PGPS ) basing on the weights of the neural network. 10) And so on.

4

Simulation

In this paper MATLAB is employed to do the data fusion method simulation. Real GPS data is used here. A BP neural network presented in Section 3 is trained to predict the current sampling time output of the single GPS receiver. The GPS data is fused with the DR system data with stochastic accumulating errors. Fig. 4 shows the fusion simulation results by using the proposed data fusion method and Kalman filter respectively. From the simulation result we can see that for this single GPS receiver the proposed data fusion method has a better performance than the Kalman filter. And in this case the GPS always works in a good status. Sometimes the GPS receiver may work in a bad status, for example, the GPS receiver cannot detect enough satellites to do the positioning or the satellites in the sky change. In this case the error of GPS is very big and Kalman filter cannot provide good performance. For the proposed data fusion method better performance can be obtained by modifying the belief of GPS 2 Kalman filter data fusion method 2

latitude errors (m)

1.5 1

0.5 0

-0.5

0

200

400

600

800

0

200

400

600

800

1000

1200

1400

1600

1800

1000

1200

1400

1600

1800

longitude errors (m)

1.5

1

0.5

0

-0.5

t (s)

Fig. 4. Data fusion results of the proposed data fusion method


123

system. When GPS works in a bad status the belief of GPS system can be set to very small. In this case the DR system is mainly used to provide the navigation information. When the GPS recovers the belief of GPS system will be enlarged.

5

Conclusion

A precise and reliable navigation system is very important for ships to finish its mission. DR system can provide short term precise navigation information. But the errors of DR system can accumulate over time without limitation. DR system cannot be used to provide navigation information alone. It needs some complement. GPS has the synergistic characteristic with the DR system. It can provide positioning information with a bounded error. The absolute error of the cheap single GPS receiver used in this paper is about 10-15 meters. DGPS can provide very precise positioning information. But the cost of DGPS is very high. This paper proposed a GPS/DR data fusion method using BP neural network. This proposed method can fuse the navigation information coming from the cheap single GPS receiver and DR system and provide precise navigation result for the ships. Good simulation results verify the effectiveness of the proposed data fusion method. Acknowledgment. This paper is supported by Jiangsu Province Marine Resources Development Research Institute Science and Technology Open Fund Project (JSIMR10A05).

Reference 1. Yang, F., Kang, Z., Du, Z., Zhao, J., Wu, Z.: The application and prediction of the ocean localization and navigation technique. Hydrographic Surveying and Charting 26(1), 71–74 (2006) 2. Wu, F., Kubo, N., Yasuda, A.: Fast Ambiguity Resolution for Marine Navigation. Journal of Japan Institute of Navigation 108, 173–180 (2003) 3. Ding, T.: The Ship Navigation and Supervision System Using DGPS. AIS and GPRS, World Shipping 29(3), 48–49 (2006) 4. Hu, L., Chen, Y., Wang, L.: The Design of the Embedded Ship Navigation System Basing on GPS and Electronic Ocean Chart. Electronic Technique Application 6, 7–9 (2005) 5. Jia, Y., Jia, C., Wei, H., Zhang, B.: Design and Implementation of a Ship Navigation System Based on GPS and Electronic Chart. Computer Engineering 29(1), 194–195 (2003)

Research of Obviating Operation Modeling Based on UML Lu Bangjun, Geng Kewen, Zhang Qiyi, and Dai Xiliang Dept. of Transportation Command, Automobile Management Institute, Bengbu, China [email protected] Abstract. Military conceptual model is abstract to "obviating behavior space". The purpose of this model is to provide module for integral joint operation simulation. According to case description, it offers a formal representation as to the elements such as battlefield barriers and obviating units entity through the establishment of obviating system package. Dynamic modeling mechanism based on UML for building cooperating operational model and obviating operational model is formed to achieve the operational use cases of each battle behavior entity. Analysis is also made on the operation activities at obviating course and suffering sneak attack through the erection of corresponding obviating activity diagram. This model implements interaction relation and message delivering, and reflects inside logic in obviating operation mission, entity, function, interaction and activity. Keywords: operation modeling, obviating units, barriers, unified modeling language, solid model.

1 Introduction Modern wars show that building and setting barriers are to delay, restrict and destroy the enemy’s action and maneuvering, which are effective measures to enhance the effective and stable defense; while "obviating barriers, opening up access" can keep the initiative and seize a favorable trend of mobility [1]. "Obviating barriers, opening up access" is a typical process of tactical operation, of which the operations model is a presentation of abstract and analogy during the process of obviating operation [2], and it is the process of modeling when based on requirements (orders, etc.) of mutual information, that one or more battle entities perform one or more actions to achieve the purpose of obviating barriers [3]. Unified Modeling Language (Unified Modeling Language, hereafter referred to as UML), is an important tool for military conceptual modeling [4], it possesses high accuracy and rationality in the military expression as to behavior space and modeling, which has been validated by practice [5-6]. UML can properly describe internal logic of the "obviating barriers" operational nodes, operational tasks, operational functions, operational activities, and it is an abstract military concept model for "obviating behavior space”, for it is an important development in building method of the operation modeling. It can meet the new demands of operation modeling simulation which is required by the changing technologies and tactics. M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 124–130, 2011. © Springer-Verlag Berlin Heidelberg 2011

Research of Obviating Operation Modeling Based on UML

125

2 Descriptions on Battle Behavior Entity Carrying Out Tasks Obviating units are mainly responsible to open up and expand the passageways in penetrating the enemy's cutting-edge location barriers during the offensive battles [1]; the task description must reflect the duties and functions of the operational behavior entity in the operations, and must reflect he unique organization and implementation of operations procedures in obviating barrier. Obviating operations required for participants include headquarters, commanders, etc. UML, by using case diagram as described in Figure 1, makes a clear definition on the obviating battle system context, system requirements and scope with the user mode of each participant expressed in system description when an obviating unit carries out tasks.

Fig. 1. Use cases description of operational behavior entity

3 Entity Descriptions of Barriers and Obviating Group Entities refer to all the individual subjects and objects that can be identified in the operational simulation system, including simulation entity descriptions and system capture [7]. Making abstract the barriers on the battlefield and the operational system obviating structure by capturing the general things in the system constitutes a base class (barrier); taking the base class as the basis and then, capturing the other classes, thus complete the design of class view for the purpose of describing in the relevant entity object. Entity Description of Barrier System. Entity description of barrier system is to make an abstract feature extraction of BattleBarriers described by its attributes, and it is the object of operations. Although the types of barriers are different and their properties are varied, yet there share some common features, such as tag identity, location, area size and harm grade. The abstract properties of barriers and their running construct barrier base (Barrier), and extensively change into explosion barrier, mine barrier, fortification barrier, and the class view of battlefield barriers designing [1-2] as shown in Figure 2 .

126

L. Bangjun et al.

Fig. 2. Model of barriers

The construction of BattleBarriers, as shown in Figure 2, achieves the purpose of system description of barriers within the obviating theatres of operations on the battlefield area, it is the polymerization of all the barriers, which exhibits the static model and the internal logic structure within them. And it is interrelated with natural barriers and the enemy fire attacks. Entity Description of Obviating Unit. Battle group is a common property for an obviating unit battle formation. Manipulating and capturing the such properties description construct battle groups (Group), they have the common features which the battle groups share, such groups as personnel configuration, communications equipment, battle equipment, location configuration, battle mission, friendly neighbors coordination, battle support and other battle groups as shown in Figure 3. And from these groups derives sub-groups such as reconnaissance group, obviating group, warning and covering group, survival sweeping and marking group, obviating preparedness group.

Fig. 3. Model of obviating unit

Obviating unit entity is the principal part to complete operations. Figure 3 shows how to build obviating units, and describes the whole-part relationships between the obviating unit and each battle group, which is interrelated with the command, the commander.

4 Construction of Obviating Operation Model The Obviating Operation System Package Diagram. In UML, package is a common mechanism used to group all the elements for building modeling [4]. Browse


127

all the modeling elements for obviating operation system structure, then put all the elements close to each other in concepts and semantics into packages including the battle system package, obviating program package, operating force calculation package, modeling kits, etc, as shown in Figure 4. Among them, the battle system packages include such entity elements as obviating unit, headquarters and commanders; barrier system packages include natural barriers, explosive barriers and barrier barriers.

Fig. 4. Obviating operation system package diagrams

Model of Obviating Operation Entity Collaboration. UML class diagrams, use case diagram and package diagram describe obviating operation system from the static point of view, and obviating operation is a dynamic behavior, only by creating a dynamic model for the system can the situation of obviating operations be fully reflected. UML collaboration diagram is control flow modeling according to organizations, it is the coordination mechanism of organized structure described by the image, and is an important method of building dynamic models: Figure 5 shows the construction model of main operations of obviating operation entity collaboration.

Fig. 5. Model of battle behavior entity collaboration

，

As shown in Figure 5 "Object, chain and information" are important elements and graphics features of collaboration diagrams. In the model of battle behavior entity

128

L. Bangjun et al.

collaboration, the "object" describes principal part of various operations; the "chain" reflects the relevance and the internal interactive mechanism, while the "information" reflects the use case description and function of each object. Taking obviating preparedness sub-group for example, when the information of "obviating support order” is received by the commander, the action of implementation of the "obviating support” begins, its role is shifted to the role of the object of "obviating group" , then the implementation of obviating operation starts. Model of Obviating Operations. Obviating operation focuses on the barrier of the set objectives, the main object of each operational behavior develops interaction according to the battle process in a particular battle space. Not only does the model need to reflect the mission, the use cases, the timing for operation and the life cycle of all the operational subjects, but also it needs to reflect the object interaction, internal collaboration, attack-defense behavior. Besides, it needs to reflect the operational process control. Obviating operations model, based on established the UML dynamic modeling mechanism, as shown in Figure 6, is also known as obviating operations sequence diagram, which describes the visualized trajectory of operation flow as time goes on. Operational object and timing for operation are two major factors in building the model. Figure 6 shows that, each object during the operational processes is arranged from left to right (also known as X-axis direction), time is reflected in top-down vertical broken line (also known as Y-axis direction) change, it is the life cycle description of each object is in the operational process. Model is the modeling for obviating control flow operations by time sequence, it describes the "realization" of each operational behavior entity use case, and it shows the interaction, relationship and information which are developed among the grouped objects of the headquarter, the commander, the battle group according to time.

Fig. 6. Model of obviating operations

Obviating Operation Process Analysis and Modeling. Obviating units, when received commander "advancing command", march forward provided by the planned route in an organized units by using a flexible approach to ensure that they cannot be found and attacked by the enemy; when they enter the operating area, they should quickly occupy the operation starting positions by making obviating preparation, as shown in Figure 7.


129

Fig. 7. Obviating operation activity diagram

Fig. 8. Operation activity diagrams when attacked by the enemy

Obviating battle process model is constructed based on UML activity diagram, also known as obviating battle activity diagram, as shown in Figure 7. During the obviating activities, methods such as secret search and removal together with forced obviating operation can be used [1]. Operation Description When Attacked By Enemy. Obviating operation, known as an important operational background, is often launched in hostile areas of great concern; therefore it is vulnerable to enemy fire attack. In view of this existing particular dynamic situation, Figure 8 shows the entity object by substituting lane space for combat space. Figure 8 is the dynamic modeling of obviating operations when attacked by enemy. When encountered enemy fire attack, the covering unit can take such tactical covering measures against the enemy as fire suppression or laying a smokescreen, while obviating personnel can take on-the-spot concealment and wait for a favorable opportunity to continue their operation, while the commander makes judgment on the enemy situation. When the enemy fire is too strong, ask a higher or friendly adjacent unit for fire support, they can timely rescue the wounded and continue implementing obviating operation[1].

130

L. Bangjun et al.

5 Conclusion "Setting barriers" and "obviating barriers" are two interwoven contradictions in the modern battlefield. All kinds of barriers are widely used in battles so that the tasks of obviating barriers and opening up access become very difficult; but the capability improvement on obviating operation in turn forces further and constant improvement and enhancement on barriers so as to meet the requirements of barrier setting operations. Applying UML to obviating operations modeling is conducive to the modeling realization of integration of conceptual model and computer in construction ideas, which has a broad applicability and excellent prospects. Such model not only provides interface for the generation of computer generation force (CGF), it also provides the interface for other types of tactical level operations modeling for reference.

References 1. Xu, X.: Engineer Tactics. The PLA Press, BeiJing (1994) (in Chinese) 2. Cao, Z., Ma, Y.: Research on Composability of Operation Modeling. Journal of System Simulation 19(4), 1421–1424 (2007) 3. Hu, Y.: Military Conceptual Model of Operations by Small Ground Units in a City. Journal of Naval University of Engineering 18(5), 28–32 (2006) 4. Grady, B., James, R., Ivar, J.: UIrB User Manual. Machinery. Shao, W., Zhang, W.: Industry Publishing House, Beijing (2005) (in Chinese) 5. Shen, R., Zhang, Y.: The Description of Requirement of Weapon and Equipment Systems Based on UML. Systems Engineering and Electronics 27(2), 270–274 (2005) 6. Fan, Y., Li, W.: Research on the Formal Description Language for Military Conceptual Modeling. Fire Control & Command Control 3l(6), 19–22 (2006) 7. Qi, Z., Wang, Z., Zhang, W., Wu, Q.: Design and Implication of Attack and Defense Simulation System for Ballistic Missile with UML. Journal of System Simulation 18(3), 602–606 (2006) 8. Guo, Q., Yang, L., Yang, R.: Computer generating Power. Bei Jing National Defense Industry Publishing House (2006) (in Chinese) 9. Duan, C., Yu, B., Yang, X., Liu, J.: Analyzing and Modeling of Operational Activities. Fire Control & Command Control 31(l1), 34–38 (2006)

The Study of Distributed Entity Negotiation Language in the Computational Grid Environment Honge Ren, Yi Shi, and Jian Zhang Information and Computer Engineering College, Northeast Forestry University, Harbin, China [email protected]

，

Abstract. In the distributed computational grid environment existing trust negotiation languages(TNL for short) could not meet the need of the high efficiency of negotiation, defending the malicious attack and negative expression, therefore we proposed a Distributed Entity Negotiation TNL, which satisfies most of requirements needed in the negotiation language, protects the safety of entities, supports distributed authorization and proof, and negative expression. To improve the efficiency of negotiation, we add it with the feedback policy. Keywords: trust negotiation, distributed authorization and proof, release policy, feedback policy.

1 Introduction With the popular of Internet, network has been an important carrier of communication and interaction. Consequently, the interaction among the entities gradually becomes a focus of study. Because of increasingly complicate net environment, people begin to pay attention to the safety of exchanging message through internet, especially the information refers to personal, organization and country. To solve these issues, Winsborough [1] proposes the theory of automatic trust negotiation (ATN) which is a process of setting up trust relationship among strangers by exposing the digital credentials and access control strategy step by step. As ATN runs in the open distributed environment, many languages can’t meet the need of crossing platform and operating system isomerism. Furthermore, not all of negotiation languages are able to serve for ATN which has its requirements for languages and running system. As a result, we need a kind of TNL, which not only satisfies most of the requirements in negotiation languages, but also can protect sensitive information, support distributed authorization and proof, support negative expression and improve the efficiency of negotiation. After the appearance of ATN, a lot of research results have arisen (such as : PSPL、 TPL、 X-Sec、 RT、 KeyNote、 DL、 X-TNL、 TrustBuilder、 XACML ect.), but there are no results could meet the demand of TN, and they just meet some aspect of the requirement. For example, [2] defines a algorithm of distributed proof, but which doesn’t support complicate resources access control and credential protection. Li puts forward a resources access control policy RT (role-based trust-management) language M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 131–137, 2011. © Springer-Verlag Berlin Heidelberg 2011

132

H. Ren, Y. Shi, and J. Zhang

in [3], which supports parameterize role and link role, but RT doesn’t support sensitive information protection. Li[4] holds the sensitive information protection through the technology of encrypted credentials based on RT, however, neither [3] or [4] can support distributed proof. Winslett[5] proposes a PeerAccess framework with well extendibility, which supports distributed authorization and proof and proof hints, but the release predication in the framework is single. It neither supports negative expression nor specifies the exposure policy. In this paper, we propose Distributed Entity Negotiation (DEN) language knowledge framework. We introduce the usage and semantics of DEN language in detail, then analyze the aplication of DEN, lastly, we conclude shortages of our work and look forward the new research opportunities.

2 The Framework and Semantics of DEN In this paper, we develop DEN through improving and extending PeerAccess and RT, which makes DEN have the new features of sensitive information protection, negative expression and information feedback during the proof. Beyond the features of well extendibility, it also supports distributed authorization and proof and proof hints. In DEN, we call the person interacting in the network an Entity, and all of the entities make up of the Entity Set N, where each entity can request resources, send and receive messages. each entity has a local knowledge base (KB), and all the local KBs form the global KB. The KB manages and achieves the action of the whole entities. Fig.1 shows the structure of a local KB, which is composed of advanced behavior management module (ABMM), release policy (RP), facts/rules set, messages send/receive set and blacklist.

ABMM Proof Hints

RP

Messages Send/Receive Set

Exposure Policy

Blacklist

Facts/Rules Set

Fig. 1. The KB structure of DEN

Advanced Behavior Management Module (ABMM). The ABMM is used for managing the advanced action of an entity, which includes two parts: One part is the proof hints, which can supply an entity with helpful instructions during the proof. For example, when Alice can’t prove that she can use the No.1 project document

The Study of Distributed Entity Negotiation Language

133

(pro-docum1) of AMC Company, she asks the workmate Bob for help, however, Bob can’t do it either. Now we assume Bob respond Alice by two ways: a. Bob sends a proof hint “Bob lsigns find(AMC signs auth(pro-docum1, Alice)), Alice, Carla))”to Alice, telling that she should ask Carla for help. Then, Alice can make use of the proof hint from Bob to turn to Carla for help. Readers can refer the usage of “find” to [5]. b. Because of being busy with his work, Bob is unwilling to give any hint to Alice. But we stipulate that an entity can’t omit the request of another entity. According to that, Bob has to send Alice a feedback information “Bob lsigns feedback (busy, Bob, Alice)”, telling that he can’t help Alice because of his work. Once receiving the information, Alice will not ask Bob for help in a period time, and she will turn to other entities. In the rules above, “feedback” is a feedback predicate, and the semantic formation of it is: A signs/lsigns feedback (resn, A, B), which shows A feedback a reason for helpless to B. “resn” is a state information (busy) with the meaning of reason. To prevent information leakage, “resn” can’t be a fact/rule. In the process of proving, we stipulate B can only ask the entity C in the proof hint provided by A for help, only if there is no entities provide proof hint to B, but this situation is rare. PeerAccess stipulates a peer ( the same as entity) can omit the requests of others, which will cause empty waiting of the requesters. The case wastes not only the time to ask others for help but also the negotiation efficiency of the requester, so we put forward the feedback policy. Release Policy. Release policy contains rules on release predicate, which stipulate which information can be released and to whom. The release policy of PeerAccess defines “ srelease1” release predicate based on sticky policy with sticky2 property. We don’t deny the desirability of srelease under some condition, but it couldn’t express all the needed release conditions. Consequently, we define a new predicate “disclose”. The new release predicate overcomes the shortages of “srelease” in many ways. It has the following characteristics: a. It default supports multi-level release, which means B can release the information sent by A to others, and others too. In the grid, there are usually more than two entities taking part in negotiation, so we consider the information is multi-level releasable. The “srelease” defined in PeerAccess can’t meet the need unless adding conditions in “srelease”, but it is more complicate and harder than “disclose” in expression. b. Entities can only release the information thought as true, which is the same as “srelease”. c. When the released information is about an entity’s privacy(“auth” authorization information, “own” property information, “member” identity information.ect ), 1

2

Srelease semantics stipulate that Alice can only send out a formula signed by Bob if she is sending it to herself or to Bob, or she can prove that Bob thinks that it is okay for her to send the message out. Further, Alice can only send out facts and rules that she believes to be true. The signer of a particular piece of information retains control over its future dissemination to other peers.

134


and if the entity doesn’t detail his/her release conditions on releasing to whom or not, the information is also multi-level releasable; if the entity defines conditions, others should multi-level release it according to the condition. We often put the condition defined by the information related entity into “disclose”, called inner condition. For example, we assume ϕ is the information to be released, which is related to C, and C is unwilling to release it to Nancy, then the rule is expressed through DEN: C lsigns disclose( ϕ ,X,Y,Y ≠ Nancy). d. During the multi-level releasing, the entity can increase limited conditions (called outer conditions) in the case where that information has passed through his/her hands, but he/she can’t delete the conditions added by others. Especially, the added outer condition couldn’t contrary to those added by others. For example: A stipulate that ϕ ( ϕ is unrelated to A) can’t be released to David, when

releasing an information ϕ to B (A lsigns disclose( ϕ ,X,Y,none) → Y ≠ David, where “none” means that A hasn’t increased inner condition ), then B should release ϕ to others under the inner condition. In addition, B can also increase outer conditions, eg: B requires the releasability of ϕ also depended on the agreement of B(B lsigns

disclose( ϕ ,X,Y,none) → Y ≠ David ∧ B condDisclose( ϕ ,X,Y,none)). But we didn’t allow B to add the following condition: B lsigns disclose( ϕ ,X,Y,none) → Y ≠ David ∧

¬ (Y ≠ David).

Particularly, the inner condition can contrary to the outer conditions, that means the entity related to the released information has bigger power than any others, which can’t be realized by other release predicates(srelease). Entities can defend information leakage better through the feature. The next ,we will introduce the semantics of “disclose”: A signs/lsigns disclose ( ϕ ,B,C,inner-cond) → outer-cond. Where A, B and C are arbitrary entity name, ϕ is a fact or rule, inner-cond is inner condition, and outer-cond is outer condition. The inner-cond can counter to the outer-cond. The form of these two conditions is: f1 ∧ … ∧ f n or ¬f1 ∧ … ∧ ¬f n (n>0), which is the conjunction of facts. A allows B to release a information ϕ to C under both the inner condition and the outer condition. According to [5], we also stipulate ϕ is true and releasable at A. In addition, if the inner-cond is null, we should use “none” to signify. Blacklist. Blacklist maintains and manages the information of entities whose credibility is not lower than 0.5(we assume the range of credibility is[0,1], and entities can define the value herself/himself. In this paper, we use 0.5), after evaluated by reputation evaluate institution. And it plays an important role during the negotiation. The working principle of blacklist: when two strangers A and B begin to negotiate(A first to start), B searches for information matches A, then there causes two results: negotiation termination if matching; to continue if not( i.e. A is temporarily trusted).


135

After that, A sends a malicious information or lots of duplicated information (to consume the resources of B)to B, then B will immediately stop to report to the REC(Reputation Evaluation centre), who evaluates A, and feedbacks the result credibility(lower than 0.5) to B. Meanwhile, B updates the blacklist by adding A, thus B can prevent another attack of A. For reasons of space, we won’t detail the working principle of REC. The remaining parts of KB: Facts/rules set: it retains facts and rules owned by an entity. The concepts of fact and rule referring to the standard Datalog is the same as those in [5]. To improve the expressive and richness of DEN ,we add the negative expression to rule. Messages send/receive: it retains facts and rules sent out and those received separately.

3 The Application of DEN We continue to use several predicates defined in PeerAccess, such as: sign, lsign. The usage of these predicates refers to [5]. To improve the readability and understandability, we will analyze the process of trust negotiation through applying DEN. Assume Alice is a manager of library Lib, who can lend books to the members of Lib. Bob is a member of Lib, and he wants to borrow book1 from Alice. Both the membership and management credentials are authorized by the curator or the client (David) of curator. When a member wants to borrow a book with the credential lost or forgot bringing, the member should show the identity certificate signed by curator or client to the manager of Lib. To prevent the occurrence of above, all the certificates related to members are signed by curator or client in advance. According to the above, let’s show the process of borrowing a book: Step1. Bob finds a needed book (book1), who wants to borrow it, and this time Alice asks him whether he is a member of Lib: Bob lsigns disclose (? Alice signs lend (book1, Bob), Bob, Alice), Alice lsigns disclose (? curator signs member(Y, Lib) → David signs member(Y, Lib), Alice, Bob, none). Step2. Now, Bob doubts the identity of Alice, and he wants to see the proof of her identity: Bob lsigns disclose (? curator signs manager (Z, Lib) → David signs manager (Z, Lib), Bob, Alice, none). Step3. Alice also doubts the intention of Bob, then she searches for Bob in her blacklist, then she finds Bob is temporarily trusted, so she relievedly shows him her credential: Alice lsigns disclose (curator signs manager (Alice, Lib), Alice, Bob, curator signs member(Y, Lib) → David signs member(Y, Lib)). Step4. Bob certains that Alice is a manager of Lib, and then he also shows his credential to Alice.

136


There are two cases: 1. Bob shows Alice his credential immediately: Bob lsigns disclose (David signs member (Bob, Lib), Bob, Alice, curator signs manager (Z, Lib) → David signs manager (Z, Lib)), go ahead to step 7. 2. Bob forgets bringing the membership credential, and he has no time to fetch, at this time he wants Alice to certify his identity, but Alice has no power to do it, and she tells Bob that he can’t borrow the book unless he has the certificate signed by curator or client: Bob lsigns disclose (Bob signs find( David signs member (Bob, Lib), Bob, Alice), Bob, Alice, none), Alice lsigns disclose(Alice signs find (curator signs certificate (Y, Lib) → David signs certificate (Y, Lib), Bob, curator), Alice, Bob, none) or Alice lsigns disclose(Alice signs find (curator signs certificate (Y, Lib) → David signs certificate (Y, Lib), Bob, David), Alice, Bob, none). Step5. Because Bob has no idea of the contact methods of curator and client, Alice insteads him to ask curator for help: Alice lsigns disclose (Alice signs find (curator signs certificate (Bob, Lib), Alice, curator), Alice, curator, none). There are also two cases: 1. The curator is busy now; Alice turns to client (David) for help: curator lsigns feedback (busy, curator, Alice), Alice lsigns disclose (Alice signs find (David signs certificate(Bob, Lib), Alice, David), Alice, David, none)， go ahead to step 7. 2. The curator promises the request: curator lsigns disclose (curator signs certificate (Bob, Lib), curator, Alice, none). Step6. There are two cases, when Alice asks David for help: a. David is having a rest, hoping not to be interrupted: David lsigns feedback (rest, David, Alice)， Bob can’t prove himself, and he fails to borrow book1. b. David is willing to help Alice: David lsigns disclose (David signs certificate (Bob, Lib), David, Alice, none). Step7. Bob derives his identity certificate, then Alice lends book1 to Bob: Alice lsigns lend (book1, Alice, Bob). The research point is the release policy of DEN, so the example assumed above doesn’t reflect the negative expression and the function of blacklist. If readers are interested in them, write the cases using DEN by yourself.

4 Conclusion In this paper, we propose DEN language for distributed computational grid through utilizing others work, which supports distributed authorization and proof through “lsign”, negative expression and protects the sensitive information of entities through “disclose” defined newly and blacklist added in KB. In addition, we propose the feedback policy to improve the efficiency of negotiation. We introduce the KB structure and semantics of


137

DEN, then analyze the usage through an example, which reflects that DEN realizes the targets put forward. Limitations of space prevent the example from reflecting DEN supports parameterize role and link role, and users can deploy it themselves. Meanwhile, we haven’t detailed the exposure policy and “blacklist”, and we haven’t proved how the feedback policy improves the efficiency. Consequently, we will continue to study these issues.

References 1. Winsborough, W.H., Seamons, K.E., Jones, V.E.: Automated trust negotiation. In: DARPA Information Survivability Conf. and Exposition, pp. 88–102. IEEE Press, New York (2000) 2. Bauer, L., Garriss, S., Reiter, M.K.: Distributed proving in access-control systems. In: Paxson, V., Waidner, M. (eds.) Proc.of the IEEE Symp. on Security and Privacy, pp. 81–95. IEEE Press, Washington (2005) 3. Li, N.H., Winsborough, W.H., Mitchell, J.C.: Distributed credential chain discovery in trust management. In: Herbert, A.S. (ed.) Proc. of the 8th ACM Conf. on Computer and Communications Security, pp. 156–165. ACM Press, New York (2001) 4. Li, J., Li, N., Winsborough, W.H.: Automated trust negotiation using cryptographic credentials. In: Atluri, V., Meadows, C., Juels, A. (eds.) Proc. of the ACM Conf. on Computer and Communications Security, pp. 46–57. ACM Press, New York (2005) 5. Winslett, M., Zhang, C., Bonatti, P.A.: PeerAccess: A logic for distributed authorization. In: Atluri, V., Meadows, C., Juels, A. (eds.) Proc of the ACM Conf. on Computer and Communications Security, pp. 168–179. ACM Press, New York (2005)

Study and Application of the Smart Car Control Algorithm Zhanglong Nie Changzhou College of Information Technology, Changzhou, Jiangsu, China [email protected]

Abstract. The speed and direction control of the own designing intelligent car is the core of intelligent car control system.In order to get fast, stable and reliable running on a different track,the system collects the discrete path information by an array of photoelectric sensors,the digital PID algorithm and indirect PID algorithm are utilized on the drive-motor and steering gear.Therefore,the system avoids the step-variation of intelligent car’s direction and speed change,eliminates the over-control and oscillation,and approximately obtains continuous control effect.The study show that the control algorithm has a nice self-tracing effect on the black and white(or large color difference)track. Keywords: Smart Car, Tracing Algorithm, Photoelectric Sensor, PID Algorithm.

1 Introduction The freescale smart car competition is a national science and technology competition organized jointly by the freescale company and Tsinghua University. The organizing committee will provide a standard car model, DC motor and rechargeable batteries, the team must design the smart car identifing dependently the specific trace, the car is the winner,who has been running the whole track ,being the fastest,and the better technical report. The team need to learn and use the Codewarrior IDE and online development methods, design the plan used to automatically identify the trace , the motor drive circuit, the speed sensing circuit, the steering servo motor drive model and the software programming about MC68S912DG128 microcontroller. The expertise of the competition involve the fields: control, pattern recognition, sensor technology, electrical, computer, mechanical ,and so on, the students are trained on the practical hands-on ability and knowledge integration ways, so the competition not only helps students improve the independent innovation capacity, but also promotes the academic level of the related subject.

2 The Key Control Algorithm Design The smart car control system is a typical closed loop system, the control system mainly achieve the control of the front wheel direction and the rear wheel speed.Therefore, MCU needs to receive the signal the trace identify circuit, the speed sensor signal, and M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 138–147, 2011. © Springer-Verlag Berlin Heidelberg 2011

Study and Application of the Smart Car Control Algorithm

139

tracing using a path search algorithm, then controling the steering servo motors and DC drive motor. The control strategy of the smart car can be divided into the speed priority and the stable priority, namely, the control target of the former is as fast as possible, and the latter focus on the stability of the car. In short, the design goals are different, the control methods will be different, and the control design is a wide variety [1]. The smart car tracing algorithm is a key part of the intelligent car design, the main design work of the smart car is carried out around it. The tracing methods: the 7 sensors are used to identify the trace, each sensor is a linear order and the 2 cm spacing,the white reflectance of the sensor is set to the largest,and the black reflectance is set to the minimum. the four index (0,1,2,3) ranges are divided between the maximum and minimum values , the location of the smart car is basically determined by by combining of the index various. (1) the speed control algorithm design 1) the speed collection design The key of the closed loop control is to gain the current value of the control object, so a way is used to measure the actual speed of the rear wheels.the way included two types: the optical encoder and Hall sensors. The high precision is the advantage of the optical encoder, but its installation is inconvenient and requiring the high environment. The advantages of Hall sensor is to a simple installment and the high test precision, the disadvantage is the delay between the measured speed and the actual speed, this design uses the latter. The circumference of the rear is 17cm, and the six magnetic steels are added to the rear wheel,so it can produce the six pulses in each week.An input capture interrupt can be generated about 2.83cm . the speed of the car is calculated by the time interval of the two rising edge[2]. However, there are differences between the realization of capturing and the principles described above ,because the bus frequency is 32M, the timer frequency factor is up to 127, the maximum timer overflow time is as follows shown: t = 65535 / ((32 * 1000000) / 128) = 262ms

(1)

The input capture time intervals can be divided into the 3 types. First, the two input capture generates in the same timer period,that the overflow interrupt did not happen between the two input capture. So the current speed value is the difference between the second speed value and the first speed value. Second, a timer overflow interrupt is generated between the two input capture, and the second value is less than the first time, so the current speed value is abtained by adding the difference between the two value to the remaining value of the previous cycle. Third, a more timer overflow interrupt is generated bewteen the two input capture , this situation is more difficult, because there may be confusion. In practice, a global variable is counted by 1 when the overflow interrupt happens, we can know how many the input capture interrupts happened by the global variable value, so getting the actual speed. the speed calculation formula is given below: s = 2.83 / ((262/65535) * T) = 2.83 / (0.004 * T) = 707.5 / T

(2)

The T is the number of pulses between the twice input capture. In the practical application, we can be calculated: 28.3 / 262 = 0.108m / s. Obviously, if the T is greater

140

Z. Nie

than 262ms, then the speed is less than 0.108m / s. Because it is too low ,the rate is no sense in practice, so the case of “count == 1”is considered in the paper,when the count is greater than 1,the speed is less than 0.108m / s, and the speed is about 0, the speed value assigned to 0xFFFF. The some values will be used into the corresponding speed list, and the current speed range is determined after the T value is roughly compared . The actual speed measured is only a first step, but the speed control is more important, the actual speed is the small fluctuations around the target speed. The PID algorithm is used to control the speed, the following describes the design of PID algorithm. 2) PID algorithm design The P of the PID controller is the scale factor, the I is the integrating factor, the D is the differential factor. the control volume is formed by the linear combination of three factors. Becaule the PID controller is simple, easy to adjust the parameters, do not need the exact mathematical model, so it is applied in all areas of industrial. it is firstly appeared in the analog control system, the function of the traditional analog PID controller is realized through the hardware platform.With the emergence of the computer, it is transfered to the computer control system, the software is instead of the realization of the original function of PID, so it is called digital PID controller, a set of algorithms forming are called PID algorithm [3] . In the analog control system, the usually control law of the controller is the PID control. To illustrate the controller works, we look at an example shown in Figure 1.

Fig. 1. Small power DC motor speed control system

n0(t) is the given speed ,n(t) is the actual speed, their difference is:e(t) = n0(t) - n(t), e(t) which adjusted by PID controller becomes u (t) which is the voltage control signal, u (t) amplified through the power could drive the DC motor to change the car speed.

r(t)

e(t)

比比积分微分

u(t)

被被对象

Fig. 2. Analog PID control system schematic

y(t)


141

The principle of the general analog PID control system is shown in Figure 2. The system consists of analog PID controller and the controlled objects. r (t)of the figure is given value, y (t) is the actual output value of the system, e (t) is their difference,that is: e (t) = r (t) - y (t)

(3)

e (t) is the input of the PID controller, u (t) is the output of the PID controller,and also is the input of controlled object. So the control law of analog PID controller is: t

u(t)=Kp[e(t)+

1 de( t ) e( t )dt + TD ]+ u 0 ∫ TI 0 dt

(4)

Kp: the scale constant, TI: integral constant, TD: Differential constant, U0: control constant. The computer control is used as the collect control, it can only calculate the control volume based on the deviation value collected, it could not continuously output the control volume to achieve the continuous control ,but the analog control could do. Because of this, the integral term and derivative term in formula (4) can not be directly used, they must be discretized. The discrete processing methods are: the T is as the sampling period, k is as the sampling NO, the discrete sampling time is the t, the sum is instead of integration, the increment is instead of the differential ,and forming the following formula:

u k = Kp[ e k +

T TI

k

∑e

j

+

j= 0

TD ( e k - e k −1 )]+ u 0 T

(5)

k: the collect serial number, k = 0,1,2, ...; uk: the output value of the computer in the k-times signal collection; ek: the input deviation value in the k-times signal collection; ek-1: the input deviation value in the k-1 times signal collection; According to formula (5) ,the incremental digital PID algorithm is as follows:

uk

＝ u ＋ Δu k −1

k

(6)

The system is using the incremental digital PID algorithm to achieve speed control[4]. That is, according to the collecting speed and target speed,the system could adjust the output to continuously approaching target speed. The follow formula is used by the speed control procedures of the design: get = ((es1-es0) + (int) (es1-aim_speed)) * K

(7)

es1 is the current speed, es0 is the previous speed, aim_speed is the target speed, K is a coefficient constant. The speed control output is set to 60 levels, the top 30 levels are forward gears, the rearward 30 levels are reverse gears.The higher gear means the faster. This method can better control the speed by the actual test,and the test speed value is around the real speed value, Generally,the fluctuation range can be controlled in 500 ticks.

142

Z. Nie

(2) the tracing algorithm design 1) tracing algorithm strategy and control processes The track is labeled by the black ribbon on the white background, obviously, the car can automatically follow a black ribbon to move forward, so we must identify effectively black and white gray level information. In order to better identify the black and white, the tracing tool of the system selects the grayscale sensor distinguishing sensitively the black and white color. According to the different levels of black and white, the output voltage value of the grayscale sensor will be different, the output voltage value can convert the corresponding digital value by the A / D conversion module of MCU. the following points should be noted when the A / D converter module is using, First, the sampling time of A / D should comply with the requirements of real-time systems, Secondly, the sample accuracy of the black and white should be higher,so the system could provide the accurate data to meet the software processing requirements. The software can distinguish that the position of the car is left, right, or missed track by using the particular algorithm.Here,the process control algorithm of the car is given. The black and white collect value of the 7 grayscale sensors on the front of the car are converted into the corresponding figures by the 10-bit 7 channel A / D conversion module, according to the figures,the control software can judge the black ribbon is under which sensor, so the system determine growth or slow down, turn left or right. The corresponding control algorithm flow is shown in Figure 3. The idea of the control algorithm: firstly, it determines whether the middle sensor is located in the the black ribbon, if it is located, the car fastly goes forward; otherwise , judging the current car is a left side or right side, or missed track, and then further processing. If the current car is left side, the process is shown in Figure 4. First, determining whether some sensor among the left three sensors is located in the black ribbon, if the

begin

left process start middle senssor is located in black ribbon?

go forward

left 1

Y

N

which sensor is located in the black ribbon?

Left ,right,or missed track?

left 2 right

left miss left process

missed process

left 3

turn left 8 degree move 15 gear



right process

end

Fig. 3. Control algorithm flow diagram

end

Fig. 4. Left side process flow diagram


143

left 1 sensor is located, then turning left the 8 degrees ,going forward using the 15 gear; if the left 2 sensor is located, then turning left the 25 degrees , moving using the 12 gear; if the left 3 sensor is located, then turning left the 40 degrees moving forward using the10 gear,so the processes of left side case are over . If the car is the right side, its processing method is quite similar to the left side. If the all current sensors are located in the white color, the process is shown in Figure 5. When the all current sensors are located in the white color, indicating that the car has been completely out of black ribbon, so determing the next process according to the previous state.if the previous state is left side,then turning left the 42 degrees , moving using the 10 gear; if the previous state is right side,then turning right the 42 degrees , moving using the 10 gear. 2) improved tracing algorithm strategy The design of intelligent model car will encounter the difficult on the path detection and the tracing strategy design, according to the car production process and its own characteristics,the following is the research about improving the speed and increasing the stability . The 10 infrared sensors installed in front of the car consists of the linear array of photoelectric sensors spaced 20mm , these sensors can vertically detect the track information,and getting the gray value of the track,the value is range 0-1023, the greater value means the more light. In actual testing, the infrared sensors is welded by hand, even the black or white is the same,but the sampling values from different sensors will be different. It is shown in Figure 6,these values are obtained by the median filter and 5 times average filter. In the figure, the pure white, pure black and intermediate color are marked by the corresponding red, yellow and blue line, it is not difficult to find that the same black or white sampling values from different sensors are different, but for the same sensor, the different black or white sampling values are obvious difference, the same gray sampling values from different sensors also has its relevance.

missed track start

left

the previous state of miss?

right

turn left 42 degree

turn right 42 degree

move 10 gear

move 10 gear

end

Fig. 5. Missed track flow diagram

144

Z. Nie

ϾӴᛳ఼᭄᥂↨䕗 data comparison of the 10 sensors

pure white ܼⱑ intermediate Ё䯈㡆 pure ܼ咥 black

Fig. 6. Data comparison 1 of the 10 sensors

ϾӴᛳ఼᭄᥂↨䕗 data comparison of the 10 sensors

pure white-intermediate ܼⱑЁ䯈㡆 Ё䯈㡆ܼ咥 black intermediate-pure

Fig. 7. Data comparison 2 of the 10 sensors

The difference between the pure white sampling values and the intermediate color sampling values, and the difference between the intermediate color sampling values and the pure black sampling values, they can be marked on a picture shwn in Figure 7, the red line means the difference between the pure white and the intermediate color, the yellow line is the difference between the intermediate color and the pure black, it is clear that the two difference is linear.Therefore, we can map the 10 sensor sampling values to the same vector space through the linear way,the mapping value is range 0-1000. Obviously, the 10 sensor data after mapping could be analogous. Then there will be many available mathematical methods to find the deviation of the sensor line array center and track. Here ,first, the system selects the 5 sensor near black ribbon by the screening method,then doing the quadratic curve fitting, determining the offset of the car by seeking the abscissa value of the minimum point. the real-time sampling data normalized need the brightness compensation. the brightness deviation is mainly due to the influence of the uneven ambient light, that there may be a strong side ligh. Therefore, in order to simplify the processing, we just assume that this effect is linear, the system has a linear brightness compensation before fitting the data and the screening,.


145

The following is the derivation of mathematical formulas obtaining the reliable deviation[5]. The real-time sampling data of all the sensors is: x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, the parameter calibration process is needed before the testing tracing, these parameters are required through the large amounts of data statistics and calculations. The cluster analysis method can be used here. a large number of sample data is to the cluster analysis by hand, we found that the most two clustering is the sensor sampling values of pure white and pure black. Therefore, we can extract the field pure white and pure black reference value by the real-time detection before running. The two reference values were established for the hi and li, i is a number of sensors,From the above we can see that the linear mapping can be used to the data standardization, the data range is 0-1000, that is:

x i' − 0 x −l = i i 1000 − 0 hi − l i

First set: x i' = y i

,after mapping,the data is: x i' = 1000 ×

xi − l i . hi − l i

，the data of the linear standardization is needed to the brightness

* compensation, the deviation function of the linear brightness: y = ax + b . Using the least square principle, the variance sum is:

10

10

i =1

i =1

M = ∑ ( y i* − y i ) 2 = ∑ (axi + b − y i ) 2 ,obtaining after the derivative for a,b: 10 ⎧ ∂M ⎪ ∂ a = 2 ∑ ( ax i + b − y i ) x i = 0 ⎪ i =1 ⎨ 10 ∂ M ⎪ = 2 ∑ ( ax i + b − y i ) = 0 ⎪⎩ ∂ b i =1

,

simplifing: 10 ⎧ 10 2 ⎪a∑ x i + Nb = ∑ xi y i ⎪ i =1 i =1 ⎨ 10 10 ⎪a x + Nb = yi ∑ i ⎪⎩ ∑ i =1 i =1

It is not difficult to solve the a and b for the second-order linear equations. Here we are only interested in the value of a, because it can directly reflect the brightness deviation. Therefore, each sensor must remov this error volume:

yi' = y i − axi , the yi' is the

gray value of each sensor after the brightness compensation.Then, we selected the k-sensors whose gray value is minimum, and extending the 2 sensors to the left and right, so that we can get the five data closely related with the black guide lines: zi (i = k-2, k-1, k, k +1, k +2), then to be parabola fitting,the fitting function

N=

k +2

∑ (z

i =k −2

is: * i

z * = ax 2 + bx + c

− zi )2 =

k +2

∑ (ax

i =k −2

2 i

,

then,

the

variance

sum:

+ bxi + c − z i ) 2 , obtaining after the derivative for a,b,c:

146

Z. Nie

k +2 ⎧ ∂N 2 2 ⎪ ∂a = 2 ∑ (axi + bxi + c − zi )xi = 0 i =k − 2 ⎪ k +2 ⎪ ∂N = 2 ∑ (axi2 + bxi + c − zi ) xi = 0 ⎨ i =k − 2 ⎪ ∂b k +2 ⎪ ∂N = 2 ∑ (axi2 + bxi + c − zi ) = 0 , ⎪ i =k − 2 ⎩ ∂c

simplifing: k +2 k +2 k +2 ⎧ k +2 4 2 2 3 ⎪a ∑ xi + b ∑ xi + c ∑ xi = ∑ xi zi i =k −2 i =k −2 i =k −2 ⎪ i =k −2 k +2 k +2 k +2 ⎪ k +2 3 2 ⎨a ∑ xi + b ∑ xi + c ∑ xi = ∑ xi zi i =k −2 i =k −2 i =k −2 ⎪ i =k −2 k +2 k +2 ⎪ k +2 2 ⎪a ∑ xi + b ∑ xi + cN = ∑ zi i =k −2 i =k −2 ⎩ i =k −2

So, getting the abscissa of the curve minimum point: x p = −

b . 2a

Setting: △L = xp, that is, △L is the deviation of the sensor line array center and the track,from Figure 4, the error signal e(t) of PID controller is the △L, which is the basis of the car tracing using the PID algorithm, then, the weights of the PID are determined through experiments ,and finding the parameters of the car running good.

3 Summary The speed control and tracing algorithm is an effective method of smart car control, the algorithm not only heightens the dynamic performance and reaction speed of the smart car system, but also improves the system adaptability and robustness, and the smart car could run with a faster speed. During the experiment, the first version of smart car is slow in the S-curve trace, and easy to miss trace,these mainly due to the front short distance perceived by the 7 sensors.and affecting the prediction of smart car. In order to predict the situation in front of the track, the second version of smart car install the 10 tilt sensors. In addition, the environment light largely impact on the smart car running, so improving the anti-jamming ablity by desgning the filter circuit and optimizing the control algorithm. The experiment shows that this system can achieve a good tracing effect.

References 1. Kaisheng, H., et al.: Analysis the technology program of South Korean Smartcar model. Electronic Engineering & Product World (March 2003) 2. Wang, Y., Liu, X.: Embedded Application Technology Tutorial. Tsinghua University Press, Beijing (2005)


147

3. Yang, X.-l.: Application of PID Arithmetic in Aptitude Vehicle. Experiment Science and Technology (August 2010) 4. Jia, X.: Feedforward-Improved PID Algorithm Applied in the Smartcar. Computer & Information Technology (December 2008) 5. Cai, G., Hong, N.: A Study of Navigation Based on the SmartCar. Electronic Science and Technology (6) (2009)

A Basis Space for Assignment Problem Shen Maoxing1, Li Jun1, and Xue Xifeng2 1

Department of Management Science, Xijing University, Xi’an Shaanxi, P.R. China, 710123 2 Department of Mathmatics, Northwest University, Xi’an Shaanxi, P.R. China, 710069 [email protected]

Abstract. A algebra structure of capality probability vectors is presented through analysis and researches of the model of assignment problem. And a concept of basis space of mission assignment problem is introduced. Furthermore, some properties are obtained in this space. And hence, a new research way is pointed for the mission assignment problem. Keywords: Mission assignment, Capality probability vector, Space structure, Mission unit, Task.

1

Introduction

The matter of mission assignment is an important problem in operations research and widly uesd in various management or control filds such as design of intelligent transportation systems and intelligent computing or comminication system or industry control system. In the face of the modern informationization situation under high tech and control command automation rapidly development, the researches on the mission assignment is more and more necessary and inevitability. The actual mission assignment problem is become more complexity and diversification due to the various difference in deraignment and tactics. Therefore, we make a bid for establishing a structural description of mission assignment backgrounded on the case of independent work without regard to the optimization for tasks choice. Based this thinking and framework, further deeper researches will evolve with the other work on scheme space of mission assignment. This work can be thinked as a try in the field of basic throrey.

2

Problem Description

Usually, we consider the mission assignment problem is as follows: there are m groups (mission unit) deploying in a certain area, they asked to do the task stream containing n tasks came into this defense area. Generally speaking, the purpose of mission assignment is that how to assign the m working units to n tasks to finishing them for the sake of obtaining some effects or purposes. Ui We note m groups as U1 , ……, U m i.e. ( i = 1, 2, , m ), And the

n tasks as

T1

，……， T

n

i.e.


Tj

( j = 1, 2,

, n ),

A Basis Space for Assignment Problem

149

U i to T j :

pij

Suppose we have known the capality probability of ( i = 1, 2,

, m ; j = 1, 2,

p1 = ( p11, p12 ,

, n ), and note these as a vector as

, p1n ) =ˆ ( p1 j ) , …, pm = ( p m1 , pm 2 ,

, p mn ) =ˆ ( pmj )

(1)

and call it as capality probability vector, the capality probability matrix is noted as

P = ( pij ) m×n . Generally, people hope to the capality as most as possible of the tasks through once mission assignment and working. The problem is how to construct a optimal scheme of mission assignment. That is, in order to maximizing the mathematics expectation of the tasks, to determining which one mission unit to do which one task in the situation of this m mission units and the n tasks. We can construct a LP (linear programming) model for this problem as follow as : m

n

max Q = ∑∑ xij pij i =1 j =1

⎧ ⎧0, U i no work T j ⎪ xij = ⎨ s.t.⎨ ⎩ 1, U i work to T j ⎪0 ≤ p ≤ 1, i = 1, 2, , m; j = 1, 2, ij ⎩

(2)

,n

If append a condition: a mission unit to a task only one task, a task is being done by only one mission unit, this means that n

∑ xij = 1, j =1

m

∑x i =1

ij

=1

( i = 1, 2,

, m ; j = 1, 2,

,n)

(3)

Hence, this model has became as a extend assignment problem model. We shall pay attentions to adopt a algebra structure to descript the mission assignment but the solving way in here.. 2.2

Space Structure Constructing

λ ∈ Λ ; vector set PP = [0, 1]n , p ∈ PP , n here [0, 1] is [0, 1] × [0, 1] × × [0, 1] . Hence, pi ∈ PP, i = 1, 2, , m . We note a number set Λ = (0, + ∞),

n

Definite two operations

Pr ⊕ Ps = ( p rj + psj − prj p sj )1×n λ

1

Ps = ( psj λ )1×n

(4)

150

S. Maoxing, L. Jun, and X. Xifeng

Pr ⊕ Ps ∈ PP , λ Ps ∈ PP . This means that two operations above defined is closed in PP . We called four elements entity ( PP, λ ; ⊕, ) as a basis space of mission assignment problem. It is necessary that to statement the ( PP, λ ; ⊕, ) is not a vector space in the sense of

such that,

λ ∈Λ ,

pr , ps ∈ PP,

then

pure mathematics, but it is an algebraic structure at last. We might as well call it as a basis space of (mission) assignment problem. The meaning of the operations can be explained as: Pr ⊕ Ps is the accomplishment vector when the mission unit U r , U s are all working, but the task accomplished by the at least one mission unit; λ P is the ebb and flow vector of capality probability vector of mission unit s

U s , we call λ as the intensifying coefficient of the accomplishment power (or work effectiveness). 2.3

Some Simple Properties

Some simple properties can be obtained easily for the two operations defined on the basis space of mission assignment problem. (1)

Pr ⊕ Ps ≥ Pr

，P ⊕ P r

s

≥ Ps

This hint “a good cooperate result of

missions”; (2)

⎧ ≥ ps , λ ≥ 1 ⎩< ps , 0 < λ < 1

λ Ps ⎨

result of mission”;

，

(3) The less the λ , the less the λ

This hint “a good potential exploring

Ps , and lim+ λ ps = (0)1×n ;

(4) The bigger the λ , the bigger the λ Prove: (1) ∵

λ →0

Ps , and lim λ ps = (1)1×n λ →+∞

p rj + psj − prj p sj = p rj + psj (1 − prj ) ≥ p rj

p rj + psj − prj p sj = p rj (1 − psj ) + psj ≥ p sj

， ( j = 1, 2,

1 ⎧ ⎪ ≥ p sj , λ ≤ 1 i.e. λ ≥ 1 ; psj λ ⎨ (2) ∵ 1 ⎪< p sj , > 1 i.e. 0 < λ < 1 λ ⎩ ⎧ ≥ ps , λ ≥ 1 ∴ λ Ps ⎨ ⎩< ps , 0 < λ < 1 1

，

,n)

，( j = 1, 2,

， ,n

），

A Basis Space for Assignment Problem

(3) ∵

，

151

0 ≤ pij ≤ 1 ∴ the less the λ , the bigger the 1 , the less the λ

1

psj λ , and 1

∵ lim psjλ = 0 + λ →0

(4) ∵

，（ j = 1, 2,

0 ≤ pij ≤ 1

，

,n

），∴

lim λ p s = (0)1×n

λ →0 +

，

1

∴ the bigger the λ , the less the λ , the bigger

1

the

psj λ , and 1

psj = 1 ∵ λlim →+∞ λ

3

，（ j = 1, 2,

,n

），∴

lim λ p s = (0)1×n

λ →+∞

，

Summary and a Tag

The research in this paper aim at to throw a brick to explore some jades, is attempt to mouse out a systematic, structural and practical approach for researching on the mission assignment problem. This work is more benefit to applying some modern optimal algorithms (such as GA, ANN etc.) or some conclusions of modern applications mathematics. This work is only a simple beginning of the basic matter, the related work is to found a scheme space of mission assignment that is still a basic research. We hope to attract more researcher to pay attentions to the further work and give some criticizes or improvement.

References 1. Yuanzhen, W., Maoxing, S., Cheng, N.: A space description for task assignment problem. System engineering and electric techniques 23, 19 (2001) 2. Zuiliang, Z., Changsheng, L., Wenzhi, Z., et al.: Military Operations Research. Military Science Press, Beijing (1993) 3. Naikui, L.: Basic Course on Theory of Military Operations Research. National Defense University Press, Beijing (1998) 4. Algebra Group of Geometry and algebra Sector of Peking University: Advanced Algebra. People Education Press, Beijing (1978) 5. Olkin, I., Gleser, L.J., Derman, C.: Probability Models and Applications. Macmillan Publishing Co.,Inc., New York (1980)

The Analysis on the Application of DSRC in the Vehicular Networks Yan Chen, Zhiyuan Zeng, and Xi Zhu School of Hydropower and Information Engineering, Huazhong University of Science and Technology, Wuhan, P.R. China [email protected]

Abstract. The vehicular networks relying on the intelligent transportation system, is the production of the modern vehicle entering into the information era, and it has important application on reducing the traffic congestions and traffic accidents. As the key technology of the communication of vehicular-road and vehicular-vehicular realization, Dedicated Short Range Communication (DSRC) is widely used in the vehicular networks. This article has analyzed the problems existed in the practical application of DSRC technology; proposed the technical improved methods; and confirmed DSRC has the broad prospects in the vehicular networks application. At last, it analyzed the prospects of vehicular networks and the matters that should be attached importance in the vehicular networks development. Keywords: Vehicular networks, Intelligent transportation system, Dedicated Short Range Communication, Multi-lane free flow electronic toll collection system.

1

Introduction

At World Expo 2010 Shanghai, China, people foresaw a highly effective, properly ordered vehicular networks based on the Intelligent Transportation System (ITS) form the science fiction movie“2030”: Vehicles running on the road just like fish swimming in the deep sea, which is called “Fish-school effect”[1]. Through this effect, vehicles communicate with other vehicles freely, and build the multi-direction relationship with others. Even if there may be danger at the next turning or further, the driver could realize it early. By this way, the traffic safety is provided, and the probability of the traffic accidents is reduced to zero. Through the interaction between vehicle and vehicle, the intelligent control, prevention from accidents, and other functions are achieved. The vehicular networks (also known as VANETs) are the production of the modern vehicle entering into the information era. It refers to the On Broad Unit (OBU) through the Radio Frequency Identification (RFID) and other wireless technology; realize the extraction and effective use of the vehicles’ attribute information, static information, and dynamic information on the information network platform [2] Meanwhile, all the M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 152–156, 2011. © Springer-Verlag Berlin Heidelberg 2011

The Analysis on the Application of DSRC in the Vehicular Networks

153

running status of vehicles are effectively supervised and provided comprehensive service according to different function demand. Generally it is divided into the following three layers[3]:

6HUYLFH

9HKLFOHV

,76 Fig. 1. The three layers of VANETs

The bottom is the ITS. It’s a real-time, accurate and efficient traffic integrated management and control system, which is built to make the advanced sensor technology, communication technology, digital processing technology, network technology, automatic control technology, information dissemination technology and so on organically applied in the whole traffic management system. All the infrastructure that the entire vehicular networks need is provided by ITS. The intermediate layer is the vehicles with intelligent interconnection function, which is the core of the whole vehicular networks. The top layer is the vehicular networks service, such as the vehicle self-piloting service, intelligent parking service, and emergency charge service and so on. In order to realize the VANETs, we need to complete the following three connections: First, establish the connection between people and the vehicles through smart phones and other mobile equipment. Second, build the connection between vehicle and vehicle, as well as vehicle and the road peripheral devices, such as traffic light, chargers, etc. Third, establish the connection between vehicle and the wireless network, which is the most crucial connection. In this paper, it briefly introduced the Dedicated Short Range Communication (DSRC) technology which is widely used in the connection of the VANETs. Then it discussed the deficiency of DSRC shown in the practical application and the corresponding technological innovation, hoping to enhance the practical application of the VANETs.

154

2

Y. Chen, Z. Zeng, and X. Zhu

Problems Exposed in the DSRC

Basic Analysis of the DSRC. DSRC is combined by three parts, namely the On Broad Unit (OBU), which is carried in the vehicle, the Road Side Unit (RSU), and the protocol for the dedicated short range communication. At present, the DSRC is divided into European, American, and the Japanese three main camps, whose core is CEN T278, ASTM/IEEE and ARIB T75 respectively. They have represented the research and development directions of the international ITS’s key technologies. Problems of the DSRC Exposed in the Practical Application. The most typical and successful case of the DSRC application in the VANETs is the Electronic Toll Collection (ETC) system. In 2007, China also developed its own DSRC protocol, called GB/T 20851, and ever since then, there have been more than 20 cities enrolled into the ETC application by using DSRC technology. While some potential risk and problems are exposed in the practical application, we now analyze them below: In the OBU aspect, the OBU-ESAM security module is a kind of embedded safety control module, using special smart chip to achieve the functions such as data encryption decipher, bidirectional status authentication, access authority control, and data file memory. But in the practical application in the ETC, its functions are rather limited, and the extension is bad. Moreover, each COS instruction costs comparable long time to be executed. All these defects show that the existing OBU-ESAM is unable to meet the application demand of the ETC. In the RSU aspect, it is composed of a high gain direction detection controlled read-write antenna and a radio frequency controller. Usually the work pattern of the RSU is the on-line working pattern that is the RSU works under the control of the traffic lane controller, all the RSU and the OBU interactive instruction must transmit to the traffic lane controller. Moreover all different function's instruction carried on the correspondence through the same TCP port, this kind of working efficiency is quite low. And in the multiple track unlimited stream system, because it has some peculiar circumstance in reality, for example: The vehicles may cross one traffic lane to another, or may run in one row while one passes other vehicles and so on. RSU needs to deal with many traffic lanes’ OBU at the same time, that requires the RSU’s ability is strong enough, but at present the RSU production is unable to satisfy the ETC system.

3

Measures and Solutions to Improve the Practical Application of the DSRC

Improvement of the OBU-ESAM Security Module. After fully considered the character of the ETC system, we established an improved technical program for OBU-ESAM. (1) We designed special COS instructions, such as READ DYNAMICINF, GET TAC, and SET KEYINDEX and so on to sped up the card processing speed enormously, and enhanced its operating performance.

The Analysis on the Application of DSRC in the Vehicular Networks

155

(2) Has designed the card file organization, enables it not only support the application of ETC, also support the enclosed path electronic collection, the traffic control and other applications. (3) The new OBU-ESAM Supports the multi-key storage which guarantee its security a lot. Improvement of the RSU. In order to optimize the RSU’s ability to treat the simultaneous OBU, we increased the special-purpose link establishment mechanism, that is increased the special-purpose window: request-Private window request, and the special-purpose window assignment-Private window allocation (PrWA), simultaneously the MAC control territory increases the assignment time window mechanism and assignments the special-purpose up link window downlink frame sequence control. Multiple Track Unlimited Stream Control System. Unlike the traditional single way limited, the multiple track unlimited ETC has many advantages. It doesn’t need any interception facility neither the artificial auxiliary lane. We have proposed a new multiple track unlimited stream control system to enhance the application of the DSRC protocol in the VANETs. It suits the multiple tracks multiprocessing construction, has high communication speed, can guarantee to collect fees reliably and accurately, guarantee the primary security, uniformity and integrity of the charge data. The overall structure of this control system is shown in Table.1. Table 1. The structure of the multiple track unlimited stream control system Coil examination Vehicle examination

High definition resolution flow examination Traffic statistics Accept the primitive passing record Monitor the RSU working status

Communication with RSU

Send the vehicle information table Set the RSU operational factor Clock synchronization

Image snapshot

License plate snapshot

Traffic lane device

Traffic lane vision device control

control

Traffic lane monitoring device control Receive vehicle information table Receive control command

Communicate with

Receive RSU control command

Station-lever system

Receive synchronized clock Upload the electronic trade record Upload license plate image Upload running status of the traffic lane control system Upload the running status of the RSU

156

Y. Chen, Z. Zeng, and X. Zhu

The multiple tracks unlimited stream control system has the following technical superiority and the characteristic: (1) Has the ability to control the multiple tracks, the multi-platoon antennas and RSU, receive multi-RSU transaction records in real-time; (2) Completed redundant transaction record processing mechanism; (3) Can carry on the real-time condition monitor to the multiple track control device, and upload to the toll station system; (4) When the computers of toll station fail to work or the network breakdown, the collection traffic lane can work independently; the working parameter and the data record storage are in local. When vehicular lane has worked for a long time, we can download the establishment parameter or to upload charge data through the artificial way. No matter through the automatic transmission or the manual transmission, we can fully guarantee the authenticity, reliability, integrity and the uniformity of the transaction data.

4

The Development of the Vehicular Networks

VANETs as a new application in the ITS have good technical support and broad market prospect. And have great significance to alleviate the transportation pressure, reduce the traffic accident and realize the intelligent guidance. In the technical support aspect, domestic and foreign have done some research now to the vehicular networks, and made preliminary progress to the DSRC and related technologies. In the application aspect, domestic and foreign already have many successful practical applications; the most typical one is the ETC system, which has made very good progress and ensured people's travelling quality. On the same time, there are some things we should pay attention to. In technical aspect, we should continue to do the research about the information security in the VANETs, for in VANETs, all the vehicular related information is transmitted through network, if it is disturbed by the human or the interception, will create the information revelation, and the property damage, then affects the entire vehicle networking application and promotion. Next, countries should further exchange ideas to formulate the unification VANETs standard structure, so that the regional systems and the equipment can be mutually compatible.

References 1. Ping, Y.: Welcome the Vehicular Networks Era. Traffic and the Transportation, 56–57 (June 2010) 2. Fallah, Y.P., Huang, C.-L., Senguta, R., Krishnan, H.: Analysis of Information Dissemination in Vehicular Ad-Hoc Network with Application to Cooperative Vehicle Safety Systems, vol. 60(1) (January 2010) 3. Yanxia, G.: The Design and Implementation of National DSRC Test Suits. East China Normal University (2009)

Disaggregate Logit Model of Public Transportation Share Ratio Prediction in Urban City Dou Hui Li1 and Wang Guo Hua2 2

1 Zhejiang Institute of Communications, Zhejiang Hangzhou, China, 311112 Department of Traffic Engineering, Zhejiang Provincial Institute of Communications Planning, Design and Research, Zhejiang Hangzhou, China, 310006 [email protected]

Abstract. In order to analyze the distributing condition of urban passenger flow scientifically and correctly, the disaggregate Logit model is presented to predict the public transportation share ration in city, which is carried out by means of analysis of the outer and inner factors that affect the choice of modes of transportation and is based on the random utility theory. Firstly, the factors with major contribution to modes choice are selected according to the likelihood ratio statistic. Then the parameters of are estimated and the model is constructed. Finally, according to the proposed algorithm, the public transportation share ratio forecast test is carried out using the field survey data. The results of independent sample test indicate that the model has a finer precision and stability. Keywords: Public transportation, Share ratio prediction, Disaggregate, Logit model.

1 Introduction Public transportation share ratio is the percentage of the trip by public transit to the total trip, which is the major index to evaluate the progress of transportation and the rationality of the urban traffic structure [1]. Based upon the analysis of the features of the citizen’s activities, to rationality, objectively and scientifically predict the public transportation ratio is the one of the fundamental work of transportation plan [2]. For the government and the industry departments to learn the service condition of the public transportation, to make the public transportation priority policy, to adjust and optimize the urban traffic structure, and to provide the macro-guidance to the public transit development, it has important theoretical significance and practical application value. Traffic mode share ratio prediction origins from prediction of traffic mode split, the methods of which can be divided into two categories [3]: the aggregate method based on the statistics and the disaggregate method based on the probability theory. The aggregate method takes the traffic zone as research unit and makes statistical process for the survey data of the individual or the family, such as averaging the survey data, calculating the M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 157–165, 2011. © Springer-Verlag Berlin Heidelberg 2011

158

D.H. Li and W.G. Hua

proportion and so on, and then calibrates the parameters of the model using the statistical data. During the process, the original information of the individual or the family is unitedly handled, so in order to ensure the accuracy, a considerable number of samples are necessary according to the law of large numbers. The disaggregate method is the relative complete model based on the maximum random utility theory, which commits to the objective interpretation of the traffic mode choice behavior and is currently the most widely used model for transportation mode split prediction. Disaggregate approach takes the individual behavior as the research target and uses the original survey data directly to construct the model, so it can make full use of the survey data and does not need much sample data that make it receive widely concern. The common disaggregate models are probit model, logit model, dogit model, Box-cox dogit model and so on, in which logit model supposes the traffic mode choice of the travelers obeys normal distribution, which is consistent with the trip distribution feature in the large transport. On the other hand, the form of logit model is simple, the physical meaning is definite and it is easy to calculate, so it is widely used in practice [4]. Based on the above analysis, this paper will study the traffic mode share ratio prediction using logit model. Firstly, the factors with important influence on traffic mode choice are selected. Then the parameters of the model are calibrated using the field survey data. Finally, we provide numerical examples on the field data to testify the finer precision of the model.

2 Micro-economic Analysis of Traffic Mode Choice The traffic share ratio at the agglomeration level is the general effect of traffic mode choice of all the travellers, so it is necessary to analyze the traveler’s behavior of traffic mode choice. Generally speaking, traffic modes for choice of travelers between any pair of O-D have several types, such as private car, public transport, bicycle, walk and so on, which can be called “selection branches”. The satisfactory level of selection branches is utility. Based on the usual mentality of travelers for making choice, we suppose that: (1) Every traveler always chooses the selection branches with the maximum utility when making choice of traffic mode. (2) The utility of every selection branch perceived by traveler is determined by both the traveler features and the branch features. The features of traveler include the situation of vehicle possession, age, income and so on. The features of selection branch include travel cost, travel time, comfortableness, reliability, safety and so on. The utility Vi of the ith selection branches can be expressed as follow:

Vi = β 0 + β1 xi1 + β 2 xi 2 + β 3 xi 3 +

K

= β 0 + ∑ β k xik k =1

(1)

Disaggregate Logit Model of Public Transportation Share Ratio Prediction

Where,

β 0 , β1 , β 2 , β 3 ,… denote

undetermined parameters;

159

xi1 , xi 2 , xi 3 ,… denote

th

the traveler personal features and the features of the i selection branch [5]. In the practical traffic environment, we cannot measure all the factors affecting the utility. On the other hand, because of various reasons such as the limitations of traffic information, individual characteristics difference of every traveler, for the travelers, the difference usually exits between the estimation of the utility and the real utility of the ith selection branch that can be expressed as follow:

U i = Vi + ε i

(2)

However, the basis on which the travelers choose the traffic mode is their subjective estimation utility of every mode, but not the real utility. If U i = max U j , the traveler must choose the ith traffic mode. Because Ui is indefinite, the traveler can only choose the ith traffic mode in the probability of Pi. The Probability Pi for traveler to choose the ith traffic mode can be calculated by the following formula:

Pi = P(U i > U j ) = P (ε j < Vi − V j + ε i ),

∀j ∈ C , j ≠ i

(3)

Where, C is the set of the alternative traffic modes. According to the Bernoulli weak law of large numbers [6], the probability of Pi can be regards as the utilization ratio of the ith traffic mode, which is also the proportion for travelers to choose the ith traffic mode. So the traffic demand shared by each traffic mode can be calculated by formula (3).

3 Model Construction of Public Transportation Share Ratio Prediction Logit Model. Supposed that

εi

in formula (3) obeys gumbel distribution, then we

can obtain the logit model of traffic modal splitting. Multinomial logit model can explicitly be expressed and the solution method is simple, so it receives much concern, which can give the better explanation of the macro response of the public transportation share ratio caused by the mode choice behavior of travelers. Therefore this paper intends to forecast the share ration of public transportation using logit model. Assume the passenger transport mode has J categories, and let j=1, 2…J denote the number of response variable category, and then multinomial logit can be written as follow: K ⎡ P( y = j x ) ⎤ ln ⎢ = α + β jk x k ⎥ ∑ j k =1 ⎢⎣ P ( y = J x )⎥⎦

(4)

160


That is to say, if there are J categories of traffic mode, the choice probability of the jth traffic mode can be calculated by the following expression: αj+

P( y = j x ) =

e J −1

K

∑ β jk xk k =1

1+ ∑e

(5)

K

α j + ∑ β jk xk k =1

j =1

Once we have the observed values of the independent variables

x1 , x2 , , xk

and

the certain events, the share ratio of each traffic mode can be calculated [7]. Choice of Independent Variables. During the series of independent variables involved in the model, not all variables have important contribution to probability forecast. To reduce the work of data acquisition and calculation, we should firstly reject the variables with less contribution and choose those variables with important significance. This can be realized based on the significance test of the independent variables to the logistic model by means of the likelihood ratio statistic, the calculation formula of which is as follow:

[

G s ( x k ) = −2 ln Ls ( x k ) − ln L f

]

(6)

ln Ls ( x k ) is the natural logarithm value of the maximum likelihood function except the independent variable x k and ln L f is the natural logarithm value of the Where,

maximum likelihood function including all the independent variables. It has been proved in statics that the likelihood ratio statistic

Gs obeys χ 2 distribution, whose

degree of freedom equals to the number of factors retained to test. If that G s ( x k ) > 3.841 =

χ

2 0.05

, then

Gs satisfies

x k is retained, otherwise it is discarded [7].

Parameter Estimation. Assuming that we have chosen the type and number of the independent variables and the structure of the model has been determined, the parameters involved in the model should be estimated subsequently. Because of the nonlinear property of Logistic regression, we adopt the method of maximum likelihood estimation to calibrate the parameters. Construct the likelihood function first: N

L = ∏ ∏ Pnj nj ∗

C

n =1 j∈ An

The logarithm likelihood function can be written as follow:

(7)


161

N

L = ln( L∗ ) = ∑ ∑ c nj ln( Pnj ) n =1 j∈ An

⎡ K J −1 α j + ∑ β jk xk ⎤ = ∑ ∑ c nj ⎢(α j + ∑ β jk x k ) − ln(1 + ∑ e k =1 )⎥ ⎢ ⎥ n =1 j∈ An k =1 j =1 ⎣ ⎦ K

N

α j , β jk and we can get

Take the partial derivation of expression (8) about J −1

N ∂L = ∑ ∑ c nj (1 − ∂α j n =1 j∈An

∑e

αj+

K

∑ β jk xk k =1

j =1

J −1

(8)

1+ ∑e

αj+

K

)

(9)

)x k

(10)

∑ β jk xk k =1

j =1

J −1

N ∂L = ∑ ∑ c nj (1 − ∂β jk n =1 j∈An

∑e

αj+

K

∑ β jk xk k =1

j =1

J −1

1+ ∑e

αj+

K

∑ β jk xk k =1

j =1

The above two expressions are nonlinear function about α j , β jk , which can be solved by certain software [8]. Goodness of Fit Test. After parameter estimation, we should investigate the superiority-inferiority of the model. In other words, we should evaluate whether the constructed model is suitable and also can give finer forecast accuracy. If the model can fit the observed data best, then it can be adopted to predict; otherwise we should specify the model anew. Pearson χ statistics is usually used to evaluate the Goodness of fit of logistic regression model, formula of which is 2

J

Oj − Ej

j

Ej

χ2 = ∑

(11)

，j=1,2,…,J，and J is the number of covariates. O and

In which

j

E j are observed

frequency and predicted frequency of the jth type of covariates separately. Smaller

162


value of χ statistics means the model is better to fit the observed data. Otherwise we should find out the reason and modify the model [8]. 2

4

Experiment of Public Transportation Share Ratio Prediction

Data Preparation. In this paper, we take Hangzhou for example to carry out the experiments of public transportation share ratio forecast, where the passenger transport modes mainly include public transportation, private car, motorcycle (bicycle) and walking. In order to get the data of the residents travel choice feature, which is used for the analysis of the influence of each factor on the traffic mode choice, we carried out the questionnaire survey in Hangzhou. To guarantee the accuracy of the parameter estimation, 500 samples were altogether extracted. The questions for survey include the features of the traffic mode, the personal features and the travel features of the traveller [9]. As an example, three samples are shown in the table 1: Table 1. Sample Data of Trip Survey

Select No. result 1 2 3

1 3 2

Features of traffic mode Personal features of traveller Travel features inherent fee time punctuality gender age job income car distance purpose dummy Xj1 Xj2 Xj3 Xj4 Xj5 Xj6 Xj7 Xj8 Xj9 Xj10 Xj11 Xj12 Xj13 1 0 0 2 35 0.7 0 43 1 4000 0 10 1 0 0 1 1 20 0.9 0 36 2 2500 0 6 3 0 1 0 3.2 30 0.85 1 23 1 8000 1 20 1

Where, the select result 1 represents public transportation, 2 represents private car, 3represents motorcycle (bicycle) and 4 represents walking. Xj1 Xj2 Xj3 are inherent dummy, which indicate the other influence factors of the jth traffic mode not given in the expression (1). Xj4 Xj5 Xj6 respectively represent the travel fee (yuan), travel time (minute) and punctuality rate of the jth traffic mode. Xj7 Xj8 Xj9 Xj10 Xj11 represent the personal features of the traveller being investigated, which respectively are the gender(1: male; 0:female), age, professional job(1:institution; 2:enterprise; 3:self-employment; 4: teacher; 5: student; 6:unemployed; 7:else), income and the possession condition of private car (1: having private car; 0: not having private car). Xj12 Xj13 respectively represent the travel distance (kilometer) and the travel purpose (1: going to work; 2: going to school; 3: shopping and entertainment ;4: visit relatives and friends; 5:else).

、、

、、

、、、、

、

Model Calibration. During all the influencing factors in the table 1, not all factors are important, so we carry out likelihood ratio test to choose the factors with significance by means of the likelihood ratio statistic and the calculation result according to the expression (6) is shown in the table 2.


163

Table 2. Calculation results of the likelihood ratio statistic Factors

ln L10

X

Natural logarithm of the likelihood function

ln L9 ( x jk )

Maximum likelihood ratio statistic

[

G s ( x k ) = −2 ln L9 ( x jk ) − ln L10

]

-163.081512

j4

-182.026124

37.889224

X j5

-184.740219

43.317414

X j6

-198.859318

71.555612

X

j7

-164.101811

2.040598

X j8

-169.790223

13.417422

X j9

-164.610131

3.057238

X j10

-181.783833

37.404642

X j11

-177.431286

28.699548

X j12

-179.358576

32.554128

X j13

-172.197559

18.232094

Where, ln L9 ( x jk ) is the natural logarithm value of the maximum likelihood function except the independent variable x ik and ln L10 is the natural logarithm value of the maximum likelihood function including all the independent variables. From the table 2, we can see that the maximum likelihood ratio statistic of Xj7 and Xj9 are both less than 3.841, so they are discarded and the other factors are retained. In the retained factors, punctuality rate, travel time, travel fee, income, travel distance are important factors and the others are less important, which is consistent with the actual situation [10]. According to the chosen factors and based on the survey data, we carry out logit regression and calibrate the parameters by means of SPSS software, and then obtain the model of the public transportation share ratio, the expression of which can be written as follow:

ln(

P1 ) = 0.166 − 1.863 X j 4 − 1.185 X j 5 + 0.262 X j 6 − 0.573 X j 8 P4 − 2.194 X j10 + 0.851 X j11 + 0.462 X j12 + 1.269 X j13

ln(

P2 ) = 0.174 − 1.483 X j 4 − 1.712 X j 5 + 0.121 X j 6 + 0.931 X j 8 P4 + 0.897 X j10 − 1.208 X j11 − 1.781 X j12 + 1.019 X j13

ln(

P3 ) = 0.886 − 1.315 X j 4 − 0.989 X j 5 + 0.143 X j 6 − 1.145 X j 8 P4 + 1.359 X j10 − 0.734 X j11 + 1.513 X j12 − 0.832 X j13

(12)

(13)

(14)

164


Where P1, P2, P3, P4 is the share ratio of public transportation, private car, motorcycle (bicycle) and walking separately. Share Ratio Forecast and Result Analysis. Before we carry out the tests of share ratio forecast, the likelihood ratio test is firstly conducted to inspect whether the constructed model is statistically significant. The result is shown in the table 3. Table 3. Likelihood Ratio Test of the Model Chi-Square 156.018 156.018

Model Chi-Square Improvement

d.f. 11 11

Significance 0.001 0.001

From the table 3, we can see that the L.R. χ statistic of the model is 156.018, which 2

means the model has χ statistical significance. So the proposed model is appropriate to predict the share ratio of public transportation. According to the presented model and based on the questionnaire survey data, the share ratio of each traffic mode can be calculated. The comparison of the forecast results and the real values is shown in the table 4: 2

Table 4. Share Ratios of Competing Transportation Mode Transportation Mode Public transport Private car Motorcycle bicycle Walking

（

）

Actual value 20.97 41.03 27.76 10.24

（）

Share Ratio % Forecast value 22.36 38.09 29.86 9.69

Relative error 6.6 7.2 7.6 5.4

As is shown in the table 4, the relative error of forecast is less than 8%, which means the model has a higher accuracy and the proposed model is suitable to predict the share ratio of the competing transportation mode. According to the forecast result, the share ratio of the public transportation is 22.36%. Although the share ratio is 10% higher than the national average level, there still exists a large gap, comparing to the 40 60 public transportation share ratio in Europe, Japan and the South America. During the expression (12), (13) and (14) of the model, the coefficients of Xj4 (travel cost) and Xj5（ travel time） are both negative, which means the travel cost and travel time have the negative effect on traffic mode choice. The coefficients of Xj6 (punctuality rate) is positive, which means the punctuality rate has the positive effect on the travellers to choose traffic mode. Therefore, we can reduce the travel cost, travel time consumption and improve the punctuality rate of the public transportation to enhance the attraction of public transportation. Meanwhile we can carry out scientific traffic demand management of private car and motorcycle, which is to restrict the

％－％


165

possession and frequency of use of the private car and motorcycle, so as to efficiently improve the share ratio of the public transportation and reach to goal of the traffic structure’s optimization finally.

5

Summary

In this paper,we have sdudied the public transportation share ratio prediction. According to the microeconomic analysis of traffic mode choice and base on the maximum random utility theory, the algorithm of logit regression for traffic state probability forecasting is put forward. This method firstly chooses the factors with important influence on traffic mode choice, and then estimates the parameters to construct the share ratio forecast function. Experiments of the share ratio forecast of competing transportation mode using the survey data shows that the proposed model has a finer accuracy and a better robustness, which has very high practical application value for the policy making of the public transportation priority development.

References 1. Niu, X., Wang, W., Yin, Z.: Research on method of urban passenger traffic mode split forecast. Journal of highway and transportation research and development 21(3), 75–78 (2004) 2. Wang, Z., Liu, A., Zheng, P.: Generalized logit method for traffic modal splitting. Journal of Tongji University 27(3), 314–318 (1999) 3. Liu, Z., Deng, W., Guo, T.: Application of disaggregate model based on RP/SP survey to transportation planning. Journal of transportation engineering and information 6(3), 59–64 (2008) 4. Ghareib, A.H.: Evaluation of logit and probit models in mode-choice situation. Journal of transportation engineering 122(4), 282–290 (1996) 5. Liu, C.: Advanced traffic planning. China communications press, Beijing (2001) 6. Math department of Fudan University. Probability and mathematical statistics. People’s education press, Beijing (1979) 7. Wang, J., Guo, Z.: Logisitic regression models, method and application, vol. 9. Higher education press, Beijing (2001) 8. Yu, X., Ren, X.: Multivariable statistics analysis. China statistics press, Beijing (1999) 9. Hu, H., Teng, J., Gao, Y., et al.: Research on travel mode choice behavior under integrated multi-modal transit information service. China Journal of Highway and Transport 22(2), 87–92 (2009) 10. Dou, H., Wu, Z., Liu, H., et al.: Algorithm of Traffic State Probability Forecasting based on K Nearest Neighbor Nonparametric Regression. Journal of Highway and transportation research and development 27(8), 76–80 (2010)

Design of Calibration System for Vehicle Speed Monitoring Device Junli Gao1, Haitao Song2,*, Qiang Fang3, and Xiaoqing Cai4 1 School of Automation, Guangdong Univ. of Tech., Guangzhou, China School of Business Administration, South China Univ. of Tech., Guangzhou, China 3 Guangdong Institute of Metrology, Guangzhou, China 4 School of Civil Engineering &Transportation, South China Univ. of Tech., Guangzhou, China [email protected] 2

Abstract. Design of one calibration system for vehicle speed monitoring device based on ground loop sensors using the direct digital frequency synthesis technology. The calibration system can generate one kind of sinusoidal signals attached onto the excitation loop sensors, which frequency, time interval is adjustable. The sinusoidal signal simulates the fast vehicle and couples with the signal from the ground loop sensor at maximum degree to excite the vehicle speed monitoring device, which performances can be verified by the calibration system. Keywords: Direct digital frequency synthesis, loop sensor, vehicle, calibration system.

1 Introduction Along with the substantial increase of highway mileage and the ownership number of vehicles in China, accurate detection on vehicle information is the key to achieve traffic information statistics and intelligent traffic control. The over-speed automatic monitoring system for vehicles has become an important device to guarantee road traffic security, mainly including vehicle speed monitoring device based on the principle of electromagnetic induction, Radar velocimeter using Doppler principle and laser velocimeter using laser theory [1,2]. Among them, due to low cost, reliability and maintainability, the over-speed automatic monitoring system based on ground loop sensors has been widely used. It acquires traffic flow information through ground loop sensors, and adjusts the time for releasing vehicles at road junctions timely to achieve intelligent control over traffic signal. This plays a key role in alleviating the traffic pressure on large and medium-sized cities. Currently, a large number of vehicle speed monitoring devices based on ground loop sensor are deployed on state highway, in particular, crucial junctions in city area. The existing test and annual maintenance show that the test results have drift and even misdeclaration due to the influence of the environment, construction quality and other factors [1,3]. According to stipulations in "Metrology law of P.R China", velocimeter used for vehicle speed monitoring instruments are compulsory calibration [2]. Therefore, it is very important to calibrate velocimeters accurately, quickly and expediently. *

Corresponding author.


Design of Calibration System for Vehicle Speed Monitoring Device

167

2 System Scheme Design The calibration methods for vehicle velocimeters mainly include substantial vehicle test methods and analog signal test methods [4,5]. The former lets the vehicle tested pass through the ground loop sensors at a constant speed. The vehicle velocimeter gives response and displays the measurement speed. The vehicle velocimeter is calibrated by comparing the velocimeter value and the known constant speed value. This method is simple and easy to implement, which is commonly used in the initial test of the equipment for velocimeter manufacturers. But it has the drawbacks such as greater error, lower precision, heavier workload, poorer reproducibility, operational risk and limited speed measuring range and so on. However, the analog signal calibration method is the one recommended in National Metrological Verification Regulation JJG527-2007 implemented on February 2008. It uses external signal attached onto the excitation loop sensors to excite the ground loop senors, which are consistent with the substantial vehicles passing through the ground loop sensors. The calibration system receives two excitation signals arrived successively and precisely measures the interval ( T). The distance (S) between adjacent excitation loop sensors are known and equal to the distance between adjacent ground loop sensors, so S/ T is the standard speed value given by the calibration system. The vehicle velocimeter is calibrated by comparing the velocimeter value and the standard speed value. This method is featured by high precision of detection, simple operation, wide measuring range and good repeatability.

△

△

Fig. 1. The calibration system scheme

The calibration system scheme for vehicle velocimeters based on analog signal calibration method is shown in fig.1, which takes the CPU Atmega64 as the core and controls the DDS chip AD9850 to generate 1-4 time-variable and frequency-variable sinusoidal signals. The signals attach onto the 1-4 excitation loop sensors successively after voltage amplification and power amplification, which are consistent with the corresponding ground loop sensors in distance. The vehicle velocimeter is calibrated by comparing the velocimeter value excited by the electromagnetic mutual inductance between the two type loop sensors and the standard speed value preset in the calibration system. Based on the principle of electromagnetic induction, the excitation loop sensors detect the sinusoidal signals in the ground loop sensors added by the vehicle velocimeter actively. The signals pass through the freqency detection module and its frequency value can be acquired. According to the value, the CPU Atmega64 configures AD9850 automatically to generate excitation signals consistent with the signals in ground loop sensors. The two type signals can generate the maximum

168

J. Gao et al.

electromagnetic mutual inductance to enhance the sensitivity of the calibration system. The system uses 1-4 excitation loop sensors to excite the vehicle velocimeter and get the average of many real-time speed values to improve the accuracy of the calibration system. The keyboard and LCD module are respectively used to realize parameters settings and calibration results displaying.

3 Design of the Hardware Circuit The system circuit mainly includes the minimum circuit of CPU Atmega64, sinusoidal wave signal generator based on DDS chip AD9850, the corresponding signal conditioning circuit and the frequency-detection circuit to detect the frequency value of the signal in ground loop sensors added by the vehicle velocimeter and realize the adaptive control on excitation loop sensors.

Fig. 2. Signal generator based on AD9850

Signal Generator Based on AD9850. The sinusoidal signal generator based on DDS chip AD9850 is shown in fig.2. Here, Y401 is the precision clock source to provide reference clock signal for AD9850. The CPU Atmega64 directly writes the frequency, phase position and other control data in the form of serial connection through data port LOAD 01, clock port WLCK 01, frequency updating clock port FQ_UD 01 to AD9850 to realize direct digital frequency synthesis. The high-fidelity sinusoidal signal can be obtained by the subsequent low-pass filter. AD9850 has 32-bit frequency control word, and the resolution of output frequency can be up to 0.0291Hz in the case of 125MHz clock signal. Signal Conditioning Circuit. The amplitude of sinusoidal wave based on AD9850 is millivolt-grade signal, which can be impose onto the excitation loop sensors only by necessary signal conditioning. The circuit includes the voltage proportion amplifier circuit with same phase and the power amplifier as shown in fig.3. The voltage amplifier circuit is superimposed with 12V DC bias voltage, which provides the static


169

Fig. 3. Signal conditioning circuit

working point for the follow-up power amplification circuit. The sinusoidal signal can amplify up to 2A through the power amplifier circuit with high-gain bandwidth to meet the requirements for excitation ground loop sensors. Frequency Detection Circuit. The frequency detection circuit is composed of proportion amplifier circuit with same phase and Schmitt trigger, as shown in fig.4. The sinusoidal wave signal in the ground loop sensors added by the vehicle velocimeter enters into the frequency detection circuit from the pin3 of LM358 Pin3. It will be adjusted by the proportion amplifier circuit with same phase, then converts into pulse signals through the Schmitt trigger. The counter of CPU Atmega64 counts the pulses in unit time to calculate the frequency value of sinusoidal wave.

Fig. 4. Frequency detection circuit

170

J. Gao et al.

4 Design of the Application Program The application program mainly includes DDS signal control program, frequency dectection program on ground loop sensor and human-machine interface. DDS signal control program mainly consists of AD9850 reset program, initialization program and load program for frequency/phase position control word. The frequency detection program uses the capture function about the internal timer T/C1 of CPU Atmega64 to capture the output pulse signals in the unit time as shown in fig.4. Then, calculate the frequency value of the ground loop sensors added by the vehicle velocimeter. The Begin

Frequency detection GUI initialization

Select channel

Data save

？

Save & update data

Frequency detecting

Y Return to parameter setting GUI

a) Parameters setting

b) Frequency detection Begin

Simulation excitation GUI initialization Excite ? Y Enable the channel Determine excitation sequence Calculate excitation interval

Excitation Countdown N

Simulation excitation Display results Excitation Y again ? N Return to parameter setting GUI

c) Simulation excitation Fig. 5. System program flowchart

N


171

concrete program code is omitted here. The human-machine interface is divided into system parameter setting, frequency detection of ground loop sensors and simulation excitation of vehicle velocimeter. The flowchart for system parameters setting is shown in fig.5a. It should be implemented firstly, specifically including speed simulation value for vehicle, driving direction, and the channel separation distance corresponding to 1-4 excitation loop sensors, the output signal’s frequency. The frequency detection flowchart on ground loop sensors is shown in fig.5b. The signal frequency value of ground loop sensors added by vehicle velocimeter can be measured according to fig.5b, and saved automatically as the reference data to adjust the excitation signals from the calibration system in time. The excitation process of vehicle velocimeter is shown in fig.5c. The calibration system will begin to 3 seconds countdown to excite the vehicle velocimeter accoring to the preset parameters, when push the “OK” button. If the excitation is ineffective, it can return to the parameter setting interface, then re-exciting after setting the parameters again. After excitation, the speed value detected by vehicle velocimeter is compared with the standard value preset by the calibration system to determine whether the precision of vehicle velocimeter can meet the standard specification.

5 Application and Results First, it is required to verify the accuracy of the calibration system developed by our team. We employ the time measurement instrument with the accuracy higher than 0.01% provided by Guangdong Institute of Metrology to capture the interval at the inlet of excitation loop sensor with the known distances. The interval is used to calculate the speed value actually simulated by the calibration system and compare with the preset speed value to obtain the simulation speed accuracy as high as 0.1-0.2%. Then, based on the calibration system, carried out the actual testing on Enping Shahuka road section with the assistance of Guangdong Province Enping Traffic Police Detachment. For example, the calibration accuracy is about 0.3-0.5% for the vehicle "Guangdong J 41839". This has reached the design requirements basically. The calibration system accuracy can be further improved through replacing high-speed relays and optimizing the control program.

Fig. 6. Calibration system testing site

6 Summary Integrated the application on DDS and singlechip processor technology, developed the calibration system for vehicle velocimeter based on ground loop sensors. Analog signal

172

J. Gao et al.

calibration method is adopted to compare the value of vehicle velocimeter with the standard value on the calibration system. The system has strong applicability, easy-to-use, wide testing range, good repeatability, and so on. It will not affect the normal traffic during testing. Moreover, this system with a good value for engineering applications can not only be used as the measurement tool of technology and quality monitoring institutions, but also can be used as measuring instruments of vehicle velocimeter manufacturer. Acknowledgments. The authors would like to thank the support by Guangdong Province “211 Project”-Guangdong Province Development & Reform Commission under grand [431] and Special Foundation from Chinese Ministry of Science and Technology under grand 2007GYJ003.

References 1. Gao, F., Fang, Q.: The research on method of performance testing and verification for traffic loop-based speed meter. Shanghai Measurement and Testing (204), 27–28 (2008) 2. Lin, Z.: The Calibration Scheme on Vehicle Speed Automatic Monitoring System. China Metrology (6), 107–108 (2008) 3. Hao, X., Liu, G.: Design and Implementation of Calibration System for Vehicle Loop-based Speed-measuring Meter. Science Technology and Engineering 9(13), 3912–3915 (2009) 4. Nie, G.: Principle Analysis on Identification and Calibration for Loop Sensor Velocity Measurement System. China Metrology (4), 95–96 (2010) 5. Nie, G.: The research and implementation of identification and calibration equipment for loop sensor velocity measurement. China University of Petroleum (East China), Dongying (2008)

Dynamic Analysis and Numerical Simulation on the Road Turning with Ultra-High Liang Yujuan Department of Physics and Electronic Engineering, Hechi University, Yizhou, Guangxi 546300, China [email protected]

Abstract. By analyzing the dynamic characteristics of vehicles on the road turning, the range of velocity is obtained where vehicles turn safely. A single-lane cellular automaton model is proposed, containing the ultra-high road turning, to stimulate the effect of road turning with ultra-high on traffic behaviors. The result shows that in certain range, the greater the ultra-high is, the average velocity and the average flow of system are. Therefore, reasonably setting up ultra-high on the road turning can promote the traffic capacity of the road. Keywords: ultra-high, road turning, centripetal force, centrifugal force, cellular automaton model.

1 Introduction With the rapid development of vehicle industry, many traffic problems such as congestion, accidents and energy lack have become the common issue all over the world. Setting up unimpeded and developed traffic transportation net has become the committed aim of many countries. Moreover, the traffic problems have been the hot subject of research in recent years [1-15]. The scholars in different fields put forward all kinds of models[9-15] to descript the characteristics of traffic flow, among which, the cellular automaton model is easy for computer operating and we can revise its rules nimbly in order to apply to all kinds of actual traffic conditions. Therefore, it gets widely applications and development [1-9] on the research of traffic flow. The most famous cellular automaton model is the NaSch model [9] which was put forward by Nagel and Schreckenberg. The paper is based on the NaSch model and adopts periodic boundary condition and sets up a single-lane cellular automaton model containing the road turning with ultra-high to study the effect of ultra-high on the traffic flow.

2 Dynamic Analysis of Vehicles on the Road Turning It is valid that the vehicles require centripetal force when they turn. According to Newton’s law of motion, the centripetal force is provided by normal static friction μmg for the road turning without ultra-high [1]: M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 173–178, 2011. © Springer-Verlag Berlin Heidelberg 2011

174

L. Yujuan

m

v2 = μmg r

v = rμg

(1)

(2)

Where m is the mass of vehicle, g is the acceleration of gravity, r is the curvature radii of road turning, μ is the static friction coefficient between the tire and road surface, v is the velocity of vehicle. The left of Equation (1), mv2/r, is the centripetal force for vehicles. In Equation (2), v is determined by r, μ, g these three factors (g is the gravitation constant). The greater the r and μ are, the greater the v is, and rμg is the critical velocity vc of vehicles that turn safely. When and only the velocity v≤vc, normal static friction can provide centripetal force, so those vehicles are able to turn safely. However, the friction coefficient μ gradually decreases with the time of the road service. When μmg<mv2/r, the normal static friction of the vehicle is not enough to provide centripetal force and the centrifugal force is equal to centripetal force, but their directions are opposite. Due to the effect of centrifugal force, vehicle will slip towards outside of road turning.

Fig. 1. Stress diagram of vehicle

As for ultra-high road turning, the stress of vehicle is showed as Fig. 1, among which, G, N, Fμ respectively stand for gravity, supporting force and transverse friction force. G is decompose into two parts, G∥=mgsinθ and G⊥=mgcosθ. According to the condition of force balanced, get G⊥=N, G∥ provide the centripetal force F, the centrifugal force F* is equal to centripetal force F, but the direction is opposite. Transverse friction force Fμ= Fμ1+Fμ2, whose size and direction are decided by the road situation and velocity of vehicles. When G∥ is enough to provide centripetal force, Fμ=0; when G∥ is too small to provide centripetal force, it will need Fμ to reinforce, and Fμ is downwards along the bevel; When the angle θ of ultra-high is quite great, the vehicle has the trend that they slip towards inside incline of road turning, and Fμ is upwards along the bevel. Now, assume the direction of Fμ is downwards along bevel, the formula of centripetal motion is that:

m

v2 r

= mg sin θ + Fμ = mg sin θ + μmg cosθ

（ 3）

Dynamic Analysis and Numerical Simulation on the Road Turning with Ultra-High

v = rg (sin θ + μ cos θ )

175

（ 4）

When θ is quite small, sinθ≈θ, cosθ≈1, therefore

v = rg (θ + μ )

（ 5）

Thus, we can infer the critical velocity of vehicle that does not slip towards outside of ultra-high road turning is vc = rg (θ + μ ) , and vc is decided by r, μ, g and θ. The greater the r, μ and θ are, the greater the vc is. Comparing (2) and (5), we can know that under the same conditions r and μ, the passing speed on the ultra-high road turning is bigger than that on the nonultra-high road turning, such as r=100m, g=10m/s2, μ=0.68, θ=60, road turning without ultra-high vc≈26.08m/s≈93.88km/h and ultra-high road turning vc≈28.01m/s≈100.85km/h.

3 Model and Rule In order to predigest the question, it is assumed that the road system is divided into L=1000 cells which can be either empty or occupied by a car with a velocity v=0, 1, …, vmax , and there are only one type of cars on the road and only one road turning. Besides, the road turning is settled in the middle of the road, outside of which is higher over h than inside, and the decelerate section l is placed in the front of road turning. According to the talk above that critical velocity vc of ultra-high road turning which is gained by dynamic analyzing, we can infer that the vehicles must decelerate when passing road turning to get the speed v≤vc. Therefore, divide the maximum velocity of vehicle on the road into two kinds: on the road turning section, take vmax=vmax2, which correspond to slow cars, and on the other section, take vmax=vmax1, which correspond to fast cars. Supposing vehicles move from left to right with periodic boundary condition. Considering the effects of all the factors on the road turning and the delay probability p of vehicles on vehicle velocity, revise evolvement rules of NaSch model as follows: (1) define the maximum velocity vmax: If vehicles on the road turning section, take vmax= vmax2, else, take vmax=vmax1; (2) define the delay probability p: If vehicles on the deceleration section l before road turning, and the speed v> vmax2, take p= p1, otherwise take p= p2; (3) acceleration: vn(t) → min（ vn(t)+1, vmax） ; (4) deterministic deceleration to avoid accidents: vn(t) → min（ vn(t), gapn(t)） ; (5) randomization with probability p: vn(t) → max（ vn(t)-1, 0） ; (6) position update: xn(t) → xn(t)+vn(t). Here vmax is the maximum velocity of the vehicle, vmax1> vmax2. xn(t) and vn(t) are the position and velocity of vehicle n, gapn(t)=xn+1(t) - xn(t)-1 denotes the number of empty cells in front of vehicle n and xn+1(t) is the position of vehicle n+1. p=p1 and p=p2 respectively denotes the larger delay probability and the smaller delay probability. The rules of this model are added step (1) and (2) comparing with NaSch model.

176

L. Yujuan

4 Numerical Results and Analysis The model parameters are set as follows: one time step corresponds to 1 s; each cell 7.5 m, and the length L of the road 7.5km, the length of deceleration section before turning l＝8cells=60m; the delay probability p1=0.8, p2=0.25; r=100m, g=10m/s2, μ=0.68; take vmax1=5, which is equal to an actual velocity 135km/h, critical velocity vc = rg (θ + μ ) =100.85km/h, if take vmax2=4=108km/h, there will be vmax2>vc, and then the component force mgsinθ of gravity and the maximum normal static friction force can not offer enough centripetal force, due to the effect of centrifugal force, the vehicles will slip out towards the outside of the road turning and it is possible that serious traffic accidents occur, therefore, only can take vmax2=3, 2, 1, and which correspond to actual velocity 81km/h, 54km/h and 27km/h. Suppose that the dip angle θ of the ultra-high road turning and the maximum velocity of vehicle is direct ratio, namely, θ=kvmax2, here k is the proportion coefficient. Let k=1, and then θ=vmax2, so θ=3 is the greatest dip angle. If N is the total number of vehicles distributing on the road L, the formula of density of vehicles, average speed and average flow are showed as follows:

ρ=

the density of vehicles:

N. L

average speed:

v (t ) =

v=

1 N ∑ vn (t ) (time N n =1

step

average),

v (T ) =

1 t 0 + T −1 ∑ v (t ) (time T t =t0

average),

1 S ∑ v (T ) (sample average). S i =1

average flow:

J = ρv .

In simulation, the first t0=2×104 time steps are discarded in order to remove the transient effects and then the data are recorded in successive T=2×104 time steps. The obtained v (t ) for each time step is the average value of vn(t), v (T ) for each run is the average value of v (t ) at the last T=2×104 time steps, v and J are obtained by averaging over 10 runs of simulations. Fig. 2 describes the relationship of the average speed and the average flow on the density for different values θ. when the road turning do not exist, the average speed and average flow of system in the mid and small density sections are the greatest; Fig. 2 (a) shows that in the small density section, the average speed and the corresponding critical density of free moving state obviously increases with the increase of the inclination of ultra-high. But after exceeding the critical density, average speed reduces rapidly. Until the density is quite great, the average speeds are equal at all situations; it is found that in Fig. 2 (b), in the small density, the flow and density of the free flow phase are direct radio, and they increase in linear; in the mid density section, with the increase of inclination of ultra-high, the maximum of the flow increase obviously; in the high density section, the flow and speed are direct radio and they reduce in linear. Fig. 3 is the time-space pattern of the road turning with 300 lattices back and forth when density ρ of vehicle is 0.15. Whilst x-axis represents

Dynamic Analysis and Numerical Simulation on the Road Turning with Ultra-High

177

the position of vehicles; the t-axis represents evolutionary time; the white dot there are not cars, and also the black dot there are cars. Besides, the grey areas denote that traffic is smooth, while the black areas mean that vehicles are jam, and the congestion spreads backwards. The jam areas reduce with the increase of ultra-high. These discontinuous jam areas indicate such traffic phenomenon that vehicles go and stop on the road. Time-space pattern can also shows the change trends of Fig. 2, which explains that the ultra-high of road turning is one of the important factors effecting traffic flow; properly enlarge ultra-high can reduce the effect limiting speed of bottleneck on the road turning. 5 (a)

θ =1 θ =2 θ =3 straght lane

3

average flow

average speed

4

2 1 0 0.0

0.2

0.4

0.6 density

0.8

1.0

(b)

0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00

θ =1 θ =2 θ =3 straght lane

0.0

0.2

0.4

0.6 density

0.8

1.0

Fig. 2. The relationship of average speed and average flow on the density for different values θ

（）θ=1；

Fig. 3. The space-time diagram of the road turning with 300 lattices back and forth. a (b) θ=2 (c) θ=3

；

5 Summary Road turning is common traffic bottleneck, where the traffic accidents often happen, and it is one of the important factors which have the influence on traffic. The dynamic analysis of vehicles on the road turning shows that speed must smaller than the critical speed vc, which is decided by r, μ, g and θ these four factors, and the critical speed vc of ultra-high road turning is greater than nonultra-high's. Based on the NaSch model, the

178

L. Yujuan

result shows that the greater the ultra-high is, and the greater the average speed and average flow of system in some range are. It is concluded that reasonably setting up ultra-high on the road turning can improve the capacity of the road. Acknowledgement. This work is supported by the National Natural Science Foundation of China (Grant Nos. 10662002 and 10865001) & the National Basic Research Program of China (Grant Nos.2006CB705500) & the Natural Science Foundation of Guangxi (Grant Nos. 2011GXNSFA018145) & the research of Guangxi Education Department (Grant Nos. 201012MS206 and 201010LX462).

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

Liang, Y., Xue, Y.: Acta Phys.Sin. 59, 5325 (2010) Pan, J., Xue, Y., Liang, Y., Tang, T.: Chinese Physics B 18, 4169 (2009) Jia, B., Li, X., Jiang, R., Gao, Z.: Acta Phys. Sin. 58, 6845 (2009) Liang, Y., Pan, J., Xue, Y.: Guangxi Phys. 30, 8 (2009) Li, S., Kong, L., Liu, M.: Guangxi Sciences 15, 47 (2008) Liang, Y.: Guangxi Sciences 18, 44 (2011) Zhao, X., Gao, Z., Jia, B.: Physica A 385, 645 (2007) Liang, Y.: Journal of Sichuan Normal University (Natural Science) 34 (2011) Nagel, K., Schreckenberg, M.: J.Phys (France) I 2, 2221 (1992) Bando, M., Hasebe, K., Nakayama, A., Shibata, A., Sugiyama, Y.: Physical Review E 51, 1035 (1995) Zhang, H.M.: Transportation Research B 36, 275 (2002) Helbing, D., Hennecke, A., Shvetsov, V., Treiber, M.: Transportation Research B 35, 183 (2001) Tian, J., Jia, B., Li, X., Gao, Z.: Chinese Physics B 19, 01051 (2010) Tang, T., Huang, H., Xu, X., Xue, Y.: Chinese Physics Lette. 24, 1410 (2007) Liang, Y., Liang, G.: Highways & Automotive Applications (2), 36 (2011)

Solving the Aircraft Assigning Problem by the Ant Colony Algorithm Tao Zhang, Jing Lin, Biao Qiu, and Yizhe Fu School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai 200433, China [email protected]

Abstract. This paper sums the aircraft assigning problem up as vehicle routing problem, and constructs a mixed integer programming model. This model not only considers the link time and link airport between two consecutive flight strings, but also considers the available flying time for each aircraft. To solve this problem, an Ant Colony System (ACS) combining with the pheromone updating strategy of ASRank (Rank-based Version of Ant System) and MMAS (MAX-MIN Ant System) is proposed. Seven groups of initial flight string sets are used to test the method, and the important parameters of the algorithm are analyzed. The numerical results show that the method of this paper can effectively reduce the total link time between the continuous flight strings, and obtain the satisfactory solution with high convergence speed. Keywords: Aircraft assigning, flight string; ant colony optimization (ACO), vehicle routing problem (VRP).

1 Introduction Aircraft assigning problem (AAP) is an important task in airlines’ daily operation, which has a decisive impact on airlines’ normal operation and overall efficiency. Barnhart [1] and Boland [2] researched the scheduling model based on flight string and put forward a model which is mainly used to solve aircraft maintenance routing problem involving only one maintenance type. Boland [2, 3] regarded the aircraft maintenance routing problem as an asymmetric traveler problem with replenishment arcs and added replenishment arc set to space-time network model which was put forward by Clarke [4]. With considering the weekly aircraft assignment model which involves A check and B check maintenance types, Sriram and Haghani [5] constructed flight strings for one day, and then built a multi-commodity network model for these flight strings. In the process of making aircraft plan, Rexing [6] considered that the aircraft departure time could fluctuate within the scope of time window thus ensuring each flight assignment could have greater opportunity for getting available aircrafts. Based on these studies, Bélanger et al. [7] studied the large-scale periodic airline fleet assignment with time window and developed a new branch-and-bound strategy which is embedded in the branch-and-price solution strategy. Sherali [8] combined the flight M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 179–187, 2011. © Springer-Verlag Berlin Heidelberg 2011

180

T. Zhang et al.

assignment problem with the other flight planning process and added demand forecast about short-term travel to the traditional aircraft assignment model. Making a deeper research on the basis of [8], Haouari assigned a path with minimum cost to each aircraft under the constraints and used a heuristic method based on network flow to solve model [9]. Du Yefu [10] proposed the optimization of flight strings in the process of optimizing airlines flight frequency. Li [11] made a deep research into the flight string preparation problem occurring in the process of flight assignment and then built the Flight Strings Preparation Model for VRP(Vehicle Routing Problem). The model considered not only the constraints of flight schedule, but also the constraints of passenger flow volume. After looking into the actual demand of various domestic airlines, this paper transforms the flights assignment problem into vehicle routing problem (VRP) and studies the weekly periodic flight assignment problem and presents the concept of virtual flight string. The AAP is NP-hard combinatorial optimization problems. The accurate algorithms can hardly solve these problems in valid time, so metaheuristic algorithms become hotspot in researching both domestic and abroad. Dorigo & Gambardella [12] presented the Ant Colony System which is easier to implement but has the same property as ant colony algorithm. Owing to the ant colony algorithm has strong global search ability and the ability of finding better solution, this paper takes the ant colony algorithm as the solution strategy of VRP problem.

2 Problem and Model The aircraft assigning problem can be described as follows: To regard each flight string prepared by the commerce department as a client, each aircraft as a vehicle, the sum of the link time between the continuous flight strings as the travel time between the clients, and aircrafts (vehicles) start from the warehouse (virtual flight string) till they complete service to all flight strings (customers), and each string can only be served exactly once by one aircraft, and the weekly service time of each aircraft can not be more than the weekly maximum available time, and the departure airport of the first flight of the first flight string (except virtual flight string) must has available aircrafts. Arrange the order of the clients served by each aircraft, that is to say, to arrange a combination of several flight strings for each aircraft with the goal of minimizing the total link time. Flight string refers to the connection among flights, which is made in accordance with the “natural” link among them and consists of several continuous flights. The flights in different flight dates belong to different flight strings. The specification of parameters and variables in the model are as follows:

n : number of flight strings; m : number of aircrafts; C : the set of all flight strings, C = {1,2,3, , n} ; V : the set of all aircrafts, V = {1,2,3, , m} ; A : the set of all vertexes, A = {0} ∪ C , 0 represents virtual flights;

Solving the Aircraft Assigning Problem by the Ant Colony Algorithm

181

W : the set of all airports owning aircrafts at the beginning of a week, W = {1,2,3, , w} ; Oi : departure airport of the first flight in flight string i ;

Di : arrival airport of the last aircraft in flight string i ; t di : the time when the first flight in the flight string i leaves the airport; t ai : the time when the last flight in flight string i arrives at the airport; GT : the minimum stop time of the convergence between flights; Qk : the available flight hours of aircraft i in a week; Pe : the total number of aircrafts owned by airport e at the beginning of a week; The definition of

⎧1 , ⎩0 ,

λie = ⎨

λie

is as follows,

the departure airport of the first flight in flight string i is airport e

where, i ∈ C ,

e ∈W .

The definition of the decision variable

⎧1 , x ijk = ⎨ ⎩0 ,

,

or else

xijk is as follows,

aircraft k executes flight string j after executing flight string i , or else

i, j ∈ A, k ∈ V . In particular, x0 ik shows flight string i is the first one carried out by aircraft k , in other words, flight string i connects the virtual flight string 0. And the set of those where,

flight strings which connect virtual flight strings are called the original set of client nodes. According to the description of parameters and variables above, we develop the model of aircraft assignment VRP problem as follows: Min

∑ ∑ ∑ (t dj − t ai ) xijk .

k∈V i∈C j∈C

(1)

s.t.

∑ ∑ xijk = 1 ,

∀j ∈ C ,

(2)

∀k ∈ V , j ∈ C ,

(3)

∀i, j ∈ A, ∀k ∈ V ,

(4)

t dj x ijk ≥ (t ai + GT ) x ijk , ∀i, j ∈ A, ∀k ∈ V ,

(5)

k∈V i∈ A

∑ x ijk − ∑ x jik = 0 ,

i∈ A

i∈ A

O j ⋅ x ijk = Di ⋅ x ijk ,

182

T. Zhang et al.

∑ ∑ (t aj − t dj ) x ijk ≤ Qk ,

∀k ∈ V ,

(6)

∑ ∑ x0ik ⋅ λie ≤ Pe ,

∀e ∈ W ,

(7)

∑ ∑ x ijk ≤| S | −1 ,

∀S ∈ C , ∀k ∈ V .

(8)

i∈ A j∈C

k∈V i∈C

i∈S j∈S

Formula (1) is the objective function which minimizes the sum of link time between continuous flight strings. Constraint (2) ensures that each flight string must be served for exactly once; constraint (3) ensures each aircraft reaching at one client must leave from the client; constraint (4) ensures the departure airport and the arrival airport of two continuous flight strings served by the same aircraft must satisfying the preceding order; constraint (5) is the link time constraint; constraint (6) ensures the weekly flight hours of each flight is less than the available hours of this week; constraint (7) ensures the number of flight strings departing from airport e is less than the number of original flights of airport e ; constraint (8) eliminates sub-loop.

3 Algorithm Designing Pseudo-random Probability Selection Rule. According to the pseudo-random probability selection rule, the rule of ants selecting client node i at the client node j is determined by formula (9).

⎧⎪arg max{[ τ (i, j)] α ⋅ [ η (i, j)] β } j∈M i j=⎨ ⎪⎩u

if q < q 0

,

(9)

else

q0 is a constant, q 0 ∈ (0, 1) ; q is a randomly generated probability, q ∈ (0, 1) ; τ (i,j ) is the pheromone amount between client i and client j , η (i, j ) is the heuristic factor between client i and client j , α , β are the weight of both pheromone and heuristic factor in the total information. M i represents the set of the available client nodes when ants is selecting the next client at the client node i (A set where,

of client nodes including all that is not visited, and to meet the requirements of transit time and the airport, as well as flight hours constraints). u is an client calculated by formula (10). Randomly generating

q before selecting the next customer, if q < q0 ,

[ τ (i,j )] α ⋅ [η(i,j )] β in all available clients starting from client i and making it the next client to visit; if q ≥ q0 , then choosing the next

then selecting the maximum client

customer according to formula (10).


⎧ [ τ (i,u )]α ⋅ [η(i,u )] β ⎪⎪ α β Pk (i,u ) = ⎨ j∈∑M[ τ (i,j )] ⋅ [η(i,j )] i ⎪ ⎪⎩0

183

if u ∈ M i ,

(10)

else

Pk (i,u) is the state transition probability when ant k transforms from client node i to client node j .

where,

Initial Pheromone and Local Update Rules. This paper uses the basic idea of the nearest neighbor method to initialize pheromone. In the process of seeking solutions, ants prefer visiting those nodes that are nearest from the current node when choosing the next node to visit. The pheromone update includes local update and overall update. The former means that when ants transform from client i to client j , the pheromone of path ( i ,

j ) will be locally updated according to the formula (11). τ (i,j ) = (１ − ρ) ⋅ τ (i,j ) + ρ ⋅ τ 0 ,

(11)

ρ ∈ (0, 1　) is an adjustable variable, representing the volatile factor pheromone; τ 0 =１ /( n ⋅ Tnn ) , n is the number of flight strings, Tnn is the sum where,

of of

link time in the original available solutions constructed in accordance with the nearest neighbor method. In ACO, when updating overall information, only the pheromone on the path belonging to the optimal solution can be updated. To make more effective use of better solutions, this paper, basing on the update mode of ASRank [13], measures solutions by the sum of link time took by each ant. It orders all the paths according to the total link time by ascending order, ie.,

gaptime 1 ≤ gaptime 2 ≤

≤ gaptime nnant (nnant is the number of ants) and

gives different weights to the path of each ant, in which a greater weight is given to a shorter path. The weight of the best path is w . We can use the formula (12) to update pheromone of each path. w −1

τ(i,j) = ( 1 − ρ)τ(i,j) + ∑ (w − r) ⋅ Δτ ijr + w ⋅ Δτ ijgb , r =1

where,

(12)

Δτ ijr = 1 / gaptime r , Δτ gbij= 1 / gaptime gb , ρ ∈ (0, 1) is the volatility

coefficient of pheromone,

gaptime r is the total link time of the r’th shortest path,

gaptime gb is the total link time of the overall optimum solution. Meanwhile, this paper takes the method of MMAS algorithm [14] to avoid stagnation in the process of searching. And the pheromone on each path is limited within the range τ min ,τ max , τ max = nτ 0 / ρ , to avoid the intensity difference of

[

]

pheromone on the paths are too large to result in premature local optimum.

184

T. Zhang et al.

Construction of Heuristic Function. The construction of heuristic factor is a core component of the information that is the basis of constructing solution.

η (i, j ) = 1 /(t dj − t ai ) ,

(13)

t dj − t ai is the link time between flight string i and flight string j , that is, the distance between two client nodes in VRP problem. Flight string j is one of the available next client nodes of flight string i and is stored in the candidate list Gi corresponding to flight string i .

where,

The formula of calculating total information is as follow:

total (i,j ) = [τ (i, j )] [η (i, j )] . α

β

(14)

The greater α is, the ants are more likely to choose the path went by other ants and the cooperation among ants becomes stronger. β shows the essentiality of the heuristic information in the searching process. The greater the path with shorter link time.

β

is, ants are more likely to choose

4 Computational Results and Analysis In order to test the feasibility of the model and the validity of the algorithm, this paper solves the model with taking some airline's actual data as the example. The algorithm programs with VC++ 6.0, moves on XP system Core (TM) 2 Duo CPU (1.80GHz), 1GB memory PC machine. In order to compare the initial node selecting different data to affect the results, this paper carries on the experimental contrast to these seven kinds of situations. We chooses this group of parameters combination α = 1 & β = 5 & q0 = 0.9 as the best parameter combination. Table 1 contains experiment results by seven sets of experiment data, the average of the best solution 10 times. From table 1, the second group of data (Monday and Tuesday 124 scheduled flight string as initial customer node collection) can restrain quickly in well solves 101105min, also the result is the most stable. The number the initial node collection Table 1. 7 groups of data experimental results Target value (min)

No.

Flights string

Best

Average

Worst

1 2 3 4 5 6 7

61 124 188 251 312 372 432

101940 101105 101105 101105 101105 101120 101140

102082 101108 101116 101121 101124 101134 101152

102190 101120 101140 101145 101150 101160 101180

iterations

time (s)

117 142 148 159 170 192 199

161 175 180 183 184 188 191


185

The total link time

contains scheduled flight string are too few, although the algorithm computing time reduced, but does not favor obtains the optimal solution; The number the initial node collection contains scheduled flight string are too many, because ant's way hunting zone expands causes the computing time to be long, and does not favor obtains the optimal solution. Based on the best parameters in the preceding text, (initial customer node collection is 124 scheduled flight strings) carrying on the solution to one week data of some airline. Figure 1 is the best solution restraining diagram of curves. 0

5

100

150

200

Number of iterations

Fig. 1. Best solution restraining curve

In Figure 1, within 10 iterations, the convergence rate is extremely quick. During the 10 to 100 iterations, the convergence rate decreases, and the best target value 101105 is obtained at about 142 iterations. In this paper, the heuristic factor structure, the initialization customer nodes selection and the pheromone update strategy can improve effectively the ACS algorithm. In order to compare the solutions obtained by our algorithm with these obtained by the manual method, we calculate airplane efficiency respectively as shown in table 2. Here,

Aircraft utilizatio n = Total flying time /(Total flying time + Total link time ) . Table 2. Comparison of aircraft utilization

ACO algorithm Manual arrangement

Total flying time (min） 140815 140815

Total link time （ min） 101105 104900

Aircraft utilization 58.2% 57.3%

From table 2, the aircraft utilization of the results obtained by our algorithm obviously is bigger than that of the result obtained by the manual method. Therefore, the model and the algorithm can reduce the link time between the flight strings effectively and enhance the airplane efficiency.

186

T. Zhang et al.

5 Conclusion This paper studied the aircraft assigning problem taking the week as unit, transformed the aircraft assigning problem to the vehicle routing problem. We proposed the virtual flight string, and established the mix integer programming model. According to the model characteristic, this paper developed improvement ant colony algorithm to solve the aircraft assigning problem. In the experiments, using the actual flight data as instances, we tested our model and algorithm. The numerical results showed the solutions obtained by our method are better that obtained by the manual method. Acknowledgements. This work is partially supported by the Natural Science Fund of Shanghai under Grant (No.09ZR1420400, 09ZR1403000), the National Natural Science Fund of China under Grant (No.60773124, 70501018), 211 Project for Shanghai University of Finance and Economics of China (the 3rd phase Leading Academic Discipline Program).

References 1. Barnhart, C., Boland, N., Clarke, L., Johnson, E., Nemhauser, G., Shenoi, R.: Flight string models for aircraft fleeting and routing. Transportation Science 32, 208–220 (1998) 2. Boland, N., Clarke, L., Nemhauser, G.: The asymmetric traveling salesman problem with replenishment arcs. European Journal of Operational Research 123, 408–427 (2000) 3. Mak, V., Boland, N.: Heuristic approaches to the asymmetric travelling salesman problem with replenishment arcs. International Transactions in Operational Research 7, 431–447 (2000) 4. Clarke, L., Johnson, E., Nemhauser, G., Zhu, Z.: The aircraft rotation problem. Annals of Operations Research 69, 33–46 (1997) 5. Sriram, C., Haghani, A.: An optimization model for aircraft maintenance scheduling and re-assignment. Transportation Research Part A 37, 29–48 (2003) 6. Rexing, B., Barnhart, C., Kniker, T., Jarrah, A., Krishnamurthy, N.: Airline fleet assignment with time windows. Transportation Science 34, 1–20 (2000) 7. Bélanger, N., Desaulniers, G., Soumis, F., Desrosiers, J.: Periodic airline fleet assignment with time windows, spacing constraints, and time dependent revenues. European Journal of Operational Research 175, 1754–1766 (2006) 8. Sherali, H.D., Bish, E.K., Zhu, X.: Airline fleet assignment concepts, models, and algorithms. European Journal of Operational Research 172, 1–30 (2006) 9. Haouari, M., Aissaoui, N., Mansour, F.Z.: Network flow based approaches for integrated aircraft fleeting and routing. European Journal of Operational Research 193, 591–599 (2009) 10. Du, Y.: A Optimal Method of Scheduled Flights for Civil Aircraft. Systems EngineeringTheory & Practice 8, 75–80 (1995) 11. Li, Y., Tan, N., Hao, G.: Study on flight string model and algorithm in flight scheduling. Journal of System Simulation 20, 612–615 (2008)


187

12. Dorigo, M., Gambardella, L.M.: Ant colonies for the travelling salesman problem. BioSystems 43, 73–81 (1997) 13. Bullnheimer, B., Hartl, R.F., Strauss, C.: A new rank based version of the ant system: A computational study. Central European Journal for Operation Research and Economics 7, 25–28 (1999) 14. Stutzle, T., Hoos, H.: Max-min ant system and local search for the traveling salesman problem. In: Proceedings of IEEE International Conference on Evolutionary Computation and Evolutionary Programming Conference. IEEE Press, New York (1997)

Generalization Bounds of Ranking via Query-Level Stability I Xiangguang He1, Wei Gao2,3, and Zhiyang Jia4 1

Department of Information Engineering, Binzhou Polytechnic, Binzhou 256200, China 2 Department of Information, Yunnan Normal University, Kunming 650092, China 3 Department of Mathematics, Soochow University, Suzhou 215006, China 4 Department of Information, Tourism and Literature college of Yunnan University, Lijiang 674100, China [email protected]

Abstract. The quality of ranking determines the success or failure of information retrieval and the goal of ranking is to learn a real-valued ranking function that induces a ranking or ordering over an instance space. We focus on generalization ability of learning to rank algorithms for information retrieval (IR). The contribution of this paper is to give generalization bounds for such ranking algorithm via uniform (strong and weak) query-level stability by deleting one element from sample set or change one element in sample set. Only we define the corresponding definitions and list all the lemmas we need. All results will show in “Generalization Bounds of Ranking via Query-Level Stability II”. Keywords: ranking, algorithmic stability, generalization bounds, strong stability, weak stability.

1 Introduction A key issue in information retrieval is to return useful items according to user’s requests, and the items are ranked by a certain ranking function. Therefore, the ranking algorithm is the most important issue in search engines because it determines the quality of the list which will be presented to the user. The problem of ranking is formulated by learning a scoring function with small ranking error generated from the given labeled samples. There are some famous ranking algorithms such as rank boost (see [1]), gradient descent ranking (see [2]), margin-based ranking (see [3]), P-Norm Push ranking (see [4]), ranking SVMs (see [5]), MfoM (see [6]), Magnitude-Preserving ranking (see [7]) and so on. Some theory analysis can be found in [8-12]. The generalization properties of ranking algorithms are central focuses in their research. Most generalization bounds in some learning algorithm are based on some measures of the complexity of the hypothesis used like VC-dimension, covering number, Rademacher complexity and so on. However, the notion of algorithmic stability can be used to derive bounds that tailored to specific learning algorithms and exploit their particular properties. A ranking algorithm is called stable if for a wild change of samples, the ranking function doesn’t change too much. M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 188–196, 2011. © Springer-Verlag Berlin Heidelberg 2011

Generalization Bounds of Ranking via Query-Level Stability I

189

We can learn the generalization bounds for the extension of this ranking algorithm via uniform leave-one-query-out associate-level loss stability(see [13]). However, the uniform stability is too restrictive for many learning algorithms (see [14]). In many applications, we should reduce the demand of stability. Our paper as the continue work of [13], consider some kinds of “almost-everywhere” stability-strong and weak query-level stability for extension ranking algorithm raised by [13] and the generalization bounds for such ranking algorithms are given as well. The organization of this paper is as follows: we describe the setting of ranking problem in next section, and define notions of five kinds of stabilities latter. Using these notions, we derive first result for stable ranking algorithms.

2 Setting Assume that query q is a random sample from the query space Q according to a probability distribution PQ. For query q, an associate

ω (q)

and its ground truth

g( ω ) are sampled from space Ω × G according to a joint probability distribution Dq, where Ω is the space of associates and G is the space of ground truth. Here the (q)

associate

ω (q)

can be a single document, a pair of documents, or a set of documents,

and correspondingly the ground truth g( ω ) can be a relevance score (or class label), an order on a pair of documents, or a permutation (list) of documents. Let l(f; (q)

ω ( q ) , g( ω ( q ) )) denote a loss (referred to as associate-level loss) defined on ( ω ( q ) , (q) g( ω )) and a ranking function f. Expected query-level loss is defined as: L(f;q)=

∫

Ω×G

l ( f ; ω ( q ) , g (ω ( q ) ))Dq (d ω ( q ) , dg (ω ( q ) )) .

Empirical query-level loss is defined as:

1 Lˆ ( f ; q ) = nq Where

nq

∑ l ( f ;ω j =1

(q) j

, g (ω (j q ) )) ,

(ω (j q ) , g (ω (j q ) )) , j =1 ,…, nq stands for nq associates of q, which are sampled

i.i.d. according to Dq. The empirical query-level loss can be an estimate of the expected query-level loss. It can be proven that the estimation is consistent. The goal of learning to rank is to select the ranking function f which can minimize the expected query-level risk defined as:

Rl ( f ) = EQ L( f ; q ) = ∫ L( f ; q ) PQ (dq ) Q

(1)

In practice, PQ is unknown. We have the training samples (q1, S1), …, (qr, Sr), where

Si ={( ω1 , g( ω1 )),…, ( ωni , g( ωni ))}, i=1, …, r, and ni is the number of (i )

(i )

(i )

(i )

associates for query qi. Here q1, …, qr can be viewed as data sampled i.i.d. according

190

X. He, W. Gao, and Z. Jia

to PQ, and ( ω j , g( ω j )) as data sampled i.i.d. according to (i )

(i )

Dqi , j = 1,…, ni, i =

1,…, r. Empirical query-level risk is defined as: r

1 Rˆl ( f ) = ∑ Lˆ ( f ; qi ) . r i =1

(2)

The empirical query-level risk is an estimate of the expected query-level risk. It can be proven that the estimation is consistent. This probabilistic formulation can cover most of existing learning for ranking algorithms. If we let the association to be a single document, a pair of documents, or a set of documents, we can respectively define pointwise, pairwise, or listwise losses, and develop pointwise, pairwise, or listwise approaches to learning to rank. (a) Pointwise Case For the document space D. We use a feature mapping function

φ :Q ×

→ X(= Rd) to create a d-dimensional feature vector for each query-document pair. ∀ query q, D

suppose that the feature vector of a document is x(q) and its relevance score (or class label) is y(q), then (x(q), y(q)) can be viewed as a random sample from X × R according to a probability distribution Dq. If l(f; x(q), y(q)) is a pointwise loss (square loss for example), then the expected query-level loss turns out to be: L(f;q)=

∫

X ×R

l ( f ; x ( q ) , y ( q ) )Dq (dx ( q ) , dy ( q ) ) . (i )

Given training samples (q1,S1), … , (qr, Sr), where Si={( x1 ,

y1(i ) ),…, ( xn(ii ) , yn(ii ) )},

i=1, …, r, the empirical query-level loss of query qi, (i=1, …, r) becomes:

1 Lˆ ( f ; qi ) = ni

ni

∑ l( f ; x j =1

(i ) j

, y (ji ) ) .

(b) Pairwise Case

∀ query q, z(q) = ( x1( q ) , x2( q ) ) stands for a document pair associated with it. Moreover, (

⎧ y (q ) = 1 if x1(q ) is ranked above x1(q ) ⎨ ⎩ y (q ) = −1 otherwise

. Let Y = {1,−1}.

x1( q ) , x2( q ) ,y(q)) can be viewed as a random sample from X2 × Y according to a

probability distribution Dq. If l(f; z(q), y(q)) is a pairwise loss (hinge loss for example), then the expected query-level loss turns out to be: L(q)=

∫

X 2 ×Y

l ( f ; z ( q ) , y ( q ) )Dq (dz ( q ) , dy ( q ) ) . (i )

Given training samples (q1,S1), … , (qr, Sr), where Si={( z1 ,

y1(i ) ),…, ( zn(ii ) , yn(ii ) )},

i=1, …, r, the empirical query-level loss of query qi, (i=1, …, r) becomes:

Generalization Bounds of Ranking via Query-Level Stability I

1 Lˆ ( f ; qi ) = ni

191

ni

∑ l( f ; z j =1

(i) j

, y (ji ) ) .

(c) Listwise Case For each query q, s(q) denote a set of m documents associated with it, π ( s ) ∈ ∏ denote a permutation of documents in s(q) according to their relevance degrees to the (q )

∏ is the space of all permutations on m documents. (s(q), π ( s ( q ) ) ) can be viewed as a random sample from Xm × ∏ according to a probability distribution (q ) Dq. If l(f; s(q), π ( s ) ) is a listwise loss (cross entropy loss for example), then the query, where

expected query-level loss turns out to be: L(q)=

∫

X m ×∏

l ( f ; s ( q ) , π ( s ( q ) ))Dq (ds ( q ) , d π ( s ( q ) )) .

Given training samples (q1,S1), … , (qr, Sr), where Si={( ( sni , π ( sni (i )

(i )

s1(i ) , π ( s1(i ) ) ),…,

) )}, i=1, …, r, the empirical query-level loss of query qi, (i=1, …, r)

becomes:

1 Lˆ ( f ; qi ) = ni

ni

∑ l( f ; s j =1

(i ) j

, π ( s (ji ) )) .

3 Definitions Y. Lan (see [13]) defined uniform leave-one-query-out associate-level loss stability. To use notions defined above, we define strong and weak leave-one-query-out associate-level loss stability, and uniform, strong and weak associate-level loss stability for change one element in training sample. They are also good measures to show how robust a ranking algorithm is. We assume 0< δ1 , δ 2 , δ 3 , δ 4 0

X (1) , X ( 2) , ⋯, X (r ) . The survivor function is given by

with the order statistics

S ( x;θ ) = e − x / θ

for x > 0.

H 0 : θ = θ 0 against H a : θ ≠ θ 0 on the basis of observable X (1) , X ( 2 ) , ⋯, X ( r ) and censored after X ( r ) .

Consider the problem of testing

r

In view of Theorem 8, the total time on test

T = ∑ X (i ) + (n − r ) X ( r ) is a i =1

r

function of prediction sufficient (or adequate) statistic (

∑X i =1

predicting X ( m ) where 1 ≤

(i )

,

X ( r ) ) for

r ≤m≤n.

Let U be the subset of {1, 2, ⋯, n} corresponding to observations that are uncensored and C be the subset of {1, 2, ⋯, n} corresponding to observations that are right-uncensored. The likelihood function of the sample in this case is defined by

L(θ ) = ∏ f ( xi ;θ ) ⋅ ∏ S ( xi ;θ ) . i∈U

i∈C

The first product can be written as the joint distribution of X (1) , X ( 2 ) , ⋯, given by

∏ f ( x ;θ ) = (r!) f ( x i

;θ ) f ( x ( 2 ) ;θ )

(1)

f ( x ( r ) ;θ )

i∈U

=

r!

θr

exp(−

1

θ

r

∑x i =1

(i )

)

for x (1) ≤ x ( 2 ) ≤ ⋯ ≤ x ( r ) , and the second product can be approximated by

1

∏ S ( x ;θ ) = exp[− θ (n − r ) x i

(r)

].

i∈C

Then

L(θ ) = = =

r!

θ

r

r!

θ

r

r!

θ

r

exp(−

1

r

∑x θ i =1 r

(i )

1 ) ⋅ exp[− (n − r ) x ( r ) ]

θ

1 exp[− (∑ x(i ) + (n − r ) x( r ) )]

θ

i =1

T exp(− ) .

θ

X (r )

A Hypothesis Testing Using the Total Time on Test from Censored Data

441

The log-likelihood is thus given by

ln L(θ ) = ln(r!) − r ln(θ ) −

T

θ

.

Ω 0 = {θ : θ = θ 0 } and Ω a = {θ : θ ≠ θ 0 } denote two disjoint (or mutually exclusive) sets of values of θ under H 0 and H a , respectively, and let Ω = Ω 0 ∪ Ω a = {θ : θ > 0} . Let max L(θ ) denote the maximum value of L(θ)

Let

θ ∈Ω 0

for θ ∈ Ω 0 ; it is usually the likelihood function with all unknown parameters replaced by their maximum-likelihood estimators, subject to the restriction that

L(θ ) represent the θ ∈ Ω 0 . Similarly, let max θ ∈Ω θ ∈ Ω which can be obtained in a similar fashion. Under Ω 0 (or H 0 : θ = θ 0 ), we see that r!

max L(θ ) =

θ

θ ∈Ω 0

Under

Ω,

r 0

exp(−

maximum value of L(θ) for

T

θ0

).

d r T ln L(θ ) = − + 2 = 0 θ θ dθ yields the maximum-likelihood estimator of θ: ∧

T 1 r θ = = [∑ X ( i ) + ( n − r ) X ( r ) ] . r r i =1 Thus

r!

max L(θ ) =

∧

θ ∈Ω

The likelihood ratio test of region of the form

(θ ) r

T exp(− ∧ ) .

θ

H 0 : θ = θ 0 versus H a : θ ≠ θ 0 has a rejection

λ ≤ c , where

max L(θ )

∧

θ 1 1 λ= = ( ) r exp[−( − ∧ )T ] , θ0 θ max L(θ ) θ 0 θ ∈Ω θ ∈Ω 0

for some appropriate constant c. Clearly,

0 ≤ λ ≤ 1 . Note that λ is an exponential

r

function of the total time on test

T = ∑ X (i ) + (n − r ) X ( r ) . Hence, the rejection i =1

region λ ≤ c has an equivalent form in term of T, that is, appropriate constant k. (See the figure below.)

T ≤ k for some

442

S.-C. Cheng

We have Thus established the following theorem. Theorem 9. Consider an exponential distribution with mean θ. The likelihood ration test for

H 0 : θ = θ 0 against H a : θ ≠ θ 0 on the basis of observing X (1) , X ( 2) ,

X ( r ) and censoring after X ( r ) has a rejection region of the form T ≤ k ,

⋯,

r

where

T = ∑ X (i ) + (n − r ) X ( r ) is the total time on test, for some appropriate i =1

constant k. 5.2

An Illustration

Consider an exponential distribution with mean θ. Now consider a censoring situation in which n units are put on test and the observation continues until r units have failed. Suppose that the numbers of millions of revolutions of thirty ball bearings are censored after the twenty-third failure [6]. The ordered data to failure are 17.88 28.92 33.00 41.52 48.40 51.84 51.96 54.12 55.56 67.80 68.64 68.64 68.88 84.12 93.12 98.64 105.12 105.84 127.92 128.04 173.40

42.12

45.60

Assume that the number of millions of revolutions of the ball bearing has an exponential distribution with mean θ. Calculate

A Hypothesis Testing Using the Total Time on Test from Censored Data

443

23

∑X i =1

(i )

=1,661.06.

23

In accordance with Theorem 8, (

∑X i =1

(i )

),

X ( 23) ) is a prediction sufficient (or

X ( m ) (for m = 24, 25, ⋯, 30) based upon the information of the observed exponentially distributed data ( X (1) , X ( 2 ) , ⋯, X ( 23) ).

adequate) statistic for predicting

Consequently, the total time on test is given by r

T = ∑ X (i ) + ( n − r ) X ( r ) = 1,661.06 + (7)(173.40) = 2,874.86 i =1

and the maximum-likelihood estimator of θ is given by ∧

θ=

2,874.86 = 124.99. 23

: θ = 120 versus H a : θ ≠ 120 , it is usually difficult to obtain the critical number k of the rejection region T ≤ k . In this case, we may apply the In order to test H 0

following theorem.

X 1 , X 2 , ⋯, X n denote a random sample with the likelihood function L(θ). Consider testing H 0 : θ ∈ Ω 0 versus H a : θ ∈ Ω a . Under certain regularity conditions, − 2 ln(λ ) is asymptotically chi-square distributed with ν degrees of freedom under H 0 , where Theorem 10. Let

ν

= (number of independent parameters under

Ω)

-(number of independent parameters under

Ω0 )

Now we calculate ∧

1 1 θ − 2 ln(λ ) = −2r ln( ) + 2T ( − ∧ ) θ0 θ0 θ 124.99 1 1 ) + 2(2,874.86)( = −2(23) ln( − ) 120 120 124.99 = 0.0387998 that is asymptotically chi-square distributed with ν = 1 degrees of freedom under H 0 : θ = 120 . Since

444

S.-C. Cheng

p − value = P( χ 2 > 0.0387998) > 0.10 , we do not reject H 0 : θ = 120 . References 1. Basu, D., Cheng, S.-C.: A note on sufficiency in coherent models. Internat. J. Math. & Math. Sci. 4(3), 571–581 (1981) 2. Buckley, J.J.: Fuzzy Statistics. Springer, Heidelberg (2004) 3. Cheng, S.-C.: Crisp Confidence Estimation of the Parameter Involving in the Distribution of the Total Time on Test for Censored Data. In: Proceedings of Academy of Business and Information Technology – 2008 Conference, ABIT 2008 (2008) 4. Cheng, S.-C., Mordeson, J.N.: A Note on the Notion of Adequate Statistics. Chinese Journal of Mathematics, ROC 13(1), 23–41 (1985) 5. Cheng, S.-C., Mordeson, J.N.: Fuzzy Confidence Estimation of the Parameter Involving in the Distribution of the Total Time on Test for Censored Data. In: Proceedings of 2008 North American Fuzzy Information Processing Society Annual Conference, NAFIPS 2008 (2008) 6. Crowder, M.J., Kimber, A.C., Smith, R.L., Sweeting, T.J.: Statistical Analysis of Reliability Data. Chapman & Hall, London (1994) 7. Fisher, R.A.: On the mathematical foundations of theoretical statistics. Phil. Trans. Roy. Soc. London A 222, 309–368 (1922) 8. Halmos, P.R., Savage, L.J.: Applications of the Radon-Nikodym Theorem to the theory of sufficient statistics. Ann. Statist. 3, 1371–1378 (1949) 9. Lauritzen, S.L.: Sufficiency and time series analysis. Intst. Math. Statist. 11, 249–269 (1972) 10. Nair, P.S., Cheng, S.-C.: An Adequate Statistic for the Exponentially distributed Censored Data. In: Proceedings of INTERFACE 2001, vol. 33, pp. 458–461 (2001) 11. Skibinsky, M.: Adequate subfields and sufficiency. Ann. Math. Statist. 38, 155–161 (1967) 12. Sugiura, M., Moriomoto, H.: Factorization theorem for adequate σ-fields (in Japanese). Sūgaku 21, 286–289 (1969) 13. Takeuchi, K., Akahira, M.: Characterizations of prediction sufficiency (adequacy) in terms of risk functions. Ann. Statist. 3, 1018–1024 (1975) 14. Torgenser, E.N.: Prediction sufficiency when the loss function does not depend on the unknown parameter. Ann. Statist. 5, 155–163 (1977)

Collaborative Mechanism Based on Trust Network Wei Hantian1 and Wang Furong2 1

School of Software, NanChang University, China 2 School of Foreign Language, JiangXi University of Finance and Economics, China [email protected]

Abstract. The way people interact in collaborative environments and social networks on the web has evolved in a rapid pace over the last few years. Web-based collaborations and cross-organizational processes typically require dynamic trust between people and services. However, finding the right partner to work on joint tasks or to solve emerging problems in such scenarios is challenging due to scale and temporary nature of collaborations. by calculates the pessimistic value of one path, the optimistic value of one path, and the comprehensive value in all path, the results obtained give a new method to estimate the risk probability of collaboration. Keywords: Collaborative Mechanism, Trust Network, Social Network.

1 Introduction People interact in collaborative environments and social networks on the Web have evolved over last years with the development of information technology and global scale economy. Effective collaborative mechanism is conducive to increase productivity of enterprises and individuals, and promote social harmony and economic development. The paradigm shift from closed systems to open, loosely coupled Web-services-based systems requires new approaches to support interactions. In distributed, cross-organizational collaboration scenarios, agent register their skills and capabilities as Human-Provided Services using the very same technology as traditional Web services to join a professional online help and support community. This approach is inspired by crowdsourcing techniques following the Web 2.0 paradigm. Agents represents different interests of entities in real life, each Agent's goal is to achieve his own behalf the interests. Agents wish to achieve maximum benefits in a task. There are various kinds of security risks should inevitably appear in the process of cooperation. So trust relationship among agents to promote collaboration system in such network environment attracts more attention. Trust mechanism plays an important role in the cooperation in human society. Many scholars believe that it is an effective way to solve the agent cooperation. Most of current trust model belongs to evaluation models, such as trust evaluation model proposed by Hu, Grandison. An interpersonal trust mechanism is proposed based on social network in the paper to build trust transfer mechanism in coordination system. M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 445–451, 2011. © Springer-Verlag Berlin Heidelberg 2011

446

W. Hantian and W. Furong

2 Related Works 2.1 Collaborative System Formation of Collaborative System. Use the success probability of collaboration between the agent as the trust index, thus formed a collaboration mechanism based on trust cooperative system. One agent can be the collaboration agent of other’s in the system, in which each entity has assigned different levels of trust. The agent with high trust value to others will be the collaborative object in the future, so after a certain period of time a relatively stable cooperative system will be formed. Evolution of Collaborative System. Agent and their link relationships can be represented as a directed graph in trust mechanism-based collaboration system. Some nodes in the topology based on trust with high entry for part of the node can provide relatively better service. node can form different self-organized nature of the gathering, which also has a "small world" and "power law" characteristics of the phenomenon in this structure has been reflected in the cooperative system. Reduce the average distance between the agents and increased the degree of convergence of the network agent, is conducive to the node discovery service resources. Power law arises because the phenomenon occurs when agent in collaboration with a certain preference, the interconnection probability values is uneven, new node agent biased in favor of higher reliability in collaboration with others, which result in some nodes’ degree is increasing. Social Network. Social network refers to social actors (ACTOR) and the relationship between the collections. It can also say that a social network is composed of multiple points (social actors) and the connection between points (the relationship between actors). The “point” in Social network is the various social actors; "edge" refers to a variety of actors in social relations. Relationship can be directed, or undirected. At the same time, social relations can be expressed in multiple forms. For example, it stands undirected s for communication and relationships between members, or trade relations between countries. Social network analysis (SNA) is the quantitative research to the relationship between of actors in the social network. Graph theory is the basis of social network analysis of mathematical theory, the formal description of social networks can be divided into social network maps and social matrix. In graph theory, social network can also be divided into digraph and undirected graph. Social network graph is composed by a set of nodes N = {n1, n2, ..., nk}, and the connection between nodes L =

Fig. 1. Digraph of social network

Fig. 2. Matrix of social relations

Collaborative Mechanism Based on Trust Network

447

{l1, l2, ... lm}. In undirected graph, the connection between nodes uses line, and in digraph (fig.1) connections between nodes are directional, with arrows. Matrix of social relations (fig. 2) is converted from the network of social relations. The matrix element represents the relationship between actors. Matrix of social relations is more standardized, which will help computer for processing, is the basis for quantitative analysis for computer. Out-Degree and Degree. Out-degree density is the sum of the other participants who send messages to the node; In-degree density is the sum of the node send messages to other participant. Pathway, Path, Shortest Distance. Pathway is sequence line from the beginning to the end point. One pathway can include many participants or similar relationship between the beginning and end. Path is special emphasis on all the points and lines are not repeated way, it is one sequence line of pathway. Shortest distance is the shortest path, a set of the line between two points short cut. Trust Network. The trust network is a subset of the network, in this subset, the network between participants and between the other participants will have a more close and warm relationship. It can be divided into four kinds based on different degrees of strictness. Here are a few commonly used measure of network analysis indicators, these are the follow-up study in this paper will be involved. N-cliques: define trust members if distance of the participants and organizations only one. N-clans: as long as the nodes have a relationship with some members, or not detract from the n-step (usually 2) can be defined to be a member of trust network. K-plexes: If the node have a direct relationship with k members in trust network, then the node is a member of n-size trust network. K-cores: if the participants have a relationship with k members, allow it access to the trust network no matter how many members they have no connection.

3 Trust Model Framework Trust Model. The key to build coordination mechanisms is establishing universal trust among all aspects. People only trust each other, then have mutual cooperation. Trust each other on the premise that to understand each other, master the other party certain information and the feeling of recognition of each other. The important way to understand each other can through the direct or indirect contact. Social networks based on trust can be established according to their association. All traders are encouraged in the absence of contacts. with the occurrence of exchanges, the two sides established a trust relationship based on feedback, and gradually build their social network based on trust. Review assessment of collaboration represent the quality of transactions, but also the trust status of both the reaction. Fig. 3 described the trust relationship between all the agents it by a directed graph.

448


Fig. 3. Trust network

Trust Calculation of Pathway. In the trust network, the node in the graph is an agent, direct trust between node XI and XJ descried as XI→XJ, if there is no direct trust relationship between XI and XJ , it can be calculated through the trust transmission chain of XI and XJ. For example, there are two trust chain A→C: A → F → C and A → B → C. If there are many trust chain between X1 and Xn the trust value must composite many recommendation agent. Described trust probability event between X1 and Xn is

，

P( X 1 → X n )

，then

P( X 1 → X n )) means untrust probability event of X1 and

Xn. m is the number of trust chain. The probability is computed as eq.1, and likely trust value is eq.2.

If

P( X 1 → X n ) =1- P( X 1 → X n ))

(1)

T (X 1 , X n ) = 1− T (X 1 , X n )

(2)

Tm ( X 1 , X n ) means the trust value of the m trust chain, and the over trust value is

as eq.3 and eq. 4.

T ( X 1 , X n ) = (1 − T1 ( X 1 , X n )) × (1 − T2 ( X 1 , X n )) × ..(1 − Tm ( X 1 , X n )) = ∏ i =1 Ti ( X 1 , X n ) m

T ( X 1 , X n ) = 1 − ∏ i =1Ti ( X 1 , X n ) m

(3)

(4)

Trusted Computing of Path. If there is a trust path: X1 → X2 → X3 ... → Xn, according to the conditions of trust transfer, T (X1, Xn) stands for trust value X1 in Xn can be regarded as the event X1, X2, ... Xn-1 Xn con-occurrence Probability p, calculated as eq.5.


449

T ( X 1 , X n ) = T ( X 1 , X 2 ) × T ( X 2 , X 3 )..., ×T ( X n −1 , X n ) (5)

= ∏ i =1 T (X i , X i +1 ) n −1

Direct Trust Formation. The evolution of direct between each node depends on the direct interaction or experience; they modify their direct trust value. This direct interaction is shown in direct link between A and B in the network topology in fig 3. This portion of trust is critically important since it does not depend on the recommendation of any trusted neighbor. So when the two nodes, A and B, interact directly, they change their direct trust value based on the satisfaction level of that interaction in fig. 4. Behavior

1

Postitive

Neutral

0.8

0.6

Negative

0

0.2

0.4

Fig. 4. Classification of behavior

Optimal Hop Length. In our model, a node has the flexibility to define the maximum length of a recommendation path through the field of the request packet. The upper bound of the length is related to the weight factor and disposition value.

Len ≤ θϕ Where θ is disposition value, and ψ is decrement rate of weighting factor per node.

4 Case Analysis For path{A,B,C,D,E,F,G},{A,H,G},{A,I,J,G} in fig.5, the algorithm calculates the pessimistic value of one path, the optimistic value of one path, and the comprehensive value in all path. In our scenario, we considerθ=0.5, and ψ=0.05. So the maximum value for length 10. The trust relationship can’t build if the trust value below 0.5. From the trust value of each node in fig.6, the min value is greater than 0.6, so all the paths are effective. If the value line is b fluctuant badly, it may have more uncertain trust. With the recommendation node increasing, the pessimistic value of one path, optimistic value of one path, and the comprehensive value in all paths were obtained in fig.7. It is obviously that they have a big deviation. A big deviation value may be means there is some risk in collaboration, so it can used to the estimate aspiration to collaborate with each other.

450


1

8 0.

0. 7

Fig. 5. A topology of devices with trust relationship

1.2 1 0.8 path 1 path 2 path 3

0.6 0.4 0.2 0 node 1

node 2

node 3

node 4

node 5

node 6

Fig. 6. A topology of devices with trust relationship 0.9 0.8 0.7 0.6 pessimistic optimistic comprehensive

0.5 0.4 0.3 0.2 0.1 0 1

2

3

4

5

6

7

8

9

10

11

Fig. 7. A topology of devices with trust relationship


451

5 Summary The proposed collaboration solutions is effectively promote collaboration between agent efficiency and improve the success rate of the amount of common tasks, but also improve the interactive performance of the entire network system to ensure the reliability of selected collaborators. The paper is drawing collaborative mechanisms by trust model based on interpersonal social network. By use of direct trust, trust chain approach to assess the credibility of the agent, and make a detailed discussion of the calculation method of the cooperative system about trust and collaboration System formation and evolution mechanisms. Finally, by calculates the pessimistic value of one path, the optimistic value of one path, and the comprehensive value in all path, the results obtained give a new method to estimate the risk probability of collaboration.

References 1. Zuo, Y., Panda, B.: Component based trust management in the context of a virtual organization. In: ACM Symposium on Applied Computing, pp. 1582–1588 (2005) 2. Moody, P., Gruen, D., Muller, M.J., Tang, J.C., Moran, T.P.: Buniness activity patterns: a new model for collaborative business applications. IBM Systems Journal 45(4), 683–694 (2006) 3. Kochikar, V.P., Suresh, J.K.: Towards a knowledge-sharing organization: Some challenges faced on the infosys journey. Infosys Technologies Limited, India (2004) 4. Perry-Smith, J.E., Shalley, C.E.: The social side of creativity: A static and dynamic social network persperctive. Academy of Management Review 28, 89–106 (2003) 5. Liu, L., Antonopoulos, N., Mackin, S.: Managing peer-to-peer networks with human tactics in social interactions. Journal of Supercomputing 44, 217–236 (2008) 6. Bowles, S., Gintis, H.: The Evolution of Strong Reciprocity: Cooperation in Heterogeneous Populations. Theoretical Population Biology 65, 17–28 (2004) 7. Roy, L.: Tacit knowledge and knowledge management: The keys to sustainable competitive advantage. Organizational Dynamics 29, 164–178 (2001) 8. Cross, R., Borgatti, S.P., Parker, A.: Beyond answers: Dimensions of the advice network. Social Networks 23, 215–235 (2001) 9. Cross, R., Prusak, L.: The people who make organizations go or stop. Harvard Business Review 6, 5–12 (2002) 10. Sabater, J., Sierra, C.: Social regret, a reputation model based on social relations. SIGecom Exchanges 3(1), 44–56 (2002) 11. Dustdar, S., Hoffmann, T.: Interaction pattern detection in process oriented information systems. Data and Knowledge Engineering (DKE) 62(1), 138–155 (2007)

Design for PDA in Portable Testing System of UAV’s Engine Based on Wince YongHong Hu, Peng Wu, Wei Wan, and Lu Guo No. 365 Institute Northwestern Polytechnical University Xi’an, China [email protected]

Abstract. As the core equipment of UAV, the engine’s working conditions can directly affect the UAV flying safely and reliably or fly out. There is a need for testing the UAV’s engine real time, reducing security risks. Portable Testing System of UAV’s Engine, it takes PDA as PC, WINCE as the operating system, testing software designed by EVC. After test, the testing software can complete the test task. Keywords: UAV Engine, Testing, Portable, WINCE.

1 Introduction UAV(unmanned aerial vehicles) includes aircraft system, avionics system, engine system and data link system. Any problem in subsystem is likely to lead directly to the failure of UAV’s combat missions, which requires all systems in advance of the comprehensive testing to ensure system stability and reliability of UAV. In Ground testing, the test of each system can use existing testing system having a good test. As the company's engine test system only have test-drive table, so the engine system can not be tested when UAV flying in the air. As the core equipment of UAV, the engine's working conditions directly affect the UAV whether can fly safely and roll reliable flight[1], so a set of efficiency and accuracy testing system for UAV is need to test engine in real time, while the engine factory testing on the ground have achieved automation and digitization, and can significantly improve the precision and efficiency, reduce security risk, significantly improve the success rate of flying, in the development process of new models, the saved test data also provides a reference to the late Research and Development.

2 Design of Testing System Portable testing system for the UAV aim at the oil and gas mixing ratio of engine, the system includes incentives, the measured unit, testing unit[2]. Data collection box section includes the microcontroller, data acquisition unit, pressure and flow sensors and power supply unit, handheld computer includes interface module, display module, memory modules and self-test module(Fig.1). M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 452–457, 2011. © Springer-Verlag Berlin Heidelberg 2011

Design for PDA in Portable Testing System of UAV’s Engine Based on Wince

453

Fig. 1. UAV’s Engine Ttesting system

3 Operating System and Develop Software for PDA PC chose the PDA, because compared with industrial machines, notebook computers, PDA inexpensive, low power consumption, easy flow of work, but fully able to meet the requirements[3]. Embedded systems typically include both hardware and software. The hardware platform includes CPU, memory, I / O ports. Software components including support for multi-task real-time embedded operating system operations and applications, the application controls the operation and behavior of the system, and the operating system controls the interaction between applications and hardware. Due to the current CPU and reliable connections are used for peripheral hardware, software will become a bottleneck in the development restrictions of embedded systems. On the upper application developers, embedded systems need a highly concise and friendly interface, reliable and easy to develop, more tasks, and low-cost operating system. Therefore, once the embedded processor and peripheral hardware selected, then the most work concentrated on the selection and development of embedded software. LINUX development more difficult because of system maintenance is difficult, requiring a higher level of technical support, and relatively WINCE development is easier, short development cycle, kernel improved, rich in GUI, powerful development tools, so we choose WINCE as the underlying operating system of PDA. Embedded operating system and customization or configuration tool have close contact, integrated development environment. For WINDOWS CE, we can not buy the WINDOWS CE operating system, we can just buy the Platform Builder for CE.NET integrated development environment, referred as PB, and it can be tailored and customized a consistent need WINDOWS CE.NET operating system. According to platform using EVC developing the serial port and the MFC of user interface software, including several key functional units: self-test unit, the serial read unit, data display unit, data storage unit. The work of the whole process of handheld computers: handheld computer sends self-test command to microcontroller, test subsystems, if test result is right, then read the normal data (there will be alarm if excessive), and real-time data will show, store the data in the form of txt file.

454

Y. Hu et al.

4 PDA Functions Handheld computer software components including self-test, data display, data storage, and the realization of excessive alarm. Handheld computer self-test: computer systems rely mainly on self-test program in the memory, memory card, touch screen and other hardware function checks; Data acquisition and processing unit serial self-test: When the computer self-test module is complete, the engine automatically starts the self-testing system software modules. Computer send the self-test data, start MCU in self-test procedure, if an acknowledge signal to the computer microcontroller, then the communication function is normal; Data Display: Display the real-time data which read through serial port; Data storage: The real-time measurements to be saved for later data analysis; Data exceeded alarm: once the real-time data detected abnormal, there will be alarm.

5 Software Design The Structure of Testing Software as shown in Fig.2.

Fig. 2. Structure of Testing Software

The main window is used to display the real-time data(Fig.3): hCom=CreateFile(_T("COM1:"),GENERIC_READ|GENERIC_WRITE,0,NULL, OPEN_EXISTING,0,NULL); GetCommState (hCom, & dcb); dcb.BaudRate = 9600; //Set BaudRate dcb.ByteSize = 8; //Set ByteSize SetCommState (hCom, & dcb);

Design for PDA in Portable Testing System of UAV’s Engine Based on Wince

455

Fig. 3. Testing Interface Flow Chart

Display real-time data to editable text box: SetDlgItemText (IDC_FLOW2_CUR, (CString) data.flow2); Editable text box to access the content for store: MyGetDlgItemText (IDC_TEMP2_CUR, data.temp2, sizeof (data.temp2)); Set of parameters used to achieve a modal dialog, some of the key code as follows: Join the dialog box to add new a new class, such as CStartDlg, this can be in the main dialog box, use the following statement creates a modal dialog box, select the realization of the pop-up models, parameter settings, and the necessary dialogue window. CStartDlg dlg; dlg.DoModal (); Benefits of modal dialog: if no operation is complete in this dialog box, the main window is not allowed to operate, it can avoid misuse of the main window[4]. Self-test is used to send and read data through the serial port, then compared, some of the key codes as follows(Fig.4): WriteFile (hCom, buf1, sizeof (buf1), & len, NULL); ReadFile (hCom, buf, sizeof (buf), & len, NULL); from the serial port

// Read return data

456

Y. Hu et al.

UCHAR key = (UCHAR)0xXX; // head UCHAR* ptr = (UCHAR*)strstr((char*)buf, (char*)&key); if(ptr != NULL){ ptr++; if(*ptr == (UCHAR)0xXX){ ptr++; if(*ptr == (UCHAR)0xXX){ ptr++; int tp[8]={0}; for(int i=0; i Vc′( x, y ) 2

2.( f c ( x, y ) − Bc ( x, y )) > Vc ( x, y ) 2

(4) is the

f c ( x, y ) is a new frame color difference signal, Bc ( x, y ) background color model is updated in real time after ， Vc′( x, y ) color variance model, Vc ( x, y ) is the initial component of variance model. Among them,

Taking into account the object into view for a long time to become the new fixed background color models have established real-time updates, follow the color model update formula (5).

Bc′ ( x, y ) = (1 − β ) • Bc ( x, y ) + β • f c ( x, y )

2 Vc′( x, y ) = (1 − β ) • Vc ( x, y ) + β • ( f c ( x, y ) − Bc′ ( x, y ))

(5)

Bc (x, y ) is the initial background color model is updated in real time after Bc′ (x, y ) color background model, f c ( x, y ) is a new frame color difference signal, Vc ( x, y ) is the initial color variance model, Vc′( x, y ) is updated in real time color

Where

variance model.

Fig. 1. Classical statistical background subtraction

Based on Difference Signal Movement Examination Shadow Suppression Algorithm

465

Fig. 2. Improved statistical background subtraction

The formula (2) and formula (4) integration of the results extracted from those who not only satisfy the formula (2) also satisfy the formula (4) of the pixels is a real sport prospects of the combined binary image obtained by morphological operation, you can fill holes and remove noise. Experimental Results In this study, vehicle detection algorithm is applied to the purpose of the vehicle as the prospects for detection. Daytime vehicle shadow cast great influence on the vehicle, sometimes causing the increase of computation and computational accuracy of classification. Figure 1 is extracted using the classical results of statistical background. Figure 2 is a background in classical statistics based on the extraction of moving foreground color difference signal based on the results of further extraction. After verification, the method can effectively suppress the impact of the shadow of moving objects.

3 Conclusion Zhendui field environment of a moving body outside influence intense brightness changes the characteristics of the statistical background of an improved extraction method in the original classical statistical background subtraction based on the use of two color difference signals to establish Seci model. After the object is light, the most significant changes is the brightness, so the brightness contrast color at this time to better reflect the characteristics of moving objects. Luminance moving object extraction based on the area where, after further screening of these regions, these regions with the previously established color models were compared, only the color model in the region with some of the difference greater than the threshold value is considered as a prospect. This eliminates the shadows and lighting effects. Experiments show that this method has better shadows and light inhibition, but the downside is that when the moving object and background colors are very close, and based on the brightness of the background will extract the foreground objects in the

466

H. ChangJie

same way as the deterioration caused by [4 ]. With the outline proposals for further extraction of the object

References 1. Minamata, Y.J., Tao, L.d., Xu, G., Peng, Y.n.: Camcorder under free movement of the background modeling. Journal of Image and Graphics 13(2) (February 2008) 2. Li, X., Yan-Yan, Zhang, Y.-J.: A number of background modeling methods of analysis and comparison. Thirteenth meeting of the National Academic Graphics 3. Shuai, F., Xue, F., Xu, X.: Based on background modeling of dynamic object detection algorithm and simulation. System Simulation 17(1) (January 2005) 4. Zhou, Z.Y., Hu, F.Q.: Dynamic scene based on background modeling target detection. Computer Engineering 34(24) (December 2008)

Application of Clustering Algorithm in Intelligent Transportation Data Analysis Long Qiong, Yu Jie, and Zhang Jinfang School of Civil Engineering, Hunan City College, Yiyang, China

[email protected]

Abstract. With the continuous development of data mining technology, to apply the data mining techniques to transportation sector will provide service to transportation scientifically and reasonably. In intelligent transportation, the analysis of traffic flow data is very important, how to analyze the traffic data intelligently is more difficult problem, so using a new data mining techniques to replace the traditional data analysis and interpretation methods is very necessary and meaningful, clustering algorithm is the collection of physical or abstracting objects into groups of similar objects from the multiple classes of processes. This paper describes all kinds of the data mining clustering algorithms, clustering algorithm is proposed in the method of dealing with traffic flow data, and applied to the actual traffic data processing, and finally the clustering algorithm is applied to each of highway toll station Various types of car traffic volume data analysis. Keywords: Clustering algorithm, intelligent transportation, data analysis, application.

1 Introduction With the increasing popularity of the intelligent transportation system concept and rapid development of applications, traffic accident data collection and transportation system testing has become the most important part of it, so be developed in priority. Basic traffic information and traffic accidents mainly include a traffic flow, speed, vehicle spacing, vehicle type, road share information on illegal vehicles, traffic accident detection information. Traffic flow data collection and traffic information commonly use induction coil to detect. Using new data mining technology to replace the traditional methods of data analysis and interpretation method is necessary and meaningful, for uncertainty of traffic information, traffic system, on the basis of the traditional database, knowledge base and model base decision support system, taking use of data warehouse, OLAP, data mining and expert system related theory and technology to build a new generation of data analysis system, the application of data mining methods (classification algorithm, clustering algorithm, decision tree algorithm, time sequential algorithm , neural network algorithms, etc.), study the establishment of traffic information for the specific mining model to deal with traffic data flow information. Data flow information includes a variety of sensors (CO / VI detector, light intensity detectors, vehicle loop detector, wind speed and direction M. Zhu (Ed.): ICCIC 2011, Part VI, CCIS 236, pp. 467–473, 2011. © Springer-Verlag Berlin Heidelberg 2011

468

L. Qiong, Y. Jie, and Z. Jinfang

detector, etc.) dynamically collected information, which also includes the speed of traffic, traffic, lane occupied rate data. However, these vast amounts of data in the past have not been effective organized and done utilization of the deep-level processing, at present, with the constant development of data mining technology in different areas, the way which people find is constantly changing. Currently the transportation flow data is very huge, amount of data are regarded as “abundant data, but lack information,” fast growing amount transportation data is generally stored in the database, then how to get useful information through data mining in the amounts data, how to find the interconnection among data becomes a very essential problem, and the application research of data mining technology in the transportation will promote the development of future highway. So to study the application data mining technology in transportation flow data is very meaningful work, with the continuous development of data mining technology, applying the data mining technology to the transportation industry reasonably and scientifically will effectively serve the transportation.

2 The Summary of Clustering Algorithm Clustering algorithm is a collection of physical or abstracting objects into groups of similar objects from the multiple classes of processes. Clusters is generated by the cluster which is a set of data objects in the collection of these objects with the same object in a cluster similar to each other, and other objects in different clusters. Cluster analysis can be used as a standalone tool to obtain the distribution of data to observe the characteristics of each cluster, focused on specific clusters for further analysis. In addition, clustering analysis can be used as the other algorithms (such as features and classification of the pre-processing step) of the preprocessing algorithm, that is, before the implementation of other algorithms using clustering algorithms to find potential relationship, you can use these algorithms in the Clusters generated for processing. The quality of the clustering influenced analysis directly, thus data mining has the basic requirements on the clustering algorithm: 1. Scalability: In many clustering algorithms, data objects have robust in a small data set, but for millions of data objects including a large-scale database clustering, the bias may lead to different results. This requires clustering algorithm to be highly scalable. 2. constraint-based clustering: may be required in the practical application of the constraints under the different cluster, it is only to be found to meet specific constraints, but also good clustering properties of the data packet is a challenging Task.3. To discover clusters of arbitrary shape, using Euclidean distance or Manhattan distance of many of the clustering algorithm to determine the clustering, tend to find that the density and size with nearly spherical clusters, but may be any of a cluster switch. Therefore, the proposed switch found in any cluster algorithm is very important. 4. Insensitive to the input sequence: some clustering algorithm is sensitive to the order of input data, such as for the same data set, presented in a different order to the same algorithm may produce very different clustering results. 5. High-dimensional data processing: a database may contain a number of dimensions or attributes, many clustering algorithms are good at handling one-dimensional or

Application of Clustering Algorithm in Intelligent Transportation Data Analysis

469

low-dimensional data, however, rare to the low-dimensional clustering quality assurance. Usually the case in the multi-dimensional can well determine the quality of clustering. Therefore, clustering algorithms need to be able to handle high dimensional data. 6. Anti-jamming capability: In the actual application, majority contains isolated points, unknown data, the vacancy or the wrong data. Therefore, such a clustering algorithm should be able to have the ability to resist the noise data, otherwise the quality of clustering results can not be ensured.

3 K-Means Algorithm K-means algorithm is an iterative clustering algorithm, in the iterative process the object of clusters constantly move until get the ideal cluster, each cluster will be represented by the mean value of the cluster objective. The cluster got by the k-means algorithm, the cluster objects has the high similarity; different cluster dissimilarity between objects is also very high. Algorithm process is as followed: (l) Data objects from the n-k objects randomly selected as the initial cluster centers; (2) The average calculated for each cluster and uses the average of the corresponding cluster representatives; (3) Calculated the distance of each objects and these center object, and according to the minimum distance re-classify the corresponding object; (4) switch to the second step, re-calculated for each (with changes) the average of the cluster. This process is repeated until the criterion function is no longer a significant change or not change until the clustering of objects; Generally, k-means algorithm use squared error criterion, defined as: k

n

E = ∑ ∑ p − mi

2

(1)

i =1 pε ci

In which, E means variance sum of all objects and corresponding cluster center in the data set, p is given data objects, mi means value of cluster Ci (p and m are multi-dimension). K-means algorithm is relatively scalable and efficient for the large database, the time complexity of algorithm is O (thn), and t is the number of iterations. Under normal circumstances it ends in the local optimal solution. However, k-means algorithm only can be used in the case mean value is meaningful, not applicable for the classification variables, the number of clusters generated should be given in advance, is very sensitive to the noise and abnormal data, can not process the data on non-convex shape .

4 K-Center Algorithm K PAM algorithm is also known as a center point algorithm, each cluster is represented by an object near to center point. First, randomly selected a representative object for each cluster, the remaining objects was assigned to the nearest cluster according its distance with the representative distance and then repeatedly with representatives of

470


non-representative objects instead of objects, in order to improve the quality of clustering. Algorithm process is as followed: (l) From several data objects randomly selected k object as the initial cluster (center) representative; (2) According to center representative objects of each cluster, and the distance of each object and these center objects, and according to the minimum distance re-clarify the corresponding objects. (3)Randomly choose a "non-central" object Orandom, calculate entire cost variance of center objects and center objects OJ exchange; (4) If distance cost variance is negative, then exchange Orandom and Oj constitute the K-center objects of new cluster; (5) Turn to the second step, re-calculated each (changeable) cluster center point. This process is repeated until no significant change in a criterion function or object does not change. In which, Criterion function is the same with the K-means algorithm. When there is noise and outliers’ data, k-center algorithm is better than the k-means algorithm, but the K-center computation is costly, time complexity of the algorithm can not scale well to large databases up;

5 Model-Based Clustering Model-based approach is to assume a model for each cluster, and then go to look for a good data set to meet this model data set. Such a model may be data points' distribution density function in space or others, an underlying assumption is that: object data set is decided by a series of probability distribution. There are usually two attempts directions: statistics-based methods and neural network-based approach. COBWEB algorithm is a popular simple incremental concept clustering algorithm, its input object is described by classification attribution, COBwEB create the hierarchical clustering by the form of classification tree. Classification trees and decision tree are different. Each node in the tree responds to a concept that encompasses the concept of a probability description; an overview was in the node object. Probability description includes the probability of concept and conditional probability like P( Ai

= Vij Ck ) , here Ai = Vij

is attribute -value pairs, Ck is concept class (counts were accumulated and stored in each node to calculate probabilities). This is the difference with the decision, decision tree mark branch not non-nodes, and takes logical descriptors, not the probability descriptor. Form a division in the brother nodes of certain level in classification tree. In order to use the classification tree to classify an object, use a partial matching function moving down the path along the "best" matching node in the tree. COBWEB takes use of a heuristic evaluation method (called classification ability) to help with tree structure. Classification ability (CU) is defined as followed:

⎡ (∑ P (ck ) ⎢ ∑ 1 ⎣ i

n

∑ p( A = V i

j

ij

⎤ Ck )2 − ∑∑ P ( Ai = Vij )2 ⎥ ) / n t j ⎦

(2)

Application of Clustering Algorithm in Intelligent Transportation Data Analysis

N is a node formed on the certain level of tree to classify

471

{c1 , c2 ,..., cn } ,concept or the

number of “category”, classification effectiveness return the category similarity and evaluate classification effectiveness return category similarity and dissimilarity between-class. (1) Probability P ( Ai

= Vij Ck ) represent dissimilarity between-class. The bigger the

value is, the category member portions which share “attribution-value” pairs is lager, more predict this “attribution-value” pair is category member. (2) Probability P (Ck

= Ai Vij ) represent dissimilarity between-class, the bigger this

value is, the category member portions in the contrast category objects share “ attribution-value” pairs is less, more predict this “ attribution-value” pair is category member;

6 The Application of Clustering Algorithm in the Transportation The application of clustering algorithm in the aspect of transportation is wide, the main application areas are the following areas: Cluster analysis of traffic flow, used in urban transportation corridor planning; the urban intersection cluster analysis is applied to traffic management and Traffic flow forecasting: plan and design of the highway used extensively in data mining clustering algorithm; the clustering algorithm is applied to identification method of highway accident-prone points etc. Clustering algorithms have five categories, namely, division-based clustering algorithm, based on hierarchical clustering algorithms; density based clustering algorithm, grid-based clustering algorithm and model-based clustering algorithm. Now the question is how to choose the appropriate algorithm for analysis. Density-based method regards cluster as high-density object area separated by the low-density regions in the data space, cluster the data space as separated by high-density area of the object, suitable for filtering noise and find clusters of arbitrary shape; Grid-based clustering method suitable for handling high-dimensional data set; Model-based algorithm locates cluster by constructing the spatial distribution density function which reflects data point, because this data is the number of vehicles which passed the different toll station, therefore, taking use of based on hierarchical clustering approach and K-Means algorithm, the implementation of the two algorithms is efficient, but also a fast clustering method, So consider using hierarchical clustering method, it provides a cluster analysis function, can do the cluster analysis of variables and samples for a variety of data types. Several issues should be paid attention when selecting the clustering factor: To meet the needs of cluster analysis, clustering factor if you choose can not meet the needs of cluster analysis, or can not provide good discrimination for the cluster analysis, cluster analysis will be difficult. 1. All values should not be a difference of magnitude; this can use a standardized method to solve. 2. All variables can not have a strong linear relationship.

472


3. Clustering factor choice, strong representation should be chosen, can reflect the characteristics of the various toll traffic properties. Taken together, the data clustering factor in the choice for passenger traffic constitutes the proportion of type 1, passenger traffic constitutes the proportion of type 2, bus 3 percentage composition of traffic, passenger traffic constitutes the proportion of type 4, trucks 1 type of traffic constitutes the proportion of truck traffic constitutes the proportion of type 2, consisting of truck traffic volume ratio of 3, 4 trucks, and truck traffic constitutes the proportion of traffic constitutes the proportion of type 5. There are two types of hierarchical clustering, namely, respectively Q and R-type clustering. Q-type clustering is clustering the samples, which allows the sample with similar characteristics together to make a big difference samples separated. R-clustering is clustering variables, which applies a variable with a similar gathering, a large difference variables are separated, Can choose the minority representative variables in the similar variables to analyze, achieve a reduction of variables, up to the purpose of dimension reduction. This study is clustering each toll station , uses the sample cluster, so it is a Q-type clustering.The first step of clustering algorithm, each toll station can be seen as a category, so the initial n toll stations can be divided into n classes, then calculate the distance of each toll station by certain algorithm, then the two closest distance toll station was merged into a category, so that n class becomes n-1 class, while there are a lot of methods to calculate the distance, here we used the calculation of Euclidean Distance. The equation is:

EUCLID( x, y ) =

k

∑ i =1

( xi − yi ) 2

(3)

Take the above vector into the equation, calculate the Europe Distance of each Toll, after taking use of the distance of each category, measure the close level of the left individual and small category, and cluster the most intimated individual and small category into a category, the average distance is the average distance of each individual and subgroup individual. In other words, if a class contains more than one toll project, then the center of this kind is the average distance of one element, namely the mid-point. After the above process is repeated continuously to all individuals and small class gathered into a growing category, until all individuals come together to form a category so far. Firstly, to do the calculation, calculate the distance of each toll, the program is as followed: #include<jostream> #include<math.h> Using namespace std; double a 14 9 = Int main () { double s; For (int i=0; i

Information and Management Engineering, Part VI - ICCIC 2011

Information and Management Engineering, Part V - ICCIC 2011

Innovative Computing and Information, Part I - ICCIC 2011

Computing and Intelligent Systems, Part IV - ICCIC 2011

Informatics Engineering and Information Science, Part IV - ICIEIS 2011

Informatics Engineering and Information Science, Part II - ICIEIS 2011

Informatics Engineering and Information Science, Part III - ICIEIS 2011

Informatics Engineering and Information Science, Part I - ICIEIS 2011

Computing and Intelligent Systems, Part III - ICCIC 2011

Innovative Computing and Information: International Conference, ICCIC 2011, September 17-18, 2011

Neural Information Processing, Part I - ICONIP 2011

Enterprise Information Systems, Part II - Centeris 2011

Neural Information Processing, Part III - ICONIP 2011

ENTERprise Information Systems, Part I - CENTERIS 2011

Neural Information Processing, Part II - ICONIP 2011

ENTERprise Information Systems, Part III - CENTERIS 2011

Henry VI Part 1

Henry VI Part 2

Henry VI Part 3

Advanced Information Systems Engineering - CAiSE 2011

Enterprise Information Systems VI

Web-Age Information Management - WAIM 2011

Information Computing and Applications, Part II - ICICA 2011

Advances in Information Technology and Education, Part I - CSE 2011

Information Computing and Applications, Part I - ICICA 2011

Digital Information Processing and Communications, Part II - ICDIPC 2011

Intelligent Information and Database Systems, Part I - ACIIDS 2011

Intelligent Information and Database Systems, Part II - ACIIDS 2011

King Henry VI, Part II

Software Engineering and Computer Systems, Part I - ICSECS 2011

Fuzzy Information and Engineering

Information and Management Engineering, Part VI - ICCIC 2011

Information and Management Engineering, Part V - ICCIC 2011

Innovative Computing and Information, Part I - ICCIC 2011

Computing and Intelligent Systems, Part IV - ICCIC 2011

Informatics Engineering and Information Science, Part IV - ICIEIS 2011

Informatics Engineering and Information Science, Part II - ICIEIS 2011

Informatics Engineering and Information Science, Part III - ICIEIS 2011

Informatics Engineering and Information Science, Part I - ICIEIS 2011

Computing and Intelligent Systems, Part III - ICCIC 2011

Innovative Computing and Information: International Conference, ICCIC 2011, September 17-18, 2011

Neural Information Processing, Part I - ICONIP 2011

Enterprise Information Systems, Part II - Centeris 2011

Neural Information Processing, Part III - ICONIP 2011

ENTERprise Information Systems, Part I - CENTERIS 2011

Neural Information Processing, Part II - ICONIP 2011

ENTERprise Information Systems, Part III - CENTERIS 2011

Henry VI Part 1

Henry VI Part 2

Henry VI Part 3

Advanced Information Systems Engineering - CAiSE 2011

Enterprise Information Systems VI

Web-Age Information Management - WAIM 2011

Information Computing and Applications, Part II - ICICA 2011

Advances in Information Technology and Education, Part I - CSE 2011

Information Computing and Applications, Part I - ICICA 2011

Digital Information Processing and Communications, Part II - ICDIPC 2011

Intelligent Information and Database Systems, Part I - ACIIDS 2011

Intelligent Information and Database Systems, Part II - ACIIDS 2011

King Henry VI, Part II

Software Engineering and Computer Systems, Part I - ICSECS 2011

Fuzzy Information and Engineering

Recommend Documents