Wireless Ad Hoc and Sensor Networks
Wireless Ad Hoc and Sensor Networks
Edited by Houda Labiod
First published in ...
188 downloads
2062 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Wireless Ad Hoc and Sensor Networks
Wireless Ad Hoc and Sensor Networks
Edited by Houda Labiod
First published in France in 2006 by Hermes Science/Lavoisier entitled: “Réseaux mobiles ad hoc et réseaux des capteurs sans fil” First published in Great Britain and the United States in 2008 by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 6 Fitzroy Square London W1T 5DX UK
John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA
www.iste.co.uk
www.wiley.com
© ISTE Ltd, 2008 © LAVOISIER, 2006 The rights of Houda Labiod to be identified as the author of this work have been asserted by her in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Cataloging-in-Publication Data Reseaux mobiles ad hoc et reseaux des capteurs sans fil. English. Wireless ad hoc and sensor networks / edited by Houda Labiod. p. cm. Includes bibliographical references and index. ISBN 978-1-84821-003-5 1. Computer networks. 2. Sensor networks. 3. Wireless communication systems--Design and construction. I. Labiod, Houda. II. Title. TK5103.2.R38713 2008 621.382'1--dc22 2007021544 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN: 978-1-84821-003-5 Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire.
Table of Contents
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Houda LABIOD
1
Chapter 2. Ad Hoc Networks: Principles and Routing . . . . . . . . . . . . . Stéphane UBÉDA
7
2.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Hertzian connection . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1. Physical layer impact. . . . . . . . . . . . . . . . . . . . . . . 2.2.2. Shared access to medium . . . . . . . . . . . . . . . . . . . . 2.2.3. Flooding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1. Dynamic source routing (DSR). . . . . . . . . . . . . . . . . 2.3.2. Ad hoc on-demand distance vector (AODV). . . . . . . . . 2.3.3. Optimized link state routing (OLSR) . . . . . . . . . . . . . 2.3.4. Topology based on reverse-path forwarding (TBRPF) . . . 2.3.5. Zone-based hierarchical link state routing protocol (ZRP). 2.3.6. Location-aided routing (LAR) . . . . . . . . . . . . . . . . . 2.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
7 12 12 15 19 21 23 25 26 28 29 30 32 33
Chapter 3. Quality of Service Support in MANETs . . . . . . . . . . . . . . . Pascale MINET
35
3.1. Introduction to QoS . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Different QoS requirements. . . . . . . . . . . . . . . 3.1.2. Chapter structure . . . . . . . . . . . . . . . . . . . . . 3.2. Mobile ad hoc networks and QoS objectives . . . . . . . 3.2.1. Characteristics of mobile ad hoc networks and QoS 3.2.1.1. Radio interference . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . .
. . . . . .
. . . . . .
35 36 36 37 37 37
vi
Wireless Ad Hoc and Sensor Networks
3.2.1.2. Limited resources . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1.3. Large dynamicity of a mobile ad hoc network . . . . . . . 3.2.1.4. Broadcast and multihop transmission . . . . . . . . . . . . 3.2.1.5. Decentralized control . . . . . . . . . . . . . . . . . . . . . 3.2.2. Routing in mobile ad hoc networks . . . . . . . . . . . . . . . 3.2.2.1. AODV: a reactive routing protocol . . . . . . . . . . . . . 3.2.2.2. OLSR: a proactive routing protocol . . . . . . . . . . . . . 3.2.2.3. Comparative OLSR and AODV performance evaluation 3.2.3. Realistic QoS objectives . . . . . . . . . . . . . . . . . . . . . . 3.3. QoS architecture and relative QoS state of the art . . . . . . . . . 3.3.1. Different QoS components . . . . . . . . . . . . . . . . . . . . 3.3.2. QoS models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.1. INSIGNIA approach . . . . . . . . . . . . . . . . . . . . . . 3.3.2.2. SWAN approach . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.3. FQMM approach . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.4. Cross-layering approach . . . . . . . . . . . . . . . . . . . . 3.3.3. QoS signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4. QoS routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4.1. Complexity of QoS routing . . . . . . . . . . . . . . . . . . 3.3.4.2. QoS extension of AODV . . . . . . . . . . . . . . . . . . . 3.3.4.3. QoS extensions of OLSR . . . . . . . . . . . . . . . . . . . 3.4. An example of QoS support: QoS OLSR . . . . . . . . . . . . . . 3.4.1. Description of QoS OLSR. . . . . . . . . . . . . . . . . . . . . 3.4.2. Performance evaluation . . . . . . . . . . . . . . . . . . . . . . 3.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2. Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
38 39 39 39 40 40 41 43 48 49 49 51 51 52 52 53 53 56 56 57 57 57 58 59 61 61 62 62
Chapter 4. Multicast Ad Hoc Routing . . . . . . . . . . . . . . . . . . . . . . . Houda LABIOD
65
4.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Multicast routing in MANETs: a brief state of the art . 4.2.1. Classification . . . . . . . . . . . . . . . . . . . . . . 4.2.2. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. SRMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1. Description. . . . . . . . . . . . . . . . . . . . . . . . 4.3.1.1. Selection criteria for FG nodes . . . . . . . . . . 4.3.2. Operation. . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2.1. Route request phase . . . . . . . . . . . . . . . . 4.3.2.2. Reply phase and FG node selection . . . . . . . 4.3.2.3. Data forwarding. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
65 66 66 68 69 69 70 72 72 72 73
Table of Contents
vii
. . . . . . . . .
. . . . . . . . .
73 74 74 74 75 75 76 77 77
Chapter 5. Self-organization of Ad Hoc Networks: Concepts and Impacts Fabrice THEOLEYRE and Fabrice VALOIS
81
4.3.3. Maintenance procedures . . . . . . . . . . . . . . . . 4.3.3.1. Notification of neighbor existence mechanism 4.3.3.2. Mesh refresh mechanism . . . . . . . . . . . . . 4.3.3.3. Link repair mechanism . . . . . . . . . . . . . . 4.3.3.4. Pruning scheme . . . . . . . . . . . . . . . . . . . 4.4. Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Simulation results and analysis . . . . . . . . . . . . . . 4.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . .
5.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . 5.2. Self-organization: definition and objectives. . . . . 5.2.1. Definition . . . . . . . . . . . . . . . . . . . . . . 5.2.2. Principles and objectives . . . . . . . . . . . . . 5.2.3. Local or distributed decisions? . . . . . . . . . . 5.3. Some key points for self-organization . . . . . . . . 5.3.1. Emergence of global behavior from local rules 5.3.2. Local interactions and node coordination. . . . 5.3.3. Minimizing network state information . . . . . 5.3.4. Dynamic environment adaptation . . . . . . . . 5.4. Self-organization: a state of the art . . . . . . . . . . 5.4.1. Classification . . . . . . . . . . . . . . . . . . . . 5.4.2. Virtual backbone . . . . . . . . . . . . . . . . . . 5.4.2.1. Notations . . . . . . . . . . . . . . . . . . . . . 5.4.2.2. Connected dominating set . . . . . . . . . . . 5.4.2.3. Maximal independent set . . . . . . . . . . . 5.4.2.4. Localized minimum spanning tree . . . . . . 5.4.2.5. Relative neighborhood graph . . . . . . . . . 5.4.3. Cauterization techniques. . . . . . . . . . . . . . 5.5. Case study and proposition of a solution. . . . . . . 5.5.1. Motivations . . . . . . . . . . . . . . . . . . . . . 5.5.2. Construction of virtual topology . . . . . . . . . 5.5.2.1. Neighborhood discovery. . . . . . . . . . . . 5.5.2.2. Backbone. . . . . . . . . . . . . . . . . . . . . 5.5.2.3. Service zones . . . . . . . . . . . . . . . . . . 5.5.3. Maintenance of virtual topology . . . . . . . . . 5.5.3.1. Backbone. . . . . . . . . . . . . . . . . . . . . 5.5.3.2. Service zones . . . . . . . . . . . . . . . . . . 5.5.4. Virtual topology properties . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81 82 82 82 84 85 85 86 86 87 87 87 88 89 89 91 92 93 94 94 94 95 95 96 97 98 98 100 101
viii
Wireless Ad Hoc and Sensor Networks
5.6. Contribution of self-organization . . . . . . . . 5.6.1. Energy saving . . . . . . . . . . . . . . . . . 5.6.2. Influence of self-organization on routing . 5.6.2.1. Intra-cluster routing . . . . . . . . . . . 5.6.2.2. Inter-cluster routing . . . . . . . . . . . 5.6.2.3. Performance . . . . . . . . . . . . . . . . 5.7. Conclusion . . . . . . . . . . . . . . . . . . . . . 5.8. Bibliography . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
101 102 103 103 103 105 106 107
Chapter 6. Approaches to Ubiquitous Computing . . . . . . . . . . . . . . . . Mohamed BAKHOUYA and Jaafar GABER
111
6.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Structured service discovery systems. . . . . . . . . . . . . . 6.2.1. Systems based on an indexing mechanism . . . . . . . . 6.2.1.1. Centralized indexing . . . . . . . . . . . . . . . . . . . 6.2.1.2. Decentralized indexing . . . . . . . . . . . . . . . . . 6.2.2. Systems based on distributed hash . . . . . . . . . . . . . 6.3. Unstructured service discovery systems . . . . . . . . . . . . 6.3.1. Flooding-based mechanism . . . . . . . . . . . . . . . . . 6.3.2. Random walk-based mechanism . . . . . . . . . . . . . . 6.4. Comparison between structured and unstructured systems . 6.5. Self-organizing and self-adaptive approach . . . . . . . . . . 6.5.1. Server community construction approach . . . . . . . . 6.5.1.1. SAgent server agent . . . . . . . . . . . . . . . . . . . 6.5.1.2. BAgent resource agent . . . . . . . . . . . . . . . . . . 6.5.1.3. Mobile aAgent . . . . . . . . . . . . . . . . . . . . . . 6.5.2. Request resolution . . . . . . . . . . . . . . . . . . . . . . 6.5.2.1. Local reinforcement mechanism . . . . . . . . . . . . 6.5.2.2. Global reinforcement mechanism . . . . . . . . . . . 6.5.2.3. Types of agents . . . . . . . . . . . . . . . . . . . . . . 6.6. Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . 6.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
143
. . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
Chapter 7. Service Discovery Protocols for MANETs. . . . . . . . . . . . . . Abdellatif OBAID and Azzedine KHIR . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
111 114 114 114 115 119 120 120 123 124 125 126 127 127 128 129 130 132 133 135 137 137
. . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
7.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . 7.2. Service discovery protocols . . . . . . . . . . . . . . 7.2.1. Service discovery protocols in wired networks 7.2.1.1. JINI . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1.2. UPnP . . . . . . . . . . . . . . . . . . . . . . . 7.2.1.3. SLP . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
143 146 146 146 148 149
Table of Contents
7.2.2. Service discovery in ad hoc networks . 7.2.2.1. Post-Query . . . . . . . . . . . . . . . 7.2.2.2. KONARK . . . . . . . . . . . . . . . 7.2.2.3. GSD. . . . . . . . . . . . . . . . . . . 7.2.2.4. Allia. . . . . . . . . . . . . . . . . . . 7.2.3. Service discovery with routing . . . . . 7.2.3.1. Koodli and Perkins protocol . . . . 7.2.3.2. SEDIRAN . . . . . . . . . . . . . . . 7.3. Conclusion . . . . . . . . . . . . . . . . . . . 7.4. Bibliography . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
150 150 151 151 152 152 153 153 162 162
Chapter 8. Distributed Clustering in Ad Hoc Networks and Applications . Romain MELLIER and Jean-Frédéric MYOUPO
165
8.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2. State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1. Clustering in two hop clusters . . . . . . . . . . . . . . . . . . . 8.2.1.1. Gerla and Tsai approach . . . . . . . . . . . . . . . . . . . . . 8.2.1.2. Distributed clustering for ad hoc networks (DCA): weight notion introduction . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1.3. Distributed clustering for better mobility support: DMAC (distributed and mobility-adaptive clustering) . . . . . . . . . . . . . 8.2.1.4. Generalization of distributed approach limiting mobility impact: GDMAC . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2. Clustering at more than two hops . . . . . . . . . . . . . . . . . 8.3. Clustering in networks where mobile devices may have the same weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1. Initialization problem in k hop networks . . . . . . . . . . . . . 8.4.2. Mutual exclusion in k hop networks. . . . . . . . . . . . . . . . 8.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
165 166 167 168
. . .
172
. . .
177
. . . . . .
179 181
. . . . . .
. . . . . .
183 184 185 185 190 191
Chapter 9. Security for Ad Hoc Routing and Forwarding . . . . . . . . . . . Sylvie LANIEPCE
195
9.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2. Reminders on routing protocols in ad hoc networks . . . . . . . . 9.2.1. Reactive protocols . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1.1. Dynamic source routing (DSR). . . . . . . . . . . . . . . . 9.2.1.2. Ad hoc on-demand distance vector (AODV) routing . . . 9.2.2. Proactive protocol . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.2.1. Destination-sequenced distance vector (DSDV) routing .
. . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . .
ix
. . . .
. . . . . .
. . . . . . .
. . . . . . .
195 196 196 196 197 198 198
x
Wireless Ad Hoc and Sensor Networks
9.3. Routing threat model in ad hoc networks . . . . . . . . . . . . . . . 9.3.1. Ad hoc network characterization for security . . . . . . . . . . 9.3.2. Classification of attack objectives . . . . . . . . . . . . . . . . . 9.3.3. Basic attacks and security counter measures . . . . . . . . . . . 9.4. Routing security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.1. SRP: secure routing for mobile ad hoc networks . . . . . . . . 9.4.2. Secure ad hoc on-demand distance vector (SAODV) routing . 9.4.3. Ariadne . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.4. ARAN: authenticated routing protocol for ad hoc networks. . 9.4.5. Secure dynamic source routing (SDSR). . . . . . . . . . . . . . 9.4.6. EndairA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5. IP datagram forwarding security . . . . . . . . . . . . . . . . . . . . 9.5.1. Monitoring-based techniques . . . . . . . . . . . . . . . . . . . . 9.5.1.1. Watchdog and pathrater . . . . . . . . . . . . . . . . . . . . . 9.5.1.2. CORE: collaborative reputation . . . . . . . . . . . . . . . . 9.5.1.3. CONFIDANT: cooperation of nodes – fairness in dynamic ad hoc networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.1.4. SAFE: securing packet forwarding in an ad hoc network . 9.5.1.5. Improvement propositions. . . . . . . . . . . . . . . . . . . . 9.5.1.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.2. Technique based on packet acknowledgement. . . . . . . . . . 9.5.3. Cooperative incentive techniques based on virtual money. . . 9.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.8. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
199 199 200 200 202 202 204 205 209 210 212 213 213 213 214
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
215 216 217 218 219 220 220 221 221
Chapter 10. Fault-Tolerant Distributed Algorithms for Scalable Systems . Sébastien TIXEUIL
225
10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2. Distributed algorithms and wireless communications . . 10.3. Fault-tolerant distributed algorithms . . . . . . . . . . . . 10.3.1. Fault taxonomy in distributed systems. . . . . . . . . 10.3.2. Fault-tolerant algorithm categories . . . . . . . . . . . 10.4. The limits and problems caused by a large-scale system 10.4.1. Hypotheses about the system . . . . . . . . . . . . . . 10.4.2. Hypotheses on the applications . . . . . . . . . . . . . 10.5. Solutions for large-scale self-stabilization . . . . . . . . . 10.5.1. Restricting the nature of the faults . . . . . . . . . . . 10.5.1.1. Detecting and correcting errors . . . . . . . . . . . 10.5.1.2. Preservation of predicates . . . . . . . . . . . . . . 10.5.2. Limiting the geographic extent of faults. . . . . . . . 10.5.2.1. k-stabilization . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
225 226 228 228 230 232 232 235 238 238 238 239 242 243
Table of Contents
10.5.2.2. Time-adaptive self-stabilization . . . . 10.5.3. Classification. . . . . . . . . . . . . . . . . . 10.5.4. Limiting the classes of problems to solve . 10.5.4.1. Localized problems . . . . . . . . . . . . 10.5.4.2. Tolerating malicious entities . . . . . . 10.6. Conclusion . . . . . . . . . . . . . . . . . . . . . 10.7. Bibliography . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
244 246 247 247 249 251 251
Chapter 11. Code Mobility in Sensor Networks . . . . . . . . . . . . . . . . . Fabrício A. SILVA, Linnyer B. RUIZ, José M. NOGUEIRA, Thais R. BRAGA and Antonio A.F. LOUREIRO
257
11.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Concepts linked to code mobility . . . . . . . . . . . . . . . . . 11.2.1. Process and object migration . . . . . . . . . . . . . . . . . 11.2.2. Code mobility . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.3. Wireless sensor networks and code mobility . . . . . . . . 11.3. Project paradigms of code mobility systems. . . . . . . . . . . 11.3.1. Client/server . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.2. Remote evaluation . . . . . . . . . . . . . . . . . . . . . . . 11.3.3. Code on demand. . . . . . . . . . . . . . . . . . . . . . . . . 11.3.4. Mobile agent . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4. Mobile agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.1. Mobile agent components . . . . . . . . . . . . . . . . . . . 11.4.2. Mobile agent system models . . . . . . . . . . . . . . . . . 11.4.2.1. Agent model . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2.2. Life cycle model. . . . . . . . . . . . . . . . . . . . . . . 11.4.2.3. Computing model . . . . . . . . . . . . . . . . . . . . . . 11.4.2.4. Security model. . . . . . . . . . . . . . . . . . . . . . . . 11.4.2.5. Communication model . . . . . . . . . . . . . . . . . . . 11.4.2.6. Navigation model . . . . . . . . . . . . . . . . . . . . . . 11.5. Modeling mobile agent systems for wireless sensor networks 11.5.1. Agent model . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.2. Life cycle model. . . . . . . . . . . . . . . . . . . . . . . . . 11.5.3. Computing model . . . . . . . . . . . . . . . . . . . . . . . . 11.5.4. Security model. . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.5. Communication model . . . . . . . . . . . . . . . . . . . . . 11.5.6. Navigation model . . . . . . . . . . . . . . . . . . . . . . . . 11.6. State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.1. Remote and single hop reprogramming . . . . . . . . . . . 11.6.2. Multihop reprogramming . . . . . . . . . . . . . . . . . . . 11.6.3. Virtual machine reprogramming . . . . . . . . . . . . . . . 11.6.4. Mobile target location application . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
xi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
257 258 259 259 260 261 261 262 262 263 263 265 266 266 266 267 267 267 267 268 268 268 269 269 270 270 271 271 272 274 275
xii
Wireless Ad Hoc and Sensor Networks
11.7. Case study: mobile agents in WSN management . 11.7.1. Objectives . . . . . . . . . . . . . . . . . . . . . 11.7.2. Models . . . . . . . . . . . . . . . . . . . . . . . 11.7.2.1. CS model . . . . . . . . . . . . . . . . . . . . 11.7.2.2. Mobile agent model. . . . . . . . . . . . . . 11.7.3. Evaluation . . . . . . . . . . . . . . . . . . . . . 11.7.3.1. Results in relation to energy usage . . . . . 11.7.3.2. Discussion . . . . . . . . . . . . . . . . . . . 11.8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . 11.9. Bibliography . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Chapter 12. Vehicle-to-Vehicle Communications: Applications and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rabah MERAIHI, Sidi-Mohammed SENOUCI, Djamal-Eddine MEDDOUR and Moez JERBI 12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2. Properties and applications . . . . . . . . . . . . . . . . . . . . 12.2.1. Properties of VANETs . . . . . . . . . . . . . . . . . . . . 12.2.2. VANET applications . . . . . . . . . . . . . . . . . . . . . 12.2.2.1. Alert in case of accidents . . . . . . . . . . . . . . . . 12.2.2.2. Alert in case of abnormally slow traffic (traffic jam, roadworks, bad weather, etc.). . . . . . . . . . . . . . . . . . . . 12.2.2.3. Collaborative driving . . . . . . . . . . . . . . . . . . . 12.2.2.4. Highway hot spot . . . . . . . . . . . . . . . . . . . . . 12.2.2.5. Parking management . . . . . . . . . . . . . . . . . . . 12.3. State of the art and study of the existing situation . . . . . . 12.3.1. Projects and consortiums. . . . . . . . . . . . . . . . . . . 12.3.2. Study of the existing situation. . . . . . . . . . . . . . . . 12.3.2.1. Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3.2.2. Data dissemination and diffusion . . . . . . . . . . . . 12.3.2.3. Mobility models for vehicular networks. . . . . . . . 12.3.2.4. MAC and physical layers . . . . . . . . . . . . . . . . 12.3.2.5. Security in vehicular networks . . . . . . . . . . . . . 12.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .
276 276 277 277 277 278 279 282 282 282 285
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
285 287 287 289 290
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
290 290 291 291 292 292 294 294 297 299 301 302 303 304
List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
309
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
313
Chapter 1
Introduction
The considerable mobile services sector growth around the world was certainly the major phenomenon of the 1990s in the telecommunications field. The concept of ubiquitous communication (everyone and everywhere) has become an essential requirement for Internet users. A high demand for mobile communications has led to the development of new multimedia services and to the evolution in user requirements in terms of throughput and universal mobility throughout different systems. We have witnessed the birth of 2nd generation mobile telephony cellular systems (GSM, CDMA, etc.) and the more difficult emergence of 2.5/3rd generation mobile systems (GPRS, UMTS, CDMA2000, etc.) which offer high throughputs of a few dozen Kbps (GPRS) to hundreds of Kbps (UMTS, EDGE). In addition, we are witnessing a large increase of mobile users in companies where the structure is becoming less organized. In fact, employees are now more often equipped with laptop computers and spend more time working with multifunctional teams that are trans-organizational and geographically dispersed. Wireless local area networks (WLANs) were designed as data transmission systems to ensure a connection which is independent from the physical location of computer peripherals making up the network and which use wireless connections instead of a wired infrastructure. It is a practical and interesting network connection solution providing mobility, flexibility and low deployment and usage cost. After being considered an isolated and immature technology, local wireless access now appears to be a key component in a centralized architecture integrating wireless and mobile technologies (IEEE 802.11b/a/g/n, IEEE 802.15, IEEE 802.16, IEEE 802.20, Ultra Wideband, 2G/3G). In addition, two other adjacent domains have quickly Chapter written by Houda LABIOD.
2
Wireless Ad Hoc and Sensor Networks
appeared: first ad hoc mobile networks (MANET, Mobile Ad hoc NETworks) and, more recently, sensor networks. These networks are infrastructureless with a very simple network deployment capability paving the way for new applications and offering solutions in multiple environments that have no infrastructure. MANETs are found within several systems; wireless local networks (the IEEE 802.11 family, Hiperlan2), personal area networks (Bluetooth) or other systems such as home networks (HomeRF, etc.). The simple deployment capability of an ad hoc network paves the way for applications which have not been able to emerge until now and offers solutions for multiple environments (far away zones, rescue zones, etc.). Network services provided in these configurations are configured and created on the fly. In the ad hoc context, radio transmission support characteristics, network mobility, hidden/exposed nodes and other factors make traditional protocols defined for wired or cellular networks inadequate. Because of this, a series of mechanisms is offered, addressing several issues/aspects such as routing, security, applications, etc. On the other hand, wireless sensor networks represent a new type of system that has emerged thanks to the great technological progress in intelligent sensor development, powerful processors and wireless communication protocol fields. It is a very promising technology which will offer a wide range of new applications in civil as well as military sectors (environment monitoring, data collection, control, etc.). This type of network, which is made up of hundreds and even thousands of elements, aims at collecting environmental data, processing and disseminating the collected data. It has inherent characteristics which are: high density, node unreliability, frequent topology changes and resource limitation constraints (power, processing, memory capacity, communication, etc.). These characteristics raise new challenges to the practical deployment of this type of network such as energy consumption optimization, self-organization, fault tolerance, security, etc. In this book, we investigate several relevant and interesting research issues related to both types of mentioned networks, such as unicast routing, multicast routing, quality of service, security, service discovery, clustering/self-organization, etc. Our main objective is to report contributions from corresponding research communities in the different concerned fields. We have decided to follow a descriptive approach with the purpose of drawing up complete states of the art of the various cited fields and to provide personal contributions of the authors and to clearly illustrate their advantages and limitations and finally to set the milestone for future work. Any reader who is interested in improving their knowledge of the technical concepts will find a list of references and recent publications at the end of each chapter.
Introduction
3
The wealth of concepts and the diversity of domains involved have led me to organize this book into 11 chapters which are described below. After the introduction in Chapter 1, Chapter 2, which is written by Stéphane Ubéda, presents a detailed description of the concepts and principles used in MANETs. The author focuses on work related to unicast ad hoc routing, which is a critical basic function that any ad hoc network must support. Routing in ad hoc networks, which is different from traditional IP routing, is a particularly complex problem because of node mobility, resource limitations and unreliability of wireless links. A panoply of more adapted algorithms was proposed by the IETF’s MANET workgroup (Internet protocol standardization organization). Three main classes have quickly emerged: proactive or table-driven protocols, reactive or on-demand protocols and hybrid protocols. The author describes these three main groups and specifies some of the most representative protocols such as AODV, OLSR, TBRF and DSR. The author reminds the reader of major remaining critical problems, and some of them are explained in the following chapters: routing (and its different subclasses: unicast routing, multicast routing and hierarchical routing), mobility management, quality of service support, radio interface problems, energy consumption, scalability and security, etc. Ad hoc networks are completely infrastructureless; their changing topology and limited resources raise the problem of quality of service support at different levels (access to support, signaling, etc.) even more so because of greater complexity. By focusing on routing, Pascale Minet clearly explains these constraints in Chapter 3. A specific solution is described which is QoS OLSR. A performance evaluation study carried out by simulation enables us to judge the stability of obtained routes, the bypassing of overloaded network zones, load-balancing between routes and accurate use of resources. Obviously, ad hoc routing with quality of service remains a new research field where contributions are still few. The necessity for a global hierarchical architecture for quality of service control is obvious; it will be important to study inter-layer interaction in order to optimize global performance. In addition, because of its importance for grouped communications, multicast routing has the capability to optimize resource usage and therefore conserves bandwidth, and thus seems to be an adequate technique for ad hoc networks. Houda Labiod, author of Chapter 4, covers these aspects. She provides a short state of the art on multicast ad hoc routing protocols and presents a new solution, the Source Routing-based Multicast Protocol (SRMP). The main idea is to introduce a new routing concept based on quality of connectivity which can use one or several metrics related to the ad hoc environment and/or to applications. In Chapter 5, written by Fabrice Theoleyre and Fabrice Valois, the authors discuss a critical problem for autonomous networks (ad hoc or sensor networks):
4
Wireless Ad Hoc and Sensor Networks
self-organization. This latter represents a specific and inherent characteristic for this type of network, resulting from the natural existence of the collaboration ability of nodes to provide network services such as routing, localization, dissemination, security, etc. The goal of most of the cited works is to define an efficient selforganized structure of the network based on efficiently using some properties of nodes such as autonomy, dynamicity, improvement of protocols, adaptation to temporal and spatial environment variations, robustness and scalability. This chapter largely covers notions relative to this domain by describing recent studies as well as a new approach based on the concept of virtual topology. This approach improves two mechanisms: energy-conserving and routing. The omnipresence of communications and services reveals the possibility for a user to access services and resources wherever, whenever or from whatever terminal is used, whether it is fixed or mobile. This capability is at the heart of the networks discussed in Chapter 6 by Mohamed Bakhouya and Jaafar Gaber. This chapter is dedicated to the discovery and emergence of services. We find a new self-adaptive original approach based on the creation of server communities to provide services available from the network. It is implemented with the help of an adaptive middleware, influenced by concepts of the human immune system. An alternative paradigm to the traditional client/server paradigm is proposed. Chapter 7 is written by Abdellatif Obaid and Azzedine Khir and discusses the service discovery in a different way. The authors focus on protocol aspects when merging service discovery with routing in ad hoc mobile networks. A new mechanism called SEDIRAN is presented and introduces improvements to the proposition from Koodli and Perkins positioned above the reactive routing protocol AODV. This solution consists of adapting routing to the needs of the service discovery mechanism. In Chapter 8, Romain Mellier and Jean-Frédéric Myoupo present a state of the art on the different clustering techniques developed for ad hoc networks, including two (or more) hop clusters. Following a presentation of the improvement of one of the discussed techniques, two applications are detailed: the problem of initialization and mutual exclusion. Sylvie Laniepce discusses routing and data transmission security in ad hoc networks in Chapter 9. Indeed, security in particular is a crucial and very complex problem for these networks. This chapter outlines the major propositions by emphasizing points of vulnerability and describing the different mechanisms and a list of possible threats. The author concludes that the proposed solutions present some limitations in terms of robustness, performance and reliability.
Introduction
5
Chapter 10, written by Sébastien Tixeuil, discusses distributed algorithms in large-scale systems. It focuses on fault-tolerant distributed algorithms when used in the context of wireless sensor networks. Following a description of related work based on a general taxonomy of faults in distributed systems, the author demonstrates that scaling is compromised by generally used assumptions in faulttolerant algorithms. Several solutions are then introduced to propose techniques derived from self-stabilization in a large-scale context. Fabrício A. Silva, Linnyer B. Ruiz, José M. Nogueira and Thais R. Braga address code mobility in sensor networks in Chapter 11. The goal is to make it possible for sensor nodes to adapt their behavior to application requirements with the use of code mobility. The authors focus on a method based on mobile agents that has been evaluated and compared to a traditional “client-server” approach. The results show that the decision in choosing which approach to adopt depends on the network’s characteristics and on the complexity of network tasks, which in turn largely depends on the code size. Chapter 12 discusses specific ad hoc networks called vehicle-to-vehicle communications or VANET, constituting a main component in intelligent transport systems (ITS). In this chapter, the authors focus on the study of a main component in ITS systems, vehicle-to-vehicle communication as well as the associated services. They outline the state of the art by describing the major existing projects and several problems such as routing, data dissemination, mobility, access and security. We hope this book, which deals with several relevant and interesting research fields, will bring a global, realistic and critical vision of the evolution of spontaneous and autonomous networks (ad hoc and sensors), as emphasized by the various authors of this book. Moreover, there remains a large number of research projects to explore. I personally want to express my gratitude to all the authors for their very interesting contributions and the quality work accomplished, as well as to the proofreaders who had the enormous task of helping me in the final drafting of this book.
Chapter 2
Ad Hoc Networks: Principles and Routing
2.1. Introduction Ad hoc: adjective made up of the Latin word ad meaning “towards” and of the demonstrative hoc meaning “this”: towards this purpose. Suitable for a specific use.
The study of ad hoc networks, which is the theme of this book, has become increasingly popular since the 1990s, even though we are still waiting on actual applications relying on these notions. This delay has made some in the field question their future, despite the large scientific community focused around this theme. The paradox comes from the fact that it is actually a subject of study in close relation to technology, but it has a hard time breaking through in the lifecycle of “objects” in the telecommunications world. After presenting the major principles behind this mysterious term – ad hoc networks – the second objective of this chapter is to convince the reader that it really is a promising approach and that even though deployments are still few and far between, the changes they will bring to society is probably as big as those achieved by cellular telephony. Before continuing this chapter, we should start by giving a definition of ad hoc networks. This first definition is not meant to be “universal”, but should enable us to establish a base for our discussions. This definition engages only its author.
Chapter written by Stéphane UBÉDA.
8
Wireless Ad Hoc and Sensor Networks C A
B
F E
Figure 2.1. Example of an ad hoc network where double arrows show the possibilities of two nodes to establish a bidirectional radio connection
An ad hoc network is a group of mobile terminals independent from any infrastructure, communicating by radio waves, where each of these terminals offers a relay service to accept a message not addressed to it in order to retransmit it to another network terminal, which is out of radio reach of the initial transmitter of this message (Figure 2.1). The capacity of terminals to serve as relays is the fundamental element of ad hoc networks and will be the main topic of this chapter. We will call ad hoc routing capability the capability enabling a message leaving a transmitting terminal to reach a destination terminal to cross several relays. In Figure 2.1, terminal A wanting to send a message to terminal C will ask terminal B to serve as a relay or consecutively to terminals E and F, for the same purpose. We can say here that there are two possible routes to reach terminal C from terminal A. The ad hoc network will have routing capability if it is possible for node A to “know” how to forward a message to C. Although routing capability is important, it is far from being the only property that these networks must possess. In fact, the main characteristic that we will find in implementing ad hoc routing techniques is the capability of these terminals to selforganize in a network. Whether we are talking about an “open” system, where the number of terminals in the network is ignored, or a “closed” system where terminals have been “pre-configured”, there exists an inherent characteristic preventing the use of traditional network techniques: the network dynamic. In fact, radio range limits the terminals which are directly accessible from a given terminal; terminal mobility and radiocommunications adaptability make network topology extremely variable. In addition, the potential inability to reach a specific terminal, at a given moment during the life of a network, prevents the allocation of a specific role to a terminal.
Ad Hoc Networks: Principles and Routing
9
The network topology dynamic – whether it comes from radio connection mobility or adaptability – is without a doubt the element characterizing ad hoc networks, more than the fact that it federates mobile terminal networks (Figure 2.2 illustrates the network topology presented in Figure 2.1 following a slight movement from two network nodes). In fact, based on a fixed infrastructure and a clustering of cell space, cellular networks are completely capable of managing mobile terminals. On the other hand, there are common elements between peer-to-peer network node communication and ad hoc networks, and their communication support is the Internet. Similarly, for certain authors, common elements are so obvious that they classify sensor networks (where it is clear that all terminals are fixed) in the ad hoc network category. For the rest of this chapter, we will also use the term node to designate a terminal integrated to an ad hoc network. A
C B
F E
Figure 2.2. New ad hoc network configuration presented in Figure 2.1 following the relatively limited movement of node A, which loses connectivity with E, and of node E, which gains connectivity with group (BF) by getting closer to it
This infatuation for ad hoc networks comes from multiple combinations. On one hand, technology has achieved important progress, in terms of radiocommunication as well as in component miniaturization. On the other hand, we are witnessing a “liberalization” of the use of the electromagnetic spectrum, which until recently was the prerogative of governments. The explosion of digital cellular radiocommunications has resulted in the democratization of radio wave use and communicating object notions. The decrease of hardware prices has accelerated this revolution. Today, equipment with “wireless” communication capability is commonly available (mobile phone, PDA, walkman, games console, etc.) and the notion of information available “anytime and anywhere” is an understood and accepted concept in the general public. Transition from the notion of voice only communication in cellular telephony to communications mixing voice and data reinforces the scope of application of communicating objects.
10
Wireless Ad Hoc and Sensor Networks
However, before being considered as part of our daily lives, ad hoc networks were – and probably still are – a military application developed in the 1970s during the creation of project DARPA (Defense Advanced Research Projects Agency) by the American Defense department called PRNet [GUE 04, JUB 87]. In fact, the concept of the “battlefield” itself prohibits the use of a pre-existing fixed communication infrastructure. The multihop communication concept rids us of range problems linked to line of view (relays are needed to overcome the obstacles), decreases transmission power (it is useless having a range covering the scene of operations) making it possible to decrease detection risks, and finally introduces robustness in the network; no ad hoc network node is more important than another. In a PRNet network, nodes are made up of a transmission system to which terminals are connected by cable. The radio transmission system implements an access control protocol to the medium specific to these environments and constitutes mobile routers for the network layer in the ISO specification. Project PRNet will enable the testing of networks of more than 100 nodes with 400 Kbit/s transmission capabilities and for the first time will implement an ad hoc routing protocol. This routing protocol is influenced by those deployed in the Internet and is based on the notion of a distance vector. In 1983, project PRNet was extended by project SURAN (Survivable Radio Network), making its application scope larger than the battlefield. This second project attempted to solve the numerous problems raised by the first experiments; some of them still do not have final solutions to this day. The main problems are undoubtedly those linked to radio resource management and security problems. These studies have led to the deployment of operational networks in the United States armed forces such as Single Channel Ground-Airborne Radio System (SINCGARS), Strategic Command and Control Communications (C3) and IntraTaskforce systems (ITF). In 1994, DARPA created project GloMo (Global Mobile Information System) in order to study the possibility of adapting growing Internet concepts to mobile users. The objective was clearly to define the bases of the Internet for mobile users based on heterogenous terminals. These studies in the context of defense systems paved the way for civil application possibilities. In 1995, the Internet Engineering Task Force (IETF) created the Mobile Ad Hoc Networks or MANET group to study ad hoc networks in the context of the Internet [MAC 99]. The use of IP protocols, the necessity for interoperability required with new current public networks and the large diversity of usage scenarios make this workgroup’s task more complex. In the context of interaction with the Internet, an ad hoc network as defined by the MANET group – sometimes still simply called MANET by some authors – as a terminal network only or stub network, i.e. one which does not have the capacity to manage temporary flows. On the other hand, in a MANET definition, it clearly makes reference to the
Ad Hoc Networks: Principles and Routing
11
possibility of some nodes offering a gateway function to one or more fixed networks. The MANET group defines a certain number of characteristics inherent to ad hoc networks and sets objectives for the development of routing protocols. These have already been mentioned, but we explain them here because they constitute a good summary for ad hoc networks: – dynamic topologies. The nodes of an ad hoc network move arbitrarily, which leads to a quickly changing random topology of the network; – variable throughputs and limited bandwidth. Using wireless connections, actual communication throughput – after deducting multiple access effects, fading, noise, interference, etc. – is often lower than the maximum transfer rate of the radio interface and in all cases much lower than those of wired communications. One of the effects of these relatively weak communication throughputs is that congestion will generally be the norm and not the exception since the demand will often be close to or even surpass network capacity; – energy constraint function. The optimization of energy conservation will be a vital criterion for the solutions proposed for ad hoc network nodes operating on battery or on limited energy resources; – limited physical security. Ad hoc networks are made up of mobile terminals that are generally more sensitive to physical threats than wired network terminals. They are also more likely to experience attacks by idle listening, identity theft and denial of service. On the other hand, decentralized management by their very nature increases MANET’s robustness. The goal of the MANET workgroup (the group is still active) is to develop peerto-peer routing solutions completely dedicated to wireless mobile terminals using multihop communications. It also studies the different alternatives for addressing, security, and interaction/interfacing with higher and lower layers. In 2003, only four protocols were retained: Optimized Link-State Routing (OLSR), Ad Hoc On Demand Distance Vector (AODV), Dynamic Source Routing (DSR), and Topology Broadcast Based on Reverse-Path Forwarding (TBRPF). OLSR, AODV and TBRPF are already at the experimental RFC 1 stage. These protocols will be explained in more detail later in this chapter along with certain other routing protocols not recognized by the IETF.
1 RFC (Request For Comments) constitutes the set of texts defining Internet functions. They
describe, specify, assist in the implementation, standardize and debate the majority of norms, standards, technologies and protocols related to the Internet. Experimental RFCs do not constitute standards, but are referenced by the IETF.
12
Wireless Ad Hoc and Sensor Networks
2.2. Hertzian connection The ad hoc network communication medium is obviously radio waves. If it is possible to get rid of the physical layer in wired networks by abstractions and models accepted today as relatively reliable, it is not the same with wireless networks [GLI 04]. Compared to a propagation phenomenon in the wired world, Hertzian or wireless propagation can reach bit error rates (BER) up to 10 times higher. This generates very unpredictable connections, connections with characteristics that can change over time on a very large range. This section does not aspire to be a complete overview of the digital transmission radio channel modeling problem, but aims to describe the different phenomena impacting protocols and to provide a basic vocabulary to describe them. 2.2.1. Physical layer impact Without going into too much detail on wireless propagation, in theory, a signal transmitted at a given strength has a power density at distance d from the source that is proportional to the surface of the sphere of radius d centered on the source; attenuation according to the distance will then be an inverse function of the distance square. The receiver will again apply attenuation to this received signal according to the performance of receiving equipment that we model by a simple multiplying factor. Finally, in order for the received signal to be decoded, it will have to be separated from disruptions experienced by the signal, or more precisely from noise and interference. We will thus say that the signal to interference plus noise ratio (SINR) must be higher than a specific K threshold. We can therefore define the coverage zone of a transmitter based on its transmission power Pe and on a level of interference I as the set of points at distance d from its source verifying the following formula:
M ×Pe I ×d 2
≥K
[2.1]
This formula illustrates attenuation of the signal in free space. Parameter M hides the different characteristics of transmitters and receivers. We will now focus on the interference phenomenon where the effects will be hidden in parameter I. In the first place, a certain number of electromagnetic phenomena disrupt the quality of signal received when the transmission environment is no longer free space. The most important is the shadowing effect produced by obstacles located in
Ad Hoc Networks: Principles and Routing
13
the direct trajectory between transmitter and receiver. The Earth’s atmosphere does not have the same properties as clear space. The presence of trees, buildings, vehicles and any surface relief modifies the quality of signal received. The obstacles also disrupt signal by such phenomena as reflection, diffraction and diffusion (see Figure 2.3).
(a)
(b)
(c)
Figure 2.3. a) A reflection phenomenon, b) a diffraction phenomenon and c) a diffusion phenomenon
These different phenomena also generate effects known as multiple path, meaning that the receiver receives waves from a same transmitter, but which may have followed different path. These multiple routes have positive effects, such as receiving the signal where no direct route exists, but also negative effects, by introducing delays which can greatly disrupt reception, and even cancel it entirely. Finally, a certain number of natural phenomena disrupt wireless propagation, induced by the presence of electric or electronic equipment or meteorological phenomena. All these disruptions are modeled by modifying the attenuation factor related to distance, which, from 2 in free space, go to a value between 2 and 4 following environments:
M ×Pe I ×d α
≥K
[2.2]
This approximation is often sufficient for models necessary to the development of protocols, outside of cases of internal behavior modeling. When we address applications which must be deployed within buildings, it becomes difficult to consider this approximation as true. It then becomes very difficult to take into account all phenomena in order to calibrate the different ad hoc network protocols. As an example, Figure 2.4 illustrates the result of a simulation of a Wi-Fi transmitter (802.11 technology) obtained inside a building.
14
Wireless Ad Hoc and Sensor Networks
Figure 2.4. Example of the result of an internal propagation where we notice that the coverage zone is far from being circular as is often assumed for simplification
The set of all disruptions that we have just described modifies the behavior of the signal’s propagation and justifies the extremely unpredictable wireless propagation character, and thus the difficulty of modeling these phenomena to adapt protocols. Potential node mobility in an ad hoc network further complicates the coverage zone forecast or the point-to-point communication quality forecast. However, the main disruptions in a mobile radiocommunication system, whether for a cellular network or an ad hoc network, come from the system itself. In fact, in an ad hoc network it is obvious that several communications will be simultaneously active and disrupting each other. In cellular radiocommunications, this problem is reduced by achieving a centralized control management of radio resources and by using different channels. In an ad hoc network, the absence of a fixed infrastructure, of centralized control and potential node mobility make any prior management of radio resource impossible. We are then left with anticipating interference and designing protocols capable of adapting to disruptions. Traditionally, ad hoc networks work by using only one communication channel or carrier, which can be reused. We will now describe the main problems related to using the radio medium that ad hoc networks will have to face. First, as we have defined the notion of the coverage zone, we will now define the interference zone. This is the zone in which a transmitted signal is too weak to be decoded, but its strength, considered as noise, may disrupt communication reception. This zone extends beyond the reception zone and prohibits the use of this transmission channel by any node, both for transmitting and receiving. This interference zone makes the problem more complicated as a node can disrupt another network node (in its interference zone) with which it will never be able to exchange a coordinating message (outside its coverage zone).
Ad Hoc Networks: Principles and Routing
A
15
C
B
E D
F
Figure 2.5. Considering the connections in a traditional diagram model, routes [A,B,C] and [D,E,F] are independent (this is the case for wired networks); in a radio network, the coverage zone of node B contains node E, which makes it impossible for these two nodes to transmit simultaneously, contradicting the route-based model in a diagram
Finally, we will end this section with a discussion of the problem illustrating the difficulty of forecasting communication performance in multihop radio networks [CHA 04]. Most algorithms on which communication protocols are based use a description of the network in the form of an adjacency graph to make decisions. One edge in this graph indicates that two nodes are in communication range. We have used this definition for Figures 2.1 and 2.2. 2.2.2. Shared access to medium Radio spectrum is a resource which is not scalable. This radio resource is organized in physical channels, and each channel is allocated to a communication. The number of physical channels is generally much lower than the number of potential communications. Radiocommunication systems remain effective because communications do not happen simultaneously and spatial reuse is also possible, in other words reusing a radio resource in a location of the network far enough away so that they do not disrupt each other. In this chapter, we are only considering onechannel ad hoc networks (the use of multiple channels in ad hoc networks remains a subject of research today). The presence of only one radio resource (outside of spatial reuse) still requires certain management. A management protocol to solve access conflict for a shared resource must be implemented: the Medium Access Control (MAC) layer. The problem of resource sharing also exists in fixed networks; the simplest example is access sharing of a communication bus. In the world of radiocommunications, this problem is slightly more complex than in the wired world
16
Wireless Ad Hoc and Sensor Networks
for several reasons. Furthermore, in this field, a node transmitting over the medium saturates its energy antenna and becomes unable to detect an interfering signal on the medium that it uses: it is impossible to detect a collision using channel sensing during a transmission. This last characteristic will then require the introduction of an acknowledgement message to secure a transmission. Finally, the very nature of distributed and dynamic ad hoc networks prevents any complex coordination mechanism as it is possible to implement them in cellular networks. One of the first MAC protocols for shared radiocommunication resources was Aloha, developed in the 1970s [ABR 70]. With this protocol, each terminal immediately transmits messages without precautions; if the message is not acknowledged, the terminal waits for a random time. This protocol’s performance quickly deteriorates when the medium gets loaded. To improve the performance of this type of protocol, carrier sensing before transmission can be implemented: this would be done by protocols called CSMA (Carrier Sense Multiple Access). This prior sensing decreases chances for a collision but reduces throughput because of this wait time before transmission. We will not go into too much more detail on these different protocols. However, to illustrate in more detail the problems generated by this MAC layer which has an important impact on higher level protocols – specifically routing – we will present the MAC layer of 802.11b protocol, which is the most currently used technology for data communications in a mobile radio environment. The IEEE 802.11 [IEE] standard uses a variation called CSMA/CA for Carrier Sense Multiple Access with Collision Avoidance. The objective of this variation is to minimize the number of collisions, mainly during medium reservation phase. To decrease collision probability, two carrier sensing mechanisms are implemented: physical carrier sensing and virtual carrier sensing. For the higher 802.11 protocol layers, the channel, or carrier, is considered as busy if one of the mechanisms says it is so. Physical sensing is provided by the physical layer and corresponds to transmission detection on the radio channel. There are two modes of managing radio resources in the 802.11 norm: Point Coordination Function (PCF) mode and Distributed Coordination Function (DCF) mode. We will not address PCF mode because is only used in infrastructure mode (i.e., with the notion of a cell around an access point). DCF mode can be used in infrastructure mode as well as in ad hoc mode, which is our main focus. DCF mode uses an algorithm that is completely distributed to guarantee fair access to a radio channel. However, it was not designed to be used in a multihop environment and guaranteeing fair access in an ad hoc context is a complex problem that we will not be able to bring up in this chapter. We will give an overview of carrier access control in DCF mode; a complete description can be obtained in [MAL 04].
Ad Hoc Networks: Principles and Routing
17
The 802.11 norm involves two types of packets: unicast packets addressed to a single node in the system with its MAC address being specified (interface 802.11 physical address) and broadcast packets corresponding to a radio broadcast to all neighbors within range. Unicast packets are explicitly acknowledged by ACK-type messages. Broadcast packets are not acknowledged in order to avoid a broadcast storm phenomenon corresponding to the generation of chain collisions when too many neighbors simultaneously acknowledge a received broadcast packet. When a node wants to transmit a data message, it senses the signal until it becomes clear. To reduce the collision rate between stations wanting to use the channel at the same time, a random wait time or backoff is used. After a wait time called DIFS, a node wanting access to the carrier randomly chooses a backoff delay. If after this delay the carrier is still clear, it transmits the waiting message over the carrier. If during this time the node senses that the carrier is busy, it defers the backoff timer until the carrier again becomes clear. Once the carrier is clear, the node again waits for a delay equal to DIFS before reactivating its backoff timer. Backoff delay is calculated in time slots, and the duration depends on transmission speed used on the carrier. DIFS
SIFS Back off
DIFS
SIFS
Back off
Node A Busy
Node C
Data
ACK
Busy
ACK
Data
Node B Deferring
Figure 2.6. Nodes A and B want to transmit a data packet (data) to node C. Node A obtains a backoff value of 2 whereas B gets 4. Node A transmits its message after two backoff slots and node B defers its backoff. Node B backoff will only start after the message and acknowledgement (ACK) have been transmitted
When a unicast message has to be acknowledged, the destination node of this message waits a SIFS time before sending the ACK. Since the duration of SIFS is shorter than DIFS, nodes waiting for carrier clearing will hear the ACK message
18
Wireless Ad Hoc and Sensor Networks
before the end of DIFS countdown and will consider the carrier busy once more. They will then relaunch a DIFS countdown after the ACK message. Medium access faces two classic problems in wireless communication, known as the hidden node problem and exposed node problem. To solve these two problems, the 802.11 norm uses two additional control messages: Ready To Send (RTS) and Clear To Send (CTS) messages. A node with access to the carrier after its backoff transmits an RTS control message containing transmission duration and other information. The destination node instantly replies with a CTS control message with the duration retained. All stations receiving RTS or CTS messages will consider the carrier as busy for the time of transmission and will launch a Virtual Carrier Sense (called NAV) indicator, and will sense the support for the time entered in the RTS or CTS packet received In this way, in the case of Figure 2.7, node C would not start its transmission towards B before the end of transmission of A to B. 1) Hidden node. Presuming that node B is in the zone of coverage of both nodes A and C, and that node A is transmitting to node B. If node C decides to send data to node B, which is not within transmission range of node A, by sensing radio carrier it will think it is clear. By launching a transmission towards B, node B will jam the transmission between A and C.
2) Exposed node. Node C is within radio range of both nodes B and D. Presuming that B is currently transmitting data to node A, node C, sensing radio carrier, considers it as busy and will defer its transmission to node D. However, this transmission would not have disrupted reception by node A which is out of range of C, and decreases the performance of the whole system.
Figure 2.7. Illustration of the problems of exposed and hidden nodes
Ad Hoc Networks: Principles and Routing
19
2.2.3. Flooding Since ad hoc networks are dynamic by nature (arrival of new nodes, departure of others) and are made up of connected mobile terminals within range and without the help of any infrastructure, they must have a discovery capacity for their environment. Terminals with no previous knowledge of the network’s structure – topology and even presence of a specific terminal in the network – will need to acquire this knowledge by message exchanges within the network. These mechanisms also exist in wired or fixed networks, but ad hoc networks must be able to support such an important dynamic and will therefore work with a much smaller time scale. This environment discovery has to go through a diffusion operation throughout the network, called flooding. This is a communication operation making it possible for a terminal to send a message to all the terminals in the ad hoc network directly within range or reachable with the help of network nodes as relays2. This operation enables different control messages to reach the network’s nodes or to reach any terminal when sufficient network information is not available. A flooding procedure could a priori be sufficient to operate an ad hoc network, since it can also transmit data from one terminal to another; this would obviously imply a too large usage of the network’s radio resources to be used for data transmission. We will then use flooding only to discover the environment, self-organize the network and implement more efficient routing algorithms. These routing protocols for ad hoc networks – consisting of gaining enough information in order for two terminals from the network to exchange messages without having to use flooding for data messages – all require the use of a flooding protocol for control. Most of the time, ad hoc routing protocols use very basic protocols to obtain this flooding capability even if it is a very bandwidth-intensive operation which is liable to greatly decrease a network’s performance. The version used in most ad hoc routing protocols is blind flooding, which simply means that the transmitting node sends a message to its neighbors, each node of the network retransmits all flooding messages that it receives. The only optimization of this protocol consists of labeling in a unique way each flooding message so that one network node receiving the same flooding message from several neighbors will transmit it only once. Each node must therefore hold flooding messages received for a time interval longer than flooding operation time and to render them null in case they become redundant.
2 Flooding of a specific network node when its location in the network is not known requires that a message be sent to everyone.
20
Wireless Ad Hoc and Sensor Networks
When a diffusion message m arrives the terminal verifies if this message is not already present in its table of last diffusions received. If that is the case, m is simply ignored. If that is not the case message m is kept in the table of last diffusions received during time D set by the algorithm. It is also transmitted by diffusion to all nodes within range over the terminal’s radio interface.
If we presume that the medium is reliable – that no message has been lost – all terminals accessible through consecutive transmissions are reached: we say that the diffusion algorithm is reliable, and all nodes have been reached if the resulting graph from neighborhood connections is connected. As we have seen previously, radiocommunication diffusion is generally unreliable on the MAC layer, or in other words radio messages sent by diffusion are not acknowledged to avoid a broadcast storm. Based on broadcast messages that are not acknowledged (as is the case with 802.11), it is impossible to obtain a reliable flooding mechanism that is guaranteed. The study of flooding mechanisms is still an ongoing research subject which has become much more important with the introduction of sensor networks. In fact, as different studies show, these flooding mechanisms are very bandwidth-intensive and also involve large consumption of terminal batteries. In the context of sensor networks, this parameter is extremely important. Flooding algorithms can be classified into 4 main categories [ING 05]: blind flooding, probabilistic flooding, geographical flooding and network information flooding. In probabilistic methods, each network node will only retransmit a flooding message with a probability p set by the algorithm. When networks are very dense, this type of protocol can result in good performance. We can also add to the messages from last diffusion received table a field which will add up the number of times a message m was received; the terminal delays retransmission of m for a time set by the algorithm and can even cancel this retransmission if m was received more than K times; K is also set beforehand by the algorithm (we presume that if K neighbors have already retransmitted the message, it is highly probable that new retransmissions will not reach any new node). We can also base the retransmission decision on distance criteria (I have received a message from a very close neighbor, therefore I will not retransmit) or still absolute position of the terminal retransmitting the flooding message and target (I am farther from the target than the neighbor which has just sent me a flooding message). Finally, we can also improve diffusion protocol performance by building a structure organizing the network’s nodes. This structure must be developed locally (only between close neighbors) and must offer global characteristics for the improvement of diffusion algorithms as well as routing algorithms. This last aspect is part of what we call topology control, which we will not address in this section. However, OLSR and TBRPF routing protocols presented later in this chapter use optimized flooding mechanisms.
Ad Hoc Networks: Principles and Routing
21
As we can see, all these improvements are intended to limit the number of collisions that flooding protocols massively tend to generate and also to limit the number of retransmissions. Flooding protocols are evaluated based on the following criteria [WIL 02]: – accessibility: the percentage of network terminals reached by the diffusion procedure. This percentage must be as close to 100% as possible; – retransmission rate: the number of retransmissions that the protocol generates by blind flooding protocol; – latency: the time between transmission of flooding message and the latest receiving time by a network node. In the case of flooding protocols integrated to routing protocols – which is what we are focusing on here – it is hard to consider protocols with network-dependant parameters (node density, topology) without losing genericity. This is not the case with sensor networks that are deployed in a known context which can be taken into account. That is the reason that diffusion algorithms offered in a purely ad hoc network context are not much more sophisticated than blind flooding. 2.3. Routing Since the arrival of ad hoc network concepts, many proposals have been studied, simulated and evaluated. These same proposals have led to variations, specializations to given environments and optimizations. This section on routing is not intended to provide an in-depth analysis. It will not get into comparisons between the different proposals either, since this topic would require a complete book to do it justice. Our goal is to describe the major classifications of routing algorithms as well as successful solutions. However, the reader should remember that current growth, with miniaturization of hardware, introduction of terminals with several radiocommunication interfaces, and emergence of sensor networks, already reopens studies on this fundamental concept, which is already very much studied, but constantly revised. A routing problem in an ad hoc network is the same as in a fixed network. A node A from the ad hoc network wants to send a message m to another node B of this same network; how does it determine the network nodes by which m will gradually be relayed through the radio medium to reach B? Internet routing is also capable of adapting to dynamics – it is the main principle of Internet – but the time scale in which topology changes are made is done over the Internet and a good number of these changes are of a much smaller scale than with ad hoc networks.
22
Wireless Ad Hoc and Sensor Networks
In the following, we will go back to traditional hypotheses which are nearly always taken for granted by ad hoc protocol designers. We will consider homogenous nodes, from the point of view of their communication capacity as well as their calculation and storage capacity, using one radiocommunication technology and with the help of only one communication carrier for the whole network. Even though they are a minority, several studies do not use the same hypotheses. The characteristics in demand by most ad hoc routing algorithm developers are: – simplicity: the protocols must generate as little surcharge of management data as possible and must be very simple to develop and deploy; – self-organization: no central control can be admitted in an ad hoc network and the structures necessary for routing management must be created in a distributed way and resist topology change as much as possible; – scalability: protocols offered must adapt to different ad hoc network sizes and support different mobility and traffic models. It is possible to add many more characteristics to the list, from the request for quality of service to energy conservation for each mobile device. Ad hoc routing proposals can be classified into two main categories: proactive and reactive routing. We can add other generally hybrid proposals to these two families, for example, the creation of an internal network structure, or relying on the presumption that each network node knows its position in a plan. Proactive protocols are directly inspired by routing protocols deployed in the Internet and are thus adaptations of link state routing and distance vector routings. Their common characteristic is that each ad hoc network node locally maintains a routing table for sending data to any node in the network. With these protocols, terminals periodically exchange information beyond their direct neighborhood for permanently maintaining “tables” describing the network, totally or partially, in order to decide routes to take during message transmission; they are sometimes called table-driven ad hoc routing. According to the updating frequency of information tables, or the frequency of update messages forwarding, these tables somewhat accurately reflect the state of the network. If the frequency is higher, then the protocol has a better chance of resisting the network’s dynamic, but on the other hand, when this updating frequency is higher, then the bandwidth used for these control messages is larger and is therefore not available for data transmissions.
Ad Hoc Networks: Principles and Routing
23
Ad hoc reactive routing algorithms minimize the use of control messages to a minimum to save bandwidth. The information vital to the calculation of a route between two network nodes is only researched when a request for this route is expressed by the higher protocol layers. The protocols of this class attempt to keep the routes used and only those as up to date as possible. The quantity of bandwidth used for control messages is particularly sensitive to the number of routes (implementation and maintenance) and can be much lower than a proactive protocol when this number is lower. The major drawback of this type of protocol is the important delay necessary between a request for message transmission and the actual transmission when the route has not yet been created. 2.3.1. Dynamic source routing (DSR) As with all reactive protocols, the DSR protocol uses a process of route discovery between two network nodes when it is necessary for a specific communication. The main characteristic differentiating DSR from the other reactive protocols is the use of routing by the source: the transmitting node of a data packet must know the list of all intermediate nodes in order to reach its destination. This route is located in data packet headers, so much so that the intermediate nodes do not need a local routing function. This mode exists in IPv4 as an optional protocol. It was initially developed to simplify network management but was abandoned for security reasons (network hardware equipment did not accept it, even though it is in the norm). Routes that will be discovered on demand are kept for a certain period by a cache mechanism. As we will see later, different options are available to limit the necessity of route discovery by using these caches. In the DSR protocol, we find two reactive routing mechanisms: route discovery and route maintenance. When a node wants to send a data message to another network node, it searches for a route in its local cache. If no route for this terminal is found, a process of route discovery is activated in order to complete the route cache. The route discovery phase is based on a blind flooding process. The node wanting route discovery generates a route request (RREQ) control message. This control message is sent by broadcast radio to all its neighbors. This message contains the identity of the initiating node, destination node and a unique sequence number determined by the initiating node.
24
Wireless Ad Hoc and Sensor Networks B
A
(1)
[A,D,E]
(1)
(1) (2)
E
[A]
[A,B]
(2)
C
(3)
D G
[A,C,G]
(2)
[A,C]
[A,D]
F
(2) [A,D,F]
Figure 2.8. Node A launches flooding with a RREQ packet containing list [A] and target F. After the first local broadcast, neighbors of A have all received the RREQ and have added their own identifier to the node list. At step two, nodes having received RREQ transmit it again. Node B receives it again from node C and so does not even consider it. Node F is reached and locally builds the route [A,D,F]. At step 3, node F does not consider the RREQ received from node E. Node F will build a RREP packet with [A,D,F] which will be transmitted to A by the opposite route or by using flooding a second time which could create a different return route
Finally, the RREQ message also contains the sorted list of intermediate nodes which will relay the data messages between initiating and destination nodes. When a node receives a RREQ message, it generates a response message, a route reply (RREP), if it is the recipient of the route request; if not, it adds its identity at the end of the intermediate nodes list and rebroadcasts this modified message over its radio interface (see Figure 2.8). The DSR protocol works in an environment where radio connections can be unidirectional, since they consider routes between two network nodes in both “forward” and “reverse” ways as independent. They are discovered independently. However, when using a MAC layer which only considers bidirectional connections – as is the case with 802.11 – DSR can consider the “return” route as being the opposite of “outgoing” to avoid launching a second discovery step (see Figure 2.8). DSR does not really use the routing table notion, but each node maintains a local cache containing its known routes toward certain network nodes. Cache use can decrease the number of searches for a destination route. If a node receives a RREP message for a node to which it knows a route, it can immediately concatenate the partial route created by a RREP message with the one it knows to its target and thus generate a RREP reply message. Nothing prevents a node from having several possible routes in its cache for a given destination, allowing for example transmission based on several simultaneous routes. The presence of several routes can also make it possible to “repair” a route which becomes invalid because of the disappearance of a connection.
Ad Hoc Networks: Principles and Routing
25
If a connection is discovered to be invalid, the node detecting it renders all routes in the cache using them invalid. Whether it has an alternate route or not, it generates a control message called route error (RERR). This RERR message is sent to the sources of the route which have become invalid. The DSR protocol also has different options for the modification of protocol behavior. They consist of using information sensed by nodes during route implementation, where these nodes are not involved by this information. In this way, each control or data message with a route in its header, all nodes sensing these messages can update their local cache, even if this message is not meant for them. These options enabling the acceleration of route search can also sometimes contribute to maintaining erroneous network information. 2.3.2. Ad hoc on-demand distance vector (AODV) AODV is a reactive routing protocol similar to DSR presented above. It uses the same route discovery mechanism (with the help of RREQ and RREP control messages) and route maintenance (with the help of the RERR control message). It also uses the concept of vector distance routing based on the distributed BellmanFord algorithm for calculating the shortest route. In the first place, the AODV protocol uses HELLO control messages between direct neighbors. The objective of these messages is to verify the state of connections since AODV only manages symmetric connections. Contrary to DSR, it does not use source routing, and each network node maintains a routing table. This routing table will only be partial because, following the reactive routing principle, only actual route requests will enable network information. The routing table memorized in each node is a table where each entry has the following information: destination node address, next hop node on the route to this destination, sequence number and time of expiration of this entry in the table. Expiration time, updated each time the entry is used, deletes routes which have not been refreshed quickly enough. The sequence number is inherent to a destination and makes it possible, when receiving an update message, to know if it corresponds to a more recent route than the one stored in the table or not. The algorithm retains the most recent route and, in the case of equality, the shortest one. When a node wants to send a data message to another network terminal, it searches for a route in its routing table. If no destination route for this terminal has been discovered, a route discovery process is activated by blind flooding. The node needing an update of routing tables generates a RREQ control message. This control message is broadcast to all neighbors. This message contains the identity of the
26
Wireless Ad Hoc and Sensor Networks
initiating node, destination node identity and the last known sequence number value associated with this destination. Each node which relays this RREQ message from the initiating node x received from a neighbor v places an entry to x in this routing table with v as the next hop node. When the destination node receives the RREQ message from x, it then has an entry to x in its routing table. It replies with a RREP control message which follows the opposite route to the RREQ message and updates routing table entries of intermediate nodes, using the initial source of RREQ message as target. If, by using HELLO messages, a connection that has been active and that is used in the routing table is detected as faulty, a RERR packet is sent to all active node neighbors. If receiving a RERR message modifies a node’s routing table, it generates a message to its neighbors, and the RERR message gradually arrives at the source, which can require reconstruction of this route if it is still needed. 2.3.3. Optimized link state routing (OLSR) The OLSR routing protocol can be considered as an adaptation to the ad hoc network world of the OSPF (open shortest path first) protocol deployed in wired Internet. Both are link state protocols, or a protocol where nodes periodically broadcast the state of connections perceived in their neighborhood to the whole network. The adaptation for ad hoc networks mainly consists of optimizing this global broadcast operation or flooding. As we have seen in the section about flooding, OLSR will structure the network to avoid blind flooding: it is an information-based flooding method. The OLSR protocol defines the multipoint relay (MPR) concept to limit the number of message retransmissions during the necessary flooding operations.
a d
x c b
Figure 2.9. This figure represents all neighbors at two hops from x. Condition 2) of the selection algorithm places nodes a and b in MPR(x). At the first step of 3) node c obtains a score of 3 (compared to a score of 2 for d) and is thus selected. Then it is the turn of d to be selected. We then obtain MPR(x) = {a,b,c,d}
Ad Hoc Networks: Principles and Routing
27
Each network node x contains a group of MPR(x) which is a subset of direct x neighbors. This group had the following property: combination of MPR(x) with neighbors of nodes belonging to MPR(x) constitutes all terminals that we can reach from x in direct communication or by using one of the direct neighbors as relays. We then talk about a distance 2 neighborhood or two hop neighborhood. An algorithm enabling the selection of the MPRs of a node x was proposed by Laouiti, Qayyum and Viennot [LAO 01]. We call N(x) the direct neighbors of x and N2(x) the nodes that we can reach from x with a relay, and which do not belong to N(x) (two hop neighbors of x). MPR(x) is generated as follows (see Figure 2.9): 1) MPR(x) = ∅; 2) add to MPR(x) all nodes y ∈ N(x) where there is one u ∈ N2(x) and the only neighbor of u in N(x) is y; 3) as long as there is a node for N2(x) which does not belong to MPR(x) and which is not a neighbor to a node of MPR(x) then: a) for each node y of N(x) not belonging to MPR(x), calculate the number of neighbor nodes of y which are not MPRX(x) or neighbors to a MPR(x), b) add to MPR(x) node y which maximizes the quantity calculated at step (a). Two types of control messages exist in OLSR: HELLO messages and TC (topology control) messages. HELLO messages serve an ad hoc network node to discover its close environment that OLSR defines as all its direct neighbors and their direct neighbors (or all the network nodes that we can reach with one retransmission). These HELLO messages transmitted by a node contain all their direct neighbors and an indicator specifying if the connection with each of their neighbors is bidirectional (symmetric) or unidirectional (asymmetric). A link between node n and a neighbor v is called unidirectional if n has received a HELLO message from v, but this message n will give no reference to itself. HELLO messages periodically transmitted between direct neighbors enable each node to locally rebuild the complete topology of their surrounding two hop network. TC messages transmitted by a node x contain partial information on the state of connection of x with its neighbors. Actually, to decrease the size of this information, a node x will indicate the state of connections that it has with all these neighbor nodes which have chosen it as MPR: they are MPR selectors. These TC messages are periodically transmitted to the whole network through a flooding process. To decrease the number of messages used during this flooding, a node y receiving a TC packet from a neighbor x will process this packet (i.e. it adds it to the list of TC messages that it has received), but will only retransmit it if it is MPR of x. The
28
Wireless Ad Hoc and Sensor Networks
definition of MPRs guarantees that all network nodes are reached (if the list of MPRs from each node is updated). Each network node retains TC packets received from each of the other network nodes. In this way, it can locally rebuild a global topology view of the network and calculate the routes for each node with help from the Disjktra algorithm to obtain the shortest route. 2.3.4. Topology based on reverse-path forwarding (TBRPF) The TBRPF protocol, as with OLSR, is a proactive link state routing protocol. Neighborhood knowledge is maintained in each node by HELLO control message exchanges. On the other hand, TBRPF offers optimization consisting of only sending “differential” information, i.e. notifying neighbors only on neighborhood modifications since the last HELLO packet.
B C A
E D
Figure 2.10. Node A has neighbors B, C, D and E. It has locally built a tree of the shortest routes starting at its location for all network nodes that it knows. It has also received partial versions of shortest route trees from its neighbors. In order to build the partial version of its shortest route tree to transmit to its neighbors, it must apply the following rule: it places in the tree all its direct neighbors and the root subtree in C (for example) only if A is the next hop node for one of its neighbors (B, D or E) to reach C (it applies the same reasoning for each root subtree in each neighbor)
In TBRPF, each network node will calculate a shortest route tree to the network. On the other hand, to avoid overloading bandwidth, some strategies will make it possible for the nodes to propagate only a part of this tree in the network. The tree is locally calculated with the help of a variation of the Dijkstra algorithm (with a conflict resolution based on node identifier), but only the subtree called the
Ad Hoc Networks: Principles and Routing
29
reportable subtree (RT) will be transmitted to node neighbors. Again, the RT is periodically broadcast in totality and, between two complete RT broadcasts, in a differential way. Finally, RT messages can be combined with HELLO messages to decrease control overhead. The RT of node x contains all x neighbors, to which we add the shortest path subtree for x rooted in a node i if x is the next hop of another neighbor j (different from i) to reach i. 2.3.5. Zone-based hierarchical link state routing protocol (ZRP) There is a debate between proactive and reactive routing protocols and we can conclude that each approach has a certain number of advantages and drawbacks. It would be natural to attempt to develop mixed methods, taking advantage of both techniques. That is what ZRP protocol is attempting, combining proactive and reactive mechanisms. The proactive mechanism field, called IARP (intrazone routing protocol), is the direct neighborhood of a node. Each network terminal will choose a distance d corresponding to the number of hops (relays) authorized to reach direct neighbor terminals or routing zones. This distance d is not necessarily identical for each network terminal. Within this close field, a proactive routing mechanism is used, enabling the node to precisely know the topology of the subnet made up of close field nodes. It has a complete routing table for these terminals. We should note that with the definition of close field, the different zones that make up these neighborhoods at distance d very probably overlap. In these close field areas, terminal nodes, which are direct radio neighbors but are not in the routing zone, will play a role in the protocol’s reactive procedure. When a terminal wants to transmit a packet to a destination node not in its routing zone, it uses a reactive mechanism called IERP (interzone routing protocol). Following this reactive phase of the ZRP protocol, the terminal transmits its route search request to all peripheral nodes. If one of them has it in its local routing table (with the definition of close neighborhood that it has chosen), it can respond to that requesting node that it knows a route to the destination node. The nodes which do not have the target terminal in their routing table retransmit this request to their peripheral nodes and so on. It is a particular flooding mechanism which uses local information to limit the number of control messages that are necessary during the search phase. As with any flooding mechanism, a node which receives a request that has already been processed – which may very well happen, requests flooding the network by multiple routes – will not consider this request again.
30
Wireless Ad Hoc and Sensor Networks
2.3.6. Location-aided routing (LAR) In literature on this subject, several proposals have been introduced for the development of ad hoc routing based on the use of network node coordinates. Network node coordinates are obtained from a system that is external to the routing protocol, with the use of a GPS (global positioning system) for example. This could be done in vehicle networks or simply in sensor networks where terminals are not mobile but where the network’s dynamic comes from the disappearance of nodes which stop working or changes in the condition of wireless propagation. In this section, we present the LAR (location-aided routing) protocol [KO 98], which uses knowledge of the sender and receiver node coordinates to optimize the flooding procedure. Each time a network node attempts to establish a route to a target terminal, it is supposed to know the target’s coordinates. We can, however, presume that these coordinates are somewhat imprecise, as the network nodes may have moved since the time this information was obtained. We can compare the idea of LAR to paging systems in cellular networks; in radio cellular networks, we try to reach a mobile device on standby in the last cell where it was last located and also in cells where it is supposed to be since this last date following a movement. In ad hoc networks, the notion of cell does not exist. If node A wants to create a route to node B, it will calculate an expected zone obtained from B’s presumed coordinates and the distance that B has covered since the last time this information was picked up by evaluating what it knows of B’s speed (see Figure 2.11). This expected zone, along with the target’s coordinates and its identity, will be placed in the route discovery message header. This will enable the optimization of flooding based on this information. The discovery itself and route use will be carried out by any reactive routing algorithm, such as DSR or AODV, for example. From this expected zone, the mobile device launching the search will determine the request zone that will correspond to a geographical zone limiting the request range and thus its associated flooding (Figure 2.11). A node receiving a route creation request from one of its neighbors will only retransmit this request if it is located within the request zone. This request zone can be specified implicitly (the predefined geometric zone including the requesting node and expected zone) or explicitly described in the discovery message. We will notice that the request zone has a surface proportional to the speed of the mobile device that we wish to find, as well as the time elapsed since the last moment where its coordinates were known.
Ad Hoc Networks: Principles and Routing
y
31
B
V(t2-t1) A
x Figure 2.11. For a search request issued by A on date t1, if the last known position of mobile device B on date t0 is (x,y) and its estimated speed is v, then the expected zone is the central disk (x,y) and radius v(t2-t1); the request zone is the smallest rectangle containing the expected zone and search request transmitting node simultaneously (to simplify, we use a rectangle parallel to axes in the reference)
In [KO 98], a second search method based on coordinates is proposed. Its objective is also to control the flooding range during a mobile device search, but does not use the notion of request zone. When node A wants to obtain a route to node B, it adds its target’s coordinates (XB,YB) to its search message as well as an estimation of A to B distance named DISTB (calculated as previously explained with position of B on date t1 and its estimated speed). When node C receives this search request, it also calculates its distance DISTC evaluated at node B target of the request. If parity (formula [2.3]) is respected, then the node propagates the request by replacing the initial distance DISTB by its own DISTC in the message; otherwise, the request message is not propagated. Parameters α and β must be specified by the protocol. They control evaluation errors that the system may make on coordinates or speeds:
αDISTB + β ≥ DISTC
[2.3]
This approach using knowledge of the coordinates of different mobile devices participating in ad hoc networks clearly has the advantage of reducing the number of messages necessary for flooding during a route search (reactive protocol) and thus the quantity of radio resources used, but remains specific for certain applications and environments for which this type of knowledge is possible. We should note,
32
Wireless Ad Hoc and Sensor Networks
however, that working on a purely geometric criterion prevents a search protocol from bypassing obstacles to wireless propagation (for example, a building may require a route that will go beyond the encompassing rectangle). 2.4. Conclusion Mobile objects equipped with communication capabilities are widely available and offer new services corresponding to what we call the mobile Internet. With them, we can access networks and obtain information anywhere and at any time. They offer the possibility of developing new services, without, however, fundamentally changing the client + server = service equation. Mobile objects are only terminals, albeit mobile, in an essentially traditional network. Direct communication between two communicating objects is an almost bigger revolution than mobile connection to fixed networks, because it breaks the traditional equation involving the notion of client. In fact, “client” terminals merge to create a service, even if today services only rarely go beyond the level of business card, phone number and document (presentations, music, etc.) exchange. An important step will be made when a large number of mobile terminals will merge to provide a service where added value will come from the number of terminals participating in the constitution of the information base required to create the service. We can use the example of automobile traffic control. Moving vehicles, picking up data such as the number of vehicles and their average speed, for example, constitute an exceptional road traffic analysis system. The uploading of all this information to a central server would be very bandwidth-consuming and data storage-intensive, so precise information can only really be evaluated locally. Organized in ad hoc networks, vehicles could propagate local information in the network enabling precise decision making, such as avoiding a congested street or adapting speed to the flow (we should be so lucky!). There are several scenarios, from surveillance of young children in a housing zone, to a search for the best price in a commercial zone for a given product. Ad hoc networks are the support for the advent of such applications. Ad hoc networks constitute a complex problem requiring a revisit of many network concepts where existing solutions were thought to be completely adequate. In this chapter, we have focused on routing, a basic function that any ad hoc network must provide. We could very well be under the assumption that there is not one ad hoc network but many. Consequently, it would be useless to search, in routing as in other ad hoc network aspects, for the perfect algorithm or protocol for each situation.
Ad Hoc Networks: Principles and Routing
33
In the world of radio networks, it is very hard to forget about the “physics” of phenomena involved when we attempt to design protocols. In this context, we have quickly experienced the necessity for “intra layer” collaboration, or in other words the necessity to design routing protocols to use lower level information (bit error ratio, power levels, etc.), external information (typically speed of movement) or even “driving” lower level parameters. This vision of things is an important lead for ad hoc environment improvement – specifically routing – but disturbs our habits. In routing as in other aspects, ad hoc network solutions must face a number of challenges. I give a short list of challenges which only commit the author of this chapter, but which are the subject of intense research. First, it will be important to control scaling: how does my solution handle itself if the number of mobile devices in my network increases? Quality of service control: how can I guarantee a level of performance for a given service? Power management control: in several scenarios, ad hoc network nodes are low energy capacity objects, a capacity that should not be wasted. Finally, there will be no important ad hoc network development without the study of specific security problems linked to these environments. I will close this chapter by reminding the reader that solutions do exist, that networks are deployed today and that ad hoc networks will undoubtedly become more important, even though many studies still remain to be done. 2.5. Bibliography [ABR 70] ABRAMSON N., “The Aloha System – Another Alternative for Computer Communication”, Proceedings of AFIPS, p. 295-298, 1970. [CHA 04] CHAUDET C., GUÉRIN LASSOUS I., ZEROVNIK J., “A Distributed Algorithms for Bandwidth Allocation in Stable Ad Hoc Networks”, IFIP International Conference on Wireless On-Demand Network Systems (WONS 2004), LNCS 2928, p. 101-115, January 2004. [CLA 03] CLAUSEN T., JACQUET P., Optimized Link State Routing Protocol (OLSR), Internet Request for Comments RFC 3626, Internet Engineering Task Force, October 2003. [GLI 04] GLISIC S.G., Advanced Wireless Communications, 4G Technologies, Wiley, New York, 2004. [GUE 04] GUÉRIN LASSOUS I., “Réseaux ad hoc”, in K. Al Agha (ed.), Réseaux sans fil et mobiles, Hermes, Paris, 2004. [HAA 02] HAAS Z.J., PEARLMAN M.R., SAMAR P., The Zone Routing Protocol (ZRP) for Ad Hoc Networks, Internet draft, draft-ietf-manet-zone-zrp-04.txt, July 2002. [IEE] IEEE Computer Society LAN MAN Standards Committee, Wireless Lan Medium Access Control (Mac) and Physical Layer (Phy), Standard IEEE.802, p. 11-1197.
34
Wireless Ad Hoc and Sensor Networks
[ING 05] INGELREST F., SIMPLOT-RYL D., “Localized Broadcast Incremental Power Protocol for Wireless Ad Hoc Networks”, Proc. 10th IEEE Symposium on Computers and Communications (ISCC 2005), Cartagena, Spain, 2005. [JOH 03] JOHNSON D.B., MALTZ D.A., HU Y.-C., The Dynamic Source Routing Protocol for Mobile Ad Hoc Networks (DSR), Internet draft, draft-ietf-manet-dsr-09.txt, April 2003. [JUB 87] JUBIN J., TORNOW J.D., “The DARPA Packet Radio Network Protocols”, Proceedings of the IEEE, Special Issue on Packet Radio Networks, vol. 75, no. 1, p. 2132, January 1987. [KO 98] KO Y.-B., VAIDYA N.H., “Location-Aided Routing (LAR) in Mobile Ad Hoc Networks”, Proceeding of the 4th ACM/IEEE International Conference on Mobile Computing and Networking, p. 66-75, Dallas, USA, October 1998. [LAO 01] LAOUITI A., QAYYUM A., VIENNOT L., “Multipoint Relaying: An Efficient Technique for Flooding in Mobile Wireless Networks”, 35th Annual Hawaii International Conference on System Sciences (HICSS’2001), 2001. [MAC 99] MACKER J., CORSON S., Mobile ad hoc networks (MANET), IETF Working Group Charter, http://www.ietf.org/html.charters/manet-charter.html, 1999. [MAL 04] MALES D., “IEEE802.11/WiFi”, in K. Al Agha (ed.), Réseaux sans fil et mobiles, Hermes, Paris, 2004. [OGI 04] OGIER R.G., TEMPLIN F.L., LEWIS M., Topology Dissemination Based on Reverse-Path Forwarding (TBRPF). Internet Request for Comments RFC 3684, Internet Engineering Task Force, February 2004. [PER 03] PERKINS C.E., BELDING-ROYER E.M., DAS S., Ad hoc On-Demand Distance Vector (AODV) Routing, Internet Request For Comments RFC 3561, Internet Engineering Task Force, July 2003. [WIL 02] WILLIAMS B., CAMP T., “Comparison of Broadcasting Techniques for Mobile Ad Hoc Network”, Proceedings of MobiHoc, Lausanne, Switzerland, June 2002.
Chapter 3
Quality of Service Support in MANETs
There are an increasing number of applications with quality of service (QoS) requirements, such as video on demand with bandwidth requirements, voice over IP with end-to-end delay and jitter requirements to name a few examples. How can we offer this quality of service in a mobile ad hoc network? This is the goal of this chapter. 3.1. Introduction to QoS We should first define QoS. The most widely accepted definition from the international community is given by CCITT’s E800 recommendation. DEFINITION 3.1.– QoS is the collective result of service performance determining the degree of satisfaction of a service user. We will first examine what QoS requirements are for applications supported by mobile ad hoc networks.
Chapter written by Pascale MINET.
36
Wireless Ad Hoc and Sensor Networks
3.1.1. Different QoS requirements As in wired networks, the flows generated by applications supported by mobile ad hoc networks have diverse characteristics such as, for example: – type (voice, video, data) and the volume of exchanged information (from a few bytes for an alarm to hundreds of megabytes for a video); – duration of interactions (short to send an email, longer for viewing a movie); – inter-arrival of packets sent to the network (from a few milliseconds to several hours); – grouped arrivals or not grouped. These flows also have different QoS requirements: – some require priority processing like alarms; – others, such as video flow, have bandwidth requirements; – and still others, such as voice, require a maximum end-to-end delay and a maximum end-to-end jitter (this jitter is the maximum variation between end-to-end delays experienced by packets in a flow). That is why uniform packet processing is not appropriate, and QoS support which considers the different QoS requirements is vital. 3.1.2. Chapter structure This chapter is organized into five sections. In the introduction, we define QoS and demonstrate the necessity of QoS support in mobile ad hoc networks. Next, in section 3.2, we demonstrate why traditional solutions are not directly applicable. In section 3.3, we describe the different components necessary for providing QoS. In section 3.4, we present a QoS support example, QoS OLSR, and provide an evaluation of its performance. Different parameters are evaluated by simulation. They allow us to judge about route stability, avoidance of overloaded network zones, load sharing between routes, good resource utilization, limited overhead as well as mobility support. This solution is implemented. The last section, section 3.5, summarizes this study and presents different perspectives that provide new research directions.
Quality of Service Support in MANETs
37
3.2. Mobile ad hoc networks and QoS objectives Why can’t solutions designed for wired networks be adopted as is in mobile ad hoc networks? 3.2.1. Characteristics of mobile ad hoc networks and QoS QoS support in a mobile ad hoc network is more complex than in a wired network for reasons that we will explain below. 3.2.1.1. Radio interference The presence of radio interference is the main difference with the wired world. DEFINITION 3.2.– Transmission of a node is said to interfere with that of another node when the signal over interference ratio on a receiver is lower than a given threshold.
Figure 3.1. Bandwidth consumed at MAC level by a CBR flow at 1 Mb/s
This interference phenomenon, inherent to radio transmissions, impacts local bandwidth available on a node, as shown in Figure 3.1. This figure represents bandwidth consumed on each node visited by a CBR (constant bit rate) flow at 1 Mb/s visiting nodes 0 to 6. The size of packets sent is 1,500 bytes, the network
38
Wireless Ad Hoc and Sensor Networks
used is an IEEE 802.11b network at 11 Mb/s, without RTS/CTS, the radio range is 250 m and the interference radius is 550 m. We notice that the bandwidth consumed at the MAC level varies from 3.5 Mb/s for destination (node 6) to 8 Mb/s for node 3. It is thus imperative to take interference into account in any solution providing QoS support. 3.2.1.2. Limited resources Mobile ad hoc networks are characterized by resources with limited and variable capacity over time. All flows present in the network must share the medium’s capacity. This capacity is generally smaller than in a wired network (for example: 11 Mb/s for IEEE 802.11b networks [MUH 02], and 54 Mb/s for IEEE 802.11g networks). In addition, bandwidth granted to a flow can quickly change following the arrival of a new flow, as shown in Figure 3.2. Let us consider two CBR flows at 1,500 Kb/s, defined as f1 and f2. Initially, only flow f1 is present. It receives a throughput equal to the requested throughput, 1,500 Kb/s. At time t = 100 s, flow f2 is introduced, resulting in an important deterioration of QoS for f1. Both flows then receive a throughput fluctuating around 800 Kb/s. This is unacceptable compared to requested throughput. This example shows that interference must be taken into account in the decision to accept/reject a new flow.
Figure 3.2. Application bandwidth measured by recipient of each flow
Quality of Service Support in MANETs
39
3.2.1.3. Large dynamicity of a mobile ad hoc network The quality of a radio link is characterized by a high versatility. In fact, numerous factors influence radio propagation, such as fading with distance, presence of obstacles, existence of multiple paths, weather conditions, etc. In order to avoid strong instability, only good quality links should be used. A radio link can be asymmetric: one extremity hears the other, but the opposite is not true. The existence probability of this type of link in a real network is significant. A protocol must not presume there is symmetry in a link but must be sure before using this link. In a mobile ad hoc network, the topology varies greatly over time. This is due to a breakage of existing links and to the creation of new links. New nodes reach the network and others leave it. In addition, since a node can move freely, radio links are created and destroyed dynamically and frequently. This can lead to major traffic variations over time and space: load peaks moving over different nodes or appearing and disappearing through time. The strong dynamicity of mobile ad hoc networks results in frequent route changes. The mobile ad hoc network’s success comes from its ability to support node mobility without interrupting ongoing communications. Adding QoS support must preserve this property. 3.2.1.4. Broadcast and multihop transmission In a mobile ad hoc network, a node cannot directly reach a node that is out of its radio range. It must use multihop routing. In a traditional broadcast network, a node reaches in a single transmission all the nodes in its network. This is no longer the case with mobile ad hoc networks. The message must be retransmitted to reach all nodes in the network. As we have seen in section 3.2.1.2, resources are limited, so we must then make sure we minimize the number of message retransmissions for optimized broadcast. We should note that broadcast is largely used in a mobile ad hoc network. For example, we can mention service advertisement of an Internet gateway, route discovery in a reactive network and the advertisement of certain links in a proactive network. That is why optimized broadcast in these networks is so important. 3.2.1.5. Decentralized control Mobile ad hoc networks are by nature decentralized. They do not need an infrastructure to operate. They self-organize and are able to reconfigure themselves without the help of a central entity. Consequently, they provide better performance, avoiding the bottleneck of a centralized solution (for example: congestion of links
40
Wireless Ad Hoc and Sensor Networks
close to the central entity). They are also more robust: node failure is perfectly tolerated by the network. QoS support must then also meet this property. 3.2.2. Routing in mobile ad hoc networks Since mobile ad hoc networks are generally multihop, routing is necessary. We observe three routing protocol families: – reactive protocols establish a route to a destination on demand. Only when a user wants to transfer messages to a destination is a route to this destination created. During route creation time, the messages are buffered at the source. Route creation uses a route discovery procedure based on flooding. Flooding is a broadcast in which each node retransmits a received message once. In this way, in an N node network, flooding of a message generates N transmissions of this message. This is extremely costly. The advantage of reactive protocols resides in the fact that only routes used are maintained. This enables memory (smaller size of routing table) and message exchange gains. AODV [PER 03a] and DSR [JOH 07] protocols are examples of reactive protocols; – proactive protocols maintain a route in each node for each network node. To do this, they periodically exchange messages relative to the network topology. These protocols present the advantage of immediate route availability. OLSR [CLA 03] and TBRPF [OGI 04] protocols are examples of proactive protocols; – hybrid protocols use a proactive protocol to manage routes close to the node and a reactive protocol for the rest of the network. ZRP [HAA 02] is an example of a hybrid protocol. IETF’s MANET workgroup [MAN 05] has standardized four of these protocols which have become experimental RFCs; two are reactive and two are proactive: – AODV reactive protocol in the form of RFC 3561 [PER 03a]; – DSR reactive protocol in the form of RFC 4728 [JOH 07]; – OLSR proactive protocol in the form of RFC 3626 [CLA 03]; – TBRPF proactive protocol in the form of RFC 3684 [OGI 04]. 3.2.2.1. AODV: a reactive routing protocol AODV [PER 03a] is a reactive protocol based on a distance vector. AODV gets rid of the infinite counting problem (slowness in detecting that a node has become inaccessible) and avoids routing loops by using a sequence number per destination. For a given destination, the information maintained corresponds to the highest sequence number. This helps in maintaining consistency of routing information. In
Quality of Service Support in MANETs
41
the AODV protocol, we can observe two procedures: route discovery and route break detection. 3.2.2.1.1. Route discovery With AODV, a node N only establishes a route to a destination D when the user wants to transfer a message to this destination and N does not know any valid route to D. To do this, node N broadcasts an RREQ message to discover a route. It buffers messages to destination D until it has obtained a route. A node receiving this message for the first time: – if it does not know the route, propagates the RREQ message and memorizes the return route to the RREQ source; – if, on the other hand, it knows a route to this destination with a sequence number higher than or equal to the one contained in the RREQ message, it responds with the RREP message. The RREP message is transmitted point-topoint and follows the reverse path of the RREQ message to reach the source. When the source receives the RREP message, it sends the waiting messages on the route obtained. This latency can prove to be unacceptable for certain applications. 3.2.2.1.2. Detection of route break When a node belongs to a route used, it monitors the link status of next hops in active routes. When a link break is detected on a route used, an RERR (Route ERRor) message is generated to signal this break to other nodes and the route becomes invalid. To do this, each node maintains a list of precursors containing the list of its neighbors using it as their next hop to the destination which has become unreachable. When this message is received, the source will relaunch the route discovery procedure. Before sending an RERR message, the node in front of the break can try to repair it locally if the destination is not farther than a given number of hops. If it is successful, local repair will not be visible from the source. 3.2.2.2. OLSR: a proactive routing protocol OLSR [CLA 03] is a proactive routing protocol based on link state and optimized for mobile ad hoc networks. It inherits properties from the OSPF protocol [MOY 98] widely used in the Internet. Optimization is obtained with the help of the multipoint relay (MPR) concept. A multipoint relay is a one-hop neighbor that has been selected as described in section 3.2.2.2.1. The use of multipoint relays will make it possible to:
42
Wireless Ad Hoc and Sensor Networks
1) decrease the size of control messages: instead of advertising all its one-hop neighbors, a node will only advertise the neighbors that have selected it as multipoint relay; 2) decrease the number of retransmissions of a broadcast message. In fact, only the transmitter’s multipoint relays and not all its one-hop neighbors retransmit the broadcast message. R0 forwarding rule of a broadcast message: a node will retransmit a broadcast message if and only if it has received it for the first time from a node, having selected it as multipoint relay and the time-to-live of that message is not expired. We now present the two main functions of OLSR: neighbor discovery and topology dissemination. 3.2.2.2.1. Neighbor discovery In order to detect its neighbors, a node N periodically broadcasts (one hop broadcast) HELLO messages containing the list of known neighbors of this node with their link status: – symmetric if both nodes N and N’ hear each other; – multipoint relay if N and N’ hear each other and node N has chosen node N’ as multipoint relay; – asymmetric if node N hears N’, but N does not hear N’; – lost if N no longer hears N’. HELLO messages are received by all one-hop neighbors but are not propagated. HELLO sending period is 2s by default. HELLO messages enable each node to discover its one hop neighbors as well as its two-hop neighbors. In this way, a node will be able to select its multipoint relays among its symmetric one-hop neighbors in order to reach all its two-hop neighbors. More precisely, each node N independently executes the following multipoint relay selection procedure: 1) N selects any node N’ that is the only neighbor of a node at two hops from N as multipoint relay; 2) N then selects a node N’’ that reaches the largest number of nodes at two hops from N not yet covered as multipoint relay. This step is repeated until all neighbors at two hops of N are covered;
Quality of Service Support in MANETs
43
3) finally, N excludes from the group of multipoint relays any node N’’ so that this group without N’’ covers all nodes at two hops of N. Multipoint relay selection is done at each change in the one or two hop neighborhood. OLSR only uses symmetric links whose quality has been judged satisfactory (for example: good level of signal reception, acceptable error rate, etc.). 3.2.2.2.2. Topology dissemination Each node maintains network topology information with the help of TC messages. Each node N selected as multipoint relay periodically broadcasts a TC message throughout the network. This message advertises all nodes having selected N as multipoint relay. This message is broadcast over all the network and takes advantage of the multipoint relay optimization (see the forwarding rule R0). A node N’ will reach another node N’’ either directly or through one its multipoint relays. All the information maintained in OLSR has a validity period, beyond which it is destroyed because it is outdated. Topology and neighborhood tables are used to calculate routes for each node based on the shortest route Dijkstra algorithm. The routing table is recalculated whenever a topology or neighborhood change occurs. 3.2.2.3. Comparative OLSR and AODV performance evaluation Relative performance of OLSR and AODV protocols have been evaluated and compared by a large number of simulations. In [CLA 04], simulations have been carried out for an 802.11b network at 2 Mb/s with 50 nodes randomly spread over a surface of 1,000 m × 1,000 m. Simulated period is 250 s. Each element represented on a curve in Figures 3.4 to 3.7 corresponds to the average value obtained over 30 simulations. The mobility model is the random model in which pause time varies from 0 to 5 s. In the traffic model, 25 CBR (Constant Bit Rate) flows are launched. Each CBR flow generates a packet every 0.1 s. The packet size is 64 bytes. These simulations show that even when density of nodes or traffic increases, or when nodes are mobile, OLSR provides: – optimal routes in number of hops: see Figure 3.3; – a better message delivery rate: see Figure 3.4; – shorter transfer delays: see Figure 3.5; – lower overhead: see Figure 3.6.
44
Wireless Ad Hoc and Sensor Networks
a
# flows
b Figure 3.3. Route optimality a) according to mobility, b) according to the number of flows
Quality of Service Support in MANETs
a
# flows
b Figure 3.4. Better delivery rate a) according to mobility, b) according to the number of flows
45
46
Wireless Ad Hoc and Sensor Networks
a
# flows
b Figure 3.5. Better delays a) according to mobility, b) according to the number of flows
Quality of Service Support in MANETs
47
a
# flows
b Figure 3.6. Lower overhead a) according to mobility, b) according to the number of flows
48
Wireless Ad Hoc and Sensor Networks
Similar results were found in [YUE 04]. These results are explained by the fact that route discovery in AODV is very costly in terms of network resources. Flooding is responsible for many collisions explaining message losses, longer delays and obtaining a route not optimized in number of flows. We should mention other OLSR advantages. OLSR complies with Internet principles. It inherits OSPF properties, a protocol widely used in the Internet. The implementation of OLSR, entirely in the user space, guarantees better security. OLSR is an optimized protocol. For a unicast transmission, it guarantees the shortest route to destination. For a broadcast, using MPRs optimizes the number of message retransmissions. In their native version, the one corresponding to their RFCs, neither OLSR nor AODV will manage QoS. How can QoS be managed in a mobile ad hoc network? 3.2.3. Realistic QoS objectives We should first note that there is no MAC IEEE 802 radio protocol currently available that would guarantee that as soon as the medium becomes free, it would become allocated to the highest priority message waiting for a transmission. In addition, the MAC IEEE 802.11 protocol’s non-determinism [MUH 02] makes strict QoS guarantees impossible. In order to satisfy QoS requirements in terms of bandwidth, a QoS support must know the bandwidth consumed, at the MAC level, by a flow at each node located in its interference area. This problem was addressed by several authors in an IEEE 802.11 network. A compromise is developed between calculation complexity and the accuracy of approximations obtained. These approximations of the bandwidth consumed at each node in the interference area are generally acquired with the help of analytical models validated by simulations. The interested reader can refer to [NGU 05a]. The presence of interferences makes the bandwidth reservation problem NP-hard [ALL 05], whereas it is polynomial in a wired network. Since this problem must be resolved whenever a new flow with QoS requirements is introduced, heuristics are generally used. Because of mobile ad hoc network characteristics presented in the previous section, the objective of QoS support in these networks is not to provide strict QoS guarantees, but to attempt to provide QoS that is close to the requested QoS, while ensuring good resource utilization and accepting a large number of flows.
Quality of Service Support in MANETs
49
3.3. QoS architecture and relative QoS state of the art Studies related to QoS can be structured with reference to the generic QoS architecture illustrated in Figure 3.7. This architecture contains five components that we will briefly describe. QoS model Admission control QoS routing QoS signaling QoS MAC Figure 3.7. QoS architecture
3.3.1. Different QoS components The five components of QoS architecture are: 1) QoS model, which is the first component to develop in QoS support. The development of all other components depends on this model. This model describes QoS classifications offered and their properties. Any QoS model includes at least two traffic classes: the one associated with QoS flows and the one associated with BE flows. We denote: – QoS flow, a flow with QoS requirements (for example: bandwidth, delay, jitter); – BE (for best effort) flow, a flow with no specific QoS requirement. This type of flow will be processed as best it can, this is the default processing. 2) admission control, which determines if a QoS flow will be accepted or not, i.e., this new flow should not deteriorate the QoS provided for QoS flows already accepted to an unacceptable level (see example shown in Figure 3.2 in section 3.2.1.2). The decision depends on the QoS requested by this flow, available resources and QoS requested by QoS flows already accepted. Different policies can be applied, such as for example: – FIFO: flows are served according to their arrival sequence in the network. If there are not enough resources for the new flow, it is rejected;
50
Wireless Ad Hoc and Sensor Networks
– fair sharing: each QoS flow receives a share of bandwidth proportional to its request; – sharing based on price paid: each QoS flow receives a share of bandwidth proportional to the price paid. The more exact the evaluation of bandwidth used by a flow, the better the admission control will be: it will accept the maximum number of flows, according to network capacity. However, in mobile ad hoc networks, the presence of interference makes it difficult to evaluate the bandwidth consumed. Nevertheless, interference should not be ignored because it could cause unacceptable performance degradation. Admission control must therefore consider interference and will be said to be interference-aware. Due to the high dynamicity of mobile ad hoc networks, using admission control only when a new QoS flow is introduced is not sufficient in maintaining QoS required by accepted QoS flows. A periodic QoS control is needed. 3) QoS routing is responsible for finding a route between a source and destination satisfying QoS flow requirements. To do this, it uses QoS signaling to select a satisfactory route. When this route is found, it remains fixed as long as the topology remains constant. In order to ensure this, different solutions can be used: – source routing: the source includes the route that each flow packet must follow; – soft state: each node visited by the flow maintains the path in a soft state which is periodically refreshed; – label switching according to the MPLS (multiprotocol label switching) technique [LEF 00]. We should note that since QoS routing can be invoked frequently, for instance, at each topology change, the algorithm used must be simple. As an example, its complexity must remain close to that of the Dijkstra algorithm. In addition, the shortest routes (smallest number of hops) tend to minimize network resources used for transmitting each packet from source to destination. That is why considering the “number of hops” metric in route calculation is recommended. Since some flows have bandwidth requirements, the bandwidth metric must also be considered. As with admission control, QoS routing must be interference-aware. 4) QoS signaling locally monitors and evaluates QoS, local available bandwidth for example, and propagates it to the nodes involved. This dissemination of information must spare network resources. In the case of a broadcast, it must be optimized.
Quality of Service Support in MANETs
51
5) MAC (medium access control) protocol with QoS is required for best QoS support performance. An ideal QoS MAC protocol: – would be deterministic; – would grant medium access to the highest priority waiting packet; – would provide MAC level QoS information. This information could be the number of slots occupied over a total number of consecutive slots, the waiting time before transmission of each packet, the number of packets sent/received successfully in a given time interval, etc. Protocol IEEE 802.11b is not deterministic and does not manage several traffic categories. Protocol IEEE 802.11e manages four traffic categories but is still not deterministic; it only guarantees that on average, medium access time of a higher priority message is lower. Since protocol IEEE 802.11b is commercially available, that is the one currently used in mobile ad hoc networks. We should mention, however, that QoS support with such a MAC layer will not produce the best performance. 3.3.2. QoS models The first two QoS models for mobile ad hoc networks are INSIGNIA and SWAN. They are adaptations of IntServ (integrated services) [BRA 94] and DiffServ (differentiated services) [CAR 98] models for mobile ad hoc networks. The third model, FQMM, is a hybrid model. The most recent trend is toward adaptiveness and optimization through different protocol layers by adopting a crosslayering approach. 3.3.2.1. INSIGNIA approach INSIGNIA [LEE 00], an adaptation of the IntServ model for mobile ad hoc networks, provides a guarantee for each accepted QoS flow. To do this, INSIGNIA maintains a state for each flow on each visited node. This state is created and maintained by the signaling protocol responsible for resource reservations. QoS is monitored periodically and in a distributed way. This model has two traffic categories: QoS traffic, called real time traffic, and BE traffic. This QoS traffic corresponds to adaptive flows expressing their QoS requirements by two levels of bandwidth: a minimum and a maximum level. MAC protocol is assumed to manage QoS. The reservation protocol of INSIGNIA, an adaptation of the RSVP protocol for ad hoc networks, verifies that the local bandwidth available over each node visited
52
Wireless Ad Hoc and Sensor Networks
by the flow will provide the bandwidth requested by this flow. If so, this flow is accepted either as a QoS flow with maximum bandwidth required, or as a QoS flow with minimum bandwidth required, or as a BE flow. Otherwise, this flow is rejected. Upon receiving packets from a QoS flow, the destination measures QoS parameters (for example: packet loss, delay, throughput achieved, etc.) and periodically sends a report to the source. Source can then adapt QoS flows to avoid congestion. We should note that bandwidth reservation does not take interference into consideration; only the nodes on the flow route reserve bandwidth for this flow. In addition, they only reserve bandwidth required by the flow, but this is inadequate, as shown in Figure 3.1. 3.3.2.2. SWAN approach SWAN [AHN 02], an adaptation of the DiffServ model for mobile ad hoc networks, provides guarantee per class instead of per flow, unlike INSIGNIA. In this way, nodes do not maintain a state per visiting flow. This model also has two traffic categories: QoS traffic, called real time traffic, and BE traffic. This QoS traffic corresponds to flows with bandwidth requirements. Contrary to INSIGNIA, MAC protocol is not assumed to manage QoS. When a new QoS flow is introduced, the source takes over admission control: it sends a probe containing, when it comes back, the minimum bandwidth provided to the flow considered in the chosen route. According to the value obtained, the source decides to accept or reject the flow. In order to maintain acceptable transmission delays for QoS flows, each node must maintain bandwidth granted to QoS flows below a specific threshold. We can also mention that admission control with SWAN does not consider interference; only the nodes on the flow route reserve bandwidth for this flow. In addition, they only reserve the amount of bandwidth requested by this flow. 3.3.2.3. FQMM approach The FQMM approach [XIA 00] is a hybrid method. It provides a better QoS guarantee to a limited number of flows, and it provides a guarantee per class to other flows. In order to do this, it combines: – the per-flow granularity of IntServ for high priority classes; – the per-class granularity of DiffServ for low priority classes.
Quality of Service Support in MANETs
53
According to the authors of this model, it does not have a problem with scalability as long as the number of flows benefiting from a per-flow guarantee is small. It would seem that mobile ad hoc network characteristics are not taken into account either. 3.3.2.4. Cross-layering approach The cross-layering approach is the most recent one. The different protocol layers interact in order to adapt and optimize resources to provide the required QoS. The authors of [NAH 04] have noticed that all mobile ad hoc IEEE 802.11 networks managing QoS expressed in terms of bandwidth use at each layer, mechanisms to control bandwidth utilization. These mechanisms cooperate in different degrees by the exchange and sharing of state information. These networks are said to use a cross-layering approach. As an example, the architecture recommended in [NIK 04] defines QoS metrics associated with each layer as follows: – at application level, they define three QoS classes. Class I corresponds to applications with strong delay constraints such as voice. Class II is recommended for applications requiring high throughput, such as video. Class III corresponds to the best effort service; – at network level, they recommend using residual energy level, buffer occupancy level and node stability for each node to characterize the quality of a link going through this node; – at MAC level, network quality can result in a signal to interference plus noise ratio (SINR) on the link. Different coding algorithms can be used in the class considered. 3.3.3. QoS signaling In order to optimize QoS management, QoS parameters, measured at each node, must be known to other nodes. A QoS signaling protocol is used in this case. Measured QoS parameters can be: – local bandwidth available in the node [ALL 05, NGU 06]; – medium access delay [NAI 05], etc. The first signaling protocol used in mobile ad hoc networks is INSIGNIA. It is an adaptation of RSVP [BRA 97], resource reservation protocol for wired networks. This protocol is not scalable.
54
Wireless Ad Hoc and Sensor Networks
BRuIT [CHAU 02] is the first protocol to consider interference in bandwidth reservation in a mobile ad hoc network. BRuIT reserves a quantity of bandwidth equal to the quantity requested on the transmitting node, receiving node, and in the interference area.
a) Native selection MPR(A) = {C}
b) Selection modified for bandwidth MPR(A) = {B,C}
Figure 3.8. Selection of multipoint relays by node A
In OLSR, signaling is performed by means of multipoint relays. In [NGU 05b], we show the efficiency of native multipoint relays (i.e. selected according to the procedure presented in section 3.2.2.2.1) to optimize information broadcasts throughout the network. This efficiency decreases as soon as multipoint relay selection is modified to take QoS parameters such as local available bandwidth and medium access delay into consideration. As an example, Figure 3.8 represents the selection of multipoint relays performed by node A: – with the native procedure, A only selects node C as multipoint relay: see Figure 3.8a; – with the procedure modified to reach each two-hop node by the neighbor with higher bandwidth and by assuming that the bandwidth available for node B is 5 Mb/s and that for node C is 1 Mb/s (see Figure 3.8b), A selects as multipoint relays the nodes: - B to reach nodes D and E, - C to reach node F. This modification of the selection procedure for multipoint relays leads to the selection of a larger number of multipoint relays. This number can reach 160% in a 200-node network with a density of 30 neighbors per node, and 140% in a 400-node network with a density of 20 neighbors per node. This is verified by Figure 3.9 illustrating the number of nodes selected as multipoint relays in a 200-node network with native selection represented as darker and the selection based on bandwidth as lighter.
Quality of Service Support in MANETs
55
250 Native OLSR MPRB
Number of MPRs
200
150
100
50
0 10
20 Network density
30
Figure 3.9. Number of multipoint relays in a 200-node network
It therefore follows that the number of retransmissions of a broadcast message through the network is larger, as Figure 3.10 shows, as soon as the native MPR selection is not used. 120
Native OLSR MPRB
Number of retransmissions
100
80
60
40
20
0 10
20 Network density
30
Figure 3.10. Number of retransmissions of a message with 200 nodes
56
Wireless Ad Hoc and Sensor Networks
It is therefore important to retain the optimized broadcast provided by OLSR in any solution enhancing OLSR with QoS support, as we will see in section 3.4. We should mention that optimized broadcast is also useful in reactive routing protocols in the route discovery phase. The interested reader will find performance gains obtained with AODV when optimized broadcast by multipoint relay is used instead of flooding during route discovery in [ALL 05]. 3.3.4. QoS routing 3.3.4.1. Complexity of QoS routing Different metrics can be used for QoS routing. These metrics are classified into three types: – additive metrics such as delay or number of hops. The delay on a route is equal to the sum of delays on links making up the route; – concave metrics such as bandwidth. The bandwidth obtained on a route is equal to the minimum bandwidth obtained on each of the links making up the route; – multiplicative metrics such as error rate. Error-free rate on a route is the product of the bit error-free rates of the links making up the route. 3.3.4.1.1. Routing based on a single criterion If QoS routing considers only one criterion, the Dijkstra algorithm can be applied in a wired network to find the minimum cost route for any destination: – if the problem is to find the minimum delay route, the cost of each link is equal to its delay; – if the problem is to find the maximum bandwidth route, the cost of each link is equal to the reverse of its available bandwidth. The problem to solve in mobile ad hoc networks is made complicated by the presence of radio interferences. Interferences make the problem NP-hard [ALL 05]. The object is to evaluate how the acceptance of a new flow modifies link delays or bandwidths available in the interference area of the new flow. The interested reader can refer to [ALL 05] and [NGU 05b] for bandwidth and [NAI 05] for delays.
Quality of Service Support in MANETs
57
3.3.4.1.2. Routing based on several criteria Let us examine the case where QoS routing considers several criteria. In the case of wired networks, it has been shown in [WAN 96] that if more than one additive and/or multiplicative metric is used for routing, then the problem of finding a route meeting these constraints is NP-hard. A simple way to proceed is to establish a hierarchy between the metrics. In [WAN 96], the authors consider that bandwidth is more important than the delay and propose an algorithm providing a route with the largest bandwidth. If there are several routes, then the one offering the shortest delay is retained. In [MUN 04], the authors favor the number of hops compared to bandwidth. Their algorithm provides the shortest route; if there is more than one, the largest bandwidth is retained. 3.3.4.2. QoS extension of AODV The authors of [PER 03b] propose to enhance the AODV routing protocol in order to manage QoS. They show how to extend RREQ and RREP messages used during route discovery in order to select a route fulfilling required QoS. Similarly, the ICMP QoS-LOST message is used to signal inability to meet required QoS. Bandwidth reservation is only carried out on the route and not in the flow interference area. 3.3.4.3. QoS extensions of OLSR Different solutions are proposed to enhance OLSR with QoS; see, for example [GE 03, MOR 05, NAI 05, NG 05b]. Some favor bandwidth constraint such as, for example, [GE 03, NG 05b]; others favor delay constraint such as [NAI 05]. Others still, such as [BAD 04] and [MOR 05], focus on one of these constraints but not on both simultaneously. Certain solutions retain optimized broadcast such as [NG 05b] and [MOR 05]. 3.4. An example of QoS support: QoS OLSR As we have seen in section 3.2.2.2, OLSR in its native version (version of RFC 3626) does not manage QoS. In this section, we show how to enhance OLSR with QoS support. We briefly describe a solution and evaluate its performance. For more detail, the reader can refer to [NGU 06]. This solution is implemented on the French Ministry of Defense/CELAR platform made up of an IEEE 802.11b network with 18 nodes.
58
Wireless Ad Hoc and Sensor Networks
3.4.1. Description of QoS OLSR This solution extends OLSR to integrate QoS support. It is compatible with OLSRv2, the new OLSR version. It has four flow classes, in decreasing priority: – control flows: OLSR flows belong to that class. Control flows have the highest priority; – delay requirement flows; – bandwidth requirement flows; – BE flows without QoS constraints. The last three classifications are the only ones available to user flows. This solution reconciles QoS support and broadcast optimization. To do this, two types of MPRs are used. MPRs selected according to the native procedure, called MPRFs, are used to retransmit broadcast messages. MPRs selected according to the standard procedure called MPRBs are used for routing. Each node measures its available local bandwidth and includes this information in its HELLO message. In this way, each node knows its neighbors’ available local bandwidth at one and two hops. A node selects its MPRFs according to the native selection and its MPRBs according to available local bandwidth. It reports these two selections in its HELLO. A node selected as MPRB announces it in a TC message containing: – minimum bandwidth in its interference area, that is limited to two hops according to our assumption; – all nodes that have selected it as MPRB with their available bandwidth. Information contained in the TC enables us to build the topology table. This table as well as neighborhood tables at one or two hops enable us to calculate the routing table. The routing table provides the shortest route for each destination. If there is more than one, the route with the largest bandwidth is chosen. This table is used for BE flows, which are routed hop by hop. QoS flows use source routing. When a new QoS flow arrives, the source performs admission control and finds, if it exists, a route satisfying requested QoS. In order to do this, a heuristic is used to estimate bandwidth used by the flow for each node located in the flow’s interference area. The route obtained is the shortest route fulfilling requested QoS. This route will be used by the flow as long as it is operational and that there is no shorter route providing bandwidth requested. Bandwidth used by BE flows is limited by a leaky bucket. This protects QoS flows from interference caused by BE flows.
Quality of Service Support in MANETs
59
3.4.2. Performance evaluation Performance of this solution has been evaluated by simulation with NS2. It can also be evaluated by measures from the completed implementation. As an example, we have taken a 300-node IEEE 802.11b network with an average number of 14 neighbors per node. Radio range is 250 m, while interference radius is 550 m. For more detail, see [NGU 06]. type f low
rat e
src
dest
QoS f 1
128
54
215
f2
128
287
f3
128
252
f4
300
f5
300
BE
start end
loss rat e
60
195 0.2%
277
75
195 0.3%
210
90
195 0.2%
28
118
40
100 0.6%
281
29
45
100 0.1%
Table 3.1. Three QoS and two BE flows
Figure 3.11. Overloaded zones bypass
We have first considered a scenario illustrated in Table 3.1, in which BE flows saturate the center of the network. QoS flows that are then introduced bypass the
60
Wireless Ad Hoc and Sensor Networks
overloaded zone (see Figure 3.11). When BE flows stop, QoS flows go through the center of the network to benefit from the shortest routes (see Figure 3.12). Throughput measured at the destination of each QoS flow shows that each one receives a throughput close to the requested throughput and that fluctuations are low (see Figure 3.13). Similarly, the loss rate measured to the destination is very low. This results in a good route stability.
Figure 3.12. Choice of shortest routes after end of BE flows
Figure 3.13. Throughput obtained at each QoS flow destination
Quality of Service Support in MANETs
61
We have also shown that this solution provides good resource utilization, due to: – a higher number of accepted QoS flows; – optimized broadcasts; – load sharing between several routes simultaneously active between two given nodes. In a mobile ad hoc network, mobility support is important. The final scenario tested involves 6 flows at 128 kb/s made up of 3 QoS flows and 3 BE flows. The network contains 100 nodes, each node having on average 20 neighbors. Each node randomly chooses a destination and travels at a speed of 3 m/s, 5 m/s or 10 m/s according to the simulation. The results are illustrated in Figure 3.14. QoS flows obtain a higher packet delivery rate than BE flows. Similarly, throughput obtained measured at destination is more stable for QoS flows.
Figure 3.14. Average flow delivery rate as a function of node speed
3.5. Conclusion 3.5.1. Summary In this chapter, we have shown why QoS support is much more complex in mobile ad hoc networks than in wired networks. We have laid out a state of the art on studies in this field. This state of the art was structured in reference to a generic
62
Wireless Ad Hoc and Sensor Networks
QoS architecture. We have then presented an example of a solution based on the OLSR routing protocol. Performance of this solution was evaluated from different scenarios and in different mobility conditions. 3.5.2. Perspectives Performance presented was done by simulation. It would be interesting to use analytical models [JAC 03] to evaluate the overhead introduced by QoS support for example. Another perspective involves scaling for QoS support. In OLSR, the Fish Eye extension [PEI 00] turned out to be quite useful. Would we find the same result with QoS support? How does QoS support behave in a very dense network? Does overhead remain acceptable? If we consider ultra-mobile networks, is overhead introduced by QoS support justified by a significant improvement of QoS perceived by the user? 3.6. Bibliography [AHN 02] AHN G.S., CAMPBELL A., VERES A., SUN L.H., “SWAN: Service Differentiation in Stateless Wireless Ad Hoc Networks”, INFOCOM’2002, New York, June 2002. [ALL 05] ALLARD G., GEORGIADIS L., JACQUET P., MANS B., “Bandwidth Reservation in Multihop Wireless Networks: Complexity, Heuristics and Mechanisms”, International Journal of Wireless and Mobile Computing, Interscience, 2005. [ALL 05] ALLARD G., Distribution des ressources et routage dans les réseaux mobiles ad hoc, University of Paris VI, thesis, Paris, September 2005. [BAD 04] BADIS H., MUNARETTO A., AL AGHA K., PUJOLLE G., Optimal path selection on a link state QoS routing, VTC’2004 Spring, Milan, Italy, May 2004. [BRA 94] BRADEN R., CLARK D., SHENKER S., “Integrated Services in the Internet Architecture”, IETF RFC, no. 1633, 1994. [BRA 97] BRADEN R., ZHANG L., BERSON S., HERZOG S., JAMIN S., “Resource Reservation Protocol (RSVP) – Version 1 Functional Specification”, IETF RFC, no. 2205, September 1997. [CAR 98] CARLSON M., WEISS W., BLAKE S., WANG Z., BLACK D., DAVIES E., “An Architecture for Differentiated Services”, Internet RFC, no. 2474, December 1998.
Quality of Service Support in MANETs
63
[CHA 02] CHAUDET C., GUERIN-LASSOUS I., “BRuIT: Bandwidth Reservation under Interference Influence”, European Wireless, 2002. [CHE 04] CHEN Y., LEE C.W., MA SU MYAT M.S., Performance Analysis of Ad Hoc Networks Protocols Through Simulation, http://www.comp.nus.edu.sg/~cs4274/ termpapers/0405-1/paper-1.pdf, 2004. [CLA 03] CLAUSEN T., JACQUET P., “Optimized Link State Routing Protocol”, IETF RFC, no. 3626, October 2003. [CLA 04] CLAUSEN T., Comparative Study of Routing Protocols for Mobile Ad Hoc Networks, Research report INRIA, no. 5135, March 2004. [GE 03] GE Y., KUNZ T., LAMONT L., “Quality of Service Routing in Ad Hoc Networks Using OLSR”, HICSS’03, Big Island, Hawaii, January 2003. [HAA 02] HAAS Z.J., PEARLMAN M.R., SAMAR P., The Zone Routing Protocol (ZRP) for Ad Hoc Networks, IETF Internet draft, July 2002. [JAC 03] JACQUET P., LAOUITI A., MINET P., VIENNOT L., “Performance Analysis of OLSR Multipoint Relay Flooding in Two Ad Hoc Wireless Network Models”, RSRCP Journal, Mobility and Internet special edition, December 2001. [JOH 07] JOHNSON D., HU Y., MALTZ D., The Dynamic Source Routing Protocol (DSR) for Mobile Ad Hoc Networks for IPv4, IETF RFC 4728, February 2007. [LEE 00] LEE S.B., AHN G.S., ZHANG X., CAMPBELL A.T., “INSIGNIA: an IP-Based Quality of Service Framework for Mobile Ad Hoc Networks”, Journal of Parallel and Distributed Computing, no. 60, 2000. [LeF 00] LE FAUCHEUR F., WU L., DAVARI S. et al., MPLS Support of Differentiated Services, Internet draft, work in progress, August 2000. [MAN 05] MANET working group, http://www.ietf.org/html.charters/manet-charter.html, November 2005. [MOR 05] MORARU L., SIMPLOT-RYL D., QoS preserving topology advertising reduction for OLSR routing protocol for mobile ad hoc networks, Research report INRIA, no. 0312, September 2005. [MOY 98] MOY J., “OSPF Version 2”, IETF RFC, no. 2328, April 1998. [MUH 02] MUHLETHALER P., 802.11 et les réseaux sans fil, Eyrolles, Paris, 2002. [MUN 04] MUNARETTO A., GUERIN-LASSOUS I., CHAUDET C., MALES D., BADIS H., NAIMI A., ALLARD G., NGUYEN D.Q., GAWEDZKI I., Spécification des algorithmes de quality of service, de routage, de contrôle d’admission et de contrôle de ressources, SAFARI RNRT Project, 2004. [NAH 04] NAHRSTEDT K., SHAH S., CHEN K., “Cross-Layer Architectures for Bandwidth Management in Wireless Networks”, in M. Cardei, I. Cardei, D.Z. Zhu (ed.), Resource Management in Wireless Networking, Kluwer Academic Publishing, Boston, 2004.
64
Wireless Ad Hoc and Sensor Networks
[NAI 05] NAIMI A., Délai et routage dans les réseaux ad hoc 802.11, University of Versailles Saint-Quentin, thesis, Versailles, September 2005. [NGU 05a] NGUYEN D.Q., MINET P., “Evaluation de la bande passante nécessaire au niveau MAC 802.11b”, Proceedings from CFIP’05, Bordeaux, March 2005. [NGU 05b] NGUYEN D.Q., MINET P., “XX”, MedHocNet’05, Porquerolles, June 2005. [NGU 06] NGUYEN D.Q., MINET P., “QoS Support and OLSR Routing in a Mobile Ad Hoc Network”, ICN06, Mauritius, April 2006. [NIK 04] NIKAEIN N., BONNET C., A Glance at Quality of Service Models for Mobile Ad Hoc Networks, http://www.eurecom.fr/~nikaeinn/qos.pdf, 2004. [OGI 04] OGIER R., TEMPLIN F., LEWIS M., “Topology Dissemination Based on Reverse Path Forwarding (TBRPF)”, IETF RFC 3684, February 2004. [PEI 00] PEI G., GERLA M., “Fisheye State Routing: A Routing Scheme for Ad Hoc Wireless Networks”, IEEE ICC’00, New Orleans, 2000. [PER 03a] PERKINS C., BELDING-ROYER E., DAS S., “Ad Hoc on Demand Distance Vector (AODV) Routing”, IETF RFC, no. 3561, July 2003. [PER 03b] PERKINS C., BELDING-ROYER E., Quality of Service for Ad Hoc on Demand Distance Routing, draft-perkins-manet-aodvqos-02.txt, October 2003. [WAN 96] WANG Z., CROWCROFT J., “Quality of Service Routing for Supporting Multimedia Applications”, IEEE Journal of Selected Areas in Communications, vol. 14, no. 7, 1996. [XIA 00] XIAO H., SEAH W.K., LO A., CHUA K.C., “A Flexible Quality of Service Model for Mobile Ad Hoc Networks”, IEEE VTC’2000 Spring, Tokyo, Japan, May 2000.
Chapter 4
Multicast Ad Hoc Routing
4.1. Introduction This chapter discusses multicast routing in MANETs. It is divided into two parts: the first part is dedicated to a state of the art on multicast routing protocols, while the second part explains our new approach, which consists of integrating the concept of “quality of connectivity” and criteria relative to an ad hoc physical environment in the context of multicast ad hoc routing. As shown in the previous two chapters, routing has quickly become a major problem and must be addressed in ad hoc networks. The challenge is therefore to develop new protocols that mainly respond to a set of criteria: robustness, simplicity, scalability and optimization of resources (energy conserving, etc.). The concept of grouped communication in packet transmission networks has been intensively studied these last few years because of the emergence of a considerable number of multicast applications (bandwidth-intensive multicast-based applications) [PAD 01] with a dazzling proliferation of Internet applications. The advantage of grouped communication is its capacity to efficiently conserve bandwidth and network resources by allowing the transmitter to send data in a single transmission to a group of receivers [NIK 00]. The concept of ad hoc environments provides an important field of deployment for this type of applications because the nodes often follow group behavior to meet the expected objectives. By extending multicast connections to the ad hoc field, Chapter written by Houda LABIOD.
66
Wireless Ad Hoc and Sensor Networks
applications such as videoconferencing, distributed games or CSCW (Computer Supported Collaborative Work) can be offered with increased performance with the help of network resource optimization. Nevertheless, this type of communication is not yet supported despite the fact that wireless links are diffused by nature and are thus perfectly adapted for this type of communications. Certainly multicast will play an important role in future MANET deployments, but critical problems still remain to be resolved. We focus on routing in this chapter. The expected advantages are efficiently reducing bandwidth usage, decreasing communication cost, efficiently transmitting data by taking into account high unpredictability in node mobility and supporting a dynamic topology with unreliable and often bad quality wireless links. Most studies on multicast are based on fixed Internet architecture. The mechanisms developed are not adapted to multihop wireless communications because of the use of multicast trees, which are very difficult to maintain when connectivity changes frequently. On the other hand, multicast trees require a heavy global routing structure such as link state or distance vector. This leads to a frequent exchange of control data causing high overheads. In addition, storage capacity and power consumption of nodes are severely limited. Also, a wireless medium significantly deteriorates the quality of transmissions (nearfar effect, multipath propagation, fading, hidden and exposed station problems). 4.2. Multicast routing in MANETs: a brief state of the art In this section, we present the advantages of multicast routing in the context of multihop wireless communications and we list the most recent protocols. 4.2.1. Classification The main advantage comes from the diffusion capacity of the radio interface to transmit multicast traffic. Nevertheless, the multicast technique requires that more capacity be added to the network and new functions be supported (multicast addressing, specific signaling protocols, etc.). Until now, only a few protocols have been proposed for ad hoc networks. Multicast protocols used for static networks such as DVMRP, CBT, PIM and MOSPF are not adapted to the context of ad hoc networks. This is caused by the fragile multicast tree structure that needs to be rebuilt every time connectivity changes. Besides, multicast trees usually require a global routing sub-structure such as link state or distance vector. This requires the frequent exchange of routing vectors.
Multicast Ad Hoc Routing
67
Other traditional multicast protocols for downstream and upstream links are not adapted due to the fact that creating and maintaining this type of link is not efficient in the case of a wireless network. A different type of multicast routing protocol is therefore necessary. These new protocols must modify the widely used traditional tree structure or deploy a different topology connecting group members [LEE 00a]. Developing this type of protocol is understandably complex. The structure of groups can change and network topology can also evolve (links and/or nodes can dynamically and randomly appear/disappear). [OBR 98] discusses requirements which must be addressed with multicast routing: minimizing network load, providing reliable transmission, building optimum routes, providing robustness, efficiency, active adaptability and unlimited mobility. Globally, we notice the existence of two classifications: protocols handling a tree structure called tree-based protocols (for example, MAODV, ABAM, ADMR) and those based on a mesh structure called mesh-based protocols (for example, ODMRP, PatchODMRP). Because we have chosen to base our contributions on reactive routing, this section will focus on on-demand protocols. The MAODV (multicast ad hoc on-demand distance vector) routing mechanism extends AODV to support multicast capabilities [ROY 99]. It builds shared multicast trees when necessary to connect members of a multicast group. A group leader is a node maintaining a multicast group sequence number. MAODV is able to support unicast, broadcast and multicast simultaneously. The ABAM (associativity-based multicast) protocol [TOH 00] establishes multicast sessions by using a criteria of association stability, a concept introduced in the ABR unicast protocol [TOH 99], and does not need an underlying unicast routing protocol. Multicast trees are built using this criterion, making it possible to reduce overheads and improve end-to-end delay. The ODMRP (on-demand multicast routing protocol) protocol [LEE 00b] is based on a mesh structure to connect members of groups by using the concept of forwarding group nodes. Mesh flooding is applied. ODMRP periodically floods control packets to create and maintain routing structure to transmit multicast traffic. In particular, when a source is active, the source periodically broadcasts Join-Data control packets. A node receiving this packet stores the identity of the upstream node (backward learning) and rebroadcasts the packet. When the Join-Data packet reaches a multicast receiver, this receiver creates a Join-Table and transmits it to its neighbors. A node receiving this table becomes a member of the forwarding group if the identity of the following node from one of the table’s entries corresponds to its own identity. It then broadcasts its own Join-Table. Each member of the forwarding group propagates the Join-Table until it reaches the source by following the shortest route. This procedure builds and updates source route for receivers, implementing a mesh. The source refreshes information relative to the group structure and updates routes by periodically sending Join-Data packets. Group maintenance takes place
68
Wireless Ad Hoc and Sensor Networks
with the help of a soft state approach. PatchODMRP [LEE 01] extends ODMRP by offering a better way to support a reduced number of sources during high mobility. To guarantee high throughput, the Join-Query interval should be initialized to lower values with high mobility. It will, nevertheless, still use a shortest route criterion. The adaptive demand-driven multicast routing (ADMR) protocol [JET 01] creates trees at the source connecting each source with multicast group receivers. The forwarding state for a given group and source is represented by a structure of source-routed tree relay. The forwarding mechanism is based on the shortest delay route through the tree connecting destination members of a multicast group. A sequence number is included in the packet header. This number uniquely identifies the packet. Packet forwarding is based on two types of flooding techniques: tree flood and network flood. The first one involves multicast tree nodes; this is indicated in the packet for source destination addresses (multicast group address). The second type of flooding involves all nodes of the network. ADMR sends keep-alive messages to maintain the forwarding tree’s current state. The absence of data packets and keep-alive messages during a period of time announces a disconnection in the forwarding tree. A local recovery procedure is executed to reconnect the tree. If it fails, a global procedure is launched. In addition, ADMR defines a pruning mechanism when there are no passive settlements since the downstream nodes. Other protocols such as MZR (multicast routing protocol based on zone routing) [MOU 01a], the simple multicast and broadcast protocol and MOLSR (multicast OLSR) [WIK 05] can be mentioned. 4.2.2. Summary We have pointed out that most existing protocols are faced with different problems and limitations with tree maintenance and frequent reconfiguration during a disconnection in the network. Some of these protocols depend on upstream and downstream nodes requiring much storage and control. Others consider the shortest route criterion to select routes which is not always well adapted to a quick and unpredictable variation of ad hoc network topology. Similar to unicast routing, the research field relative to multicast routing is a young field where there are far fewer propositions than in the case of unicast. There is no standard and critical issues remain unsolved, requiring more research work from the scientific community. Throughout our investigations, we have observed the following: – performance studies are not completely finalized; – validation of propositions are most often accomplished by simulation with very little theoretical analysis;
Multicast Ad Hoc Routing
69
– complex analytical studies are very far from being completed; – lack of standard in this field has enabled several propositions to emerge; – various metrics are proposed to ensure respect of constraints related to the network. In this context, we have proposed a new multicast routing approach based on a different routing strategy modifying the traditional tree structure or deploying a different topology which connects group members. Consequently, we have proposed a new on-demand multicast routing called SRMP (source routing-based multicast protocol). Our mechanism has no loops and is meant to decrease routing overhead at the same time as efficiently providing robustness in relation to node mobility, adaptation to wireless channel fluctuations and optimization of network resource usage. 4.3. SRMP SRMP is a mesh-based protocol. The mesh structure is established on an ondemand basis to connect group members providing enhanced connectivity between multicast members. We define a multicast mesh as a subset of the network topology delivering at least one route from each source to each receiver belonging to a given group. During the mesh creation phase, we apply the concept of relay nodes called forwarding group (FG) nodes [CHI 98]. The source routing mechanism proposed in the unicast DSR protocol [JOH 96] is applied according to a different methodology. Multiple routes are available based on a prediction of a future link state. In the following sections, we give a general description of SRMP. Then, we explain the metrics selection calculation used in the mesh creation phase to choose FG nodes, which represents a central element of our proposition. We also give the defined data structures and explain the principle of the different procedures. 4.3.1. Description SRMP is an on-demand multicast routing protocol. Route selection is defined following the mesh creation phase, launched by multicast receivers, for each multicast session. We introduce two major concepts to solve routing problems: 1) route availability concept; 2) concept of route longevity depending on an energy-conserving mechanism.
70
Wireless Ad Hoc and Sensor Networks
The first criterion enables the protocol to distinguish between available and unavailable routes. A route is considered available or unavailable based on the radio quality of each one of its links and on node stability at the end of each link. The second criterion involves the choice of wireless links to optimize node battery life involved in the route. In this way, combining these two parameters constitutes an important element with a considerable impact on routing efficiency. This mirrors our new concept of quality of connectivity. This type of approach, compared to that based on trees, presents advantages, among which we can cite the following: – providing multiple redundant routes between members to efficiently deal with quick and unpredictable changes in topology liable to disrupt multicast data flow or require rebuilding of routing structure. This is accomplished by using the FG concept and the implementation of a mesh structure for each multicast group; – applying a selective limited scope flooding technique to avoid global network flooding; – offsetting multicast tree limitations (intermittent connectivity, traffic concentration, reconfiguration/reconstruction of trees); – applying quality of connectivity concept represented by a set of efficient criteria for FG node selection in such a way that route availability and connectivity between nodes is as efficient as possible. In conclusion, we can say that the main asset of SRMP resides in its capacity to provide multiple stable routes based on a state prediction function. These routes also guarantee node stability in relation to their neighbors, high connectivity between nodes and a longer battery life. Clearly, the main challenge for an efficient multicast transmission greatly depends on the choice of FG nodes and the maintenance of the core routing structure represented by FGM (forwarding group mesh). SRMP must develop a compromise between mesh size (number of chosen nodes), availability and stability of selected routes. 4.3.1.1. Selection criteria for FG nodes We define four selection metrics: association stability, signal strength, link availability and battery energy level. 4.3.1.1.1. Association stability This metric represents stability of a node compared to its neighborhood. It is the equivalent to the one introduced in the ABR protocol [MOU 03a], known as the degree of link stability. Association stability is calculated by each node.
Multicast Ad Hoc Routing
71
4.3.1.1.2. Signal strength This metric measures the strength of the received signal, indicating the strength of this connectivity for each pair of nodes. SRMP uses this metric to choose the strongest links; it is calculated as the signal level of the received beacon. We classify links as strong or weak by comparing the signal level to a predefined threshold. 4.3.1.1.3. Battery life This metric periodically calculates current battery power for each node. The value of this metric is a decreasing function in time based on the number of packets processed. Bp(0) is initial energy defined the same in each node. The node has an acceptable level of energy (see formula [4.1]) respecting a predefined threshold: Bp(t) = Bp(current) – [PCgp + PCrp + PCfp + K ]
[4.1]
with: – Bp(t): battery power level at time t; – Bp(current): current power level (initially, Bp(current) = Bp(0)); – PCgp: total power consumed for each generated packet (including processing and transmission time); – PCrp: total power consumed for each packet received (including receiving and processing time); – PCfp: total power for each packet relayed (including receiving, processing and transmission time); – K: energy consumed by the node’s physical resources (equipment). 4.3.1.1.4. Link availability estimation Route reliability is an important characteristic that helps to eliminate rerouting and to select the best routes. SRMP decreases the frequency of route break by using a metric to measure link availability during the route selection procedure. Calculation of this metric is based on a probabilistic model [MOU 02a] which predicts future link availability, and thus the route availability.
72
Wireless Ad Hoc and Sensor Networks
4.3.2. Operation Similarly to reactive routing protocols, SRMP comprises two phases: a request phase followed by a reply phase. The first phase uses a route discovery procedure to find optimum routes for SRMP to reach the multicast group. Multiple routes are built during the reply phase where the selection of FG nodes and mesh construction are completed. The following sections describe the request phase, reply phase, FG node selection, data transmission and packet forwarding in a mesh structure. 4.3.2.1. Route request phase This starts when the source node, which is not a member of the multicast group, wants to join the group. The source starts by broadcasting a Join-request packet to its neighbors using a route discovery procedure for the multicast group. The Joinrequest packet contains a source identifier, multicast group identifier and a sequence number updated by the source node. To eliminate multiple copies of Join-request packets, each node receiving it compares the identifier of each packet received with those saved in its Multicast_Message_Duplication_Table. We consider that the SRMP route request phase is different from that in DSR since it is handled differently. In fact, the source routing mechanism is not the same since the route source accumulates in the Join-reply packet during the reply phase instead of the route request phase. In this way, we eliminate overheads caused by routing and transmission. 4.3.2.2. Reply phase and FG node selection Initially, a multicast receiver starts a reply phase. When it receives a Join-request packet, it verifies its stability in relation to its neighbors, strength of the received signal and the availability of its links with its neighbors. Battery life is also verified by considering the energy needed for a transmission to each neighbor. A neighbor is chosen as FG node if the quality metrics satisfy predefined threshold conditions. If that is the case, the receiver sends a Join-reply message to the FG node by initializing its member node type in its Neighbor_Stability_Table. In the case where no neighbor nodes would satisfy conditions of quality, we would then choose the one with the best metrics as an FG node. The Join-reply packet records the multicast group identifier. The route from the receiver (Join-reply source) to the node is built during Join-reply packet propagation. Upon receiving a Join-reply packet, an FG node starts by creating an entry for the multicast group in the Multicast_Routing_Cache table. The node initializes its
Multicast Ad Hoc Routing
73
state at FG node value and copies the accumulated opposite route in the received Join-reply packet. It also stores the source which is in the Join-reply packet and time of packet reception. The node in turn reiterates the process to choose FG nodes among its neighbors. This process continues until the source is reached, building the mesh structure connecting group members. Once the Join-reply packet is received, the source becomes a multicast source and creates an entry in the multicast group in its Multicast_Routing_Cache table. Depending on mesh structure, more than one Join-reply packet can be received by the source for the same multicast group. In this way, multiple routes are retained for the same multicast group. Once the mesh is created, new Join-request packets can be sent for any multicast group; replies can be transmitted from any FG member having routes still valid in its cache to reach the multicast group. In this case, the FG node broadcasts its Joinreply packet to the source which has transmitted the request with the corresponding route stored in its cache. The procedure follows the same neighbor selection mechanism described previously, until it reaches the requesting node. Loops are avoided during Join-reply packet propagation thanks to the source routing mechanism. 4.3.2.3. Data forwarding This section describes how data is transmitted through the mesh. A source starts transmission by selecting one of the routes saved in its cache. The criterion chosen is the shortest route in terms of number of hops, and the most recent route if more than one route exists. Each FG node receiving the data packet forwards it if it has at least one valid route to reach the multicast group in its cache, as long as the packet is not duplicated. This is a very interesting characteristic of SRMP, avoiding transmission of data packets on invalid routes and minimized traffic overhead. This process continues until all multicast receivers are reached. A multicast receiver receiving a data packet for the first time creates an entry in its Receiver_Multicast_ Routing_Table. To guarantee data transmission to all multicast receivers, the nodes duplicate the transmission if the chosen route goes directly to the multicast group. 4.3.3. Maintenance procedures SRMP introduces several mechanisms to adapt mesh structure to variations which can happen in an ad hoc network. In this way, the mesh is updated, link failures are detected and repaired, updated neighborhood information is constantly maintained and a pruning scheme is proposed, enabling any network node to leave the group. Our objective is to maintain route longevity for as long as possible. We
74
Wireless Ad Hoc and Sensor Networks
rely on MAC layer beacons and introduce two new messages: Multicast-RERR Message, and Leave Group Message. 4.3.3.1. Notification of neighbor existence mechanism SRMP uses MAC layer beacons to deliver current information on the existence of neighbors. When a node receives a beacon from its neighbor, it updates its Neighbor_Stability_Table. Updating this entry is carried out by incrementing the connection’s quality parameters; association stability, received signal strength, link availability prediction and current energy level. 4.3.3.2. Mesh refresh mechanism We have developed a simple mechanism by using data packet propagation requiring no additional overhead. During data packet transmission, the route is refreshed for the different mesh routes. Each time the source transmits a data packet, it updates the timer of the route used in its cache. Typically, an FG node forwarding this packet scans the packet header and refreshes the corresponding entry’s timer in its cache. In addition, a multicast receiver analyzes the header of all received data packets, refreshing the table entry’s timer for the source. 4.3.3.3. Link repair mechanism SRMP reacts to node mobility on-demand in such a way that it detects a link failure during data transmission thanks to the MAC layer support. We propose two mechanisms: 1) how to maintain routes when a link failure occurs between two FG nodes; 2) how to maintain routes when a link is broken between a multicast receiver and an FG node. We should note that reconfiguration of the mesh is not necessary if stability characteristics combined with route battery life are valid throughout multicast communications. SRMP follows the same idea proposed in DSR for link failure between two FG nodes. In this case, the node detecting failure must inform the originating source. It begins by generating a Multicast-RERR packet indicating the broken link, then it erases from its cache the routes containing this link. Nodes on the route to the source, receiving this packet, also erase from their cache all routes containing the broken link. In the case of connection failure between an FG node and a multicast receiver, the FG node detecting the break simply erases the receiver from its Neighbor_Stability_Table. Periodically, each FG node verifies its neighbor table and erases from its cache the routes to multicast groups that have no members. In this
Multicast Ad Hoc Routing
75
way, a Multicast-RERR message is sent to all neighbor members informing them of the failure. 4.3.3.4. Pruning scheme SRMP provides an efficient pruning mechanism making it possible for any member node to leave a multicast session. We observe two cases: 1) when an FG node wants to leave the network; 2) when a multicast receiver wants to leave the network. If a multicast source wants to leave a multicast group, it simply stops transmitting data to this group and erases the group’s entries from its cache. This leads to the expiration of all routes connecting this source to the multicast group because no refreshing is done. Similarly, entries in the Receiver_Multicast_ Routing_Table will expire and consequently be deleted. Typically, a multicast receiver wanting to leave the group sends a Leave Group message to its member neighbors and erases all entries corresponding to this group from its table. A member neighbor receiving this message will erase associated receiver information in its Neighbor_Stability_Table. Periodically, if no member is found in a multicast group, all routes to this group will be erased from the cache. The node sends a Multicast-RERR message to all member neighbors by following the procedure described above. The multicast session ID is transported by the Multicast Group ID field of the Leave Group message, and the Neighbor ID field carries the member neighbor ID to which the message is addressed. If an FG node wants to leave a group, it first sends a Leave Group message to its neighbors, erasing from its cache all entries associated with this group. Each node receiving this message erases this FG node in its Neighbor_Stability_Table, and erases the routes containing this node from its cache and sends a Multicast-RERR message to its member neighbors by also following the previous connection break recovery procedure. The broken link field in this message stores the FG node that has left the network. 4.4. Properties Our protocol has the following properties [MOU 02b, MOU 01b]: – efficient use of network resources because of the on-demand approach; – use of source routing that allows routing without loops; – avoids disadvantages of multicast trees;
76
Wireless Ad Hoc and Sensor Networks
– redundant routes are available thanks to the mesh structure, thus decreasing rerouting of data packets which can be frequent due to connection breaks; – duplication detected in data packets which reduces overheads; – the concept of FG nodes is used in mesh creation, limiting forwarding range and reducing overhead; – selection criteria for FG nodes provide routes with good quality links in terms of link stability in agreement with the prediction of their future states, which gives SRMP its major advantage, which consists of providing optimal routes in terms of communication, overhead, reliable data delivery and short end-to-end delay; – routes with high energy level are also chosen to optimize energy consumption, a predominant characteristic in ad hoc networks; – the efficient refresh mechanism is applied to maintain the mesh structure during data transmission. This mechanism does not add any additional control overhead, as long as it uses data propagation to update the state of used routes; – timers are used to avoid invalid routes in node tables; – efficient pruning mechanism preventing invalid routes to be found by forwarding nodes toward any non-member node. 4.5. Simulation results and analysis In order to validate and justify the different aspects of our proposition, we have carried out a performance evaluation of SRMP, taking several mobility and communication scenarios into consideration [MOU 04, MOU 03b]. The results were compared to two of the main multicast ad hoc routing protocols, ODMRP and ADMR, which are reactive mechanisms like SRMP. We have chosen ODMRP specifically because it uses a mesh structure to forward data packets, and ADMR because it is considered a more traditional protocol based on the use of multicast forwarding trees. We have analyzed the behavior of these three protocols with the ns2 simulator by considering various mobility scenarios. We have considered several performance metrics: end-to-end delay, delivery ratio (or throughput), control overhead (number of control bytes and packets) and average link failure rate. We have also analyzed SRMP with two mobility models, random waypoint (RWP) and reference point group mobility model (RPGM). We have in particular analyzed energy consumption and robustness of the protocol. SRMP has shown two main and interesting characteristics: 1) it provides a better delivery ratio during very high mobility; 2) it has a low impact on control overhead optimizing network resources and bandwidth.
Multicast Ad Hoc Routing
77
These two advantages were observed independently of the mobility model and structure of multicast groups. In addition, we have clearly illustrated an expected result stipulating that RPGM is better adapted to multicast communications. It would be interesting to extend this evaluation study to include other multicast routing protocols such as MAODV and MDSR. 4.6. Conclusion In this chapter, we have addressed a recent research problem, which is multicast ad hoc routing. We began by establishing a state of the art by providing the main propositions with their advantages and limitations. A summary of current mechanisms highlighted the necessity of specifying a new class of more powerful protocols. The key characteristics needed mainly include flexibility, adaptability, energy conservation, scalability and robustness. From there, we presented the motivation behind our SRMP protocol. Our contribution constitutes an alternative to current approaches. Contrary to most existing protocols, our protocol follows a mesh-based approach to establish and maintain routes. Because of the application of the FG node concept, flooding is minimized; robustness and enhanced connectivity are provided. In addition, a reactive approach is applied to make data transfer more reliable while optimizing the use of network resources and decreasing routing overhead with active adaptability. In order to build a powerful, stable and reliable central routing structure, we considered the integration of the “quality of connectivity” concept in selecting FG nodes. We considered criteria related to the nature of ad hoc network environments such as node stability, link availability, quality of received signal and level of energy consumption, while considering connectivity in the mesh creation phase, multiple stable routes with high link availability probability and optimized energy consumption are found. We also developed an SRMP analytical study using random graphs. This part, because of the originality of the used concepts, revealed very interesting results in ad hoc network routing as well as in random graph theory. In the near future, we will continue to explore this modeling approach. 4.7. Bibliography [CHI 98] CHIANG C., GERLA M., ZHANG L., “Forwarding Group Multicast Protocol (FGMP) for Multihop, Mobile Wireless Networks”, Cluster Computing, vol. 1, no. 2, p. 187-196, 1998. [JET 01] JETCHEVA J.G., JOHNSON D.B., “Adaptive Demand-Driven Multicast Routing in Multi-hop Wireless Ad Hoc Networks”, ACM MobiHoc ’01, Long Beach, CA, USA, 2001.
78
Wireless Ad Hoc and Sensor Networks
[JOH 96] JOHNSON D., MALTZ D., “Dynamic source routing in ad hoc wireless networks”, in T. Imielinski and H. Korth (eds.), Mobile Computing, Kluwer, Norwell, MA, USA 1996. [LEE 00a] LEE S.-J., Routing and Multicast Strategies in Wireless Mobile Ad Hoc Networks, PhD thesis, University of California, Berkeley, 2000. [LEE 00b] LEE S., SU W., GERLA M., “On-Demand Multicast Routing Protocol in Multihop Wireless Mobile Networks”, ACM/Baltzer Mobile Networks and Applications, 2000. [LEE 01] LEE M., KIM Y.K., “PatchODMRP: An Ad-hoc Multicast Routing Protocol”, Proceedings of the 15th International Conference on Information Networking, p. 537-543, 2001. [MOU 01a] MOUSTAFA H., Multicast Routing in Mobile Ad Hoc Networks, Master’s thesis, 2001. [MOU 01b] MOUSTAFA H., LABIOD H., The Source Routing-based Multicast Protocol for Mobile Ad Hoc Networks (SRMP), IETF Internet draft, November, 2001. [MOU 02a] MOUSTAFA H., LABIOD H., “SRMP: a mesh-based protocol for multicast communication in ad hoc networks”, International Conference on 3rd Generation Wireless and Beyond (3Gwireless’2002), San Francisco, CA, USA, May 28-31, 2002. [MOU 02b] MOUSTAFA H., LABIOD H., “Source Routing-based Multicast Protocol for Mobile Ad hoc Networks”, 10th International Conference on Telecommunication Systems Modeling and Analysis (ICTSM-10), Monterey, CA, USA, October 3-6, 2002. [MOU 03a] MOUSTAFA H., LABIOD H., “A Multicast On-demand Mesh-based Routing Protocol in Multihop Mobile Wireless Networks”, IEEE 58th Vehicular Technology Conference VTC2003-fall, Orlando, FL, USA, October 6-9, 2003. [MOU 03b] MOUSTAFA H., LABIOD H., “A Performance Comparison of Multicast Routing Protocols In Ad Hoc Networks”, 14th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC 03), Peking, China, September 7-10, 2003. [MOU 04] MOUSTAFA H., LABIOD H., “Multicast Routing in Mobile Ad Hoc Networks”, Kluwer Telecommunication Systems Journal, vol. 25, no. 1 and 2, January-February 2004. [NIK 00] NIKAEIN N., LABIOD H., BONNET C., “Error Recovery Scheme for Multicast Transmission Over Wireless Networks”, IEEE ICC 2000, New-Orleans, LA, USA, 2000. [OBR 98] OBRACZKA K., TSUDIK G., “Multicast Routing Issues in Ad Hoc Networks”, IEEE International Conference on Universal Personal Communication (ICUPC’98), 1998. [PAD 01] PAPADIMITRIAN et al., Optical Multicast in Wavelength Switched Networks, Internet draft , IPO working group, 2001. [ROY 99] ROYER E.M., PERKINS C.E., “Multicast Operation of the Ad-hoc On-Demand Distance Vector Routing Protocol”, Proceeding of the 5th Annual ACM/IEEE International Conference on Mobile Computing And Networking, p. 207-218, 1999.
Multicast Ad Hoc Routing
79
[TOH 00] TOH C-K., GUICHAL G., BUNCHUA, “ABAM: On-Demand AssociativityBased Multicast Routing for Ad Hoc Mobile Networks”, 52nd IEEE Vehicular Technology Conference 2000, vol. 3, p. 987-993, 2000. [TOH 99] TOH C., ROYER E., “A Review of Current Routing Protocols for Ad Hoc Mobile Wireless Networks”, IEEE Personal Communications, vol. 6, no. 2, p. 46-55, 1999. [WIK 05] Wikipedia, an exhaustive list of protocols http://en.wikipedia.org/wiki/Ad_hoc_protocol_list, 2005.
is
available
at
URL
Chapter 5
Self-organization of Ad Hoc Networks: Concepts and Impacts
5.1. Introduction Mobile ad hoc networks (MANET) are based on node collaboration to provide network services such as unicast and multicast routing, localization, flooding, etc. Due to this, ad hoc networks appear as self-organized systems: no central entity is required in the network’s behavior and global behavior emerges from local interactions. In this chapter we discuss self-organization from the viewpoint of a self-organized structure with the purpose of supporting and improving the behavior of protocols and services for ad hoc networks. The underlying objective of ad hoc networks is for each node to collaborate with its neighborhood to implement new networking protocols such as routing or flooding. Nevertheless, this collaboration is often done independently of intrinsic node properties. In this way, a high mobility node can participate in the development of a route and, because of its mobility, can quickly lead to a break in the route. Similarly, low energy level nodes can also participate in the networking protocols instead of preserving their resources as they should. In most studies, a MANET network does not consider node heterogenity and does not attempt to take advantage of its properties. In addition, MANET networks are mainly considered to be flat and are neither structured nor hierarchized.
Chapter written by Fabrice THEOLEYRE and Fabrice VALOIS.
82
Wireless Ad Hoc and Sensor Networks
Self-organization, as discussed here, is designed to take advantage of node properties in order to bring out a structure for network organization. This structure must be autonomous and dynamic in such a way that local change will only lightly affect the global structure. Using this self-organization structure must facilitate networking protocol deployment. The notion of local decisions and interactions will be the fundamental principles of this self-organization. Consequently, we are expecting properties of dynamic environment adaptation and robustness, and of course this localized concept will lead to a better scalability than by using centralized or distributed solutions. First, we start by defining concepts inherent to self-organization in a more precise manner. Next, we will then look at more traditional self-organization approaches and will explain in more detail a solution based on the notion of virtual topology. From this self-organization solution, we will study the effects that selforganization can have on network behavior. More precisely, we will look at how basic services such as routing can be influenced by a self-organization scheme. Finally, we will propose extended works to apply the notion of self-organization. 5.2. Self-organization: definition and objectives 5.2.1. Definition A system is said to be organized if it has a structure and a set of associated functions. The goal of the structure is to organize all entities and enable their interactions. The objective of the associated functions is to maintain the structure and, according to its use, respond to determined needs. The notion of selforganization refers to the organization of a system without interaction with any external entity and no centralized control. Because of this, self-organization must absolutely be based on local interactions, in a completely distributed manner. 5.2.2. Principles and objectives Still, we expect more from a self-organized system than simple local interactions with no external control. We particularly wish to bring out a global behavior from different local interactions. Typically, and in the case of ad hoc networks, from the exchange of localized information, we want to bring out a virtual structure to organize the network. Two types of virtual structures may interest us: clusters and backbones (including mesh networks and trees) (see Figure 5.1). We can then build two network views: a local microscopic view, representing node dynamics and a global view, with properties that we will discuss in more detail later in this chapter.
Self-organization of Ad Hoc Networks
83
We must obviously ensure that the global view is constructed in a limited time and is coherent This emerging structure must obviously adapt to the environment and react to local changes. More precisely, a local change must only result in a local modification of the structure and not impact on its entirety. This dynamic reaction to local changes in a sufficiently short time leads to the property of adaptability. The system then appears to be robust: the new structure (macroscopic view) is stable because the self-organization mechanism will make it possible to react to a link or a node failure (due to mobility, insufficient energy, etc.): the structure will adapt and rebuild locally. Since there is no central entity, there are no critical nodes in a selforganized system and the system can repair itself without outside help. We group adaptability and robustness properties in the same term of self-stabilization [SCH 93] where failures are associated with links and nodes.
Figure 5.1. Different types of self-organizing structures
If we consider self-organization from the point of view of a stable topology to federate the ad hoc network, this topology emerging from local interactions must be shared by all nodes. In other words, we must prevent any source-oriented construction such as the use of OLSR’s MPRs (multipoint relay) for example, and we should propose a common self-organization structure to all network nodes. Since this logical structure will be based on physical nodes, these more important nodes will be more implied to implement network services. In this type of selforganization, it is necessary to make sure these nodes participate in turn. Finally, the last important property that a self-organized structure must possess is scalability: even with a large number of nodes, the system must continue to function efficiently. This property comes from the lack of centralized control and the use of local interactions only. A too large number of network nodes must not cause congestion.
84
Wireless Ad Hoc and Sensor Networks
Finally, the properties that a self-organization schema must have are: – only local interactions; – emergence of a global structure from local information; – responsiveness to local changes and robustness; – self-organizing structure that is not source-oriented; – scalability. The notion of self-organization takes on its full meaning with ad hoc and sensor networks. It provides an environment enabling node configuration and the implementation of communications protocols. In a self-organized network, human intervention is reduced to a minimum, facilitating its deployment. Self-organization must therefore bring out global behavior. In essence, an ad hoc network can be considered as a self-organized system because all protocols used are based on local interactions between the nodes and are distributed over network members. From our point of view, the idea is to provide a dynamic and unique virtual structure to facilitate deployment of any network service, instead of being limited only to address configuration or routing protocol. 5.2.3. Local or distributed decisions? Our notion of local refers to the neighborhood of a node and to the partial, local view that it has of the network. In a self-organized structure, it is preferable that decisions be made with the help of a strictly local process based on information and local interactions instead of in a distributed manner. Let us review the difference: – a process is localized if each node makes a decision based on (local) information that it has, or on the (partial) view that it has of the network; – from a distributed point of view, several nodes can interact and converge towards a decision. The decision taken no longer comes only from local information but also from distributed information. Choosing a local strategy is to choose to minimize known information in the network, to prevent the formal cost of synchronizing decisions and to react more quickly to local changes. Now that we have defined self-organization and its associated properties, we will look at rules for developing a self-organized system.
Self-organization of Ad Hoc Networks
85
5.3. Some key points for self-organization [PRE 05] studies key self-organization principles. We retain the main properties in order to build a self-organizing structure. It is necessary to define rules and protocols necessary to take advantage of the interaction between network nodes. Four paradigms will be studied: – emergence of global behavior from local rules; – local interactions and node coordination; – minimizing network state information; – dynamically adapting to the environment. 5.3.1. Emergence of global behavior from local rules We focus on the development of a network protocol providing global properties such as connectivity, robustness, etc. The centralized approach proposes the use of a node dedicated to this protocol or service. With self-organization, this function must be shared by all network nodes. No entity can set the desired function, but each node must contribute to the creation of a global behavior. It is therefore necessary to develop local rules leading to a global behavior. Routing protocols in ad hoc networks illustrate this paradigm: the global behavior is forwarding a data packet from a source toward a destination whereas the local rule is to transfer the packet to a neighbor. What local behaviors can lead to a global behavior? Locally, each node will collect information which will be aggregated and broadcast to the neighborhood. Global view is not required and local knowledge will be enough to create a global behavior. This is actually what happens with distance vector routing protocols such as DSDV [PER 94]. The construction of a local view and the implementation of local interactions are the first two main points of self-organization. We should note that when a global view emerges, it is not necessarily exact but tends to be close. From our point of view, not having a perfect solution in this very dynamic context is not a major drawback: since the network topology changes frequently, the ideal solution also evolves and represents a significant cost. Basing global protocols and services on local interactions and information makes these protocols much more robust and stable since there are no critical nodes and the network does not change states abruptly, and local state changes only have a local reach. Nevertheless, since each node only has a local view of the network, this can lead to inconsistencies in the network. This is the object of the second paradigm.
86
Wireless Ad Hoc and Sensor Networks
5.3.2. Local interactions and node coordination The result of local decisions can lead to inconsistent situations. In the case of address auto-configuration, this inconsistency leads to conflict situations: two nodes can have the same address following local decisions. To avoid this phenomenon, the first solution is to use explicit coordination. Nevertheless, this type of solution exhibits a huge networking overhead because it may require the implementation of an address detection mechanism for example. This type of coordination responds well to centralized systems. With ad hoc networks, the use of explicit coordination cannot be applied due to limited radio resources and the network’s dynamic, because information must be constantly updated and radio resources are limited. Tolerating conflict may be necessary here. They are adequate if they are local, temporary, easily detectable and solvable. The notion of implicit coordination must also be used. The idea is to use all information circulating in the local neighborhood in order to detect and update inconsistencies. Conflict detection will also help in locally refining the network view and to improve inconsistent situations. This second paradigm is obviously connected to the first one. The implementation of local rules will enable the emergence of a global behavior. Based on these local interactions, a global service with some inconsistencies will be implemented. Through these interactions, neighborhood information will be exchanged, in order to update topology or address information, resolving partial but erroneous views of the network and its associated service. 5.3.3. Minimizing network state information With general networks, each node maintains an information list on the state of the network, for example: gateway, servers (DNS, security), etc. This information known by all nodes must remain consistent and we must use a synchronization mechanism. We could say that as network state information becomes more important, it will become less easy to turn to the concept of self-organization (see section 5.3.2). In the case of ad hoc networks, it is necessary to use adaptive mechanisms for the maintenance and information discovery on the state of a network. As with routing protocols, [PRE 05] proposes two methodologies: a reactive and a proactive approach. In the first case, a node will make an explicit request relayed in multicast or broadcast, whereas in the second case, periodic announcements are made to broadcast this information. Once more, the self-organization mechanism makes the network more reactive and robust to configuration changes and to critical situations. The dependence on central information is also avoided.
Self-organization of Ad Hoc Networks
87
5.3.4. Dynamic environment adaptation The last rule that must be created in self-organized networks is the capacity to react to network topology changes and to information changes, resulting from mobility and/or node failure. Since no centralized entity can broadcast a change of state, each must monitor its local behavior and react accordingly. Monitoring its local neighborhood is linked to paradigm 1 and reacting to new information is linked to paradigm 2. [PRE 05] identifies 3 adaptation levels: – 1st level: adaptation to topology changes resulting in the mobility or appearance/disappearance of a node; – 2nd level: adaptation of conventional parameters (timers, knowledge of k neighborhood, etc.) based on state changes for improving network behavior; – 3rd level: a self-organized system must be able to realize that the environment in which it evolves will not lead to a stable system and thus, must replace one mechanism with another. These adaptations can be combined with adaptive control. We must weigh these adaptation mechanisms: a local state change must only lead to a local change and must not affect the complete network topology. We will describe below the most relevant structures from a self-organization viewpoint, considering self-organization as the emergence of a topology for structuring the network. Next we will focus on a solution to implement selforganization principles and study what self-organization brings to network behavior. 5.4. Self-organization: a state of the art 5.4.1. Classification Two types of virtual structures can be considered: (virtual) backbone structures (tree, trellis) and clusters (see Figure 5.1). To structure an ad hoc network in the form of a tree leads to high hierarchization of the topology and nodes which can be used by network services. On the other hand, associated robustness is weak. A trellis topology (also known as a mesh network) has the advantage of link redundancy ensuring its robustness. On the other hand, with the goal of using organization to improve network behavior, the use of a trellis-based structure results in the existence of multiple loops that must be managed in the protocol. Finally, clusters enable network partitioning and can
88
Wireless Ad Hoc and Sensor Networks
possibly structure the network depending on location, a metric of density or connected to a service, etc. We will now present tree and trellis topologies leading to the development of a virtual backbone and we will explain cluster-based solutions. 5.4.2. Virtual backbone The idea of organizing an ad hoc network into a backbone comes from an analogy with wired networks. This type of structure federates nodes around a backbone, provides development support for network services and can serve as a natural link for a fixed network interconnection. We are presenting here the main virtual backbone development techniques in ad hoc networks by successively studying the following structures: CDS (connected dominating set), RNG (relative neighborhood graph), LMST (localized minimum spanning tree), etc. Note that these topologies are also used in the area of wireless sensor networks in the context of topology control. Before going any further, we will introduce a few notations from graph theory.
Figure 5.2. Modeling of ad hoc network as graphs
Self-organization of Ad Hoc Networks
89
5.4.2.1. Notations Graph theory forms a tool that is well adapted for modeling ad hoc networks. A network is modeled by a graph and an edge exists between two vertices if both terminals are neighbors (see Figure 5.2) in the physical topology. We will use the following notations: – G(V,E): graph associated with the ad hoc network, with all vertices V and all edges E; – Nk(u): k-neighborhood of vertex u, i.e. all nodes at less than k hops from u. By convention, N1(u) = N(u); – routeu1→uk: all vertices {u1,...,uk} as well as edges (ui,ui+1)i∈[1..k-1] exist. k is called route length. 5.4.2.2. Connected dominating set A connected dominating set (CDS) is defined by a node that is either a dominatee or dominator, while a dominatee is a neighbor of at least one dominator and the dominators are connected (see Figure 5.3). By definition, a CDS is the set V’ of G(V,E) vertices such that:
∀u ∈ V , ∃v ∈ V '/ v ∈ N (u )
[5.1]
∀ (u, v) ∈ V '2 , ∃c = routeu ,v / ∃w ∈ c, w ∈ V ' The notion of CDS can be extended to the notion of k-CDS, by modifying proposition [5.1]:
∀u ∈ V , ∃v ∈ V '/ v ∈ N k (u )
90
Wireless Ad Hoc and Sensor Networks
Figure 5.3. Topology – connected dominating set
We call MCDS a CDS of minimum cardinality. Small size is an important asset for ad hoc networks without being a critical element. However, since centralized construction of an MCDS is an NP-hard problem, numerous algorithms use heuristics to approximate an MCDS, in particular in a distributed context. Many studies propose the construction of an MCDS in two steps. The first would enable the election of dominator nodes, in such a way that any dominatee would be neighbor to a dominator. The set thus formed is a dominating set. Subsequently, the dominating set is connected by minimizing cardinality of the final set. A node can be any of the states: dominating, dominatee, active (in election process) or isolated (in initialization). In [BUT 03, CAR 02, CHE 02, LIA 00], a leader is declared dominant. Its neighbors become dominatees, and dominatee neighbors become active. The active node with the highest weight at the end of the round becomes dominant. In this topology, two dominators are separated by 2 hops. Different weights are used as an
Self-organization of Ad Hoc Networks
91
election metric: the highest degree [BUT 03, LIA 00], the lowest address [CAR 02] or a generic weight [CHE 02]. [ADJ 05] proposes the construction of a CDS based on OLSR’s multipoint relay. [DAI 03, WU 99] proposes an algorithm of CDS construction using only local information (see Figure 5.4). A node is dominant if it has at least two disconnected neighbors, otherwise the node is dominated (algorithm known as rule 1). As described in [WU 99], the shortest route from a node u to a node v goes exclusively through dominating vertices, route edges being excluded.
Figure 5.4. Examples of CDS built with Wu and Li algorithm
5.4.2.3. Maximal independent set A set IS is said to be independent if it is made up of nodes which are not neighbors in G(V,E), i.e.: IS = {u ∈ IS , (¬∃v ∈ S / u ∈ N (v))}
A maximal independent set (MIS) of a graph G(V,E) represents the IS set with the largest number of vertices. This type of topology is easier to maintain since each independent set of nodes must verify in its neighborhood that no other node from this set exists. With ad hoc networks, MIS are used in order to build CDS. More precisely, during a first step, MIS members are chosen [WAN 04] then connected to provide a CDS.
92
Wireless Ad Hoc and Sensor Networks
5.4.2.4. Localized minimum spanning tree Minimum spanning trees are widely used in fixed networks, particularly for routing problems. A spanning tree is a topology containing all nodes from a graph and only a subset of the vertices; an MST is a minimum weight spanning tree. Although such a topology is interesting in the case of low energy broadcast protocols [CAR 05], and particularly for wireless sensor networks, involving all network nodes in a self-organization environment can seem resource-intensive.
Figure 5.5. Topology – localized minimum spanning tree
Calculating an MST requires a central entity. In the context of ad hoc networks and in agreement with what we have discussed previously, it is necessary to consider the notion of MST from a localized standpoint (see Figure 5.5). [LI 03] proposes that each node locally calculate an MST from its neighborhood information. An edge between two nodes u and v belongs to the LMST if and only if u is a neighbor to v in the MST(N(v)) graph and v is a neighbor of u in the MST(N(v)) graph.
Self-organization of Ad Hoc Networks
93
5.4.2.5. Relative neighborhood graph RNG [TOU 80] graphs (see Figure 5.6) are also used, as are LMST, with the objective of proposing efficient and low energy broadcast protocols and also to enable topology control [CAR 05]. An edge (u,v) belongs to an RNG graph if no node w exists where w is a radio neighbor of u and v. In a formal manner, we define as RNG(G) = (V,Erng) the RNG graph of G as: ⎧ ⎫ ⎪ ⎪ (u, v) ∈ G / ¬∃w ∈ V (u, w) , ⎪ ⎪ ERNG = ⎨ ⎬ ⎪ ⎪ ⎪ ⎪( w, v ) ∈ V ∧ (d (u, w) < d (u , v)) ∧ (d (v, w) < d (u , v))⎪ ⎪ ⎩ ⎭
However, to calculate an RNG graph, localization information such as GPS is required.
Figure 5.6. Topology – relative neighborhood graph
94
Wireless Ad Hoc and Sensor Networks
5.4.3. Cauterization techniques Partitioning a network into homogenous zones is called clustering. A cluster can be linked to a service zone, a geographical zone or to a given function. There are numerous propositions of non-overlapping cluster construction in ad hoc networks. Nevertheless, the traditional algorithm builds clusters in such a way that the distance (in number of hops) between cluster members and clusterhead is at the most k [LIN 97]. The clusterhead is the node with the highest weight in the neighborhood. Officially, we define:
∀u ∈ V , ∃c ∈ C / c ∈ N k (u ), (C = ∪clusterheads ) [ALZ 02a, LOU 03, THE 04c] propose an ad hoc network architecture mixing clusters and virtual backbone to accomplish specific broadcast and routing functions. After reviewing the different topologies for organizing an ad hoc network from the angle of the emergence of a self-organizing structure, we now study the implementation of a virtual topology based on the rules expressed previously. We will then study its properties and how such a self-organization can improve network behavior. 5.5. Case study and proposition of a solution 5.5.1. Motivations The structures presented above make organizing an ad hoc network possible at two levels: at node level where local interactions occur and at the organizing structure level. Here, we propose to structure the network around a virtual topology based on clusters and a virtual backbone [THE 04c] (see Figure 5.6). This threelevel network hierarchization must be able to implement and support in a more efficient way functions such as routing, addressing, power management, etc. With this goal in mind, we first propose the construction of a backbone based on a CDS. It is made up of stronger nodes, and is responsible for collecting control traffic in order to optimize its diffusion to avoid the broadcast storm problem [NI 99]. Backbone nodes are chosen according to criteria reflecting their aptitude in building a stable backbone over time. The backbone cardinality should be limited. In parallel, the ad hoc network will be partitioned into logical entities, introducing a new hierarchy in the network with the help of clusters. A clusterhead controls a service
Self-organization of Ad Hoc Networks
95
zone, organizes routing, attributes addresses, etc. Backbone construction and maintenance and service zones algorithm are integrated, combining functions while limiting control traffic required for construction and maintenance. We propose the optimization of these structures’ stability instead of only focusing on minimizing their cardinality.
Figure 5.7. Virtual topology construction procedure
5.5.2. Construction of virtual topology We have chosen to build the backbone before service zones (see Figure 5.6) for the following reasons: – optimizing the number of nodes participating in the election of zone leaders (clusterheads); – forcing a zone leader to become a backbone member; – taking advantage of the backbone for zone construction control traffic; – setting a limited distance between a node and its zone leader using the backbone. 5.5.2.1. Neighborhood discovery Construction of a backbone is based on knowledge of kcds-neighborhood, kcds being a self-organization parameter. In addition, to detect unidirectional connections, each neighbor must include its 1-neighbors in its topology packets. Because of this, a node must periodically locally broadcast a hello packet over kcds-1 hops. Each node locally rebuilds its kcds-neighborhood by separating bidirectional from unidirectional connections.
96
Wireless Ad Hoc and Sensor Networks
5.5.2.2. Backbone The backbone comes from the construction of a kcds-CDS: the distance between a node and backbone is a parameter of this proposition. In a static network, kcds can be raised in order to limit backbone cardinality. In a highly mobile network, kcds need to be low to limit disconnections. To initiate construction, we use a leader: it may be natural as is the case with an access point serving as an Internet gateway or it may result from an election. A node can take one of the following states: – isolated: in initialization state, the node waits for a trigger signal to determine its state; – active: in election process to become dominator; – dominator: backbone member; – dominatee: backbone client, with one dominator at less than kcds hops. First, a dominating set is built. The leader becomes the first dominator, starting the construction process, generating a series of distributed decisions, based on the following rules: – an isolated or active node receiving a message from a dominator at less than kcds hops becomes dominatee and sets the source as father; – an isolated node receiving a message from a dominatee at less than kcds hops becomes active and sets a timer for the election; – an active node for which the timer has elapsed, with the highest weight among all its active kcds-neighbors becomes dominator. Schematically, the construction algorithm processes in waves and is not blocked by local decisions in process of resolution. During each wave, one or more active nodes are elected as dominators, these dominators’ neighbors become their dominatees, then neighbors of the dominatees become active.
Self-organization of Ad Hoc Networks
97
Figure 5.8. Example of a 2-CDS
The dominating set is then connected based on an algorithm inspired by [ALZ 2b]. Initially, the leader is the only one considered as connected. Each connected dominator relays a cds-invite at 2*kcds+1 hops. A dominator that is not connected receiving a cds-invite becomes connected and takes the dominatee which sent the packet as father. All dominatees relaying the reply of this dominator also become dominators. In this way, the backbone is built as a tree with its root being the leader, dominators are branches and dominatees are the leaves (see Figure 5.8). Complexity in terms of time is in O(n), and in terms of messages is in O(n) (each node relaying or sending a message of state and at the most x cds-invite). 5.5.2.3. Service zones The construction of service zones is not accomplished on the complete topology but on the sub-graph corresponding to the virtual backbone previously built: only dominators will participate in the construction of zones; a dominatee automatically joins the same cluster as its father.
98
Wireless Ad Hoc and Sensor Networks
The algorithm used here goes back to traditional cluster developments: the clusterhead that will be elected is the virtual topology member with the highest weight in the virtual backbone’s kcluster neighborhood. When the backbone is built in the leader’s neighborhood, the construction of service zones will be initiated. The construction process will be generated as soon as the backbone is locally available. 5.5.3. Maintenance of virtual topology Due to node mobility, the previous topology will inevitably be broken. It is thus vital to provide a set of event-driven functions for maintaining virtual topology connectivity. The construction process is performed at network initialization but maintenance must be constantly active, while aiming for reduced control traffic. In order to maintain its kcds-neighborhood, each node periodically sends its weight, state, father, distance to father, zone leader, neighbors and their weight in a hello. In this way, each dominator can maintain a list of its dominatees, dominatees of which it is the father and its sons, i.e. dominators of which it is father in the virtual structure. 5.5.3.1. Backbone The backbone must remain connected and dominatees must be in the neighborhood of the backbone. We propose a maintenance procedure which is described in the following sections. 5.5.3.1.1. Dominance property A dominatee only verifies its father’s validity. A father P is valid until it is a dominator, at less than kcds hops, and a neighbor exists announced at less than kcds-1 hops from P and having P as its father. If a father is invalid, the dominatee looks for a new father in its neighborhood table. Persistence of the topology is thus maximized: a dominatee retains the same father as long as it can. The dominatee informs its new father by sending a free hello. If a dominatee finds no valid father, it then becomes active. At least one active node will be elected as dominator and will execute the maintenance procedure reserved for dominators. 5.5.3.1.2. Connectivity property To guarantee backbone connectivity, the leader periodically sends ap-hellos with an increasing sequence number. These ap-hellos are only relayed in multicast by dominators using the backbone to limit the number of transmissions.
Self-organization of Ad Hoc Networks
99
When a dominator receives an ap-hello from its father, it relays it. Otherwise, it adds the source as secondary father if the sequence number is higher than the one from the last ap-hello sent by its father. In this way, a dominator cannot choose one of its descendants in the backbone as secondary father. A dominator D is considered disconnected in one of the following cases: – father of D is no longer dominator, or no longer neighbor; – D has not received any of the least maxap-hello ap-hellos from its father. When a dominator is disconnected, it chooses its secondary father with the highest weight as a new main father. It informs it of its decision by sending a free hello to its neighborhood. If the list of secondary fathers is empty, the following mechanism is applied: – D generates a cds-reconnnect with the sequence number of the last aphello listen. It sends a broadcast packet with a TTL set at 2*kcds+1; – D dominatees relay the broadcast packet; – the other dominatees relay it in unicast towards their dominator in order to optimize control traffic; – if a dominator receives the request and has received an ap-hello from its own father with a higher sequence number than the one requested, it responds with a cds-invite, relayed in unicast in the inverse route. Finally, each dominator that hears a cds-invite can use the source as secondary father. If a dominator chooses to reconnect to a secondary father from an explicit discovery, it sends a cds-accept, acting the same as during construction. The proactive maintenance mechanism of secondary fathers will create emergency connections in the backbone. A reconnection of the backbone is possible without latency and without control traffic. 5.5.3.1.3. Branch break If the radio medium is busy, numerous packets can be lost. Reconnection requests will concatenate, making network overload worse. In this way, a dominator for which maxreconnect reconnection attempts have failed orders the break of its branch by sending a multicast cds-break to its sons and dominatees. A node that receives a cds-break from its father relays the message in multicast to the other nodes involved, goes to the isolated state, and waits for an external request initiating the reconstruction process.
100
Wireless Ad Hoc and Sensor Networks
A dominator noticing that it has a kcds-neighbor with an isolated state in its neighborhood table sends a cds-invite with a TTL of kcds+1. Similarly, a dominatee neighbor of its dominator which has an isolated neighbor at exactly kcds hops orders its father to send a cds-invite. A dominatee receiving a cdsinvite becomes active by storing the source as secondary father in order to later connect if it is elected as dominator. 5.5.3.1.4. Cardinality In order to keep a relatively low backbone cardinality, a mechanism to remove useless dominators is proposed. A dominator is said to be useless if it does not have a dominator son, and only dominatees at the most at kcds-1 hops. This type of dominator sends cds-useless to its dominatees by diffusion, forcing them to choose its own father as new dominator. Then it takes the dominatee state maintaining the same father. On the other hand, to minimize the distance between a dominator and a leader, a dominator always reconnects to the dominator broadcasting the most recent aphello based on a sequence number. The height of the CDS is decreased, which makes it possible to obtain a CDS with more branches, and potentially more dominators can be declared useless. However, since the CDS diameter is also decreased, the collision rate of a broadcast using the backbone is decreased. 5.5.3.2. Service zones Only dominators participate in cluster maintenance. When a hello comes from the node chosen as relay to its dominator, the dominatee can update the identifier of its clusterhead. On the other hand, a dominator must send additional information in its hellos like the distance of its leader and the next hop via the backbone to reach it. A dominator D considers its leader as lost if the relay to its leader is no longer a neighbor, announces a different leader, or if its distance announced to the leader is higher at exactly (kcluster-kcds-1) hops. If its relay announces a new leader C to (kcluster-kcds-1) hops at the most, then D takes this new leader and updates its distance from C. D sends a free hello to force the future dominators who have previously chosen it as relay towards their leader to update their information and change their decision. A dominant D such as leader C1 is no longer valid will attempt to reconnect. D looks for a candidate with one of these conditions among its virtual neighbors: – a dominant D’ is a neighbor, has a leader C2 different from C1 and announces a distance to C2 via the backbone of at the most (kcluster-kcds-1) hops;
Self-organization of Ad Hoc Networks
101
– a dominant D’ is a neighbor, has C1 as leader, announces a distance to C2 via the backbone of at the most (kcluster-kcds-1) hops, and D is not a relay to C1 for D’. D will choose this new leader and update its distance through the backbone for its next hellos. If a dominator cannot join any existing cluster, it becomes leader. It immediately announces its decision with a free hello. This type of maintenance is possible because it relies on the backbone’s tree structure. A clusterhead is useless if no dominant neighbor has chosen it as leader. Since the clusters are connected, no other dominator in the network will have chosen it as leader. This type of useless node looks for an existing cluster to join. If such a cluster exists, it connects to it by becoming a client node. 5.5.4. Virtual topology properties We discuss here the properties of self-organization structure by using a network simulation tool1. It appears that the virtual topology is connected over 95% of the time, independently of network cardinality, density or node mobility. This tremendous stability is also based on a relevant choice of dominating nodes. In fact, when a node is involved in virtual topology, it usually stays for 2 minutes. Both the topology stability and the dominator stability represent time and spatial stabilities, respectively. On topologies of up to 80 nodes and for an average degree of 10, approximately 30% of nodes are dominant and are thus more involved in selforganization; clearly, the proportion of nodes involved is inversely proportional to the degree. The choice of nodes is based on a metric, whether it is degree, node identity, its energy reserve, etc. Note that the results are not really sensitive to the metric used (approximately 5% performance variation). 5.6. Contribution of self-organization The question raised is how to take advantage of the self-organization offered? Properties of robustness, stability and temporal persistence that we have demonstrated in the previous case study are all assets in studying the influence of self-organization on ad hoc network behavior. In particular, does a routing protocol based on a self-organization scheme offer better performance if we consider a flat network based on an unstructured network? In addition, by revealing a node
1 Detailed results are provided in [THE 04b, THE 04c].
102
Wireless Ad Hoc and Sensor Networks
hierarchy, why not use it to save energy for nodes that are less involved in virtual topology? 5.6.1. Energy saving Network nodes are energy-independent, thus it is important to provide solutions to increase their lifespan. The use of a self-organizing structure involves strong nodes in network behavior providing the opportunity of using simple solutions to save the energy of weaker nodes. The only solution to significantly reduce a node’s energy consumption is to put it in a sleeping mode [FEE 01]. No longer transmitting packets is not enough to conserve mobile node batteries because radio carrier listening also leads to energy consumption. It is possible to put dominatees to sleep for a period of time with low impact on the network’s behavior and on the virtual topology created. On the other hand, we must take into account neighborhood density. Putting a node with few neighbors in sleep mode is taking the risk that the network becomes disconnected. For this reason, we must limit the minimum number of active neighbors in order for a dominatee to sleep. Probability Psleep that a node will go to sleep can simply be expressed as a penalty function (Pe) representing the number of 1-neighbors with the lightest weight:
P sleep
1 Pe
Sleeping nodes are thus the least important dominators in the network. This metric can obviously be modified according to the node consumption model, the objective in terms of network lifespan, etc. Using self-organization as a means to efficiently deploy energy conservation functions stays the same. In [THE 04b], the impact of its energy conservation solution is explained in more detail. This report highlights a slight self-organized structure connectivity loss of less than 4%. On average, however, a node will go to sleep 11% of the time. We observe here a strong influence from the metric enabling the election of dominator nodes: a metric that takes into consideration an energy factor will have greater influence on a network’s lifespan.
Self-organization of Ad Hoc Networks
103
5.6.2. Influence of self-organization on routing Still discussing the self-organization structure introduced earlier, we now study how self-organization can influence routing solutions. In particular, since virtual topology proposes structuring the ad hoc network at 2 levels, a routing solution using this structure could offer a hierarchical routing methodology by separating intra-cluster routing from inter-cluster routing. Virtual topology naturally offers a proactive knowledge of dominating zones in a kcds-neighborhood. Proactive knowledge of a cluster does not present significant overheads but should be characterized. A proactive routing solution within clusters thus becomes vital. Since virtual topology is more stable than the physical neighborhood, inter-cluster routing which will be based on this cluster topology should be more robust, while requiring less overhead due to control packets. By modifying parameters kcds and kcluster it is also possible to limit the overhead brought about by this self-organization-based routing solution. 5.6.2.1. Intra-cluster routing Maintenance protocol of virtual topology already creates a neighborhood table for kcds neighborhood. A backbone radius kcds = 2 represents a stable backbone with a reduced number of clusterheads (~20% [THE 04a]). To have proactive knowledge of the cluster, instead of being relayed over kcds+1 hops, hello packets are relayed over 2*kcds+1. A node relays a hello if it comes from a bidirectional connection, from a node belonging to the same cluster, and if the TTL is higher than 1. This neighborhood discovery enables knowledge of the cluster’s internal topology. In this topology, a node can execute a shortest path algorithm (Dijkstra, for example) to calculate optimal routes. 5.6.2.2. Inter-cluster routing A reactive routing solution between clusters is proposed. A route is characterized by a series of cluster identities instead of a series of node identities (addresses). Once more, this type of routing takes advantage of the virtual topology’s stability. 5.6.2.2.1. Discovery of cluster topology Since the route is defined in the form of a series of cluster identifiers to follow, a node must know adjacent clusters and a route to reach them. This knowledge is integrated into hello packets in addition to regular information: identifier, weight, state and identifier of the clusterhead. A node neighboring several clusters can thus serve as a gateway to announced clusters. A route to a gateway can also be calculated with the help of the proactive intra-cluster routing protocol.
104
Wireless Ad Hoc and Sensor Networks
5.6.2.2.2. Route discovery When a node S wants to send a data packet to D, the following events can occur: – D is a maximum of kcds hops from S or S and D have the same clusterhead. D is thus in S’s neighborhood table. S directly executes the proactive intra-cluster routing algorithm introduced previously to reach D; – D is in S’s routing table. S thus has a cluster route to reach D. It executes the inter-cluster routing algorithm; – otherwise, S initiates a route discovery procedure. S lets a dominator play the role of proxy for route discovery. It generates a Route Request in which it enters the address of its clusterhead in the packet’s list of clusterheads. Then it transmits to other backbone members in multicast. The dominatees do not participate in route discovery. When each dominator receives a Route Request, it relays the packet if it does not know D and if it has not seen the packet before. Before relaying it, the dominator enters its cluster identifier if it is different from the previous cluster identifier of the route contained in this packet. If D is in the dominator’s neighborhood table, then it will generate a Route Reply containing the route of clusters contained in the Route Request. Similarly, it adds in this route the address of its cluster identifier and D’s cluster identifier if they are not already present. Route Reply finally contains the cluster route to follow from D to S. The dominator sends the Route Reply to S by executing the intercluster routing algorithm. Route Requests are only relayed by backbone members, decreasing the overhead incurred by a route discovery. 5.6.2.2.3. Routing protocol Inter-cluster routing protocol is used for data packets and Route Replies. The route of clusters to follow is contained in the packet header. Before relaying a Route Reply, a node can cache the acquired route to S and to D, in order to reduce the number of route discoveries generated later. If the final destination is in the neighborhood table, then node N1 sends the packet directly with the help of intra-cluster routing. If not, it looks for the first known G cluster, closest to the destination: – a 1-neighbor N2 has clusterhead G. N2 is the next hop; – a 1-neighbor N2 is a gateway for G. N2 is the next hop; – a node N2 is the closest gateway to G from cluster N1. N1 executes intra-cluster routing to reach N2. N1 and N2 are in the same cluster and have the same local view. Due to, they will make consistent routing decisions.
Self-organization of Ad Hoc Networks
105
This type of solution does not create a routing loop. Instead, each time, the packet gets closer by one hop to the destination. However, it is possible to have inconsistencies in neighborhood tables created in a distributed way [WU 04]. To avoid loops, a previously seen packet is simply deleted silently. A node considers that it has already processed the packet if a packet with the same source/identifier addresses has already been relayed. Route calculation to go through a cluster is dynamic. A node always chooses to relay the packet to the first known cluster, and it must therefore update the route from the Route Reply if it becomes modified in such a way that the source benefits from this dynamic route calculation. This type of modification is not required for a data packet since the source will not receive the modification. Finally, route length is not optimal, as it is limited by relaying the packet to the closest cluster towards destination. This type of dynamic route calculation is very robust: packets arrive at the destination, even if numerous individual nodes move. The cluster routes need only be valid in order to deliver the packet to the destination. 5.6.2.2.4. Route repair Since delivery rate is a major performance criterion for a routing protocol, it is necessary to maximize it by using for example a simple packet acknowledgement mechanism. If a node sends a data packet and does not receive acknowledgement within a period of time, it will retransmit the packet. A passive acknowledgement by cooperation with IEEE 802.11 MAC layer, such as DSR [JOH 03], is also possible. In that case, there is no overhead. If a packet cannot be transmitted on a path, a local route repair mechanism can be used. This node will re-execute the routing algorithm by simply preventing the faulty node from becoming the next hop. This type of reconstruction will limit the impact of convergence delay on neighborhood tables, improving delivery rate, but handicapping end-to-end delay. 5.6.2.3. Performance We will now summarize results detailed in [THE 05]. The important point to mention is the spatial and temporal stability that a self-organized structure provides. Its use for routing retains this property. Routes appear more stable than those developed by traditional routing protocols (AODV, OLSR) based on a flat approach but also more robust than a hierarchical solution such as CBRP. In a self-organized system, a node must react and adapt to local changes. Due to this a route is locally maintained and, following this principle, becomes more robust. Delivery rate increases by nearly 8% compared to flat approaches. Self-organization properties allow the routing protocol to better support mobility or a bad environment: we observe relative performance insensitivity to these events. Route construction as proposed involves a longer average length than that of proactive and reactive protocols. The advantage of self-organization-based routing in relation to a hierarchical protocol such as CBRP resides in its low latency because of previously
106
Wireless Ad Hoc and Sensor Networks
known self-organization topology information which is reused for routing. On the other hand, routing proposed here takes advantage of the backbone to forward Route Request control traffic; we witness lower network capacity during dense traffic. 5.7. Conclusion Self-organization is a key point for spontaneous networks, whether they are ad hoc or sensor types, because in our view, the self-organization paradigm is at the heart of all problems. It brings elements of solutions to the autonomous behavior of this type of networks and must help in the reconsideration of protocol deployment. Ad hoc mobile networks are self-organized networks by nature. The network protocols (routing, localization, diffusion, etc.) are based on local node collaboration. These local interactions enable the emergence of a group of services. In this chapter, we have presented another way of considering self-organization. In this case, it consists of studying dynamic topologies that we can build to structure the network. The objective is obviously for this structure to ensure more efficient development of network protocols. Self-organization solutions must not rely on a centralized coordination system. The key points to remember when proposing selforganized architectures are: – only local interactions; – emergence of a global structure from local information; – responsiveness to local changes and robustness; – self-organizing structure not source-oriented; – scalability. We have emphasized two topologies for self-organization: virtual backbones and clusters or service zones. A few algorithms allowing their constructions to be adapted to a dynamic environment were presented. Construction of a virtual topology brings robustness and stability. Temporal persistence observed provides a more stable environment supporting network protocol development, which leads us to focus on the contribution of selforganization in the behavior of an ad hoc network. We studied how a simple energysaving solution could benefit from such a solution and we have focused on a routing solution benefiting from a self-organized structure. The performances show much better behavior than traditional approaches.
Self-organization of Ad Hoc Networks
107
5.8. Bibliography [ADJ 05] ADJIH C., JACQUET P., VIENNOT L., “Computing Connected Dominating Sets with Multipoint Relays”, Ad Hoc and Sensor Wireless Networks, vol. 1, no. 1-2, January 2005. [ALZ 02a] ALZOUBI K., WANG P., FRIEDER O., “Message-Optimal Connected Dominating Sets in Mobile Ad Hoc Networks”, 3rd ACM International Symposium on Mobile Ad Hoc Networking and Computing, Lausanne, Switzerland, p. 157-164, June 2002. [ALZ 02b] ALZOUBI K.M., WAN P.-J., FRIEDER O., “Distributed Heuristics for Connected Dominating Set in Wireless Ad Hoc Networks”, IEEE ComSoc/KICS Journal of Communications and Networks, Special edition, Innovations in Ad Hoc Mobile Pervasive Networks, vol. 4, no. 1, p. 22-29, March 2002. [BUT 03] BUTENKO S., CHENG X., DU D.-Z., PARDALOS P.M., “On the Construction of Virtual Backbone for Ad Hoc Wireless Networks”, in S. Butenko, R. Murphey, P.M. Pardalos (eds.), Cooperative Control: Models, Applications and Algorithms, p. 43-54, Kluwer Academic Publishing, Boston, January 2003. [CAR 02] CARDEI M., CHENG X., CHENG X., DU D.-Z., “Connected domination in ad hoc wireless networks”, International Conference on Computer Science and Informatics (CSI), North Carolina, USA, March 2002. [CAR 05] CARTIGNY J., INGELREST F., SIMPLOT-RYL D., STOJMENOVI´C I., “Localized LMST and RNG Based Minimum-Energy Broadcast Protocols in Ad Hoc Networks”, Ad Hoc Networks, vol. 3, no. 1, p. 1-16, January 2005. [CHE 02] CHENG X., DU D.-Z., Virtual Backbone-Based Routing in Multihop Ad Hoc Wireless Networks, Report no. 02-002, University of Minnesota, Minnesota, USA, January 2002. [DAI 03] DAI F.,WU J., “Distributed Dominant Pruning in Ad Hoc Networks”, International Conference on Communications (ICC), vol. 1, p. 353-357, IEEE, Anchorage, USA, May 2003. [FEE 01] FEENEY L., NILSON M., “Investigating the Energy Consumption of a Wireless Network Interface in an Ad Hoc Networking Environment”, INFOCOM, p. 1548-1557, IEEE, Anchorage, USA, April 2001. [JOH 03] JOHNSON D.B., MALTZ D.A., HU Y.-C., The Dynamic Source Routing Protocol for Mobile Ad Hoc Networks (DSR), Internet draft version no. 09, IETF, April 2003. [LI 03] LI N., HOU J., SHA L., “Design and Analysis of an MST-Based Topology Control Algorithm”, INFOCOM, IEEE, San Francisco, USA, April 2003.
108
Wireless Ad Hoc and Sensor Networks
[LIA 00] LIANG B., HAAS Z.J., “Virtual Backbone Generation and Maintenance in Ad Hoc Network Mobility Management”, INFOCOM, p. 1293-1302, IEEE, Tel Aviv, Israel, March 2000. [LIN 97] LIN C.R., GERLA M., “Adaptive clustering for Mobile Wireless Networks”, IEEE Journal of Selected Areas in Communications, vol. 15, no. 7, p. 1265-1275, 1997. [LOU 03] LOU W., WU J., “A Cluster-Based Backbone Infrastructure for Broadcasting in MANET”, Workshop on Wireless, Mobile and Ad Hoc Networks, in conjunction with IPDPS, Nice, France, April 2003. [NI 99] NI S., TSENG Y., CHEN Y., SHEU J., “The Broadcast Storm Problem in a Mobile Ad Hoc Network”, International Conference on Mobile Computing and Networking (MOBICOM), p. 151-162, ACM, Seattle, USA, August 1999. [PER 94] PERKINS C.E., “Highly Dynamic Destination-Sequenced Distance-Vector Routing (DSDV) for Mobile Computers”, SIGCOMM, p. 234-244, ACM, London, UK, August 1994. [PRE 05] PREHOFER C., BETTSTETTER C., “Self-Organization in Communication Networks: Principles and Design Paradigm”, IEEE Communications Magazine, vol. 43, no. 7, p. 78-85, July 2005. [SCH 93] SCHNEIDER M., “Self-Stabilization”, ACM Computing Surveys, vol. 25, no. 1, p. 45-67, March 1993. [THE 04a] THEOLEYRE F., VALOIS F., “Robustness and Reliability for Virtual Topologies in Wireless Multihop Access Networks”, Mediterranean Ad Hoc Networking Workshop (MedHocNet), p. 81-92, Bodrum, Turkey, June 2004. [THE 04b] THEOLEYRE F., VALOIS F., “Topologie Virtuelle pour une Organisation des Réseaux Hybrides Multisauts”, Journées Doctorales Informatique et Réseaux (JDIR), Lannion, France, November 2004. [THE 04c] THEOLEYRE F., VALOIS F., “A Virtual Structure for Mobility Management in Hybrid Networks”, Wireless Communications and Networking Conference (WCNC), vol. 5, p. 1035-1040, Atlanta, USA, March 2004. [THE 05] THEOLEYRE F., VALOIS F., “Routage hybride sur structure virtuelle dans les réseaux mobiles ad hoc”, Colloque francophone sur Ingénierie des Protocoles (CFIP), Bordeaux, France, March 2005. [TOU 80] TOUSSAINT G., “The Relative Neighborhood Graph of Finite Planar Set”, Pattern Recognition, vol. 12, p. 261-268, 1980. [WAN 04] WANG P.-J., ALZUBI K.M., FRIEDER O., “Distributed Construction of Connected Dominating Set in Wireless Ad Hoc Networks”, Mobile Networks and Applications, vol. 9, no. 2, p. 141-149, April 2004.
Self-organization of Ad Hoc Networks
109
[WU 99] WU J., LI H., “On Calculating Connected Dominating Set for Efficient Routing in Ad Hoc Wireless Networks”, International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications (DIALM), p. 714, ACM, Seattle, USA, August 1999. [WU 04] WU J., LOU W., “Extended Multipoint Relays to Determine Connected Dominating Sets in MANETS”, Conference on Sensor and Ad Hoc Communications and Networks (SECON), Santa Clara, USA, October 2004.
Chapter 6
Approaches to Ubiquitous Computing
6.1. Introduction Ubiquity of communications and services represents the possibility for a user to access services and resources wherever, whenever and whether from a fixed or mobile terminal [WEI 93]. It is the development of wireless telecommunications, mobile networks and their crossing with fixed communication networks, i.e. proximity and remote networks, which enables us to consider this ubiquity of services and resources today. The major challenges for ubiquitous implementation are, on one hand, the introduction of automatic adaptation possibilities in processes and communications for dynamic network modifications, of the number and availability of resources and of user behaviors, and on the other hand, implementation of mechanisms for the development of services [GAB 00, GAB 06, BAK 07]. In other words, the implementation of new applications in the context of ubiquitous computing requires the development of new approaches having self-organization, self-adaptation and emergence capabilities for mobile ad hoc and sensor network environments. It should be noted that according to Gaber’s classification, interaction paradigms can be classified into three categories: the traditional client to server paradigm (CSP) and two alternative paradigms, the adaptive services to client paradigm (SCP) and the spontaneous service emergence paradigm (SEP). These alternative paradigms are more suitable for ubiquitous and pervasive computing respectively require selforganizing, self-adaptive and emergence capabilities to cope with dynamically
Chapter written by Mohamed BAKHOUYA and Jaafar GABER.
112
Wireless Ad Hoc and Sensor Networks
changing context environments such as computing contexts and user contexts [GAB 00, GAB 06, BAK 07]. The traditional client/server paradigm is inadequate for ubiquitous computing implementation. More precisely, in this paradigm, the client is the one taking the initiative of requesting an existing service and has the means to locate it. In this chapter, we focus on the implementation of an alternative paradigm, contrary to the traditional client/server paradigm; the service will go to the client instead of the client taking the initiative and requesting a service, or a resource by first knowing its existence and location. In addition, by integrating self-adaptation and selforganization possibilities, this paradigm enables the development of services with the help of a learning mechanism based on resource availability and user behaviors [GAB 00, GAB 06, BAK 03, BAK 05, BAK 07]. This chapter is dedicated to the presentation of this approach and methodologies found in the literature for the implementation of ubiquitous computing. More precisely, we are talking about service discovery and development systems in mobile ad hoc networks, or MANET and wireless sensor networks. A mobile ad hoc network is a system made up of mobile sites. A site is any mobile object capable of communicating over wireless network. These objects can be personal assistants, laptop computers, cell phones or sensors. There is no centralized management in these networks. More precisely, mobile hosts form, in an ad hoc manner, a network infrastructure. In addition, no assumption or limitation is made on the ad hoc network’s size; the network may contain hundreds or thousands of mobile units [FOU 04, PAR 05]. This network lets its users access services whatever their geographical location [FOU 04]. In order for these users to access distributed services in a mobile environment, a service discovery system is required. A services discovery system is based on the definition of an interaction protocol or mechanism between users and available services. In other words, it allows users to locate a service in order to access and use it [BAK 05]. A service is an application or a software component with a specifically defined function. It is directly accessible by users through the use of requests. It can also interact with other services throughout the network with the purpose of creating combined services. More precisely, a service is an application equipped with a well defined interface for accessing and using one or more resources. When there are multiple resources, they can be located in one single node or geographically distributed between several network nodes. We call combined service a service based on distributed network resources. For example, current peer-to-peer (P2P) systems are systems where the service provided is sharing disk space and file transfers distributed over several network servers.
Approaches to Ubiquitous Computing
113
Service discovery systems which exist in the literature can be classified into three families according to their architectures or their operation modes (see Figure 6.1). The first family groups systems which implement a structural organization for the location of network services. In other words, they are systems using indexing or hash mechanisms. The second family groups systems which are not based on any specific structure or organization. These systems are commonly called unstructured service discovery systems.
Figure 6.1. Classification of service discovery systems from the point of view of their architectures or their operation modes
The third family groups self-organizing and self-adaptive systems. In this category, we present a new approach called based on the mobile agent paradigm and inspired by the human immune system for services discovery in MANETs. This approach integrates possibilities of reduction/expansion for automatically adapting to an environment where the evolution can be random and which can adapt to combined mobility of network nodes and users. The human immune system presents interesting functional properties such as self-organization, self regulation, emergence and self-adaptation to an environment where evolution is dynamic and random. The analogy with the immune system is made in the following way [GAB 00, GAB 06, BAK 03, BAK 05, BAK 07]: a user request corresponds to an antigen (or a virus) attacking the network. The network then behaves like the immune system to eliminate the attacking antigen, i.e. to respond to the user’s request. In other words, the requested resource or service corresponds to the immune response. This represents an alternative approach, contrary to the traditional client/server paradigm; it is the service addressing the client and not the client that
114
Wireless Ad Hoc and Sensor Networks
takes the initiative and requests a service or a resource knowing its existence and location. In this chapter, we present a study of service discovery systems with their architectures and operation modes in ad hoc networks as criteria of comparison. We begin by describing the different structured and unstructured service discovery systems proposed in the literature in sections 6.2 and 6.3 respectively. In section 6.4, we present a comparative study of these systems in a dynamic context. In section 6.5 we present a self-adaptive approach based on the creation of communities for service discovery in mobile ad hoc networks. 6.2. Structured service discovery systems 6.2.1. Systems based on an indexing mechanism These systems are based on the use of directories (or service lookups) and they can be categorized into two families. The first one groups systems using a centralized indexing mechanism. They are offered for locating services in smallsized networks. The second family groups systems based on a decentralized indexing mechanism. They are offered for locating services in large-scale networks. In order for servers to register their services in these directories and for users to access them, a mechanism for locating servers storing these services is needed. There are three types of locations [GUR 03]: static location, passive location and active location. In the case of a static location, addresses of servers storing the directories are provided to users and servers by a manual configuration. For passive location, service lookup servers periodically broadcast messages announcing their presence. In the case of active location, users and servers are the ones broadcasting messages to locate service lookup servers. To enable users to know when a service arrives or leaves, leasing and notification mechanisms are used [GUR 03, PER 98]. With the leasing mechanism, the server must indicate the time period during which its service will be valid. More precisely, before the expiration of this delay, it must renew dissemination of its service; otherwise, it is considered unavailable by the user. The notification mechanism informs users, before the expiration of the service validity period, of certain events, for example, the departure of the service or a change in this service’s parameters. 6.2.1.1. Centralized indexing The approach based on the use of the centralized indexing mechanism is the most widely used by service discovery systems in small networks [BET 00,
Approaches to Ubiquitous Computing
115
GUR 03, ROB 98]. As an example, the Jini middleware service discovery system developed by Sun Microsystems [GUR 03] uses a directory which maintains information on available network resources. The three types of service lookup location can be used in Jini [MAT 01]. The standard CORBA middleware from OMG (Object Management Group) also uses the indexing mechanism [GUR 03] to locate services. CORBA [GUR 03] uses static location of the service lookup server. In these systems, servers communicate information on their services to the directory. Location of a service is performed in the following manner: the user sends a location request to the service lookup server to get a list of servers whose available services correspond to the request criteria. The user makes her choice and then directly connects to the server with the requested service. Location by using a single centralized service lookup server is simple and avoids having to broadcast or multicast information on services or requests searching for these services. However, the fact that all information concerning services is concentrated in one directory constitutes a major drawback of this model in terms of scalability [MIL 02]. Another problem raised by this approach is its vulnerability to failures and to intermittent connections as this server is a mobile node. In fact, central server failure or disconnection would make it impossible to locate services. 6.2.1.2. Decentralized indexing This indexing mechanism was mainly proposed to solve scalability issues as well as the problem of failure of the service lookup server using the centralized indexing mechanism in wired networks. As an example, the SSDS (secure service discovery system) architecture, developed in the context of the Berkeley Ninja Research Project at University of California [CZE 99], uses service lookup servers structured in a hierarchical way for service discovery [XU 01]. Basic entities of this architecture are SC (service client), SP (service provider) and DS (discovery service). SCs correspond to users discovering and using services. SPs correspond to service providers. DSs are domain servers acting as service lookup. A DS is associated with each domain. To disseminate information on services, each SP sends its service information to its known domain server by manual configuration (or static location). This domain logs the information and then sends it to its parent server in the hierarchy. The parent server logs this information and in turn sends it to its parent server. This procedure is repeated until service information reaches the central service lookup server (root of hierarchy). In this system, information on services is compressed using the bloom filter technique [CZE 99]. Location of a service is done in the following way: the user sends its request to its DS. The DS looks in its service lookup if it can locally respond to the request
116
Wireless Ad Hoc and Sensor Networks
(i.e., the service lookup knows the location of the requested service). If that is the case, it responds to the user’s request, otherwise it sends a request to its parent server. This procedure is repeated until the requested service is found. The SLP (service location protocol) system is proposed by the Internet Engineering Task Force (IETF) [BET 00, GUT 99] for locating services on smallsized networks [PER 98]. The architecture of SLP is made up of three types of entities represented by agents: a service agent (SA) which represents a service, a user agent (UA) acting on behalf of the user and a directory of agents (DA) which centralizes information on services [AZO 02]. UAs discover services offered by SAs and which are logged in DAs. However, no detail is given on how to organize DAs. An extension of SLP is mSLP (mesh-enhanced service location protocol), which is more scalable because of collaboration between several DAs [ZHA 02] organized into a mesh network. In other words, it is a topology where each DA is connected to all other DAs in the network. In fact, service information is duplicated on DA service lookup servers. However, if the services frequently join or leave the network, information updates on these services overload the network and generate a bottleneck for DAs. This type of architecture based on hierarchical service lookup servers was proposed to solve the scalability issue in wired networks. Indeed, it cannot be suitable for mobile ad hoc networks because the user’s connection with the service lookup server and between the server and the other directory servers is difficult to maintain because of node mobility [CHO 05, HAU 05, LIU 03]. It also increases the cost of periodic updates on service information arriving or leaving the network. In addition, the failure or disconnection of a subordinate server or the central server in the hierarchy would make locating services impossible. To address these issues, several approaches have been proposed for service discovery in mobile ad hoc networks [CHO 05]. These approaches are based on mechanisms of the election of nodes which could assume the role of service lookup. As an example, studies proposed to log service information to nodes called dominator nodes [KOZ 04]. These nodes are considered stable. They are selected according to the frequency of their disconnection. More precisely, the service discovery protocol is made up of two phases: election phase for dominator nodes (BBM (backbone management)) and service location phase (DSD (distributed service discovery)). The first phase consists of selecting a set of dominator nodes in the following way: a node with a low disconnection frequency (NLFF (normalized link failure frequency)) becomes a dominator node. In other words, this phase consists of eliminating nodes with a NLFF higher than a given threshold nlff th . Following this selection phase, each dominator node broadcasts a message to the other dominator nodes. In fact, dominator nodes form an overlay network in the form of a grid (or a mesh).
Approaches to Ubiquitous Computing
117
If it is not dominating, a node is connected to a virtual access point (VAP) to all dominator nodes. It logs information concerning its services with the DA present in its VAP. In order for the other nodes to know the validity of information logged in their cache, a leasing mechanism is used. More precisely, each node must indicate the time period during which its service is valid. Before expiration of this delay, it must renew dissemination of its service with its DA, otherwise it will be considered unavailable by its DA and will be deleted from its cache [KOZ 04]. When a user wants to search for a service, it sends a location request to its VAP and to all neighbor VAPs until the service requested is located or when a specific search threshold TTL (time to live) is reached. Another approach proposes the organization of services in a ring [KLE 03]. In other words, nodes which are physically close and offer similar services are grouped into a ring. Rings form a hierarchical structure. Each ring is represented by an elected node considered as a service access point (SAP) to services offered by nodes in this ring. By using this structure, the location of services becomes simple and efficient. A request is routed throughout the rings until it reaches the service requested. In this approach, services are represented by the DAML-S (DARPA Agent Markup Language Service ontology) language. A function of similarity between two services was also proposed. This function returns an integer indicating if two services are similar or not. An approach was proposed in [SAI 05] where nodes which can play the role of service lookup are elected among all network nodes based on their capacity in terms of resources and constraints of their environment. Each node with the role of service lookup is responsible for storing information on available services in neighbor nodes at H hops. It also exchanges its profile with other directory nodes. In other words, it is the capacity of the node and a summary in condensed form of information on services offered by nodes of its neighborhood by using the bloom filter technique [SAI 05]. Services are represented with a web service description language (WSDL). When a user wants to locate a service, it sends a location request to its service lookup node. If this node does not have the service or information on the requested service, it broadcasts the request to one or more nodes liable to solve the request based on their profile. Content management for service lookup nodes is based on a policy of cooperative caches enabling these nodes to benefit from information present in the service lookup nodes. This cache policy is efficient in terms of time of request resolution and message complexity. However, it does not take into account node resources in terms of storage capacity [HAU 05]. In addition, cache maintenance can be difficult because of node mobility. To resolve the cache maintenance issue, another approach was proposed in [HAU 05]. It is a solution which distributes information on available services in the network by decreasing the distance between a node that has kept a trace of
118
Wireless Ad Hoc and Sensor Networks
information on a service and a node wanting to locate this service. In order to do this, a distance factor was introduced. This factor controls the maximum distance separating a node with a trace of information about a given service from a node without it [HAU 05]. In addition, a node must be able to find the requested service or information on its location from a closer node. The smaller the distance factor, the shorter the distance separating a node with information on the service of a node not having it. In other words, the smaller the distance factor, the shorter the time for locating a given service. A method proposed in [LIU 03] is based on a dynamic node election mechanism playing the role of service lookup. More precisely, a network node can play one or more of the three following roles: a user role (or client) in the case where it requests a service, a provider role (or RP for resource provider) in the case where it offers a service or a service lookup role (DA for discovery agent) if it is elected by the other nodes. In other words, users discover services offered by RPs and are logged in DAs. Each DA is responsible for a set domain in collaboration with the other network DAs. The selection of a node to play the role of a DA is based on an election algorithm in the following way. Each node broadcasts its presence to all network nodes and the node with the smallest identifier will be elected as DA. Let us suppose that M DA were elected, the initial DA must choose M-1 nodes to form the set of DAs indexed by the set {2,…,M}. The initial DA takes index 1. Following this phase, DA addresses are periodically broadcast in the network. Each node not playing the role of a DA must choose its closest hDA (home DA) to log information on its services. We should note that nodes are mobile; consequently, DAs must update members present in their domain. Each DA periodically broadcasts a message to its neighbors containing its index, message expiration date and a variable representing distance covered by this message. When a non-DA node receives this message from a DA node, it looks at the distance separating it from its DA. If this distance is higher than the current distance from its hDA, it will not rebroadcast the message; otherwise it considers this DA as its hDA and broadcasts the message to its neighbor nodes. The third phase involves logging service information in DA nodes. Each service is characterized by an attribute known to all network nodes. A node providing a service announces its service to its hDA by sending a logging request. This request contains the PR node address, attribute of its service α, validity date of this announcement. When the DA receives this message from a PR, it calculates its index β in all indexes {1,2,…,M} using a hash function H = (β = H(α)). It then sends this message to DAs: DAβ, DAβ+1,…DA β+k–1 to log information concerning the attribute service α.
Approaches to Ubiquitous Computing
119
Service location in DAs is carried out in the following way. A user wanting to locate a service sends a request to its hDA. If it does not have the service or information on locating the requested service, it calculates its index β by using function H(α) to obtain all qualified DAs: DAβ, DAβ+1,…DA β+k–1. The request is then broadcast to the DA closest to hDA. If this DA cannot respond, it chooses the second DA of this group and the procedure is repeated until the requested service is located. In this way, the user receives a list of PRs and chooses the one providing the service with a given quality of service. In this approach, DAs are organized into a mesh network. Or in other words, it is a topology where each DA is connected to all other network DAs. This method enables users to locate available services in the network. However, periodic broadcasting of logging requests between DAs in order to maintain the network between DA nodes as well as information on services carries the risk of overloading the network. 6.2.2. Systems based on distributed hash Hash-based systems are service discovery systems mainly dedicated to locating available network files without going through a service lookup server. These systems are based on an approach founded on the use of a distributed hash table [LI 02, LUA 05, MIL 02, RIS 04, SCH 03]. This approach is implemented in the following manner: a hash function is applied to each node’s IP address in order to generate its identifier called nodeID. Similarly, the identifier of a file, called fileID, is generated from the file’s name or content by using this same function [FEL 03]. Using a hash function makes building an overlay network functioning independently from the underlying physical network possible. In this way, a set of file identifiers corresponds to each network node. For example, nodes receiving files whose identifiers are close to the node identifier. This matching between nodeID and fileID enables routing of requests toward nodes with a nodeID as close as possible to the requested file’s fileID [JAN 02, LUA 05, MIL 02]. The system proposed by Plaxton et al. [PLA 99] is the first system based on the development of an overlay network in the form of a mesh [LI 02]. File and node identifiers are generated by hash function SHA-1 (secure hash algorithm version 1) [PLA 99]. Other more elaborate systems derived from Plaxton routing and locating mechanisms were proposed for file transfers in a dynamic network such as Pastry [ROW 01] and Tapestry [ZHA 01]. In order to keep root nodes from indicating that an absent node holds a file, a periodic file information dissemination mechanism is offered by the Pastry and Tapestry systems [LI 02]. In fact, a node deletes the
120
Wireless Ad Hoc and Sensor Networks
pointer to any node which has not disseminated its files in a set delay. Similarly, during the dissemination of a file, if its root node no longer exists, another node replaces it. For example, in Tapestry, several root nodes are defined as responsible for the same file. In this way, failure of a root node would not make it impossible to locate files. However, in a dynamic network in which nodes and connections appear and disappear over time, periodic information dissemination of files can overload the network [GAU 02, JAN 02]. Other file transfer systems exist which are based on the construction of an overlay network such as Content-Addressable Networks [GAU 02, RAT 01], Chord [STO 01] and Viceroy [GAU 02, MAL 02]. This node structure in overlay networks enables the location of a file in an efficient manner, but in return requires a development and maintenance algorithm for this network during frequent arrival or departure of one or more nodes [LIB 02]. In addition, the overlay network between pairs does not reflect the physical network topology at all. In other words, two nodes distant in the overlay network are in reality very close geographically and vice versa. To avoid this loss of efficiency for service discovery in mobile ad hoc networks where the notion of proximity is very important, geographic location must be taken into account during creation and update of the overlay network. More precisely, allocation of identifiers must be based on the network’s topology. In addition, these systems do not consider node resources in terms of storage capacity [ROB 04]. An approach using the Chord system [STO 01] was proposed for services discovery in mobile ad hoc networks with a cache adjustment mechanism [ROB 04]. In other words, nodes with a higher storage capacity can cache information about the services much better than other nodes. 6.3. Unstructured service discovery systems The second family of service discovery systems groups all systems said to be unstructured. In this type of system, no service lookup will centralize information on services. To locate a service, two mechanisms can be used: flooding or random walk [GUR 03, SHU 02]. 6.3.1. Flooding-based mechanism Two techniques based on the flooding mechanism can be used for the discovery of services present in the network: push and pull techniques. In the push technique, the user periodically broadcasts his discovery request for services available in the network. In the pull technique, it is the servers who periodically broadcast information on their services, making users aware of available services in the network.
Approaches to Ubiquitous Computing
121
Several service publishing strategies were proposed for service discovery in a mobile ad hoc network using the pull technique: a greedy strategy, incremental, uniform memoryless, with memory and a conservative strategy [HAU 05, LUO 03, LUO 04]. In the first strategy, a node publishes its service to all network nodes. A user wishing to locate a service broadcasts his request to all network nodes. In fact, if all the nodes know all available network services, users will not transmit their requests. In an ad hoc network, the nodes are limited in storage capacity. It is impossible for all nodes to store information on all services available in the network. In addition, nodes can leave or join the network dynamically. Consequently, for nodes to know about the arrival or departure of a node, leasing and notification mechanisms will be used [GUR 03]. However, periodic broadcasting of messages can overload the network [LIU 03]. With the incremental strategy, each node publishes its service to nodes in a set zone at each step and can increment this zone. A user also wanting to locate a service sends a request to nodes present in this zone. He can also extend this zone if the request is not satisfied. In the uniform memoryless strategy, a node publishes its service to a group of nodes chosen randomly. A user wishing to locate a service sends his request to a group of randomly chosen nodes. With the memory strategy, the user selects a series of nodes, at each step, among nodes not reached in the previous step. DEAPspace [NID 01] is a system dedicated to services discovery in mobile ad hoc networks. It is based on the pull technique for service information dissemination. More precisely, each node maintains a service lookup (or a cache) to store information on services. To disseminate service information, each node periodically sends information on its services to all network nodes. Each node must indicate the period of time during which the service is valid. Before the expiration of this delay, it must renew dissemination of service information, otherwise it is considered by the other nodes as unavailable and will be deleted from their cache. When a node receives a service dissemination message, it logs information concerning this service in its directory and then broadcasts it to other nodes. This procedure is repeated until the information about services reaches all network nodes. When a node receives its service information dissemination message, it increases its time of validity in order to increase the chance of being received by all network nodes. This technique of dissemination based on service lookup broadcast decreases the time of service location. However, periodic broadcasting of messages increases network resource usage. Konark [HEL 03] is another system dedicated to service discovery in mobile ad hoc networks by using the push and pull techniques. In other words, each node maintains a service lookup (or a cache) to store information on services. This cache
122
Wireless Ad Hoc and Sensor Networks
is organized in the form of a tree to categorize services and to facilitate search and dissemination of information on these services. These services are represented by the XML description language, similar to WSDL. When a user wishes to locate a service, he initiates a location message and sends it in multicast. When a node receives this message, it looks in its service lookup to see if it can respond to the request locally (if the service lookup has the location of the requested service). If that is the case, it responds to the request and sends a dissemination message to inform other users requesting the same service of its location. When the user receives the dissemination message, he stores it in his service lookup. HAID (hybrid adaptive protocol for integrated discovery) is a system proposed in [CHA 05] based on the use of the push and pull techniques where information on services is disseminated to the nodes in a given zone. This zone is determined by node neighbors at a single hop and it can vary dynamically according to requests of users requiring a given service. All the nodes of each region store information on services offered by nodes of this region. Each node periodically sends a dissemination message of service information on services to node neighbors at one hop. This message contains the following information: type of service, service provider ID, identifier of the node receiving this message and region of dissemination. This region corresponds to the lifespan of message propagation, initially set at 1. When a node receives this message, it searches in its directory. If the disseminated service exists, it updates information concerning this service, otherwise it adds it to its service lookup. A user wanting to locate a service sends a location message with a time to live (TTL). When a node receives this message, if it does not have information or the requested service, it increases the TTL value and broadcasts the message. Otherwise, it notifies the requesting node by sending a response containing provider identifier and the number of hops to reach it. The message follows the opposite route until it reaches the user. If the TTL value is equal to zero, the request is not rebroadcast in the network. Another system using the push and pull techniques for service discovery in a mobile ad hoc network was proposed in [MOH 04]. Each node periodically broadcasts information on its services to other network nodes. Each user wanting to search for a service which has not yet received a dissemination message concerning this service broadcasts his request in the network. To limit periodic broadcasting of dissemination messages, each server listens to its carrier to determine the number of nodes having disseminated their services or request a service location. If this number is lower than a given threshold, the server sends a service dissemination message, otherwise it waits for a given period of time. After the time has expired, it listens to its carrier once more and repeats the same procedure.
Approaches to Ubiquitous Computing
123
When a node receives a request, if it has information concerning this service, it responds to the user by broadcasting the response. The response broadcast enables other users requiring the same service to know of its location. When a user wants to locate a service, he listens to his carrier to determine the number of nodes having been disseminated or to search for a service and proceeds in the same manner as a node wanting to disseminate its service. Gnutella [AND 01, JAN 02] is an unstructured system made up of a group of nodes, also called peers, for file transfer between users. The Gnutella system eliminates the need for a service lookup server to index user files [FEL 03]. A node joins the network by connecting to another node already connected. It then starts getting to know other network nodes by receiving requests or responses from them. The request resolution principle in the Gnutella system is accomplished in the following way: a user wanting to locate a file sends a request to neighbors. If the neighbors cannot respond to the request, they send the request to their own neighbors. If a node has the file involved, it notifies the user by sending a response. Otherwise, the search is stopped when a given search threshold (a given depth) has been reached. More precisely, requests have a limited hops to live (HTL) time [LUA 05]. This value is expressed in terms of maximum number of nodes to reach so as to avoid circulating in the network indefinitely. It is often set to 7 to start, and is decremented by each neighbor [FEL 03]. If the HTL value is equal to zero, the request is no longer rebroadcast in the network. The main drawback of the push technique is the significant number of messages transmitted for locating a file which overloads the network [FRA 02, LI 02]. In addition, requests can fail simply by expiration of their HTL before covering the whole network [DEV 03]. In other words, using HTL does not guarantee locating a service available in the network. To solve this problem, an approach using a location mechanism based on the use of a dynamic HTL (expanding ring or dynamic TTL setting) was proposed [LI 02, RIS 04]. More precisely, a request is transmitted with a small HTL. If no service is found, a new request is transmitted with a larger HTL and so on until the service requested is found or until HTL values flooding the network are reached. This approach guarantees the location of an available network service but does not make scaling possible because of the high number of messages used for locating a service [LI 02]. 6.3.2. Random walk-based mechanism To avoid network overload caused by the flooding mechanism with HTL, a method proposed in [LV 02] consists of using a location algorithm based on parallel random walk, or K-random walk in an unstructured system. The objective of this approach is to locate a file without using HTL. The principle of a single random
124
Wireless Ad Hoc and Sensor Networks
walk is that a node wanting to locate a file sends a message executing a random type walk, or in other words its route is not predetermined. When the requested service is located, it comes back to the initiating node. However, this approach increases service location time [RIS 04]. The use of the k-random walk can decrease file location time. In this case, the node wishing to locate a file sends k messages which work in parallel. In [GKA 04, LV 02], the authors illustrate that location based on the use of parallel random walk in an unstructured network supports scalability compared to using a flooding mechanism with HTL, but the cost in terms of necessary time for request resolutions can be high [LI 02, LV 02]. The solution consisting of duplicating network files, as was proposed in [COH 02], is not appropriate since it raises the issue of their update in a dynamic network. 6.4. Comparison between structured and unstructured systems In structured systems based on the centralized indexing mechanism, request resolution relies on the use of a directory (or service lookup) that groups information on services available in a small-sized network. However, the fact that all information concerning services is centralized in a single place is a major flaw in this model in terms of scalability. In addition, failure of the server storing service information would make it impossible to locate these services. The use of a decentralized indexing mechanism can solve the scalability problem. However, the cost of information updates for services frequently entering or leaving the network becomes high [CHO 05]. Structured systems based on the use of the hash mechanism were initially proposed for file transfer and access and disk space sharing. They require the development of an overlay network for locating available files in the network [FRA 02, MIL 02, SCH 03]. However, when one or more nodes simultaneously enter or leave an overlay network, a maintenance algorithm becomes necessary [LIB 02]. In addition, these systems use a mechanism of regular dissemination of file information. Consequently, the high number of messages generated for overlay network maintenance and periodic service information dissemination can overload the network. Unstructured systems do not use an approach based on using service lookups. Two techniques based on a flooding mechanism can be used for service discovery in the network: push and pull techniques. In the push technique, the user periodically broadcasts its discovery request for services available in the network. With the pull technique, it is the servers who periodically broadcast information on their services enabling users to know what services are available in the network. The advantage of this type of system is that they require no structure between network nodes. Therefore, the nodes can freely join or leave the network without any coordination
Approaches to Ubiquitous Computing
125
with other nodes. Service location is carried out through a flooding technique with TTL or HTL. In other words, requests have a limited lifespan in terms of hops so that they do not circulate in the network indefinitely. However, requests can fail simply by expiration of their time before covering the whole network. To solve this problem, an approach was proposed based on the use of dynamic HTL [LI 02, RIS 04]. This approach guarantees the location of a service available in the network but does not enable scalability because of the high number of messages used for locating a service. To avoid network overload by messages, a method based on the use of parallel random walks for request resolution was proposed [LI 02, LV 02]. However, cost in terms of time required for request resolution can be high. The solution consisting of duplicating information concerning services in the network, as was proposed in [COH 02], is not appropriate since it raises the problem of their updates in a dynamic network. A self-organizing and self-adaptive method, based on affinity networks was proposed in [BAK 03, BAK 05, BAK 07] for service discovery in dynamic networks. Self-organization is implemented by the creation of server communities (i.e. nodes able to offer services). A community is an affinity network connecting nodes that are capable of providing a service together. Self-adaptation is implemented by the automatic adjustment of affinities in relation to evolution and dynamic modifications of the number and availability of resources and services, and user requests. More precisely, affinity networks self-organize and dynamically adapt to the type and frequency of user requests through a mechanism of learning and reinforcement of node affinity connections. In other words, the mechanism makes it possible to determine corrections required for adapting to new network conditions and future user requests. 6.5. Self-organizing and self-adaptive approach Contrary to service discovery methods proposed in the literature, this approach is implemented through an adaptive middleware inspired by the human immune system [GAB00, GAB06, BAK 03, BAK 05, BAK 07]. The analogy with the immune system is done in the following way. A user request is considered as an antigen attacking the network. The middleware reacts by developing an immune response to eliminate the attacking antigen i.e. to satisfy the user’s request [GAB 00]. The immune system presents a set of operational principles, such as selfregulation, self-organization, cooperation and adaptive memory [SOM 97, WAT 99]. These characteristics will enable the middleware to develop decentralized and adaptive solutions in a dynamic and uncertain environment.
126
Wireless Ad Hoc and Sensor Networks
We should remind the reader that the operation principle of the immune system is the principle of selection by cloning [GAB 00]. This is put in place by idiotypic networks and by adaptive memory. In this approach, the analogy is the following. The principle of selection by cloning is developed by server communities representing idiotypic networks. Adaptive memory is represented by the request response reinforcement mechanism. The service discovery approach proposed is made up of two processes which can work in parallel. The first one concerns the creation of server communities (or affinity networks) and the second one involves user request resolution. We begin this section by presenting the server community development mechanism. Then, we will present request resolution mechanisms. 6.5.1. Server community construction approach Jerne [STE 94, WAT 99] introduced the idea that the immune system’s B-cells communicate together and create affinity networks (or idiotypic networks) even in the absence of antigens. More precisely, B-cells are not isolated cells, but are connected by chains of stimulation/suppression in which a B-cell is considered an antigen for another B-cell. Therefore, we can consider that an idiotypic network can develop itself in two ways: a proactive way in the absence of an antigen and a reactive way by the introduction and stimulation of this antigen. By analogy with the immune system, the creation of a community can be performed in a proactive or reactive way. The proactive creation is initiated by servers without the presence of user transmitted requests. Creation can be carried out in a reactive way during a user request resolution. The process of proactive creation is a community detection process enabling servers (in the sense of nodes able to offer resources) to mutually discover each other with the goal of creating affinity relationships. To implement this detection process, we adopt a technique based on mobile agents executing parallel random walks [BAA 03, BRO 89] for discovery of services present in the network. Note that a service is made up of a group of resources. A node having a part of these resources cannot provide this service individually, but, because of the process of dissemination through mobile agents, nodes with resources necessary to implement the service, will mutually discover each other and will be able to provide this service together. More precisely, a community of these nodes is created to represent the service. More generally, a community emerges from the cooperation between network nodes through mobile agents to represent a service available in the network by
Approaches to Ubiquitous Computing
127
creating affinity connections. Three types of agents participating in the community development process are: – mobile agents, called aAgents, representing immune system antibodies. These agents enable implementation of the community detection process; – resource agents, called BAgents, representing resources available in the network. These agents correspond to B-cells in the immune system; – server agents or nodes, called SAgents, representing network nodes. These agents correspond to T-cells in the immune system. 6.5.1.1. SAgent server agent An SAgent is associated with each network node. It can belong to one or more communities and has two important roles. The first role concerns node resource management. In particular, when a node enters the network, the SAgent creates BAgents associated with its resources. When a node resource must be deleted, the SAgent informs the BAgent representing this resource. The second role of the SAgent concerns the reception of mobile antibody aAgents from a proactive creation. The SAgent activates appropriate BAgents for the implementation of affinity connections. More precisely, when the SAgent receives the activation message from an aAgent to develop a service, it communicates back the list of required resources to build the service if they exist. It also communicates back the list of its immediate neighbors to enable it to select a following node to visit. 6.5.1.2. BAgent resource agent Proactive creation is initiated by nodes without the presence of requests transmitted by users. BAgents, created by SAgents to represent resources, are responsible for this process through two roles. Their first role involves the creation of mobile antibody agents to stimulate corresponding immune responses to the creation of node communities. These communities represent initial services predefined for proactive creation. More precisely, when a BAgent is created and initiated by the SAgent, it in turn creates a group of antibody aAgents for the proactive creation of services belonging to the group of initial services. The second role of the BAgent concerns the development of affinity connections between its resource and other resources discovered by aAgents. In other words, when the BAgent receives a message from an aAgent to create an affinity connection with another BAgent, it creates this affinity connection in its local table of connections with the affinity value, which is a real and positive number initially chosen in a random way. This value will then be determined based on the number of requests that covered this connection in search of the service, as we will see in section 6.5.2. We should note that an affinity connection is deleted when its affinity value becomes lower than or equal to 0. The value corresponding to an affinity connection
128
Wireless Ad Hoc and Sensor Networks
can decrease and go over 0 when the community to which it belongs is no longer used by user requests. 6.5.1.3. Mobile aAgent An aAgent is created by a BAgent for the creation of a community representing a given service. The role of a mobile aAgent is to create a community of nodes representing a given service. This agent corresponds to an immune system antibody. It moves randomly between network servers and activates SAgents encountered in order to stimulate the creation of affinity connections between appropriate BAgents. More precisely, when this agent is created, it sends an activation message to the SAgent requiring a message containing the list of resources corresponding to the service and the list of the network’s direct neighbors. When it receives this message, in the case where the server does not have the resources requested by the service, it moves to a randomly chosen neighboring server. Otherwise, it activates each BAgent corresponding to resources required for the creation of the service by sending an activation message to it requesting a message containing affinity connections from activated BAgents corresponding to the service to create. In the case where activated BAgents do not yet have affinity connections involved in the creation of service community, the agent randomly chooses a BAgent among them and sends to it the message of creation of affinity connection. The agent then moves to a neighbor node randomly chosen from the list of direct neighbors. If connections corresponding to the service to be created exist but do not yet include a link between BAgents from the current server and the last server visited, then it sends a development message to the last BAgent that has participated in creating the community representing the service. Then, it chooses a connection with the highest affinity value and moves to the corresponding server. The aAgent is eliminated when its lifespan has expired. In the natural immune system, an antibody generated by a B-cell is considered an antigen for other B-cells. In other words, an antibody is an antigen which does not correspond to a request coming from a user but is proactively initiated by a BAgent. It is important to note that the mobile aAgent can clone itself during its random walk. The self regulation approach proposed in [AMI 02, BAK 05, BAK 06] could be used to regulate aAgent population size in the network. By analogy with the immune system, a community of servers represents an idiotypic network, server resources correspond to B-cells and a user request for a service corresponds to an antigen. In the natural immune system, when an antigen attacks the human body, B-cells activated by this antigen are cloned and produce antibodies until the antigen is eliminated; this is the “primary” response. Thus, the presence of these B-cells is reinforced by a cloning mechanism and when the same
Approaches to Ubiquitous Computing
129
antigen attacks the body once again, the immune system reacts more quickly to this new exposition; this is the “secondary” response. On the other hand, B-cells whose presence is not reinforced by antigens end up being eliminated [BAL 00]; this is apoptosis. In other words, B-cells not reinforced by exposition to the same antigen are gradually eliminated and corresponding idiotypic networks disappear. Similarly, in this approach, when a community exists in the network but is not used, it must be undone (i.e. affinity connections must be deleted). In addition, mobile aAgents creating this community must also be eliminated from the network. Otherwise, these agents will continue to cover the network and build communities to represent their service. As we will see in the following section concerning request resolution, the resolution approach is based on the reinforcement of affinity connections from a community. This reinforcement is performed by mobile agents representing requests when this community corresponds to the response requested. Otherwise, affinity connections are weakened. More precisely, values of affinity connections from a community of servers progressively decrease according to an adjustment equation that will be described in the following section. 6.5.2. Request resolution Request resolution must be executed from a greedy resolution strategy based on the selection of the best neighbor within a community with affinity connection reinforcement. More precisely, a mobile agent representing a request (or the antigen), once created, initiates a random walk in the network until it reaches a node that is able to provide one or more resources requested to make up the service. If the node encountered is the entry point to a community corresponding to the service requested, the agent executes a greedy run within this community according to affinity connection values. It uses a connection when its affinity value is higher. In addition, during its route, it reinforces the value of selected affinity connections and decreases those from non selected connections [BAK 03, BAK 05]. We will call this reinforcement process local reinforcement of connections. When the agent ends its route within the community and constitutes the list of servers providing the service, it proceeds to the reinforcement of the selected route. More precisely, it uses the opposite route to the entry point by once more reinforcing the chosen links. We will call this second reinforcement process global reinforcement. The objective of the local reinforcement process is to allow servers to adapt their affinity connections in pairs according to available services in the network and based on user requests. This step is vital in this self-organizing approach since it makes it possible for the system to adapt to dynamic network modifications, to the number and availability of resources and to user requests.
130
Wireless Ad Hoc and Sensor Networks
The objective of the global reinforcement process is to enable the selection, throughout the community, of an appropriate route for the user request. In other words, remember that a route within a community corresponds to a list of nodes liable to provide the service together. Since there are several possibilities of choosing a route, the global reinforcement process makes it possible for one to emerge based on user requests. This step is also vital in this self-organizing approach since it enables the emergence of adaptive solutions to requests transmitted by users. In a more formal way, request resolution is implemented by using affinity connections created during the community development step. A community is an affinity network which can be represented by a direct weighted graph G(S,L), where S is the group of nodes and L is the group of logical connections representing the affinity relations between these nodes. A connection between node i and node j is (s) characterized by weight mij representing the affinity between these two nodes within the community representing service s. 6.5.2.1. Local reinforcement mechanism (s)
Affinity mij between node i resource (i.e. a BAgent of i) is a resource of server j (i.e., a BAgent of j), in relation to a service s, is adjusted at step k+1 in the following way: mij( s) (k + 1) = mij( s) (k ) + ∆mij( s ) (k ) = mij( s) (k ) + µ (satLocalij( s ) − f (mij( s ) (k ))
f being logistic function
1 1+ e
mij
[6.1]
, and µ being the constant of the proportionality
included between 0 and 1. satLocalij( s ) is local satisfaction between BAgent i and BAgent j in relation to request s. This value weighs 1 for positive satisfaction or in other words, the agent request arriving in server i selects server j among neighbors of i providing the following requested resource. satLocalij( s ) weighs 0 for negative satisfaction. The use of the logistic function enables affinity value mij( s ) , when close to 0, to quickly increase when satLocalij( s ) is equal to 1 and to quickly decrease when satLocalij( s ) is equal to 0. To illustrate this reinforcement mechanism, we consider the following example in which a request agent searches for a service s made up of three resources r1, r2
Approaches to Ubiquitous Computing
131
and r3 (i.e., s = (r1, r2, r3)). Let us presume that the request agent arrived at node s1 using resource r1 (see Figure 6.2). The request agent thus considers that s1 is the entry point in a community providing requested service s and has affinity relations with its neighbors s2, s3 and s4. These neighbors provide resources r2, r3 and r4 respectively. Let us further presume that initially m14 weighs 0.4, m13 weighs 0.3 and m12 weighs 0.2. Node s4 does not belong to the community providing the (s) requested service. The request agent cannot choose s4 and consequently satLocal14 is equal to 0. On the other hand, both s2 and s3 nodes can satisfy the request. (s) (s) Consequently, satLocal12 and satLocal13 are equal to 1. Between these last two, the request agent selects the affinity connection with the highest value, i.e. connection m13. Values of these connections are then adjusted based on equation 1 and consequently m14 is decreased and becomes – 0.1987, m12 is increased and becomes 0.6502 and m13 is increased and becomes 0.7256. The request agent, having chosen node s3 as the next step in its route for request resolution, moves into this node and continues the resolution process in the same way. The request agent then uses the connection between s3 and s6 for the selection of the last resource r2 and locally reinforces the value of affinity m36 which becomes equal to 1.0318. S2 S1
r2
m12=0.2
r1
m26=0.3
S3 m13=0.3
m14=0.4
m36=0.7
r3
S6 r2
r4 S4
m45=0.5
S5 r5
m56=0.6
Figure 6.2. In this example, the request agent searches for a service made up of three resources r1, r2 and r3. Node s1 has affinity connections with its neighbors s2, s3 and s4. Only s2 and s3 belong to the requested service community s = (r1, r2, r3). When the request agent chooses connections m13 and m36, affinity values become m12 = 0.6502, m13 = 0.7256, m14 = -0.1987 and m36 = 1.0318
We should note that, despite the fact that it has resource r1, if server s1 was not already part of a community providing service s = (r1, r2, r3), the proactive development process is launched as described in section 6.5.1.
132
Wireless Ad Hoc and Sensor Networks
6.5.2.2. Global reinforcement mechanism When the agent ends its route within the community and makes up the list of nodes providing the requested service, it then proceeds to global reinforcement of selected route ϕ (i.e., affinity connections between selected nodes). More precisely, it uses route ϕ in the opposite direction until it reaches the entry point by once more reinforcing the chosen connections. Value variation of affinity connections is calculated as follows: ∆mij( s ) (k ) = µ ( satGlobalϕ( s ) − f (mij( s ) (k ))
[6.2]
satGlobalϕ( s ) being the global satisfaction in relation to the resolution of request s. This value weighs 1 for a positive satisfaction or in other words, the request agent has found all requested resources, or 0 for negative satisfaction. That is the case when delay (TTL) granted in the request agent expired before it has been able to find all requested resources.
To illustrate this global reinforcement mechanism, we will take the previous example in which a request agent searches for a service s made up of three resources r1, r2 and r3. Once resource r1 is found in node s1, resource r3 in node s3, and remaining resource r2 in node s6, the request agent uses the route made up of selected links in the opposite direction by once more reinforcing values based on equation [6.2] Values m36, m13 are incremented and become equal to 1.2945 and 1.0518 respectively (see Figure 6.3). S2
r1 m14=-0.1987
r2
m12=0.6502
S1
m26=0.3
S3 m13=1.0518
m36=1.2945
r3
S6 r2
r4 S4
m45=0.5
S5 r5
m56=0.6
Figure 6.3. In this example, when the request agent finds all requested resources r1, r2 and r3, it uses the opposite route and reinforces the values of affinity connections a second time. In this way, the values of selected links become m13 = 1.0518 and m36 = 1.2945
Approaches to Ubiquitous Computing
133
6.5.2.3. Types of agents In the process of creating affinity networks between nodes, we have used all three types of agents aAgent, BAgent and SAgent. For request resolution, the following two types of agents are added: – user agents, called UAgent, representing users and initiating the request resolution process; – mobile agents, called AAgent, for the implementation of the resolution by triggering an immune response. These agents correspond to the immune system’s antigens. 6.5.2.3.1. User agent (UAgent) A UAgent is associated with each user connected to the network. The UAgent has two roles. The first one involves receiving user requests. In other words, for each request transmitted by the user, UAgent creates an AAgent associated with it to initiate the resolution process. The second role of UAgent involves receiving responses from these AAgents. More precisely, when UAgent receives a request from the user, it creates an AAgent for its resolution and waits for the result containing the list of nodes providing a part of or all resources requested. Once it receives this message, the result is then sent to the user. Then, UAgent sends a termination message to AAgent. 6.5.2.3.2. Request agent (AAgent) This agent is created by UAgent for the resolution of a request. The role of this mobile AAgent is to resolve a user’s request. Once created by a UAgent, this agent initiates a random walk in the network in search of nodes liable to provide one or more resources making up the service. More precisely, when this agent reaches a node, it transmits an activation message to SAgent requesting a message containing the node’s and its neighbors’ list of resources. In the case where the node encountered has no resource to satisfy the request, AAgent continues its random walk by choosing randomly the next node to visit among the current node’s neighbors. In the case where the node encountered is the entry point of a community corresponding to the requested service, AAgent executes a greedy route within this community based on affinity connection values. In other words, AAgent uses the connection with the highest affinity value to continue its route. In addition, during this route, AAgent reinforces the value of selected connections and decreases the value of non selected connections. When AAgent ends its route within the community and makes up the list of nodes providing the service, it sends to UAgent
134
Wireless Ad Hoc and Sensor Networks
the response containing the list of resources found. Then, it proceeds to a second reinforcement of the selected route. More precisely, AAgent uses the opposite route until it reaches the entry point by once more reinforcing the chosen links. In the case where the node encountered has one or more resources liable to satisfy the request and that this node is not the entry point of a community representing the requested service (i.e., none of its BAgents with resources has affinity connections), it stimulates reactive creation by sending a stimulation message to a BAgent randomly chosen among selected BAgents. In order to do this, the BAgent will create a mobile aAgent as presented in section 6.5.1. AAgent then sends a response to UAgent to indicate that the service is not represented. 6.5.2.3.3. Server agent (SAgent) We might remember that an SAgent represents a network node. The roles of this agent in the community development process were described in section 6.5.1. It also has a role in the resolution process consisting of communicating to the AAgent the list of its resources (or BAgents). More precisely, when it receives an activation message from an AAgent for a request resolution, it responds by sending a message containing the following information: a list of its resources, its identifier and the list of its direct neighbors. 6.5.2.3.4. Resource agent (BAgent) We should also remember that BAgents represent resources offered by network nodes. In the resolution process, BAgents play two roles. The first role involves value adjustment of affinity connections with other BAgents within a community. The second role, activated by an AAgent, involves reactively creating node communities. In other words, BAgents can initiate mobile antibody aAgents for the creation of a community as presented in section 6.5.1. It should be noted that the reinforcement or attenuation of affinity connections between nodes is based on value modification of these connections by using equations [6.1] and [6.2]. This reinforcement of affinity connections is at the root of the emergence of the most widely used routes and their quick selection during future request resolution. In fact, it is important to note that, because of this learning mechanism, when a service is often requested, the reactive development of new communities and the presence of those already existing are reinforced. Consequently, the encounter between a request searching for a popular service, and communities representing it will be performed much faster.
Approaches to Ubiquitous Computing
135
It is also important to note that this affinity connection reinforcement constitutes the self-organizing memory of this self-adaptive approach. This also corresponds, by analogy, to the adaptive memory of the immune system enabling the development of faster secondary immune responses to antigens previously encountered. 6.6. Simulation results The simulations presented below have been developed by Network Simulator (NS2) [NET]. NS2 is a simulator for generating a network and simulating communications between its nodes. To accomplish our simulations, we have generated a network of 100 nodes. Each node has one resource. There are 10 types of resources randomly distributed among the nodes. The first simulation consists of comparing request resolution time using two strategies. In the first strategy, there are no affinity connections between servers, and agents execute random walks in the network for request resolution. In the second strategy, the creation of communities is carried out but the reinforcement mechanism of affinity connection values is not applied. In this case, request agents execute a random route within these communities. Requests are randomly generated from all 10 resources. The simulation result presented in Figure 6.4 shows that the creation of communities, with a request resolution based on a random route within these communities, improves request resolution time compared to a strategy of resolution based only on random walk in the network and with no community creation. The objective of the second simulation is to compare request resolution time when the reinforcement mechanism of affinity connection values is applied and when it is not applied respectively. When this mechanism is not applied, the request agent executes a random route within the community instead of a greedy route with reinforcement. In addition, affinity connections are not modified during both routes. More precisely, affinity connections between servers are not weighed by adjustable values. The simulation result presented in Figure 6.5 shows that the use of the reinforcement mechanism improves request resolution time compared to a resolution strategy only based on random walk within created communities.
136
Wireless Ad Hoc and Sensor Networks
Figure 6.4. Comparison in terms of resolution time (r), with or without creation of server communities
Figure 6.5. Comparison in terms of resolution time (r), with and without reinforcement of affinity connections
Approaches to Ubiquitous Computing
137
6.7. Conclusion In this chapter, we presented the different systems of service discovery in the literature for implementing ubiquitous computing. They are classified into three families. One family groups systems based on structural organization such as centralized or decentralized indexing or hashing. The second family groups systems not based on a structural organization. They use push and pull mechanisms or random walk. The third family groups self-organizing and self-adapting systems for the discovery of services available in a dynamically and randomly evolving network. More precisely, the approach presented in this chapter is a self-adaptive methodology for a network in which the addition or subtraction of resources and servers, and the variation of the type or frequency of requests is performed randomly and in an unpredictable way. Self-organization is implemented by the autonomous creation of server communities to represent services available in the network. These communities can be created in a proactive or reactive manner. Servers belonging to one community are linked by affinity connections. Self-adaptation is implemented on one hand by the maintenance of these links in relation to the evolution and dynamic modifications of the number and availability of resources and services, and on the other hand by the adjustment of these links’ values in relation to user requests. This adjustment by reinforcement of affinity connections constitutes selforganizing memory of this self-adaptive approach made up of the experience acquired during past request resolutions, enabling the middleware to quickly solve future user requests. 6.8. Bibliography [AMI 02] AMIN K.A., MIKLER A.R., “Dynamic Agent Population in Agent-based Distance Vector Routing”, Second International Workshop on Intelligent Systems Design and Applications, Atlanta, GA, USA, August 2002. [AND 01] ANDERSON K., Analysis of the Traffic on the Gnutella Network, CSE222 Final Project, http://www.cs.ucsd.edu/classes/wi01/cse222/projects/reports/p2p-2.pdf, 2001. [AZO 02] AZONDEKON V., Service Selection and Association Facilitation in a Network, M.Sc.A. Thesis, Electrical and Computer Engineering Department, University of Sherbrooke, Sherbrooke, Quebec, January 2002. [BAA 03] BAALA H., FLAUZAC O., GABER J., BUI M., EL-GHAZAWI T., “A SelfStabilizing Distributed Algorithm for Spanning Tree Construction in Wireless Ad Hoc Networks”, Journal of Parallel and Distributed Computing, vol. 63, no. 1, p. 97-104, January 2003.
138
Wireless Ad Hoc and Sensor Networks
[BAK 03] BAKHOUYA M., GABER J., Self-Adaptive and Self-Organising System Inspired by the Immune System for Ubiquitous Computing, Internal research report, University of Technologies of Belfort-Montbéliard (UTBM), France, October 2003. [BAK 05] BAKHOUYA M., Approche auto-adaptative à base d’agents mobiles et inspirée du système immunitaire de l’Homme pour la découverte de services dans les réseaux à grande échelle, PhD thesis, University of Technologies of Belfort-Montbéliard, France, 2005. [BAK 06] Bakhouya M., Gaber J., Adaptive Approach for the Regulation of a Mobile Agent Population in a Distributed Network, 5th International Symposium on Parallel and Distributed Computing (ISPDC'06). IEEE Press, Timisoara, Romania, pp. 360-366, July 2006. [BAK 07] BAKHOUYA M., GABER J., Ubiquitous and Pervasive Application Design. Encyclopedia of Mobile Computing & Commerce, Eds. D. Taniar, Idea Group Pub, February 2007. [BAL 00] BALLET P., Intérêts mutuels des systèmes multi-agents et de l’immunologie, applications à l’immunologie, l’hématologie et au traitement d’images, PhD thesis, University of Brest, France, January 28 2000. [BET 00] BETTSTETTER C., RENNER C., “A Comparison of Service Discovery Protocols and Implementation of the Service Location Protocol”, Proceedings EUNICE 2000, 6th EUNICE Open European Summer School, September 2000. [BRO 89] BRODER A.Z., KARLIN A.R., RAGHAVAN P., UPFAL E., “Trading Space for Time in Undirected s-t Connectivity”, ACM Symposium on Theory of Computing, p. 543549, Seattle, WA, 1989. [CHA 05] CHANG-SEOK OH Y.-B.K., ROH Y.-S., “An Integrated Approach for Efficient Routing and Service Discovery in Mobile Ad Hoc Networks”, IEEE Consumer Communications and Networking Conference (CCNC 2005), Las Vegas, USA, 2005. [CHO 05] CHO C., LEE D., Survey of Service Discovery Architectures for Mobile Ad Hoc Networks, http://www.cise.ufl.edu/class/cen5531fa05/files/ccho-dlee.pdf, 2005. [COH 02] COHEN E., SHENKER S., “Replication Strategies in Unstructured Peer-to-Peer Networks”, ACM SIGCOMM’02 Conference, vol. 32, no. 4, p. 177-190, 2002. [CZE 99] CZERWINSKI S., ZHAO B., HODES T., JOSEPH A., KATZ R., “An Architecture for a Secure Service Discovery Service”, Proceedings of ACM MobiCom’99, Seattle, USA, 1999. [DEV 03] DEVERGE J.-F., Systèmes distribués de partage de données, thesis, IFSIC, University of Rennes, France, 2003. [FEL 03] FELTIN G., DOYEN G., FESTOR O., “Les protocoles Peer-to-Peer, leur utilisation et leur détection”, Cinquième Journées Réseaux JRES’2003, Lille, http://2003.jres.org/ actes/paper.70.pdf, 2003 [FOU 04] FOUIAL O., Découverte et fourniture de services adaptatifs dans les environnements mobiles, PhD thesis, Ecole Nationale Supérieure des Télécommunications, 2004.
Approaches to Ubiquitous Computing
139
[FRA 02] FRANÇOIS P., “Vers de nouveaux modèles peer-to-peer”, Mini-Workshop Systèmes Coopératifs, http://www.info.fundp.ac.be/ven/CISma/FILES/2003-pierreFRANCOIS.pdf, 2002. [GAB 00] GABER J., New Paradigms for Ubiquitous and Pervasive Computing, white paper, University of Technologies of Belfort-Montbéliard (UTBM), France, September 2000. [GAB 06] GABER J., New Paradigms for Ubiquitous and Pervasive Applications, Proceeding of 1st Workshop on Software Engineering Challenges for Ubiquitous Computing, Lancaster, UK, 2006. [GAU 02] GAURON P., Topologies dynamiques pour les systèmes pair-à-pair, Training report from DEA Informatique distribuée, Paris-Sud-Orsay University, France, 2002. [GKA 04] GKANTSIDIS C., MIHAIL M., SABERI A., “Random Walks in Peer-to-Peer Networks”, INFOCOM, http://www.ieee-infocom.org/2004/Papers, 2004. [GUR 03] GURGEN L., Découverte de données dans les réseaux mobiles, DEA training, ENSIMAG, Institut National Polytechnique de Grenoble, Grenoble, France June 13 2003. [GUT 99] GUTTMAN E., PERKINS C., VEIZADES J., DAY M., “Service Location Protocol, Version 2”, RFC, no. 2608, June 1999. [HAU 05] HAUSPIE M., Contributions à l’étude des gestionnaires de services distribués dans les réseaux ad hoc, PhD thesis, University of Lille, Lille, France, 2005. [HEL 03] HELAL S., DESAI N., VERMA V., LEE C., “Konark – A Service Discovery and Delivery Protocol for Ad-hoc Networks”, Proceedings of the 3rd IEEE Conference on Wireless Communication Networks (WCNC), New Orleans, USA, 2003. [JAN 02] JAN M., Systèmes pair-à-pair de gestion de données à grande échelle: localisation et routage, DEA d’Informatique de l’IFSIC training report, University of Rennes I, Rennes, France, 2002. [KLE 03] KLEIN M., KONIG-RIES B., OBREITER P., “Service Rings – A Semantic Overlay for Service Discovery in Ad Hoc Networks”, 14th International Workshop on Database and Expert Systems Applications (DEXA’03), p. 180, Prague, Czech Republic 2003. [KOZ 04] KOZAT U.C., TASSIULAS L., “Service Discovery in Mobile Ad Hoc Networks: An Overall Perspective on Architectural Choices and Network Layer Support Issues”, Ad Hoc Networks, no. 2, p. 23-44, Elsevier, Paris, 2004. [LI 02] LI C.W.B., Peer-to-Peer Overlay Networks: A Survey, Department of Computer Science, http://comp.uark.edu/cgwang/Papers/TR-P2P.pdf, 2002. [LIB 02] LIBEN-NOWELL D., BALAKRISHNAN H., KARGER D., “Analysis of the Evolution of Peer-to-Peer Systems”, Proceedings of the 21st Annual Symposium on Principles of Distributed Computing, p. 233-242, ACM Press, Monterey, USA, 2002. [LIU 03] LIU J.C., SOHRABY K., ZHANG Q., LI B., ZHU W., “Resource Discovery in Mobile Ad Hoc Networks”, The Handbook of Ad Hoc Wireless Networks, p. 431-441, 2003.
140
Wireless Ad Hoc and Sensor Networks
[LUA 05] LUA E.K., CROWCROFT J., PIAS M., SHARMA R., LIM S., “A Survey and Comparison of Peer-to-Peer Overlay Network Schemes”, IEEE Communications Survey and Tutorial, vol. 7, no. 2, 2005. [LUO 03] LUO H., “Performance Evaluation of Service Discovery Strategies in Ad Hoc Networks”, Master of Science, School of Computer Science, Carleton University, Ottawa, USA, 2003. [LUO 04] LUO H., BARBEAU M., “Performance Evaluation of Service Discovery Strategies in Ad Hoc Networks”, 2nd Annual Conference on Communication Networks and Services Research (CNSR’04), p. 61-68, Fredericton, 2004. [LV 02] LV Q., CAO P., COHEN E., LI K., SHENKER S., “Search and Replication in Unstructured Peer-to-Peer Networks”, ACM Sigmetrics 2002, http://citeseer.ist.psu. edu/lv02search.html. [MAL 02] MALKHI D., NAOR M., RATAJCZAK D., “Viceroy: A Scalable and Dynamic Emulation of the Butterfly”, Proceedings of the 21st ACM Symposium on Principles of Distributed Computing, Monterey, USA, 2002. [MAT 01] MATA J.G., “Comparison of Bandwidth Usage: Service Location Protocol and Jini”, Master of Computer Science, School of Computer Science, Carleton University, Ottawa, USA, February 2001. [MIL 02] MILOJICIC D.S., KALOGERAKI V., LUKOSE R., NAGARAJA K., PRUYNE J., RICHARD B., ROLLINS S., XU Z., Peer-to-Peer Computing, Research report no. HPL2002-57, HP Labs, March 2002. [MOH 04] MOHAN U., ALMEROTH K.C., BELDING-ROYER E.M., “Scalable Service Discovery in Mobile Ad Hoc Networks”, NETWORKING 2004, p. 137-149, 2004. [NID 01] NIDD M., “Service Discovery in DEAPspace”, IEEE Personal Communications, no. 39-45, 2001. [NET] Network simulator NS-2, available on Information Sciences Institute’s website: http://www.isi.edu/nsnam/ns. [PAR 05] PAROUX G., “Une plate-forme pour les échanges P2P sur les réseaux mobiles ad hoc”, Manifestation des Jeunes Chercheurs STIC, Rennes, France, 2005. [PER 98] PERKINS C., “Service Location Protocol”, ACTS Mobile Networking Summit/ MMITS Software Radio Workshop, Rhodes, Greece, June 1998. [PLA 99] PLAXTON C.G., RAJARAMAN R., RICHA A.W., “Accessing Nearby Copies of Replicated Objects in a Distributed Environment”, Theory of Computing Systems, no. 32, p. 241-280, 1999. [RAT 01] RATNASAMY S., FRANCIS P., HANDLEY M., KARP R., SHENKER S., “A Scalable Content-Addressable Network”, Proc. ACM SIGCOMM’01, San Diego, USA, 2001. [RIS 04] RISSON J., MOORS T., Survey of Research towards Robust Peer-to-Peer Networks: Search Methods, UNSW-EE-P2P-1-1 Technical Report, University of New South Wales, Sydney, Australia, September 2004.
Approaches to Ubiquitous Computing
141
[ROB 00] ROBERT M., Discovery and Its Discontents: Discovery Protocols for Ubiquitous Computing, Urbana UIUCDCS-R-99-2132, Department of Computer Science University of Illinois Urbana-Champaign, citeseer.nj.nec.com/mcgrath00discovery.html, March 25 2000. [ROB 04] ROBINSON R., INDULSKA J., “A Complex Systems Approach to Service Discovery”, DEXA Workshops 2004, p. 657-661, 2004. [ROW 01] ROWSTRON A., DRUSCHEL P., Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems, Reading Notes in Computer Science, N± 2218, 2001. [SAI 05] SAILHAN F., ISSARNY V., “Scalable Service Discovery for MANET”, Proceedings of the 3rd IEEE Int’l Conf. on Pervasive Computing and Communications, Kauai Island, HI, 2005. [SCH 03] SCHÜRMANS S., Distributed Hashtables, reference no. 227747, 2003. [SHU 02] SHUKLA A., “Issues in Auto Service Discovery”, A Study for the Course in Advanced Computer Networks (CS625), Doctor Dheeraj Sanghi, Instructor 2002. [SOM 97] SOMAYAJI A., HOFMEYR S., FORREST S., “Principles of a Computer Immune System”, Proceedings of the Second New Security Paradigms Workshop, p. 75-82, Little Compton, USA, 1997. [STE 94] STEWART J., “Un système cognitif sans neurones: les capacités d’adaptation, d’apprentissage et de mémoire du système immunitaire”, Intellectica, vol. 1, no. 18, p. 1543, 1994. [STO 01] STOICAY I., MORRISZ R., LIBEN-NOWELLZ D., KARGERZ D.R., KAASHOEKZ M.F., DABEKZ F., BALAKRISHNANZ H., Chord: A Scalable Peer-topeer Lookup Protocol for Internet Applications, http://www.pdos.csail.mit.edu/papers/ ton:chord/paper-ton.pdf, 2001. [WAT 99] WATANABE Y., ISHIGURO A., UCHKAWA Y., “Decentralized Behavior Arbitration Mechanism for Autonomous Mobile Robot Using Immune System”, in D. Dasgupta (ed.), Artificial Immune Systems and Their Applications, p. 186-208, SpringerVerlag, New York, 1999. [WEI 93] WEISER M., Ubiquitous Computing, www.ubiquitous.com/hypertext/weiser, March 1993. [XU 01] XU D., NAHRSTEDT K., WICHADAKUL D., “QoS-Aware Discovery of WideArea Distributed Services”, First IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), Brisbane, Australia, May 2001. [ZHA 01] ZHAO B.Y., KUBIATOWICZ J.D., JOSEPH A.D., Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing, UC Berkeley, UCB/CSD-01-1141, 2001. [ZHA 02] ZHAO W., SCHULZRINNE H., GUTTMAN E., “mSLP-Mesh-enhanced Service Location Protocol”, ICCCN 2000, Internet draft draft-zhao-slp-da-interaction-07.txt, 2002.
Chapter 7
Service Discovery Protocols for MANETs
This chapter discusses the issues of service discovery in ad hoc networks. In section 7.1, we begin by introducing concerns relative to service discovery in general and to ad hoc networks in particular. In section 7.2, we present the most widely known service discovery protocols. We begin with the presentation of those introduced in wired networks (JINI, UPnP and SLP) (section 7.2.1). Then, in section 7.2.2, we describe some of the protocols introduced in ad hoc networks (Post-Query, Konark, GSD and Allia). In section 7.2.3, we finish with the presentation of discovery protocols paired with routing protocols. We describe two protocols. The first one was introduced by Koodli and Perkins. The second one, called SEDIRAN, is our own contribution. We then end this chapter with a conclusion. 7.1. Introduction Wireless technologies offer new telecommunications opportunities. Recent wireless communication growth and the continuous improvement of mobile terminal performance have enabled us to use these technologies in different fields and to consider new applications. There are two categories of wireless mobile networks: – networks with infrastructure using access points to connect to a wide range network (the Internet, for example); – networks without infrastructure (or ad hoc networks) which do not presuppose the presence of a wired infrastructure. Network nodes are connected by radio links.
Chapter written by Abdellatif OBAID and Azzedine KHIR.
144
Wireless Ad Hoc and Sensor Networks
An ad hoc network (also called MANET for Mobile Ad hoc NETwork) is a wireless self-configurable network made up of mobile free moving terminals, interconnected by wireless connections. Each terminal participates in the network’s structure and acts as a router (Figure 7.1). Network topology depends on the position of terminals, their coverage zones and their transmission power. Ad hoc networks provide a way of communicating which can be quickly and easily deployed. This makes ad hoc networks a good choice for applications in various civil (rescue, emergency, police, transportation, etc.) and military fields. Since the transmission range of a wireless terminal is limited, in order for a terminal to communicate with another terminal, it must go through intermediate nodes. This requires routing protocols that are able to forward and maintain routes between terminals. Since terminals are mobile, an ad hoc network is subject to topology changes (which are sometimes frequent). Because of this, routing protocols in ad hoc networks must consider these topology changes.
Figure 7.1. An ad hoc network
The goal of a computer network is to offer a way for users with different terminals to communicate in order to share data and access services provided by different servers. In order for an automatic and dynamic integration of services in these networks to happen, several service discovery protocols dedicated to wired networks have been proposed. Generally, these protocols are based on a centralized architecture where each service is registered to a directory server, making it possible to group information on network services in a central location and to organize it according to type. This approach makes it easier for users to choose and access the different services.
Service Discovery Protocols for MANETs
145
Service discovery protocols are vital for an ad hoc environment. In fact, these networks are self-configurable and made up of mobile terminals that are autonomous and free to move, to leave and join the network at any time. Service discovery in such an environment requires a decentralized approach in which a terminal should not depend on a centralized directory to announce its services. The main elements of a service discovery protocol are: description of services, notification of services in the network, search and use of services. Service discovery in ad hoc networks constitutes an issue that several researchers are focusing on. There are already a few interesting studies on the subject. Two basic mechanisms are used to discover services in ad hoc networks: – the first is based on request/response message exchanges in which a user interested in discovering a service broadcasts a discovery request containing information on the service requested. The terminal offering the service sends a response; – in the second, the service provider periodically sends announcement messages (called flags) to announce the availability of its services. A cache memory is used to temporarily register information on available services. This information is then recovered using flags and response messages. A terminal searching for a service will first look into its cache memory before broadcasting a discovery message. Some protocols actually use this cache memory to respond to discovery requests. The centralized service discovery protocol approach is not adapted to ad hoc environments. In a decentralized approach, since there is no central service repository server, servers announce their existence by broadcasting announcement messages in the network. Users wanting to discover a service transmit discovery requests. Request and response messages are sent through routing protocols. In the case of a reactive routing, sending these response messages can trigger a route search if the terminal does not have a route to the service requester. A service discovery protocol in ad hoc networks must minimize the number of messages transmitted to avoid triggering the routing protocol as much as possible. An ad hoc network can be made up of terminals with different capabilities (examples are: a laptop computer, PDA or a cell phone). A service invocation system must consider this variety and be able to enable interoperability between different platforms. In addition, bandwidth inadequacy in certain wireless networks (such as GSM) requires optimization of traffic exchanged between requester and provider. On the other hand, resource limits (i.e. memory, processor, battery) of some types of mobile terminals require a service invocation system to minimize the use of these resources.
146
Wireless Ad Hoc and Sensor Networks
7.2. Service discovery protocols Elements of service discovery are based on the notions of description, publication, discovery and invocation: – description: a service discovery mechanism must have a clear syntax and a well defined semantic to facilitate service search and matching at a semantic level. A few service discovery systems use simple data structures to describe service profiles and others use semantic description languages such as RDF (resource description framework) [RDF]; – publication: publication of services is meant to announce service descriptions to a lookup server, or directly to other network nodes. The efficiency of a publication depends on the information’s characteristic provided concerning a service, which can help a user to determine if he wants to use this service. The advantage of the service publication process is to make service profiles accessible to users so that they can be aware of the existence of services and can use them; – discovery: this is the process by which a user finds a provider for a service. This process helps to formulate a discovery request and offers a matching function to determine services corresponding to the request. It also provides a communication mechanism between users and service providers; – invocation: service invocation is responsible for facilitating the use of a service. It is also responsible for maintaining the connection between user and provider during their interaction. A good invocation mechanism hides details of a communication. In the case of a break of these network connections, it would redirect requests to another provider. 7.2.1. Service discovery protocols in wired networks In this section, we present a few service discovery protocols in wired networks because they have been at the base of protocols used in ad hoc networks. 7.2.1.1. JINI JINI is a service discovery system based on Java language which offers code mobility and security in executing codes uploaded from other machines [JINI]. It uses the characteristics of the application environment in Java to simplify development of distributed systems. The core of the system is based on discovery, join and lookup protocols. The discovery protocol is launched when a service wants to locate a server lookup. Once the server is located, the service uses the join protocol to join it. The lookup protocol
Service Discovery Protocols for MANETs
147
is launched by a client to locate a service and to retrieve a copy of the object enabling its invocation. A service is described by its interface and attributes. Service providers publish their services in the network’s lookup servers and clients interested in directly finding a service consult these servers. A lookup server provides a contact point between the system and users. Services are added to the lookup server through both discovery and join protocols. Access to a service is based on a lease period which is negotiated between the user and provider. If the lease is not renewed before its expiration (either because the client no longer needs the resource or because the lease is not renewable), the resource becomes free. A series of groups is associated with each lookup server. Services are associated with groups or belong to the public group. In order for a service to register in a lookup server, it must belong to a group managed by the lookup server, or to the public group. The discovery protocol enables service providers and customers to discover and retrieve proxies from lookup servers in order to interact with them. Three mechanisms are used for lookup server discovery: – a multicast request protocol used to discover lookup servers; – a multicast announcement protocol enabling lookup servers to announce their existence; – a unicast discovery protocol used to contact a specific lookup server. The membership protocol will enable a service provider to register a service in a lookup server by using the proxy received during discovery. When a server starts, it tries to join the lookup servers registered in the list of the service’s lookup servers by using a unicast discovery request. If the group list is empty, the service broadcasts in multicast a discovery request and registers in every lookup server responding to the request, or announces its existence. If the group list is not empty, the service registers to lookup servers belonging to one or more of its groups. The server lookup maintains a collection of service items. Each item represents an occurrence of an available service. When a new service is created, it registers with the lookup server by providing a first collection of attributes. For example, a printer could include attributes for speed (for pages per minute), resolution and the facility for double-sided printing, etc.
148
Wireless Ad Hoc and Sensor Networks
7.2.1.2. UPnP The PnP (Plug and Play) protocol is used for the installation, configuration and addition of peripherals to a computer. UPnP (Universal Plug and Play) [UPNP] extends this functionality to network equipment and enables the discovery and control of equipment and services. The functional basic blocks of a UPnP network are the equipment, services and control points. A control point is a controller able to discover and control equipment. Exchanges in UPnP are based on HTTP. Two HTTPMU and HTTPU (multicast and unicast UDP HTTP) protocols have been defined to transmit messages. SSDP (simple service discovery protocol) is a protocol used to discover devices and their services in a network. SSDP defines methods so that a control point can discover resources it needs so that devices can publish their availability in the network. A control point can send an SSDP search request in order to discover devices and services offered in the network. It can refine its discovery by allowing the search for a specific type of device, service or a specific device. UPnP equipment remains in listening mode for a multicast port. When a discovery request is received, the device verifies to see if it is concerned by the request and, if that is the case, an SSDP response is sent in unicast to the control point. Similarly, when a device joins the network, it sends SSDP presence announcement messages to announce the services it offers. When equipment joins the network, it announces its services by transmitting a publication message to the 239,255,255,250:1900 multicast address. Control points remain in listening mode for this port in order to retrieve publications. To announce all services, the device sends a group of publication messages, each one containing information on a device or service. When a control point joins the network, the discovery protocol grants it permission to search for services it needs. In order to do this, it sends a discovery message containing the type of device or service identifier via the address and multicast port. To learn more about a discovered device and its capabilities, or to interact with it, the control point must retrieve the equipment description from the URL provided during discovery. The equipment description is divided into two sections: equipment and service descriptions. Description of equipment contains manufacturer information such as model name and number, serial number, etc. Description of a service is carried out in XML. Once the control point has retrieved the description of a service, it can call the actions offered by sending a control message to the service control URL. The service returns the result of an error message in case of failure. The invocation can cause a change of state variables, and in this case, the service
Service Discovery Protocols for MANETs
149
sends notifications to control points involved. To call an action, an SOAP (simple object access protocol) message is created and sent. 7.2.1.3. SLP SLP (service location protocol) [VEI 97] is a standard proposed by the IETF for service discovery in IP networks. It provides access to information on the existence, connection and configuration of services. SLP defines three types of agents: – UA (user agent) which represents the service user; – SA (service agent) is the service provider; – DA (discovery agent) offers service registry functions. The UA agent broadcasts a request (SrvRqst) indicating characteristics of the service needed by the client. It will receive a response (SrvRply) containing the location of services in the network which fulfill these characteristics. The UA can send the SrvRqst request directly to DA agents. DAs work as cache servers for services provided by the network. To register services, SAs send to DAs SrvReg messages containing all services announced and receive acknowledgements (SrvAck). UAs and SAs discover DAs in two ways. First, they send a SrvRqst request in multicast. Then, the DA sends publications that UA and SA detect by listening to the SLP port. In both cases, agents receive publication messages from the DA (DAAdvert or DA advertisement) (Figure 7.2).
Figure 7.2. DA discovery
Services are organized in groups. A group may correspond to a site, an administrative grouping, proximity in the network topology, etc. SAs and DAs always have groups associated with them. If a group is assigned to a UA, this UA will only be able to discover services belonging to this group. The UA can be configured without a group and in this case, it will discover all available groups and will be able to request any available service on the network.
150
Wireless Ad Hoc and Sensor Networks
7.2.2. Service discovery in ad hoc networks Service discovery in ad hoc networks has been the subject of several research projects these last few years [VER 05]. The nature of the ad hoc environment requires a strategy of service announcement and discovery which minimizes traffic generated by discovery protocols. In addition, several protocols are concerned with power constraints in nodes. Because of frequent changes in the topology of these networks, these protocols must also adapt and continue to operate. In the following sections, we present some of these protocols. 7.2.2.1. Post-Query This protocol [BAR 03, LUO 04] uses several strategies to adapt to network environment constraints. It is made up of two protocols: Posting protocol and Querying protocol. The Posting protocol is responsible for the publication of services and Querying protocol conducts the discovery. These two protocols are represented by two functions P(s) and Q(c) respectively. Each server s broadcasts a service to a group of nodes Ns by using algorithm P(s). Each client requests service of a given type to a group of nodes by using the Q(c) algorithm. Several Post-Query strategies are proposed to adapt network environment constraints. A Post-Query strategy is a (P1,Q1),(P2,Q2),…, (PR,QR) sequence of Post-Query protocols executed in R rounds (or in R times). These strategies include: – greedy strategy: terminals announce their services and broadcast discovery requests to everyone. Message forwarding is based on message broadcasting; – incremental strategy: each terminal broadcasts and requests services from a subset of terminals in a first round and gradually increases the size of this subset; – uniform memoryless strategy: Post-Query is executed at each round and identically for all network terminals. The server broadcasts all its services to a random group of terminals. The client sends its request to a random group of servers. The group size will affect response time, network load and success probability of finding a service; – strategy with memory: each client uses a cache memory to store identifiers of nodes previously visited. The nodes that have not been previously contacted are the only ones involved in new rounds; – conservative strategy: at each round, servers broadcast services to neighborhood nodes by using the one-hop broadcast mechanism (i.e. only direct neighbors receive the messages). Clients request these services by using the same mechanism.
Service Discovery Protocols for MANETs
151
7.2.2.2. KONARK In [HEL 03], we propose a protocol for discovery and invocation of services specifically dedicated to ad hoc networks. This protocol is particularly intended for mobile commerce applications. We use a peer-to-peer mechanism for service discovery. Description of services is done in a simplified WSDL and services are called by using the SOAP protocol. The purpose of this protocol is publication, discovery and publication message forwarding, either periodically or based on geographical or temporal events. The client can use a cache memory to save information on announced services. Discovery messages are sent to a fixed multicast group. It uses a service registry organized in the form of a tree. Classification of services is generic at high level (the root of the tree) and becomes increasingly specific as we get deeper in the structure. Discovery and publication can be done at any level of the hierarchy. When it receives a service publication message, a node updates its registry at a tree leaf level.
KONARK uses multicast service requests and unicast for responses, making it more efficient than if it was using multicast in both cases. However, the fact that it uses WSDL descriptions gives a certain level of semantic matching freedom, which also generates an additional process. 7.2.2.3. GSD GSD (group-based service discovery) [CHA 02, CHA 04] is a protocol based on peer-to-peer cache memory and group of services concepts. Service discovery is based on publication messages and discovery request forwarding. The services are described by using OWL (web ontology language) [OWL] for ontologies and are categorized into several groups based on OWL’s class/sub-class hierarchy. Each service provider periodically announces the list of its services through a publication message such as:
<Packet-type, Source-Address, Service-Description, Service-Groups, Other-Groups, Hop-Count, Lifetime, ADV-DIAMETER> The Service-Description and Service-Groups fields contain information on local services and their groups from the transmitting node. The Other-Groups field contains the list of non-local service groups. This list is built based on data from the transmitting node’s cache memory. The Other-Groups field makes it possible for groups to propagate from one node to another. When a node receives a publication message, it verifies whether or not to retransmit. The ADV-DIAMETER field determines the hop limit to execute and Hop-Count determines the number of hops already made. When it receives a publication message, the node logs the service in its cache memory.
152
Wireless Ad Hoc and Sensor Networks
A service discovery request is made up of a description based on the requested service ontology and optionally includes service affiliation groups. The request is first matched with services logged in the requester’s cache and if there is no entry corresponding to the request, the requester broadcasts the request in the network. A service request is structured as follows: <Packet-type, BroadcastId, Service-Description, Request-Groups, Source-Address, Last-Address, Hop-Count> The cache’s Other-Groups field enables nodes processing the request to retransmit it to specific nodes instead of broadcasting it. In order to forward a response message, we use a reverse routing mechanism based on the routing table. When it receives a request, the node updates the routing table for a precise time interval based on BroadcastId, Source-Address and the request’s Last-Address fields. The response to a service discovery request (service reply) is generated by a node if it finds entries in the cache matching the discovery request. Compared to other protocols, GSD generates less request and response messages because service requests are not broadcast but forwarded to nodes which have memorized information on similar services. However, since it uses an ontology language, it requires that additional processes of these languages be executed. 7.2.2.4. Allia Allia [RAT 02] is an agent-based protocol. It uses a peer-to-peer memory cache to store information on its services. Each node periodically broadcasts information about its services. Nodes with similar services will form an alliance. When a node receives a request for a service that it cannot fulfill, if this information exists, it sends it to the client, otherwise, it broadcasts the request to members of its alliance. Allia takes into consideration the capacity limits of peripherals used and user and application preferences in the system’s intended field, in this case mobile commerce. One of this protocol’s drawbacks is that it is resource-intensive for the environment of execution of agents. 7.2.3. Service discovery with routing The service discovery or invocation process is generally executed in the application layer. Yet, in ad hoc networks, routing protocols make it possible to adapt to the mobile environment by taking into account topology changes. Placing the discovery at the application layer will trigger the routing process (in a reactive
Service Discovery Protocols for MANETs
153
environment), which could delay discovery or invocation of services and consume the network’s bandwidth. With this observation in mind, several propositions combining discovery with routing such as DSR and AODV [JOH 02, PER 03] have emerged. 7.2.3.1. Koodli and Perkins protocol This service discovery protocol based on routing messages was proposed as an extension to reactive routing protocols [PER 02]. These protocols offer an RREQ message to discover the route and a corresponding RREP response message. The service discovery process then uses these two messages with an extension to their formats. An extended RREP message called SREQ (Service REQuest) was offered with as one of its formats:
ServiceType is a ServiceTypeLength size field representing the type of service requested and ServiceRequestPredicate indicates the search condition. When a node receives an SREQ message, it checks in its cache to see if it has a service corresponding to the request. If that is the case, it checks for a valid route to the service provider. If both conditions are favorable, the node generates an RREP message with an extension containing the discovery result. This type of message is called SREP (Service REPly). If the service memory cache from a node receiving an SREQ contains a match with the service discovery request, the node enters the service provider’s address to the RREQ message destination address and rebroadcasts the SREQ message. A node receiving the SREQ message with a destination address which is not zero and with a route to the destination address responds with an SREP message. Other capabilities were added, such as the addition of an SANM (Service ANnouncement M 0essage) type of message for broadcasting publication messages in the network. 7.2.3.2. SEDIRAN We have introduced improvements to the proposition from Koodli and Perkins [OBA 05]. We now propose a protocol which we have called SEDIRAN (service discovery and interaction with routing protocols in ad hoc network) located above the AODV reactive routing protocol.
154
Wireless Ad Hoc and Sensor Networks
Our approach consists of adapting routing to the service discovery requirements. To accomplish this adaptation, we have added the following extensions: – we have extended the protocols by adding additional functions specific to service discovery support; – we have added cache memory management in order to store already discovered services for future use. This enables us to improve discovery process performance; – we have introduced an incremental discovery process; – we have categorized service types in order to separate the need to launch the discovery process and the need to announce services; – the architecture chosen will be able to use Web services to apply our protocol in a context of services offered in mobile networks, particularly for mobile commerce, while still considering the diversity of mobile devices used. We base our service discovery and publication process on the notions of special services and ordinary services. Ordinary services are services known by network nodes. Each ordinary service belongs to a type of service identified by a UUID (universal unique identifier, a 128 bit standard) type identifier. Research for this type of service will be done based on this identifier. An example of such a service could be a game service (a chess game for example). Ordinary types of service are stored in a database structured as a service type tree. Potential users will be able to browse through this structure (through a Web browser-type graphical interface for example). For now, the matching process is based on the service type’s UUID.
Special services are services not presently known to users but are represented by a description that is understandable to them. These services are not stored in any database. It is generally understood that the announcement is performed with special services. Since these services are not known in advance, potential users must learn of their existence. Discovery, on the other hand, is peformed on ordinary services. Each node can have several ordinary services since these nodes never announce their services, and interested users must launch a discovery request. A cache administrator for ordinary and special services is used to store information on services offered in the network. The service discovery protocol provides users with location of available services. To facilitate contact between service providers and users, we have proposed an interaction between SEDIRAN and the routing protocol during discovery. This solution has the advantage of keeping calls to route discovery procedures to a minimum in the case of reactive routing such as AODV. SEDIRAN enhances routing tables (and route cache) during service discovery.
Service Discovery Protocols for MANETs
155
A server providing a special service broadcasts a publication message called ADVM (ADVertisement Message) to network nodes (Figure 7.3). This message contains the server address offering the service, service description, etc. Each node passed through by this message stores this information in its cache memory. It will then be able to use it at a later time. The validity period of the announced service is controlled by a leasing system enabling cache cleanup of this service when its lease expires.
B
A
C S
D
Cache
ADVM Services
Figure 7.3. Broadcasting of a publication message
To search for an ordinary service, the client broadcasts his request in a message called DREQ (Discovery REQuest) to network nodes. This message contains a value representing the maximum number of hops (hop limit) that the message is authorized to make. It is equal to the number of nodes that the message can pass through. Note that this mechanism is already available in DSR and AODV type routing protocols. Nodes offering the type of service requested respond to the request sending a DREP (Discovery REPly) message. Figure 7.4 shows an example of a node (node A) requesting the discovery of an ordinary service and receiving a response from both nodes for this type of service.
156
Wireless Ad Hoc and Sensor Networks
A
DREP DREQ
Figure 7.4. Discovery with hop limit = 1
The discovery procedure works incrementally. A node wishing to execute a discovery broadcasts a DREQ discovery message with a hop limit of 1. It receives responses from nodes offering this service. The user can then view the discovery result. If he is not satisfied or if he wants to discover other servers offering the same type of services, another DREQ discovery message with a hop limit of 2 is broadcast. Intermediate nodes update their routing tables with the requesting node and rebroadcast the request message. They also store information contained in the DREP response message. They update their routing table with the service provider and add services discovered in the ordinary services cache. In this way, they will not have to execute service discovery during their subsequent discovery operations. The process continues until the user decides to stop. A client wanting to discover all offered services in the network will proceed in a similar fashion. He sends a discovery request for all services to nodes of the first step (i.e. hop count=1) and logs all services found in its cache as well as routes to providers. He then sends a request for services offered by nodes in the second step. However, only servers offering services different from those discovered in the previous steps are the ones that respond by providing these services. The process continues this way until the last step. This strategy enables the client to store the closest server providing each service since the closest server is the only one responding. Figure 7.5 illustrates this process. In step b), nodes E, F and H respond by giving services different from those discovered in step a).
Service Discovery Protocols for MANETs S1,S2
S6,S7 S1,S2
A
157
G
E
DREQ
B
S1,S3
DREQ DREP
DREQ
D
C
S4
S2,S5
(a) Hop Limit=1
F S2,S9
DREP (b) Hop Limit=2
H S8
Figure 7.5. Discovery of all available services
Since SEDIRAN is located above the routing protocol, discovery messages will be encapsulated in routing messages. We have chosen the AODV reactive routing protocol for our case study and analyzed its operation to determine an adequate way of interacting with SEDIRAN in order to fulfill our objective of enhancing the routing table during service discovery. Forwarding of response messages will not trigger route discovery. Logging of a service in the cache memory is preceded by a route log to the service provider in order to avoid launching routing during service invocation. There are three types of SEDIRAN packets: – DREQ (Discovery REQuest): this contains an ordinary service discovery request. It specifies the requested service identifier type and the list of services previously discovered; – DREP (Discovery REPly): this contains the response to a discovery request. It contains the identifier list of services belonging to the type of service requested and the provider’s address; – ADVM (ADVertisment Message): this contains the announcement message of a special service offered by a provider. It contains service description, its identifier and the provider’s address. We use the AODV RREP message to encapsulate all types of SEDIRAN messages (including DREQ, DREP and ADVM). In order to do this, we have made changes to the AODV RREP packet. Figure 7.6 illustrates SEDIRAN’s modified RREP packet format.
158
Wireless Ad Hoc and Sensor Networks
1
32 Type (8 bits)
RAPSD (5 bits)
Prefix (5 bits)
Hop count (8 bits)
Destination IP Address Sequence Number Destination Source IP Address RREP ID Lifetime SEDIRAN message
Figure 7.6. SEDIRAN RREP packet
The P (Publish) field indicates that the message is an ADVM announcement message. The S (Search) field indicates that the message is a DREQ discovery request. The D (Discovery) field indicates that it is a DREP response to a discovery request. The size of the message including the SEDIRAN messages is indicated in the size hop count field to specify a discovery range. RREP ID field contains a broadcast identifier of the RREP message. The AODV protocol does not broadcast RREP messages. This field is used to force it to broadcast this message. It enables nodes to find out whether they have already received the message or not. If they have already received it, they ignore it, otherwise, they rebroadcast it. A provider of special services periodically broadcasts ADVM messages to network nodes to inform them of the existence of its services. The latter is transferred to the routing protocol to be encapsulated in a RREP packet and broadcast in the network. When a node receives this message, the AODV protocol verifies whether the message has already been received through the RREP ID field. If not, it updates the route to the provider and rebroadcasts the message. It then transfers the ADVM message to the SEDIRAN protocol in order to update the special services cache. In this way, the node has the service and route to the provider. A subsequent invocation of this service will not cause the transmission of a route discovery message. Figure 7.7 illustrates this exchange.
Service Discovery Protocols for MANETs
- Routing update - Special services cache update
Service provider SP
N1
N2
SEDIRAN
SEDIRAN
SEDIRAN
ADVM
ADVM
ADVM
AODV
AODV RREP/ ADVM
RREP/ RREP/ ADVM ADVM
RREP/ ADVM
Transport
AODV
Transport
RREP/ ADVM Transport
Figure 7.7. ADVM message propagation
Routing update
Routing update
N1
N2
N3
SEDIRAN
SEDIRAN
DREQ
DREQ
AODV
AODV RREP/ DREQ
Transport
RREP/ DREQ
AODV RREP/ RREP/ DREQ DREQ
Transport
Figure 7.8. DREQ message propagation
Transport
159
160
Wireless Ad Hoc and Sensor Networks
To discover a service in the network, SEDIRAN generates a DREQ request. The request is encapsulated by the routing protocol in an RREP message and broadcast in the network. The DREQ request targets services offered by nodes located in a given diameter. Once the request is received by an intermediate node, the routing protocol updates the route to the requested service in order to prepare the return route for the DREP response. When the request is received by a node belonging to the chosen diameter, the routing protocol updates the route to the requester and transmits DREQ message to the SEDIRAN protocol (Figure 7.8). This protocol searches in a local services table for services corresponding to the request. If it finds any, it generates a DREP response message. Once a DREP message is generated by SEDIRAN, it is transmitted to the routing protocol to be encapsulated in a RREP message and sent to the requester. When an intermediate node receives the message, the AODV protocol updates the route to the provider and transmits the DREP message to SEDIRAN module in order to update the ordinary service cache. In this way, the intermediate node has the service and route to the provider. A subsequent service invocation will not result in a route discovery. When a requester receives the message, the AODV protocol updates the route to the provider and transmits the DREP message to the SEDIRAN protocol in order to update the ordinary service cache. - Routing update - Ordinary service cache update N1
N1
N2
SEDIRA N
SEDIRA N
SEDIRA N
DRE P
DRE P
DRE P
AODV
AODV RRE P/ DRE P
Transport
RRE P/ DRE P
AODV RRE P/ RRE P/ DRE P DRE P
Transport
Figure 7.9. DREP message propagation
Transport
Service Discovery Protocols for MANETs
161
Figure 7.10. SEDIRAN software architecture
SEDIRAN is a protocol designed to be used in the context of a mobile commerce application. A prototype of the protocol and of a mobile commerce application was developed in Linux and Windows environments. We have used the Jadhoc library [JAD] written in Java for ad hoc networks. The system implementation is modular in the sense that we can use the AODV routing protocol or any other reactive routing protocol. We are also planning on the implementation of an API to integrate a UDDI directory service for storing service descriptions in WSDL. Figure 7.10 illustrates our system’s software architecture. Although this study is still in its infancy, we have been able to produce this implementation with the goal of integrating it in a ubiquitous calculation environment in a subsequent step. We are also committed, in the short term, to studying the performance of this protocol. We plan to do this by simulation as well as in a concrete environment.
162
Wireless Ad Hoc and Sensor Networks
7.3. Conclusion In this chapter, we have described the most widely known service discovery protocols. The issues discussed concern discovery as well as publication aspects. In general, these protocols do not target all aspects but just a few in some cases. We have attempted to describe the strong points of each. In addition, recent studies on service discovery and ubiquitous applications have facilitated the emergence of discovery protocols in ad hoc networks. Among these protocols we have highlighted those which are paired with ad hoc routing procedures. Among them is the SEDIRAN protocol that we have developed in our own laboratory. One of the aspects that we have considered in our own protocol involves power saving aspects during service search. We plan on integrating it by the use of ad hoc routing protocols considering this factor. We are also working on service-oriented ad hoc routing protocols. Routing in this case will not only consider destination addresses but the services offered by them. 7.4. Bibliography [BAR 03] BARBEAU M., KRANAKIS E., “Modeling and Performance Analysis of Service Discovery Strategies in Ad Hoc Networks”, Proceedings of International Conference on Wireless Networks (ICWN), Las Vegas, USA, 2003. [CHA 02] CHAKRABORTY D., JOSHI A., YESHA Y., FININ T., “GSD: A Novel Groupbased Service Discovery Protocol for MANETS”, 4th IEEE Conference on Mobile and Wireless Communications Networks (MWCN), September 9 2002. [CHA 04] CHAKRABORTY D., JOSHI A., YESHA Y., FININ T., “Towards Distributed Service Discovery in Pervasive Computing Environments”, IEEE Transactions on Mobile Computing, July 15 2004. [HEL 03] HELAL S., DESAI N., VERMA V.N., LEE C., “Konark – A Service Discovery and Delivery Protocol for Ad Hoc Networks”, Proceedings of the Third IEEE Conference on Wireless Communication Networks (WCNC), New Orleans, USA, March 2003. [JAD] Jadhoc System Design Manual, University of Bremen, http://www.comnets.unibremen.edu. [JINI] Jini Architecture Specification, http://wwws.sun.com/software/jini/specs/jini2_0.pdf. [JOH 02] JOHNSON D.B., MALTZ D.A., BROCH J., “DSR: The Dynamic Source Routing Protocol for Multi-Hop Wireless Ad Hoc Networks”, Internet Engineering Task Force, 2002.
Service Discovery Protocols for MANETs
163
[LUO 04] LUO H., BARBEAU M., “Performance Evaluation of Service Discovery Strategies in Ad Hoc Networks”, 2nd Annual Communication Networks and Services Research (CNSR) Conference, Fredericton, NB, Canada, May 2004. [OBA 06] OBAID A., KHIR A., MILI H., LAFOREST L., “A Routing Based Service Discovery Protocol for Ad Hoc Networks”, submitted to IEEE Globecom’2006 Symposium on Wireless Ad Hoc and Sensor Networks, 2006. [OWL] OWL: Web Ontology Language, http://www.w3.org/TR/owl-features. [PER 02] PERKINS C., KOODLI R., Service discovery in on-demand ad hoc networks, IETF Internet draft, draft-koodli-manet-servicediscovery-00, October 2002. [PER 03] PERKINS C., BELDING-ROYER E., DAS S., “Ad Hoc On-Demand Distance Vector (AODV) Routing”, RFC3561 (experimental), July 2003. [RAT 02] RATSIMOR O., CHAKRABORTY D., JOSHI A., FININ T., “Allia: AllianceBased Service Discovery for Ad Hoc Environments”, International Workshop on Mobile Commerce, Proceedings of the 2nd International Workshop on Mobile Commerce 2002, Atlanta, Georgia, USA, September 28 2002. [RDF] RDF: Resource Description Framework, http://www.w3.org/RDF. [UPNP] Understanding Universal Plug and whitepapers.asp, White Paper, June 2000.
Play
http://www.upnp.org/resources/
[VEI 97] VEIZADES J., GUTTMAN E., PERKINS C., KAPLAN S., RFC 2608: SLP Service Location Protocol, version 2, http://www.openslp.org/doc/rfc/rfc2608.txt, June 1997. [VER 05] VERVERIDIS C.N., POLYZOS G.C., “Routing Layer Support for Service Discovery in Mobile Ad Hoc Networks”, Proceeding of the 3rd IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW’05), Kauai Island, Hawaii, USA, 2005.
Chapter 8
Distributed Clustering in Ad Hoc Networks and Applications
8.1. Introduction Recent natural catastrophes, such as the Pakistan earthquakes and the tsunami in Southeast Asia, have clearly demonstrated the importance of rapid deployment of a reliable and powerful communication service for optimal rescue organization. In such completely devastated regions, “traditional” cell phone use is impossible because it requires the installation of a large infrastructure: one base station per sector connected to other base stations by wired connections. The installation of each base station is too expensive because, in addition to material cost, it requires a precise land topology study in order to ensure that a large enough area is covered. Others will argue that satellite telephony is a suitable solution. It certainly has the advantage of covering the globe, but its cost is totally prohibitive and it is not especially appropriate for a highly localized dense communication network. Next to these “traditional” technologies, another path has been explored since the 1970s: wireless ad hoc networks. This comes from a simple observation. In the future, personal communications and mobile computing will require an easy to deploy wireless network infrastructure, multihop if possible, which is capable of supporting multimedia services. Traditional telephony and current mobile communication networks rely on a wired network base that we generally refer to as a backbone. The objective of wireless ad hoc networks is to make reliable wireless
Chapter written by Romain MELLIER and Jean-Frédéric MYOUPO.
166
Wireless Ad Hoc and Sensor Networks
communications possible without the use of an installation-intensive wired backbone. A wireless ad hoc network is a group of mobile stations which can move without restrictions. All nodes from a single network can communicate either by direct wireless communication (if they are close enough to each other), or by the cooperation of one of several other members of the network. The problem with this type of network is that each node will simultaneously act as a client and a server depending on the situation. Consequently, most tasks said to be “easy” in more traditional networks will clearly become more complicated (mutual exclusion, broadcasting, etc.). Node mobility and the fact that all nodes only have limited energy autonomy must be taken into consideration. Delay in implementing this type of network may seem somewhat overwhelming, but their practical applications would be such that we must try our utmost. This includes military applications (special operations, battlefield scenarios, etc.), natural catastrophes (fires, earthquakes, flooding, etc.) and domestic applications (installation of a temporary communication network during a conference or a large assembly, etc.). A practical way of countering the previously mentioned problem would be to build a hierarchical structure above the network in order to simulate a sort of backbone made up of nodes which are more “adapted” than others. This is precisely the goal of clustering. This methodology has already proven its efficiency in the past and is a well known and widely studied problem in traditional distributed systems. It has, among others, helped in the resolution of several problems such as storage space minimization for the communication of information (routing tables, for example). In this way, bandwidth improvement and resource distribution are realities. First, we will give a state of the art on the different clustering techniques developed for wireless ad hoc networks, whether it is for 2 hop clusters or more. Once we have presented an improvement of one of the techniques addressed, we will illustrate the various applications in this field. 8.2. State of the art The notion of organizing by clusters has been used for ad hoc networks since their conception. In [BAK 81, BAK 84, BAK 87, EPH 83], an architecture completely distributed by clusters is mainly proposed for establishing and demonstrating the adaptation capability of these networks to connectivity changes in
Distributed Clustering in Ad Hoc Networks and Applications
167
hierarchical routing. We will study more recent works – starting with [GER 97] – committed to using wireless ad hoc networks to support multimedia traffic. Generally, clustering in a network is based on the following principle: nodes are divided into clusters, and some nodes, called clusterheads, are responsible for the formation and maintenance of clusters. A group of clusterheads is called the dominating group. This is the backbone of the network. In existing solutions, clustering is done in two distinct steps: the cluster initialization phase and cluster maintenance phase. The first step is accomplished by choosing specific nodes to act as the dominating group in the clustering process. Clusters are then formed “around” clusterheads. Nodes that are not clusterheads are described as “ordinary nodes”. Current clustering algorithms mainly differ on the heuristic used in choosing which node should attain the status of clusterhead. A problem that arises during the study of cluster development in a wireless ad hoc network is defining the term “around”. Most approaches on the subject define this zone as being the group of direct neighbors of the clusterhead [BAS 99, BAS 05, GER 97], whereas others have a much more flexible definition, with clusters at k hops, k ≥ 2 [GAR 03]. Certain references say that clustering with at the most two hops is said to be “node-centric”, whereas clustering with over two hops is called “cluster-centric” [BAN 01, MCD 99]. In order to formalize our presentation, we should explain our notations. The network may be represented by a graph G = (V, E) with V, as the group of points (wireless stations) and E the set of edges (by this we mean communication connections). We consider a sub-case in relation to the general case. Communication connections are symmetric (if node p is a neighbor of node q, then q is also a neighbor of p). We now start this state of the art by studying the main references on clustering in two hop clusters. 8.2.1. Clustering in two hop clusters Several heuristics were proposed to choose the clusterhead in ad hoc networks. In “smallest identifier” algorithms, a unique identifier is assigned to each node and the node with the smallest identifier is elected as the clusterhead. A node
168
Wireless Ad Hoc and Sensor Networks
listening to one or more clusterheads is a “gateway” generally used for information routing between clusters. Otherwise, a node is an ordinary node. [GER 97] is one of the most important contributions in this field because it forms the basis of numerous studies. Gerla describes a clustering algorithm with no overlap. Clusters are controlled independently from each other and automatically reconfigured when nodes move. 8.2.1.1. Gerla and Tsai approach The objective of the proposed clustering algorithm [GER 97] is finding a series of interconnected clusters covering the whole population. More precisely, topology of the system is divided into separate clusters. A good clustering schema will tend to preserve its structure when a few nodes move and topology changes slowly. Otherwise, in the case of a frequently changing topology, there will be communication network overload because of continuous structure rebuilding. This problem concerning optimal cluster size must be addressed. The authors favor the following option: optimal cluster size is the result of the compromise between spatial reuse of a communication channel (which tends to have small clusters) and minimization of communication delays between clusters (which tends to have larger clusters). Other constraints also apply, such as energy consumption and geographical node distribution. In fact, cluster size is controlled by node radio transmission power. For clustering algorithm, the authors have presumed that transmission power is set and uniform throughout the network. Within each cluster, nodes can communicate with each other at two hops maximum. Clusters can be developed according to node identifiers. We consider the following hypotheses: 1) each node has a unique ID identifier and knows the identifiers of all its one hop neighbors; 2) a message sent by a node is correctly received by all its direct neighbors in a finite period of time; 3) network topology does not change during algorithm execution. The algorithm accomplishing this task is relatively simple. It is an ordinary node with a my_id identifier. Its neighbor identifiers, including its own, are stored in its local г memory. Each node’s objective is to initialize its my_cid variable containing its cluster number. This decision is based on node identifiers. We consider that it is the weakest identifier that is the most adapted for this task of becoming clusterhead.
Distributed Clustering in Ad Hoc Networks and Applications
169
If my_id identifier is the smallest of all those stored in г, then my_id initializes my_cid to my_id (i.e. it is its own clusterhead). It then broadcasts to all its neighbors a cluster(my_id, my_cid) message to inform them of its decision and it subtracts my_id from г. Once this is done, my_id node will execute the next block of instructions until its variable г is equal to the empty group: – when my_id node receives a cluster type message (id, cid), it starts by taking note of the fact that its neighbor id belongs to the cluster of clusterhead cid. Then, if the node id is actually this cid, and if node my_id does not know its own clusterhead or if its clusterhead has a higher identifier than cid, then the node will link to the cluster cid, since cid is actually its neighbor. It takes my_id out of its г group; – if its identifier is the smallest of all those stored in г, and if it does not know its clusterhead, it names itself as its own clusterhead and broadcasts the information to its neighbors. It takes itself out of its г group; – if г is empty-group, my_id has finished the algorithm execution and knows its status for future applications to come (if my_id = my_cid, then my_id is clusterhead, otherwise it is ordinary node) and to which cluster it belongs. An interesting point in this algorithm is that ordinary nodes and clusterheads all have the same workload during cluster development. In this way, they spend the same amount of energy and their activity will not suffer from this surplus of responsibility. Consider the topology example illustrated in Figure 8.1 to have a better understanding of the development methodology. After clustering, we obtain Figure 8.2, where we find six clusters which are {1, 2}, {3, 4, 11}, {5, 6, 7, 8, 9}, {10, 12, 13}, {14, 15, 16, 17} and {18, 19, 20}.
Figure 8.1. System topology
170
Wireless Ad Hoc and Sensor Networks
Figure 8.2. Clustering of system according to node identifiers
From here, it is easy to conclude that at the end of the algorithm, a node will have finally decided and will only belong to a single cluster. In fact, the cluster identifier to which the node is linked is either the node’s identifier or the smallest of its neighbors’ identifiers. We also show that in a single cluster, two nodes are at the most at a distance of 2 hops from each other. In order to do this, we must consider several nodes in one single cluster. Each node must be able to directly reach the node whose identifier is equal to the cluster identifier. Therefore, two nodes from one cluster are obviously at a distance of 2 from each other at most. The complexity of this algorithm is shown in O(|V|), with V as the group of nodes in the graph representing the network. At this time in the algorithm, we have a completely clustered network. However, in a dynamic radio network, nodes can (or cannot) move or leave the network definitively. Other nodes can also wish to join the network. We say that a topology change has occurred when a node disconnects from and connects to all or some of its neighbors by altering the cluster structure. Obviously, system performance will be affected by large cluster changes. So, in addition to an efficient development mechanism, it is also vital to implement a cluster maintenance schema to maintain a cluster infrastructure that is as stable as possible. The authors in [GER 97] have therefore developed a maintenance mechanism that limits as much as possible the number of a node’s transitions from one cluster to another. Consider the example presented in Figure 8.3a.
Distributed Clustering in Ad Hoc Networks and Applications
171
Figure 8.3. Cluster maintenance example
There are five nodes in the cluster and the distance inside a cluster in terms of number of hops is at most 2. Because of mobility, topology has changed until it reached the configuration in Figure 8.3b. At this moment, d(1, 5) = d(2, 5) = 3>2, where d(i, j) is the distance in number of hops between nodes i and j. Consequently, the cluster needs to be reconfigured. In other words, we should decide which node(s) must be taken out of the current cluster. From there, the criterion adopted is to figure out which node has the highest connectivity. We choose to leave the node with the highest connectivity and its neighbors still in the original cluster and take out all others. We should remind our readers that at this moment in the algorithm, nodes only have local information with which to make their decision. So, once a node has discovered that a member, say x, of its cluster is no longer in its vicinity, it must verify whether the node with the highest connectivity is one of its neighbors. If that is the case, y takes x out of its cluster. Otherwise, y must change clusters. Two steps are therefore necessary to maintain cluster structure: – step 1: verify whether one of the members of the cluster is gone; – step 2: if one of the members of the cluster has actually left this cluster, then decide if I should change clusters myself or take out nodes no longer in my vicinity. Consider the example from Figure 8.3b. Node 4 is the node with the highest connectivity. Thus, node 4 and its neighbors {1, 2, 3} do not change clusters. However, node 5 must either join another cluster, or start its own cluster. If a node wishes to join a cluster, it must first verify that all members of its cluster are actually part of its vicinity. Only on this condition will a node be able to join a cluster. From here, the authors describe numerous primitives for using this type of network as support for a complete multimedia traffic, according to them. In reality, several elements clearly demonstrate that this contribution, even as major as it is, was only a first step in the development of powerful clustering algorithms.
172
Wireless Ad Hoc and Sensor Networks
Another way of choosing the clusterhead is the “highest degree” based algorithm [GER 95]. In this algorithm, the degree of a node is measured on the basis of distance to its neighbors. Each node broadcasts this information to its neighbors, and the node with the largest number of neighbors (i.e. the maximum degree) is chosen as the clusterhead and its neighbors join it as members of its cluster. In the case where two neighbor nodes have the same degree, their unique identifiers are used to separate them. Then, the process continues for remaining nodes. The highest degree heuristic establishes that the node with the highest number of neighbors should be elected the clusterhead without considering the very obvious fact that a clusterhead may not be able to support a large number of neighbors because of certain resource limitations, even if those nodes are direct neighbors and are located well within range. Evidently, none of these models leads us to a satisfactory result in the construction of a dominating group. In 1999, Basagni [BAS 99] proposed an evolution of the model in order to get closer to wireless ad hoc network reality. 8.2.1.2. Distributed clustering for ad hoc networks (DCA): weight notion introduction Basagni [BAS 99] starts with a simple statement. In existing solutions, clustering is done in two distinct steps: cluster initialization phase and cluster maintenance phase. The first step is accomplished by choosing specific nodes to act as dominating group in the clustering process. Clusters are then formed “around” clusterheads. Current clustering algorithms mainly differ on the heuristic used in choosing which node to raise to the status of clusterhead. The same procedure is then repeated in all remaining nodes until a cluster is assigned to each node. A hypothesis made in all studies on the subject is that during cluster initialization, nodes do not move as long as cluster formation is still active. This is a major drawback, because in a real ad hoc situation, no hypothesis of this type can be acceptable. Once all clusters are formed, the non-mobility presumption is abandoned and numerous techniques describe how to maintain cluster organization. For example, in [BAK 81] and [BAK 84], cluster reorganization is executed periodically by recalling the clustering algorithm at regular intervals because of node mobility. As explained previously, in [GER 97], cluster maintenance in the presence of mobility is described in this way: each v node locally decides whether or not to refresh its data concerning the cluster (its own affiliation or not to its cluster and that of its members). The decision is based on the knowledge of v neighbors at 1 and 2 hops and on the knowledge of the local topology of v neighbor with the highest degree. The resulting algorithm is thus different from the one dedicated to clustering (based on node identifiers and knowledge of direct neighbors) and the clustering obtained has different characteristics from the original one.
Distributed Clustering in Ad Hoc Networks and Applications
173
The idea of Basagni [BAS 99] is very different. He focused on the two clustering phases (initialization and maintenance) with the perspective of obtaining certain specific properties. More precisely, because of the dynamic nature of ad hoc networks, nodes need: 1) to have at least one clusterhead neighbor (for quicker communication between any node pair); 2) to be affiliated to the “best” clusterhead neighbor; 3) in addition, we do not want two clusterheads to be direct neighbors. In this way, we will have the assurance of a good spatial distribution of clusterheads throughout the network. Basagni [BAS 99] begins by presenting the algorithm known as DCA (for Distributed Clustering Algorithm) which is a generalization of clustering algorithms presented in [BAK 81] and [GER 97]. In fact, these previous approaches are generalized by authorizing a choice of clusterheads based on a generic weight (a strictly positive integer) associated which each node: the higher the weight of a node, the better this node would be in the important role of clusterhead. The main advantage of this approach is that by presenting weights with parameters relative to node mobility, we can choose the best nodes possible for the role of clusterhead, or in other words nodes that can handle this task the best. For example, if the weight of a node is inversely proportional to its moving speed, then the less mobile nodes will be selected as clusterheads. Since these nodes either do not move or move very slowly compared to other nodes, clustering is guaranteed to have a longer lifetime, and consequently reprocessing associated with cluster maintenance in wireless environments is minimized. To simplify matters, Basagni presumes that we have uniqueness of network weights. This innocent looking hypothesis will be the subject of a controversy that we will address later in this chapter. The main hypothesis completed for developing the DCA algorithm is that during the algorithm execution, the network topology will not change. In fact, in its first version, this algorithm is only a generalization of the generic algorithm weight concept presented in [BAK 81] and [GER 97]. The author also makes two more common hypotheses in this context. He presumes that a message sent by a node is correctly received in a finite time period (a step) by all its neighbors and that each node knows its identifier, its weight and the identifiers and weights of all its neighbors. The algorithm is individually executed by each node in such a way that a node v decides its own role (clusterhead or ordinary node) based only on decisions from its neighbors with higher weights. So, initially, only nodes with the highest weight in the neighborhood will broadcast a message to their direct neighbors, announcing that
174
Wireless Ad Hoc and Sensor Networks
they are clusterheads. If no node with a higher node has sent this type of message (indicating that it will join another cluster as ordinary node), then v will send a message establishing its status as a clusterhead. Beyond the initial procedure, the algorithm is completely managed by message reception: a specific procedure will be executed by a node according to the nature of the received message. The algorithm uses two types of messages: ch(v), used by a node v to inform its neighbors that it will be a clusterhead, and join(v, u), with which v communicates to its neighbors that it will be part of the cluster where its clusterhead is u. In the explanation of the algorithm, we use the following notations: – v, the generic node executing the algorithm (from now on, we will say that v is the identifier of this node and wv is its weight); – cluster(v), the group of nodes in the cluster of v. This variable is initialized at Ø and is only updated if v is a clusterhead; – clusterhead, the variable in which each node logs the clusterhead’s identifier to which it has submitted itself. Initialization value is zero; – ch(−) and join(−,−), two Boolean variables. The node sets ch(u) to “true”, with a neighbor u of v, including v, when it sends a ch(v) (v = u) message or when it receives from u (u being a neighbor of v, but not v) a ch(u) message. The Boolean variable join(u, t), with neighbor u of v, and t any node, is set to “true” by v when v receives a join(u, t) message from u. These two variables are initialized at “false”. Each node simultaneously launches the algorithm execution using the same initialization procedure. Only those nodes with the highest weight compared to their direct neighbors will send a ch(nodes with highest weight) type message. Because of the nature of the weights (integers), there is always at least one node v transmitting a ch(v) message. All other nodes settle for waiting to receive such a message. From there, we have two specific procedures where the execution is launched by message arrivals: – when a ch message is received from a neighbor u, node v first verifies if it has received a join(z, x) message with x being an ordinary network node (this is in fact logged into the corresponding Boolean variable) from its neighbors z, such that wz >wu. In this case, v will not receive a ch-type message from one of these z, and u is the node with the highest weight in the neighborhood of v which has sent a ch message. Therefore, v joins u and leaves the execution of the algorithm because now it knows to which cluster it is linked, or in other words, which is its clusterhead. If at least one node z remains such that we > wu, which has not yet sent a message, then node v settles for logging in the ch(u) variable that u sent a message ch to it, and waits for a message from z;
Distributed Clustering in Ad Hoc Networks and Applications
175
– when it receives a join(u, t) message, node v verifies whether it has previously sent a message c (i.e. if it has already decided to become a clusterhead: when this happens, ch(v) is still at value “true”). If that is the case, v verifies whether node u wants to join its own cluster (i.e. v = t) and updates its cluster(v) group if needed. Then if all neighbors z of v, such that wz < wv, have communicated their intention to join a cluster, v leaves the DCA execution. Please note that, in this case, node v pays no attention to its neighbors y (if there are any) such that wy > wv, because these nodes have assuredly joined a node x with a higher weight than that of v. This enables v to be a clusterhead despite the fact that some of its neighbors have a higher weight. If node v has not sent a ch type message, before deciding which will be its role, it needs to know what all nodes z, such that wz > wv, have decided for themselves. If v received a message from all nodes of this type, then it verifies the nature of the messages received. If they are all join type messages, this means that all its z neighbors have decided to join a cluster as ordinary nodes. This implies that now v is the only node with the highest weight among nodes (if they exist) which have not yet decided what to do. In this case, v will be a clusterhead and it executes necessary operations (i.e. it sends a ch type message, updates its cluster(v) group, and it sets variable ch(v) to “true” and variable clusterhead to v). At that time, v also verifies if each of its neighbors y, such that wy<wv, has already joined another cluster. If that is the case, v leaves the algorithm execution: it will be clusterhead of a cluster only made up of itself. Otherwise, if v has received at least a ch type message from z, then it joins the cluster of the neighbor with the highest weight which has previously sent a ch message. Then it leaves the DCA execution. Note that a node always leaves the execution of the algorithm once it has sent a join message. In order to clarify the previous explanations, we will detail the procedure of a simple example. Consider the simple ad hoc network illustrated in Figure 8.4a.
(a)
(b)
Figure 8.4. a) An ad hoc network G with nodes v and their weight wv, 1 ≤ v ≤ 8, b) an appropriate clustering for G
176
Wireless Ad Hoc and Sensor Networks
During step 1 (the beginning of the clustering step), all nodes execute the DCA initialization procedure. Since nodes 4 and 7 are those with the highest weight in their neighborhood, they send a ch message declaring themselves clusterheads. At the end of this step, nodes 1 and 5 receive a ch(4) message, nodes 2 and 6 receive a ch(7) message and nodes 3 and 8 neither receive nor send messages. During step 2, by executing the procedure corresponding to arrival of a ch(4) message, nodes 1 and 5 respectively send join(1, 4) and join (5, 4) and leave the DCA execution (in fact, no node can prevent them since none of their neighbors has a higher weight than 4). At the end of this step, node 4 has received both join messages from nodes 1 and 5, and since it has received a join message from any neighboring node with a lower weight than itself, it quits the DCA algorithm execution: cluster {4, 1, 5} is formed. At the end of step 1, nodes 2 and 6 know that 7 has a neighboring node with the status of clusterhead by executing the procedure associated with the arrival of ch(7) message. Since there is no node in the neighborhood of 6 with a higher weight than 7, node 6 sends the join(7) message and quits the DCA execution. Node 2 cannot do the same because it must wait for a message from node 1 with a higher weight than 7. Join(1, 4) message is received by node 2 at the end of the second step: since node 1 will join node 4’s cluster, node 2 must join node 7. By executing the procedure associated with the arrival of join(1, 4) message, node 2 sends join(2, 7) message. At the end of the third step, node 7 has received a message from all nodes with a lower weight than its own weight, and leaves the DCA algorithm execution: cluster {7, 2, 6} is formed. Now consider node 3. At the end of step 2, it has received the join(1, 4) message. At that moment, however, node 3 could not decide which role it would assume since there was still a node (2) in its neighborhood with a higher weight and which had not yet made a decision. At the end of the third step, node 3 has finally received the join(2, 7) message and by executing the procedure associated with a message arrival, it realizes that all nodes with higher weights have decided to join another cluster. It then sends a ch(3) message declaring its new status to its direct neighbors. At the end of step 4, this message is received by node 8 which, as node 3 has a higher weight and since no other node with a higher weight than 3 is in its direct neighborhood, joins the cluster of node 3 by sending a join(8, 3) message and leaves the DCA algorithm execution. When node 3 receives a join message from node 8, it also leaves the DCA execution. Therefore, at the end of the fifth step, cluster {3, 8} is formed and the DCA algorithm ends. Clustering thus generated is presented in Figure 8.4b. We can easily observe that the complexity in number of DCA messages is n, which is also significant in time since we presume that the time to send a message is limited. To determine the complexity in terms of time of this algorithm would require the introduction of several additional notions. We note by I all nodes i whose weight is higher than u, with u as a neighbor of i. This is the group of nodes successfully completing the initialization procedure at the very beginning of wireless ad hoc network clustering. For each node v not belonging to I, we define the
Distributed Clustering in Ad Hoc Networks and Applications
177
blocking distance as the longest of the shortest route between v and any i in I. We introduce the notion of blocking diameter Bd by saying that it is the largest of all blocking distances. We prove that each network node sends exactly one message during Bd+1 steps of the algorithm execution. This algorithm is suitable for “almost static” networks (in other words, it is suitable for installing cluster architecture in a network where nodes do not move “much”: slow mobility). By following the model presented in [BAK 81], the use of DCA with mobility can be described as a series of periodic clustering (i.e. clusters are continuously rebuilt in order to adapt to unpredictable variations in network topology). 8.2.1.3. Distributed clustering for better mobility support: DMAC (distributed and mobility-adaptive clustering) We will describe a clustering algorithm, DMAC (distributed and mobilityadaptive clustering) [BAS 99], which adapts to network mobility, and which creates and maintains the same cluster architecture as that developed by DCA. The first fundamental difference between DMAC and DCA is that DMAC will manage initialization as well as cluster organization maintenance despite node mobility while ensuring the three properties previously mentioned. The other difference is that we no longer need to presume that network topology is static during cluster construction. Adaptation to network topology changes is made possible by letting each node react to receiving messages from other nodes, as well as letting a communication line break to another node (probably caused by node failure or by wireless device movements) or the presence of a new connection. In the DMAC algorithm description, we still presume that a message sent by a node is correctly received by all its neighbors in a finite period of time. It is also presumed that each node knows its own identifier, weight, role (whether it has already decided to be a clusterhead or ordinary node), and identifiers, weights and roles of all its neighbors. As long as a node has not decided what it wants to be yet, it is still considered an ordinary node. Except in the procedure that each node executes when it launches a clustering procedure, as in the case of DCA, the algorithm is controlled by message arrivals and deliveries. Here we will use the same type of message as during DCA description. We consider the following additional hypotheses: – a node is informed of connection failure or of the presence of a new communication link by a low level network service which will launch an adapted procedure; – DMAC procedures are said to be “atomic” (we cannot interrupt them);
178
Wireless Ad Hoc and Sensor Networks
– during clustering initialization or when a node is added to the network, its clusterhead, ch(-), and cluster(-) variables are initialized to zero, “false” and Ø respectively. What follows is the complete description of procedures comprising the DMAC algorithm, as executed by any node v: – initialization: during the installation of the cluster structure or when a node v is added to the network, a node executes the initialization procedure in order to (re)determine its own role. If, among its neighbors, there is at least one clusterhead with a higher weight, then v joins it, otherwise, it will become a clusterhead. Note that a neighbor with a higher weight which has not yet decided its own role (this can happen during the first network initialization, or when two or more nodes are simultaneously added to the network) can send a message later (every node is bound to execute this initialization procedure). If this message is a ch type message, then v will drop its clusterhead role and will submit to this new clusterhead; – break in communication connection: whatever the moment where a node is informed of a break in the communication connection to one of its direct u neighbors, node v starts by verifying if its own role is actually a clusterhead and whether or not u belonged to its cluster. If that is the case, v takes u out of its cluster(v). If v is an ordinary node and u was its clusterhead, v becomes an orphan and must determine its new role in the future. In order to do this, v verifies if there is at least one clusterhead z in its neighborhood with a higher weight. If that is the case, then v joins the higher weight clusterhead. Otherwise, it becomes its own clusterhead; – emergence of a new communication connection: when v is informed of the presence of a new neighbor u, it verifies if u is a clusterhead. If that is the case and if u has a higher weight than the current v clusterhead, then independently from its own current role, v submits to u; – during message ch(u) arrival: when a u neighbor of v becomes a clusterhead, it sends a ch(u) message to all its direct neighbors, particularly v. When this message is received, node v verifies whether it will have to submit to u (in other words, it verifies if the weight of u is higher than the clusterhead to which v belongs to). In this case, v joins cluster u independently of its own current role; – during message join(u, z) arrival: the behavior of node v depends on whether or not it is currently a clusterhead. If that is the case, v must verify if u joins its clusterhead (z = v: in this case, u is added to cluster (v)) or if u belonged to its cluster and now joins another cluster (z ≠ v: in this case, u is taken out of cluster (v)). If v is not a clusterhead, it must verify whether u was its clusterhead. If that is the case, then, and only then, v must decide what its role will be: it will join clusterhead x with the highest weight in the sense that the weight of x is strictly
Distributed Clustering in Ad Hoc Networks and Applications
179
higher than its own weight, if such a node exists. Otherwise, it will become its own clusterhead. We can easily conclude that the previous procedures will enable us to obtain and maintain clustering which fulfills the three properties discussed previously for each network with the same complexities as the DCA algorithm. Basagni [BAS 99] presented two node clustering algorithms, DCA and DMAC, for any wireless ad hoc network in a set of clusters made up of a clusterhead and of a possibility of several ordinary nodes. A new clusterhead selection criteria based on weights has been introduced making a choice based on mobility linked parameters possible, which was not the case with previous clustering algorithms. The algorithms proposed only require local topology knowledge of each node (i.e. its direct neighbors), and enables each ordinary node to directly access at least one clusterhead, thus guaranteeing quick inter-cluster and intra-cluster communication between any node pair. The DCA algorithm is easy to implement and it has been proven that its complexity in terms of time is increased by a network parameter depending on the potentially changing network topology, as opposed to n, the non-variable network size. DMAC is supposed to combine ease of DCA implementation with a complete adaptation to node mobility, even during cluster initialization. 8.2.1.4. Generalization of distributed approach limiting mobility impact: GDMAC In [BAS 05] it is shown how to improve the DMAC approach and the result is the GDMAC (generalized DMAC) algorithm. Specifically, the objective is to limit the impact of mobility. Basagni’s observation is as follows: in accordance with DMAC algorithm specifications, an ordinary node will always submit to its neighborhood’s clusterhead with the highest weight. If another clusterhead with a higher weight emerges later in its neighborhood, the ordinary node will change its affiliation in order to always submit to the node most able to be a good clusterhead. Another form of reaffiliation appears when two clusterheads find themselves in each other’s range of influence because of their mobility. At that moment, the heaviest one is the only one that will keep its clusterhead status, and the other one will become an ordinary node and find itself a suitable clusterhead. If no clusterhead is within transmission range, the node will become its own clusterhead once more (election). In this perspective, DMAC is actually able to support the dynamic nature of this type of network, in particular with a permanent distributed monitoring of all possible events (connection formation/failure, etc.). When a new node is detected, necessary information for maintaining the structure (identifiers, roles, etc.) is exchanged by
180
Wireless Ad Hoc and Sensor Networks
nodes and adapted procedures are launched to reorganize clusters in order to be able to include the new ones. In this optic, it quickly becomes clear that, depending on the network’s degree of mobility and specific objectives described by node mobility, there can be frequent cluster reorganizations, whether by election or reaffiliation. There are certain cases where a simple topological event can be enough to launch a chain reaction of role changes for several nodes. Such undesirable phenomena sometimes occur at the detriment of the whole network’s performance level because a large part of the node energy is monopolized by all necessary reprocessing (when a cluster is being reorganized, no “useful” data can be sent by its intermediary). A possible solution to decrease the number of changes caused by mobility (election and reaffiliations) involves somewhat relaxing the DMAC hypothesis which strictly prohibits having two clusterhead neighbors and which forces any ordinary node to affiliate with its highest weight clusterhead neighbor. The GDMAC protocol was defined with this idea. GDMAC operations are based on two parameters, H and K, which help to limit the “chain effect” of some DMAC rules. From there, up to K>0 clusterheads are authorized to become neighbors. In addition, an ordinary node prefers a newly arrived clusterhead to its current clusterhead when the weight of the new one is H units higher than its old clusterhead. GDMAC is clearly only a pure generalization of DMAC. We obtain the DMAC protocol by setting H and K values to 0. The higher the value of H, the less chance that a node changes clusterheads when new clusterheads arrive in its neighborhood. Parameter H implements the idea that cluster reorganization is only necessary when the new potential clusterhead is actually much better than the current clusterhead. Parameter K controls spatial density in clusterheads. When K = 0, there cannot be 2 neighbor clusterheads (as with the DMAC algorithm). Giving a value to parameter K that is strictly higher than 0 helps in delaying cluster reorganizations because now a clusterhead is no longer forced to leave its responsibilities when up to K-1 clusterheads with higher weights are present in its direct neighborhood. The author illustrates his comment with numerous simulations, clearly proving that on an economic level, GDMAC offers a significant energy improvement compared with the original solution with DMAC. However, this gain is not without drawbacks. In fact, our final objective is to use the structure offered by the clustering process to run other algorithms more efficiently, but the problem is that, generally, to offset overhead caused by clustering, these algorithms need to be able to count on some reliability from the cluster structure. Without this reliability, there will be no performance gain compared with more traditional solutions.
Distributed Clustering in Ad Hoc Networks and Applications
181
Another major problem is still not resolved in this improvement: node weights are presumably unique for simplicity purposes. But what will happen if two neighbor nodes, or even more, have identical weights? A solution to this problem will be addressed later in this chapter. Before going to a reflection on clustering algorithms yielding clusters in which two nodes are at more than two hops, we can briefly cite a few other contributions in this field. Algorithms mentioned up to now have the advantage of only requiring local network topology information. In contrast, the protocol proposed in [CHA 02] requires global network topology information and therefore cannot be reasonably considered as adapted to be used in wireless ad hoc networks. This protocol is the subject of an enhancement in [LU 05] but the requirement to broadcast information to the whole network before making a decision remains, which makes it completely unadapted in our case. In other solutions such as [AN 01, BAS 01] and [ER 04], the pattern formed by node mobility is considered for cluster formation, but again, this use takes us farther from our fundamental requirements to use clustering algorithms only having local network knowledge. [BAS 01] also presents an algorithm based on weights, where one local mobility criteria is used to represent the clusterhead. The ratio between levels of consecutive transmission power between two nodes is used to determine relative mobility between two neighbor mobile devices. The problem is that this process is intensive and requires stations with significant calculation capability. In [AN 01], wireless nodes are organized into variable size clusters with no overlap, whereas in [ER 04], clusters are overlapping. The algorithm proposed in [AN 01] uses a combination of physical as well as logical network partitions, whereas nodes moving in the same “manner” are grouped in [ER 04]. These last two studies bring us to a very legitimate questioning: why not create clusters with more than two hops in diameter? This is exactly the subject of the next section of this chapter. 8.2.2. Clustering at more than two hops Contrary to clustering solutions said to be node-centric enabling the development of clusters with a limit of two hops, there are solutions described as cluster-centric enabling the development of k hop clusters. A k-cluster is made up of a group of nodes mutually accessible by a route length of k ≥ 1. In this approach, the network is broken down in cluster nodes without one node specifically designated as a clusterhead. From this premise, several methods have been developed. The drawback from most of these methods is that they only target specific wireless ad hoc networks, either by their topology, or their mobility characteristics, etc. For example,
182
Wireless Ad Hoc and Sensor Networks
[CHA 97] describes an algorithm for the successful clustering of networks in the form of unit disk graph. In addition to this aspect, this method is not suitable because it presumes that all nodes are stationary during cluster development. There are also other algorithms implicitly imposing certain constraints on the cluster diameter [BAN 01, MCD 99, RAM 98]. In [BAN 01], a distributed implementation of a centralized clustering algorithm is proposed. This requires the development of a global spanning tree to generate clusters which fulfill several constraints on the number of nodes per cluster and the number of hierarchical levels. The priority is to create clusters between k and 2k-1 size for a given k. The algorithm starts by creating a rooted spanning tree covering the complete network. Cluster development is then done backwards with sub-trees transformed into clusters supporting the predetermined size criteria. An interesting reference in this field of clustering at more than two hops is [GAR 03], which always uses the notion of clusterhead as clustering algorithms for clusters at more than 2 hops. We will study it in more detail. This report describes several new clustering algorithms for nodes in a mobile ad hoc network. The authors of [GAR 03] propose to combine two well-known approaches into a single clustering algorithm considering connectivity as a primary criterion, and the weakest identifier as a secondary criterion for clusterhead selection. Their goal is to decrease the number of clusters, which makes it a compromise with the desire of having smaller sized groups. The authors refer to the algorithm from Lin and Gerla [GER 97] as the algorithm based on the “1-smallest identifier”. First, they generalize the same distributed algorithm to define k-clusters and declare that the resulting algorithm is based on the “k-smallest identifier”. One of the nodes initiates the clustering process by flooding the network with a clustering request. It is presumed that stations know all their neighbors at more than k hops. All nodes with the smallest identifier among their neighbors at k hops maximum broadcast their decision to create clusters (with them as the clusterhead) to their neighborhood (in the sense here of nodes at k hops at the most). Their decision (and in the same way all the future decisions of other nodes) is then transmitted to all other nodes up to a k distance. If all neighbors at k hops at the most, which have smaller identifiers, have broadcast their decision and none of them has declared themselves a clusterhead, then the node decides to create its own cluster and to broadcast its decision to the others. Otherwise, it chooses its clusterhead neighbor at k hops maximum with the weakest clusterhead identifier and broadcasts its decision to its neighborhood. Each node therefore broadcasts its decision once its neighbors at k hops have broadcast theirs.
Distributed Clustering in Ad Hoc Networks and Applications
183
The algorithm based on the smallest identifier does not take into account node connectivity and consequently produces more clusters than is necessary. An algorithm only based on node degrees is not suitable because of the high number of similarities between nodes. The authors therefore propose to use node degree as a primary key and the identifier as a secondary key in clustering decisions. Then the authors introduce a technique to maintain cluster structure while indicating that increasingly specialized techniques must be deployed to maintain a certain clustering quality. This brings us to question the legitimacy of wanting to obtain clustering at more than two hops. In fact, to our knowledge, this is not really motivated by any use. The idea of clustering ad hoc networks is to try to find the advantages offered by mobile cellular networks. Furthermore, when we notice the difficulty in maintaining good clustering, we can be skeptical of the chances of achieving quality clustering with clusters at more than two hops. Finally, we often characterize mobile ad hoc networks as being very dynamic in nature. It therefore appears illusory to think that nodes will remain correctly physically grouped in order to maintain the spatial coherence of the cluster. This precision out of the way, we can then return to the study of wireless ad hoc network clustering in clusters at more than two hops. In fact, if the algorithm introduced in [BAS 99] is satisfactory from the point of view of the low requirements in node topology information, it still unfortunately presents a prohibitive lack in its use in a real network environment. For the choice of clusterheads, the author relies on a system of weights. To simplify, he presumes that weights are unique, but he does not mention what would happen if two nodes had the same weight. This is the subject of the next section. 8.3. Clustering in networks where mobile devices may have the same weight In order to achieve a complete clustering algorithm, the main problem is to bypass the problem of non-unique weights. In fact, whatever the hypothesis used (reverse moving speed, initial battery level, etc.), it is always possible that two network nodes, or more, can have the same weight, which makes using the DMAC algorithm directly impossible. Several solutions may be considered: first, we may consider adopting the idea of [GAR 03] to say that generic weight is not the primary key in the decision criteria. However, we must then find a suitable secondary key so that the combination of these criteria may guarantee uniqueness throughout the network, but this would
184
Wireless Ad Hoc and Sensor Networks
bring us back to the initial problem. This problem is currently unresolved, but we will describe a first solution here. We presume that an indefinite number of nodes individually launches the clustering process. Nodes not volunteering at first will be involved in the process by messages sent by the others. When we study the protocol discussed in [BAS 99], we notice that in fact, we do not really need strict weight uniqueness in the whole network. We only need uniqueness at two hops (i.e. if a mobile device v has weight wv, all of its neighbors at 2 hops maximum must each have a weight that is different from wv) in order to resolve the case of a mobile device which has two clusterhead neighbors, or more, and must choose one to affiliate itself to. Any volunteer node will first verify whether it can become a clusterhead. If it has the same weight as its “heavier” neighbor (if it exists), then it will randomly choose a bit. If it picks 0, it will submit to it, otherwise, it will become a clusterhead and will modify its weight so that it instantly becomes heavier than it. This requires the implementation of weight in the form of a list: the original node weight is stored in the first link in the list. When a node “artificially” becomes heavier than one of its neighbors, it adds a link with value 1 at the end of its list (2 if it was equal to 2 of its highest clusterhead neighbors, etc.), otherwise, it adds a 0 and informs the victorious neighbor that it must add a 1 (if there are two victorious nodes, the higher one takes 2 and the other 1). If the node receives a notification from one of its neighbors to modify its weight before arriving at the end of the procedure, then it must submit to the orders of this notification and restart the procedure at the beginning (i.e. the principle of “first come, first served”). Another case of unsolved parity is when a node cannot determine which node will be its clusterhead. In this case, the node will arbitrate between the different candidate nodes by using the same type of procedure. This simple DMAC algorithm modification has the advantage of not modifying its complexity and of offering a system of weights for executing clustering. In the future, simulations will be carried out to validate this first approach. 8.4. Applications As mentioned in this chapter’s introduction, the use of clustering in mobile ad hoc networks greatly enhances the implementation of numerous protocols. The idea is to recreate a structure presenting certain stability in order to have an infrastructure similar to the one in cellular mobile networks. In this section, we will present two possible applications: initialization problem and mutual exclusion in k hop networks.
Distributed Clustering in Ad Hoc Networks and Applications
185
8.4.1. Initialization problem in k hop networks The initialization problem consists of the task of providing a unique identifier to each mobile device. The study of some works on mobile ad hoc networks quickly proves to the reader that this is an absolutely fundamental aspect of this problem. In fact, for the simplest task, most authors systematically presume that all nodes have unique identifiers. As mentioned in the previous section, the installation of a cluster structure does not necessarily require identifiers, but only unique weights. Since we have unique weights, we can now presume that the network is completely clustered and that we will be able to launch an initialization. More precisely, the initialization can be carried out as follows: each clusterhead will execute a local initialization of its own cluster by using an initialization algorithm of one hop mobile ad hoc network [MYO 03]. Then the clusterhead of the highest cluster will transmit to the next cluster the highest identifier used in its cluster by itself or one of its affiliates. The next clusterhead only needs to add this maximum identifier to each identifier in its cluster. Then it transmits the maximum identifier used in its cluster to the next clusterhead and so on. This is the prefix sum technique. 8.4.2. Mutual exclusion in k hop networks One of the most traditional paradigms in the study of distributed systems is mutual exclusion. This consists of the definition of a protocol executed by a group of processes wishing to coordinate with each other for single access to a unique shared resource (or a piece of code), which we call critical section. There has been intensive research in this field for static distributed systems and summarized in [AND 03] and [RAY 86]. However, the problem is somewhat different with mobile ad hoc networks because of permanent network topology changes and problem of energy consumption. Since the stations are autonomous and therefore have a very limited energy autonomy, algorithms for wireless ad hoc networks must avoid consuming too much energy in order to leave as much as possible for the next applications used by the stations. Since the beginning of the 1990s, research on mobile ad hoc networks has increased greatly, due in large parts to their obvious economic advantage. Most contributions are based on the circulation of a token: the one obtaining the token can reach a critical section. One of the first protocols based on the use of tokens is the RL (reverse link) algorithm presented in [VAI 98] and enhanced in [VAI 01]. Their approach is to adapt static network algorithms to mobile ad hoc networks by
186
Wireless Ad Hoc and Sensor Networks
introducing mobility. The algorithm in [VAI 98] presumes that each station will obtain a critical section by maintaining an acyclic directed graph in the network based on a mechanism introduced by [BER 81]. The fundamental property of this acyclic directed graph is that the token holder is its port of call. This is ensured by using a height mechanism to label nodes enabling unlimited reorientation of communication connections, throughout which the token will circulate. Each node locally maintains a request queue containing identifiers of node neighbors from which it has received a request for the token. However, this protocol has several drawbacks. First, it presumes that the token is reliable (i.e. it will neither disappear, nor will it be duplicated). No node failure will occur. A low level protocol informs each node in the node group with which it can directly communicate by providing it with indications on formations and failures of communication connections. The authors also presume that communication connections are FIFO. A vital problem with this algorithm is that there is a certain failure pattern with communication connections which can cause famine in the network, presuming that there are two node connections to the node requesting the token. If these connections alternately fail and come back each time the token is on the verge of going through one of these links, this will clearly lead to unavoidable famine. That is why the authors have to presume that communication link failures stop in order to ensure a finite access time to the critical section for all nodes. Numerous studies on the subject have been published since, but they all retain this drawback: the significant problem of reliably supporting mobility in these networks. The problem in implementing reliable mobile ad hoc network algorithms is the lack of reliable infrastructure resulting from high node mobility. In 2005, [MEL 05a] proposed the installation of a type of “reliable” node backbone above the network by using one of the clustering techniques discussed previously. This reliable backbone would be made up of clusterheads that would be responsible for token routing between clusters and for its distribution in clusters. This work is only a very preliminary version (very restrictive mobility hypotheses) of a project by the authors, but it clearly sets the bases. The idea of this mutual exclusion algorithm is to presume that a slightly modified version of the DMAC algorithm runs in the background in the network, and to use the properties of the structure thus generated. Each node has a weight that will define its priority in the network to obtain the critical section. In order to reach this goal, the authors of [MEL 05] have to make some hypotheses on the network. They presume that the weight of a node is inversely proportional to its moving speed, which is presumed to be constant, or to its remaining initial battery level. From there, weights are used as identifiers of sorts. Naturally, we presume that weights are unique, hence the necessity of being able to use some mechanisms already discussed to solve parity cases. We presume that a message is received by a node in a finite time (a slot) by all its neighbors and that each node knows its weight,
Distributed Clustering in Ad Hoc Networks and Applications
187
role (therefore its cluster) and weights and roles of all its direct neighbors. In addition, we presume that the clusterhead cumulates in its own memory all knowledge from all its affiliates. In this way, it knows the identity of clusters to which it can directly send the token or forward it with the help of one of its affiliated nodes. We presume that all this information is transmitted to the clusterhead by the slightly modified version of the DMAC algorithm. The implementation of this protocol is left to the consideration of the reader. Once a mobile device has received the token, it must decide whether it is authorized to enter into critical section or if it must transmit the token to a node or a cluster with higher priority. In reality, it locally maintains a list of nodes to satisfy, including itself as any other node. So, when a node obtains the token and is at the head of its list of priorities, it sends itself the token through the loop-back interface, much in the same way as when we send a packet to address 127.0.0.1 in an IP-based system. Nodes with lowest weights have the highest priority because of their shorter shelf life in this neighborhood. In the algorithm description, we will use the following notations: – v: current node on which the description is made; – Γ(v): the group of neighbors of v, including v; – Γc(v): neighbors of v belonging to the same cluster as v. This includes v; – c: all neighbor clusters of cluster v. This includes the cluster itself. This variable is obviously available only to clusterheads; – token (sender, destination, flag, clusterID) is the token that will be transmitted from one node to another. It uses the following attributes: - sender: the weight of the mobile device that sent the token to its neighbors; - destination: the mobile device weight to which the sender wanted to send the token. When a mobile device whose weight is not destination receives this token, it must destroy the token; - flag: this is used to know in what way the receiving node must or must not use the token. The value of the flag will evolve depending on situations summarized in Figure 8.5; - clustered: this represents the cluster identifier to which the token is being transmitted. This attribute is only initialized when the flag variable has specific special values; – my_cluster is the cluster identifier to which v belongs; – my_ch represents the clusterhead of the cluster to which v belongs. If v is its own clusterhead, then my_ch = v; – nodeTosatisfy represents the last node to which v, as clusterhead, has transmitted the token. If the value of the nodeTosatisfy is “nothing”, this means that
188
Wireless Ad Hoc and Sensor Networks
now, v’s goal is to determine to which neighbor cluster to transmit this token, and no longer to find a neighbor node to satisfy; – clusterTosatisfy represents the last cluster to which v, as clusterhead, has transmitted the token. We can now present the mutual exclusion algorithm. We presume that the DMAC protocol has installed a cluster structure over the network and that now the network is mobile, based on the restriction that cluster structure must not change: – the role of a node cannot change: an ordinary node must always remain an ordinary node. A clusterhead definitely retains its status; – nodes must not go from one cluster to another. They belong to a unique cluster forever. The mutual exclusion protocol is based on the developed cluster domination property (in other words, each ordinary node has at least one clusterhead in its direct neighborhood). It establishes a simple communication system. Here is its principle: once the clusterhead has obtained the token, it transmits it consecutively to all its affiliated nodes with the flag initialized to value “use”. Each of them enters critical section (if it so wishes) and then sends the token to the clusterhead with the flag with value “back”. Once all affiliated nodes, including the clusterhead, have been satisfied, the clusterhead transmits the token to the neighbor cluster with the lowest identifier (i.e. the one with the least reliable clusterhead) with “forward” flag and uses, unless it is in direct contact with this cluster, its most reliable affiliate as a gateway to the target cluster. This approach is summarized in Figure 8.5.
Figure 8.5. Token circulation in the network and flag value evolution
Distributed Clustering in Ad Hoc Networks and Applications
189
With only this description, we could argue that two selfish “neighbor” clusterheads (or clusterheads where at least one gateway node of each is a neighbor) could monopolize the token by settling for only sending it to each other and thus lead to famine for all other clusters. To counter this phenomenon, the authors of [MEL 05] presume that when a clusterhead receives the token from another cluster, it does not use it immediately for its own purposes. If a neighbor cluster that is not yet satisfied has higher priority than the cluster to which it is clusterhead, then it will send the token to it instead of using it itself. The token will come back to it after a finite time period. The first protocol initialization, after DMAC has installed the cluster, is carried out by offering token(v, v, forward, my_cluster(v)) to clusterhead v of my_cluster(v) cluster. This mutual exclusion algorithm was also implemented to support the consequences of the following phenomena: – intra-cluster mobility: affiliated nodes are authorized to run around the clusterhead and to have very variable communication connections between each other. The only restriction is that all ordinary nodes must remain in direct contact with the clusterhead; – a certain inter-cluster mobility: if the clusterhead moves in a certain direction and all its affiliates follow, the cluster structure is not altered. There would be a certain inter-cluster mobility: intra-cluster mobility. In fact, because nodes run around their clusterhead, gateway nodes will not always have the same neighbors over time. So, when a clusterhead uses one of its affiliated nodes as a gateway to the neighbor cluster, it is possible that this node is no longer in contact with the target cluster during token arrival. The intermediate node will then send the token to its clusterhead with the “cancel” flag to announce that it must find another route to the target cluster (if this cluster is still in contact). This type of situation is illustrated in Figure 8.6.
Figure 8.6. Situation in which the “cancel” flag is used
190
Wireless Ad Hoc and Sensor Networks
Algorithm security and agility are not in danger by node failures in these conditions: – the network must remain connected (i.e. a related graph); – clusterheads must never fail or this may destroy cluster structure; – the direct token sender and receiver must never disappear during a transmission (in other words, reliable transmissions in one hop), otherwise, the token would disappear. This problem could be avoided by using a PIF (propagation of information with feedback) technique: a dedicated node could periodically broadcast a PIF to verify whether there is exactly one token currently circulating in the network [CHA 82, SEG 83]. Under these restrictions, the algorithm can be considered as fault tolerant and energy efficient. We can easily establish that the number of broadcasts is proportional to the number n of nodes, or in O(n). Numerous elements still need improvement, in particular the algorithm for the cluster used. The authors of [MEL 06b] and [MYO 07] modify the DMAC algorithm in order to make it capable of supporting the fact that several nodes may have the same weight. We could also argue that presuming that a message is always received in a given period of time is not realistic. This can easily be improved by using a mutual exclusion algorithm for single hop ad hoc networks [MEL 06a]. 8.5. Conclusion In this chapter, we have presented different clustering techniques. The main point of divergence between them is the heuristic used to determine which will be a clusterhead. In fact, some are based on node identifiers, while others are based on their connectivity, whereas others still prefer to use a generic weight system, depending on a parameter connected to mobility. Very clearly, this system has the advantage of being closer to network reality than the other solutions. The problem is now to formulate a good weight system for the installation of a cluster structure that will be as stable as possible. Another problem encountered is obtaining a weight system where uniqueness is guaranteed, because in current clustering propositions, we settle for presuming this uniqueness. However, whatever the heuristic chosen (reverse moving speed, etc.), we quickly notice that it is not absolutely guaranteed. By using the methods explained in this chapter, we can obtain a system with unique weights from any network. In fact, using a reliable cluster structure enables net improvement of the implementation of numerous protocols dedicated to mobile ad hoc networks (routing, initialization, etc.). We have particularly illustrated a first version of a mutual exclusion algorithm benefiting from the capabilities of this structure. However, in order to be able to
Distributed Clustering in Ad Hoc Networks and Applications
191
develop efficient algorithms over clusters, it remains fundamental to improve clustering techniques which still present some drawbacks, in particular the obligation of making a compromise between reliability, reactivity and energy conservation. Interested readers will be able to find performance studies for all clustering techniques presented in their relative references. 8.6. Bibliography [AN 01] AN B., PAPAVASSILIOU S., “A Mobility-Based Clustering Approach to Support Mobility Management and Multicast Routing in Mobile Ad Hoc Wireless Networks”, International Journal of Network Management, vol. 11, no. 6, p. 387-395, 2001. [AND 03] ANDERSON J., HERMANI T., KIM Y., “Shared Memory Mutual Exclusion: Major Research since 1986”, Distributed Computing, no. 16, p. 75-110, 2003. [BAK 81] BAKER D.J., EPHREMIDES A., “The Architectural Organization of a Mobile Radio Network via a Distributed Algorithm”, IEEE Transactions on Communications COM-29, no. 11, p. 1694-1701, November 1981. [BAK 84] BAKER D.J., EPHREMIDES A., FLYNN J.A., “The Design and Simulation of a Mobile Radio Network with Distributed Control”, IEEE Journal on Selected Areas in Communications SAC-2, no. 1, p. 226-237, January 1984. [BAK 87] BAKER D.J., EPHREMIDES A., WIESELTHIER J.E., “A Design Concept for Reliable Mobile Radio Networks with Frequency Hopping Signaling”, Proceedings of the IEEE 75, vol. 1, p. 56-73, January 1987. [BAN 01] BANERJEE S., KHULLER S., “A Clustering Scheme for Hierarchical Control in Multi-Hop Wireless Networks”, Proceedings of the 20th IEEE Infocom 2001, vol. 2, p. 1028-1037, 2001. [BAS 01] BASU P., KHAN N., LITTLE T., “A Mobility-Based Metric for Clustering in Mobile Ad Hoc Networks”, Proceedings of the 21st International Conference on Distributed Computing Systems, p. 413-432, 2001. [BAS 05] BASAGNI S., GHOSH R., “Limiting the Impact of Mobility on Ad Hoc Clustering”, Proceedings of the Second ACM Inter. Workshop on Performance Evaluation of Wireless Ad Hoc, Sensor, & Ubiquitous Networks (PE-WASUN ‘05), IEEE Computer Society, p. 197-204, October 2005. [BAS 99] BASAGNI S., “Distributed Clustering for Ad Hoc Networks”, Proceedings of the 1999 International Symposium on Parallel Architectures, Algorithms, and Networks (ISPAN’ 99), IEEE Computer Society, p. 310-315, June 1999. [BER 81] BERTSEKAS D., GAFNI E., “Distributed algorithms for generating loop-free routes in networks with frequently changing topology”, IEEE Transactions on Communications, C-29, no. 1, p. 11-18, August 1981. [CHA 82] CHANG M., “Echo Algorithms: Depth Parallel Operations on General Graphs”, IEEE Transactions on Software Engineering, SE-8, p. 391-401, 1982.
192
Wireless Ad Hoc and Sensor Networks
[CHA 02] CHATTERJEE M., DAS S.K., TURGUT D., “WCA: a Weighted Clustering Algorithm for Mobile Ad Hoc Networks”, Journal of Cluster Computing, special issue on Mobile Ad Hoc Networks, vol. 5, no. 2, p. 193-204, April 2002. [CHA 97] CHATTERJEE M., KRISHNA P., PRADHAN D.K., VAIDYA N., “A ClusterBased Approach for Routing in Dynamic Networks”, ACM SIGCOMM Computer Communications Review, vol. 27, no. 2, p. 49-64, April 1997. [EPH 83] EPHREMIDES A., “Design Concepts for a Mobile-User Radio Network”, Computers & Electrical Engineering, vol. 10, no. 3, p. 127-135, 1983. [ER 04] ER I.I., SEAH W.K.G., “Mobility-Based d-Hop Clustering Algorithm for Mobile Ad Hoc Networks”, Proceedings of IEEE Wireless Communications and Networking Conference, Atlanta, USA, March 21-25 2004. [GAR 03] GARCIA NOCETTI F., SOLANO GONZALEZ J., STOJMENOVIC I., “Connectivity Based k-Hop Clustering in Wireless Networks”, Telecommunication Systems, vol. 22, no. 1-4, p. 205-220, 2003. [GER 95] GERLA M., TSAÏ J.T.C., “Multicluster, Mobile, Multimedia Radio Network”, Wireless Networks, vol. 1, no. 3, p. 255-265, 1995. [GER 97] GERLA M., LIN C.R., “Adaptive Clustering for Mobile Wireless Networks”, Journal on Selected Ares in Communications, vol. 15, no. 7, p. 1265-1275, September 1997. [LU 05] LU H., REN C.M., SUN X.M. “EWCA: An Enhanced Weighted Clustering Algorithm for Mobile Ad Hoc Networks”, Proceedings of the 2005 International Conference on Wireless Networks, p. 489-494, June 2005. [MCD 99] MCDONALD A.B., ZNATI T.A., “A Mobility-Based Framework for Adaptive Clustering in Wireless Ad Hoc Networks”, IEEE Journal on Selected Areas in Communications, special issue on Wireless Ad Hoc Networks, vol. 17, no. 8, p. 14661487, 1999. [MEL 05] MELLIER R., MYOUPO J.F., “A Clustering Mutual Exclusion Protocol for MultiHop Mobile Ad Hoc Networks”, Proceedings of the IEEE International Conference on Networks (ICON), Malaysia, November 2005. [MEL 06a] MELLIER R., MYOUPO J.F., “Fault Tolerant Mutual and k-Mutual Exclusion Algorithms for Single-Hop Mobile Ad Hoc Networks”, International Journal of Ad Hoc and Ubiquitous Computing (IJAHUC), vol. 1(3), p. 156-166, 2006. [MEL 06b] MELLIER R., MYOUPOJ.F. “A Weighted Clustering Algorithm for Mobile Ad Hoc Networks with Non-Unique Weights”, 2nd International Conference on Wireless and Mobile Communications (ICWMC 2006), p. 39, Bucharest, Romania, 29-31 July, 2006. [MYO 03] MYOUPO J.F., RAVELOMANANA V., THIMONIER L., “Average Case Analysis Based Protocols to Initialize Packet Radio Networks”, Journal of Wireless Communication Mobile Computing, vol. 3, p. 539-548, 2004. [MYO 07] MYOUPO J.F, SOW I., “A Randomized Clustering Algorithm for Unknown Wireless Ad Hoc Networks with Multiple Same-Weighted Nodes”, forthcoming.
Distributed Clustering in Ad Hoc Networks and Applications
193
[RAM 98] RAMANATHAN R., STEENSTRUP M., “Hierarchically-Organized, Multi-Hop Mobile Wireless Networks for Quality-of-Service Support”, Mobile Networks & Applications, vol. 3, no. 1, p. 101-119, June 1998. [RAY 86] RAYNAL M., Algorithms for Mutual Exclusion, MIT Press, Cambridge, USA, 1986. [SEG 83] SEGALL A., “Distributed Network Protocols”, IEEE Transactions on Information Theory, IT-29, p. 23-35, 1983. [VAI 01] VAIDYA N., WALTER J., WELCH J., “A Mutual Exclusion Algorithm for Ad Hoc Mobile Networks”, Wireless Networks, vol. 7, no. 6, p. 585-600, 2001. [VAI 98] VAIDYA N., WALTER J., WELCH J., “A Mutual Exclusion Algorithm for Mobile Ad Hoc Networks”, Dial M for Mobility Workshop, Dallas, USA, October 1998.
Chapter 9
Security for Ad Hoc Routing and Forwarding
9.1. Introduction Wireless ad hoc networks represent problems with routing security and IP datagram forwarding which are significant because of their special characteristics: peer-to-peer open architecture, shared “wireless” medium, highly dynamic topology, node mobility and large constraints in resources. To illustrate this, certain node members of these networks can belong to malicious people injecting erroneous routing information in order: – to insert into routing table routes liable to result in denials of service by forwarding traffic to a black hole; – to detour traffic in order for a malicious node to be able to illegally access the active content and maybe even modify it; – to make network congestion phenomena appear, due to routing loops or the use of non-optimal routes. Maintaining IP connectivity in these types of networks thus requires the use of secure routing protocols. This chapter discusses the state of knowledge and research in the field. First is a review of routing protocols in ad hoc networks (section 9.2). The body of this chapter addresses IP security of ad hoc networks by first focusing on the threat model of this specific security problem (section 9.3). The two last
Chapter written by Sylvie LANIEPCE.
196
Wireless Ad Hoc and Sensor Networks
sections consist of a state of the art on routing security (section 9.4) and on IP datagram security forwarding (section 9.5). 9.2. Reminders on routing protocols in ad hoc networks This section is not meant to discuss routing in ad hoc network in its entirety, but mostly to review the main elements that make it possible to understand security mechanisms in this chapter [BEL 04]. The first section focuses on protocols said to be reactive, or in other words, protocols initiating on demand route discoveries, when a packet transmission needs to be forwarded to a destination. The second section discusses protocols known as proactive which, by contrast, build routing tables in advance to be used when future data transmissions are needed without them having to be created when route discovery is initiated. 9.2.1. Reactive protocols 9.2.1.1. Dynamic source routing (DSR) The DSR protocol [JOH 04] is a source routing protocol which includes a route discovery mechanism enabling a node to find, on demand, a route to reach a remote node, i.e. to identify a sequence of intermediate nodes by which data packets can transmit to reach their destination node. Then, the source node includes the route’s description in the header of each data packet that will be forwarded by the network in compliance with a source routing protocol. The discovery phase between a source node S and a destination node D is done as follows. Node S broadcasts an RREQ packet in its radio transmission zone identifying the request identifier (S) and its final destination (D), and containing a request identifier. Each RREQ message also includes a field containing the description of the route being created. When an intermediate node receives an RREQ message, it is responsible for the rebroadcast in its radio coverage zone, after adding its own identifier to a field being populated throughout its hops and which contains the sequence of descriptive identifiers for the route created in the end, except in these cases: – where the sequence of identifiers already contains the identifier of the intermediate node, implying the creation of a loop. The node then destroys the RREQ packet received; – where the intermediate node has already received a RREQ from the same initiating node with identical request identifier and destination node, which implies multiple arrivals of a single request from several nodes. In this case, the intermediate
Security for Ad Hoc Routing and Forwarding
197
node destroys the RREQ message received since it has already processed this request; – where the intermediate node is the destination node. In this case, it sends back to the initiating RREQ node an RREP packet with a copy of the route’s description field created during RREQ propagation. This type of principle does not exclude the discovery of several possible routes to reach a given destination, which could be an advantage in distributing load or retaining an alternate route in case a connection breaks (thus avoiding a discovery phase for an additional route). DSR also contains a second mechanism, called route maintenance. This mechanism enables a node to detect a topology change resulting in the break of an active connection. Each node is responsible for the detection of possible connection breaks between itself and the next node, through acknowledgement mechanisms, operating in either the connection layer (as is the case with native 802.11 layer for example), based on a “passive” mode (through promiscuous mode listening) or through a dedicated DSR acknowledgement. Each node is also responsible for informing the data packet initiating node of the detected connection break, with the help of an RERR error message. 9.2.1.2. Ad hoc on-demand distance vector (AODV) routing AODV [PER 03] is a distance vector routing protocol also based on a discovery of new routes on demand – in other words executed when a node wants to communicate with another node and it does not know a valid route to reach it – using a RREQ message flooding technique. Each intermediate node broadcasting an RREQ message keeps a trace of the originating node’s identity in order to be able to possibly reply to it by an RREP corresponding to it. Similarly, each intermediate node transmitting a RREP on the return route creates an entry in its routing table. It memorizes the identifier of the node from which it received the RREP message, because this node is the next node to which a data packet bound for the RREP initiating node will have to go through (where appropriate). In addition, each node manages and maintains a number identifying its route requests (RREQ ID) that it increments throughout the different RREQ message generations. This identifier is linked to the message initiator’s IP address and makes RREQs unique, helping each node avoid processing the same RREQ multiple times, only processing the first RREQ copy received and eliminating the following copies.
198
Wireless Ad Hoc and Sensor Networks
Each node also maintains a sequence number – incremented according to an increasing monotone function – with the objective of informing the network of the age of routing information contained in RREQ, RREP or RERR messages. The network can then knowingly update its corresponding routing tables and avoid looping. Each node increments its sequence number in two circumstances, just before transmitting an RREQ message and just before transmitting an RREP message, emphasizing the newness of transmitted information (the sequence number is contained in RREQ and RREPs). In addition, for each entry created in its routing table, each node keeps the sequence number from the destination node. In this way, it can then judge the validity or possible obsolescence of its routing table for each destination node and consequently the pertinence of communicating or not communicating to other nodes its routing information or furthermore of the necessity of updating its tables. Finally, a hop counter is incremented during intermediate node retransmissions, which establishes, in number of hops, the length of the route created for selecting the best route in terms of nodes passed through. 9.2.2. Proactive protocol 9.2.2.1. Destination-sequenced distance vector (DSDV) routing DSDV [PER 94] is a distance vector routing protocol with specific adaptations – particularly with the use of sequence numbers – to respond to ad hoc network specifications, in particular node mobility and consequent rapid topology changes.
Each node member of a network maintains a routing table containing as many entries as possible destination nodes in the network. Each table entry includes a destination node address, the address of the next node constituting the first hop on the shortest route to reach the destination node, the distance in number of hops to the destination node and a sequence number given by the destination node and corresponding to the principle explained below. Each node retransmits data packets using its routing table to determine the next node on the route to destination. Each node regularly broadcasts an update table message from its own routing table and proceeds to update its own table from update messages from other nodes if this information is more recent than the information it currently possesses. In order to provide information on the newness or the obsolescence of routing information related to a given destination node, it is associated with a sequence number given by the destination node which increments before each topology change (line break or connection throughout moves in particular). A node always chooses the route with the highest sequence number and, if they are equal, the lowest number of hops.
Security for Ad Hoc Routing and Forwarding
199
9.3. Routing threat model in ad hoc networks 9.3.1. Ad hoc network characterization for security Because of their particular characteristics, ad hoc networks have their share of problems in terms of routing security, among which is the ease with which a malicious node can capture traffic and compromise consistency of network connectivity. In fact, in the absence of security on the physical layer, the “wireless” transmission mode of ad hoc communication networks makes it extremely easy to capture exchanged traffic by simply listening to the communication medium in promiscuous mode. Because of this, there is no confidentiality in the exchanged traffic: it can be heard, analyzed and reused to forge and inject packets compromising network operations, particularly with routing control message exchanges. In addition, a real difference between ad hoc networks and infrastructure networks comes from the fact that network nodes are terminals and routers simultaneously. In the absence of security mechanisms, a malicious node can in fact conduct operations at will that are non-compliant with specifications of the routing protocol used because of its role as router with the goal of compromising traffic transmission. The following elements can be added to these two major security problems. In their most severe definition, ad hoc networks are by nature totally spontaneous and are liable to contain nodes that are not known by each other. They offer no guarantee concerning present nodes and do not rely on any pre-existing infrastructure. This means in particular that the key management system implied by the implementation of cryptographic type security mechanisms should ideally be established without hindrance, which is not possible in reality. In addition, the open architecture and high dynamicity of ad hoc networks, with the possibility of malicious nodes joining or leaving radio coverage zone at any time in particular, lead to node topology which can continuously reconfigure itself. Finally, resource constraints are potentially high, whether for processing capacity of some terminals (such as PDAs), energy autonomy (portable terminals) or bandwidth availability (available throughput over radio connections). The rest of this section explains a few types of possible attacks toward ad hoc routing protocols by going back to the classification proposed by P. Ning et al. [NIN 05] and by highlighting a few specifications for security risks resulting from the ad hoc character of networks studied here.
200
Wireless Ad Hoc and Sensor Networks
9.3.2. Classification of attack objectives According to [NIN 05], the objective of an attack on ad hoc networks refers to one of the following four generic categories: – route break: the objective of the attack consists of either interrupting an existing route or preventing the creation of a new route. The particularly critical and hard to prevent “wormhole attack” comes under this last type of attack. Two non-neighbor nodes (not directly accessible because of the non-recovery of their radio transmission zone), working together to collaborate, establish a private network connection between each other, enabling them to exchange packets “directly” by short-circuiting multihop routing resources of the ad hoc network. Typically, these two malicious nodes transmit “in direct connection” through their private network connection – called a transmission tunnel – RREQ route discovery messages and corresponding RREP reply messages to each other so that this route is adopted, and will not make it possible for data packets to be forwarded. In fact, this discovered route includes the hop (presumed primary) between both malicious nodes which, by definition, is impossible since they are not neighbors. The injection of routing control packets leading to the creation of a loop is another form of this type of attack; – route invasion: the attacking node intends to illegitimately introduce itself as an intermediate (router) node throughout the packet transmission route between a source and a destination node. Once it has achieved its objective, the attacking node can listen to traffic at will, and even eliminate data packets; – node isolation: the goal of this attack is to prevent any communication of a node with the rest of the network, at transmission or at arrival. As an example, some cooperation incentive principles establish a value for node reputation, especially on the basis of received recommendations from network nodes; these nodes can spread false accusations for the purpose of isolating this node because of a presumed lack of cooperation; – resources consumption: the attacker attempts to use up a resource in a large sense (bandwidth, node processing capacity (CPU), energy, etc.), often with the purpose of leading to a denial of service. 9.3.3. Basic attacks and security counter measures According to Ning and Sun [NIN 05], there are four basic routing control message actions on which a malicious node builds its attacks toward ad hoc routing protocols. These actions can be executed in an isolated way, repeated or with a combination thereof: 1) eliminating a routing control packet: the malicious node deletes the routing control message received when it is expected to be retransmitted to its neighbors.
Security for Ad Hoc Routing and Forwarding
201
The cryptographic mechanisms are not intended to handle this type of problem and new approaches are developed to protect the network against this type of vulnerability in the data packet forwarding environment (see section 9.5). Some of these approaches consist of analyzing node behavior by listening in promiscuous mode in order to evaluate their cooperation in effectively retransmitting packets; 2) modifying a routing control packet and following it through: the attacking node modifies a routing control packet in a way that is non-compliant with the routing protocol specification, and retransmits it to its neighbors. With routing control packets, several types of modification fields can affect routing security, including: – modification of the request identifier: associated with the IP address of a request initiator, it identifies the request uniquely and shows how recent the message is. Its upward modification by an attacking node will give a request the appearance of a more recent request than the one actually transmitted by the actual request initiator and can lead to it having priority by avoiding its destruction (a node only accepts the first copy of the same request received several times and deletes the following ones), – number of hops: its modification makes the route specification appear more or less optimal than reality. It enables the increase or decrease of this route’s future adoption chances to forward data. In the end, it will either attract traffic or, on the contrary, avoid transmission load in the future, – sequence number: in AODV, its increase launches routing table updates so that (presumably) more recent information is adapted because of a higher sequence number than that already known to the network, – IP addresses of the request initiating and destination nodes. Modification of the originating address can for example prevent an RREP message reaching the legitimate node, – description of route under construction (deleting or adding one or more intermediate nodes). In order to face these illegal modifications, the security mechanisms normally used depend on whether data transmitted from node to node varies or not. Fields which are non-variable during retransmissions from intermediate node to intermediate node (request identifier, edge node IP addresses, etc.) are generally identified by the message identifier with the help of traditional primitives such as digital signature or HMAC (message authentication code) calculation [KRA 97], enabling the message receiver to verify that no modification that is not compliant with the protocol has been made.
202
Wireless Ad Hoc and Sensor Networks
Variable fields and their protection against deletions or fraudulent modification is the subject of more original propositions detailed below. For some, they are based on recursive calculations where each intermediate node accepts as input date the calculation result (hashing or encryption) of the previous node without the possibility of reversing its operation, and there is a way to verify the final value expected; 3) forge a response to a routing control packet: the node forges a counterfeit routing control packet in response to a received routing control packet; 4) spontaneously forging a routing control packet: in the absence of any request, the node forges a counterfeit routing control packet and sends it to the network. Applied to DSR or AODV, a simple example consists of spontaneously forging RERR type packets to inform the network of a (presumed) broken connection, preferably by spoofing the identity of a node neighbor to the connection break. Authentication of routing control message initiators is normally a first countermeasure to this type of action. 9.4. Routing security This section reviews the main security propositions for routing protocols by describing required security objectives, the detail of exchanges as well as the object and principle of specific mechanisms implemented where appropriate. A few descriptions of attacks on some of these protocols illustrate the practical feasibility of attack principles described earlier, without limiting the presentation to theoretical security considerations. With very few exceptions, our presentation of studies follows the chronological order of publication. 9.4.1. SRP: secure routing for mobile ad hoc networks The SRP (secure routing protocol) [PAP 02] offers a security extension which applies to the DSR protocol. Security objectives considered by the authors are: – protecting against attacks meant to interrupt a route discovery phase; – guaranteeing the acquisition of accurate routing information with the help of eviction, detection and elimination mechanisms of tempered reply messages; – limiting vulnerability to denial of service attacks. In order to do this, RREQ route request messages are the subject of the following specific security mechanisms:
Security for Ad Hoc Routing and Forwarding
203
– the source maintains a request sequence number – incremented according to an increasing monotone function, at each new phase of route discovery – for each destination node with which it communicates. Included in the RREQ message, this sequence number enables the destination (and only the destination) to detect obsolete requests: those where the sequence number is lower than that of other requests issued by the same source; – the source also includes in its RREQ messages a unique request identifier enabling intermediate nodes to detect multiple arrivals of a single request message (copies of a previously received request are eliminated). Since this identifier is random, a malicious node cannot predict request identifiers with the purpose of forging and injecting RREQs in the network which could potentially make intermediate nodes delete a legitimate RREQ, as this last RREQ would have an identical sequence number to that of a (counterfeit) RREQ already received. There is dissociation of the detection mechanism for obsolete requests (sequence number) by the destination node and of the verification mechanisms of message uniqueness (request identifier) by intermediate nodes, contrary to the basic DSR protocol; – finally the source includes a MAC (message authentication code) produced over all non-variable RREQ fields with the help of a secret KS,T shared between source and destination. This means that the identifier description field of nodes located in the route under construction, which is informed throughout RREQ propagation, cannot be authenticated by this MAC because it is modified from hop to hop, hence the possible attack described below in this section. This MAC enables the destination to verify message integrity (absence of modification) and to authenticate its origin. Security principles implemented by the destination node are similar, that is, sequence number and identifier request, as well as MAC production (verifiable by the source) on the reply message including the description of the route discovered from RREQ propagation. Finally, to limit risks of denial of service attacks by network flooding and also to block excessive occupation behaviors of communication media, the authors of [PAP 02] propose that intermediate nodes measure the frequency of request emission or relay by their neighbor nodes and respectively accord them a processing priority that is inversely proportional to their message transmission rates. This mechanism’s drawback is that it encourages selfish nodes not to consume resources to relay requests on others’ behalf with the purpose of displaying a weak message transmission rate and thus acquire a high priority (for their own purpose) [HU 04]. In addition, an attack associated with this priority principle resides in the possibility of forging and addressing RREQ messages in the network in the name of an intermediate node by assuming its identity, with the purpose of giving it the appearance of a node transmitting many messages [HU 04].
204
Wireless Ad Hoc and Sensor Networks
SRP implies the existence of a security association (SA) between ending nodes – source (S) and destination (T: target) – between which the route discovery phase is executed, enabling the previous creation of the shared KS,T secret used for MAC calculation. ATTACK ILLUSTRATION 9.1.– [BUT 04] describes a possible attack against SRP, by demonstrating that a malicious node can insert a random intermediate node sequence in the description of a route being created, thus making the discovered route unusable. In order to do this, the malicious node intercepts discovery route packets (and during the return trip, corresponding response packets) and injects erroneous information, a counterfeit sequence of identifiers in the description field of the route under construction. The mechanisms described above are not capable of detecting this type of compromise of an RREQ message once a similar compromise is applied to the corresponding return RREP message. 9.4.2. Secure ad hoc on-demand distance vector (SAODV) routing SAODV [ZAP 02, ZAP 05] constitutes an extension to the AODV routing protocol offering protection against the following security threats during the route discovery phase: – identity theft of message initiators (source for RREQ messages and destination in the case of RREP messages); – modification of information content of routing control packets transmitted by initiators and from node to node; – tampering in the sense of decreasing the counter-metric corresponding to the number of hops of a malicious intermediate node which would attract traffic to itself. In addition, SAODV adopts the following authorization principle. The supreme authority concerning route indications to reach a given destination node belongs to this node only. This means that only route indications communicated by the destination node are taken into account. In this way, a malicious node can only lie about itself. To do this, mechanisms implemented are: – authentication of all set fields of routing control packets with the help of digital signatures calculated by packet initiators;
Security for Ad Hoc Routing and Forwarding
205
– authentication of hop counter based on a digest calculation principle. The message initiator generates and provides a starting value (a seed) as well as a final value. This latter one is the result of a hashing function applied to the seed a number of times that is equal to the total IP packet time to live (TTL) (i.e. the number of total hops indicated in the TTL field of the IP packet header). Each intermediate node hashes the result of calculated hashing transmitted by the previous node (the first intermediate node hashes the seed transmitted by the source) and transmits it to the following node. To verify hop counter authentication, each intermediate node applies to the result of the executed hashing, provided by the previous node, this same hashing function a number of times equal to the number of remaining nodes (total number of hops, TTL, minus the hop counter provided by the previous node) and must verify that the result obtained is equal to the final value calculated, authenticated and provided by the message initiator. If that is not the case, this means that the hop counter is not in line with the number of seed hashings (i.e. the number of intermediate nodes crossed, with respect to the protocol). This can happen when a malicious node decreases the hop counter with the purpose of attracting traffic and is therefore detected by this counter authentication mechanism. This information is added to the packet in the form of a field called SAODVSSE (single signature extension). SAODV also defines another extension called SAODV-DES (double signature extension) which offers intermediate nodes the possibility of responding to the source, a previously discovered route and at least as new as the request requires it (in terms of sequence number), while guaranteeing the authorization principle previously described. That is to say, in its reply, the intermediate node reports the routing information destination node signature provided during the previous discovery, guaranteeing that the destination node is actually the one providing the information. Hypotheses relative to the system of key management supported by SAODV are as follows. Each node has a pair of private key/public key and there is a way for the latter to be known by all other nodes. In addition, there is a link of verifiable trust between node identity and its public key. 9.4.3. Ariadne Ariadne [HU 02b] is a source secured reactive routing protocol, and its principles are based on those of DSR.
206
Wireless Ad Hoc and Sensor Networks
Ariadne implements three main security mechanisms: – a MAC authentication using a shared key with the destination (which is previously distributed) enables the destination node to authenticate the RREQ packet transmitted by the initiator; – an authentication enables the destination node to authenticate each intermediate node whose identifier appears in the descriptive field of the created route. In this way, the destination node returns to the RREQ transmitter a route only containing legitimate nodes in the form of an RREP. In order to do this, Ariadne proposes the choice of one of the three following authentication techniques: - creation of a MAC using a shared secret key. This mode is reserved to networks allowing a prior distribution of secret key pairs between each pair of nodes; - MAC creation based on the TESLA 1 (timed efficient stream loss-tolerant authentication) protocol [PE 00]. TESLA only implements symmetric cryptographic primitives; - creation of a digital signature, reserved for highly powerful calculation node networks; – a method based on recursive hashing calculation prevents any intermediate node from deleting the address of a previous intermediate node from the field listing node identifiers describing the route under construction. In order to do this, each node i creates and communicates a digest hi, calculated by hashing the concatenation of the arguments node identifier i (ŋi) and digest hi-1 provided by the previous node i-1: hi = H[ŋi, h i-1]. In the end, the destination recalculates the digest provided by the previous node ŋn and must verify that it is equal to: H[ŋn, H[ŋn-1, H[…, H[ŋ1, MACK
(S,D,id,time interval)]…]]]
SD
where ŋn, ŋn-1…, ŋ1 is given by the field listing identifiers from the route created and MAC is the one created by the request initiator and can be calculated by the
1 TESLA is a broadcasting authentication principle, based on MAC calculation with a group key communicated to destinations after the authenticated message has been sent with a delay that must be long enough so that the key will not be available in the network before the authenticated message has been received by destinations. In this way, the (single use) key for MAC production verification cannot be intercepted in order to forge a MAC by stealing the identity of the node transmitting the TESLA key, since the message has already been received by destinations when it becomes available in the network.
Security for Ad Hoc Routing and Forwarding
207
destination (time interval corresponding to an estimated forwarding delay of the packet in the network). In total, the detail of route discovery exchanges between source node (S) and destination node (D) in Ariadne’s TESLA version is accomplished, as illustrated in Figure 9.1. Bold, underlined characters indicate fields changing throughout packet transmission. Mi is a MAC and Ki is a TESLA key disclosed after transmission of ti the corresponding MAC. S:
h0 = MACKS D(REQUEST, S, D, id, ti)
S → *:
〈REQUEST, S, D, id, ti, h0, ( ), ( )〉
A:
h1 = H [A, h0] MA = MACKAti(REQUEST, S, D, id, ti, h1, (A), ( ))
A → *:
〈REQUEST, S, D, id, ti, h1, (A), (MA)〉
B:
h2 = H [B, h1] MB = MACKBti(REQUEST, S, D, id, ti, h2, (A, B), (MA, MB))
B → *:
〈REQUEST, S, D, id, ti, h2, (A, B), (MA, MB)〉
C:
h3 = H [C, h2] MC = MACKC (REQUEST, S, D, id, ti, h3, (A, B, C), (MA, MB)) ti
C → *:
〈REQUEST, S, D, id, ti, h3, (A, B, C), (MA, MB, MC)〉
D:
MD = MACKD S(REPLY, D, S, ti, (A, B, C), (MA, MB, MC))
D → C:
〈REPLY, D, S, ti, (A, B, C), (MA, MB, MC), MD, ( )〉
C → B:
〈REPLY, D, S, ti, (A, B, C), (MA, MB, MC), MD, (KCti)〉
B → A:
〈REPLY, D, S, ti, (A, B, C), (MA, MB, MC), MD, (KCti, KBti)〉
A → S:
〈REPLY, D, S, ti, (A, B, C), (MA, MB, MC), MD, (KCti, KBti, KAti)〉
Figure 9.1. Route discovery with Ariadne, TESLA version [HU 02b]
ATTACK ILLUSTRATION 9.2.– [BUT 04] describes a possible attack against the Ariadne version using a digital signature as a message authentication technique. [BUT 04] demonstrates that a malicious node NM can define itself and add itself as an intermediate node throughout a route under construction, following an intermediate legitimate node NL to which it is not a neighbor (NM pretends to be
208
Wireless Ad Hoc and Sensor Networks
Nfollowing L in an access route to a destination node). The route thus defined then becomes unusable because by definition a packet cannot pass the hop between the legitimate node NL and malicious node NM, because they are not neighbors. To accomplish this attack, the malicious node must be a neighbor to both specific nodes from which it obtains information necessary to the preparation of the attack at the time where these nodes broadcast the route request packet: NM must be a neighbor to nodes Npreceeding L and Nfollowing L, in other words, nodes located in the route under construction and respectively preceding and following node NL. From Npreceeding L, it obtains the recursive digest produced by the series of nodes N1 to Npreceeding L2. From Nfollowing L, it obtains the appropriate digital signature calculated by NL. From these two pieces of information, the malicious node can forge and forward a route request packet which will end up at the destination node and its response will be correctly transmitted in return to the initiator, which will think it has received a legitimate route with the impossible hop NL/NM! This is a good illustration of the vulnerability of ad hoc networks, since even if they are authenticated by each intermediate node, routing control packets can be diverted from their task because of their diffusion over a wireless communication medium making their interception and their reuse by malicious nodes possible. ATTACK ILLUSTRATION 9.3.– [ACS 05] describes a possible attack against the Ariadne version which authenticates messages based on MAC calculations. This attack is also possible against the Tesla version and Ariadne digital signature. Two corrupted nodes collaborate to delete a sequence of intermediate nodes included in the field describing the route under construction. From then on, the route thus discovered becomes unusable by the initiating node since, by definition, a packet cannot cross a hop between two non-neighboring nodes (separated from each other by the deleted sequence of intermediate nodes). In order to get around the prevention mechanism of this type of deletion of intermediate node identifiers considered by Ariadne, the nodes proceed as follows. The first corrupted node positions the digest of its own identifier, concatenated to the digest produced by the previous intermediate node, in the packet that it retransmits, instead of the MAC that it is supposed to create. The second corrupted node can delete from the list of intermediate nodes the sequence separating it from the first corrupted node and can recalculate the digest from the one provided by the first corrupted node and its own identifier, thus making the deletion of the
2 This vulnerability point, making it possible to change or take out a node identifier in the descriptive list of the route under construction, once the malicious node can capture an RREQ message that does not include the digest produced by the node that it wants to delete, is explained in [HU 02b].
Security for Ad Hoc Routing and Forwarding
209
intermediate node sequence undetectable. The return packet which is not protected by the digest mechanism is even simpler to forge since the first corrupted node only needs to delete the sequence of intermediate nodes. The MAC calculated by the destination and returned to the initiator is compliant with the description of the “amputated” route provided, since it has been created on the basis of a route description omitting the intermediate node sequence deleted by the second corrupt node. The route discovered is not only unusable by the initiator to reach the destination, but forwards all traffic from the initiator to the corrupt node which then obtains control over this data traffic! 9.4.4. ARAN: authenticated routing protocol for ad hoc networks [SAN 02] proposes ARAN which is based on AODV. ARAN offers a global security solution protecting against identity theft, illicit modification, replay and control message repudiation relative to route discovery phase, by the use of multiple and costly asymmetric cryptography operations (production and verification of digital signatures). Each message is signed by its initiator (source for RDPs – route discovery packet – and destination for REPs – reply) and each intermediate node signs the initial message created and signed by the initiator, after verification (and deletion) of the signature created by the previous intermediate node. In short, initiating nodes and each node involved in forwarding RDPs and REPs are authenticated and the integrity of messages transmitted is guaranteed. Contrary to AODV, the RDP message does not contain any “hop count” field that increments hops throughout. In fact, ARAN enables the discovery of the fastest route – the one that will lead to a REP first – and not the shortest (in number of hops). RDP and REP routing control messages altogether do not contain any variable field throughout their transmission. The guarantee of their integrity offered by the production of a digital signature can then apply to the whole message. ARAN implies the existence of a certification authority, by which nodes get their certificate3 prior to their involvement in the ad hoc network. ARAN is particularly vulnerable to denial of service attacks in case of network flooding with routing control packets (RDP and REP) because of the large number of resources consumed by digital signature verification. 3 “Traditional” mechanisms of public key cryptography are applied: certificates are signed by the authority and contain the node’s IP address as well as its public key to enable other nodes to verify signatures created with the corresponding private key.
210
Wireless Ad Hoc and Sensor Networks
9.4.5. Secure dynamic source routing (SDSR) [KAR 05] proposes a secure source routing protocol in response to objectives of security as defined by the authors: – route integrity is guaranteed. [KAR 05] explains that this integrity must be seen as a guarantee against any form of alteration or modification of the routing process which can lead to a denial of service, a black hole absorbing traffic or any other destructive attack; – an anti-replay mechanism prevents a delayed reuse of routing control packets; – intermediate nodes constituting the route are authenticated; – a session key is established between the source and destination to enable encryption of data subsequently exchanged (data security). To establish a route between source S and destination D through the K1 intermediate node, the exchange of route request (RREQ) and response (RREP) messages are executed as illustrated in Figure 9.2. Transmitted by source S: RREQ S 1.
D
ID
Transmitted by intermediate node K1: RREQ S D ID 2. Transmitted by destination D: RREP S D ID 3. sigSKD DHPKD
DHPKS DHPKS
N1
sigSKS
SR {S, K1}
sigSKS
SR {S, K1, D}
sigSKS
N2
DHPKS N2 ESKD(h(kSD))
Transmitted by intermediate node K1: RREP S D ID DHPKS N2 4. sigSKD DHPKD ESKD(h(kSD))
SR {S}
SR {S, K1, D} DHPK1
sigSKS ESK1(h(kS1))
Where: – ID is a random request identifier; – DHPKi (Diffie-Hellman public key) is relative to node i; – N1 is a unique number created by S. N2 is created by K1, by encryption from N1; – SR (source route) describes the route under construction; – sigSKS, sigSKD are digital signatures respectively created by source S and destination D (SK: secret key); – ESKi(h(hSi)) is a digest encrypted by i of session key (kSi) shared by S and i.
Figure 9.2. Route discovery with SDSR
Security for Ad Hoc Routing and Forwarding
211
The source transmits a signed route request including the two following specific mechanisms: – DHPKS is a public key transmitted by the source. Its use combined with DHPKD – a public key transmitted by the destination – will enable both the destination and source to jointly establish a session key according to the DiffieHellmann key agreement method [RES 99]. This key is used for encryption of data exchanged subsequently; – Ni is the result of consecutive encryption of a unique random number N1 created by the source, by each node (symmetric encryption executed with the help of a key only known to that node). During the reply phase, Ni is, conversely, consecutively decrypted hop by hop. If source S receives an N1 identical to the one it has sent in its request message, this means that the route established during the request phase is identical to the one used during the reply phase. Since the reply is signed (it can not have been subject to modification), the description provided on the route created during the request phase is then declared to be unaltered. The RREP response signed by the destination is transmitted hop by hop by intermediate nodes which add two specific fields to the reply: – DHPKi is a public key given by node Ki, used for the creation of a common session key used for encryption of data exchanged subsequently between the source and node Ki; – ESKi(h(kSi)) enables the source to authenticate each intermediate node. ATTACK ILLUSTRATION 9.4.– This protocol is vulnerable to the following attack which leads to the definition of an unusable route because it includes a hop between non-neighbor nodes. Let malicious node Km be a neighbor of two other nodes Kk and Kp, where Kk and Kp are not neighbors. Km receives an RREQ from Kk and retransmits it unchanged to Kp in the name of Kk without having assigned its identifier in the route description (SR field) and without having encrypted the unique random number provided by Kk. Km proceeds in the same way for reply transmission, by addressing an unchanged message to Kk in the name of Kp. In the end, the source actually receives a reply including a unique random number N1 unchanged in relation to the one that it was emitted and concludes that the route received is valid, although the route includes the hop Kk/Kp which is impossible! This attack corresponds in the end to a “wormhole” attack, without any of the tunnel’s end nodes being involved in the attack.
212
Wireless Ad Hoc and Sensor Networks
9.4.6. EndairA G.Acs et al. [ACS 05] propose a new reactive source routing protocol, called endairA – Ariadne “in reverse” – because endairA is inspired by the Ariadne protocol (see section 9.4.3) but is different in that it opts for an authentication of the RREP message, contrary to Ariadne which does an authentication of the RREQ message. According to [ACS 05], a secure routing protocol is defined as follows. It is a routing service which must never lead to the discovery of a non-existent route, or in other words a route which does not make it possible to reach the intended destination node. As an example, a route discovery phase which would lead to a route, including in its description two intermediate nodes presumed to be consecutive but which in reality are not neighbors, is the result of a non-secure routing protocol. In fact, this route responds to the definition of a non-existent route since by definition a packet cannot cross a hop between two nodes that are not neighbors of each other, i.e. not mutually accessible by radio transmission in one hop. Figure 9.3 illustrates the detail of RREQ and RREP exchanges in a route discovery phase, where S (source) is the exchange initiator, T the target, A and B intermediate nodes, id a request identifier, and sigA, sigB and sigT the signatures created by A, B and T. Each signature applies to fields (including signatures from previous nodes) preceding the signature. S → *: (rreq; S; T; id; ()) A → *: (rreq; S; T; id; (A)) B → *: (rreq; S; T; id; (A;B)) T → B: (rreq; S; T; (A;B); (sigT)) B → A: (rreq; S; T; (A;B); (sigT; sigB)) A → S: (rreq; S; T; (A;B); (sigT; sigB; sigA)) Figure 9.3. Route discovery with endairA [ACS 05] (*: broadcast message)
The RREP message is controlled at each node located in the route under construction to ensure protocol security. Each node verifies signatures as well as identifiers (presence of its own identifier in the route description) and neighborhood (previous and next identifiers included in the route description must correspond to neighbor nodes).
Security for Ad Hoc Routing and Forwarding
213
EndairA proposes additional extensions: – an anti-replay mechanism consisting of recopying the request identifier in the reply message; – a counter-measure to request flooding with the help of the signature of the initial route request message by the initiator; – a lighter version of the protocol limiting the number of digital signature verifications with the help of an identifier and an associated timer mechanism. 9.5. IP datagram forwarding security Due to the possible presence of selfish and/or malicious nodes, it is necessary to ensure that on one hand the forwarding function of IP datagrams is correctly executed by nodes and, on the other hand, the effort dedicated to the execution of this forwarding function is evenly distributed between the nodes. The rest of this section gives a report on propositions handling this problem. 9.5.1. Monitoring-based techniques 9.5.1.1. Watchdog and pathrater Marti et al. [MAR 00] propose mechanisms for detecting and avoiding nodes failing to route packets, which apply to source routing protocols. The document discusses the case of the DSR routing protocol by proposing to assign two extensions to it: watchdog and pathrater. A watchdog is associated with each node which controls, with the help of promiscuous listening, whether the next node located in the route actually forwards the packet. Each node located in the route verifies that the next node retransmits its packets to the right destination in compliance with the route description included in the packet header (because it is a source routing protocol). Pathrater uses this knowledge of nodes failing to route packets to choose the best route and avoid routes with faulty nodes. A pathrater is associated with each node. It assigns a score to each node that it knows and a metric to each route by averaging scores for nodes it contains. When a pathrater finds a node during a route discovery, it assigns it a neutral notation. Notation of active nodes (those requested to forward packets) is increased at regular intervals if they correctly forward packets. A node’s score is decreased if it is located in an unreachable route. When a watchdog detects a node failing to route packets, its notation is highly penalized. Each node can recover a better notation
214
Wireless Ad Hoc and Sensor Networks
over time. Inactive nodes (those not receiving packets to forward) see no change in their notation. A third mechanism defines a transmission of additional RREQ packets when the known routes are all under suspicion of containing a node failing to route packets. This proposition has the following limitations: – a selfish node can fail on purpose to route packets to be detected by these detection mechanisms and thus be qualified as faulty. This takes it off the routes and enables it to save its resources; – these mechanisms do not exclude nodes qualified as faulty. These nodes can always send and receive packets. There is no punishment associated with faulty node detection, no more than there is an incentive to cooperate. Propositions came later to complete this first approach. 9.5.1.2. CORE: COllaborative REputation Michiardi et al. [MIC 02] propose a generic mechanism of cooperation reinforcement called CORE – COllaborative REputation – based on a collaborative monitoring technique and applicable to different network functions, such as IP packet forwarding, route discovery and network management. This type of evaluation, executed in a collective way by a group of nodes, increases reliability of the mechanism compared to the monitoring principle from [MAR 00] limited to the conclusions of an observation executed only by the node located upstream from the monitored node. The parameter adopted by [MIC 02] to score a node’s behavior is reputation, which reports a trust level that we can grant a node to forward (or not) packets on behalf of other network nodes. Reputation is calculated according to observations made in promiscuous mode by a watchdog, as in the proposition of [MAR 00]. Its value varies between –1 and +1. A good reputation – positive value – gives a node (said to be trustworthy) the privilege of using network resources for its own purpose, whereas a bad reputation – negative value – tends to eventually exclude the accused node (said to be dysfunctional). Reputation as defined by [MIC 02] covers the three following components: – subjective reputation, which is calculated with the help of its own observation of its neighbor nodes. Its calculation takes into account event history. By weighing observations, a more important influence is given to past observations, so as not to penalize a node with recent sporadic behavior because of, for example, a connection
Security for Ad Hoc Routing and Forwarding
215
break or lack of energy. Later, a whole other reasoning was defended by [REB 05] which gives an increased weight to more recent observations to prevent a node from acquiring a good reputation in the past and from transforming itself with impunity into a bottleneck, because of a lack of consideration for the recent (malicious) dysfunction of this node; – indirect reputation, i.e. which takes into account recommendations from neighbor nodes. Positive recommendations are the only ones considered; accusations are ignored to avoid denials of service resulting from false accusations of honest nodes by malicious nodes; – functional reputation, which is the result of the weighted calculation of a global value containing the routing function and IP datagram forwarding at the same time, for example. In addition, reputation calculation includes two specific mechanisms. Positive reputation level acquired by a node decrease over time. Owing to this, a node which cooperates correctly only when it needs network functions can see its reputation decreased over time when it sets its network connection to inactive (idle status) when it does not need them anymore. Contrary to [MAR 00], there is an exclusion mechanism for dysfunctional nodes. A node requested to serve (called on by) a dysfunctional node – presenting a negative reputation value based on the knowledge of the requested node – does not fulfill its IP transport requests. This isolates the dysfunctional node since the network does not forward packets in its name. 9.5.1.3. CONFIDANT: cooperation of nodes – fairness in dynamic ad hoc networks CONFIDANT, presented by Buchegger et al. [BUC 02], is another cooperative protocol for detecting and isolating nodes failing to forward (routing control and data) packets similar to CORE. It applies to reactive source routing protocols and works in the same way, based on a medium listening principle to ensure correct packet forwarding by nodes. Simulation tests presented use DSR. The four components of CONFIDANT are: – a neighbor nodes monitoring component; – a trust manager responsible for transmitting and receiving alarms warning of the detection of a malicious node. Alarm transmission is only done to nodes that are “friendly”, which presumes mechanisms of creation and maintenance of lists of “friendly” nodes; these mechanisms are not defined in [BUC 02];
216
Wireless Ad Hoc and Sensor Networks
– a reputation system that indicates negative experience attributable to each node. The effect of the consideration of “false” bad reputations (spread by malicious nodes) is reduced by allowing an excluded node to reintegrate the network after a certain period of time if its behavior is judged to be satisfactory; – a route manager responsible for taking all measures of route avoidance including a malicious node, and for the non-processing of route requests including or emanating from a malicious node. Similarly to CORE, this last element makes it possible to exclude malicious nodes by not serving them. 9.5.1.4. SAFE: Securing pAcket Forwarding in an ad hoc nEtwork Rebahi et al. [REB 05] propose mechanisms of detection and avoidance of faulty nodes, applicable to reactive routing protocols and combining incentive to cooperation and fraud deterrent, called SAFE (Securing pAcket Forwarding in an ad hoc nEtwork). As with CORE and CONFIDANT, the reputation of a node is evaluated globally and is cooperative between the different nodes located in the direct neighborhood (only one hop apart), by monitoring in promiscuous mode that the supervised node correctly forwards packets that it receives. The calculation of reputation is a direct function of the percentage of retransmitted packets. As indicated before, indirect evaluation presents the drawback of allowing malicious nodes to accuse – and therefore potentially contribute to exclude – falsely honest nodes, by spreading false information about presumably selfish behaviors. On this subject, SAFE allows for a verification mechanism when a node receives an abuse accusation from another node. The node interrogates neighboring nodes of the accused monitored node to find out their opinion on the veracity of the accusation, in order to consider it in the update measure of the reputation. Each node contains a SAFE agent which includes: – a node monitoring component; – a filter to extract reputation recommendations transmitted by neighbor nodes in the packet SAFE headers; – a reputation management component calculating the reputation level of each neighbor node; – a safe holding reputation values associated with TTLs (Time To Live) to limit its validity over time. Reputation calculation considers the history of node behavior to smooth localized dysfunctions – which can be legitimate – by giving an increased weight to
Security for Ad Hoc Routing and Forwarding
217
more recent observations, contrary to the CORE mechanism which favors the oldest events. Contrary to CORE and CONFIDANT, [REB 05] does not allow the exclusion of a malicious node from the network, but only avoids it during route choice. 9.5.1.5. Improvement propositions The detection principle for selfish nodes by monitoring in promiscuous mode presents limitations, since there are cases where the relay node correctly forwards packets without the controlling node being able to hear it, which leads to qualifying a node as selfish even if it actually is not. In fact, as illustrated in Figure 9.4, it could be possible for example that node A is not able to listen to node B forward packet P1 to node C because of a collision with packet P2 coming from node D. Kargl et al. [KAR 04] propose a mechanism to decrease this number of false detections. A silent node at the time when the relay node listens to it to judge its action is not classified as selfish if this node has already been silent for a while.
Figure 9.4. Collision
In this way, a node that would move to a transmission zone out of reach of a given node would no longer be “audible” from the latter. Silent for a while (since it is absent from the radio transmission zone) from the point of view of its exneighbor, this node is not qualified as selfish by its ex-neighbor when it does not hear it forward the packet that it sends it (it is not aware of topology change). In this case, this mechanism will prevent the wrongful accusation of a node that would have moved and is no longer able to forward packets on behalf of its ex-neighbors.
218
Wireless Ad Hoc and Sensor Networks
In addition, to better avoid false accusations of an honest node when it is not heard forwarding packets or when false accusations come from malicious nodes, Venkatesan et al. [VEN 05] propose in a general way the consideration of a new evaluation element to judge a given node’s behavior: the reputation called observed – Figure 9.5c – intended for the purpose of evaluating a node (node B) by observing its packet exchanges with other nodes (nodes A and C) for exchanges not involving the observer node (node D) itself. The reputation observed complements the two following modes already in the literature: – direct reputation – Figure 9.5a – which derives from conclusions from the observer node (node A) concerning the monitoring of the node with which it is exchanging packets (node B); – recommended reputation – Figure 9.5c – derived from information from neighbor nodes (node D) monitoring the node (node B) with which the recommended node (node A) is exchanging packets.
Figure 9.5. a) Direct reputation; b) recommended reputation; c) observed reputation
9.5.1.6. Summary The common trait of these studies is the motivation for improving the monitoring principle initiated by [MAR 00], in order to try to respond to the impossibility of judging in a reliable way a node’s behavior, only by promiscuous listening. We should not take for granted that approaches based on cooperative evaluation and consideration of a node’s behavior history are sufficient to understand legitimate unavailability situations – such as mobility or lack of terminal energy – or to avoid fraudulent exploitation of the reputation measurement.
Security for Ad Hoc Routing and Forwarding
219
Finally, performance comparison between the different approaches remains difficult because of a lack of a common evaluation environment which should specify the total number of nodes, the percentage of selfish nodes, the behavior of these selfish nodes (frequency of packet drops, occasional or continuous), mobility mode, etc. As [CON 04] highlights, is there not a risk of overloading cooperative nodes by totally avoiding nodes which are selfish from time to time. Another detection approach for malicious nodes acknowledgement is presented in the following section.
based
on
packet
9.5.2. Technique based on packet acknowledgement These techniques presume that the destination node sends an ACK acknowledgement packet to the source for each data packet successfully received. If a valid ACK is not received at a given time, the algorithm presumes that the packet is lost; the protocol records the existence of failure between source and destination and launches a search in the route in order to identify the faulty link. The method proposed by [AWE 02] consists of a dichotomous search based on packet acknowledgment requests addressed to the middle node on the part suspected of being faulty. If the middle node correctly resends the expected acknowledgement, the part presumed faulty is the one located downstream of the middle node, i.e. between the middle node and destination. Otherwise, the faulty part is the one located upstream of the middle node, or between the source and middle node. The search continues until the faulty part is down to only two nodes. According to [KAR 04], in case of a suspicion of selfish node presence, the source node repeatedly sends a probe command to nodes located on the route leading to the destination (from the source’s farthest to the closest) – only identifiable by the node to which it is intended – encrypted in the packet header that they must acknowledge. When the source receives an acknowledgement of its probe packet, this means that it has either detected the first node located upstream of the selfish node or the selfish node itself, because by hypothesis the source node does not receive an acknowledgement when it addresses nodes located downstream of the selfish node (which is supposed to destroy probe packets). These detection methods based on packet acknowledgement seem to be inefficient when the malicious node acts intermittently, in other words by not systematically rejecting all acknowledgement request packets addressed during the malicious node’s search phase, which would limit their benefit.
220
Wireless Ad Hoc and Sensor Networks
9.5.3. Cooperative incentive techniques based on virtual money [BUT 01] proposes a methodology to encourage ad hoc network nodes to participate in the packet forwarding function on one hand, and avoid network overload on the other, by implementing a “trade” routing principle where a nuglet is virtual money making it possible to buy or pay for this IP transport service. In other words, forwarding packets on the behalf of others enables a node to earn nuglets which in turn makes it possible for it to pay the network so that it forwards its own traffic. Two types of models coexist: the Packet Purse and Packet Trade models [PLA]. According to the Packet Purse model, the original node funds and assigns nuglets to the packet to forward and each intermediate node cashes its due. The paying character of the operation deters the originating node from overloading the network. According to the Packet Trade model, each intermediate node buys the packet to transmit and resells it to the following node. The destination node is the one which finally pays for the arrival of packets. Contrary to the previous model, the source therefore does not need to know in advance the number of nuglets to fund in order to forward its packets. However, the principle of deterrence in relation to network overload is no longer ensured. This mechanism requires the existence of a secure model controlling exchanges of nuglets between nodes participating in the packet transmission process, which constitutes a relatively big constraint. 9.6. Conclusion This chapter outlines the main ad hoc network security propositions by highlighting vulnerable elements and explaining the different mechanisms designed to cope with them. Route discovery phase security is based on primitives and basic “traditional” cryptographic mechanisms mainly intended for ensuring the integrity and origin of routing control messages to guarantee operation compliance to routing protocol specifications. IP datagram forwarding security requires more innovative approaches. Its main objective is to maximize and spread efforts made by each node to enable packet forwarding from original node to destination node, with respect to the previously discovered route. In other words, the object is to detect with the purpose of at least
Security for Ad Hoc Routing and Forwarding
221
avoiding and ideally deterring and excluding malicious nodes from the network which would fail to transmit packets addressed to them. This aspect of security addresses new security problems that cryptographic mechanisms are not meant to handle dealing with the measurement, implementation and incentive of trust between nodes. As illustrated in this chapter, proposed solutions present a certain number of limitations in terms of robustness, performance or reliability and the subject remains to this day an active field of research. Among the new directions taken in the ad hoc network field of research and liable to become pertinent and profitable in the specific field of security, we can name the consideration of application contexts and limited associated hypotheses such as inter-vehicle communications. The integration of ad hoc networks to the Internet also creates new considerations in which infrastructure networks would offer services to the ad hoc part of the network, particularly security services. The consideration of potential complementarities between TCP/IP model layers to offer more robust security solutions could also be a possible axis for progress in the field. 9.7. Acknowledgements The author wishes to thank Jacques DEMERJIAN for his collaboration in identifying the works relevant to this field and in his analysis of its state of the art. The author also thanks Didier GUERIN, Mohamed KASSI-LAHLOU, Mohamed BOUCADAIR and Mohammed ACHEMLAL for their perceptive comments and their contributions to the production of this chapter. 9.8. Bibliography [ACS 05] ACS G., BUTTYAN L., VAJDA I., Provably Secure On-demand Source Routing in Mobile Ad Hoc Networks, technical report, March 2005. [AWE 02] AWERBUCH B., HOLMER D., NITA-ROTARU C., RUBENS H., “An OnDemand Secure Routing Protocol Resilient to Byzantine Failures”, ACM Workshop on Wireless Security (WiSe), Atlanta, USA, September 2002. [BEL 04] BELDING-ROYER E.M., “Routing Approaches in Mobile Ad Hoc Networks”, Mobile Ad Hoc Networking, IEEE, New Jersey, 2004. [BUC 02] BUCHEGGER S., LE BOUDEC J.Y., “Performance Analysis of the CONFIDANT Protocol”, ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), Lausanne, June 2002.
222
Wireless Ad Hoc and Sensor Networks
[BUT 01] BUTTYAN L., HUBAUX J., Nuglets: A Virtual Currency to Stimulate Cooperation in Self-Organized Ad Hoc Networks, technical report, EPFL, 2001. [BUT 04] BUTTYAN L., VAJDA I., “Towards Provable Security for Ad Hoc Routing Protocols”, ACM Workshop on Security in Ad Hoc and Sensor Networks (SASN), October 2004. [CLA 03] CLAUSEN T, JACQUET P., “Optimized Link State Routing Protocol (OLSR)”, IETF standard, RFC 3626, October 2003. [CON 04] CONTI M., GREGORI E., MASELLI G., “Cooperation Issues in Mobile Ad Hoc Networks”, 24th International Conference on Distributed Computing Systems Workshops (ICDCSW’04), vol. 06, no. 6, p. 803-808, March 2004. [HU 02a] HU Y., JOHNSON D.B., PERIIG A., “SEAD: Secure Efficient Distance Vector Routing for Mobile Wireless Ad Hoc Networks”, 4th IEEE Workshop on Mobile Computing Systems and Application, 2002. [HU 02b] HU Y., PERRIG A., JOHNSON D., “Ariadne: A Secure On-Demand Routing Protocol for Ad Hoc Networks”, 8th ACM International Conference on Mobile Computing and Networking, 2002. [HU 04] HU Y., PERRIG A., “A Survey of Secure Wireless Ad Hoc Routing”, IEEE Security & Privacy, 2004. [JOH 04] JOHNSON D.B., MALTZ D.A., HU Y. “The Dynamic Source Routing Protocol for Mobile Ad Hoc Networks (DSR)”, Draft IETF, draft-ietf-manet-dsr-10.txt, July 2004. [KAR 04] KARGL F., KLENK A., WEBER M., SCHLOTT S., “Advanced Detection of Selfish or Malicious Nodes in Ad Hoc Networks”, 1st European Workshop on Security in Ad-Hoc and Sensor Networks (ESAS 2004), August 2004. [KAR 05] KARGL F., GEISS A., SCHLOTT S., WEBER M., “Secure Dynamic Source Routing”, HICSS-38, January 2005. [KRA 97] KRAWCZYK H., BELLARE M., CANETTI R., “HMAC: Keyed-Hashing for Message Authentication”, RFC 2104, February 1997. [MAR 00] MARTI S., GIULI T., LAI K., BAKER M., “Mitigating Routing Misbehavior in Mobile Ad Hoc Networks”, 6th annual ACM/IEEE International Conference on Mobile Computing and Networking, p. 255-265, 2000. [MIC 02] MICHIARDI P., MOLVA R., “CORE: A COllaborative Reputation Mechanism to enforce node cooperation in Mobile Ad Hoc Networks”, Proceedings of Communication and Multimedia Security 2002 Conference, September 2002. [NIN 05] NING P., SUN K., “How to Misuse AODV: a Case Study of Insider Attacks against Mobile Ad-Hoc Routing Protocols”, Journal of Ad Hoc Networks, Elsevier, Paris, November 2005. [PAP 02] PAPADIMITRATOS P., HAAS A.J., “Secure Routing for Mobile Ad hoc Networks”, Proceedings of the SCS Communication Networks and Distributed Systems Modeling and Simulation Conference (CNDS 2002), p. 27-31, San Antonio, TX, January 2002.
Security for Ad Hoc Routing and Forwarding
223
[PER 00] PERKINS C., CANETTI R., TYGAR J.D., SONG D., “Efficient Authentication and Signing of Multicast Streams over Lossy Channels”, IEEE Symposium on Security and Privacy, p. 56-57, May 2000. [PER 03] PERKINS C., BELDING-ROYER E., DAS S., “Ad Hoc On-Demand Distance Vector (AODV) Routing”, IETF standard, RFC 3561, July 2003. [PER 94] PERKINS C.E., BHAGWAT P., “Highly Dynamic Destination-Sequenced Distance-Vector Routing (DSDV) Routing”, SIGCOMM’94: Computer Communications Review, vol. 24, no. 4, p. 234-244, October 1994. [PLA] PLANETE Project. INRIA Sophia Antipolis, www.inrialpes.fr/planete/splash/splash_ fiche.html. [REB 05] REBAHI Y., MUJICA V., SIMONS C., SISALEM D., “SAFE: Securing pAcket Forwarding in ad hoc nEtworks”, 5th Workshop on Applications and Services in Wireless Networks, Paris, France, June-July 2005. [RES 99] RESCORLA E., “Diffie-Hellman Key Agreement Method”, RFC 2631, IETF, June 1999. [SAN 02] SANZGIRI K., LEVINE B.N., SHIELDS C., DAHILL B., BELDING-ROYER E.M., “A Secure Routing Protocol for Ad Hoc Networks”, Proceedings 10th IEEE Int’l Conf. Network Protocols (ICNP’02), p. 78-87, IEEE Press, November 2002. [VEN 05] VENKATESAN B., VIJAY V., “Designing Secure Wireless Mobile Ad Hoc Networks”, 19th International Conference on Advanced Information Networking and Applications (AINA 2005), March 2005. [ZAP 02] ZAPATA M.G., AZOCAN N., “Securing Ad Hoc Routing Protocols”, Proceedings ACM Workshop on Wireless Security (WiSe), p. 1-10, ACM Press, September 2002. [ZAP 05] ZAPATA M.G., “Secure Ad Hoc On-Demand Distance Vector (SAODV) Routing”, Draft IETF, draft-guerrero-manet-saodv-04.txt, September 2005.
Chapter 10
Fault-Tolerant Distributed Algorithms for Scalable Systems
10.1. Introduction Historically, the development of distributed systems was for the most part tied to several needs, such as communications between geographically remote entities, the acceleration of calculations following the increase in resources, the better reliability of systems caused by the redundancy of its core components. Several economic factors also had some influence, such as the fact that several simple machines cost less than a single complex machine, for a given number of users. The study of distributed systems and algorithms specific to these systems, distributed algorithms, helps in understanding the specific features of these systems compared to classic centralized systems: information is local (each element of the system only holds a fraction of the information and must obtain more by communicating with other elements), and time is local (the elements of the system can run their instructions at different speeds). These two factors result in a non-deterministic system, and two consecutive executions of the same distributed system are likely to be different. The fact that certain elements of the system can become faulty increases this nondeterministic aspect and the difficulty of predicting the overall system’s behavior. When the number of components in a distributed system is increased, the possibility for one or several of these components to become faulty also increases. When the production costs of these components are reduced to achieve economies of scale,
Chapter written by Sébastien T IXEUIL.
225
226
Wireless Ad Hoc and Sensor Networks
the rate of potential defects again increases. Finally, when the system’s components are deployed in an environment that is not necessarily controlled, the risks of faults occurring become impossible to overlook. In sensor networks, these three major factors are combined, meaning that faults will inevitably occur over the system’s lifetime. This makes it important to maximize the system’s useful lifetime: that is, the time during which the system will provide useful results, since faults can have significant repercussions for the application being run on the system. These repercussions depend on the extent of the fault. In this chapter, we will discuss distributed algorithms in large-scale systems, in other words distributed systems that can be comprised of tens (or hundreds) of thousands of elementary machines (computers, sensors, etc.). To be more precise, we will only be dealing with fault-tolerant distributed algorithms. Section 10.2 describes how distributed algorithms affect the design of ad hoc and sensor networks. In section 10.3, we present the taxonomy of faults that may occur and the traditional methods for solving the related problems in distributed systems (leaving aside the large-scale aspect). Section 10.4 shows that scalability in fault-tolerance is compromised by the hypotheses fault-tolerant algorithms usually rely on. Section 10.5 then presents several leads for the use of certain fault-tolerant techniques (derived from self-stabilization) in a large-scale context.
10.2. Distributed algorithms and wireless communications Distributed algorithms constitute the core of the protocols used in actual networks. Therefore, they are involved in the process long before the implementation phase, and can be used to determine, among several methods, those that are impossible to implement, those that will be costly, or those that will possibly be efficient. For a general overview of the relations between distributed algorithms and “classical” network protocols, see [JOH 04]; the rest of this section focuses on the specific features of wireless communications. In the layered network model commonly accepted for wireless networks [AKY 02], distributed algorithms are involved in the four highest layers: data link, network, transport and application. For the data link layer, and more specifically media access control (MAC), several types of protocols can be used. The most widespread in the context of wireless networks are CSMA (Carrier Sense Multiple Access) [WOO 01], TDMA (Time Division Multiple Access) and FDMA (Frequency Division Multiple Access) [SHI 01]. In any case, the main objective is to allow access to the medium despite problems that compromise the network’s performance (lag, throughput) or the energy used for communicating (crucial in sensor networks). Collisions are the major problem. They
Fault-Tolerant Distributed Algorithms
227
occur when neighboring nodes simultaneously use the radio medium to transmit information; receiving nodes may then receive a jammed or unusable signal. In sensor networks, other problems may arise, such as when receiving a signal is almost as costly (in terms of the energy consumed) as waiting for the reception of a signal; on the algorithmic level, this practical limitation leads to techniques that suggest a compromise between latency and electrical consumption for communicating within the network [SOH 00]. Because of their nature, techniques related to TDMA and FDMA involve coloring the nodes or the edges of a graph. Because the wireless networks we are considering have to be self-organized, this coloring cannot be predefined before deploying the system, and has to be the result of an algorithm run by the system itself: a distributed algorithm. Current distributed solutions to graph coloring problems show that there are theoretical boundaries when it comes to the quality of the coloring performed, depending on the location of the distributed algorithm (this location is directly related to the energy consumed). Furthermore, TDMA requires the clocks of the system’s nodes to be synchronized, which may require the use of distributed clock synchronizing algorithms. Other distributed algorithms based on solutions to graph problems may be useful in the network layer: for example, it is possible to construct an energyefficient infrastructure by self-organizing the network in a hierarchical manner or by determining a sub-network with specific features. In those cases, the graph problems that are considered are most often related to the concept of dominating sets (sets of nodes that can communicate with all of the other nodes in the graph), the objective being to make this set as small and/or efficient as possible [KUH 03, MOS 05]. The two upper layers (transport and application) take advantage of indirectly distributed algorithms. For the transport layer, it seems that the UDP and TCP protocols currently used on the Internet, and based on a best effort principle, are satisfactory for wireless communications as well. For the application layer, many new protocols that are specific to sensor networks have appeared, among which [AKY 02]: – sensor management: this layer consists of making the lower layers transparent, and suggests for example services for initializing, deployment, localization synchronization, etc; – assigning tasks and gathering data: this layer allows the user to indicate which data has to be collected by which sensor, and has an influence on the lower layers of the network layer (routing and self-organizing); – data requests and dissemination: this layer allows the user to specify how the data is aggregated by means of requests written in a language similar to SQL (Structured Query Language), but which also includes events. These events, in [SHE 01], are receive (an action performed upon the reception of a message by the node), every (an action regularly performed following the expiry of a timer) and expire (a particular action performed following the expiry of a timer).
228
Wireless Ad Hoc and Sensor Networks
These new application layers, which are specific to these new types of networks, can, however, rely on the basic building blocks provided by the distributed algorithms that already exist: leader election, mutual exclusion, resource management, information dissemination, etc. 10.3. Fault-tolerant distributed algorithms Traditionally, a distributed system is usually represented by a graph, in which the nodes are the system’s machines (or sensors) and the edges represent the ability for two machines to communicate. Thus, two machines are connected if they are capable of communicating information to one another (using a network connection for example). In some cases, the edges of the graph are oriented so as to represent the fact that communication can only take place one way (for example, wireless communication from a satellite to an antenna on the ground). From now on, we will indiscriminately use the words machine, sensor, node or process depending on the context. 10.3.1. Fault taxonomy in distributed systems A first criterion for classifying faults in distributed systems is localization in time. Usually, three types of possible faults are distinguished: – transient faults: faults that are arbitrary in nature can strike the system, but there is a point in the execution beyond which these faults no longer occur; – permanent faults: faults that are arbitrary in nature can strike the system, but there is a point in the execution beyond which these faults always occur; – intermittent faults: faults that are arbitrary in nature can strike the system, at any moment in the execution. Transient fault and permanent faults are, of course, specific cases of intermittent faults. However, with a system in which intermittent faults rarely occur, a system that tolerates transient faults can be useful, because the useful lifespan can be long enough. A second criterion is the nature of the faults. An element of the distributed system can be represented by an automaton, whose states represent the possible values of the element’s variables, and whose transitions represent the code run by the element. We can then distinguish the following faults depending on whether they involve the state or the code of the element: – state-related faults: changes in an element’s variables may be caused by disturbances in the environment (electromagnetic waves, for example), attacks (buffer overflow, for example) or simply faults on the part of the equipment used. For example, it is possible for some variables to have values that they are not supposed to have if the system is running normally;
Fault-Tolerant Distributed Algorithms
Intermittent faults
Transient faults
229
Byzantine faults
Omissions Crash faults
Permanent faults Duplications
Desequencing
Figure 10.1. Fault taxonomy in distributed systems
– code-related faults: an arbitrary change in an element’s code is most often the result of an attack (the replacement, for example, of an element by a malicious opponent), but certain, less serious types correspond to bugs or a difficulty in handling the load. There are several different sub-categories of code-related faults: - crashes: at a given moment during the execution, an element stops its execution permanently and no longer performs any action; - omissions: at different moments during the execution, an element may omit to communicate with the other elements of the system, either in transmission, or in reception; - duplications: at different moments during the execution, an element may perform an action several times, even though its code states that this execution must be performed once; - desequencing: at different moments during the execution, an element may perform the right actions, but in the wrong order; - Byzantine faults: these simply correspond to an arbitrary type of fault, and are therefore the faults that cause the most harm. Crashes are included in omissions (an element that no longer communicates is perceived by the rest of the system as an element that has ended its execution). Omissions are trivially included in Byzantine faults. Duplications and desequencing are also included in Byzantine faults, but are generally regarded as behaviors strictly related with communication capabilities. Figure 10.1 sums up the inclusion relations that can be inferred from the different types of faults.
230
Wireless Ad Hoc and Sensor Networks
10.3.2. Fault-tolerant algorithm categories When faults occur on one or several of the elements that comprise a distributed system, it is essential to be able to deal with them. If a system tolerates no fault whatsoever, the failure of a single one of its elements can compromise the execution of the entire system: this is the case for a system in which an entity has a central role (such as the DNS). In order to preserve the system’s useful lifespan, several ad hoc methods have been developed, which are usually specific to a particular type of fault that is likely to occur in the system in question. However, these solutions can be categorized depending on whether the effect is visible or not to an observer (a user, for example). A blocking solution hides the occurrence of faults to the observer, whereas a non-blocking solution does not present this characteristic: the effect of faults is visible over a certain period of time, then the system resumes behaving properly. A blocking approach may seem preferable at first, since it applies to a greater number of applications. Using a non-blocking approach to regulate air traffic would make collisions possible following the occurrence of faults. However, a blocking solution is usually more costly (in resources and in time) than a non-blocking solution, and can only tolerate faults so long as they have been anticipated. For problems such as routing, where being unable to transport information for a few moments will not have catastrophic consequences, a non-blocking approach is perfectly well-suited. Two major categories for fault-tolerant algorithms can be distinguished: – robust algorithms: these use redundancy on several levels of information, of communications, or of the system’s nodes, in order to overlap to the extent that the rest of the code can safely be executed. They usually rely on the hypothesis that a limited number of faults will strike the system, so as to preserve at least a majority of correct elements (sometimes more if the faults are more severe). Typically, these are blocking algorithms; – self-stabilizing algorithms: these rely on the hypothesis that the faults are transient (in other words, limited in time), but do not set constraints regarding the extent of the faults (which may involve all of the system’s elements). An algorithm is selfstabilizing [DIJ 74] if it manages, in a finite time, to present an appropriate behavior independently from the initial state of its elements, meaning that the variables of the elements may start from a state that is arbitrary (and impossible to achieve by running the application normally). Typically, self-stabilizing algorithms are non-blocking, because between the moment when the faults cease and the moment when the system has stabilized to an appropriate behavior, the execution may turn out to be somewhat erratic. Robust algorithms are quite close to what we conceive intuitively as fault-tolerance. If an element is susceptible to faults, then each element is replaced with three
Fault-Tolerant Distributed Algorithms
231
Extent in space
Extent in time Self-stabilizing algorithms Robust algorithms
Non-blocking
Blocking
Figure 10.2. The different classes of fault-tolerant distributed algorithms
identical elements, and each time an action is undertaken, the action is performed three times by each of the elements, and the action actually undertaken is the one that corresponds to the majority of the three individual actions. Self-stabilization would seem to be related more to the concept of convergence in mathematics or control theory, where the objective is to reach a fixed point regardless of the initial position; the fixed point corresponds here to an appropriate execution. Being capable of starting with an arbitrary state may seem odd (since it would seem that the initial states of the elements are always well known), but studies [VAR 00] have shown that if a distributed system is subjected to stopping and restarting-type node failures (which correspond to a definite failure followed by a reinitialization), and communications cannot be totally reliable (some communications may be lost, duplicated or, desequenced), then an arbitrary state of the system can actually be achieved. Even if the probability of the execution that leads to this arbitrary state is negligible in normal conditions, it is not impossible for an attack on the system to attempt to reproduce such an execution. In any case, and regardless of the nature of what led the system to this arbitrary state, a self-stabilizing algorithm is capable of providing an appropriate behavior in a finite amount of time. In fact, self-stabilizing distributed algorithms are found in a number of protocols used in computer networks [JOH 04].
Figure 10.2 sums up the relative capabilities of robust algorithms and self-stabilizing algorithms. Bear in mind that none of these classes can be developed further using the basic hypotheses: a self-stabilizing algorithm cannot tolerate failures that occur continuously, and a robust algorithm generally cannot tolerate highly extended failures. As a consequence, no general solution to continuous or extended failures can exist.
232
Wireless Ad Hoc and Sensor Networks
10.4. The limits and problems caused by a large-scale system 10.4.1. Hypotheses about the system Robust algorithms generally rely on hypotheses that lose their relevance in large scale systems, among which we have: – complete communications: in many robust algorithms, a node is capable of talking to any other node, despite faults in other nodes. When modeling the communication capabilities in a graph, this is equivalent to considering that the graph is complete. In a local network where all of the machines are directly connected, this hypothesis is valid, but in a system comprised of tens (or even hundreds) of thousands of machines, it becomes inefficient at best (the latency for traveling through the system increases), and otherwise whimsical (the communication taking place through defective nodes can no longer be achieved); – global communications: most of the existing solutions require, for each phase (the total number of phases depending on the number of faults that we wish to be able to tolerate), a quadratic number of communications (depending on the size of the system), which compromises its scalability. This is because, over the course of a phase, a node typically sends a message to each of the other nodes of the system; – synchronous communications: a fundamental result in the literature on robust distributed algorithms [FIS 85] states that, even when considering a problem that seems simple (consensus, where all of the nodes of the system must agree on a value suggested by at least one of them), and considering a very simple model for describing faults (a single crash fault may occur), it is impossible to solve the problem in an asynchronous environment (where the relative speeds of the system’s nodes are not bounded). This result stems from the fact, first, that the decision procedure may, in certain executions, depend on the decision communicated by a single node in the system and, second, that in a completely asynchronous system, it is impossible to tell the difference between a node that has failed and a very slow one. However, a node that has suffered a crash fault will never communicate again, whereas a very slow node will eventually send its message. If a decision is made by thinking that it is faulty, then, if it is very slow, its decision (made before the message is sent) may be the opposite of the one made by the other nodes of the systems. This impossibility result has led research toward synchronous or partially synchronous environments (where there are bounds, which may or may not be known by the nodes themselves, on the relative speeds of the system’s nodes). One way of formalizing the hypotheses regarding synchronicity, which we assume, is to use the concept of the failure detector. Such a detector is a distributed oracle that receives requests from the system’s nodes for information about the faulty nodes. The more synchronous the system, and the more reliable the detector, and the easier it is to find a fault-tolerant solution to a problem. Conversely, the stronger the synchronicity hypotheses, the more difficult it is to justify an increase in the number of nodes in the system. A classification of the different failure detectors can be found in [GÄR 01].
Fault-Tolerant Distributed Algorithms
233
For the most part, real systems are at least partially synchronous (which is equivalent to saying that there is a bound, known or unknown, on the ratios between the relative speeds of the system’s nodes). On the other hand, the complete and global communications hypotheses are too strong to be applied to large scale systems. In the context of self-stabilization, the hypotheses made for the system generally do not include, as with robust algorithms, conditions on the completeness or the globality of the communications. Many algorithms run on systems with nodes that only communicate locally. However, several hypotheses may be crucial for the algorithm to run properly, and involve the hypotheses made regarding the scheduling of the system: – atomicity of the communications: most of the self-stabilizing algorithms discussed in the literature use communication primitives with a high level of atomicity. At least three historic models are found in the literature: - the state model (or shared memory model): in one atomic step, a node can read the state of each of the neighboring nodes, and update its own state; - the shared registry model: in one atomic step, a node can read the state of one of its neighboring nodes, or update its own state, but not both simultaneously; - the message passing model: this is the classic model for distributed algorithms, for which in one atomic step, a node sends a message to one of the neighboring nodes, or receives a message from one of the neighboring nodes, but not both simultaneously. With the recent study of the self-stabilization property in wireless and ad hoc sensor networks, several models for local diffusion with potential collisions have appeared. In the model that presents the highest degree of atomicity [KUL 03], a node can, in one atomic step, read its own state and partially write the state of each of the neighboring nodes. If two nodes simultaneously write the state of a common neighbor, a collision occurs and none of the information is written. A more realistic model [HER 04] consists of defining two distinct and atomic actions for local diffusion on one hand and the reception of a locally diffused message on the other. In the case of bidirectional communications, it is possible to simulate a model using another model. For example, [DOL 00] shows how to transform the shared memory model into a message passing model. In the models that are specific to wireless networks, [KUL 04] shows how to transform the local diffusion model with collisions into a shared memory model; in a similar fashion [HER 03] shows that the model in [HER 04] can be transformed into a shared memory model. There are two problems with these transformations: – the transformation uses up resources (time, memory, energy in the case of sensors), which could be avoided using a direct solution in the model closest to the considered system;
234
Wireless Ad Hoc and Sensor Networks
A
B
Figure 10.3. Self-stabilizing node coloring
– the transformation is only possible in systems with bidirectional communications: this is due to the fact that acknowledgments have to be sent regularly to ensure that the highest level model is properly simulated. – spatial scheduling: historically, self-stabilizing algorithms relied on the hypothesis that two neighboring nodes cannot execute their codes simultaneously. This makes it possible to break problems of symmetry in certain configurations. Usually, three main possibilities are distinguished for scheduling, depending on which constraints are wanted: - central scheduling: at a given moment, only one of the system’s nodes can run its code; - global (or synchronous) scheduling: at a given moment, all of the system’s nodes run their codes; - distributed scheduling: at a given moment, an arbitrary subset of the system’s nodes runs its code. This type of spatial scheduling is the most realistic. Other variations are also possible (for example, a locally central scheduling: at a given moment, in each neighborhood, only one of the nodes executes its code), but in practice, they are equivalent to one of the three models above (see [TIX 00]). The more constrained the spatial scheduling model is, the easier it is to solve problems. For example, [ANG 80] shows that it is impossible to color an arbitrary graph in a distributed and deterministic fashion (see execution A in Figure 10.3). On the other
Fault-Tolerant Distributed Algorithms
235
hand, [GRA 00] shows that if the spatial scheduling is locally central, then such a solution is possible (see execution B in Figure 10.3). Some algorithms, which rely on the hypothesis of one of these models, can be run in another model, at the price of a greater consumption of resources, as before. Because the most general model is the distributed model, it may be transformed into a more constrained model using a mutual exclusion algorithm [DIJ 74] (for the central model), or a synchronization algorithm [ALI 98] (for the global model): – temporal scheduling: the first self-stabilizing [DIJ 74] algorithms were independent of the concept of time, that is, they were written in a purely asynchronous model, where no hypothesis is stated regarding the relative speeds of the system’s nodes. Later on, scheduling models with heavier constraints began to appear, particularly for the description of real systems. Schedulings are usually divided into three main types: - arbitrary scheduling: no hypothesis is made regarding the respective execution properties of the system’s nodes, other than the simple progression (at each moment, at least one node executes some actions); - fair scheduling: each node runs local actions infinitely often; - bounded scheduling: between the executions of two actions for the same system node, each node executes a bounded number of actions. Bounded scheduling can be constrained further in order to obtain synchronous (or global) scheduling. As with the variations on the previous models, there are algorithms for transforming the execution from one model to another. For example, alternators [GOU 97, JOH 99] can be used to construct a bounded model based on a fair or arbitrary model. On the other hand, because of its unbounded nature, the strict fair model cannot be constructed from the arbitrary model. 10.4.2. Hypotheses on the applications In the context of self-stabilization, depending on the problem that we wish to solve, the minimum time required for going back to a correct configuration varies significantly. Problems are generally divided into two categories: – static problems: we wish to perform a task that consists of calculating a function that depends on the system in which it is assessed. For example, it can consist of coloring the nodes of a network so as to never have two adjacent nodes with the same color; another example is the calculation of the shortest paths to a destination; – dynamic problems: we wish to perform a task that performs a favor for other algorithms. The model transformation protocols such as token passing fall into this category. From the scalability perspective, the crucial issue rests in the locality of the problem’s definition. For example, coloring is a local problem: if each node is locally assured that each of its neighbors has a color different from its own, then all of the system’s nodes are also assured of this property. On the other hand, the problem of
236
Wireless Ad Hoc and Sensor Networks
d
1
1
1
1
1
4
1
2
2
1
3
1
3
1
3
d
2
10
4
6
1
1
1
4
1
5
2
1
3
1
5
3
2
1
5
Figure 10.4. Self-stabilizing tree construction
finding a tree leading to a destination is not a local problem: each node (except for the destination) simply has a pointer towards one of its neighbors (its parent in the tree), but has no way of knowing whether the general structure induced by the neighbors chosen in this fashion is, indeed, a tree towards the destination (it may just as well be an isolated root and a ring oriented between the other nodes). For dynamic tasks, the token passing problem between the nodes of a network is also a global problem: if a node does not detect a token in its immediate neighborhood, it cannot draw the conclusion that there are none in the network; if it owns one, it has no way of knowing whether there is another one. Global problems lead to performance problems when we start focusing on scalability. For example, Figure 10.4 shows two configurations of a self-stabilizing algorithm for constructing trees of the shortest paths to a destination. Between the two configurations, only the weight of one edge (shown in bold) has changed. However, this single modification led to a change of parents for half the nodes in the network. In a dual fashion, Figure 10.5 shows an initial configuration of a self-stabilizing algorithm of mutual exclusion by token passing in an anonymous ring (the identifiers are only placed so that the nodes can be identified in the text). When a process owns the token (something determined solely by its neighborhood), it is shown in black. The goal of this algorithm is to guarantee that, in a finite time, a single token navigates through the network. For reasons of system symmetry, it is not possible to conceive, in this context, an algorithm that would eliminate the token placed in n/2 + 1 but not the one placed in 1. The only way to decrease the number of tokens consists of moving them along the ring so as to regroup them. Because the distance that initially separates two tokens is proportional to n, the network’s size, such a movement implies that a same number of nodes perform at least one action. For wireless networks of modest sizes (such as ad hoc networks comprised of a few dozens of nodes), it remains possible to consider solving global problems. For both static and dynamic problems, it is possible to come up with optimal solutions: – Many static global problems are equivalent to constructing a tree (or a forest), according to a particular metric. Depending on the cases, the result is a breadth-first or depth-first spanning tree, or one that minimizes (or maximizes) one or several particular criteria. The distributed algorithm can then be reduced to the execution, on each
Fault-Tolerant Distributed Algorithms
237
Figure 10.5. Mutual exclusion by self-stabilizing token passing
node, of an operator specific to the problem at hand [DUC 98]. When this operator satisfies certain properties [DUC 01, DUC 03, DEL 05], the derived algorithm is selfstabilizing. This method then makes it much easier to produce self-stabilizing solutions: it is generic (depending on the operator, the communication graph, the scheduling) and makes it possible, while writing the proof, to only verify that the expected properties are satisfied by the operator. Furthermore, this solution can be used in networks where communications are not reliable (losses, duplications, desequencing) and therefore solves in a uniform way certain problems that are specific to wireless networks (collisions, etc.). – The “standard” self-stabilizing algorithm for dynamic problems (and the first published algorithm) is the mutual exclusion algorithm on a unidirectional ring. In practice, several criteria need to be taken into account: the stabilization time, the service time (maximum time between two token passings on a given node), the memory used (in bits) and the transparency with respect to underlying communication algorithms. A probabilistic algorithm that is optimal [DUC 04] for all of these criteria can be simply expressed as follows: - each node has a state with two possible values, normal and speed reducer; - a normal node that receives a token passes it on to its successor in the ring; - a speed reducer node that receives a token keeps the token; - at each unit of time (the system is assumed to be synchronous), each normal 1 , otherwise (if it is a speed node becomes a speed reducer with a probability of n(n−1) 1 reducer) it becomes normal with a probability of n .
238
Wireless Ad Hoc and Sensor Networks
On average, over n synchronous steps, there is only one speed reducer, and it remains a speed reducer for n steps. Then, all of the tokens end up on the same speed reducer node, and are fused together over the n steps. After that, a token travels around the ring in 2n steps on average.
10.5. Solutions for large-scale self-stabilization Self-stabilization, in its original form, is not suitable for large-scale systems. However, by restricting certain aspects of the basic definition, it is possible to preserve fault-tolerant properties that are useful in large-scale systems. These restrictions consist of limiting the extent or the nature of the faults that are considered, in order to allow a rapid return to a normal state, or also the type of problems that are to be solved. For certain methods, the self-stabilization property is maintained, and additional properties, that are useful in large-scale systems are added.
10.5.1. Restricting the nature of the faults 10.5.1.1. Detecting and correcting errors The simplest way of adding the self-stabilization property to a system is to use a mechanism for detecting and correcting errors. The historical approaches [KAT 93, AWE 94] are global (at least one of the nodes receives information from each of the others, or sends information to each of the others) and are not suitable for large-scale systems. However, several recent approaches can be considered for such systems: – localized detectors and correctors: not all of the tasks are equivalent when trying to detect whether memory corruption has occurred. For example, in order to realize that a coloring of the nodes is incorrect, a node simply has to take a look at the colors of its neighbors and compare them with its own. Each node that detects conflict can request a correction action. On the other hand, in order to ensure that a network’s orientation is without cycles, it may be necessary to look at a distance proportional to the size of the system (see [BEA 98]). Of course, a particular algorithm can include additional variables in order to allow a quicker detection (for example, the distance to the root when constructing a tree, each node is then able to verify that its parent in the tree is indeed at a distance smaller than its own). The local stabilizer in [AFE 02] is based on this principle. Parallel to the normal execution of the system, the nodes watch the state of their neighborhoods, at a distance that depends on the problem and the algorithm used. This surveillance makes it possible to detect certain memory corruptions, and to trigger adequate correction operations. The correction phase uses the redundancy of the information used for detection: each node has a copy of the states of each of its neighbors at a certain distance k. This redundancy only uses this technique for algorithms in which this distance is small, since the memory and the associated exchanges increase exponentially with this distance;
Fault-Tolerant Distributed Algorithms
239
– probably correct detectors and correctors: this approach consists of considering that truly arbitrary memory corruptions are highly unlikely. Probabilistic arguments are used to establish that, in general, memory corruptions that result from faults can be detected using traditional techniques from information theory, such as data redundancy or error detection and correction codes. In particular, in [HER 00], error detection codes are used to determine that memory corruption has occurred, with a high probability. If the article only considers the case where a single corruption arises (in other words, only one node in the system is affected by this corruption), it makes it possible to return to a normal behavior in a single correction step. For a system, even a large-scale one, where memory corruptions are localized in each neighborhood and are not malicious (that is, they can be detected using techniques such a cyclic redundancy checks), this approach is well indicated. A similar approach was adopted recently in [HUA 05]. Here is the basic principle: each variable of the beginning algorithm is associated with k variables (3 in the article mentioned). Then, for each access to a particular variable (meaning, in fact, a set of k variables representing a logic variable), a parity function is used to allow to distinguish visible corruptions from absences of corruption (or invisible corruptions). A visible corruption corresponds to a detection by the parity code. Each time a visible corruption is detected, a majority function is used bit by bit to restore the value of the variable before the corruption occurred. The flaw in this approach is that it increases the memory and processes for a particular node by a factor k. On the other hand, the simplicity of its implementation makes it easy to apply to nodes with little power (such as sensor networks). 10.5.1.2. Preservation of predicates A self-stabilizing predicate does not have to be initialized. Also, when the parameters or the environment change, it is capable of adapting to the new context without the need for writing specific code for dealing with cases that were not expected when designing the system. This general approach to fault-tolerance and adaptivity is undoubtedly a strong point of self-stabilization, but in a large-scale system, the dynamic aspects and the tendencies to unexpected changes in the environment are more likely to occur than arbitrary corruptions of the memories of the system’s nodes. Several recent studies in the field of self-stabilization focus on more robust solutions than the merely stabilizing solutions in strongly dynamic contexts. At their core, these algorithms are self-stabilizing and therefore enjoy the resulting properties. Furthermore, they preserve a local predicate when certain particular changes occur. In a way, they guarantee certain properties when restricted (but potentially frequent) failures occur, and simply guarantee self-stabilization when arbitrary failures occur. Several complementary approaches can be listed: – super-stabilization: this property (defined in [DOL 97]) states that a superstabilizing algorithm is self-stabilizing on one hand and, on the other, preserves a predicate (typically a safety predicate) when changes in topology occur in a legitimate
240
Wireless Ad Hoc and Sensor Networks
configuration. Thus, changes in topology are limited: if these changes occur during the stabilization phase, the system can never stabilize. On the other hand, if they occur only after a correct global state is achieved, the system remains stable. This property is strictly greater than the self-stabilization property: in a self-stabilizing system, changes in topology would be interpreted as transient failures (the nodes do not have in memory an accurate vision of their neighborhood), and no safety guarantee could be made if, starting with a proper global state, such a change in topology were to occur; – self-stabilization and unreliable communications: this concerns algorithms that are both self-stabilizing and tolerant to link faults (losses, duplications, desequencing; see section 10.3.1). If link faults are transient (or intermittent but occurring rarely), “simple” self-stabilization makes it possible to return to a normal state. On the other hand, if these faults occur intermittently but regularly, a proper behavior for the system is no longer a guarantee. Note first of all that for this problem to have a solution, it is impossible for these link faults to be completely arbitrary: - losses: if a channel can lose all of the messages that travel through it, no nontrivial problem can be solved. We therefore state as a hypothesis that the losses are fair, meaning that if a node sends an infinity of messages on the adjacent link, the link delivers an infinity of messages to the node located on the other end of the link. Of course, the link can lose an infinity of messages this way; - duplications: if a channel can infinitely duplicate a message that travels through this channel, then no non-trivial problem can be solved in a self-stabilizing way. By considering that, due to a transient failure, the communication links contain erroneous messages, these messages can be duplicated and delivered in infinite amounts to adjacent nodes, and indefinitely compromise the system’s integrity. Thus, we also state as a hypothesis that a same message can only be duplicated a finite (but not necessarily bounded) number of times. With these hypotheses (fair losses, finite duplication, arbitrary desequencing), there are several solutions that remain self-stabilizing. In particular, this means that the link failures can occur during the stabilization phase, but also during the stabilized phase. Thus, based on a legitimate configuration, the losses, duplications and desequencing that may occur have no impact on the correction of the system (therefore, they are blocked to the user of the system). In [DEL 02], a solution to the census problem (finding all of the nodes of a system and their respective positions) presents the characteristics mentioned before. In [DEL 05], a generic solution (that is, one that can be parameterized for different metrics) is given for the problem of forest construction. These two solutions share some characteristics regarding resistance to different types of failures in communication links. In order to manage these links, each node regularly retransmits its last message. In order to deal with duplications, the algorithm used satisfies the idempotence property (that is, that the receptions several times in a row of a message do not change the state of the node receiving it). Finally, for desequencing, the algorithm deals with each message independently (which leads [DEL 02] to use large-sized messages);
Fault-Tolerant Distributed Algorithms
d
d
d
{ m}
A
241
{ m}
B
C
d
d
{m}
d
{ m}
D
E
F
{ m}
{ m}
Figure 10.6. Loop-free routing in a dynamic environment
– loop-free self-stabilizing routing: the predicates we wish to preserve here are specific to routing protocols. The major advantage of constructing routing tables using a distributed algorithm is to be able to update them dynamically, and in particular when the network is in use. Now, when a routing table is modified locally on a node, the path of a particular message traveling through this node is also modified. If no particular precautions are taken, at a given moment, logical loops can occur when following the path set by the routing tables during an update. These logical loops increase the number of jumps that a particular message has to perform and, if this message has a limited lifetime, can lead to the message being deleted. Loop-free routing makes it possible to guarantee that, during the modification of the routing tables, at any moment, there is no logical loop in the system. In [COB 02], self-stabilizing of a loop-free routing algorithm is presented. However, in a highly dynamic system, it is possible even for a loop-free routing algorithm to be insufficient. In Figure 10.6, a possible execution of a loop-free routing algorithm is presented, with a message that has to be sent to the destination. The changes in costs on the communication links lead to several updates of the routing tables. At each moment,
242
Wireless Ad Hoc and Sensor Networks
a particular node uses its local routing table, and at each moment, no logical loop exists until the destination. However, the dynamic nature of the network causes a logical loop to be constructed over time, preventing the message from reaching the destination. In [JOH 03], a loop-free self-stabilizing algorithm that preserves routes is presented. The route preservation property means that if a tree is initially constructed towards a destination, any message transmitted towards that destination reaches it in a finite time. The general technique for achieving this result is comprised of two phases: – before changing its parent in the tree leading it to the destination, a node checks all of its children to ensure that this change will not cause a loop; – modifications of the routing table have a lower priority than the transmission of messages, meaning that a message traveling towards the root will always have its distance from the root decrease (for a particular metric). In this way, if there are changes in the weights of the links after a tree has been constructed, the messages will always reach the destination. If, moreover, there are no changes in the weights of the links over a certain period of time, then the system converges to a tree of shortest paths (always according to a particular metric) to the destination. 10.5.2. Limiting the geographic extent of faults In its traditional definition, self-stabilization does not set any constraints on the number of faults that can strike a system. This general description is not always verified for large-scale systems, such as a sensor network or a large ad hoc network (several hundreds of nodes). This is a result of the large number of nodes, which makes it very likely that a large majority of them will function properly. On the other hand, it is just as likely that over the entire course of the execution, several of them may go through intermittent faults. If we assume that the faults that can occur only ever concern a very small part of the network, it is possible to design algorithms that converge more quickly than traditional self-stabilizing algorithms. In order to have a formal framework, we consider that the distance to a legitimate configuration is equal to the number of nodes whose memories have to be changed in order to achieve a legitimate configuration (as with a Hamming distance). Of course, it is possible that even if we are at a distance k from a legitimate configuration, more than k nodes have, in fact, corrupted memories. From the perspective of returning to a normal state, only the closest legitimate configuration is considered. Studies that attempt to minimize the stabilization time in the context where few faults occur usually divide stabilization into two levels: – “visible” stabilization: here, only the output variables of the algorithm are involved. The output variables are typically used by the system’s user. For example,
Fault-Tolerant Distributed Algorithms
243
if we consider a tree construction algorithm, only the pointer oriented toward the parent node is included in the output variables; – “internal” stabilization: here, all of the algorithm’s variables are involved. This type of stabilization corresponds to the traditional concept of self-stabilization; In many studies, only the “visible” stabilization is performed quickly (that is, in time relative to the number of faults that strike the system, rather than in time relative to the size of said system), while the “internal” stabilization most of the time remains proportional to the network’s size. Algorithms that present this constraint are not capable of tolerating a high frequency of faults. Consider an algorithm whose time for visible stabilization depends on k (the number of faults) and whose internal stabilization depends on n (the size of the system). Now, if a new fault occurs while the visible stabilization is being performed, but not during the internal stabilization, this can lead to a global state containing a number of faults greater than k, and there is no longer any guarantee on the new time for visible stabilization. 10.5.2.1. k-stabilization k-stabilization is defined as self-stabilization, when restricting the starting configurations to those configurations that are at a distance of k or more from a legitimate configuration. Because of the less hostile environment, it is possible to solve problems that are impossible in the case of general self-stabilization, and to offer reduced visible stabilization times, even for global tasks such as those described in section 10.4.2. For example, the token passing problem is infamous [ISR 90] for being impossible to solve in a way that is self-stabilizing, anonymous (the nodes cannot be distinguished), and deterministic when the communication graph is a unidirectional ring (a node can only receive information from the neighbor to its left, and can only send information to the neighbor on its right). The major argument in the proof of this impossibility is as follows: consider a configuration where the system of n nodes contains a single token and is stabilized. In Figure 10.7, the token is located on the node whose state is e2. We then construct a new system with a size of 2n that reproduces the states of the processors in a symmetric way (meaning that the node i is in the same local state as the node i + n). There are two tokens in this new system, and if we execute the codes of the nodes using asynchronous scheduling, the two tokens last forever (while the first system with a size of n makes the single token last forever). In [GEN 01], the assumption is that no more than k < nc faults strike the system (where c is a small constant); in that case, the token passing problem can be solved in a deterministic and k-stabilizing fashion. The basic idea is to add a speed to the tokens. This speed is proportional to the number of correct nodes that precede the token (this number is calculated by way of a variable that estimates the distance to the next token). That way, a token whose k predecessors are correct will have the maximum speed, whereas a token whose k predecessors are not all correct will have
244
Wireless Ad Hoc and Sensor Networks
e1 e4
e2
e1 e4
e2
e3
e3
e3 e2
e4 e1
Figure 10.7. Impossibility of mutual exclusion in a uniform ring
a reduced speed, and the correct token will catch up with it. The visible stabilization time of this algorithm has complexity O(k). 10.5.2.2. Time-adaptive self-stabilization For certain problems, a memory corruption can cause a cascade of corrections in the entire system [AWE 94], yet it would be natural for the stabilization to be quicker when the number of failures that strike the system is smaller. This is the principle behind time-adaptive self-stabilization, also known as scalable stabilization and fault local stabilization. Time-adaptive self-stabilization was first studied for static tasks. One task in particular, the persistent bit, is the subject of [KUT 99a, KUT 99b]: the idea is to tolerate the corruption of a bit on k nodes, when k is not known to any of the nodes. These two methods are based on the gathering of information from nodes, in order to later perform a majority vote to establish the actual value. In [KUT 99b], the visible stan ). In [KUT 99a], if the bilization time has complexity O(k log(n)), for k ≤ O( log(n) n number of faults is smaller than 2 , then the visible stabilization time has complexity O(k). These two algorithms assume asynchronous scheduling. Still in regards to static tasks, [GHO 96] suggests an algorithm that transforms a self-stabilizing algorithm A for a static task into a new algorithm A , which is also self-stabilizing, but with a constant visible stabilization time if k is equal to 1. Its internal stabilization has complexity O(T × D), where T is the stabilization time of A and where D is the network’s diameter. The algorithm in [GHO 96] runs with an asynchronous scheduling. Another algorithm transformation process is that presented in [GHO 02]: this process adds stabilization properties to a non-stabilizing algorithm for a static problem, in the case where the number of faults is significantly smaller than the size of the network.
Fault-Tolerant Distributed Algorithms
245
J J J J
A J
J J J J
B
J
F
F
F
F
F
Figure 10.8. Time adaptivity for dynamic problems
However, the resulting complexities strongly depend on the distribution of the faults that strike the system: the best results are obtained when the k faults are contiguous (the stabilization time then has complexity O(k 3 )), but the performances decrease (exponentially in k) when the faults are arbitrarily located. The case of dynamic tasks is more difficult to deal with in the context of timeadaptivity [GEN 02]. Consider for example the case of a single fault appearing in the token passing problem in a network. In a proper run (without faults), the token propagates through the network, and because of the local nature of information exchanges, the propagation of the token can only occur from one neighborhood to the next (see execution A in Figure 10.8, where the node indicated by J carries the token, and where the gray nodes are the only ones capable of taking action). However, when a fault occurs at the other end of the network (see execution B in Figure 10.8, where the node indicated by F is faulty), it can only be corrected in the neighborhood (because of the local nature of information exchanges). Even with a bound scheduling hypothesis, it may be that the actions correcting the fault only take place after a time proportional to the size of the network (and are not dependent on the number of faults that strike the
246
Wireless Ad Hoc and Sensor Networks
self-stabilization
(n-1)-stabilization
Adaptativité en temps
tim e-a da
ptiv ity
2-stabilization
1-stabilization
Figure 10.9. Classification of problems for k-stabilization and time-adaptivity
network). This case does not occur with static tasks: if a fault occurs in a legitimate configuration, only the neighborhood is able to take action to correct it. This result implies that to obtain time-adaptive algorithms for dynamic tasks, it is necessary to consider that the system is subject to synchronous scheduling (or to assume that in a set time interval referred to as a round, all of the nodes that have the ability to act do so). Several solutions to the token passing problem in a ring have been suggested in this context [BEA 99, GEN 00, GEN 01]. They either rely on synchronous scheduling ([GEN 00, GEN 01]), or measure the visible stabilization time in rounds ([BEA 99]). The algorithm in [BEA 99] has stabilization time complexity O(k 2 ), that in [GEN 00] O(k), and that in [GEN 01] O(f ), where f is the effective number of faults that strike the network (as opposed to k, the bound on the maximum number of faults that can be tolerated). 10.5.3. Classification It is possible to arrange self-stabilizing, k-stabilizing and time-adaptive algorithms into classes, depending on the difficulty in solving problems that can be solved in each case. For example, if it is possible to solve a problem in a self-stabilizing way, it is also possible to solve it in a k-stabilizing way (if you can do more, you can do less). Likewise, if it is possible to solve a problem in a time-adaptive way, it is also possible to solve it without trying to constrain the visible stabilization time. Thus, the class of problems that can be solved in a time-adaptive way is a subset of the class of problems that can be solved in self-stabilizing way, which is itself a subset of the class of problems that have k-stabilizing solutions (see Figure 10.9). These inclusions are strict: some problems can be solved in a k-stabilizing way, but not in a self-stabilizing way (see section 10.5.2.1), others can be solved in a self-stabilizing way, but not in an time-adaptive way (see section 10.5.2.2).
Fault-Tolerant Distributed Algorithms
247
10.5.4. Limiting the classes of problems to solve In section 10.4.2, several problems specific to distributed systems are global, meaning that a modification of an element of the system can have repercussions on the entire system. In this section, the objective is to achieve good performances (i.e. performances that do not depend on the size of the system) in large-scale systems. Unlike in the previous sections, we will assume that memory corruptions can be completely arbitrary, and that their extent is also arbitrary. The idea is to study algorithms referred to as localized, that is, that the correction in one part of the system does not depend on those in the other parts of the system. 10.5.4.1. Localized problems Resource allocation problems (usually derived from graph coloring) often present local constraints. As a result, in most cases, they are localizable, meaning they can be solved with localized algorithms. 10.5.4.1.1. Allocation of TDMA time slots Collision avoidance and resolution are fundamental aspects of the protocols used for wireless networks. Indirectly, a communication protocol that prevents collision saves energy, since the need for retransmitting a message is reduced. Time division multiple access (TDMA) is a reasonable technique for avoiding collisions. The algorithmic problem of allocating time slots in TDMA is linked to the traditional problem of allocating frequencies in FDMA. With FDMA, each color represents a frequency, and to avoid collisions, we have to ensure that all the nodes separated by a distance of two (or less) have different colors. An additional constraint is for all of the colors chosen by neighboring nodes to be far enough from each other to prevent interferences. If the set of colors used is the integer interval [0, λ], then the colors (fv , fw ) of neighboring nodes (v, w) have to satisfy |fv − fw | > 1 to prevent interferences. The standard notation for expressing such a constraint is L(1 , 2 ): for any pair of nodes separated by a distance of i ∈ {1, 2}, the colors differ by at least i . Therefore, graph coloring for FDMA should satisfy the constraint L(2, 1). Furthermore, a solution that optimizes the number of colors used is preferable, since it reduces the number of frequencies required. The graph coloring problem in TDMA is slightly different. Let L (1 , 2 ) be the constraint for any pair of nodes separated by a distance i ∈ {1, 2}, the colors differing by i mod (λ + 1). This constraint expresses the problems on the edges of the time slots. The usual coloring constraint for TDMA is L (1, 1). If the time slots lack precision (for example, because the time synchronization is not perfect), it is possible to request a stricter color separation, such as L (2, 2). Minimizing the number of colors for TDMA is advisable, because if a period of time corresponding to the color sequence 0..λ is fit to the unit interval [0, 1], then each color represents a fraction 1/(λ + 1) of the bandwidth. Therefore, the smaller λ is, the better the bandwidth used. The first TDMA-type algorithm for sensor networks is presented in [KUL 03]. They start off with a grid topology (that may be generalized to any topology by grid
248
Wireless Ad Hoc and Sensor Networks
embedding) and assume that each node knows its position in the grid (this position is used for calculating the time slot allocation). Using their approach in general graphs requires for the topology by grid embedding to be the same in all of the nodes and to be known before the algorithm is implemented. As a result, this algorithm cannot be used in networks that evolve dynamically. In [HER 04], a time slot allocation algorithm is suggested, that manages dynamic network evolutions, transient failures and scalability. The approach for dealing with both the dynamic nature of the network and transient faults is self-stabilization, which ensures that the system converges to a valid TDMA allocation after a transient failure or a change in topology. The problem of scalability is handled using the fact that the mean stabilization time of the probabilistic time slot allocation algorithm has complexity O(1). The basic idea underlying this algorithm consists of a rapid probabilistic coloring technique, which could be used for solving other problems in sensor networks, or in certain ad hoc networks. This technique consists of rapidly coloring the graph in order to perform neighborhood unique naming, and is detailed below. 10.5.4.1.2. Unique neighborhood naming An algorithm that performs unique neighborhood naming gives each node a name that is distinct from those of its N k neighbors, where k is a set constant and where N k refers to the neighborhood at a distance k. This may seem odd, since it is generally assumed that the nodes already have a unique identifier (for example, the MAC address of the wireless network device), but if we try to use these identifiers for coloring, the potentially large set of identifiers may be a problem when changing to a larger scale. This makes it a good idea to assign them names, taken from a constant size space, ensuring that they are locally unique. This problem can be seen as a coloring of N k . The basic idea of the coloring algorithm is as follows: let γ = ∆t for t > k 1. If a node does not have a unique color (chosen between 0 and γ) in its cache of Npk (assumed to be retransmitted by each node to its entire neighborhood, using techniques of the same type as CSMA/CA), it randomly chooses a new color among those that are available. Here is a key property: the uniqueness property of the color of a node p is stable. Likewise, the uniqueness property of any node subset is also stable. In other words, once a node is considered to be unique in all of the neighborhoods it belongs to, it is stable. It then becomes possible to work with a Markov model on the executions, and to show that the probability of a sequence of actions that leads from a given stable set to a larger stable set is positive. Furthermore, neighborhood unique naming in [HER 04] has a property that the system’s global identifiers do not: because the identifiers have a constant size, the longest chain of increasing identifiers in the graph
1 A compromise must be found when choosing t in γ = ∆t . First, t should be high enough for the choice of a new identifier to be unique with a high probability. Generally, high values of t reduce the mean convergence time of the neighborhood unique naming algorithm, and low values of t reduce the constant d, which in turn reduces the mean convergence times of algorithms that use this unique naming.
Fault-Tolerant Distributed Algorithms
249
also has a constant size. This constant size makes it possible to construct other selfstabilizing algorithms using this basic building block (for example, [MIT 05] uses this neighborhood unique naming technique at a distance of 2 for constructing a network hierarchy, and [HER 04] uses it at a distance of 3 for a TDMA time slot allocation), while maintaining a constant stabilization time, therefore making it independent of the network’s size. 10.5.4.2. Tolerating malicious entities As indicated in section 10.3.1, the Byzantine failure model is the strongest: a node of the system can simply exhibit an arbitrary behavior. Of course, in order for it to cause damage to the system, it is necessary for this arbitrary behavior to go unnoticed by the functioning nodes, meaning that the values that are communicated and exchanged have to stay within the intervals of values expected to be found by the other nodes. Most traditional solutions use one or more hypotheses that are not realistic in large-scale systems, such as those found in sensor networks and ad hoc networks that have a large size: – they assume total connectivity (see section 10.4.1); – they assume that a wide majority of nodes are correct (usually equal to more than two-thirds of the nodes); – they assume that the nodes have access to reliable cryptographic primitives (something a sensor with limited processing capabilities cannot provide). Because some problems may be localizable, it might be interesting to focus on their ability to tolerate faults that are more important than simple memory corruptions, such as Byzantine faults. More specifically, there is a need for designing algorithms that: – are self-stabilizing; – can run on any topology (dynamic topologies); – do not use cryptographic primitives; – do not state any hypothesis regarding the number of Byzantine nodes; – tolerate Byzantine nodes in the sense that they have no effect on correct nodes. A first method for obtaining algorithms with such properties is presented in [NES 02]. The Byzantine containment radius is defined as the maximum distance from which the effect of Byzantine nodes can be felt. This containment radius must obviously be as small as possible. A problem is r-restrictive if its specification prohibits combinations of states in a configuration for nodes at a distance of r at the most. For example, the problem of coloring nodes in a network is 1-restrictive, since two neighboring nodes cannot have the same color. On the other hand, the tree construction problem is r-restrictive (for any r between 1 and n − 1) because the correction implies that all of the parents that are chosen must form a tree. The main theorem in [NES 02] states that if a problem is r-restrictive, the best containment
250
Wireless Ad Hoc and Sensor Networks
radius that can be obtained is r. It is easy to see that the neighborhood unique naming algorithm mentioned in section 10.5.4.1 is 1-restrictive for a neighborhood at a distance of 1, and in fact has a containment radius of 1: if a Byzantine node makes an action, it cannot – in order to have an effect – only have the same color as one of its neighbors; this neighbor makes an action, but if it is correct, it assumes a color that is not taken by any of its neighbors, meaning that the reaction to the Byzantine behavior goes no further than that node. In [SAK 05], the authors consider the problem of coloring links in networks whose topologies are trees, in a way that is self-stabilizing and that tolerates Byzantines. Link coloring consists of assigning colors to each link in such a way that two links adjacent to the same node do not have the same color. This coloring also presents applications in the field of frequency allocation in wireless networks. The fact that the network is a tree (an oriented tree) simplifies the problem, because the network is not symmetric, and the coloring decision can be made by one of the two adjacent nodes (the parent node in [SAK 05]). Despite this simplified model, the authors show that: – it is necessary to use at least d + 1 colors, where d is the graph’s maximum degree of communication to allow a constant containment radius (whereas d colors would otherwise be enough); – it is necessary to have a centralized spatial scheduling (see section 10.4.1) if there is a need to tolerate both memory corruptions and Byzantines; – there is an algorithm for oriented trees that uses d + 1 colors and has a containment radius of 2. When the network is uniform (all of the nodes run the same code) and anonymous (the nodes have no way of telling each other apart), a self-stabilizing link coloring algorithm cannot rely on the hypothesis that the color of a link is determined by a single node. This is because the nodes are uniform, none has priority over the other, and the coloring of the edge that connects them must result from a local agreement between them. In [MAS 05], a self-stabilizing edge coloring algorithm that tolerates Byzantines is presented. Unlike the one in [SAK 05], the algorithm in [MAS 05] considers uniform and arbitrary networks (and not oriented trees), and uses 2d − 1 colors (instead of d + 1). As for the Byzantine containment radius, the protocol in [MAS 05] is optimal, since the influence of Byzantine nodes is limited to themselves; this means that the sub-system comprised only of correct processes is always correct. The idea behind the algorithm in [MAS 05] is as follows: each node maintains a list of colors assigned to its incoming links, and periodically exchanges this list with its neighbors. Based on the list received from its neighbor v, a node u can suggest a color for the link (u, v). This suggested color must not appear in the set of incoming colors of u and v. Because spatial scheduling is central (this is due to the necessity shown in [SAK 05]), it is impossible for two neighbors to suggest a color at the same time. Because the set of available colors is 2d − 1, u is always able to suggest a color
Fault-Tolerant Distributed Algorithms
251
that is not used either by u, or by v. If both u and v are correct, the color c of the link (u, v) is never changed again. In the case of a Byzantine node, it may, however, be that such a Byzantine will constantly suggest colors that conflict with those of the other neighbors. If this color conflicts with a color u and v have already agreed upon, this suggestion is ignored. The remaining case occurs when u has two neighbors v and w (where u and v are correct and w is Byzantine) and u has not agreed yet with either v or w. The Byzantine node w could continuously suggest colors that conflict with v, and w could constantly accept the color suggested by w. To ensure that this behavior cannot occur infinitely often, [MAS 05] uses a list of priorities such that the neighbors of u alternately obtain the priority in suggesting the color of the link. Then, once u and v have agreed on the color of the link (u, v), this color can no longer be modified by w, because these suggestions are systematically rejected. 10.6. Conclusion Traditional techniques used in fault-tolerant distributed algorithms are for the most part ill-suited for scalability. Using them would lead to mechanisms that are either too costly in terms of resources (memory, calculation time), or out of proportion with the problem to solve. Several leads have been investigated for going around the impossibility results in the framework of self-stabilization: restricting the hypotheses regarding the faults that may occur (whether about their nature or their geographic location), or restricting the type of applications that we set out to solve. In the specific case of wireless communications, several resource allocation problems (frequencies, times slots) can be solved in a highly fault-tolerant fashion: arbitrary memory corruption, extensive malicious behaviors, etc. The line between problems that are impossible to solve because they are too costly and those that can be handled with reasonable constraints remains, nevertheless, quite blurred. Several recent results show that there is probably a compromise between the resources used and the ability to tolerate failures, but a significant amount of additional work remains to be done in order to get an accurate idea of this compromise. 10.7. Bibliography [AFE 02] A FEK Y., D OLEV S., “Local Stabilizer”, J. Parallel Distrib. Comput., vol. 62, no. 5, p. 745-765, 2002. [AKY 02] A KYILDIZ I., S U W., S ANKARASUBRAMANIAM Y., C AYIRCI E., “A survey on sensor networks”, IEEE Commun. Mag., vol. 40, no. 8, p. 102-114, 2002. [ALI 98] A LIMA L. O., B EAUQUIER J., DATTA A. K., T IXEUIL S., “Self-Stabilization with Global Rooted Synchronizers”, Proceedings of the 18th International Conference on Distributed Computing Systems, 26 - 29 May, Amsterdam, The Netherlands, IEEE Press, p. 102109, 1998.
252
Wireless Ad Hoc and Sensor Networks
[ANG 80] A NGLUIN D., “Local and global properties in networks of processors (Extended Abstract)”, STOC ’80: Proceedings of the Twelfth Annual ACM Symposium on Theory of Computing, New York, NY, USA, ACM Press, p. 82-93, 1980. [AWE 94] AWERBUCH B., PATT-S HAMIR B., VARGHESE G., D OLEV S., “Self-Stabilization by Local Checking and Global Reset (Extended Abstract)”, T EL G., V ITÁNYI P. M. B., Eds., Distributed Algorithms, 8th International Workshop, WDAG ’94, vol. 857 of Lecture Notes in Computer Science, Springer, p. 326-339, 1994. [BEA 98] B EAUQUIER J., D ELAËT S., D OLEV S., T IXEUIL S., “Transient Fault Detectors”, K UTTEN S., Ed., Distributed Computing, 12th International Symposium, DISC ’98, Andros, Greece, September 24-26, 1998, Proceedings, Springer, p. 62-74, 1998. [BEA 99] B EAUQUIER J., G ENOLINI C., K UTTEN S., “Optimal Reactive -Stabilization: The Case of Mutual Exclusion”, Proceedings of the Eighteenth Annual ACM Symposium on Principles of Distributed Computing, p. 209-218, 1999. [COB 02] C OBB J. A., G OUDA M. G., “Stabilization of General Loop-Free Routing”, J. Parallel Distrib. Comput., vol. 62, no. 5, p. 922-944, 2002. [DEL 02] D ELAËT S., T IXEUIL S., “Tolerating Transient and Intermittent Failures”, Journal of Parallel and Distributed Computing, vol. 62, no. 5, p. 961-981, May 2002. [DEL 05] D ELAËT S., D UCOURTHIAL B., T IXEUIL S., “Self-stabilization with r-operators revisited”, Proceedings of the Seventh Symposium on Self-stabilizing Systems (SSS’05), vol. 3764 of Lecture Notes in Computer Science, Barcelona, Spain, Springer Verlag, p. 6880, October 2005. [DIJ 74] D IJKSTRA E. W., “Self-stabilizing Systems in Spite of Distributed Control”, Commun. ACM, vol. 17, no. 11, p. 643-644, 1974. [DOL 97] D OLEV S., H ERMAN T., “Superstabilizing Protocols for Dynamic Distributed Systems”, Chicago J. Theor. Comput. Sci., vol. 1997, 1997. [DOL 00] D OLEV S., Self-stabilization, MIT Press, March 2000. [DUC 98] D UCOURTHIAL B., “New operators for computing with associative nets”, G ARGANO L., P ELEG D., Eds., SIROCCO’98, 5th International Colloquium on Structural Information & Communication Complexity, Carleton Scientific, p. 51-65, 1998. [DUC 01] D UCOURTHIAL B., T IXEUIL S., “Self-stabilization with r-operators”, Distributed Computing, vol. 14, no. 3, p. 147-162, July 2001. [DUC 03] D UCOURTHIAL B., T IXEUIL S., “Self-stabilization with Path Algebra”, Theoretical Computer Science, vol. 293, no. 1, p. 219-236, 2003, extended abstract in Sirrocco 2000. [DUC 04] D UCHON P., H ANUSSE N., T IXEUIL S., “Optimal Randomized Self-stabilizing Mutual Exclusion in Synchronous Rings”, Proceedings of the 18th Symposium on Distributed Computing (DISC 2004), no. 3274, Lecture Notes in Computer Science, Amsterdam, The Nederlands, Springer Verlag, p. 216-229, October 2004. [FIS 85] F ISCHER M. J., LYNCH N. A., PATERSON M., “Impossibility of Distributed Consensus with One Faulty Process”, J. ACM, vol. 32, no. 2, p. 374-382, 1985.
Fault-Tolerant Distributed Algorithms
253
[GÄR 01] G ÄRTNER F. C., A gentle introduction to failure detectors and related problems, Report no. TUD-BS-2001-01, Darmstadt University of Technology, Department of Computer Science, April 2001. [GEN 00] G ENOLINI C., “Optimal k-Stabilization: the case of Synchronous Mutual Exclusion”, Proceedings of Parallel and Distributed Computing Systems (PDCS’2000), p. 371376, November 2000. [GEN 01] G ENOLINI C., T IXEUIL S., Reactive k-stabilization and time adaptivity: possibility and impossibility results, Report no. 1276, Laboratoire de Recherche en Informatique, University of Paris Sud XI, 2001. [GEN 02] G ENOLINI C., T IXEUIL S., “A Lower Bound on k-stabilization in Asynchronous Systems”, Proceedings of IEEE 21st Symposium on Reliable Distributed Systems (SRDS’2002), Osaka, Japan, October 2002. [GHO 96] G HOSH S., G UPTA A., H ERMAN T., P EMMARAJU S. V., “Fault-Containing SelfStabilizing Algorithms”, Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, p. 45-54, 1996. [GHO 02] G HOSH S., H E X., “Scalable Self-Stabilization”, J. Parallel Distrib. Comput., vol. 62, no. 5, p. 945-960, 2002. [GOU 97] G OUDA M. G., H ADDIX F. F., “The linear alternator”, G HOSH S., H ERMAN T., Eds., 3rd Workshop on Self-stabilizing Systems, Carleton University Press, p. 31-47, 1997. [GRA 00] G RADINARIU M., T IXEUIL S., “Self-stabilizing Vertex Coloring of Arbitrary Graphs”, International Conference on Principles of Distributed Systems (OPODIS’2000), Paris, France, p. 55-70, December 2000. [HER 00] H ERMAN T., P EMMARAJU S. V., “Error-detecting codes and fault-containing selfstabilization”, Inf. Process. Lett., vol. 73, no. 1-2, p. 41-46, 2000. [HER 03] H ERMAN T., “Models of Self-Stabilization and Sensor Networks”, DAS S. R., DAS S. K., Eds., Distributed Computing – IWDC 2003, 5th International Workshop, vol. 2918 of Lecture Notes in Computer Science, Springer, p. 205-214, 2003. [HER 04] H ERMAN T., T IXEUIL S., “A Distributed TDMA Slot Assignment Algorithm for Wireless Sensor Networks”, Proceedings of the First Workshop on Algorithmic Aspects of Wireless Sensor Networks (AlgoSensors’2004), no. 3121, Lecture Notes in Computer Science, Turku, Finland, Springer-Verlag, p. 45-58, July 2004. [HUA 05] H UANG C.-T., G OUDA M. G., “State Checksum and Its Role in System Stabilization”, 25th International Conference on Distributed Computing Systems Workshops (ICDCS 2005 Workshops), IEEE Computer Society, p. 29-34, 2005. [ISR 90] I SRAELI A., JALFON M., “Token Management Schemes and Random Walks Yield Self-Stabilizing Mutual Exclusion”, Proceedings of the Ninth Annual ACM Symposium on Principles of Distributed Computing, p. 119-131, 1990. [JOH 99] J OHNEN C., A LIMA L. O., DATTA A. K., T IXEUIL S., “Self-stabilizing Neighborhood Synchronizer in Tree Networks”, Proceedings of the 19th International Conference on Distributed Computing Systems, Austin, TX, USA, IEEE Computer Society, p. 487-494, May-June 1999.
254
Wireless Ad Hoc and Sensor Networks
[JOH 03] J OHNEN C., T IXEUIL S., “Route Preserving Stabilization”, Proceedings of the Sixth Symposium on Self-stabilizing Systems (SSS’03), Lecture Notes in Computer Science, San Francisco, USA, Springer Verlag, June 2003, also in the Proceedings of DSN’03 as a one page abstract. [JOH 04] J OHNEN C., P ETIT F., T IXEUIL S., “Auto-stabilisation et Protocoles Réseaux”, Technique et Science Informatiques, vol. 23, no. 8, p. 1027-1056, 2004. [KAT 93] K ATZ S., P ERRY K. J., “Self-Stabilizing Extensions for Message-Passing Systems”, Distributed Computing, vol. 7, no. 1, p. 17-26, 1993. [KUH 03] K UHN F., WATTENHOFER R., “Constant-time distributed dominating set approximation”, Proceedings of the Twenty-Second ACM Symposium on Principles of Distributed Computing (PODC 2003), Boston, Massachusetts, USA, ACM, p. 25-32, 2003. [KUL 03] K ULKARNI S. S., A RUMUGAM U., “Collision-Free Communication in Sensor Networks”, H UANG S.-T., H ERMAN T., Eds., Self-Stabilizing Systems, 6th International Symposium, SSS 2003, vol. 2704 of Lecture Notes in Computer Science, Springer, p. 17-31, 2003. [KUL 04] K ULKARNI S. S., A RUMUGAM U., “Transformations for Write-All-with-Collision Model”, PAPATRIANTAFILOU M., H UNEL P., Eds., Principles of Distributed Systems, 7th International Conference, OPODIS 2003, vol. 3144 of Lecture Notes in Computer Science, Springer, p. 184-197, 2004. [KUT 99a] K UTTEN S., PATT-S HAMIR B., “Stabilizing Time-Adaptive Protocols”, Theor. Comput. Sci., vol. 220, no. 1, p. 93-111, 1999. [KUT 99b] K UTTEN S., P ELEG D., “Fault-Local Distributed Mending”, vol. 30, no. 1, p. 144-165, 1999.
J. Algorithms,
[MAS 05] M ASUZAWA T., T IXEUIL S., “A Self-stabilizing Link Coloring Algorithm Resilient to Unbounded Byzantine Faults in Arbitrary Networks”, Proceedings of OPODIS 2005, Lecture Notes in Computer Science, Pisa, Italy, Springer-Verlag, December 2005. [MIT 05] M ITTON N., F LEURY E., G UÉRIN -L ASSOUS I., T IXEUIL S., “Self-stabilization in Self-organized Wireless Multihop Networks”, Proceedings of the 25th IEEE International Conference on Distributed Computing Systems Workshops (WWAN’05), Columbus, Ohio, USA, IEEE Press, p. 909-915, June 2005. [MOS 05] M OSCIBRODA T., WATTENHOFER R., “Maximal independent sets in radio networks”, AGUILERA M. K., A SPNES J., Eds., Proceedings of the Twenty-Fourth Annual ACM Symposium on Principles of Distributed Computing, PODC 2005, Las Vegas, NV, USA, ACM, p. 148-157, 2005. [NES 02] N ESTERENKO M., A RORA A., “Tolerance to Unbounded Byzantine Faults”, 21st Symposium on Reliable Distributed Systems (SRDS 2002), IEEE Computer Society, 2002. [SAK 05] S AKURAI Y., O OSHITA F., M ASUZAWA T., “A Self-stabilizing Link-Coloring Protocol Resilient to Byzantine Faults in Tree Networks”, Principles of Distributed Systems, 8th International Conference, OPODIS 2004, vol. 3544 of Lecture Notes in Computer Science, Springer, p. 283-298, 2005.
Fault-Tolerant Distributed Algorithms
255
[SHE 01] S HEN C.-C., S RISATHAPORNPHAT C., JAIKAEO C., “Sensor Information Networking Architecture and Applications”, IEEE Pers. Commun., vol. 8, no. 4, p. 52-59, August 2001. [SHI 01] S HIH E., C HO S.-H., I CKES N., M IN R., S INHA A., WANG A., C HANDRAKASAN A., “Physical layer driven protocol and algorithm design for energy-efficient wireless sensor networks”, MOBICOM 2001, Proceedings of the Seventh Annual International Conference on Mobile Computing and Networking, Rome, Italy, ACM, p. 272-287, 2001. [SOH 00] S OHRABI K., G AO J., A ILAWADHI V., P OTTIE G., “Protocols for Self-Organization of a Wireless Sensor Network”, IEEE Personal Communications, vol. 7, October 2000. [TIX 00] T IXEUIL S., Auto-stabilisation Efficace, PhD thesis, University of Paris Sud XI, January 2000. [VAR 00] VARGHESE G., JAYARAM M., “The fault span of crash failures”, J. ACM, vol. 47, no. 2, p. 244-293, 2000. [WOO 01] W OO A., C ULLER D. E., “A transmission control scheme for media access in sensor networks”, MOBICOM 2001, Proceedings of the Seventh Annual International Conference on Mobile Computing and Networking, Rome, Italy, ACM, p. 221-235, 2001.
256
Chapter 11
Code Mobility in Sensor Networks
11.1. Introduction Wireless sensor networks offer a new scientific and technological research field which presents huge possibilities in helping people to interact with the natural or artificial environment. In large-scale networks, changing the behavior of its elements after deployment and during operations is a very difficult, if not impossible, task. In order to change the behavior or to modify original programming of the network’s elements, code mobility is the best alternative offered. This chapter introduces a discussion on code mobility techniques in the context of wireless sensor networks and presents a comparative study between different approaches in code mobility, mainly the mobile agent paradigm. Wireless ad hoc networks are unstructured networks, in which any element (node) can play the role of router by ensuring communication intermediation between two network entities. If the node with originating data does not directly reach the destination node because this node is not within the originating node’s coverage radius, then this node can use intermediate nodes as routers until the destination is reached. This type of network has no fixed infrastructure and its topology is dynamic, linked to element mobility and communication failures which are common. Wireless sensor networks (WSNs) are a special type of wireless ad hoc network made up of large quantities of devices called sensor nodes, presenting severe energy, processing capacity and communication restrictions. Generally, nodes operate in areas that are difficult to access and without the possibility of Chapter written by Fabrício A. SILVA, Linnyer B. RUIZ, José M. NOGUEIRA, Thais R. BRAGA and Antonio A.F. LOUREIRO.
258
Wireless Ad Hoc and Sensor Networks
energy recharge. WSNs are meant to bring monitoring data together and to act in their environment. Made up of hundreds, or even thousands, of sensor nodes, a network can generate large volumes of data. With the intent of increasing their TTL without causing a negative impact on their objectives, efficient communication mechanisms with optimization of resource usage must be available for WSNs. A technology presenting great potential for WSNs is code mobility. The basic premise with this technology is code transportation from developers to where data is located, instead of transporting data to nodes where the code is. This dynamism enables the transmission of important results, thus conserving network resources. This chapter discusses the subject of code mobility in WSNs, presenting their characteristics, conditions and procedures and a case study. This chapter is organized as follows: section 11.2 introduces code mobility and the motivations for its use, followed by a basic conceptualization and a discussion on code mobility in wireless sensor networks. Section 11.3 presents project paradigms of code mobility systems. A well known approach called “mobile agents” is explained in section 11.4. Models needing to be defined in order for mobile agents to be used are also described. Section 11.5 discusses the use of mobile agents in WSNs and describes mobile agent models for WSNs. Section 11.6 presents a few examples of mobility concept applications for sensor networks and section 11.7 presents a comparative case study between different code mobility methodologies. Section 11.8 provides a conclusion. 11.2. Concepts linked to code mobility In the last few years, computer networks have quickly evolved in several aspects. First, networks keep growing and causes for this growth are the Internet and the necessity to exchange information within and outside of organizations, combined with lower cost of equipment, all of which makes even cooperative networks grow very rapidly. Secondly, networks are becoming pervasive and omnipresent. Connectivity between elements and systems increases every day and is no longer limited to computers, as it is now present in common electronic devices such as refrigerators and microwaves. The non-locality or omnipresence comes from the fact that the user remains connected to a network regardless of physical location, a phenomenon made possible by technological advances in the wireless communication field. Finally, the increase of technology availability for common users, like, for example, Web browsing applications, makes this the right time for the growth of new market and application fields. With this also emerges the necessity for personalized services.
Code Mobility in Sensor Networks
259
At the same time, though, new problems arise. Concerning network growth, scaling problems appear; current solutions working well in networks with few elements may not be efficient in large-scale networks. Concerning omnipresent networks, a user can move from one location to another, which makes network topology dynamic; solutions must take this dynamism into consideration. With the emergence of new services and markets, there is a necessity for services adapted to clients. Because of new market conditions and network infrastructure dynamism, more flexible and expandable solutions are vital. 11.2.1. Process and object migration One of the possibilities for handling these problems is the process, object and code movement between processing entities. In the universe of distributed systems, a few mechanisms have already emerged. Certain operating systems, such as Solaris and Linux, enable the implementation of a mechanism called “process migration”. A process is an abstraction of the operating system, containing code, data and operating state. Process migration is the transfer of a process executing on a machine to another remote machine, after which the transferred process resumes execution. Migration mechanisms must manage connections between the process and their execution environment, with the intent of re-establishing the same state as the original environment in the remote environment, by allowing the process to continue its execution from its stopping point. Since the main goal is load balancing between machines, this mechanism is transparent to the user most of the time. Another mechanism is “object migration”, where objects are transferred between different addressing spaces; this mechanism presents a finer granularity than process migration, because objects can include simple data structure all the way to complex operations. Process and object migration techniques are viable for small scale systems with high communication throughput, low latency, high reliability and general element homogeneity. For large-scale heterogenous distributed systems with varied characteristics and a dynamic topology, new migration or mobility techniques must be developed. 11.2.2. Code mobility An approach considered to provide solutions fulfilling the previously mentioned conditions is called “code mobility”. Code mobility (CM) is informally defined as the “capacity to dynamically modify connections between code fragments and the place where they are executed” [FUG 98]. The idea of relocating the code is very powerful and introduces a large number of applications. However, this concept is not new: submitting a process for batch processing can be considered a simple CM
260
Wireless Ad Hoc and Sensor Networks
mechanism; this concept appeared in the 1960s when processing power was centered in highly powerful machines (mainframes). The infrastructure of a CM system differs from those inherent to traditional distributed systems. In those, a software layer was responsible for offering transparency to users, so that they did not know which machines were used for the execution of their processes. With a CM system, it is necessary to specify the location, called a computer environment, where each execution unit will be located. By making these locations known to the user, network topology is no longer transparent. By execution unit we mean any sequential processing flow. Execution units use shared resources present in the computer environment, such as file systems. An execution unit contains a data space and an execution state. The data space represents references to several resources that can be used. The execution state contains data that cannot be shared between execution units, such as control information (instruction pointer, procedure return stack, etc.). 11.2.3. Wireless sensor networks and code mobility WSNs are distributed systems with specific characteristics which benefit from code mobility techniques. Some of these benefits are: – follow-up on environment evolution: it is impossible to predict all events in WSNs deployed in remote and hard to access environments. With the use of CM, new functions can be inserted into nodes based on demand without having to retrieve sensor nodes to reprogram them and restart the network. The network’s objective can even be modified, in the case where codes with new allocations are sent to nodes; – possibility of locally making decisions: in real-time systems monitored and controlled by a WSN, a delay in decision making may be unacceptable; a control code locally executed in nodes can drastically decrease a delay. Even if the nodes are programmed with the control code before its distribution in the environment, the required condition of WSN evolution to allow new capabilities to be dynamically inserted must be respected; – operation with a large quantity of elements: the scaling condition makes the solution to the problem in distributed systems difficult. Since we are hoping that WSNs will be made up of hundreds and perhaps thousands of sensor nodes, using a CM could be an aid in the operation of this type of network; – processing large volumes of data: a very large number of elements going into production (perception) can produce a large volume of data. For example, in the application presented in Werner-Allen et al. [WER 05], five sensor nodes have produced 1.7 GB of data by collecting infra-sonic signals for 54 hours straight. Code
Code Mobility in Sensor Networks
261
mobility enables this data to be processed locally and that only the most important data be transmitted thus conserving network resources. We should note that this assertion is only true if the volume of transported code is smaller than the volume of resulting data. 11.3. Project paradigms of code mobility systems A few project paradigms of CM systems define the elements which make up the system and interaction between each other. There are three CM paradigms: code on demand (CoD), remote evaluation (ReV) and mobile agents (MA) [FUG 98]. In the field of distributed system software development, we should mention the traditional client/server (CS) paradigm, although not in direct relation with code mobility; it is widely used in distributed system projects and it must be compared with the other approaches. These paradigms are briefly described later; in this, we consider a user located on a host machine or a host node A, wishing to execute a service and which in order to do this, requests processing, data and code resources (software component). A host node B (or simply host B) plays the role of server. 11.3.1. Client/server The user requesting the service does not have the code, data or computer resources; therefore, she asks for the service from another host (B), by sending a request message. Host B executes the service and, if needed, sends a message to host A. The server, host B, provides services to clients, in this case to host A. The server contains necessary data and resources for the execution of services. This paradigm is illustrated in Figure 11.1.
Figure 11.1. Client/server
262
Wireless Ad Hoc and Sensor Networks
11.3.2. Remote evaluation In this paradigm, host A, which wants to execute a service, has the code but does not have the resources or the appropriate data. It therefore has to interact with another host to complete the service. In this case, host A sends the code to B, which contains resources and data; B then executes received code, by using local resources and data, and sends a response to host A. This paradigm is illustrated in Figure 11.2.
Figure 11.2. Remote evaluation
11.3.3. Code on demand In this paradigm, a host A wanting to execute a service has resources and data, but not the code. It asks for the code from another host – B – and executes it by using resources and data already available locally. An example amply used now is the Java Applet mechanism, where the user requests a Java-compiled code from a server to interpret it locally with the help of a Web browser. This paradigm is illustrated in Figure 11.3.
Figure 11.3. Code on demand
Code Mobility in Sensor Networks
263
11.3.4. Mobile agent In this paradigm, host A has the code and part of the resources or data. It is not able to completely execute its own tasks. The code and a part of data that might exist migrate to where missing resources and data are located. Data located in host A may be an intermediate result; execution may be interrupted and continue in the other host, in other words we can migrate the execution state included in the return stack and the instruction pointer, for example. This paradigm is illustrated in Figure 11.4.
Figure 11.4. Mobile agents
In a different manner from remote evaluation, this paradigm enables the code to migrate from one node to another, executing parts of the code in each node until results are obtained. The next section explains this paradigm in more detail, which is the most quoted and adopted in studies the world over. 11.4. Mobile agents The term “software agents” can be defined as “programs helping users to execute tasks, acting on their behalf”, and is a concept introduced by artificial intelligence researchers [HEW 77]. Other definitions have more recently emerged that also consider specific agent characteristics. If a software agent can be transported to a remote node for execution, it will be called a mobile agent (MA). The main difference between a mobile agent and a non-mobile agent (static) is the capacity of the mobile agent to transfer its functions to execute in remote elements. The main characteristics desirable for MAs, described by Wooldridge and Jennings [WOO 95], are presented below: – autonomy: the agent must be able to operate without external intervention and control its actions and its state;
264
Wireless Ad Hoc and Sensor Networks
– objective setting: it must have an objective and organize its actions to generate beneficial environment changes; – adaptability: it must be able to perceive and adapt to the environment, by making decisions and by establishing strategies to reach its objectives in compliance with obtained information; – proactivity: the agent must be able to anticipate future situations, reacting appropriately in relation to established rules and policies; – flexibility: the agent must be able to dynamically choose actions to execute, without following a pre-established scenario; – communication/collaboration: a mobile agent does not have a global system view, thus requiring communication/collaboration between several agents to reach its objectives; – intelligence: it must be able to execute activities which until now have only been accomplished by humans and which imply reasoning (planning, strategy, etc.) and perception. It is also important that the agent learns from past perceptions to makes the correct decision in the future; – mobility: a mobile agent must be able to transfer itself to remote entities and continue its execution from its stopping point. We must emphasize that these are desirable characteristics; mobile agents are not required to have all these characteristics in a single implementation. The potential and current advantages of using mobile agents are: – dynamic adaptation: mobile agents can monitor the environment and react to change autonomously. Because of this they can adapt their future behavior to information retrieved; – operation in heterogenous environments: mobile agents only depend on their execution environment. Since networks are increasingly heterogenous, in relation to material as well as software, the use of mobile agents can facilitate interaction between different devices; – fault tolerance: mobile agent operation must continue, even when a communication break occurs, making remote communication impossible. The agent can wait until the connection is back before migrating if necessary; – decrease of network load: communication protocol interactions generally produce a high traffic rate in the network. With mobile agents, interactions can possibly be executed locally. Besides, code migration to the location where data is kept can be beneficial for networks handling large volumes of data;
Code Mobility in Sensor Networks
265
– decrease of delay: in real-time systems where a delay in decision making is often unacceptable, mobile agents can be sent to act locally and quickly execute control actions; – asynchronous and independent execution: mobile devices using wireless communication have fragile connections making the execution of tasks requiring continuous connectivity difficult, if not impossible. Once sent to the remote agent, an agent can operate asynchronously and independently; the connections can be repaired later to retrieve the agent; – autonomous process: mobile agent characteristics will be able to facilitate the implementation of autonomous systems [KEP 03]. Capabilities of these systems will be able to adapt to needs without direct human intervention. 11.4.1. Mobile agent components In general, elements making up the scenario of a mobile agent system are agent codes, servers, hosts, agent attributes, environments or execution locations, execution region, interactions or meetings, authorities and authorizations. These elements are briefly described below: – code: program defining agent behavior. It is interesting to note that mobile agents are implemented by programming languages that are completely interpreted or compiled to an intermediate code. This enables the agent to be executed on different platforms and architectures; – server: program executing in machines able to create, activate and transfer agents in addition to executing authentication and authorization mechanisms and manage resources; – host: machine in which the agent is executed. All hosts must execute the server program; – past state: this represents past actions, making it possible for the agent to resume its activities after being transported from one host to another; – attributes: they describe the agent, containing for example the agent and user identification. The attributes must also impose domain limits and indicate which host resources can be used by the agent; – location: also called context, this is the logical environment for agent execution which makes all resources available; – region: this is the administrative domain associated with a location; – movement: this is the transfer of an agent from one location to another. The transfer is only made if the agent has authorization to access destination;
266
Wireless Ad Hoc and Sensor Networks
– meetings or interactions: defined as the direct interaction between two agents generally located in one area; – authority: identity of the person or company represented by the agent; – authorization: this indicates the operations which can be executed by the agent. 11.4.2. Mobile agent system models There are several types of models to consider in the development of mobile agent systems. The models are defined in compliance with desired requirements and functions for systems. Some of them as described in the following as presented by Green et al. [GRE 97]. 11.4.2.1. Agent model The agent model defines the internal agent structure considering aspects such as its intelligence: aspects of autonomy, learning and cooperation characteristics, also specifying its proactive and reactive nature. 11.4.2.2. Life cycle model The life cycle model defines the different execution states and events leading to a change of state. The two main models to represent an agent’s life cycle are: the one based on processes for the first and on tasks for the second. The first, which is more powerful and flexible, starts in an initial state, migrates to an execution state, and then to a state of death when the execution ends. In order for the migration to happen, an agent must enter into a suspended state; when it reaches its destination, it goes back to an execution state. In the model based on tasks, each task presents its own state; a group of pre-established conditions authorize changes of state. In this model, when a mobile agent migrates, the context of execution of the current task is lost. During a mobile agent’s life cycle, the following actions are possible: – creation: a new agent is generated and its state is initialized; – dispatching: an agent is sent to be executed in another machine; – restitution: the agent returns to its point of origin; – deactivation: the agent stops execution, and stores its current state; – activation: agent execution is initiated or resumed in case it was deactivated; – cloning: an identical copy of an agent is made, including the execution state. This clone can be dispatched to a remote node; – deallocation: the agent concludes and resources are freed.
Code Mobility in Sensor Networks
267
11.4.2.3. Computing model It defines the way in which an agent is executed, identifying the following steps in the execution state. The agent’s computing capabilities must be defined in terms of all primitive instructions that it can execute. The computing model can be described by algorithms or sequence diagrams. This is a model that touches almost all others. 11.4.2.4. Security model Security in a mobile agent system can be divided into two main groups: host protection against malicious mobile agents and protection of mobile agents against malicious hosts. In the first case, since the mobile agent has access to host resources, it can use this access for attacking. In the second case, since the mobile agent needs to expose its code and data, the host can try to retrieve confidential information, modify the agent’s behavior or data and even stop the agent’s operation. In order to guarantee an agent’s legitimacy, a host can use the following procedures: – authentication: verify if the agent originated from a trustworthy host; – code verification: verify the code to certify that it does not execute a prohibited action. As with any case, where it is not possible to verify before execution, this mechanism can be used to investigate if the agent is attempting to corrupt the execution environment; – authorization: this indicates which resources and how many resources the agent will be able to use, as well as actions that could potentially be executed. Techniques already existing, such as encryption and digital signature, can be applied for the host’s protection. However, agent protection against malicious hosts is more difficult. Several other elements exist and must be studied. 11.4.2.5. Communication model To accomplish certain tasks, mobile agents will need to cooperate with each other, which requires communication. A protocol defines a communication model – depending on the type and complexity of agents, they will need to be able to understand more than one protocol. 11.4.2.6. Navigation model This model defines the aspects related to agent mobility, such as destination host identification and migration form. It is interesting that the mobile agent decides when and where it will migrate without external intervention. We consider mobility to be strong when we migrate the code and execution state; mobility is weak when we only migrate the code. In multihop wireless networks, it is important to try to
268
Wireless Ad Hoc and Sensor Networks
identify which is the best route to follow to optimize resource usage and migration time. 11.5. Modeling mobile agent systems for wireless sensor networks A contribution of this work is to discuss the application of mobile agent concepts in the field of wireless sensor networks. Modeling of a mobile agent system for WSNs must take into consideration the network’s specific characteristics. Sensor nodes, which are the WSN’s basic elements, have limited hardware capabilities and a limited energy capacity. In the main WSN applications, it is difficult if not impossible to exchange or recharge batteries from sensor nodes in place. Systems architects should be concerned about prolonging life of the network to the maximum; in order to do so, networks must be energy efficient. What follows is a discussion on the application of mobile agent models in WSN projects. 11.5.1. Agent model The software executed by sensor nodes must be efficient and robust, with the goal of contributing to the extension of network life. There are techniques which can help with the construction of agents making decisions, such as, for example, artificial intelligence (AI). However, some of these techniques, such as neural networks, can trigger excessive resource usage (processing and memory), which is not interesting for WSNs. Agents must therefore have a limited capacity for intelligence or even present no degree of intelligence if the application warrants it. Another alternative is to only create a part of agents with AI mechanisms since these agents collaborate with the others for decision making. This alternative must be evaluated to verify if usage, caused by communication for collaboration, is not more intensive than usage caused by processing because of higher intelligence. Policies or rules for mobile agent proactivity must be well defined, simple and optimized. It is important that mobile agents have a proactive behavior, mainly in element intensive networks; however, proactive behavior must not interrupt other network activities. 11.5.2. Life cycle model Concerning actions which can be executed by mobile agents, the only one that can be eliminated in certain cases is cloning; depending on the application, cloning can be unnecessary and unviable. For example, an information retrieval agent may not need clones. However, a fire detection agent may need clones for a better coverage of a region at risk. The other actions, such as creation and dispatching of mobile agents, cannot be eliminated and must be executed by servers.
Code Mobility in Sensor Networks
269
A WSN node can use up all its energy and stop its activity; an agent in execution in this node will stop its operation. To provide continuity to its tasks, an agent must be able to identify when the node hosting it is in a critical energy situation. Therefore, the host’s residual energy must be monitored by the agent. When the critical situation nears, the agent can migrate to another node or inform its triggering host of its situation. In this last case, the triggering host can create another agent to substitute for the old one and send it to another node. Another problem which may happen is node isolation because of a communication loss; this happens when the node’s neighbors use up their energy or are put out of service. In the first case, there is nothing to do and the isolated node must consider itself as lost or dead since it has lost communication with the network. In the second case, the agent must wait for a new route to then follow its route. In this case, if the agent keeps trying to migrate unsuccessfully, the host node can use up its energy. It is therefore important that the agent uses efficient techniques to find out the correct moment to migrate. 11.5.3. Computing model The computing model describes the agent’s execution flow, and this depends on its objectives. Any computing model of a WSN agent must lead to the efficient execution (processing) and use of memory; the implementation must be simple, robust and expandable. The idea of a mobile agent system for WSNs is to have simple agents resolving complex problems in collaboration with others; each agent accomplishes a part of a larger task. 11.5.4. Security model To protect mobile agents against malicious host nodes, we can use identification mechanisms for compromised nodes, as proposed by Peysakhov et al. [PEY 04] for wireless ad hoc networks, where agents avoid going to compromised hosts. To protect hosts against malicious mobile nodes, we must use encryption and authentication mechanisms. The wireless sensor node operating system, TinyOS has a simple encryption model called TinySec [LEV 04]. There are also published studies which attempt to guarantee security in WSNs. Choosing which mechanisms to adopt will depend on the application’s characteristics. For example, in military applications, it is important that information be confidential and totally trustworthy; in this case, more complex security mechanisms are necessary. In other applications, as with environmental weather monitoring, it is not necessary to use large resources to guarantee security. In WSNs, it may be interesting to adopt a budget system, in which a predetermined quantity of
270
Wireless Ad Hoc and Sensor Networks
resources is reserved specifically for security. In this way, each application would have its limit of resources based on its requirement. 11.5.5. Communication model For agents to cooperate, the implementation of a single communication protocol is required. This protocol in WSNs must be simple and use few memory and communication resources. It is important that agents analyze information and exchange messages only when necessary. Ideally, they will pre-process data or even compact it before transmission. Compaction is another subject that must be studied on its own, to verify if the additional load of the compaction algorithm, because of processing, is compensated by transmission economy. Another idea which can be adopted is the election of nodes to receive mobile agents; they must be nodes that are strategically located or with high processing power. The other nodes remain simpler. 11.5.6. Navigation model A well defined and efficient navigation model is fundamental for the success of a mobile agent system in WSNs. When they migrate without a management strategy, agents can saturate the network and waste resources. Before projecting a mobile agent system for WSNs, a few questions must be answered: – will migration be single or multihop? Single hop migration is more suitable because it causes a shorter delay and lower resource usage. However, in certain cases it may be necessary to migrate to multiple hops depending on network topology and the current agent requirement; for example, the agent may need resources not available in the neighbors of the host where it is placed. In this way, it must migrate to hosts beyond its neighborhood. It is desirable to always maintain a preference for single hop migration, but not be restricted to it; – will migration be strong or weak? In strong migration, in addition to the code, the execution state also migrates, including return and execution stack. This produces an additional load in processing and increases the agent’s size. In WSNs, only weak migration is then desirable. However, in specific cases, we can use strong migration; – will its route be static or dynamic? In a dynamic route, the following host is identified as the next hop before each migration. This approach is flexible and enables the adaptation to network topology changes. However, it consumes more host resources. In the static approach, the route is calculated once by the server before dispatching an agent. Despite using fewer resources, topology change may have a negative impact on the fulfillment of tasks. The decision for which strategy to
Code Mobility in Sensor Networks
271
adopt depends on the application, whether node mobility exists, or a loss of route with high impact on results, among others. The best solution may be a hybrid solution in which the route is statically calculated and the agent is able to modify it dynamically. 11.6. State of the art This section briefly describes published studies discussing the use of code mobility mechanisms in wireless sensor networks. Until now, few studies applied code mobility to WSNs. Most of these studies involve reprogramming, or remote and radio modification of the application being executed in sensor nodes without having to connect them directly to a computer. Network reprogramming works in two steps. First the code, encapsulated in communication packets, is sent to the sensor node and stored outside of the programming memory. The program received is then transferred to the program memory and the sensor node starts its execution. There are studies addressing reprogramming using single and multihop communication. There are also studies using virtual machines for reprogramming. We will discuss published studies on this subject. 11.6.1. Remote and single hop reprogramming The TinyOS operating system [LEV 04], adopted by commercial sensor node platforms such as Mica Motes (www.xbow.com) and Telos (www.moteiv.com), presents a simple reprogramming mechanism called XNP (Crossbow In-Network Programming) [CRO 03]. Reprogramming is done is three phases. In the first one, called the program download phase, the code is divided into sequential capsules and sent to sensor nodes by using a single hop communication. The capsules are stored in the nodes’ EEPROM memory. In the second phase, called the request phase, nodes verify if capsules are missing and, if so, send a request to the server. The server then returns requested capsules. This phase ends when the server has not received a request after a given period of time. In the last phase, called the reprogramming phase, the code received is transferred to the program memory and nodes initiate the execution of the new application. Experiments have been done to evaluate reprogramming time and success rate. A 37,000 byte (841 capsules) program was used; the number of nodes varied between 4, 8 and 16. Downloading time was approximately 100 seconds. As the number of nodes increases, losses increase and time used for retransmissions also increases. This experiment only enables single hop reprogramming, which cannot be used for very large WSNs. Other studies presented below propose reprogramming mechanisms enabling multihop communication.
272
Wireless Ad Hoc and Sensor Networks
Jong and Culler [JON 04] discuss reprogramming in the case of changes by an increase in code version. One way to identify the difference between two programs or versions is to divide the program image into fixed size blocks and to compare blocks from two programs. This technique is called “fixed block comparison”. However, this approach does not enable the finding of these shared blocks. The authors have used a version adapted from the “Rsync” algorithm [TRI 99] in which the server contains information on the program version executing in nodes in order to compare it with a more recent version. In this way, it only sends nodes what is different between them. The difference between code images is identified by comparing block versions without having to know the program’s structure. This solution can thus be used on any hardware platform. Five cases have been identified in their documentation, based on quantity of modifications between versions, i.e. how many blocks will need to be updated. To evaluate transmission time and decoding for the different cases, analytical assessments and measures were made. The solution proposed, which uses the Rsync algorithm, obtained results 9.1 times higher to fixed block comparison if we consider few modifications to bring to the code; they are 2.1 to 2.5 times higher if we consider many modifications. 11.6.2. Multihop reprogramming The previous studies do not support multihop reprogramming, which is not ideal for WSNs with a large quantity of nodes. Some studies that we will present propose this mechanism. Deluge is a reliable dissemination protocol for the propagation of large volumes of data in WSNs; it uses multihop communication [HUI 04]. The authors report that large volumes of data such as programs can be sent to network nodes for reprogramming. By considering WSN characteristics, the algorithm proposed is intended to be energy efficient as well as trying to minimize code propagation time. Another characteristic of the Deluge protocol is that it is not required to maintain neighbor node tables as decisions are made locally. Program image is divided into fixed size packets with value n, chosen in such a way to fit inside a single TinyOS packet. At a higher level, the image is divided into contiguous pages, each containing N packets. For each page to send, a node must receive the task of sending or receiving this page. Nodes periodically disseminate the program version that they have in an announcement message. Receiving nodes needing an update ask a provider chosen based on whatever heuristic. The protocol adopted by Deluge uses an implicit NACK mechanism, since receivers resent requests for packets that were not received. A simple growth update model is possible, indicating in the announcement message which pages have been modified. Protocol optimizations are proposed, such as forwarding rate modification and a
Code Mobility in Sensor Networks
273
more efficient provider selection. Optimization evaluations have been done by using the TOSSIM simulator [LEV 03], varying the distance between nodes (i.e., varying their density). A provider and 24 receivers were used. As for the different strategies, the authors have concluded that it is more efficient to use a high transmission rate, a spatial pipeline and a simple provider election mechanism, and not using error correction mechanisms. The study of Stathopoulos et al. [STA 03] proposes MOAP (Multihop Over-theAir Programming), a mechanism of code distribution for the Mica platform. The strategy used to accomplish multihop communication is neighborhood-byneighborhood, where single hop communication evolves recursively to a multihop communication. A Ripple mechanism is used to limit to one originating node by neighborhood. An origin is defined as the server which sends the new code to the network. Ripple guarantees that transmissions will be done at one hop distance thus reducing the consumption due to transmissions. Nevertheless, since a node must receive the complete code’s image to transform itself into an originating node, the delay to reprogram the whole network is longer. From a reliability standpoint, the receiver is responsible for detecting losses, sending NACK packets if necessary. A sliding window mechanism is used to identify which code segments have been received. Despite the restriction of sequenced packet receiving, this mechanism has proven to be efficient compared to other more complex solutions. In MOAP, the code is divided into segments with 2 address bytes and 16 bytes of data. Each communication packet contains one segment. Simulations were carried out with the EmStar tool for 30 Mica-1 Mote nodes and for 15 Mica-2 nodes. The size of the image used was 100 segments. The metrics evaluated were energy usage and latency. The Ripple strategy is more efficient for energy usage (60-90% less data transmitted), but it presents a higher delay to program the network compared with the traditional flooding mechanism. Nevertheless, as the network density increases, latency decreases with Ripple. Another evaluation was made on dissemination type; the authors perceived that unicast dissemination can decrease transmissions compared to broadcast. One of the advantages with MOAP is that it does not use an intelligent election mechanism for the originating node; the choice of a code provider node enables the optimization of network quality. The MNP study’s main objective is to propose a code provider election algorithm with the goal of decreasing collisions and data loss [WAN 04]. The idea is to maintain a provider per neighborhood as much as possible. In addition, the algorithm tries to select the provider which will cause the greatest positive impact. When a node cannot use a transmitted code segment, it cuts off its radio to save energy. Sensor nodes do not need to maintain neighborhood information; decisions are made locally, enabling the algorithm to be scalable. Two models of provider election are evaluated: with and without pipeline. In the first case, the node becomes
274
Wireless Ad Hoc and Sensor Networks
a potential provider after receiving the complete image (hop to hop propagation). In the second case, a node can become a provider if it already received part of the image. A potential provider periodically sends an announcement message to its neighbors, announcing which code it has. To prevent collisions, messages are sent at random intervals. Neighbors in turn send a request for all providers having the image they require. All providers are able to calculate how many requests their competitors (other providers) have received. In this way, the one with the most requests becomes the current provider because a large number of nodes are interested in its data. The other providers then enter a state of sleep to conserve energy. This solution has been implemented in Mica-2 sensor nodes. First, a few nodes were used to verify the algorithm’s operation. The nodes were placed on a grid 2.5 meters apart from one another; the base station was located in the higher left corner of the grid. The authors carried out experiments within a laboratory (25 nodes) and in an open field (49 nodes); transmission strength was diverse, making multiple phases of communications possible. It has been assessed that only one provider in a neighborhood was transmitting simultaneously; the providers that would cause the greatest impact have been selected, illustrating the efficiency of the provider election algorithm. To evaluate scaling, the TOSSIM simulation tool [LEV 03] was used, since the authors do not have a large enough quantity of sensor nodes. 11.6.3. Virtual machine reprogramming Maté [LEV 02] is a virtual machine (VM) for WSNs or more specifically for the TinyOS operating system. One of the advantages with using virtual programs (only interpreted by their corresponding VM) is their capacity to restrict commands to be executed with the help of a user/kernel interface; the user can only do what the VM authorizes. Maté is a byte-code interpreter for sensor nodes making it possible to execute remote scripts. The code is divided into capsules with 24 instructions of 1 byte each. Programs can contain more than one capsule. Each capsule adjusts into a TinyOS packet and contains an identifier and the code version. Maté has a stack of operators and a stack of addresses. Most of the instructions only operate in the operator stack. The code is distributed by flooding for all network nodes. Maté was evaluated by using an ad hoc routing algorithm. Since Maté transmits virtual instructions which may contain more than one binary instruction, it is efficient with regard to communication. Nevertheless, an additional load is introduced during execution/interpretation of virtual instructions. To verify if this additional processing load is offset by communication energy saving, measurements of the number of instructions executed per second and dimensions of different programs were done. Results have shown that for a small number of executions,
Code Mobility in Sensor Networks
275
Maté is preferable to binary reprogramming. In other words, the load of the central processing unit remains low if it is compared to economy of communication. Nevertheless, for codes remaining longer in sensor nodes, binary reprogramming is more efficient. To evaluate the rate of code propagation, experiments with 42 sensor nodes placed on a grid were carried out. Approximately 80 seconds were necessary for all sensor nodes to receive the new code, which is quite high. The ASVM (application specific virtual machine) is an evolution of Maté and was developed by the same authors [LEV 05]. ASVM presents three abstractions: handlers, which are code routines to process certain events; operations, which are execution units of the functions, and capsules, which are the code propagation units. This new version is meant to improve Maté with regard to flexibility, parallelism and propagation. Maté adopts an epidemic propagation mechanism, in which all network elements receive new versions of scripts. An extension to Maté was made to enable the use of mobile agents [SZU 05]. This extension makes it possible for the code to be selectively propagated, i.e. to specific elements and not to the entire network. Because of this, autonomous mobile agents can be launched in the network to make migration decisions in accordance with environment characteristics. Simulations were carried out with TOSSIM and experiments in real environments for three different applications: event identification, local and global data retrieval. Results show that these applications can adopt mobile agent mechanisms in large WSNs, because the cost of code distribution is proportional to the number of agent transfers. 11.6.4. Mobile target location application Mobile target location is an interesting application for the use of code mobility. A study presented by Tseng et al. [TSE 04] proposes a mobile target location model in WSNs using mobile agents. The idea is to migrate the agent’s code according to the mobile target position, in order for the agent to always execute in a sensor node close to the target. When a target is detected, an election process chooses a master node, which will be the first node to execute the agent. The master then invites two other neighbor nodes, called slaves, and sends them a copy of the agent. These three nodes cooperate together to execute a triangulation algorithm and identify the target’s position. As the target moves, the process is repeated. Network topology is triangular, which means that the position of nodes forms different triangles; it is therefore possible to identify with transmission signal strength when the target moves out of a region and enters another one. Simple experiments were carried out with the intention of verifying whether the solution really identifies the target, without evaluating performance and resource usage. 4 and 12 laptop computers were used as sensor nodes and one laptop computer was used as a target. The target’s
276
Wireless Ad Hoc and Sensor Networks
signal strength was used to identify it. Results illustrate that the solution was successful in identifying target movement with good precision. To evaluate network load, simulations were done using a simulator developed by the authors [TSE 04]. 11.7. Case study: mobile agents in WSN management WSN management is a research field that has only recently started to be studied. Manna architecture [RUI 03] proposes integrated solutions for the management of different WSN applications. It offers separation between the application and management functions, making the integration of organization, management and maintenance activities possible. The approach used by Manna architecture considers a third level of WSN functionality in addition to the two already known dimensions which are functional and management levels. The Manna architecture project was based on the self-organizing paradigm for management tasks to be executed autonomously. It was demonstrated in the evaluation of the Manna architecture that it is desirable to provide self-organizing solutions for promoting resources and quality of services. Manna architecture makes it possible for different selforganization strategies to be implemented, such as hierarchical and distributed strategies; different approaches for each strategy can be used. It is important to achieve evaluations before choosing an approach, in order to identify which is the most appropriate for a specific situation. 11.7.1. Objectives The goal of this case study is to compare two WSN management paradigms. One of the paradigms evaluated is the well known client/server (CS), in which managers request management data from current nodes and then execute management services to achieve determined goals. The other code mobility paradigm uses mobile agents which emigrate from one current node to another, locally ensuring management services. We have studied the case of a hierarchical (organized in groups), heterogenous (presenting nodes with different hardware capabilities) WSN, with a high density of nodes (a large quantity of nodes by surface unit). Manager functions are included in group leader nodes, which are nodes with the most resources. The other nodes, called current nodes, are controlled by the leader of its groups. Self-organization solutions are distributed and each is responsible for a different domain, since there are not many managers deployed in the network. Two models were proposed, each for a different approach.
278
Wireless Ad Hoc and Sensor Networks
of the mobile agent. The code must thus be implemented carefully to obtain good robustness and optimization characteristics. In the mobile agent model illustrated in Figure 11.5b, each manager periodically sends a mobile agent to execute in each current node of its domain. In the present case, the mobile agent moves from one node to another, until it reaches the group’s leader. MA activities are divided into two distinct phases, initialization and operation. The initialization phase corresponds to the first two MA visits in the domain where it will execute its functions. During the first visit, the mobile agent retrieves energy and topology information on the node that it visits and examines whether it is a redundant node. During the second visit, the main task is adjustment of radio transmission strength. Topology information is used for calculating the ideal strength for each node. In the operation phase, the MA goes through its domain of operation and retrieves values of administered objects by systematically verifying whether they are redundant nodes needing to be activated.
Figure 11.5. Self-organization models
11.7.3. Evaluation The two models were evaluated by simulations with the help of the Network Simulator 2 (NS-2). Some simulation aspects were configured the same way for both models (see Table 11.1). Each manager asks for management data (CS model) and sends mobile agents (MA model) at 50 second intervals. The number of managers was set at 20. With the goal of evaluating the scaling capacity of each approach, simulations were done with 100, 150, 200, 250 and 300 current nodes. The quantity of managed objects was 10, 30 and 60.
Code Mobility in Sensor Networks
Network configuration
Simulation configuration
Number of group leaders: 20 Number of sensor nodes: 100, 150, 200, 250, 300 MAC protocol: IEEE 802.15.4 Routing protocol: Minimal Spanning Tree Node deployment: random
Simulation time: 4,000 seconds Number of simulations: 33 Surface size: 250m x 250m NS-2l confidence interval: 95%
Configuration of group leaders
Current node configuration
Transmission range: 70 meters Processing power: 0.360 W Transmission strength: 0.6 W Reception power: 0.3 W Perception power: not applicable Type of dissemination: programmed Battery capacity: 100 J Throughput: 100 kbps Sensor devices: not applicable Perception range: not applicable Type of perception: not applicable Perception interval: not applicable
Transmission range: Adjustable Processing power: 0.024 W Transmission strength: 0.052 W Reception power: 0.059 W Perception power: 0.015 W Type of dissemination: Continuous Battery capacity: 10 J Throughput: 100 kbps Sensor devices: Temperature Perception range: 10 meters Type of perception: Programmed Perception interval: 30 seconds
279
Table 11.1. Simulated network and node characteristics
11.7.3.1. Results in relation to energy usage Energy is an important resource for WSNs since each sensor node component relies on it to function properly. Besides, because of WSN characteristics, such as remote monitoring and the large number of elements, it will often be impossible to carry out maintenance operations on site such as energy recharge. Consequently, energy usage must be efficient in order to prolong the network’s life. Figure 11.6 presents energy usage due to transmission and enables the evaluation of scaling and mobile agent code size impact. Simulation results in this case were obtained by considering 10 controlled objects. We note that the CS model has an increasing linear rate as network grows, whereas energy usage values remain almost constant in the case of the MA model. In a 300 node network, the CS model consumed more energy than the MA model for all sizes of agents.
280
Wireless Ad Hoc and Sensor Networks
Figure 11.6. Evaluating the mobile agent’s code size
Figure 11.7. Variation of the number of objects and relative energy usage
Code Mobility in Sensor Networks
281
Figure 11.7 illustrates the impact of the increase of the number of controlled elements. In the present case, we have set the size of agent code to 100 bytes. We have simulated both models by using 10, 30 and 60 controlled objects. The graph presents the relative value between CS and MA models, in other words results were obtained be dividing average energy usage of current nodes in the CS model (CS Energy) by the average consumption in the MA model (MA Energy). Results show how many times energy consumption in the CS model is higher than MA. We note that even with 10 controlled objects, the CS model consumes more energy than MA. When the number of objects or the number of nodes increases, the relative value becomes higher, this means that these parameters present a higher impact on the CS model. Concerning energy usage in relation to processing, Table 11.2 presents average values obtained separately for current nodes (CN) and for the group’s leader (GL). In this case, we have used 10 controlled objects in the simulations. As illustrated, the MA model uses considerably more energy for processing than the CS model. Maté is a byte-code interpreter requiring more processing time to execute instructions. Whereas MicaZ processor can execute approximately 8x106 instructions per second, Maté VM, presented in section 11.6.3, only executes 1x104 instructions per second. Since the MA model uses the Maté mechanism, it presents higher energy consumption with processing. Number of nodes 100
150
200
250
300
Type of node
CS
MA - 100 bytes
MA - 150 bytes
MA - 200 bytes
CN
0.000046
0.022250
0.03282
0.03925
GL
0.000220
0.644820
0.92260
1.13206
CN
0.000040
0.021545
0.02875
0.03758
GL
0.000260
0.664280
0.90082
1.16884
CN
0.000037
0.020810
0.02547
0.03249
GL
0.000300
0.659630
0.88912
1.11467
CN
0.000032
0.017700
0.02653
0.03289
GL
0.000310
0.630510
0.90420
1.12756
CN
0.000030
0.017990
0.02412
0.02990
GL
0.000340
0.038250
0.03394
1.08500
Table 11.2. Processing energy consumption in Joules for current nodes (CN) and group leader (GL)
282
Wireless Ad Hoc and Sensor Networks
11.7.3.2. Discussion The results presented enable us to reach important conclusions on the use of the two self-organization approaches evaluated. First we can conclude that when the size of the network or the number of controlled objects increases, MA model behavior is more reasonable than that of the CS model if we consider energy usage for transmission as a metric. Considering energy consumption for data processing as a metric, the CS model presents more interesting results. In fact, the MA model generates data processing costs, whereas the CS model is costly during communication. Mobile agent size is an important parameter with a direct impact on the results of the MA model. The larger the mobile agent size, the higher the resource consumption will be. Consequently, the decision of the choice of approach to adopt depends on network characteristics and complexity of management tasks, which is directly in relation to the size of mobile agent code. 11.8. Conclusion Code mobility is a possibility for the development of wireless sensor networks, which work in dynamic conditions but do not have the possibility of substituting reprogramming code in place and during its operating time. The idea is to make it possible for sensor nodes to change their behaviors depending on application requests; the solution of retrieving nodes to modify them is not always suitable in the case of large-scale networks. Consequently, code mobility technology helps application program designers by offering the possibility of making modifications after network deployment. Among the procedures of existing code mobility, we have studied the one based on mobile agents. We are convinced that this technique presents great possibilities for this new technological field: wireless sensor networks. 11.9. Bibliography [CRO 03] CROSSBOW TECHNOLOGY INC., Mote In-Network Programming, User Reference Version 20030315, 2003. [FUG 98] FUGGETTA A., PICCO G.P., VIGNA G., “Understanding Code Mobility”, IEEE Transactions on Software Engineering, vol. 4, no. 5, p. 342-361, 1998. [GRE 97] GREEN S., HURST L., NANGLE B., CUNNINGHAM P., SOMERS F., EVANS R., Software Agents: A Review, report no. TCS-CS-1997-06, Dublin, Ireland, 1997. [HEW 77] HEWITT C., “Viewing Control Structures as Patterns of Passing Messages”, Journal of Artificial Intelligence, vol. 8, no. 3, p. 323-364, 1977.
Code Mobility in Sensor Networks
283
[HUI 04] HUI J., CULLER D., “The Dynamic Behavior of a Data Dissemination Protocol for Network”, 2nd International Conference on Embedded Networked Sensor Systems, p. 81-94, ACM Press, 2004. [JON 04] JONG J., CULLER D., “Incremental Network Programming for Wireless Sensors”, 1st IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks IEEE SECON, 2004. [KEP 03] KEPHART J.O., CHESS D.M., “The Vision of Autonomic Computing”, IEEE Computer, vol. 36, no. 1, p. 41-50, 2003. [LEV 02] LEVIS P., CULLER D., “Maté: A Tiny Virtual Machine for Sensor Networks”, 10th International Conference on Architectural Support for Programming Languages and Operating Systems, p. 85-95, ACM, San Jose, USA, 2002. [LEV 03] LEVIS P., LEE N., WELSH M., CULLER D., TOSSIM: “Accurate and Scalable Simulation of Entire TinyOS Applications”, Proceedings of the First ACM International Conference on Embedded Networked Sensor Systems, p. 126-137, Los Angeles, USA, 2003. [LEV 04] LEVIS P., MADDEN S., POLASTRE J., SZEWCZYK R., WHITEHOUSE K., WOO A., GAY D., HILL J., WELSH M., BREWER E., CULLER D., “TinyOS: An Operating System for Wireless Sensor Networks”, in W. Weber, J. Rabaey, E. Aarts (ed.), Ambient Intelligence, Springer-Verlag, New York, USA, 2004. [LEV 05] LEVIS P., GAY D., CULLER D., “Active Sensor Networks”, Second USENIX/ACM Symposium on Networked System Design and Implementation, 2005. [MAR 04] MARSH D., TYNAN R., O’KANE D., O’HARE G.M.P., “Autonomic Wireless Sensor Networks”, Engineering Applications of Artificial Intelligence Journal, p. 741748, 2004. [PEY 04] PEYSAKHOV M., ARTZ D., SULTANIK E., REGLI W., “Network Awareness for Mobile Agents on Ad Hoc Networks”, International Joint Conference on Autonomous Agents and Multiagent Systems, p. 368-376, New York, USA, 2004. [RUI 03] RUIZ L.B., NOGUEIRA J.M.S., LOUREIRO A.A.F., “MANNA: A Management Architecture for Wireless Sensor Networks”, IEEE Communications Magazine, vol. 412, p. 116-125, 2003. [SIL 05] SILVA F.A., RUIZ L.B., BRAGA T.R.M., NOGUEIRA J.M.S., LOUREIRO A.A.F., “Defining a Wireless Sensor Network Management Protocol”, Latin American Network Operations and Management Symposium, p. 39-50, Porto Alegre, Brazil, 2005. [STA 03] STATHOPOULOS T., HEIDEMANN J., ESTRIN D., A Remote Code Update Mechanism for Wireless Sensor Networks, CENSTR-30 report, Department of Computer Science, UCLA, Los Angeles, USA, 2003. [SZU 05] SZUMEL L., LEBRUN J., OWENS J.D., “Towards a Mobile Agent Framework for Sensor Networks”, Second IEEE Workshop on Embedded Networked Sensors, p. 79-88, Sydney, Australia, 2005. [TRI 99] TRIDGELL A., Efficient Algorithms for Sorting and Synchronization, PhD thesis, Australian National University, Canberra, Australia, 1999.
284
Wireless Ad Hoc and Sensor Networks
[TSE 04] TSENG Y.C., KUO S.P., LEE H.W., HUANG C.F., “Location Tracking in a Wireless Sensor Network by Mobile Agents”, The Computer Journal, vol. 47, p. 448-460, 2004. [WAN 04] WANG L., KULKARNI S.S., “MNP: Multihop Network Reprogramming Service for Sensor Networks”, Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems, p. 285-286, ACM Press, New York, USA, 2004. [WER 05] WERNER-ALLEN G., WELSH M., RUIZ M., JOHNSON J., LEES J., “Monitoring Volcanic Eruptions with a Wireless Sensor Network”, 2nd European Workshop on Wireless Sensor Networks, Istanbul, Turkey, 2005. [WOO 95] WOOLDRIDGE M., JENNINGS N.R., “Intelligent Agents: Theory and Practice”, The Knowledge Engineering Review, vol. 10, no. 2, p. 115-152, 1995.
Chapter 12
Vehicle-to-Vehicle Communications: Applications and Perspectives
12.1. Introduction The objective of ambient intelligence is to create an intelligent daily space, which is immediately usable and integrated into our homes, our offices, our roads, our cars and everywhere. This new concept must be invisible; it must blend in with our normal environment and must be present when we need it. One of the applications of this concept consists of providing our cars and roads with capabilities to make the road more secure (information about traffic, accidents, dangers, possible detours, weather, etc.) and to make our time on the road more enjoyable (Internet access, network games, helping two people follow each other on the road, chat, etc.). These applications are typical examples of what we call an intelligent transportation system (ITS) whose goal is to improve security, efficiency and enjoyment in road transport through the use of new technologies for information and communication (NTIC). Traditional traffic management systems are based on centralized infrastructures where cameras and sensors implemented along the road collect information on density and traffic state and transmit this data to a central unit to process it and make appropriate decisions. This type of system is very costly in terms of deployment and is characterized by a long reaction time for processing and information transfer in a context where information transmission delay is vital and is extremely important in Chapter written by Rabah MERAIHI, Sidi-Mohammed SENOUCI, Djamal-Eddine MEDDOUR and Moez JERBI.
286
Wireless Ad Hoc and Sensor Networks
this type of system. In addition, these devices placed on roads require periodic and expensive maintenance. Consequently, for large-scale deployment of this type of system, important investment is required in the communication and sensor infrastructure. However, with the rapid development of wireless communication technologies, location and sensors, a new decentralized (or semi-centralized) architecture based on vehicle-to-vehicle communications (V2V) has created a very real interest these last few years for car manufacturers, the R&D community and telecom operators. This type of architecture relies on a distributed and autonomous system and is made up of the vehicles themselves without the support of a fixed infrastructure for data routing. In this case, we are talking about a vehicular ad hoc network (VANET), which is no more than a specific application of traditional mobile ad hoc networks (MANET)1. An example of an urban VANET network is illustrated in Figure 12.1.
Figure 12.1. Example of VANET network [KOS 05]
1 A mobile ad hoc network (MANET) is an autonomous system made up of mobile stations interconnected by wireless connections without the management of a centralized infrastructure. Following existing communications in the network, the mobile stations (or nodes) can also assume the role of router to relay data.
Vehicle-to-Vehicle Communications
287
In this chapter, we focus on the study of the main component in ITS systems which is inter-vehicular communication (IVC) and its related services. For road safety services, the information on potential dangers (weather conditions, state of the road, operational state of a vehicle, etc.) can be exchanged in real time between vehicles to inform the drivers. Examples of services are not limited to road safety applications but exist for other types of applications as well, specifically comfort applications (mobile Internet access, convoy of cars, games, etc.) offering interesting perspectives for telecom operators looking for new service niches. The rest of this chapter is organized as follows: in section 12.2 we present a detailed description of the different vehicular communication characteristics, features and applications. As well as a presentation of existing projects in this field, section 12.3 will discuss the state of the art and the related work proposed in the literature to face the dynamics and constraints related to vehicular networks. In particular, the following problems will be addressed: routing, data dissemination, mobility models, medium control and security. Section 12.4 will provide the conclusion. 12.2. Properties and applications 12.2.1. Properties of VANETs As an integral part of an ITS system, IVC combines the following technologies and disciplines, as represented in Figure 12.2: – sensing and close environment perception: by using different sensors (weather conditions, state of the road, state of the vehicle, pollution and others) and cameras, the driver obtains a certain amount of information and a better visibility inside his vehicle, enabling him to react appropriately to changes in his immediate environment; – processing: with a large processing capacity on board, vehicles nowadays are intelligent and are able to interpret the collected information with the purpose of helping the driver to make a decision (particularly in driver assistance systems); – storage: a large storage space is required in this context in order to store different classes and types of information. These data structures are updated via events and decisions from the communication system. We should note that in a network of vehicles, energy and storage space are sufficiently available; – routing and communication: for information exchange and diffusion in the vehicular network itself or with other networks (IP or cellular for example). This makes it possible to increase the precaution perimeter with the help of an extended perception of the environment and thus give a more accurate prediction of driving problems.
288
Wireless Ad Hoc and Sensor Networks
Figure 12.2. Intelligent vehicle [HUB 05]
These different technologies are present in all environments where IVC technology can be applied. Nevertheless, depending on the application sectors, the environment and its characteristics can differ: free, rural, semi-urban, urban space, tunnels; wireless communication properties, radio range and capacity can also be contrasted. From an architectural point of view, inter-vehicle communication system can be either pure ad hoc vehicle-to-vehicle, or hybrid using gateways to other networks and services. As previously mentioned, a VANET network represents a specific aspect of MANET networks. Nevertheless, research works studied and carried out in the field of MANETs cannot be applied directly in the context of vehicular networks because of the characteristics of VANET networks making the application of ad hoc network protocols and architectures inappropriate. In the following, we present a few properties and constraints related to the environment of vehicular networks which sets them apart from ad hoc networks: – processing, energy and communication capacity: contrary to the context of mobile ad hoc networks where energy constraint for example represents one of the problems discussed in the literature, vehicles in a VANET have no limit in terms of energy, have large processing capability and can have several communication interfaces (Wi-Fi, Bluetooth [BLU 01] and others); – environment and mobility model: environments considered in ad hoc networks are often limited to open spaces or indoors (as in the case of a conference or within a building). Vehicle movements are connected to road infrastructures, on highways or within a metropolitan zone. The constraints imposed by this type of environment, such as radio obstacles (for example, because of buildings) and multipath and fading effects, considerably affect the mobility model and radio transmission quality to consider in proposed protocols and solutions. In addition, mobility is directly connected to driver behavior;
Vehicle-to-Vehicle Communications
289
– type of information transported and diffusion: since one of the key vehicle network applications is prevention and road safety, the types of communications will focus on message broadcast from a source (or a point) to several recipients. Nevertheless, the vehicles concerned by such diffusion depend on their geographical location and their degree of implication in the launched event. In such situations, communications are mainly unidirectional; – network topology and connectivity: contrary to ad hoc networks, VANET networks are characterized by high mobility linked to car speed, which is much faster on highways. Consequently, an element can quickly join or leave the network in a very short period of time, which makes topology changes frequent. In addition, problems such as network clustering can appear frequently, mainly when the IVC system is not widely used and set up in the majority of vehicles. Solutions proposed must then consider this spatio-temporal constraint where connectivity is one of the key parameters. The heterogenity of nodes in terms of speed (cars and buses: buses have a regular, slower speed) offers additional information to consider in the development of solutions and architectures for vehicle networks. One of the constraints and parameters to be closely studied is the VANET network fragmentation problem because of spatio-temporal conditions, specifically when the market penetration rate of these networks is low. This implies weak connectivity and very limited road life. In addition, properties inherent to VANET networks, especially in terms of size, raise scaling problems and require a complete revision of existing solutions; – from the point of view of sensor networks, a node (vehicle) in the network can be considered a high capacity sensor, equipped with various functions, or a local network made up of existing terminals in the vehicle. In addition the information collected by sensors in a vehicle can be combined to eliminate redundancies and decrease the number of transmissions. The energy constraint and mobility factor clearly differentiate sensor networks from vehicle networks, designed for different sectors of application. Moreover, information collected by sensors in vehicles is used in the operation of protocols and can generally affect network behavior.
12.2.2. VANET applications The main IVC network applications can be classified into three categories: 1) road safety applications, 2) driver assistance applications, and 3) comfort applications. In what follows, we explain these categories in more detail and then give examples of applications: – road safety applications: road safety has become a priority in most developed countries. This priority is motivated by the increasing number of accidents on roads due to the increasing number of vehicles. In order to improve safety in travel and
290
Wireless Ad Hoc and Sensor Networks
cope with road accidents, IVCs offer the possibility of preventing collisions and road work, of preventing obstacles (fixed or mobile) and of distributing weather information; – applications to driver assistance systems and cooperative vehicles: to facilitate autonomous driving and bring support to the driver in specific situations: help in vehicle overtaking, prevention of straight or curved lane exits, etc. We can also mention the case of trucking companies using IVC for productivity to decrease gasoline consumption; – comfort applications for the driver and passengers: user information and communication services in particular, such as mobile access to the Internet, electronic messaging, inter-vehicle chat, network games, etc. In what follows, we limit ourselves to the description of a few services and examples of vehicle-to-vehicle communication system applications. 12.2.2.1. Alert in case of accidents This service alerts vehicles driving towards the scene of an accident that traffic conditions have been modified and that it may be necessary to be more vigilant. It is also necessary in case of reduced vehicle density to be able to retain the message in order to retransmit it if another vehicle enters the retransmission zone. Safety messages will have to be transmitted at regular periods. Thus, the node(s) designated to retransmit messages will transmit alert messages at regular moments. Messages will have to be short to be transmitted quickly. Messages will also need to have accident scene coordinates and retransmission zone parameters. 12.2.2.2. Alert in case of abnormally slow traffic (traffic jam, roadworks, bad weather, etc.) This service alerts car drivers to particular traffic situations. The driver is informed that it is necessary to slow down regardless of the nature of the traffic problem. The alert message is transmitted by a vehicle detecting traffic problems. An official vehicle doing road work can also trigger an alert message (see Figure 12.3). As with the alert message informing of an accident, the alert message informing of a slow down must be transmitted to other vehicles efficiently and quickly. 12.2.2.3. Collaborative driving Collaborative driving is a concept that considerably improves road transport safety in addition to decreasing the number of victims in accidents involving automobile vehicles. This innovation is based on information exchanged between vehicles equipped with instruments (for example, sensors) enabling them to perceive what surrounds them and to collaborate in dynamically formed groups. These groups
Vehicle-to-Vehicle Communications
291
of vehicles, or localized networks, can develop a collective driving strategy which would require little or no intervention from drivers. In the last few years, different automated vehicle architectures have been proposed, but most of them have not, or almost not, tackled the inter-vehicular communication problem. 12.2.2.4. Highway hot spot Today people can access websites from a public area (stations, airports, etc.), for example to download movies. In the car, we can imagine buying an Internet content from a station or even from the highway (going from one car to another to the closest access point to join the wired network). Car passengers will be able to play network games, download MP3 files, send cards to friends, etc. 12.2.2.5. Parking management This service assembles information on space availability in parking lots and coordinates between cars in order to guide them to find free spaces (SmartPark project [SMA 05]).
(a) Car in trouble
(b) Roadworks
(c) Intelligent parking lot
(d) Collision risk
(e) Internet access Figure 12.3. VANET network application scenario examples
292
Wireless Ad Hoc and Sensor Networks
12.3. State of the art and study of the existing situation 12.3.1. Projects and consortiums The first IVC studies emerged at the beginning of the 1980s in Japan (for example: Association of Electronic Technology for Automobile Traffic and Driving) with the increase of people or merchandise traveling, thus stimulating the exploration of new solutions such as automatic driving, intelligent road planning, etc. Several government institutions throughout the world have led an exploratory phase from different worldwide projects, involving a large number of research units. These projects have led to the definition of several possible prototypes and solutions, based on different approaches. In this way, traffic management systems were installed in large Japanese cities and on most urban and intercity highways. The Japanese have made large investments in the development of driver information systems. In the case of a highway, the system electronically monitors the speed and volume of traffic and gives drivers instant warnings on accidents and delays. Warnings and other information for drivers are displayed on different variable message signs. In the Japanese AHS (automated highway system) project, the goal was to design an automated highway system for autonomous driving: control of the vehicle is assumed by a computer on board. In the USA, there is the Intelligent Transportation Society of America (ITS America), which is a group of manufacturers, government agencies, universities and other enterprises. This group focuses on research, promotion and development and deployment coordination of ITS applications throughout the USA. As in Japan, the American government also implemented the NAHSC (National Automated Highway System Consortium) in 1995. In Europe, the PROMETHEUS (PROgraM for European Traffic with Highest Efficiency and Unprecedented Safety) project began in 1986 and included over 13 vehicle manufacturers and several universities from 19 European countries. In this context, several approaches and solutions concerning ITSs have been developed, implemented and demonstrated. The results of this first step were a detailed analysis of the problem and the development of a feasibility study to achieve a better understanding of the conditions and possible effects of applying the technology. Later, and with the technological advancement of communication, calculation and location equipment, other projects were carried out and have paved the way for some IVC applications. Due to the importance of this field, new projects were initiated throughout the world. In Europe, a certain number of large-scale projects have recently emerged focusing on problems related to IVC systems. Most of these projects were introduced in the context of research programs from the European Community (5th
Vehicle-to-Vehicle Communications
293
and 6th PCRD). However, a large majority of these projects focus on the exclusive use of existing infrastructure for implementing the IVC system, which can be extremely expensive. Drive [DRI 99] and GST [GST 05] projects are excellent examples of these projects. DRiVE (Dynamic Radio for IP services in Vehicular Environments) is meant to work on the convergence of different cellular technologies and high throughput networks (GSM, UMTS, DAB and DVB-T) in order to implement the necessary foundation for the development of innovative IP services for vehicles. The GST (Global Systems for Telematics) [GST 05] project is also intended for applications related to road safety. However, this project focuses on the use of the GSM network. It focuses on problems in relation with securing the network and service infrastructure, operation security and billing. In what follows, we will review a few consortiums and projects undertaken in the last few years on V2V communications: – the FleetNet (Internet on the road) project is a German project introduced by a consortium of six manufacturers and three universities [FLE 00]. FleetNet’s objective was to develop a communication platform for vehicle networks, to implement a demonstrator, and to standardize the proposed solutions in order to ensure better security and comfort for driver and passengers. The FleetNet architecture is based on a routing mechanism based on a system of location and navigation, and also considers vehicle to infrastructure communications in order to provide Internet access service; – the Car2Car [CAR 05] communication consortium was launched by six European automobile manufacturers, and was open to providers, research organizations and other partners. The Car2Car consortium has established its objective of improving road safety and efficiently managing traffic using IVCs. Its main missions were as follows: 1) the creation of an open European standard for V2V communications based on wireless LAN components, 2) developing V2V system prototype demonstrators for road safety applications, 3) promoting the allocation of a free exclusive frequency band for Car2Car applications in Europe, and 4) developing deployment strategies and economic models for market penetration; – the European IST project CarTalk2000 [CAR 01] (coordinated by the Daimler Chrysler manufacturers between 2001 and 2004) was intended to develop cooperative aid to driving systems and implement a pure, self-organized wireless ad hoc network. The UMTS radio access technology was adopted as part of a multihop routing protocol based on localization. Besides technological aspects, the project studied factors related to strategies of market introduction including cost analyses and legal aspects; – the NOW (Network-on-Wheels) project [NOW 04] (2004-2008) is a German project from the Federal education and research government, founded by automobile
294
Wireless Ad Hoc and Sensor Networks
manufacturers, telecommunications operators and academia. NOW supports and strongly cooperates with the Car2Car consortium. The communication protocols developed for the project are dedicated for security applications as well as for entertainment applications and provide an open communication platform for a large range of applications. One of NOW’s main objectives is the implementation of communication protocols and data security algorithms in vehicle networks. Considering the wireless IEEE 802.11 technology and location-based routing in a V2V or vehicle to infrastructure communication context, the goal is to implement a system of reference and to contribute to the standardization of such a solution in Europe in collaboration with the Car2Car consortium. Aspects concerning vehicle antennae are also addressed; – the integrated European PReVENT project [PRE 04], co-financed by the European Commission, was introduced to improve road safety by developing and demonstrating preventive road safety technologies and applications. Its objective is to reduce the number of accidents by 50% by 2010, as indicated in the eSafety action [ESA 05] for European Union road transport. PReVENT makes it possible to: 1) study and evaluate preventive safety applications using sensors, positioning and communication technologies integrated in embedded systems for driving assistance; 2) contribute in the technological development and integration, and 3) contribute to a quick market introduction; – on the French side, the MobiVip project [MOB 05] from the PRéDIT3 program is one of the most recent in the field. It focuses on research and experimentation of key technological bricks for the integrated deployment of mobility services in urban areas based on a “public individual vehicle” transport system and an information system which integrates into the policy of global traffic management on the city scale. The goal is the creation of new tools for multimodality (hardware, software and model technological bricks) based on integration, in the juncture between assisted and automatic driving, telecom, transport modeling and service evaluation. However, studies carried out in MobiVip do not focus on the network aspect. 12.3.2. Study of the existing situation In this section, we will present a few propositions relative to VANET networks. They are, however, only at the proposition stage and no standard has yet been developed. 12.3.2.1. Routing Before addressing routing issues in vehicle networks, we will briefly recall a few principles and studies surrounding routing in mobile ad hoc networks (MANET).
Vehicle-to-Vehicle Communications
295
12.3.2.1.1. Routing protocols in MANET
Figure 12.4. Classification of ad hoc routing algorithms
As illustrated in Figure 12.4, there exist two classes of MANET routing protocols – the flat and the large-scale routing protocols: – we start with the first protocol class, flat routing can be divided into two subsets: on one side proactive protocols (FSR [GER 01], OLSR [CLA 03], TBRPF [OGI 04]) and on the other reactive protocols (DSR [JOH 04], AODV [PER 03]). A proactive protocol will keep all possible routes for each destination in the network. The route will then be instantly available. Conversely, reactive protocols will determine a route only when requested. A period of time is therefore necessary for a route search; – the large-scale class includes geographical routing protocols and hierarchical routing protocols: - geographical routing: a routing technique based on location information. All protocols in this group share the excessive usage of geographical position information, implicitly raising the need for a global location service providing this information such as GPS, Galileo, ZigBee [ZIG 05]. Also, a source node needs to know the current position of a desired communication partner. Generally, this information is assumed to be provided by a location service. Geographical protocols can be further subdivided into two categories 1) enhanced-topological and 2)
296
Wireless Ad Hoc and Sensor Networks
position-based. The first uses location information to enhance existing topologicalbased protocols to make them more suitable. These improvements are mainly focused on the number of route discovery messages sent. In this way, the use of route discovery algorithms becomes more relevant in certain areas of the network than in others. The position determination technology will make it possible to mark a perimeter of research in which the route discovery protocol will be more efficient. An example of this strategy is the location-aided routing (LAR) protocol [KOY 98], a protocol based on DSR. The second category of geographical algorithms such that GGAR [NAV 97], GPSR [KAR 00] and GRP [JAI 01] reside in their capacity to find the best possible geographical route for each transmitted packet, while having a restricted view of the network or only having partial location information; - hierarchical routing: finally, the large-scale class includes the subclass of protocols that rely on hierarchical approaches. The main objective is to partition the network into clusters to provide better routing information dissemination. Clustering consists of classifying network nodes in a hierarchical way following specific parameters: address, geographical zone, capabilities, etc. A subset of nodes is elected in a completely distributed way to assume the role of local coordinator. This type of hierarchical routing approach (for example, CBRP [MIN 99]) is intended to reduce the size of the routing table which is based on the clustering structure used. A clustering algorithm is based on the following steps: clusterhead formation (election), communication between clusterheads, and their maintenance. Routing based on location is known for its robustness in terms of network size scalability. It is a good candidate for VANET networks. A few studies such as that from [SEN 05] have well demonstrated this. The authors have evaluated the performance of three ad hoc routing protocols (AODV, DSR, and LAR). Simulation results have shown that LAR is more powerful in terms of end-to-end delay and network overload in an IVC environment. 12.3.2.1.2. Routing protocols for VANET Different solutions for routing in IVCs were proposed. We describe them below. A-STAR (Anchor-based Street and Traffic Aware Routing) [LIM 05] is a position-based routing protocol for a metropolitan IVC environment. It mostly uses information on the itinerary of city buses to identify an anchor route with high connectivity for packet transmission. A-STAR is similar to the GSR protocol since it adopts an anchor-based routing approach considering road topology. However, contrary to GSR, it calculates anchor paths according to traffic (bus traffic, vehicle traffic, etc.). A weight is assigned to each street according to its capacity (a wide road or a small road served by a different number of buses). Route
Vehicle-to-Vehicle Communications
297
information provided by buses gives an idea of the traffic load in each street. This provides an image of the city vehicular traffic at different times. We find that one of the perspectives of this study consists of giving a dynamic weight that would change based on this retrieved information and of traffic at a given time in order to provide better anchor calculation quality. For performance studies, the M-Grid mobility model was used to describe vehicle movement within a city. The authors of [BLU 03] propose a clustering algorithm (Clustering for Open IVC Networks – COIN) adapted to IVC networks which improves cluster stability. The clustering mechanism used is designed in a way that enhances scalability. Cluster selection is based on node mobility, driver behavior and distance between cars. With a reduced additional control load, this protocol gives clusters a time to live that is approximately twice as long and decreases changes in cluster affiliation by at least 46%. The reactive routing protocol proposed in [LOC 03] is based on location information. It uses the city map to facilitate the routing function. The authors use the reactive location service to know the position of another vehicle. It is the equivalent of a route discovery procedure for routing protocols based on topology. The authors are currently working on expanding the solution to minimize overhead generated by diffusions. 12.3.2.2. Data dissemination and diffusion Dissemination of information consists of forwarding data from a source to one or more destinations, by ensuring a reduced transmission delay, strong reliability and better resource usage. Destination nodes concerned by the dissemination mechanism can be characterized by their positions, IP address, geographical region, etc. In a MANET network, broadcasting protocols use flooding for route construction and maintenance. Flooding is the most naive protocol consisting of re-broadcasting the received packet by all the nodes. The problem (also known in [TSE 02] as the broadcast storm problem) is that this systematic rebroadcast uselessly causes excessive bandwidth usage since each node will receive the same information many times. In addition, with dense ad hoc networks, the fact that each node systematically rebroadcasts generates a large number of collisions that will not be corrected by the MAC layer (absence of ACK during diffusion). This decreases the efficiency and reliability of diffusion. However, other types of diffusion better adapted to IVC environments are now possible, notably multicast and geocast.
298
Wireless Ad Hoc and Sensor Networks
Multicast is used by applications wishing to transmit information to more than one destination. A node that wants to receive data must first join a multicast group. Messages sent are then received by all members of the group. Geocast adopts the same operating principle with the difference that instead of explicitly joining a multicast group, nodes are implicit members of the same group if they are in the same geographical zone. In this case, the group becomes a geocast group. In this type of protocol, the following terminology is used: 1) geocast group: members of a group are defined by their geographical location; 2) geocast zone: the geographical area where all mobile node members of a geocast group are located. Entering the zone is the same as joining the group and vice versa; and 3) forwarding zone: the zone where data packets are forwarded. Each geocast group has a forwarding zone, and only nodes that are inside can forward packets. A geocast zone is included in a forwarding zone. Below, we will briefly present a few data dissemination solutions in vehicular networks. In fact, because of the challenges of road safety applications, vehicular networks must integrate data dissemination mechanisms which are efficient and reliable. MDDV (mobility-centric data dissemination algorithm for vehicular networks) [WU 04] is a diffusion algorithm which considers that vehicles do not have the positions of their surrounding vehicles, contrary to other geographical algorithms. The road system is modeled as a directed graph where nodes represent intersections, and connections road segments. A weight is associated with each connection to reflect corresponding traffic density and distance. MDDV uses a forwarding path specified as the route with the smallest sum of weights from a source to a “destination region” in the directed graph. The urban multi-hop broadcast protocol [KOR 04] is a broadcast algorithm modifying the 802.11 access layer to adapt it to the IVC context with the goal of reducing collisions and efficiently using bandwidth. It includes two phases: the first one is called directional broadcast where the source selects the furthest node in the diffusion direction to perform data forwarding with no topology information, and the second is the intersection broadcast, which disseminates packets in all directions using repeaters installed at intersections. RBM (role-based multicast): in [BRI 00], the authors propose a multicast protocol where each node maintains two lists: a list of neighbors and a list of transmitting nodes. Depending on the contents of these two lists, a node decides whether or not to rebroadcast the message after a certain period of time. In other words, this approach allows nodes to wait with rebroadcasts until new neighbors move into their vicinity.
Vehicle-to-Vehicle Communications
299
In [SUN 00] dissemination protocols called TRADE (TRack DEtection) and DDT (distance defer time) were proposed. For TRADE, the objective is to guarantee better reliability with a limited number of rebroadcasts. A vehicle must designate among its neighbors those ensuring retransmission of messages based on their movements. DDT uses a defer time before rebroadcasting a received message and if during this time it receives the same message from another vehicle, it does not rebroadcast it again. IVG (inter-vehicle geocast) [BEN 04] is a new dissemination mechanism which generalizes the previous methods (TRADE and DDT) in order to overcome network fragmentation, reliability and neighbor determination. Dynamic relays are introduced to periodically rebroadcast alert messages. These relays are selected according to the relative distance to the source vehicle. A simulation study showed that IVG outperforms TRADE and DDT in terms of reliability. Other geocast solutions can also be found in [SUN 05] or [MAI 05]. 12.3.2.3. Mobility models for vehicular networks Simulations play an important role in the validation of new protocols as they allow hypotheses to be tested in a relatively low time-consuming manner. An important aspect of simulations in MANETs in general and VANETs in particular is the definition of movement of the nodes in the simulated area. It was recognized that movement patterns greatly influence simulation results. Consequently, particular attention must be given to the development and definition of a mobility model, considering the characteristics and constraints of the modeled environment. For mobile ad hoc networks, the random waypoint (RWP) model [JOH 96] is one of the most widely used mobility models in simulations. In this model, each node individually chooses a random destination in the network’s geographical limit and also chooses a random movement speed (between a minimum and a maximum). Once the node reaches its destination, it makes a pause during a time period. The node then repeats the process by choosing a different destination and random speed. In this case, mobile nodes move randomly and independently from road topology. There are other mobility models [CAM 02] called “group mobility models” which represent mobile nodes where movements are mutually dependent and which are adapted to applications involving group communications. These mobility models for MANET suffer from a few limits such as convergence of nodes to the center of the network’s surface over time, which gives inadequate node distribution [BET 04]. These models cannot be directly used in vehicle networks where movements and speeds are delimited and predefined by routes and driver behavior. In order to design an appropriate realistic mobility model for IVCs, we have to take into account the environmental characteristics:
300
Wireless Ad Hoc and Sensor Networks
– highway: open environment characterized by a fast movement speed (with limits: minimum and maximum speed), car acceleration and deceleration and node density based on the time of day; – city: moderate speed with a higher intersection probability. There are stop lights, a large vehicle density and the existence of roads that are busier than others (main roads, commercial or tourist areas, for example); – open countryside: characterized by slower speeds with lower car density. Road traffic modeling is a research subject where several studies have been carried out in the field of intelligent transport systems. A variety of simulation tools such as PARAMICS [CAM 96], CORSIM [COR 96] and VISSIM [VIS 05] were developed to analyze micro and macro mobility features. For example, CORSIM [COR 96] is a microscopic traffic simulator developed by the Federal Highway Administration in the USA and widely used in ITS. In CORSIM, node mobility is determined considering driver behavior, vehicle characteristics and restrictions from the road and surrounding vehicles. However, few efforts are spent in the integration of techniques and communication scenarios. A few studies on mobility models have been proposed recently. For example, [SAH 04] proposed a mobility model based on vehicle movements following real and specific road plans. A comparison with the RWP model was conducted. Developed for the NS-2 simulator [FAL 05], this model is available and can be freely used for vehicular network simulations. However, aspects such as wait time at intersections or the existence of busier streets have not been considered in the model. In [LEB 05], a generic mobility model based on the Random Trip Mobility model family was studied. Simulations using city maps were performed. In [KAR 05], a tool called MOVE based on SUMO [SUM 05] was developed to generate mobility trace files usable with NS-2 or Qualnet [QUA 05]. In STRAW (STreet RAndom Waypoint) [CHO 05], movement of nodes is restricted to streets defined by data from real city maps and their mobility is limited according to road congestion. The model is divided into components: intra-segment mobility, intersegment mobility and a road management and execution component. In [MAN 05], a simulation tool called GrooveSim was developed. It integrates a set of mobility, communication, traffic and path models. It represents one of the rare tools to propose a mobility model to be tested on a geographical dissemination protocol for vehicular networks. Developed in C++ for Linux platforms, it offers four basic mobility models: uniform speed model with a minimum and maximum speed; 4state Markov-based probabilistic model; load-based model; and street map-based maximum speed. GrooveSim generates street maps from anywhere in the USA by importing TIGER/Line-type files [TIG 04] that are freely available in the USA. Nevertheless, with this type of format, information provided in TIGER files do not offer specific information on the number of lanes per road, one way systems, and
Vehicle-to-Vehicle Communications
301
contain no road signs. Because GrooveSim offers basic models, it should be expanded to support more realistic communication models and more appropriate traffic models. Mobility models generally proposed are quite recent and mainly tested with ad hoc routing protocols (DSR, AODV, etc.). It would be more suitable to use them with (routing or other) protocols specific to IVC constraints. It would also be appropriate to combine the approaches used in road traffic simulators with communication simulators (CORSIM and NS-2, for example). 12.3.2.4. MAC and physical layers Currently, there are two main approaches for the design of MAC protocols specific to IVC networks. They differ depending on the radio interface used. The first approach is based on WLAN physical layers, such as IEEE 802.11 and Bluetooth. The alternative approach consists of enhancing 3G cellular telephony layers such as CDMA with decentralized access. On one hand, the advantage with the first approach is the distributed coordination support in ad hoc mode, but the flexibility of resource allocation and transmission throughput is low. On the other hand, 3G extensions provide higher throughput and more flexible resource allocation, but are badly affected by a complexity linked to the coordination function developed in distributed mode. We will now discuss a few propositions which attempt to improve certain aspects of existing norms. The authors of [KAT 03] proposed a distributed access protocol called LCA (location-based channel access). Based on location information, the LCA protocol divides a geographical surface into cells with one channel per cell. In each cell, any multiple access protocol such as CSMA or CDMA can be used. We nevertheless think that a set of simulations in the vehicular context is necessary to evaluate the validity of this type of solution in a network as dynamic as IVC networks. There are several propositions for the use of R-ALOHA2 (reservation ALOHA) for a distributed channel allocation [BOR 02, BOR 03, HAL 01, RUD 03]. For example, in [BOR 02], the authors proposed the RR-ALOHA (reliable R-ALOHA) protocol [BOR 02] based on UTRA TDD. In this new protocol, additional information concerning the status of each slot is transmitted to all nodes thus avoiding the same reservation occurring. There are also a few MAC protocols for ad hoc networks combining CDMA with random access to the channel. As an example, we can site RA-CDMA (random access CDMA) [SOU 88] where transmitting 2 R-ALOHA is a random access protocol with reservation and which is totally distributed. RALOHA is based on slotted ALOHA with regular slot allocation. If a station succeeds in transmitting to a given slot, it will reserve the same slot in the next frames. This slot reservation reduces contention.
302
Wireless Ad Hoc and Sensor Networks
stations begin their transmission immediately and independently from the state of the channel. One of the improvements of RA-CDMA is CA-CDMA (channeladapted CDMA) [MUQ 03]. This uses a modified RTS/CTS reservation mechanism where the channel is divided into control and data channels. RTS/CTSs are sent to control channels in order to inform interfering nodes of the state of the channel (but contrary to protocol IEEE 802.11, interfering nodes can transmit simultaneously, although only under certain conditions). Throughout simulation results (in particular the comparison between CA-CDMA and IEEE 802.11), it was demonstrated that this protocol is promising for ad hoc networks. However, other dedicated studies involving an IVC environment are necessary. Even though some MAC protocols have been proposed, more attempts are needed to put them into practice. Currently, IEEE 802.11b is the most widely used for demonstrations [FUB 03], and IEEE 802.11a is chosen by ASTM (American Society for Testing and Materials) to serve as a foundation for the DSRC3 (dedicated short range communication) standard [DSR 03]. 12.3.2.5. Security in vehicular networks Communication transmission in a vehicle network and information about vehicles and their drivers must be protected and secured to ensure the proper operation of an intelligent transportation system. Data sensitivity transmitted over a VANET network demonstrates a high need for security. In fact, the importance of security in this context is vital because of the critical consequences resulting from a violation or attack. In addition, with a highly dynamic environment characterized by almost instant arrivals and departures of cars, and short connection times, the deployment of a security solution must cope with specific configurations and constraints, even though the need for secure solutions for data transmission in VANET was known from the beginning. Only recently has this problem aroused interest, and a few solutions have been developed. A brief description of some of these propositions follows. The authors of [GOL 04] propose a model to evaluate the validity level of data circulating in the VANET network. In [RAY 05], the authors provide a detailed analysis of the different attacks in vehicular networks and propose a security architecture where protocols are described and evaluated. In addition, they show that cryptography based on a public key is suitable for VANET networks. In [BLU 04], the authors presented the SecCar architecture which relies on a public key infrastructure to offer security solutions in IVCs. The use of digital signatures is studied and discussed in [LUT 02]; it is also supported by Zarki et al. [ZAR 02] who
3 DSRC communication standard using the 5.850–5.925 GHz band in the USA. This is a variation of the IEEE 801.11a technology for V2V and V2I communications.
Vehicle-to-Vehicle Communications
303
discuss security requirements for a system using a public key infrastructure in a VANET environment. The approach implemented for the NOW project [NOW 04] consists of the construction of attack trees. From a generic model, an attack tree is built based on the security requirements of the system and the different vulnerabilities related to proposed services. First, the focus is on general attacks such as the insertion of false messages, DOSs (denial of service), and privacy violation. The authors of [DUR 02] focus on privacy guarantee and data integrity in application telematics. They, in fact, present a solution for data protection by using solutions based on standards such as SSL or IPSec. There are also other propositions which are nevertheless limited to specific and limited aspects of vehicle networks without proposing a solution with global visibility for the vehicular network context. 12.4. Conclusion In the last few years, the development of new technologies has sparked an incredible evolution of the transportation system. This evolution is intended to make networks more secure, efficient, reliable and ecological without necessarily having to modify the hardware of the existing infrastructure. The range of technologies involved includes information and sensor technologies, control and communication systems; it touches disciplines such as transportation, engineering, telecommunications, computing, finances, electronic commerce and automobile manufacturing. The main objectives of an intelligent transportation system include: 1) the improvement of road safety, 2) the improvement of global efficiency of the transportation system by reducing travel time and traffic jams, 3) the integration of transportation in a durable development policy, particularly by reducing gas emissions for vehicles and heavy trucks and by optimizing maintenance of the infrastructure, and 4) the improvement of user comfort by providing him with a selection of information, decision support, guidance and Internet access services. The main goal of this chapter is to provide an insight and a better understanding of one of the main components of these ITS systems, that is, IVC or what we call mobile inter-vehicular ad hoc networks (VANET), which are a particular class of MANET. The characteristics and applications of these systems, as well as a group of projects and research studies relating to this field, were presented. Even though they are similar to the mobile ad hoc network environment, problems inherent to vehicle networks must be closely studied and some existing ad hoc network solutions must
304
Wireless Ad Hoc and Sensor Networks
be revised and adapted. In this chapter, we also presented some recent propositions concerning routing and data dissemination, mobility models, channel access layer, and aspects linked to security. These studies, although few in number, attempt to respond to environment characteristics and constraints. We think that particular attention must be given to mobility models for a better representation of the real context (parameters such as lane changes, traffic lights, high influence areas, and the use of topographical information provided by maps). These models are necessary for testing these large-scale communication systems by simulation. In addition, traffic models and the interconnection with other networks must be considered and taken into account in studies carried out for vehicular networks. 12.5. Bibliography [BEN 04] BENSLIMANE A., BACHIR A., “Réseaux Ad Hoc Mobiles: Géodiffusion InterVéhicules”, in G. Pujolle (ed.), L’Internet ambiant, p. 215-236, Hermes, Paris, 2004. [BET 04] BETTSTETTER C., HARTENSTEIN H., PREZ COSTA X., “Stochastic Properties of the Random Waypoint Mobility Model”, ACM/Kluwer Wireless Networks: Special Issue on Modeling and Analysis of Mobile Networks, vol. 10, no. 5, September 2004. [BLU 01] BLUETOOTH SIG: Bluetooth Specification Version 1.1., https://www.bluetooth.org/ spec, 2001, [BLU 03] BLUM J., ESKANDARIAN A., HOFFMAN L., “Mobility Management for IVC Networks”, Proceedings of IEEE Intelligent Vehicles Symposium, p. 150-155, Columbus, OH, June 2003. [BLU 04] BLUM J., ESKANDARIAN A., “The Threat of Intelligent Collisions”, IT Professional, vol. 6, no. 1, p. 24-29, January-February 2004. [BOR 02] BORGONOVO F., CAPONE A., CESANA M., FRATTA L., “RR-ALOHA, a Reliable R-ALOHA Broadcast Channel for Ad-Hoc Inter-Vehicle Communication Networks”, Proceedings of Med-Hoc-Net 2002, Baia Chia, Italy, 2002. [BOR 03] BORGONOVO F., CAPONE A., CESANA M., FRATTA L., “ADHOC MAC: A New, Flexible and Reliable MAC Architecture for Ad-Hoc Networks”, Proceedings of IEEE Wireless Communications and Networking Conference (WCNC’03), New Orleans, USA, March 2003. [BRI 00] BRIESEMEISTER L., HOMMEL G., “Role-Based Multicast in Highly Mobile but Sparsely Connected Ad Hoc Networks”, Proceedings of the 1st ACM International Symposium on Mobile Ad Hoc Networking & Computing, Boston, Massachusetts, USA, 2000. [CAM 96] CAMERON G., DUNCAN G., “PARAMICS-Distributed Microscopic Simulation of Road Traffic”, Journal of Supercomputing, vol. 10, p. 25-53, 1996.
Vehicle-to-Vehicle Communications
305
[CAM 02] CAMP T., BOLENG J., DAVIES V., “A Survey of Mobility Models for Ad Hoc Network Research”, Wireless Communications & Mobile Computing (WCMC): Special Issue on Mobile Ad Hoc Networking: Research, Trends and Applications, p. 483-502, 2002. [CAR 01] Safe and Comfortable Driving Based Upon Inter-Vehicle Communication, http://www.cartalk2000.net, 2001. [CAR 05] Car2Car Communication Consortium, www.car-to-car.org, 2005. [CHO 05] CHOFFNES D.R., BUSTAMANTE F.E., “An Integrated Mobility and Traffic Model for Vehicular Wireless Networks”, Proceedings of the 2nd ACM International Workshop on Vehicular Ad Hoc Networks (VANET), Cologne, Germany, September 2005. [CLA 03] CLAUSEN T., JACQUET P., “Optimized Link State Routing Protocol (OLSR)”, IETF Request for Comments: RFC 3626, October 2003. [COR 96] CORSIM User Manual, Version 1.01, Federal Highway Administration, US Department of Transportation, 1996. [DRI 99] Project DRiVE, http://www.ist-drive.org/index2.html,1999. [DSR 03] Standard Specification for Telecommunications and Information Exchange Between Roadside and Vehicle Systems – 5 GHz Band Dedicated Short Range Communications (DSRC) Medium Access Control (MAC) and Physical Layer (PHY) Specifications, ASTM E2213-03, September 2003. [DUR 02] DURI S., GRUTESER M., LIU X., MOSKOWITZ P., PEREZ R., SINGH M., TANG J., “Framework for Security and Privacy in Automotive Telematics”, Proceedings of the 2nd International Workshop on Mobile Commerce, p. 25-32, Atlanta, USA, 2002. [ERM 04] ERMEL E., Localisation et Routage géographique dans les réseaux sans fil hétérogènes, PhD thesis, Laboratoire d’Informatique de Paris VI, June 2004. [ESA 05] e-Safety, 2005.
http://europa.eu.int/information_society/activities/esafety/index_en.htm,
[FAL 05] FALL K., VARADHAN K., “The ns Manual”, http://www.isi.edu/nsnam/ns/ doc/index.html, 2005. [FLE 00] FleetNet project – Internet on the road, http://www.et2.tu-harburg.de/fleetnet, 2000. [FUB 03] FUBLER H., HARTENSTEIN H., FRANZ W., ENKELMANN W., MOSKE M., WAGNER C., “The Fleetnet Demonstrator”, Demos of the 9th ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom’03), San Diego, California, USA, September 2003. [GER 01] GERLA M., HONG X., PEI G., “Fisheye State Routing Protocol (FSR) for Ad Hoc Networks”, IETF Internet Draft, draft-ietf-MANET-fsr-02.txt, December 2001. [GOL 04] GOLLE P., GREENE D.H., STADDON J., “Detecting and Correcting Malicious Data in VANET”, Proceedings of the First ACM Workshop on Vehicular Ad Hoc Networks, p. 29-37, Philadelphia, USA, October 2004. [GST 05] Project GST, http://www.gstproject.org, 2005.
306
Wireless Ad Hoc and Sensor Networks
[HAA 02] HAAS Z., PEARLMAN M., SAMAR P., “The Zone Routing Protocol (ZRP) for Ad Hoc Networks”, IETF Internet draft, draft-ietf-manet-zone-zrp-02.txt, July 2002. [HAL 01] LOTT M., HALFMANN R., SCHULZ E., RADIMIRSCH M., “Medium access and radio resource management for ad hoc networks based on UTRA TDD”, Proceedings of the 2nd ACM/SIGMOBILE Symposium on Mobile Ad Hoc Networking & Computing (MobiHoc’01), Long Beach, CA, USA, October 2001. [HUB 05] HUBAUX J.-P., “Vehicular Networks: How to Secure Them”, MiNeMa Summer School, Klagenfurt, July 2005. [JAI 01] JAIN R., PURI A., SENGUPTA R., “Geographical Routing Using Partial Information for Wireless Ad Hoc Networks”, IEEE Personal Communications, vol. 8, p. 48-57, February 2001. [JOH 96] JOHNSON D.B., MALTZ D.A., “Dynamic Source Routing in Ad Hoc Wireless Networks”, in T. Imielinski, H. Korth (ed.), Mobile Computing, vol. 353, p. 153-181, Kluwer Academic Publishing, Boston, 1996. [JOH 04] JOHNSON D.B., MALTZ D.A., HU Y.-C., The Dynamic Source Routing Protocol for Mobile Ad Hoc Networks (DSR), Internet draft, , July 19 2004. [KAR 00] KARP B., KUNG H.T., “Gpsr: Greedy Perimeter Stateless Routing for Wireless Networks”, Proceedings of ACM/IEEE MOBICOM’00, Boston, USA, August 2000. [KAR 05] KARNADI F.K., MO Z.H., LANY K.-C., “Rapid Generation of Realistic Mobility Models for VANET”, Proceedings of the Eleventh Annual International Conference on Mobile Computing and Networking (MobiCom 2005), Cologne, Germany, AugustSeptember 2005. [KAT 03] KATRAGADDA S., MURTHY G., RAO R., KUMAR M., SACHIN R., “A Decentralized Location Based Channel Access Protocol for Inter-Vehicle Communication”, Proceedings of the 57th IEEE Semiannual Vehicular Technology Conference (VTC’03 Spring), Jeju, Korea, April 2003. [KOR 04] KORKMAZ G., EKICI E., OZGUNER F., OZGUNER U., “Urban Multi-Hop Broadcast Protocol for Inter-Vehicle Communication Systems”, Proceedings of First ACM Workshop on Vehicular Ad Hoc Networks (VANET 2004), p. 76-85, Philadelphia, PA, USA, October 2004. [KOS 05] KOSCH T., “Ad-Hoc Connected Vehicles”, MiNeMa Summer School, Klagenfurt, July 2005. [KOY 98] KO Y., VAIDYA N., “Locaton-Aided Routing (LAR) in Mobile Ad Hoc Networks”, Proceedings of ACM/IEEE MOBICOM’98, p. 66-75, Dallas, USA, August 1998. [LEB 05] LE BOUDEC J.-Y., VOJNOVIC M., “Perfect Simulation and Stationarity of a Class of Mobility Models”, Proceedings of IEEE Infocom 2005, Miami, USA, 2005.
Vehicle-to-Vehicle Communications
307
[LIM 05] LIM T.M., SEET B.C., LEE B.S., YEO C.K., KASSLER A., “Pervasive Communication for Commuters in Public Buses”, Proceedings of IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW’05), p. 75-79, Washington, USA, 8-12 March 2005. [LOC 03] LOCHERT C., HARTENSTEIN H., TIAN J., FLER H., HERRMANN D., MAUVE M., “A Routing Strategy for Vehicular Ad Hoc Networks in City Environments”, Proceedings of IEEE Intelligent Vehicles Symposium (IV2003), Ohio, USA, June 2003. [LUT 02] GOLLAN L., MEINEL C., “Digital signatures for automobiles”, in Systemics, Cybernetics and Informatics (SCI), 2002. [MAI 05] MAIHÖFER C., LEINMÜLLER T., SCHOCH E., “Abiding Geocast: Time-Stable Geocast for Ad Hoc Networks”, Proceedings of the Second ACM International Workshop on Vehicular Ad Hoc Networks (VANET 2005), p. 30-39, Cologne, Germany, September 2005. [MAN 05] IETF Mobile Ad-hoc Networks (MANET) charter, http://www.ietf.org/html. charters/manet-charter.html. [MAN 05] MANGHARAM R., WELLER D.S., STANCIL D.D., RAJKUMAR R., PARIKH J.S., “GrooveSim: A Topography-Accurate Simulator for Geographic Routing in Vehicular Networks”, Proceedings of the Second ACM International Workshop on Vehicular Ad hoc Networks (Mobicom/VANET 2005), Cologne, Germany, September 2005. [MIN 99] MINGLIANG JIANG Y.T., JINYANG LI, “Cluster Based Routing Protocol”, IETF Internet Draft, draft-ietf-manet-cbrp-spec-01.txt, July 1999. [MOB 05] Project MobiVip, http://www-sop.inria.fr/mobivip, 2005 [MUQ 03] MUQATTASH A., KRUNZ M., “CDMA-based MAC Protocol for Wireless Ad Hoc Networks”, Proceedings of the 4th ACM/SIGMOBILE Symposium on Mobile Ad Hoc Networking & Computing (MobiHoc’03), Annapolis, USA, June 2003. [NAV 97] NAVAS J., IMELINSKI T., “Geocast – Geographic Addressing and Routing”, Proceedings of ACM/IEEE MOBICOM’97, vol. 3, p. 66-76, Budapest, Hungary, September 1997. [NOW 04] NOW (Network-On-Wheels), www.network-on-wheels.de, 2004. [OGI 04] OGIER R., TEMPLIN F., LEWIS M., “Topology Dissemination Based on ReversePath Forwarding (TBRPF)”, IETF Request For Comments (RFC) 3684, February 2004. [PER 03] PERKINS C.E., BELDING-ROYER E.M., DAS S., “Ad Hoc On Demand Distance Vector (AODV) Routing”, IETF RFC 3561, 2003. [PRE 04] Integrated project PReVENT, www.prevent-ip.org, 2004 [QUA 05] Qualnet, “Qualnet user manual”, http://www.scalable-networks.com/products/ qualnet.php, 2005.
308
Wireless Ad Hoc and Sensor Networks
[RAY 05] RAYA M., HUBAUX J-P., “The Security of Vehicular Ad Hoc Networks”, Proceedings of the 3rd ACM Workshop on Security of Ad Hoc and Sensor Networks (SASN’05), Alexandria, USA, November 2005. [RUD 03] RUDACK M., MEINCKE M., JOBMANN K., LOTT M., “On Traffic Dynamical Aspects Intervehicle Communication (IVC)”, Proceedings of the 57th IEEE Semiannual Vehicular Technology Conference (VTC’03 Spring), Jeju, Korea, April 2003. [SAH 04] SAHA A.K., JOHNSON D.B., “Modeling Mobility for Vehicular Ad Hoc Networks”, Poster in the First ACM Workshop on Vehicular Ad Hoc Networks (VANET 2004), Philadelphia, USA, October 2004. [SEN 05] SENOUCI S.-M., MOHAMED-RASHEED T., “Modified Location-Aided Routing Protocols for Control Overhead Reduction in Mobile Ad Hoc Networks”, Network Control and Engineering for QoS, Security and Mobility (NetCon 2005), Lannion, France, November 2005. [SMA 05] Project SmartPark: Parking Made Easy, http://smartpark.epfl.ch, 2005. [SOU 88] SOUSA E., SILVESTER J.A., “Spreading Code Protocols for Distributed SpreadSpecturm Packet Radio Networks”, IEEE Transactions on Communications, vol. 36, no. 3, p. 272-281, 1988. [SUM 05] SUMO Simulation of Urban Mobility, http://sumo.sourceforge.net, 2005. [SUN 00] SUN M.T., FENG W.C., LAI T.H., YAMADA K., OKADA H., “GPS-based Message Broadcast for Adaptive Inter-vehicle Communications”, Proceedings of IEEE VTC Fall 2000, no. 6, p. 2685-2692, Boston, MA, USA, September 2000. [SUN 05] SUN Q., GARCIA-MOLINA H., Using Ad-Hoc Inter-Vehicle Network for Regional Alerts, technical report, 2005. [TIA 02] TIAN J., STEPANOV I., ROTHERMEL K., “Spatial Aware Geographic Forwarding for Mobile Ad Hoc Networks”, Proceedings of MobiHoc, Lausanne, Switzerland, June 2002. [TIG 04] TIGER/Line 2004. US Geological http://www.census.gov/geo/www/tiger, 2004.
Survey
(USGS)
topographic
maps,
[TSE 02] TSENG Y.C., NI S.Y., CHEN Y.S., SHEU J.P., “The Broadcast Storm Problem in a Mobile Ad Hoc Network”, Wireless Networks, vol. 8, no. 2-3, p. 153-67, 2002. [VIS 05] PTV simulation VISSIM, http://www.english.ptv.de/cgi-bin/traffic/traf_vissim.pl, 2005. [WU 04] WU H., FUJIMOTO R., GUENSLER R., HUNTER M., “MDDV: a MobilityCentric Data Dissemination Algorithm for Vehicular Networks”, Proceedings of the 1st ACM International Workshop on Vehicular Ad Hoc Networks, p. 47-56, Philadelphia, PA, USA, 2004. [ZAR 02] EL ZARKI M., MEHROTRA S., TSUDIK G., VENKATASUBRAMANIAN N., “Security Issues in a Future Vehicular Network”, European Wireless, Florence, Italy, February 2002. [ZIG 05] The ZigBee Alliance, http://www.zigbee.org, 2005.
List of Authors
Mohamed BAKHOUYA SeT University of Technology of Belfort-Montbéliard Belfort France Thais R. BRAGA Federal University of Minas Gerais Belo Horizonte Brazil Jaafar GABER SeT University of Technology of Belfort-Montbéliard Belfort France Moez JERBI France Télécom R&D Lannion France Azzedine KHIR University of Quebec Montreal Canada
310
Wireless Ad Hoc and Sensor Networks
Houda LABIOD ENST Paris France Sylvie LANIEPCE France Télécom R&D Caen France Antonio A.F. LOUREIRO Federal University of Minas Gerais Belo Horizonte Brazil Djamal-Eddine MEDDOUR France Télécom R&D Lannion France Romain MELLIER LaRIA University of Picardy Jules Verne Amiens France Rabah MERAIHI France Télécom R&D Lannion France Pascale MINET INRIA Rocquencourt Le Chesnay France Jean-Frédéric MYOUPO LaRIA University of Picardy Jules Verne Amiens France
List of Authors
José M. NOGUEIRA Federal University of Minas Gerais Belo Horizonte Brazil Abdellatif OBAID University of Quebec Montreal Canada Linnyer B. RUIZ Federal University of Minas Gerais Belo Horizonte Brazil Sidi-Mohammed SENOUCI France Télécom R&D Lannion France Fabrício A. SILVA Federal University of Minas Gerais Belo Horizonte Brazil Fabrice THEOLEYRE INSA Lyon France Sébastien TIXEUIL LRI University of Paris 11 Paris France Stéphane UBÉDA INSA Lyon France
311
312
Wireless Ad Hoc and Sensor Networks
Fabrice VALOIS INSA Lyon France
Index
1-9
B
1-smallest identifier 182 802.11 13, 16, 18, 20, 24
Backbone 82, 88, 96, 165, 186 Backoff 17 B-cell 126-127, 128 Blocking diameter 177 BRuIT 54
A ACK 17 Acknowledgement 197, 219 Ad Hoc On-Demand Vector (AODV) 11, 25-26, 30, 40, 57 153, 40, 197, 204 Adaptability 83 Adaptation 87 Admission control 49 Advertisement message (ADVM) 155, 159 Affinity connections 125, 127, 128, 129, 130, 131, 132, 134, 135 Allia 152 Aloha 16 Antibodies 126-129, 134 Antigen 113, 125, 128-129, 133 Application 285-289, 292-294, 298, 299, 303-304 Ariadne 205-209, 212 Atomic procedure 177 Attack 200-202, 203, 204, 207, 208, 209, 211
C Carrier sense multiple access (CSMA) 16 Client/server 261, 276 Cloning 126, 128, 266 Cluster 82, 94, 97, 100 Cluster-centric 167 Clusterhead 167-172 Code mobility 257-282 Code on demand 261, 262 Collaboration 81 Collaborative reputation (CORE) 214, 215 Collision 16, 17, 21 Coloring (graph/node) 227, 234, 235, 238, 247-250 Comfort 287, 290, 293, Communication 286-304 Communication model 267, 270 Computing model 267, 269
314
Wireless Ad Hoc and Sensor Networks
CONFIDANT 215-216 Connected dominating set (CDS) 88, 89 Critical section 185, 186 Cross-layering 53
Flooding 19-21, 23, 24, 25, 26, 27, 29, 30, 120-123, 203, 209, 213 Forwarding group (FG) nodes 67, 69, 76 FQMM 52
D
G
Denial of service 195, 200, 202, 203, 234, 210, 215 Destination-sequenced distance vector (DSDV) 198 Diffusion 287, 289, 297 Directories 114-117, 124 Discovery reply (DREP) 155 Discovery request (DREQ) 155 Dissemination 114, 117, 119, 120, 121, 124, 227, 287, 297-299 Distributed algorithms 225, 226, 228, 251 Distributed and mobility-adaptive clustering (DMAC) 177, 179 Distributed clustering for ad hoc networks (DCA) 172-177 Distributed system 228 Dominatee 97 Dominator 97 Dynamic source routing (DSR) 11, 23, 196, 210
Gateway 168 GDMA 179-181 Global reinforcement mechanism 130 Global satisfaction 132 Group-based service discovery (GSD) 151
E Emergence 111, 113, 130, 134 Emerging behavior 81, 82 EndairA 212-213 Energy saving 102 Energy usage 273, 279-281 Environment 287, 288, 293, 296, 297
F Fading 66 Fault-tolerant 225, 226, 228, 230
H Hash mechanism 113, 118, 119-120, 124 Hello 25, 27, 28, 98, 100, 101 Hidden station 66 Hierarchy 57, 94, 102, 115, 116 Hybrid proposals 22 Hybrid protocol 40
I ID identifier 168 IETF 3 Immune system 113, 125, 126, 127, 128, 129, 133, 135 Inconsistencies 105 Inconsistent situations 86 Indexing mechanism 113, 114-119, 124 Initialization problem 185 INSIGNA 51 Intelligent 285, 288, 292, 300, 302 Intelligent transportation system (ITS) 285, 287, 292, 300, 303 Inter-cluster mobility 189 Interference 37-38 Internet 285, 287, 290, 293,
Index
Inter-vehicular communication (IVC) 287, 288, 289, 290, 292, 293, 296, 297, 298, 299, 301, 302 Intra-cluster mobility 189 Invocation 146
J, K Jerne 126 JINI 146 K-cluster 181 Konark 151 Koodli and Perkins 153
L Learning mechanism 112, 125, 134 Leasing 114, 117, 121, 155 Life cycle model 266, 268 Local decision 82, 86, 96 Local interactions 81, 86 Local minimum spanning tree (LMST) 88, 93 Local reinforcement mechanism 130 Local satisfaction 130 Location-aided routing (LAR) 30 Loop-back 187
M MAC protocol with QoS 51 Maintenance 73-75, 98-101 Manager 276-278 Manna architecture 276 Maté 274-275, 281 Maximal independent set (MIS) 91 Medium access control (MAC) 15, 16, 17, 20, 24, 301-302 Memory 121, 125, 126, 135, 145 Mesh-based protocols 67 Mica 271, 273, 274, 281, Middleware 115, 125, 137
315
Migration 259, 264, 266, 267, 268, 270, 275 Mobile ad hoc networks (MANET) 2, 3, 10, 11, 35, 66, 143, 144, 286, 295 Mobile agent 128-129, 133-134, 257258, 261, 263-268, 270, 276-280, 282 Mobility model 299 Model 51, 266, 267, 270, 277, 299 Monitoring 213-219 More than two hops 181 Multicast 65, 2, 3 Multicast ad hoc on-demand distance vector (MAODV) 67, 77 Multicast tree 66 Multipoint relay (MPR) 26, 27, 41 Mutual exclusion 185
N Navigation model 267, 270 Neighbor 17, 19, 20, 23, 25, 26, 27, 28, 29 Neighborhood 20, 22, 26, 27, 28 Node-centric 167 Node connectivity 183 Node degree 183 Non-locality 258 Notification 74 Notification mechanism 114, 121 Nuglets 220
O Object migration 259 On-demand multicast routing protocol (ODMRP) 67, 68, 76 Open shortest path first (OSPF) 26 Optimized Link State Routing (OLSR) 26, 41, 43, 57 Ordinary node 167 Ordinary services 154 Overlay network 116, 119, 120, 124
316
Wireless Ad Hoc and Sensor Networks
P Peer-to-peer (P2P) 112 Post-query 150 Prefix sum 185 Primary response 128 PRnet 10 Proactive 41, 198 Proactive protocol 40 Process migration 259 Project 261, 292 Project paradigms of code mobility systems 261 Propagation 12, 13, 14, 30 Pruning 75 Publication 146 Pull technique 120, 121, 122, 124 Push technique 120, 123, 124
Q QoS model 51 QoS OLSR 57-61 QoS routing 50 QoS signaling 50, 53 Quality of connectivity 3, 55, 70, 77 Quality of Service (QoS) 35-62
R Random walk-based mechanism 123124 Random waypoint (RWP) 76 Reactive protocol 40 Reference point group 76 Relative neighborhood graph (RNG) 88, 93 Remote evaluation (ReV) 261, 262 Reply phase 72 Reportable subtree (RT) 29 Reprogramming 271, 272 Request phase 72
Resource Description Framework (RDF) 146 Reverse Link (RL) algorithm 185 Route availability 69 Route discovery 72 Route error (RERR) 41 Route reply (RREP) 24, 41 Route request (RREQ) 24, 41
S Scalability 82, 83, 232, 251, 277, 296, 297 Secondary response 129 Secure ad hoc on-demand distance vector (SAODV) 204,-205 Secure dynamic source routing (SDSR) 210-213 Secure routing protocol (SRP) 202204 Security model 267, 269 Self-adaptation 111, 113, 125 Selfish 203, 213, 214, 217, 219 Self-organization 81, 82, 85, 112-113, 125-126, 276-278 Self-stabilization 83, 226, 230-248 Sensor 2, 5, 285-287, 289-290, 294 Sensor node 258, 260, 268, 271, 274, 275 Server community 125, 126, 136 Service discovery 112-115, 119, 124, 125, 143, 144 Service Discovery and Interaction with Routing protocols in Ad hoc Network (SEDIRAN) 153-161 Service discovery protocols 146 Service location protocol (SLP) 116, 149 Service lookup 121, 124 Software agent 263 Source routing-based multicast protocol (SRMP) 3, 69 Spatial coherence 183
Index
Spatial density 180 Special services 154 Survivable radio network (SURAN) 10 SWAN 52
T Threat model 199 TinyOs 269, 271, 272, 274 Token passing 236, 237, 243, 245, 246 Topology based on reverse-path forwarding (TBRPF) 11, 28 Topology control (TC) 27 TOSSIM 273, 274, 275 Transport 285, 290, 294, 300, 302 Tree construction 236, 242, 249 Tree-based protocols 67 Two hop cluster 167
U, V Ubiquitous 111, 137 Universal Plug and Play (UPnP) 143, 148-149 Vehicle 285-304 Vehicular ad hoc network (VANET) 5, 286, 287-291, 294, 296, 299, 302 Virtual machine (VM) 274 Virtual structures 82, 84, 87, 98 Virtual topology 94, 95, 98, 101 Virus 113
W-Z Web service description language (WSDL) 117 Weight 90, 91, 92, 94, 96, 98, 102, 103, 130, 172, 183 Wireless communication 226-228, 251
317
Wireless local area network (WLAN) 1 Wireless sensor network (WSN) management 276 Zone-based hierarchical link state routing protocol (ZRP) 29