Stochastic
Modeling in Broadband Communications
Systems
SIAM Monographs on Mathematical Modeling and Computation
E...

Author:
Ingemar Kaj

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Stochastic

Modeling in Broadband Communications

Systems

SIAM Monographs on Mathematical Modeling and Computation

Editor-in-Chief Joseph E. Flaherty Rensselaer Polytechnic Institute

About the Series

Editorial Board

In 1997, SIAM began a new series on mathematical modeling and computation. Books in the series develop a focused topic from its genesis to the current state of the art; these books

Ivo Babuska University of Texas at Austin

present modern mathematical developments with direct applications in science and engineering;

H. Thomas Banks North Carolina State University

describe mathematical issues arising in modern applications;

Margaret Cheney Rensselaer Polytechnic Institute

develop mathematical models of topical physical, chemical, or biological systems; present new and efficient computational tools and techniques that have direct applications in science and engineering; and illustrate the continuing, integrated roles of mathematical, scientific, and computational investigation. Although sophisticated ideas are presented, the writing style is popular rather than formal. Texts are intended to be read by audiences with little more than a bachelor's degree in mathematics or engineering. Thus, they are suitable for use in graduate mathematics, science, and engineering courses. By design, the material is multidisciplinary. As such, we hope to foster cooperation and collaboration between mathematicians, computer scientists, engineers, and scientists. This is a difficult task because different terminology is used for the same concept in different disciplines. Nevertheless, we believe we have been successful and hope that you enjoy the texts in the series. Joseph E. Flaherty Ingemar Kaj, Stochastic Modeling in Broadband Communications Systems Peter Salamon, Paolo Sibani, and Richard Frost, Facts, Conjectures, and Improvements for Simulated Annealing Lyn C. Thomas, David 8. Edelman, and Jonathan N. Crook, Credit Scoring and Its Applications Frank Natterer and Frank Wiibbeling, Mathematical Methods in Image Reconstruction Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion Michael Criebel, Thomas Dornseifer, and Tilman Neunhoeffer, Numerical Simulation in Fluid Dynamics: A Practical Introduction Khosrow Chadan, David Colton, Lassi Paivarinta, and William Rundell, An Introduction to Inverse Scattering and Inverse Spectral Problems Charles K. Chui, Wavelets: A Mathematical Tool for Signal Analysis

Paul Davis Worcester Polytechnic Institute Stephen H. Davis Northwestern University Jack). Dongarra University of Tennessee at Knoxville and Oak Ridge National Laboratory Christoph Hoffmann Purdue University George M. Homsy Stanford University Joseph B. Keller Stanford University J. Tinsley Oden University of Texas at Austin James Sethian University of California at Berkeley Barna A. Szabo Washington University

Stochastic Modeling in Broadband Communications Systems Ingemar Kaj Uppsala University Uppsala, Sweden

SIHJIL Society for Industrial and Applied Mathematics Philadelphia

Copyright © 2002 by the Society for Industrial and Applied Mathematics. 10987654321 All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688. The PING Utility Library is a public domain software package developed by Mark Lindner and is distributed under the terms of the GNU Lesser General Public License. The package can be freely downloaded from http://www.dystance.net/ping Academy Award is a registered trademark of the Academy of Motion Picture Arts and Sciences. Library of Congress Cataloging-in-Publication Data Kaj, Ingemar. Stochastic modeling in broadband communications systems / Ingemar Kaj. p. cm. — (SIAM monographs on mathematical modeling and computation) Includes bibliographical references and index. ISBN 0-89871-519-9 1. Broadband communication systems—Mathematical models. 2. Stochastic analysis. I. Title. II. Series. TK5103.4 .K35 2002 621.382-dc21

SlflJTL. is a registered trademark.

2002029186

Contents Preface

ix

Notation and Notions from Probability Theory

xiii

1

Introduction 1.1 A brief introduction to networking concepts 1.2 Modeling aspects of general networks 1.3 Broadband traffic characteristics 1.4 Three introductory examples 1.4.1 Simple collision model 1.4.2 Basic arrivals process 1.4.3 Periodic streams 1.5 Exercises

1 1 2 4 9 9 13 17 18

2

Markov Service Systems 2.1 Discrete-time service systems 2.2 Arrival and service rates, continuous time 2.3 Ideas of stationarity and equilibrium states 2.4 Balance equations, slotted time 2.5 Balance equations, continuous time 2.6 Jackson networks 2.7 Markov loss systems 2.8 Delay analysis in Markov systems 2.8.1 Delay in M/M/1 2.8.2 A client-server Jackson network 2.9 Exercises

21 21 24 28 30 32 35 38 39 39 41 43

3

Non-Markov Systems 3.1 Performance measures 3.2 Integrated processes and time averages 3.3 Some ideas from renewal theory 3.3.1 Renewal reward processes 3.3.2 Renewal rate and on-off processes 3.3.3 Hand-off termination probability

45 45 47 50 51 51 53

V

vi

Contents

3.4 3.5

3.6 3.7

3.3.4 Reliable data transfer The loss and delay time balance 3.4.1 Little's formula The M/G/1 system 3.5.1 Simple examples leading to non-Markovity 3.5.2 Pollaczek-Khinchin formulas 3.5.3 Lindley recursion for M/G/1 3.5.4 The M/G/1 virtual waiting time distribution 3.5.5 Heavy traffic limit in M/G/1 3.5.6 Deterministic service times, M/D/1 The M/G/oo model Exercises

54 57 58 60 60 62 65 66 68 70 72 73

4

Cell-Switching Models 4.1 m x m crossbar 4.1.1 Output loss crossbar 4.1.2 Output queuing with a shared buffer 4.1.3 Input buffer blocking 4.1.4 Input blocking, loss system 4.2 Exercises

5

Cell and Burst Scale Traffic Models 5.1 Cell-level traffic 5.1.1 Isochron multiplexing 5.1.2 Voice packet streams in Internet telephony 5.1.3 Round-trip time distribution, PING data 5.1.4 Packet fragmentation in video communications 5.2 Burst-level rate models 5.2.1 Anick-Mitra-Sondhi model 5.2.2 Markov modulated Poisson process 5.3 Long-range dependence traffic models 5.3.1 Self-similarity 5.3.2 Heavy-tailed rate models 5.3.3 Fractional Brownian motion 5.3.4 Statistical methods 5.4 Exercises

93 93 94 96 100 103 106 106 108 109 Ill 113 114 118 121

6

Traffic Control 6.1 Admission control 6.1.1 Effective bandwidth 6.1.2 Statistical multiplexing gain 6.2 Access control 6.2.1 Leaky bucket systems 6.2.2 The M/M/1 leaky bucket 6.2.3 The generic cell rate algorithm 6.2.4 A slotted version of the leaky bucket filter

123 123 124 125 127 127 128 131 131

77 77 78 80 82 86 90

Contents 6.3

6.4

6.5

vii Multiaccess modeling 6.3.1 The slotted Aloha Markov chain 6.3.2 Diffusion approximation approach 6.3.3 Remark on stochastic differential equation approximation 6.3.4 CSMA and CSMA/CD 6.3.5 A collision resolution algorithm Congestion control 6.4.1 A controlled Aloha network 6.4.2 Window control 6.4.3 Modeling TCP window size 6.4.4 TCP window dynamics 6.4.5 Meanfield approximation of interacting TCP sources . . . Exercises

134 136 137 140 141 147 149 149 150 152 153 159 162

Bibliography

167

Index

173

This page intentionally left blank

Preface This text is intended for students in mathematics, applied mathematics, and stochastics who have an interest in network modeling and for students in computer science and related areas with an open view toward mathematical models. The material also will be useful for many practitioners in the computer communications or telecommunications industry who use probabilistic models and methods. Mathematical methods based on the theory of stochastic processes have long been used effectively in telephone traffic modeling. The original telephone traffic models developed and published (1909-1927) by the Danish mathematician A. K. Erlang formed the theoretical framework for planning and dimensioning the growing telephone networks for decades to come. The work of Erlang at a telephone company in Copenhagen must in fact be considered among the single most successful theories in the history of applied mathematics. Not until the development of the emerging techniques in high capacity communication systems has it become clear that Erlang's legacy has reached its limits. Even basic mathematical traffic modeling requires a wider ranging selection of ideas and techniques. The inherent structure of modern network traffic, which is distinctly different from traditional voice traffic, generates challenging mathematical and statistical problems. Industry acknowledges the need for mathematical competence in this area, judging from the growth of conferences, academia-industry cooperative projects, and recruitment to industry-based research departments. This book covers material suitable for final-year undergraduate students to the Ph.D. level in mathematics, probability and statistics, computer science, and computer engineering. The selection of topics depends on the reader's background and interest. The reader is expected to have basic knowledge of calculus and probability, including random variables, probability distributions, and expected values. A brief introduction to networking concepts is included. The presentation of the main material covers a variety of models and situations ranging over different time scales of calls, bursts, and cells and over different protocol layers for transport, control, and applications. The mechanisms of queuing, collisions, delay, and loss appear in various forms, and the effects of buffering, retransmission, multiplexing, and traffic control are studied. Typically, the end result is some form of load-throughput analysis. The common theme is that all models are formulated in terms of appropriate stochastic quantities and the main mathematical tools are those of equilibrium Markov chain theory, renewal theory, and asymptotic limit results. The classical Markov queuing systems and more general single-server systems are covered as starting points and reference systems. The reader will find relatively simple stochastic models for more realistic networking problems such as reliable data transfer ix

x

Preface

protocols, the forced termination problem in cellular networks, space division switching, Internet telephony traffic, leaky bucket filters, the ethemet local area network protocols carrier-sensing multiple access (CSMA) and CSMA with collision detection (CSMA/CD), collision resolution protocols, and the window dynamics in transmission control protocol. Moreover, the reader will find material related to arrival process modeling, which is designed for network traffic exhibiting long-range dependence and self-similarity. This includes statistical methods and approximation by means of fractional Brownian motion. The selection of topics should serve as a background from which those who have an interest in the area will be able to continue in one or another direction. As several topics touch on or intersect with current research, the text could serve as a basis for independent investigations. It is my hope that the chosen level of mathematical rigor is acceptable for most readers. The purpose of this text is to give an overview of stochastic models and mathematical techniques based on stochastic processes for application in the fields of telecommunications and computer communication networks. The presentation introduces readers with various backgrounds and training in mathematics to a number of useful techniques in traffic modeling. The intended audience consists, on one hand, of students and professionals in telecommunications and computer engineering with an interest in using applied mathematics, in particular stochastics, to improve their understanding of communications systems. On the other hand, it is written for the purpose of introducing students and professionals in probability and applied mathematics to a huge area of interesting problems and models arising from today's accelerating developments in broadband channel transmission systems. Given this twofold purpose, the text should be concise and based on sound mathematical reasoning yet be accessible for an audience unwilling to spend more than a fair share of time on mathematical detail and generality. The approach suggested here to serve this objective is to rely on the language and concepts of random variables and stochastics and the strength in intuitive reasoning they provide. Main ideas and notions are introduced and discussed along with specific network applications, and most calculations are motivated and carried out in detail. Probabilistic arguments are preferred to analytical ones—for example, we have chosen not to use moment-generating functions. We state and use general results from the theory of Markov chains and renewal processes, but for detailed proofs and systematic treatment, readers should consult existing textbooks on mathematical queuing theory, such as Kleinrock [28], Asmussen [3J, Wolff [69], and Bremaud [7]. Basic calculus and probability as prerequisites should be enough as a starting point. Some of the models we discuss require mathematically more advanced material, such as diffusion approximations and nonlinear differential equations. However, we provide introductions and emphasize intuitive probabilistic arguments. A number of illustrations with graphs of real data or model simulations are given for clarity. Exercises are provided at the end of each chapter partly to promote the idea of using the book within a course. Some exercises support training in probabilistic calculus, some study variations of models discussed in the main text, and additional comprehensive exercises could be used as course assignments. Chapter 1 contains a brief introduction to basic concepts in networking and communication systems and also to the nature of real traffic data. For readers with limited background in probability theory we provide a summary of notions and distributions in elementary probability and discuss introductory examples, including a short presentation of

Preface

xi

the Poisson process. Chapter 2 gives a summary and introduction to Markov chain theory in discrete and continuous time, focusing on equilibrium properties. Elements of queuing, loss, and delay are covered as well as the Jackson network of Markov service systems. In Chapter 3 we begin with performance measures and study load-versus-throughput relationships. The main objective is to discuss non-Markov dynamics and study various modeling techniques, including renewal processes, renewal rate processes, and on-off processes, and to cover standard material in queuing theory, such as Little's formula and the M/G/1 model. Specific applications include forced termination of a mobile phone and reliable data transfer protocols. Chapter 4 is devoted to the study of the simplest loss and contention protocols in packet switching. The basic example is an m x n crossbar switch, where packets arriving on m input lines are randomly switched onto n outputs. This results in either loss or buffer delay due to contention for output lines. A particular artifact in some of these models is the so-called head-of-line blocking phenomenon. Chapter 5 addresses traffic modeling relevant for cell and burst time scales. We begin with specific models for isochronous cell streams, Internet telephony, and a fragmentation procedure for video communications. Then we turn to the multiplexing of independent sources over a joint transmission channel—for example, the Anick-Mitra-Sondhi model of supeipositioned on-off sources. An important finding from recent research in this area shows that the addition of distributions with heavy tails to this class of models will result in long-range dependence. We discuss the related idea of self-similarity and give an account on the topic of approximating, in a certain sense, network traffic using the continuous selfsimilar process known as fractional Brownian motion. A section on data analysis includes statistical methods for identifying heavy tails. In Chapter 6 we apply a number of techniques and models to the study of various aspects of traffic control. As part of admission control we study the topics of effective bandwidth and statistical multiplexing gain. Access control includes several versions of the leaky bucket mechanism. Multiaccess control from a mathematical perspective involves modeling the retransmission mechanisms in contention protocols. We introduce the ideas using principles of the simplest Aloha-net and generalize to the ethernet protocols CSMA and CSMA/CD. Contention protocols based on collision-resolution algorithms are modeled using rather different methods. Finally, we treat congestion control of Internet traffic in a detailed study of the transmission control protocol (TCP) and Internet protocol (IP) window dynamics scheme. Obviously many topics of interest have been omitted from our presentation or are touched on only briefly. Some areas would have required a background of rather sophisticated mathematics, such as the theory of large deviations, which has found key applications in, for example, large-scale asymptotics of buffer overflow probabilities [57], and matrix-analytic methods leading, for example, to numerical schemes for calculating loss probabilities.

Acknowledgments This book was developed partly on the basis of lecture notes for courses given over several years. A main impulse was the opportunity to spend one term at Carleton University, Ottawa, and give a joint graduate course for students at the De-

xii

Preface

partment of Systems and Computer Engineering and the Department of Mathematics and Statistics. I am most grateful to Amit Bose at the Laboratory for Research in Statistics and Probability for taking the initiative for this project and for his continued support and cooperation. Thanks to further support from loannis Lambadaris and Michael Devetsikiotis at the Broadband Networks Laboratory, Carleton University, I was provided with excellent working conditions in an inspiring research environment. It is my pleasure to thank Gunnar Karlsson, Department of Microelectronics and Information Technology, Royal Institute of Technology, Kista, and Mats Rudemo, Chalmers University of Technology, Gothenburg, for similar opportunities to lecture graduate courses for other groups of advanced students. Many students in these and other courses influenced the content and style of the book and gave valuable input and inspiration for selecting topics and problems. Among these are Tarkan Taralp, Matthias Falkner, Mattias Ostergren, and Anders Andersson. Some have become coworkers and are directly involved in research reported in the book: Raimundas Gaigalas, Jb'rgen Olsen, and Ian Marsh. I am grateful to Evsey Morozov, Petrozavodsk University, for carefully reading the manuscript and providing many useful comments. Special thanks are due to Ian Marsh, Swedish Institute of Computer Science, for reading the manuscript in great detail and for numerous comments that improved both language and content. During the writing of this book I met my wife, Olga. Her support and love have been invaluable.

I.K.

Notation and Notions from Probability Theory It is assumed that the reader is familiar with the basic ideas of elementary probability theory, in particular the concepts of stochastic variables, probability distributions, and expected values. For the reader's convenience and to introduce notation used throughout the text, some of these notions are recapitulated here, including a listing of standard distributions. In addition, this section contains some terminology and lists a few properties related to conditional expectations, Markov chains, and convergence of random variables. For detailed accounts the reader should consult such textbooks as Gut [16]. Distribution function: F(x) — P(X < x'). Quantile: The number xa that satisfies F(xa) = 1 - a. Discrete random variable: X is discrete if it assumes a finite or countable number of values x\, X2, • . • with probabilities p(x\), p(xz),..., where p(x) is the probability function for X. Continuous random variable: X is continuous if it assumes all values in an interval according to a density function f ( x ) , 1.

F(x) is continuous for all x,

2. F'(x) = f ( x ) for all x where the derivative exists, and h 3. P(a < X < ab) f(x)dx. =f

Joint distributions: F(xl,...,xr) = P(Xi <x},...,Xr <xr) p(xi, ...,xr) = P(Xi = xi,...,Xr = xr) /(*,,...,*,) = £...£F(*,,...,*,). Expected value:

xiii

xiv

Notation

Variance: V(X) = a2 = E((X - ,u,)2). Standard deviation: D(X) = a = ^V(X). Covariance: Cov(X, Y) = E((X - ^)(Y - ny)). Correlation coefficient: Independence: Two random variables X and Y are said to be independent if

This implies E(XY) = E ( X ) E ( Y ) , hence Cov(X, F) = 0, in which case X and Y are said to be uncorrelated. The following are the most common discrete distributions. Binomial distribution: X e Bin(«, p) if p(k) = (£) pk (1 - p)"~k, 00,

Notation

xv

Exponential distribution: X e Exp(a) if X e T(l, a), f ( x ) = ae~"\ x > 0,

Normal distribution: X e N(m,a) if f(jc) = expected value and a2 the variance.

where u — m is the

ForN(0, 1) the distribution function is written $(*), the density (p(x), and the quantiles za. Slightly more advanced material includes conditional probabilities and conditional expectations; we list some relations particularly useful for calculations: Conditional mean: E(X) = E(E(X\Y)). Conditional variance: V(X) = E(V(X\Y)) + V(E(X\Y)). Conditional covariance: Cov(X, Y) — E(Cov(X, Y\Z)) + Cov(£(X|Z), E ( Y \ Z ) ) . Stochastic sums: Let {Xf} be independent identically distributed random variables, let N be an integer valued random variable independent of (or a stopping time for) {X,-}, and put Y = ^'i=l X,. Then E(Y) = V(Y) = E(N)V(X)

E(N)E(X), +

E(X)2V(N).

Markov chains are sequences of random variables X\, Xi, • • . , in which the future outcome Xn+i depends on the present variable Xn but is independent of the way in which the present state arose from its predecessors X\,..., X,,-\. Markov chain property: A sequence (Xn)n>\ of random variables is a Markov chain if for all « and x\. ..., xn+\,

Markov process: A process { X , , t > 0} in continuous time with discrete states is a Markov process if for any given trajectory {xr, 0 < r < t} of states and arbitrary .v < t,

Several convergence concepts are used in probability theory. We state four of them, of which the most important for the applications in this book are convergence almost surely and convergence in distribution. Let (Xn)n>i be a sequence of random variables. Convergence almost surely: (Xn)n>\ converges almost surely (a.s.) to a random variable X, Xn U4' X if

xvi

Notation

Convergence in probability:

n)n>\ converges in probability to X, Xn -> X if

foral

Convergence in L2: n)n>i converges in L2 to X, Xn —>• X if

Convergence in distribution: (X n )«>i converges in distribution to X, Xn ->• X if

where C(F) is the set of continuity points of the distribution function F of X .

Chapter 1

Introduction

1.1

A brief introduction to networking concepts

Many excellent sources are available for those without professional training in computer science who would like to understand the basic principles of communication networks. Computer Networks by Tannenbaum [61] is a classic text. TCP/IP Illustrated [60] has practical focus, and Computer Networking: A Top-Down Approach Featuring the Internet by Kurose and Ross [30] is a source covering the most recent developments. Also useful are introductory chapters of books in the category of more mathematical presentations, the classic example being Kleinrock's [28]. More recent books include An Introduction to Broadband Networks by Acampora [1], Fundamentals of Telecommunication Networks by Saadawi, Ammar, and El Hakeem [53], and High-Performance Communication Networks by Walrand and Varaiya [65]. From such sources one can get an idea of the evolution of communication networks; the period up to 1950 is dominated by the wide spread of telephony, after which it is natural to distinguish four phases. The first phase, 1950-1975, still represents voice traffic, now transmitted over digital channels. This era was triggered by the pulse code modulation (PCM) technique, in whic a voice signal in the frequency band 300-3400 Hz is sampled 8000 times per second and each time is coded into one of 28 = 256 levels. The coded signal therefore requires 8 x 8000 = 64,000 binary digits every second, which equals a transmission capacity of 64 Kbiiys. Various standards were developed for multiplexing signals from many sources over the same transmission medium. In North America and Japan a time division multiplexing (TDM) method called the Tl digital system was introduced. In this system 24 PCM voice signals are transmitted using the basic slot time of 0.125 ms but are assembled into a frame consisting of 24 x 8 voice bits plus 1 control bit per slot. This equals 8000 frames of 193 bits per second giving 1.544 Mbit/s, which is referred to as Tl or 1.5 Mbit/s traffic. In Europe, the CCITT standard, now known as the ITU-T (Telecommunication Standardization Sector of the International Telecommunication Union), defined 30 PCM voices requiring 240 bits, plus two extra channels of 16 bits used for signalling and synchronization, turned into a frame. Consequently this requires 2.048 Mbit/s and we obtain 2Mbit/s traffic or simply a 1

2

Chapter 1. Introduction

"2M voice channel." The second phase marks the entrance of computer networks and packet switching in contrast to the circuit-switched voice traffic. The Arpanet and other early systems were able to link host computers and dial-up terminal users. In 1976 the X.25 protocol for packet switching was established, and in 1978 the International Organization for Standardization (ISO) reference model with a seven-layer framework of protocols was defined. Other key technologies were the local area network (LAN) protocol ethernet and the first packetswitched radio network Aloha-net. The goal of integrating digital traffic sources, for example, voice, data, video, and images, in common transmission mediums such as optical fiber, characterizes the third phase in the 1980s. The acronym ISDN (integrated services digital network) is sometimes prefixed by N for narrow band and B for broadband, where the latter typically refers to asynchronous transfer mode (ATM) based cell-switching techniques. Some of the TDM systems developed for broadband traffic are, in North America, the synchronous optical network SONET based on capacity 51.84 Mbit/s or multiples, and in Europe, synchronous digital hierarchy (SDH), working with 155.52 Mbit/s or multiples, up to and over speeds of 48 x 51.84 = 16 x 155.52 = 2.4 Gbit/s. The fourth phase saw the growth in the 1990s of the World Wide Web, mainly due to the HTTP protocol and the Mosaic browser from National Center for Supercomputing Applications (NCSA), and the commercialization of the Internet. Virtually all computer network traffic is carried on the Internet through TCP/IP, and, quoting Kurose and Ross [30], "Although today the majority of voice traffic is carried over the telephone networks, networking equipment manufacturers and telephone company operators are currently preparing for a major migration to Internet technology." Adjunct to this fourth phase is the accelerated development of wireless networks and the preparations for mobile networks, in particular for the mobile Internet network based on radio access and optical fibers.

1.2

Modeling aspects of general networks

We begin by attempting to distinguish some general features of communication networks and to identify where stochastic modeling issues may be used. Figure 1.1 shows schematically a network surrounded by imaginary terminals, which could be telephones, home computers, or subscriber clients. They request various services from the network via an access net. Once the terminal's requests for service are admitted to the network, access transport follows, which includes multiplexing or concentration of data streams and connection to the trunk net. The trunk net consists of multiplexed channels of varying capacity based on, say, the SDH protocol or plesiochron digital hierarchy (PDH), for transport from one node to another. From the point of view of the network server, network management is crucial to guarantee reliability, transparency, and quality of service (QoS) in terms of sufficient bandwidth, control on bit errors, delay time, etc. Within the network, transportation occurs in different transfer modes; the traditional circuit mode of voice signals remains the major part of many networks. In packet mode, or frame mode, data packets of variable size containing the actual information bits along with addresses and other signaling information pass various stages of the transmission. Yet more specific is cell mode transfer, where all information is stored in equal-size units called cells, the basic example being ATM traffic.

1.2. Modeling aspects of general networks

3

Figure 1.1. Simple network model.

To be more concrete we can think of a public switched telephone network (PSTN) with a T2 system, which means that four Tl channels are multiplexed into a capacity of 6.312 Mbit/s, serving telephone terminals requesting access on available voice links, each requiring 64 Kbit/s. Transport between nodes (telephone stations) is based on circuit switching so that all connections are open throughout the call whether or not any data is to be sent. For comparison, consider the network model in Figure 1.1 as a host server computer accessed by users for transmission of computer data sets. An X.25 system, or the more modern frame relay protocol, both using packet switching, handles the transport phase. Arriving data are disassembled, packed into smaller units with address labels, transported over a communications system, and reassembled at the destination node. A third variant is that the network is a B-ISDN with a hybrid system of circuit and packet switching trunk nets, where a multitude of users is constantly challenging the access net with a mixture of requests. The user is the provider of telephone services rather than an individual subscriber, and a call is a bandwidth request, varying with the current demand on that provider. The PSTN system carries, in a sense, continuous traffic, and the packet switched system carries discrete traffic. In the first case bit errors are acceptable because they only add noise, whereas delays are much more disturbing and should be avoided. In the second case we have the reverse situation in which a single error may destroy a computer file being downloaded, whereas some time delay until the completion of a service is acceptable. We now take a closer look at what human speech might look like in digital communications. Based on the PCM picture it seems reasonable to think of a voice signal built of talk periods filled with binary digits equal to 1 and silence periods during which only digits equal to 0 are transmitted. A typical signal is shown in Figure 1.2, where the vertical lines during talk periods indicate the discrete time slots. Since the length of a slot is only 125 microseconds, however, the signal may be approximated by a continuous time curve, Z,, t > 0, which is 0 during the silent periods and 1 otherwise. Indeed, it has been suggested based on empirical measurements that a typical human voice over a phone is composed of such alternative speech periods of length 0.6 to 1.8 seconds and silent periods of length 0.4 to 1.2 seconds. A basic technique for constructing a mathematical model for a situation like this is to let a sequence of random variables S\, $2, . . . with a given distribution describe the successive silence periods and to let a sequence T\,Ti,..., characterized by another distribution describe the speech periods. The corresponding curve Z, is a randomly varying stochastic

Chapter 1. Introduction

4

Figure 1.2. PCM voice. process in continuous time, which from time to time jumps from 0 to 1 or vice versa. The next step is to impose sufficiently strong assumptions on the model to enable the derivation of useful information, while under the same assumptions the mathematical model still retains some of the nature of the considered situation. A typical assumption, which will be used many times elsewhere, is that the sequences of random variables {Si} and {Tf} are statistically independent and have finite means. Then [Zt, t > 0} is called an alternating renewal process for which a rich theory is available. It is not difficult, however, to point out problems with such assumptions even in this simple example. It is rather likely, for example, that periods of speech or silence nearby in time could influence one another and thus violate their independence. Such objections can be raised for any of the situations and techniques considered here. We will sometimes ignore them and where appropriate discuss possible alternatives. To continue the example, let us estimate how much of the total capacity is used for the transmission of speech. We assume that the expected lengths of talk and silence periods are £(7") = 0.8 seconds and E(S) = 1.2 seconds, respectively. During each silence-talk cycle a proportion E(T)/(E(T) + E(S)) = 0.4 of time is spent in state 1, i.e., a talk state. Over a long period this is the fraction of the total capacity used for the speech periods. If we multiplex 24 such sources in TDM fashion on a Tl link, then with 24 ongoing calls, approximately 1.5 Mbit x 0.6 = 900,000 bits every second are used to transmit nothing!

1.3

Broadband traffic characteristics

The objective of this section is to gain some insight into the qualitative nature of typical traffic streams in a communication network. It is sometimes natural from the point of view of mathematical modeling to consider a hierarchy of time scales. It has been proposed that three such levels, or scales, are enough for most purposes, and we refer to them as call scale, burst scale, and cell scale in order of shorter time units. We discuss this topic with reference to Figure 1.3 and by using an example. Suppose we wish to transmit a sequence of 100 X-ray images on a fast ethernet link that has a capacity of 100 Mbit/s, and each image is approximately 10 Mbits large. Let us assume each photo is divided into 10,000 packets, each 1000 bits in size, thus requiring a slot time of 10 £is. On the cell scale level consider a synchronous stream of packets sent one every 10th slot, so the transmission rate is one photo per second. However, it is unlikely that this degree of link efficiency can be maintained over an extended period. On the next scale, the burst scale, which is typically on the order of seconds rather than milli- or microseconds,

1.3. Broadband traffic characteristics

5

Figure 1.3. Time scale hierarchy.

we may assume that the photos (bursts) arrive on average one every 10 seconds, resulting in a burst intensity of 10 Mbit/s. A possible model for the arrival times of each photo would be the Poisson process with an intensity of 0.1 images per second. Note that within bursts the cell scale model is still appropriate. Similarly, suppose such sequences of images are sent regularly with interarrival times of 10,000 seconds. One sequence can be thought of as a call of length 1000 seconds, and hence on the last level the process of calls has a rate of 1 Mbit/s. This example illustrates the important concepts of peak rate and mean rate. On the call level the traffic peak rate is 1 Mbit/s, whereas the mean rate is 100 Kbit/s. On the burst scale the peak rate is 10 Mbit/s and the mean rate is 1 Mbit/s. On the cell scale the peak rate coincides with the maximal capacity of 100 Mbit/s and the mean rate is 10 Mbit/s. It is clear from this that the appropriate time scales need to be identified for stating system performance criteria. Consider, for example, blocking probabilities at a specific node. It may be acceptable that on average one of 100 calls is lost due to congestion at the node. Perhaps acceptable quality demands would limit the risk to 10~4 of losing a burst, whereas maximum blocking probability for cells should be maintained at, for example, 10~6. From the viewpoint of the broadband network server, the total traffic load is the superposition, on each time scale, of traffic streams from many sources. We bring in some terminology from the ATM technique to discuss this further. See Onvural [44] and Saito [54] for detailed accounts. An ATM cell carries 48 bytes of data and 5 additional octets for labels and control; hence its size is 53 x 8 = 424 bits. The transmission time per cell in an ATM switch equipped with links of capacity C = 155 Mbit/s is 424/C % 2.74 x 10~6 « 3/us. On the cell level all traffic in ATM is broken down into such regular streams. To get an idea

6

Chapter 1. Introduction

Figure 1.4. ATM time scale hierarchy.

of bursts in the ATM situation one can think of a large number of on-off sources (such as the PCM voice model in section 1.2) that are added to each other and sorted according to various priority classes and traffic classes, such as audio, video of constant bit rate, and video of variable bit rate. The proposed ATM standard supports five traffic classes. Finally, the analog of calls in ATM is the set-up of virtual connections, or virtual paths; see Figure 1.4. The character of network traffic is highly unpredictable and changes quickly with the emergence of new applications, and with no certainty can we describe the nature of the dominating volumes of future network traffic. Despite this, we now consider a few empirical examples of traffic streams. The first data set is a trace of approximately 20 minutes of low-intensity ethemet traffic in the LAN of the MIC campus, Uppsala University (123,902 packets). Figure 1.5 shows the distribution of packet sizes in the data trace. Clearly there is a random variation in the data set, but there is a distinct character with three main peaks visible. The lower and upper peaks correspond to the minimal and maximal packet sizes allowed under the ethernet protocol, and the middle peak, around 576 bytes, shows the fraction of packets that were formatted as TCP segments. The interarrival times in the same data turn out to be much more dispersed. These are the successive time gaps between the arrivals of two packets. To obtain a readable graphics output, it is appropriate to compress the data on the x-axis. A histogram of the logarithms of the interarrival times is given in Figure 1.6. Some apparently machine-generated features in the data are visible along with the random character emphasized by the fact that several users contribute to the same trace. Figure 1.7 shows a plot of the counting process for the sequence of arriving packets, i.e., the process in continuous time with a jump of size one at each arrival time point. The large variation in the data set is manifest in the deviations from a linear increase clearly visible even on the time scale of the order of minutes. Finally, in Figure 1.8 the same data set has been turned into an arrival rate sequence by plotting the number of arrivals falling in consecutive time intervals of given length A. In this graph A = 1 sec. Statistical analysis of extensive ethernet data performed at Bellcore and first published in 1994 lead to the important finding that such data show characteristics of long-range dependence and

1.3. Broadband traffic characteristics

7

Figure 1.5. Ethernet packet size distribution.

self-similarity. These concepts are discussed in subsection 5.3. For an introduction, see, e.g., Willinger and Paxson [67]. The next example of empirical data refers to encoded video motion pictures. A widely used coding algorithm is the Moving Picture Experts Group (MPEG) standard encoding scheme. Variable bit rate (VBR) video coding tries to provide constant viewing quality. Basically, the video data stream is sorted into frames at a rate of 25 frames per second. There are three types of frames—I, P, and B. An I-frame is a full coded frame, and P- and B-frames are updates designed to reduce both the spatial and the temporal redundancy in the signal. The frames are arranged in periodic sequences, e.g., IBBPBBPBBPBB, forming a group of pictures (GoP). Figure 1.9 shows a graph of the sizes measured in bits of the successively arriving I-frames representing the GoPs in a trace of 17 minutes of a BBC newscast. The size distribution in this case varies modestly. The corresponding histogram in Figure 1.10 suggests that even a normal distribution could be used for crude modeling. As a further example of similar data, Figure 1.11 shows a graph of the VBR in the coded video-trace of the motion picture Last Action Hero (Columbia Tristar, 1993). The trace is 2.6 hours long and the graph shows the size of I-frames measured in bytes for approximately 234,000 frames. The third example is a trace of a voice call over an Internet telephony system from Argentina to Sweden. The voice signal is transmitted in 160 bytes packets with one packet sent every 20 ms. En route to their destination the packets go through stages of buffering and interaction with Internet cross traffic, causing random delay variations. In addition, packets may catch up with strongly delayed packets ahead in the packet train and have to adapt to a

8

Chapter 1. Introduction

Figure 1.6. Ethernet interarrival time, logarithmic scale.

slower pace. As a result, the time intervals between arriving packets are sometimes shorter than 20 ms and sometimes longer. Figure 1.12 shows a histogram for the interarrival time data of a call where the quiet periods during which the transmitting caller is silent listening to the other party have been suppressed from the data. The transmitting caller speaks for about 110 seconds, which corresponds to 5500 packets. The peak close to zero represents the number of overpassing packets that arrive together with the packet ahead in line. The final example illustrated in Figure 1.13 shows round-trip times (RTT) for a se quence of 2000 test packets sent one per second during business hours return-trip from the server www.math.uu.se to www.ericsson.se (11 packets were lost). The data were obtained using the software PING. The sample mean of the measurements is 154.82 ms with substantial fluctuations in the range 46.7 ms to 679.6 ms. The median is 115.5 ms. RTT measurements show huge variation depending on network loads, but the data in this example seems to be typical. Even if a normal RTT level is established over a period of transmission, there will be spikes corresponding to large delays. In Figure 1.13 the typical level is around 100 ms and the intensity of spikes is high. Despite the random variations in RTT data it is common practice in much modeling work to assign a fixed value to the RTT and consider it to be a constant model parameter. At several occasions we follow this practice of ignoring the RTT fluctuations, except in section 5.1.3, where we discuss data such as in Figure 1.13 and introduce an explanatory model.

1.4. Three introductory examples

9

Figure 1.7. Ethernet arrival count process.

1.4

Three introductory examples

The following examples introduce basic techniques and notation from probability theory and illustrate performance evaluation in simple cases.

1.4.1

Simple collision model

In a multiuser system it is important to assign to each user reasonable bandwidth to avoid extensive losses and delays. One source of losses in a packet-based system arises from attempts to send several messages in a given time slot during which only a single message can be accommodated and transmitted. Typically each user involved in such a collision is affected and faces a delay due to the retransmission of the message at a later time. Such situations are studied in detail in section 6.3. As a first simple illustration we consider a system of two users attempting to transmit messages in a slotted time fashion over a common transmission channel. Suppose the two users attempt to send messages with probabilities p\ and p2, independently in each time slot and independently of each other. As long as at most one user is sending in a given time slot the transmission is considered successful. If both users attempt to send in the same slot a collision occurs and both messages are lost. For simplicity we ignore retransmissions and just count the number of losses to get a measure on the performance of a system like this.

10

Chapter 1. Introduction

Figure 1.8. Ethernet arrival rate process. Introduce for k > I

Uk =

1 0

if user 1 attempts to send in slot nr fc, else,

Vk =

1 0

if user 2 attempts to send in slot nr k, else

with P(Uk = 1) = pi and P(Vk = 1) = P2- Moreover, put Nn = number of successful transmissions during n slots

and Kn = number of lost messages during n slots The expected values are found to be

1.4. Three introductory examples

11

Figure 1.9. BBC news, I-frame arrival rate.

and we also note that Nn + Kn = total number of attempted messages = with E(Nn + Kn} = n(p\ + p2}. Furthermore, the law of large numbers for sums of independent identically distributed (i.i.d.) random variables applies (in its strong form), resulting in the asymptotic (almost sure) limits valid as n —> oo,

These limits have natural interpretations in terms of some basic performance measures. They will be studied later, but to complete the example we illustrate them briefly as follows. The offered load to the system, or the total load to which the service system is exposed, is in our case the limiting number of requests the transmission channel is receiving per time unit regarding the transmittal of a message over the link, whether successful or not. Hence offered load:

number of messages attempted up to time n n

Chapter 1. Introduction

12

Figure 1.10. BBC news, frame size distribution.

The throughput depends on the rate at which the system can cope with the offered load, and it is measured as the average work completed by the server per time unit. Hence we count the number of successful messages over n time units, divide by n, and let n ->• oo to get throughput:

number of successfully transmitted messages up to slot n n

in units of messages per slot. The related term utilization is normally used as a measure of the rate of activity of the server. In this case the natural measure is the average time during which the link is busy. What is the loss probability of messages in this system? In the long run we have number of lost messages number of attempted messages which again is a consequence of the law of large numbers. On the other hand, if we consider a single slot (say, k = 1) we may argue that the loss probability is the conditional probability of both users attempting to send packets given that at least one of them attempted to transmit, in other words, the probability

1.4. Three introductory examples

13

Figure 1.11. Frame sizes, MPEG video encoding. However, this is the probability of an event causing the loss of two messages. It should therefore be multiplied by a factor of two, yielding a measure of losses in agreement to the one previously obtained. 1.4.2

Basic arrivals process

In just about any situation of setting up a mathematical model for studying network traffic, it is crucial that external arrivals of packets, frames, calls, etc. are subject to relevant and appropriate mathematical assumptions. It seems inevitable, however, that such assumptions sometimes pertain to strongly idealized conditions, hence models that account for only the most basic features of the arrival streams. At several places in this book we discuss arrival stream modeling designed to cover dependence structures and correlation. We also study periodicities and the effect of superpositioning in multiuser systems, aiming to achieve better agreement with empirical data. Still it is fair to say that the most important model for arrival traffic is the Poisson process with its fundamental assumption of constant traffic intensity. Next we give an introduction to the Poisson process, mainly for the reader with limited experience with basic probability models. We start with the intention of modeling a continuous time process N,, t > 0, with A/o = 0 such that N, = number of arrivals in the interval [0, /]

Chapter 1. Introduction

14

Figure 1.12. Histogram for voice over IP interarrival times. with the reasonable assumption that arrivals should occur (a) randomly in time, and (b) independently of each other. Simplest model. For fixed / suppose there is either one or no arrival in [0, t] and that pure chance decides which of these two alternatives occurs. LetX = number of arrivals in [0, t], and put For (a) to occur in this trivial model, it is reasonable to make the assumption

whereas (b) is not applicable. Binary model. Now assume that an arrival is possible in [0, t/2], as well as in [t/2, /], so that if we put X] = number of arrivals in [0, t/2] and X2 = number of arrivals in [t/2, t],

then the assumption P(Xi = 1) = P(X2 = 1) = A.f/2, where 0 < A. < 2/f is a constant, includes (a), whereas (b) is included by assuming that X[ and X2 are independent random variables. The number S2 = Xi + X2 = total number of arrivals in [0, t] is distributed as the number of successful attempts of two trials performed independently with a probability of success X t / 2 each time; hence according to the binomial distribution

S2 eBin(2, Xt/2).

1.4. Three introductory examples

15

Figure 1.13. RTT measurements. n-stepmodel. TheextensiontothecaseofnindependentrandomvariablesXi,..., Xn, each Bin(l, Xt/ri) distributed, is straightforward. Then

suggesting that the arrival count N, at time t would be given by a limiting quantity Sx of S,, as n -> oo. Since

this approach, known as the Poisson approximation of the binomial distribution, leads to

or, in short, N, e Po(A.f) for each t > 0. Furthermore, by a refined but similar argument, it follows for any t > s > 0 that the successive arrival increments Ns and Nt — Ns are independent random variables with stationary Poisson distributions

16

Chapter 1. Introduction

Figure 1.14. Twenty sample paths of Poisson process, A, = 1. These properties are known to characterize the Poisson process with intensity X. Alternative view. Here we indicate some aspects of the more dynamical approach to counting processes, which distinguishes the Poisson process as one of the basic continuous time, discrete state, Markov jump processes. This approach highlights the notion of the parameter A as being the infinitesimal intensity of Nt in the sense of the relation P(arrival in (t, t + h\) = P(Nt+h -Nt = \} = Xh + o(h),

h -> 0,

where o(h) denotes a remainder term, varying from one instance of occurrence to another, with the property that lim^o o(h)/h = 0. Together with the assumption P(at least two arrivals in (t, t + h]) = P(Nt+h - Nt > 2) = o(h), and using the simpler notation pk(t) = P(Nt = &), this leads, for small h, to pk(t + h) = pk-i(t)(M + o(h)) + pk(t}(\ - A/i - o(h}} + o(h), hence and therefore

This system of differential equations can be solved directly in a recursive manner applying an integrating factor to each equation. An alternative method is to turn the system into an equivalent equation for the generating functions gt(u) = X^/tlo ukPk(t}> \u\ < I , and identify its solution as the generating function E(uN') = g-^1-") of the Poisson distribution.

1.4. Three introductory examples

17

Of course a third method is to verify directly that pk(t) = e~^'(kt)k/k\,k > 0, is the unique solution to the given system of equations. Note that

—- = number of arrivals per time unit with E ( N t / t ) = A, and thus

In this sense (L2-convergence), N,/t converges toward A. as t —> oo. Convergence in P probability, N,/t —> A, follows from the Chebyshev inequality

valid for each e > 0. Even the strongest mode of almost sure convergence, N,/t -—* A, is true in this case as a result of the Poisson process strong law of large numbers. In any case, we have justified the interpretation of the intensity A as the long time arrival rate. Let T0 = 0 and, for k > 1, UK — time between arrivals k — I and k, In particular

and therefore that is, C/i e Exp(A). More generally, the interarrival times ((4)i>i are i.i.d., each having the exponential distribution with mean E(Ti) = I/A. Consequently the arrival epochs 7i have the Gamma distribution

This property, which is discussed in Exercise 1.5, makes it simple to simulate trajectories of the Poisson process. See Figure 1.14.

1.4.3

Periodic streams

Consider periodic streams of packets, each packet of length t Kbit arriving every r ms to a buffered node for access onto a link of capacity c Mbit/s. A total of m such streams are multiplexed into the same node, with the characteristic feature of the system being that the phases of the separate streams are unknown. A packet always uses the full link capacity during transmission, forcing packets from other streams to wait in the buffer while the link is busy.

18

Chapter 1. Introduction

Lacking evidence of any other arrival pattern, we naturally assume that the m arrival times from separate streams during the periodic cycle are uniformly distributed. More explicitly, since each packet requires t/c ms for its transmission, occupying the fraction l/cr < 1 of the total time, we can model the arrival times of the streams within a cycle by means of m independent Re(0,1) distributed random variables U\,..., Um, and define for cycle time 0 < t < r N(t) = number of packets being transmitted or waiting in buffer at time t

Clearly a buffer size of (m — 1) is enough to avoid losses, and the maximal delay is restricted to (m - \)tjc ms, which occurs in the worst-case situation that all streams are synchronized. Moreover

which gives

and where we assume the system has settled in a steady state. To see the limitations of the analysis of this example we only have to replace the constant size packets by variable size packet streams, which may still be periodic if we make appropriate assumptions on the packet size distribution. In principle we may now need infinite buffers (if a packet has to wait for another packet in its own stream) and the delay is also potentially large due to arrivals during the service of very large packets.

1.5

Exercises

1.1 Suppose a random trial either succeeds, with probability p, or fails, with probability 1 — p. After independent repeats of such a trial, put N = number of successes from n trials, M = number of attempts until first successful trial. What are the possible outcomes of N and Ml Write down the probability functions P(N = k) and P (M = k), and verify their normalization by summing over the relevant ^'s. Calculate the expected values and the variances of the random variables N and M. 1.2 Let M and N be independent random variables that are both geometrically distributed with the same parameter p. Find the probability P(M < N).

1.5. Exercises

19

1.3 Using the simple collision model of section 1.4.1, consider two users sending equalsize 1024 octet messages over a 2Mbit/s transmission channel, with the natural slot length being given by the message transmission time. The first user attempts to send with probability 0.05 and the second with probability 0.10 in each slot. Find the offered load, the throughput, the utilization, and the loss probability. 1.4 Consider again the collision model of section 1.4.1 with two users attempting to transmit messages over a common channel. They send in each time slot with probabilities p\ and p-2, respectively, with both messages being lost in the case of a collision. The number of successfully transmitted messages in n slots is denoted by Nn, the number of lost messages by Kn. Find the variances of Nn and Kn and also find the covariance Cov(Nn, Kn). 1.5 LetN,,t > 0, denote a Poisson process with intensity A, > 0 and associated interarrival times [Uk}k>\, and put Tn — Y."k=l Uk. Check that the events [Tn < t} and {Nt > n} are identical and hence that P(Tn < t) = P(N, > n). By differentiation find the probability densities of the arrival times T,,, n > 1. Verify that the relations Nt — maxjn : T,, < t) and N, — min{« : Tn+\ > t} also are valid. 1.6 Suppose m independent Poisson processes with intensities A,, (' = 1, m, start at time t = 0. Let 7"_ denote the time of occurrence of the first event in any of the m processes and 7_ denote the first time when at least one event has occurred in each process. Find the density functions of 7~_ and T+. 1.7 A much simplified model of an ATM switch has two input and two output ports. Arrivals occur independently in any given time slot and in each of the input ports. With probability p an ATM cell arrives at the input, and with probability 1 — p no cell arrives. Each arriving cell is transferred during the same slot to one of the outputs, chosen randomly with equal probabilities. The capacity of the output ports is limited to at most one cell during the slot. Should two cells be switched to the same output, only one exits, whereas the other is delayed in a buffer. Let X be the number of arriving cells and Y the number of exiting cells in one slot. Find the correlation coefficient px. Y between X and Y . 1.8 During transmission of a binary signal it is known that bit-errors occur independently with probability 5 x 1 0 I0 per digit. The available bandwidth is 1 Mbit/s. Compute or estimate (a) the probability of managing a one-hour-long transmission without errors, (b) the probability that at least 15 of 80 one-hour transmissions are performed errorfree. 1.9 A sending unit transmits a binary signal Xi, X ? , . . . of zeros and ones, where the sequence is independent and P(X, = 1) = 1 — P(X-, = 0) = p for each /. After digital-to-analog conversion, and because of disturbances during the transmission, the sequence Y], K 2 , . . . is received, where Yj = X; + Z; and Z, € N(0, a). The noise variables Z\, ZT. . . . are independent of each other and of the original signal. The receiver interprets the signal F, as binary digit 1 if Y, > 1/2 and digit 0 otherwise.

20

Chapter 1. Introduction (a) Find p\ = P(Yi > 1/2) as a function of p for the cases a — 1/5 and a — 1/3. (b) Estimate a value of p if 162 of 424 observations at the receiver are digit 1 and the rest digit 0 and the noise parameter is set to a — 1/3.

Chapter 2

Markov Service Systems

The Markov property is a restrictive assumption. Still, many Markovian models are highly relevant for traffic modeling. Referring to the general discussion on time scale analysis in section 1.2, a Markov model could well capture random variations occurring in a burst or call scale modeling situation. As we will see, such methods should not be excluded even on a cell level, such as in ATM switches. Of particular importance is the class of continuous-time, time-homogeneous, Markov birth-and-death processes, again despite the strong assumptions put on such processes by the Markov property. They clearly provide useful insights into modeling of service systems although their relative simplicity and mathematical tractability may not warrant applicability without caution. On the other hand, indisputable facts are that they belong to, or even form, the historical core of the subject of traffic modeling and that progress on more general models often presupposes a fair understanding of this class. In this text, Markov models in discrete time are of almost the same importance as those in continuous time. This may be in contrast to other presentations of traffic modeling where slotted, or discrete, time models arise mostly in the form of embedded Markov chains of more general non-Markov continuous time processes as they are sampled at certain random times. This is a central topic that appears in section 3.5.3, but we have chosen to also present several models of switching, multiaccess methods, and so forth, specifically in slotted time mode since this seems to be more natural and convenient. The next sections describe some basic Markovian models, starting with selected aspects of slotted time systems and turning to continuous-time classical queuing service systems. We emphasize the ideas of equilibrium steady states and calculating performance measures. The reader should be alert to the severe restrictions we work under and keep in mind, for example, that such a case as the example in section 1.4.3 is not covered by the techniques given in this section.

2.1

Discrete-time service systems

Imagine cells of size t bits arriving at a buffered nodal point for transmission on a link of capacity c bits per second, so that the time slot required for transmission of one cell is t/c 21

22

Chapter 2. Markov Service Systems

seconds. Suppose that the time line is divided into half-open intervals (l(n - \)/c, in/c], n > 1. If there is at least one cell present in the system at the beginning of a slot, then a single cell is transmitted during the same slot, leaving excess cells stored in a buffer awaiting transmission in the slots that follow. To formalize, let Xn = number of cells arriving at the node during slot number «, An = X^=i %k — total number of arrivals in n slots, Qn = number of cells in the buffer (queue) at end of slot n, and Nn = number of cells in the system (in buffer or being transmitted), slot n. Here all quantities are indexed for n > 1. In addition, appropriate initial conditions should be specified, typically Q0 = 0 or NO = 0 if the system starts from an empty state. Example 1. Suppose m users are connected to a transmission node. Each user generates independently in every slot a cell with probability p and remains silent with probability \ — p. Then for each n, Xn e Bin(m, p) and An e Bin(nm, p). The scaling p = y/m yields An e Bin(nm, y/m) ~ Po(yn) for large m; compare this with section 1.4.2.

Now,

which summarizes to

or, equivalently, Other useful relations are

On the other hand, if we choose the system size as a state variable, then, analogous to (2.2),

Assuming, for example, that

then the Markov property

2.1. Discrete-time service systems

23

of the sequence Nn, n > 1, follows directly; similarly for Qn, n > 1. A basic recursion like (2.1) for the buffer size in this simple model is called a Lindley equation. It turns out that equations of this form play a prominent role in much of classical queuing theory, a major reason being the link they provide between the distribution of the sequence in question, in our case {(?„), and the distribution of the maximum for a related random walk. We consider this next. Put £„ — Xn — 1 and

The mentioned link between [Qn] and the sequence {5n}n>o, which is a random walk with integer jump sizes in [—1,0, 1,...], requires the independence assumption of (2.6) and is given by

To prove the distributional identity (2.7) we start by demonstrating the identity

Indeed, note that (2.8) is true for n = 1 and suppose for the sake of a proof by induction that it holds for a given integer n. Then

which establishes the next instance of the identity and hence the validity of (2.8). It is noteworthy that until now assumption (2.6) was not used. Invoking it at this point yields

which concludes the derivation of (2.7). Finally in this subsection we mention the well-founded objection against the present model in that it covers only constant service times. It is the varying number of equally sized cells per slot that generates random fluctuations in the buffer whereas the service procedure simply consists of transmitting one cell per slot as long as there are cells present. To a certain degree the restriction can be circumvented, allowing for a wider interpretation. Suppose that cells have random sizes L\ and replace Xn above by J^i=\ ^i, the total amount of work arriving during slot n. For comparison suppose that the L;'s are multiples of i and that transmission occurs at the constant rate i bits per second. Then the same model applies with Qn being an integer-valued sequence representing current workload, the remaining work left to do for the transmission node if no further cells arrived.

24

2.2

Chapter 2. Markov Service Systems

Arrival and service rates, continuous time

To formally study service systems in continuous time, we introduce the following notation, relevant not only for Markovian situations but generally as well: A, = arrivals process (number of jobs, calls, requests, cells, packets,...), B, = departure process, T\, TI, . • . = interarrival times, Si, 82, • • • = successive service times, Nt = number of jobs in the system = At — B,, Q, — number of jobs in buffer (queue), Mt = number of jobs being served, Nt = Qt + Mt. A further general notion is that of average arrival rate, which, if this limit exists, is defined as We have in mind single-server systems, m-server systems with m parallel service stations, and infinite-server systems, where any job is immediately assigned a server on arrival. In the situation with a finite number of servers and in the case where no server is available, the arrivals are stored in a buffer. The queue is emptied at the rate at which service capacity is again freed up due to departures from the system and according to a given set of rules. The simplest such rule is the first-in-first-out (FIFO) principle, also called first-come-first-served (FCFS), which ranks priority of service higher the earlier the arrival time. The service times Sk normally refer to the times needed for complete service of a given arrival before exit from the system. A classical Markovian service model is given by a set of birth parameters {A.n}n>o and a set of death parameters {//,„ }„>]. Here \k = intensity of an arrival at time t if N, — k and fik = intensity of end-of-service, hence system departure, at time t if N, = k. These interpretations are consistent with the infinitesimal notions

Since the intensities A.^ and ju^ are independent of t, the exponential distribution being the only continuous distribution lacking memory is bound to appear. The basic expression of this property for the exponential distribution is that the family of remaining waiting times until the next jump has the same distributions as the waiting time itself. In fact, if S e Exp(a), then for eacht

2.2. Arrival and service rates, continuous time

25

is independent of s. It turns out that to construct the process Nt one can proceed as follows. Given NQ = k, let Nt remain on the level k for a random time which is E\p(Xk+ /i*). Then j ump to k +1 or k -1 with probabilities A.* / (A.* + /A* ) and /A* / (X* + A 4 *), respectively. Repeat these steps by selecting at each new level independent waiting times that are exponentially distributed with the appropriate intensities. Equivalently, if N, — k, the remaining time until the next arrival is Exp(Ai) and the remaining time until the next completion of a service interval is Exp(/i^). Observe that these two random times are independent; hence their minimum is exponentially distributed with intensity A.J. + /^. Compare Exercise 1.6. This minimum time is the remaining time until the next jump of Nt, in agreement with the construction described above. Of central importance is the special case when A.^ = A. for all k > 0. Considering the embedded counting process, At, of upward jumps only, we see that P(A,+h - A, = 1) = P(Nt+h - N, = l\N, = k, for some ft) = Xh + o(h), and similarly P(At+h — A, > 2) = o(h); in other words, we see that A, is the Poisson process. Other examples are finite buffer, at most K jobs in system. discouraged arrivals; arrival rate slows down with system size. Turning to the service rates /j,^ we have the following simple examples: the single server model with service time distribution Exp(^t) which gives at any time t such that A', > 0 constant intensity IJL for a downward jump in N,. the m -server model where each server operates . , , , . , , ,, , • independently of the others for a random time with distribution Exp(At). By combining arrival and service rates appropriately we obtain a list of some of the classical queuing models, shown in Table 2.1. The crucial tool that allows for the derivation of performance measures and comparative studies of Markov birth-and-death processes such as the models given in Table 2.1 is the equilibrium steady state analysis that we discuss in the next section. This is no reason, however, to ignore completely the exact analysis of the distribution of the process Nt, and as a bare minimum we provide a short discussion of Kolmogorov's backward equations on which the transient, or time-dependent, analysis of such processes is based. In analogy with the Poisson process studied in section 1.4.2, and again using the notation p/t(0 = P(N, = k ) , we have, for small h,

hence

26

Chapter 2. Markov Service Systems Table 2.1. Markov queuing systems in Kendall notation. M/M/1: M/M/m: M/M/oo: M/M/l/K: M/M/m/m:

Figure 2.1. Ten sample paths of "M/M/oo, A./M = 20. and therefore

The existence of a unique solution {pk(t), t > 0}^>0 to this system corresponds to a system size process Nt,t > 0, being well defined. In general this requires certain restrictions on the parameters kk and 11%, excluding such cases where, for example, Nt would tend to infinity at a finite (random) time. No simple necessary and sufficient conditions are known but it is well known that if there are constants a and b such that Xk + MA < a + bk for all k > 0, then a unique solution exists. Clearly this criteria suffices for the models in Table 2.1. It is the exception rather than the rule that the system of equations in (2.11) can be solved explicitly, or that a representation of the solution exists which is of practical use.

2.2. Arrival and service rates, continuous time

27

Figure 2.2. System size o/M/M/1, initially empty, X/n = 0.9. One such exception is the M/M/oo model, for which the probabilities pk(t) can be found in a tractable form; see Figure 2.1. Indeed, let X* = A. and ^ — ilk for all k > 0. It is rather straightforward to verify that the resulting system of equations

equipped with the initial conditions pn(0) = 1, /?t(0) = 0, k > 1, has the solution

As a consequence, for each fixed t we have

It is perhaps surprising that the simple choice of parameters At = 1 and ^ = /x for the M/M/1 model yield as a result for the state probabilities the unwieldy expression

where lk are the modified Bessel functions

28

Chapter 2. Markov Service Systems

Figure 2.3. Sample paths of critical M/M/1, A. = /j,. A further, reassuring, fact is that despite its formidable character the formula for pk(t) greatly simplifies in the asymptotic limit as time tends to infinity. As a matter of fact, if A.///, < l,then/7t(f) —> (\—X/^)(\/n,)k,t ->• oo, a key result for the steady state analysis to follow. Some simulated trajectories of the M/M/1 system starting from idle are shown in Figure 2.2 for the subcritical case A.//Z < 1 and in Figure 2.3 for the case A. = ju.. In each case five independent realizations are indicated.

2.3

Ideas of stationarity and equilibrium states

A part of a network traffic system such as an access control point or a local transmission node, even with a random character of the traffic streams involved, should ideally work in a relatively stable manner, at least over relevant time scales. By this we mean that apart from more predictable changes in intensity, say, daily or seasonal, the variations and fluctuations inherent in packet delays, buffer occupation levels, etc., occur in such a way that dimension and design of the system could be chosen to guarantee, typically, acceptable quality of the service tasks the system is set to fulfill. Observed by a system operator, the network node would appear to be functioning in a randomly varying but still steady state. Eventually, in a greater perspective, the wider area network of which our system was a part would then also work in an equilibrium manner. If we accept that features of real systems can be captured by such stochastic models we have begun to study, the natural interpretation for a real system in equilibrium is that the distributions of the random quantities used in the model do not change over time. There are essentially two methods for implementing this reasoning into the stochastic modeling. To illustrate them we recall our main examples so far: the buffer sizes Qn in section 2.1 and the system size process Nt in section 2.2.

2.3. Ideas of stationarity and equilibrium states

29

First, it is natural to expect that for appropriate values of any parameters that are part of the model, and after an initial stage of adaptation, the distributions would settle down over time, for instance, P(Qn = k) » P(Qn+\ = k) = • • • or even P(Qn — k) — P(Qn+\ = k) = • • -, for n sufficiently large independent of the initial state QQ. In principle the settling down into a stage of stationary evolution over time could run indefinitely. Hence, mathematically the central notions are those of limit distributions and of limits in distribution of random sequences and processes as the time parameter tends to infinity. Supposing that regardless of the initial distribution of QQ the limiting probabilities

exist, a random variable Qx with P(QX = k) = n^ can be associated with the sequence { Q n } . Then the sequence is said to converge in distribution to Qx and {n^} is said to be the limit distribution of the system. Similarly, if

then {jik} is the limit distribution of system size in the continuous-time model. The random variables Qx and Nx represent buffer size and system size in the limiting sense, which in the philosophy of equilibrium states means that they are chosen to approximate at any fixed given time the true distributions of these quantities. The second, in a sense stronger, method for encompassing a state of equilibrium into a stochastic model of the kind we are concerned with is to choose the initial distribution so that it is actually preserved under the evolution of the system. Using again the buffer size example, this approach amounts to finding a distribution {%.} with £^0 % = 1, such that

For obvious reasons a distribution that satisfies (2.12) is called stationary. Regarding the relation of the two notions of equilibrium to each other, we note that a limit distribution is always stationary. If we have already found a limit distribution, then we pick Qo according to that particular distribution and start off the Markov chain. It is heuristically obvious, and not difficult to prove, that the distribution is preserved over time. The converse is not necessarily true, however; Markov chains with a periodic behavior must be excluded. Theorem 2.1 (limit distributions for Markov chains). Suppose a Markov chain {7,,}n>o on the nonnegative integers is irreducible: P(Yn = j\Yq = i) > 0 for some n > Q, for all i, j, and aperiodic: greatest common divisor of the set {n : P(Yn = i\Y0 = i) > 0} is equal to 1 (it suffices, for example, to find one state i with P(Yi = i\Yo = i) > 0), and suppose that a stationary distribution {nj}, rtj > 0, exists. Then the stationary distribution is unique, Jij > Qfor all j, and it is the limit distribution of{Yn}. Summing up, when we say that a system works in a steady state or operates in equilibrium, we refer to, unless otherwise specified, the situation in (2.12), which is typically

30

Chapter 2. Markov Service Systems

obtained by starting the evolution of the system from a stationary state that is also the limit distribution. Even if the initial distribution is not stationary, it often seems to be a mild approximation to ignore the difference between the steady state and the actual distribution at a fixed time. Some authors prefer to work with sequences like {2n}-t» 0; hence the buffer size Markov chain Qn is irreducible and aperiodic. We thus conclude from Theorem 2.1 that to find its operational characteristics in equilibrium, it is enough to compute a stationary distribution. By (2.2) such a stationary distribution must satisfy

and thus, since we already noted that P(X = 0) = 0 would violate (2.13),

The above is a first example of how to derive information about an unknown distribution, in our case that of Qx, in terms of a known, that of X, under suitable assumptions on the model, namely, assumption (2.13), which says that on average the input is smaller than the output of the service node. The argument is based on a balance equation (2.2), where Xn represents the input and l(e n +x,, +1 >i} the output, pursuing the consequences of equating terms. We can now proceed and express the state probabilities JT* = P(QX = k), k > 1, in terms of the expression for TTQ = P(Qoc = 0) just obtained. In general there are no simple expressions for the state probabilities, but since they solve a relatively simple system of equations it is in principle straightforward to calculate them recursively. The equations are obvious from (2.3) when considered in equilibrium, namely,

2.4. Balance equations, slotted time

31

It should be observed that the previous result (2.15) is not redundant since the first of these equations determines P(Qca = 1) in terms of P( I* ~ ^ ~ ^ ^o = 0. The maximum sequence maxo — oc (in the sense of almost sure convergence) as n —> oo. Indeed, negative drift suffices for the maximum to stay finite, in agreement with the equilibrium version of (2.7), which is the representation

Example 2. Suppose that

where we assume that p < 1/2 so E(X) = 1p < 1 in accordance with (2.13). The equations for the equilibrium probabilities simplify and it is seen that

32

Chapter 2. Markov Service Systems

hence

in other words, 2TO e Ge(l - j^)Via its random walk representation, the simplicity of this example is easily understood. In fact, we have £(&) = 2p — 1 < 0, and Sn is the simple random walk with 5o = 0 and jumps of size one, upward with probability p < 1/2 and downward with probability I — p. We have established a well-known property of the simple random walk with negative drift, namely, that its maximum is geometrically distributed, yielding, for example,

2.5

Balance equations, continuous time

The notions of stationary distributions and limit distributions briefly discussed in section 2.3 apply to discrete-time models as well as to continuous-time models. In general the theoretical basis for studying the asymptotic behavior of continuous-time Markov processes is more sophisticated than that for Markov chain models. On the other hand, for Markov traffic models there often is no particular need to bring in theory that goes beyond birthand-death processes. Recall from section 2.2 that a birth-and-death process is characterized by a set of birth rates {/„} and death rates {/zn}. Restricting considerations to this class, a result can be stated that gives complete knowledge of the asymptotic properties. Theorem 2.2. Assume that a birth-and-death process N,, t > 0, on the nonnegative Integers is governed by parameters {Xn}n>0 and {jtn}n>i such that

Then a unique stationary distribution

exists, which is also the limit distribution

Before discussing the proof on an informal level, let us look at two examples.

2.5. Balance equations, continuous time

33

Figure 2.4. Sample paths o/M/M/1, Q = 0.9, 1.0, 1.1. Example 3. Consider the M/M/1 model again with parameters A and /LI. Condition (2.18) takes the form

It is customary to denote the relevant parameter ratio by

and conclude, in the subcritical regime g < 1,

thus NX, e Ge(l — Q), which was mentioned at the end of section 2.2. We illustrate this fundamental example in Figure 2.4, which shows simulated traces of the process for the three cases with intensity ratios Q equal to 0.9, 1.0, and 1.1, respectively. Example 4. The model M/M/1/K is a variant of the previous model such that an arrival at time t is accepted into the service system only if N, < K; otherwise it is lost. Hence the total size of the system is always limited to size K. Recall the intensities

The divergence of the first sum in criterion (2.18) (consistent with A./C = 0 and fig = /LI) is a recurrence condition that is automatically satisfied as there is only a finite number of states.

34

Chapter 2. Markov Service Systems

Similarly, the convergence of the second sum in (2.18) is satisfied. The corresponding distributions in (2.19) simplify to

where the required normalization ^,Q nk = 1 implies

This is a truncated geometric distribution. There is no restriction on the parameter Q in this example. The larger the value of Q, however, the more arrivals will be lost. It is not difficult to comprehend the content of Theorem 2.2. Indeed, the probabilities Pk(t) = P(N, = k) are governed by the system of equations (2.11). If the process N, evolves in a steady state fashion it means that an initial distribution for Wo has been chosen such that pk (t) is actually independent of t. But then the derivatives must vanish, p'k(t) = 0. Writing in this case 7tk = pt(t), then by (2.11)

By iterating the second relation and as the final step using the first, we obtain

hence, by induction, the balance equations for birth-and-death processes in convenient form,

Solving in terms of JTQ, this gives

after which it only remains to find JTQ such that these relations are consistent with the required normalization,

It is now obvious that convergence of the second sum in condition (2.18) is necessary. To conclude the discussion about the validity of Theorem 2.2, consider for comparison the case of discrete-time Markov chains in Theorem 2.1. The notion of periodicity is completely tied to the discrete structure and is not relevant to the case of continuous time. What is needed, however, is an analog of the irreducibility criterion. The intuitive content of irreducibility is that any state can be visited from any other state, and clearly birth-and-death processes work

2.6. Jackson networks

35

as something of a prototype for such behavior. More formally, all transition probabilities satisfy are continuous functions of t. It turns out that this property is a sufficient condition for the stationary distributions obtained as solutions to the balance equations (2.20) to be also the unique limit distributions. For complete proofs, see a textbook on stochastic processes, such as Grimmet and Stirzaker [14]orResnick[51]. The next example continues the discussion of the M/M/1 model. Example 5. Consider a transmission link of capacity c bits per second, and suppose messages arrive at the transmission node according to a Poisson process of intensity A, messages per second. We assume that the lengths L\, L 2 , ..., measured in bits, of arriving messages are randomly varying, independent from one arrival to the next, and are exponentially distributed, Lj e Exp(l/l) with mean t bits. To transform these assumptions into an M/M/1 model we put bits/msg 5" = L/C = L/C seconds per message; bits/sec hence, observe that S e Exp(c/£) since

and interpret S as the service time in an M/M/1 model with arrival intensity A, and service intensity ^ = c/t. The traffic intensity is therefore Q = Xi/c, and the system will operate in a steady state as long as X.i < c, under which the system size Nt has a geometric stationary distribution with mean

2.6

Jackson networks

It is certainly necessary for achieving even modest goals of realism in traffic models to include several service systems and alternative routes. In a LAN, for example, a variety of incoming traffic streams are switched from one node to another for continued service or for direction onto outgoing links. In the Markovian framework there is a celebrated result that extends the use of the classical queuing models quite remarkably. If each node in a network of service stations is modeled by means of a classical queuing service system with infinite buffer, and traffic is allowed to be routed from the exit of one node to the entrance of another node, then Jackson's theorem gives a recipe for writing the stationary distribution for the number of cells present at each node of the network. Despite its simplicity and tractability, however, the theory we study in this section hardly resolves the issue of modeling complicated networks. The main drawback is that the Markov structure imposed on the system requires that routing of cells between nodes be completely state independent.

36

Chapter 2. Markov Service Systems

Jackson's network model can be described as follows. We work in continuous time mode and consider m unbounded service nodes of the type studied in section 2.2 (M/M/1, M/M/oo, or more general M/M/s systems), characterized by sequences fji1 = {/^} of service rates, fji'n = rate of service in node / if n jobs are present, n > 1, i = 1 , . . . , m. Suppose that each node is supplied with an external input line, and assume X1 = intensity of external arrivals at node /,A' > 0. Thus the arrival streams are purely Poisson and not allowed to depend on the present status at the node. Finally, suppose an m x m routing matrix R = (r,y) is given such that r{j = probability of entering node j on departure from node i, prohibiting, as mentioned above, the routing mechanism from depending on the state of the system. It is obvious that = probability of leaving the network from node i, and that the parameters {X1}, {/z!}, and R determine a system that works as shown schematically in Figure 2.5. Under suitable assumptions on the parameters the resulting state variables Nf = system size, buffer plus server, at time t in nodal point /, / = 1 , . . . , m, will be well defined and in the long run enter steady state behavior. We start with the trivial case R = 0 (the matrix with only zero entries). Then the Jackson model consists of m independent Poisson arrival Markov service systems, such as M/M/1, working in parallel without affecting each other. If the separate nodes settle into equilibrium distributions [nlk}k>o, i = 1 , . . . , m, then the network steady state is simply given by

What is remarkable is that the same structure prevails for the general Jackson network. In fact, such networks behave in steady state as if the nodes were independent, presupposing we calculate the actual arrival intensities to each node, consisting not only of external arrivals but also of internal traffic from within the network. Theorem 2.3. Suppose a network is defined by parameters [kl}, {/x1}, and R as above. If YI > Qfor at least one i, then the system of traffic equations

has a unique solution (yl, ..., y m ).

2.6. Jackson networks

37

Figure 2.5. Jackson network. If for each node i the sequence {n'n} is such that the service system in that node exposed to a Poisson arrival stream of intensity yl exhibits a steady state n'(y'\ then the network stationary distribution is given by

Example 6 (M/M/1 in series). A set of m serially connected M/M/1 service stations is a simple example of a Jackson network. We assume

Now Jackson's traffic equations take the form

Hence, if A < ^ for each j — 1 , . . . , m, then in steady state

38

Chapter 2. Markov Service Systems

Figure 2.6. M/M/1 with feedback mechanism. Example 7. A feedback M/M/1 system is given schematically in Figure 2.6; each job that departs from the server is independently switched into one of two routes. With probability r the job is allowed to exit the system and with probability 1 — r it returns to the entrance queue for repeated service, in the latter case as surplus to the external Poisson stream of intensity A, consequently adding to the workload of the constant intensity JJL server. The traffic equations simplify into

and thus, if A. < r/z, the steady state system size distribution is given by

2.7

Markov loss systems

There are two basic loss mechanisms. We discuss them briefly using two of the models in Table 2.1. The single-server finite-size model M/M/l/K is the basic example of a model where jobs are lost when the buffer has filled to its maximal size, K — 1, restricting the system size to Nt < K at all times. The m-server loss system M/M/m/m, on the other hand, blocks out arrivals if the maximal number m of available servers has been drained, hence avoiding the need for a buffer at all. Of course, an analysis of the combined model M/M/m/K would involve elements from both loss mechanisms. Loss analysis of M/M/l/K. Recall from Example 4 that the steady state distribution is given by the truncated geometric distribution

One can check that the average buffer occupancy E(N00) is given by

A bound on this quantity might be an appropriate measure of service quality, hence leading to a rule for selecting K. Perhaps it is more natural to consider the probability that the

2.8. Delay analysis in Markov systems

39

system is blocking arrivals, namely,

Anticipating a discussion in section 3.2, this can be interpreted as the proportion of time during which the system is full, and hence as the probability that any given arrival is lost. Given an acceptable value of the loss probability, it is a simple task to find the required buffer size; see Exercise 2.9. The m-server pure loss system. This is the model M/M/m/m: Again it is straightforward to solve the balance equations, obtaining

and, in particular, Erlang's 1st loss formula: The amount of lost work per time unit in the M/M/m/m model is given by ITTOT, where jrm is the loss probability

For numerical purposes it is often convenient to introduce a dummy random variable X e Po(^) and note that

2.8

Delay analysis in Markov systems

Clearly the above analysis of simple loss models raises further important issues, such as providing answers to questions like, How much time does a typical job spend in the system? and, How much longer does a job have to wait if further buffer space is added? As an introduction to understanding the balance between loss and delay we consider at this point some standard calculations regarding the simplest Markov systems. For convenience, we drop indices when we discuss steady state quantities and write, for example, N instead of NCC for system size.

2.8.1

Delay in M/M/1

The typical quantities studied are system time W, the total time from arrival to departure of a typical job, and waiting time Wq, the time spent in queue waiting for the service of

40

Chapter 2. Markov Service Systems

other jobs to be completed. For the M/M/1 model under FCFS scheduling, the steady state distributions of system time and waiting time can be found explicitly. The key idea is to condition on the system size N = n and represent the corresponding waiting time in a convenient form. More precisely, suppose a job arrives in the system and finds either N = Q = 0 or N = n > 1, and hence Q = n — 1 > 0 other buffered jobs waiting for processing. The total time the newly arriving job will have to wait in the system is

where S is the service time for the arriving job itself, U is the remaining service time for the job occupying the server at arrival, and S\,..., Sn-\ are service times for the n — 1 jobs waiting in the buffer. Now S as well as Si,..., Sn-i are independent and all are exponentially distributed with parameter JJL. Moreover, service times and remaining service times are identically distributed in the exponential case (see (2.10)), which means that U has the same distribution as the other summands in the representation for W. Hence for both cases n = 0 and n > 1 we obtain

in view of Exercise 2.5. Thus, considering the conditioning, we may indicate the distribution of W via the suggestive notation

which in explicit terms means that W has the density function

In other words, the law of the M/M/1 system time W in equilibrium is remarkably simple! The exponential distribution with expected value I/(/A — A.) appears. Similarly, restricting the analysis to queuing time only, we have

hence if n > 1 and Wq = 0 otherwise. It follows that Wq is a mixed distribution that assigns mass P(N = 0) = 1 — Q to an atom at zero (no waiting in line is necessary), and the remaining mass is distributed according to the reweighted exponential density Q fw(x). Equivalently, P(Wq > jc) = Qe-^-V*, x >0.

2.8. Delay analysis in Markov systems

41

Figure 2.7. Web client-server model. First look at Little's formula. Recall that in steady state M/M/1, E(N) = g>/(\ Q) = A,/(// — A) and hence E(Q) — Q2/(l — Q). The mean sizes and corresponding mean delay times are therefore related by

These are simple instances of the celebrated Little's formula, which states that in a multitude of models the mean size is proportional to the mean delay with the average arrival rate being the proportionality constant. 2.8.2 A client-server Jackson network The World Wide Web is growing at an unsurpassed speed. Over a period of several years the number of web servers has increased exponentially. The driving force of the growth dynamics might be, on one hand, to meet the request for service from the growing collective of clients having access to the server via the Internet, and on the other hand, to generate new groups of clients of financial or other potential interest to the server. Modeling of such a large-scale supply and demand system, and of the evolutionary dynamics of the web, is in its infancy. Interesting starting points can be found in [20]; see also [58]. To set up a simple model for the performance of a website exposed to Internet community load we follow Slothouber [59]. The website is assumed to act as a file server only responding to download requests arriving from Internet clients. Each request consists of retrieving a file of exponential size F with mean E ( F ) . The server will transmit only a chunk of the file at a time, of exponential size B with mean E(B}. Hence the client may have to repeat its request a random number of times until file transmission is complete. The system is modeled as a Jackson network with five M/M/1 nodes; see Figure 2.7 (slightly modified compared to [59]). Three nodes model the web server and two nodes the Internet communication network. Requests arrive with intensity A, hits per second at the entry point of node 1, where one-time processing is performed at service rate ^JL\. The service time at node 2 with rate yU2 represents server time, which is independent of file size. At node 3 the server reads B bits of data, which are processed at the rate Cr. This block of data is transmitted by node 4 to the Internet at the server's transfer rate Cs bits per second and received by the client's browser, node 5, at client network bandwidth Cc. Now, with probability r = E ( B } / E ( F ) the file transfer is complete, and with probability 1 — r the job is retransmitted to the input of node 2. This is repeated until the job exits from node 5. One should observe before continuing that the choice of the retransmission probability r is consistent with the assumption of F and B both having exponential distributions. In fact,

42

Chapter 2. Markov Service Systems

in complete analogy to (2.22) we have

where M e Ge+(l — r) is the number of rounds needed to download the complete file of size F. We pause to recall the general Jackson net in Theorem 2.3 and to consider delay times in such networks. If we accept the basic relationship of Little's formula, then we can immediately write down expressions for delay in a Jackson network. As before consider m Markov service nodes with external Poisson arrival intensities A ( , internal traffic intensities yl obtained from Jackson's traffic equations, and steady state distributions represented by random variables Nl for the number of messages at node i, i = 1 , . . . , m. The average system time delay in node / is given by ENl / y l , where the mean is computed with respect to yl. Similarly, the average network system time for any message arriving at the network is given by

In the client-server model it is simpler to compute the total mean delay E(W} for a client as follows. First, it follows from the traffic equations for this model that

The delay times in nodes 1 and 2 are therefore

Similarly, since it takes on average E(B)/C seconds to process a file of size B under capacity C,

The number of round-trips, M, for which E(M) = 1/r = E ( F ) / E ( B ) , must be included, and the final result becomes

Clearly, for this to hold X has to be small enough so that all terms exist finitely. The final result differs somewhat from [59], mostly because we preferred five nodes rather than four to keep the exponential file size distributions exact. Slothouber suggests typical parameter values C s = 1.5 Mbit/s, Cc « 700 Kbit/s, E(F) « 5 Mbyte, E(B)^2 Mbyte and investigates the influence on delay as the remaining model parameters are varied.

2.9. Exercises

2.9

43

Exercises

2.1 An indicator for traffic load in a router is measured once a minute, at which times the load is classified as either normal or high. In the case of high load a leaky-bucket system starts, which reduces incoming traffic. The sequence of classifications can be described as a discrete-time Markov chain. The probability that a normal reading is followed by a high reading is 0.10 and the probability that a high measurement is followed by a normal one is 0.95. Find the probability that a high load is registered during an arbitrary minute interval. What is the expected number of minutes during an hour at which a registered high load is followed by a normal load, that is, at which the control system is effective? 2.2 A binary source in a communication system generates a sequence X\, Xi, ... of zeros and ones according to a Markov chain with transition probability matrix

where a and b are the probabilities for a change from one symbol to the other. The sequence is transmitted over a binary symmetric channel, which means that the fcth symbol received, Yk, is corrupted by errors with probability e and error free with probability 1 — £, independent of previous errors. When the system is in equilibrium find the conditional probability P(Xk = l\Yk = 1) that a received digit one is error free, as a function of a, b, and £. 2.3 Consider the service system M/M/oo, where the arrival intensity \n = A. > 0 is constant regardless of the number of jobs n in the system but the service rate equals j.in — \in,n > 1, where JJL > 0 is a given constant. Write the balance equations. Solve to obtain a stationary solution. Is it necessary to impose restrictions on the parameters to find a solution that is a probability distribution? What is the steady state distribution of this service system? What is the utilization? 2.4 Think of the service time required for each job arriving at a service system as an amount of work that each arrival is carrying with it (for example, link processing time). We associate with an M/M/1 queue the discrete time workload process Yn, n > 1, by letting Yn be the accumulated amount of work brought into the system by arrivals occurring in the time interval [n — 1, n). We obtain in this way an i.i.d. sequence. Why are they independent? What can be said about the distribution of the Yn 's? The mean? (A reader with some background in probability may want to find the variance and also the moment-generating function.) 2.5 Based on the previous exercise consider the M/M/1 system from the viewpoint of the server. By recording at each time t the amount of residual work that remains to be done to finish what is in the system at time t, we obtain the continuous time virtual workload process. Draw a graph of what such a process would typically look like. Observe that a simple change of units yields the equivalent virtual waiting time process.

44

Chapter 2. Markov Service Systems

2.6 For a small circuit-switched public network find the number of circuits necessary to keep the probability of blocked calls less than 0.2, assuming Poisson arrivals with intensity 120 calls per hour and exponentially distributed call durations with a mean of 2 minutes each. 2.7 Suppose an access control mechanism of a single-server system has the effect of changing a constant arrival rate a. into the rates an = « / ( « + 1), where n is the number of jobs in the system. Assume exponential transmission times with rate \JL. Find the equilibrium distribution and the expected size of the system and queue, respectively. Find the average arrival rate. Then, from Little's formula, determine the average waiting times in the system and in the queue. 2.8 By expanding the cube of Qn+\ in (2.2), express the equilibrium variance of 7 = 0 for all other pairs i, j. Describe the steady state of the network. In particular, what is the distribution of the total size of the network in steady state?

(Ampler 3

Non-Markov Systems

This chapter introduces techniques based on renewal theory and techniques related to general service-time distributions in queuing and loss systems. Renewal theoretic methods are particularly useful for performance analysis in many service systems. The basic properties of renewal processes and renewal-reward processes are covered and the application of renewal models in two specific areas of networking are discussed at some length. The objective of the first example is to express in model parameters the probability that an ongoing mobile phone call is terminated. The mobile unit moves from one cell of the service area into another and the call is terminated because no traffic channel is available in the new cell. The second set of examples deals with reliable data transfer protocols for modeling the transport-layer communication protocols on the Internet. Service systems with Poisson arrivals and general service-time distributions belong to the core of traditional queuing theory. A selection of material is presented starting with the Pollaczek-Khinchin formulas. As an introduction to more advanced topics we discuss two different approximations of the queuing delay time in M/G/1 systems: one approach is based on the heavy traffic approximation and the other uses recursive representations. Some general references for this chapter are Bertsekas and Gallager [5] and Harrison andPatel[17].

3.1

Performance measures

We begin by presenting a list of notions that are useful for measuring performance in many systems, such as the classical Markov models, another non-Markov service system that will be discussed in this chapter, and also, for example, switches. Traffic intensity:

Q=

average service time required per server per unit time,

Off Hi rt • average amount of traffic presented to system per unit of time,

45

46

Chapter 3. Non-Markov Systems Utilization:

fraction of used system capacity,

Throughput: e r

average amount of work completed by F J the system J Fper . ° . unit of time,

Loss probability:

average fraction of lost traffic, and

Blocking probability:

probability the system is blocked at a random time.

Normally one can think of utilization as the fraction of busy servers. The following relation between some of the listed quantities holds in great generality: loss probability = 1

utilization traffic intensity

We demonstrate the validity of (3.1) for a number of models in examples and exercises below. At this point we offer only a heuristic argument as follows. First, we may think of the loss probability as the limiting ratio of the number of lost messages to the number of attempted messages considered over a long time interval, as indicated in loss probability:

number of lost messages total number of messages total number of messages — number of transmitted messages total number of messages number of transmitted messages total number of messages

By dividing with the length of the time interval, the same relation considered per unit of time is obtained as number of transmitted messages per unit of time loss probability: 1 — total number of messages attempted per unit of time The system service rate is the number of messages that the system can accommodate per unit of time. Hence the ratio of the amount of transmitted messages to the system service rate measures the fraction of system capacity actually used. In other words, it measures the utilization of the system utilization:

number of transmitted messages per unit of time system service rate

Similarly, number of attempted messages per unit of time system service rate

system arrival rate system service rate

The service rate per server is inversely proportional to the expected service time. Hence the above ratio, considered on a per-server basis, equals the fraction of time during which a given server is requested to provide service, system arrival rate system service rate

server arrival rate = traffic intensity, service rate per server

3.2. Integrated processes and time averages

47

Figure 3.1. Load versus throughput M/M/l/K, K = 1,2,5, 10, 100, /x = 1. which combined with the earlier relations make (3.1) plausible. Example 8 (continuation of Example 4). We look at the specific case M/M/l/K:

It is clear that (3.1) holds. Observe that this model is defined for any Q > 0. A load-versusthroughput graph is shown in Figure 3.1. Further examples are deferred to the exercises.

3.2

Integrated processes and time averages

In various situations integrals of stochastic processes arise naturally. For example, if Nt is system size in a single-server system, then f^ 1 {Ns >0} ds is the occupation time for the server during [0, ?]. Integrals of the form fa} Xs ds are well-defined random variables as long as the sample functions of Xt are not too irregular. Sufficient regularity conditions are known and cover virtually all stochastic processes arising in practice, including those in this book. To present the rules for calculating moments we assume that { X t , t € T}, T a fixed interval,

48

Chapter 3. Non-Markov Systems

has finite mean and variance such that E(X t) is a continuous function oft and Cov(Xs, Xt) is a jointly continuous function in s and t. By using results from measure theory it can be verified that the order of integration and expectation can be interchanged, so that

Moreover, applying (3.7) to the double integral,

it follows that

The process Xt is weakly stationary if the mean function E(Xt) is constant independent of t and the autocovariance function c(t} = Cov(Xs, Xs+t) is independent of s. Under this further restriction £(/Q Xsds) = E(Xo)t and (3.8) simplifies. In particular, change of variables in the integrals shows that

Time averages is a useful technique for studying the asymptotic behavior of service systems. Recall that in the steady state analysis of Markovian service systems, a limit distribution {nk} represents the typical probabilities that after a long time interval, a trajectory of the system is observed in state k > 0. If such a process Nt with steady state A^ has been observed over a long interval of time [0, t] and a time instant u G [0, t] is selected randomly, then the distribution of Nu should be close to the distribution of A^,, which is {nk}. It is natural to expect that this would hold with or without Markovity. To follow this line of thought we represent the knowledge of the evolution of a general process Nt by writing heuristically J-t = {all events known from knowing NS,Q < s < t},

t > 0.

Indicated here is the mathematical formalism of a filtration (F^t^o associated with the stochastic process Nt, a notion that in full requires the theory of measures and measurable spaces. On a nonformal level it is sufficient to think of Ft as the information about the prehistory [Ns, 0 < s < t} of the process up to time t. Let U(t) e Re(0, t) denote a random variable independent of (A7,) that is uniformly distributed on [0, t]. Based on the above heuristics we expect that Nu(t) represents an average state, that is,

The effect of performing the conditional expectation E(Nu(t)\Jrt} is to average over U(t) only, keeping the random variation in (Ns) intact. In such a way the time average process of

3.2. Integrated processes and time averages

49

(Ns) appears. Since the density fu(t)(s) of U(t) is constant and equal to \/t for s e [0, t], it follows that

Going one step further, averaging over the sample functions of (JV y ) shows that

Similarly,

and we expect

for large t. This turns out to be true in great generality. In particular

identifying the long-run proportion of time that Nt spends in state k with the steady state probability xk for NOQ. Moreover,

and so

Such results belong to ergodic theory. As an application we return to the example of occupation times in single server systems. Here

which verifies that the fraction of time the server is occupied indeed equals the traffic intensity. Although the conceptual reference in this brief introduction is to Markov processes, typically the Markov property is not essential. Results of the above form are common throughout probability theory. We move onto renewal theory in the next section and continue the discussion of these topics and related ideas. Wolff [69] systematically exploited time averaging methods, gave detailed presentations of Markov processes and renewal theory, and studied many interesting applications of queuing theory.

50

3.3

Chapter 3. Non-Markov Systems

Some ideas from renewal theory

The standard (delayed) renewal model is the following. Consider a sequence of independent, nonnegative random variables U\, Ui, • • • , where U\ has distribution function F\(t) = P(U\ < t), t > 0, and t/2, t/s,..., are identically distributed with common distribution function F(t) = P(U{ 0, i > 2. We assume

Let Tn = Y^=i Ui, TQ = 0, denote the partial sums, and suppose that renewal events occur on the real line at times T\, 7 2 , . . . . The renewal process is the counting process

associated with the i.i.d. sequence (Ui). The special case F\(t) = F(t) = 1 — e~xt returns the Poisson process with intensity A. Two simple but useful observations are the relationships

(see Exercise 3.5) and The latter ordering property shows that

An argument based on the strong law of large numbers now completes a proof of the renewal theorem in its simplest form

More advanced methods [14], [69] lead to the corresponding property for the mean number of renewals, the elementary renewal theorem,

The stationary renewal process is obtained by choosing for F\ the equilibrium distribution associated with F(t), namely,

It can be shown that with this choice E(Nt) = t/v, so that the asymptotic relation in the elementary renewal theorem is in fact an identity for any fixed t > 0.

3.3. Some ideas from renewal theory

51

3.3.1Renewal reward processes The renewal reward theorem is an extension of the renewal theorem, which is often useful and convenient for asymptotic throughput analysis in various traffic models. The idea is to associate to the renewal cycles not only their lengths Ut,i > 1, but also a further sequence of random variables, /?,, i > 1, where 7?( represents the reward accumulated during cycle i. The total reward up to time t is given by partial reward from interval ( T ^ t , t ] . The aim is to find the asymptotic mean reward in the sense of a time average, i.e., the limit as t —» oo of fraction of partial reward. It is clear from this relation what to expect. The renewal theorem shows that Nt/t —»• 1/v, and so it should follow from the strong law of large numbers that/?,/? -> E ( R ) / v & s t —»• oo. The typical assumptions imposed on the rewards to guarantee the expected behavior are that for each _/', Rj may depend on Uj but is independent of all £/,, / ^ j, and that (/?,-) is an independent sequence, identically distributed except possibly for R I , which is allowed to have a different distribution. Moreover, it is assumed that E\R{\ < oo for any /. It can be shown that under these assumptions the details of assigning rewards to renewal events does not affect the end result. It does not matter whether a reward is counted at the beginning or at the end of a renewal interval or if it is gradually allocated continuously over time. In either case the partial rewards vanish asymptotically, and the renewal reward theorem states that the time-averaged total reward converges to the cycle averaged reward. Indeed

where E(R) is the common expected value of the rewards /?,-, i > 2. For proofs and more general versions related to regenerative processes, see, e.g., Wolff [69].

3.3.2

Renewal rate and on-off processes

Now let X i , Xi, • • • denote a sequence of i.i.d. random variables with finite mean E(X) and variance V(X), which is independent of the interrenewal times U\, f/2, • • •• Define the corresponding renewal rate process to be

where the notation TO = 0 is added. A useful property of the renewal rate process is that its mean and covariance functions are easily calculated. Indeed,

52

Chapter 3. Non-Markov Systems

and

since if U\ < t, then A, = Y^=2 ^» ^{Tn-\ 0. The class of alternating renewal processes or on-off processes is another interesting starting point for modeling general arrivals. Here the renewal times Tn = £)"=1(£/i + Vj) are generated by two i.i.d. finite mean sequences (t/;) and (V,), and the on-off process

equals one during the successive on-periods of length U\, f / 2 , . . . and equals zero during the off-periods of length V\, Vi, The initial condition is ZQ!) = 1; the analogous process Zf( ) 0) withZQ = 0 is obtained by switching the roles ofU and V. More generally we can construct a stationary on-off process Zt. To find the corresponding steady state probabilities we apply the renewal reward theorem. The mean cycle lengths are v = E(T\) = E(U\) + E(V\) and the mean rewards £(/?,•) = E(U\). Over an interval of length t the integral J0' Zs ds is the length of time the process spends in the on-state. Hence partial reward. In the limit t —» oo this yields the on-state equilibrium probability

It can be verified that the on-off process indeed has an asymptotic distribution (P(on), 1 — jP(on)}, given as above by the limiting ratios of the mean values. We compute the covariance function for the stationary process Zt in the simplest case of the Markovian alternating renewal process. In this case the sojourn times are exponential with constant intensities a > 0 for jumps from off to on, ft > 0 for jumps from on to off, and P(on) = a/(a + /?). Now

3.3. Some ideas from renewal theory

53

Referring to the Kolmogorov equations (2.11), the function p\\(t} — P(Zt = l|Zo = 1) satisfies Hence and thus

Moreover, V(Z,) = y(l — y} and therefore

3.3.3

Hand-off termination probability

The stochastic model considered here was introduced by Lin, Mohan, and Noerpel [32]. The service area of a personal communication service network is partitioned into several fixed cells. Subscribers of the service are mobile and carry portable phone units. From time to time they move between adjacent cells, and it is during these moments that an ongoing call is under risk of early termination. Whenever a portable enters a new cell with a call in progress it requires a new channel; this procedure of changing channels is called hand-off. We are interested in the number of cell transitions during an ongoing call and ultimately in the forced termination probability for the call. Assume that the intercell times, i.e., the successive time intervals that a user spends in consecutive cells over the service area, are independent and with lengths U determined by a given distribution function F(t) = P(U < t). Assume also that the call holding times, 5, are exponentially distributed with parameter /i — l/E(S} and independent of cell duration times. We pick the starting time and location for the call randomly. It is natural therefore to model a single call using the stationary renewal process. Let N, be the renewal process with interrenewal times such that [/,, i > 2, are i.i.d. with distribution function F having finite mean v = E(U) < oo and U\ has the equilibrium distribution function Feq in (3.13). This means that a user call is initiated at time t = 0 and that Nt gives the number of cell transitions for that user up to time t. Hence K = NS = number of hand-off transitions during a single call. The final parameter in the model is the forced termination parameter pf. At each hand-off transition event the call is terminated in advance with probability pf independent of the number of previous hand-offs. Therefore the random variable M = number of hand-off transitions until first forced termination is geometrically distributed, M e Ge+ (/?/), and is independent of K. The forced termination probability is the probability that the first forced termination event occurs while the call is still in progress,

54

Chapter 3. Non-Markov Systems

Now, to have K > k for some k > 1, it must be true that the duration of the call exceeds the time U\ of the first hand-off, the probability of which is P(S > U\). Given S > U\, the remaining duration of the call must exceed f/2, and so on until all k hand-offs have occurred. But since S is exponential these remaining life-lengths are also exponential, whence

and so

Now we use the calculation

to conclude

It is an integration exercise to rewrite

The final representation of the forced termination probability is therefore

3.3.4

Reliable data transfer

IP provides the basic delivery service between communicating end systems on the Internet's network layer. Although every host has a unique IP address and the protocol is responsible for the logical communication between any two hosts, the IP service makes no guarantees that data segments are delivered correctly or in order to the receiving end. Because of this, the protocol is of best-effort nature and IP is said to be an unreliable transfer protocol. The fundamental task of the transport-layer TCP, except for extending delivery service to processes running on the end hosts, is to provide reliable data transfer. Sequence numbers, acknowledgments, and timers are used to ensure that data segments are delivered orderly and uncorrupted. In section 6.4.3 we make a detailed study of the congestion control mechanisms of TCP. In this section the mechanisms of three simpler protocols for reliable data transfer are investigated: the stop-and-wait protocol, the go-back-N protocol (GBN), and the selective repeat protocol (SR). The aim is to compare these schemes with respect to effective throughput as a function of the packet loss probability. To solve this task we apply the renewal reward theorem.

3.3. Some ideas from renewal theory

55

For the exact principles of reliable data transfer see the computer networks literature, e.g., Schwartz [55], Stevens [60], and Kurose and Ross [30, Chapter 3.4]. Here we restrict ourselves to a simplified model for the transmission of packets (or frames) from a sender to a receiver, where it is assumed that packets of fixed and equal size, numbered in sequence, are transmitted successively subject to constant delay times. Every packet delivered at the receiver's end is acknowledged by the return of an ACK packet in the opposite direction, and the arrival of the ACK at the sender marks the end of a successful transmission round. For simplicity we select a time unit by letting tR = 1 be the round-trip time. In addition, a time-out clock with expiry time TO is used to handle the loss of an ACK; expiry of the clock at the sender triggers the retransmission of one or several unacknowledged packets. To continue putting this in the framework of a mathematical model, we associate with each data packet a loss probability /?, the probability that either the packet or its ACK is lost during transmission, and hence the packet unacknowledged at the sender. We start by analyzing the go-back-1 protocol. The sender transmits a single packet on the channel then waits a round-trip time for the corresponding ACK to arrive. If the packet is successfully delivered, then the return of the ACK packet marks the start of the next round, where a single packet can be transmitted again. If the packet is lost, and consequently no ACK packet arrives in the expected time period, then the time-out clock is activated and the packet is retransmitted with an additional delay of TO, the time-out span measured in number of round-trip times. The periods between consecutive packet loss events can be thought of as cycles, where each cycle consists of a random number of rounds. Let K denote the number of such rounds in a cycle, K = number of rounds until a packet is lost. Then K e Ge + (p) since the probability that a loss occurs in round k is P(K = k} = (1 — p)k~l p, k > 1. The loss is discovered only at the end of the Kth round and TQ roundtrip times elapse before retransmission of the lost packet. This means that effectively per cycle K — 1 packets are delivered over a time interval of length K + TQ. The stop-andwait protocol is very similar conceptually. The difference can be captured in the model by saying that now K — 1 packets are delivered over a time interval of length (K + \}TQ round-trip times. See Schwartz [55] for a detailed discussion of the differences between the two protocols. Although the set-up in [55] is slightly different from ours, the resulting throughput formulas will be the same. To rephrase the above in terms of the renewal reward theory briefly introduced in section 3.3.1, let N, denote the number of cycles up to time t. Then N, is the renewal process associated with a sequence of interrenewal times ([//), such that for go-back-1 Ui = Kf + T0 and for stop-and-wait (// = (Kf + l)7o, where Kt is the number of rounds in cycle /, i > 1. The mean interrenewal times are therefore

with each cycle i is associated the reward Rf = Kt — 1 with E(Ri) = (1 — p)/p. The throughput over time [0, t] can be written Throughput (t)

56

Chapter 3. Non-Markov Systems

and so, by the renewal reward theorem, ThroughputGB1 (formulas 4-6 and 4-1, respectively, in Schwartz [55]). The extension to GBN is presented in two steps. In the main body of the text the throughput is derived for a simplified version of the protocol. The more realistic modification is discussed in Exercise 3.4. In these models it is assumed that the packet size is small compared to the round-trip time and that a parameter TV acting as a windowsize is introduced giving the maximum number of packets that the sender is allowed to send without waiting for acknowledgment. A successful round consists of transmitting N packets, each packet subject to loss independently and with the same probability p. If one or several of the N packets are lost, then we assume in the simplified model that all the TV packets in the same round have to be retransmitted. (In the model of Exercise 3.4 only those packets following after the first loss in the window must be retransmitted.) The probability that a round of packets is delivered loss-free is (1 — p)N and the number of rounds until the first loss occurs is a geometrically distributed random variable K e Ge+(l — (1 — p)N). Just as for the go-back-1 protocol, we let Nt be the number of cycles to time t, and we note that

is the expected time between any two loss events. The reward associated with cycle i is a total of Ri = N(K{ — 1) packets and the expected reward therefore is

Again the resulting throughput derives from the renewal reward theorem, ThroughputGBN that is, ThroughputGBN The SR protocol was devised to avoid the drawback with GBN that all the N packets in a round must be retransmitted if a loss occurs. The SR protocol can be modeled in close analogy to GBN under the simplifying assumption that the possibility of several packets being lost in the same round can be ignored, which is to say that since p is sufficiently small, probabilities of order pk, k > 2, can be ignored. Then SR is obtained from GBN with the modification that another N — I packets are added to the reward of cycle /,

3.4. The loss and delay time balance

57

Figure 3.2. Throughput in GBN (filled lines) and SR (dashed lines). since in this case only one packet of those in the final round before time-out must be retransmitted later. This gives ThroughputSR A comparison of throughput for GBN and SR for /?-values ranging from p = 0 to p = 0.1 and N from N = 1 to TV = 40 is shown in Figure 3.2. The time-out interval was set to TQ = 4 round-trip times.

3.4

The loss and delay time balance

We have seen that the obvious remedy for avoiding congestion in nodes and transmission links is to add buffer space. It is equally obvious that if a packet is allowed into a heavily buffered system, then there is a definite risk that the packet will spend an excessive amount of time in various buffer queues, possibly generating unacceptable delays in end-to-end delivery time. Traffic modeling can give some insight into this fundamental balance between buffer space and delays. We follow the usual approach to time delay modeling, namely, we consider a singleserver system with delay consisting of buffer time and service time. Certainly the total delay of a packet can originate in many other ways. First, each contribution to the delay the packet suffers within a given subnet in the network should be summed to obtain the total. On the level of simpler units such as transmission links, the delay might be broken down into several sources. Normally it is fair to assume that processing delay and propagation delay are independent of traffic load and hence do not directly affect the loss-versus-delay trade-off. On the other hand, transmission delay depends on the packet size distribution, and queuing delay on buffer sizes and retransmission delay can be very sensitive to traffic load. Moreover, what is considered acceptable delay in one traffic class may not be acceptable in another, which points to further difficulties in systems designed for a variety of traffic types.

58

Chapter 3. Non-Markov Systems

The system is represented by a long time arrival rate A = A.,*, and a system size process Nt, typically assumed to be in steady state and referred to by N only. Our discussion still includes, but is not restricted to, the classical queuing Markov single-server systems. Consider W = time span from the arrival of a typical message until it departs from the server, = Wq + S = time waiting in buffer + service time, as in section 2.8.1 where it is called the system time. Since it is not clear what a "typical message" is, for now we leave it as a textual description. What can be said about the expected value E(W}1 Following the heuristic we argue that if we pick an arbitrary arrival "painted pink" and trace it through the system, then w = E(system time for pink message) = E(W). By definition of A, during the time span w on average Xw messages arrive at the system. Obviously they all arrive later than the pink message and so, assuming the FCFS policy, none of them are able to depart before the pink message. Consider now the situation in the system at the particular time when the pink message departs. All messages that were already in the system when the pink message arrived have, again by the FCFS assumption, now departed, leaving us with the conclusion that there are approximately Aw remaining messages in the system at that time. However, seen from the perspective of the system, the departure time of the pink message is just any time and hence there should be approximately E(N) messages in the system at that time. Therefore, to keep the balance straight, E(W) = E(N)/k. As already noted in the Markov case, this rather innocent relation between the time a typical message spends in the system and the typical system size, proportionally determined by the arrival rate, is known as Little's formula. It turns out to be one of the few results for service systems that is true in great generality without strong assumptions on Markovity or independence. 3.4.1

Little's formula

We continue the discussion of this fundamental relation more stringently but without attempting to be fully rigorous. Suppose the service system is initially empty at time t = 0 and recall that system size is always the excess of arrivals to departures, Nt = At — Bt. Figure 3.3 illustrates this relation using a simulated trace of the M/M/1 process represented via its arrival and departure processes At and Bt; the difference between the two increasing curves is therefore the system size Nt. Several busy periods are visible interjected by idle periods during which At and Bt coincide. We compute the limit of

as t -> oo in two different ways. The ergodic limit result for time averages in (3.10) gives

3.4. The loss and delay time balance

59

Figure 3.3. Arrivals and departures in M/M/1.

To compute the same limit by different means, let Wi, W2, . . . denote the successive times spent in the system for each message in order of arrival. The key observation is that for each time point t such that Nt = 0 we have

This can be understood as follows, referring to Figure 3.4. The left-hand side of the above equation is the area between the curves At and Bt. The right-hand side of the equation is the same area built up of horizontal sections one for each jump of At of height 1 and length given by the horizontal distance to the corresponding piece of the curve B,. Considering one such jump at time t, the horizontal distance is the time it takes for the server to handle At clients, hence the delay of that particular arrival. If tk is a sequence such that Ntk = 0 and tk —»• oo, k —»• oo, it follows that

as k —>• oo; compare this with the renewal reward theorem in section 3.3.1. The first factor on the right side appears by definition of the long time arrival rate X. The second factor E(W), the mean in the steady state delay time distribution, results from the strong law of large numbers n~l Y^j=\ Wj —> E(W) since A(^) passes through all integers and increases indefinitely as k grows. The last step toward Little's formula is to verify that the

60

Chapter 3. Non-Markov Systems

Figure 3.4. Blow-up, section of previous graph. same method works in general for t > 0, in the sense that error term, where the error term can be shown to vanish in the limit t -* oo [69, Chapter 5.15]. Thus we are able to identify the two limits of f^ 1 /0' Ns ds and conclude that E(N) = XE(W). A similar analysis of t ~' J0 Qs ds leads to the various instances of Little's formula:

£(busy servers) = XE(S).

3.5 The M/C/1 system In the remainder of this chapter we turn to the family of service systems known as M/G/m. Here M stands for Markov in the sense that arrival epochs are given by the Poisson process, G stands for a general service time distribution, and m represents the number of servers. We restrict the discussion to the single-server system M/G/1 and the infinite-server system M/G/oo. These systems are not directly covered by the mathematical techniques already discussed in the direction of exploiting renewal theory methods or ergodic theory but require other ideas.

3.5.1

Simple examples leading to non-Markovity

We give some further examples not covered by the models introduced so far.

3.5. The M/G/1 system

61

Figure 3.5. Read-write disk access.

Example 9. Deterministic service times. Consider a single transmission link of capacity c bits per second equipped with a buffer designed to hold arriving packets until the link is free. Packets arrive at the link according to a Poisson stream with intensity X packets per second, hence the arrival process A(t) is the ordinary Poisson process. All packets are of fixed equal length L bits and thus each requires a transmission time, or service time, of S = L/C seconds on the link. The natural measure of traffic intensity seems to be Q = AL/c and we can assume Q < 1 in order to expect an equilibrium situation. Quantities such as system size N or delay W or performance measures such as utilization are still apt for study. But how? There are no obvious balance equations. For example, even if L/C is an average service time, its inverse fj, — c/L can no longer be interpreted as a Markovian jump rate. Example 10. Tandem link, fixed packet size. Connect two such links as in the previous example in a series and let each packet that departs from the first link immediately enter the second. It is a rather striking effect that at the second node there will never be a single packet in line! In fact, the interdeparture times from the first node are all greater than or equal to the fixed number S, and thus so are the interarrival times at the second node. But that is enough to avoid any risk of collision on the second transmission link. Example 11. Rotating disk. Suppose a storage disk, sketched in Figure 3.5, rotates r times every second and is partitioned into s distinct sectors. It is connected to a read-write device that is able to retrieve and store data on a specific sector. Suppose furthermore that read-write requests, each requiring a block of b consecutive sectors, arrive at the device according to a Poisson process with given intensity. If necessary an arriving request is buffered until previously arrived requests are completed. Thinking of S = time to find the required data block as a service time, it becomes clear that this is an example of the M/G/J model. The service time distribution is quite arbitrary depending on a number of parameters. Given the lack of exponential service times it is difficult even trying to mimic the Markovian set-up from M/M/1, but it turns out that in M/G/1 much useful information already is contained in the first and second moments of S. For later reference we compute them now. First, to model the apparatus we assume that the beginning of the desired block is found at sector k from the read-write head with probability l/s,k = Q,... ,s — 1. Thus the random variable number of Z — sectors until beginning of desired block is uniformly

62

Chapter 3. Non-Markov Systems

distributed on the integers ( 0 , . . . , s - I}. The service time in seconds is given by number of sectors to read requested block number of sectors traced per second Hence we compute

where

Thus and consequently

3.5.2

Pollaczek-Khinchin formulas

For models like M/G/1, two approaches are available: to find an embedded Markov process that describes the system sufficiently well and to add more information to the state of the process under study so that the extended process is in fact Markov. In either approach Markovity is enforced back into the system via auxiliary processes that are accessible for analysis using the standard tools of Markov process theory. To exemplify, let Nt be the system size of the M/G/1 single-server system. It turns out that (Ntt)k>\, where tk are the successive times of departure from the system, is a Markov chain whose stationary distribution can be found. Similarly, as an example of how to extend the state space, let R, denote the remaining service time for the job being processed at time t. Then the two-dimensional process {(Nt, Rt), t > 0} is a Markov process although {Nt, t > 0) is not. Both ideas are demonstrated below, primarily for the purpose of deriving average size and delay in M/G/1. We begin the detailed study of the M/G/1 model noting that most concepts and quantities already introduced for single-server systems are unchanged. The parameters of the model are the arrival rate A, > 0 and a general service-time distribution represented by a random variable S. We assume that the service time has finite mean E(S) < oo and finite variance V(5) < oo and that the buffer management policy is the FIFO queue. We wish to solve the following problem: In steady state find the average size E(N), queue length E(Q), system time E(W), and waiting time E(Wq'). From Little's formula we already know the relations E(N) = XE(W) and E(Q) = XE(Wq). Furthermore, we know that E(N) = E(Q) + P(serverbusy) and E(W) = E(Wq) + E(S).

63

3.5. The M/G/1 system

Figure 3.6. Trajectory of Rt in M/G/1. Consider again the remaining, or residual, service time R, = remaining service time at time t. A typical trajectory of this process is shown in Figure 3.6. The vertical jumps are the successive service times of size S\, 82,... and occur at time points when jobs are transferred from the buffer to the processing unit or, if the buffer is empty, arrive from the external source. It is to be expected that if the M/G/1 system is in equilibrium, then the process R, should possess a stationary limiting distribution Rx so that we can speak of a steady state residual service time R (dropping the subscript) just as we have introduced N, Q, and so on. This can be verified and we will see later how to express the distribution of R in terms of that of 5. For now we rely on the mere concept of steady states and on the following interpretation. Suppose an arriving job finds at its time of arrival that exactly n other jobs are buffered waiting to be serviced. Put differently, at a typical arrival time the event {Q — n} is observed. Then the waiting time Wq in the buffer until service of the arriving job begins must satisfy where for notational convenience we put S^ — 0. Markovity is at the core of this relation. Indeed, the extended state (Q,, R,) is rich enough to enable the above representation for waiting time, without reference to any other information hidden in the history of the system. It is now simple to find how the mean values of the quantities involved relate to each other. Namely, conditionally on {Q = n},

But the time that any job spends in the service unit is never influenced by the present buffer status, and so Therefore

64

Chapter 3. Non-Markov Systems

In combination with the instance of Little's formula E(Q) = KE(Wq) we obtain

thus, under the crucial necessary condition

we obtain the relation

From now on we impose the restriction in (3.19) and observe that with Q = XE(S} we may introduce the concept of traffic intensity in M/G/1, plus we have found again the familiar basic criterion Q < 1. To go further it is necessary to express E(R) in terms of the parameters A, and S. With reference to Figure 3.6 this means expressing the average level of the peaked curve /?(/) in terms of X and S. Equivalently, referring to section 3.2, we need to compute the limit

It is not difficult to understand the typical magnitude of the integral J0 Rs ds. Over a long interval of length t the Poisson process produces approximately Kt arrivals. Thus approximately the same number Xt of service completions is produced, taking into account that the process R, is supposed to evolve in equilibrium. But the number of service completions is the same as the number of triangle-shaped paths building up the trajectory of R(t) from 0 to / (Figure 3.6). The areas of these triangles are S^/2, S%/2, etc. Hence on average E(S2)/2, which is finite since we have assumed that S has finite variance. The total area J0' Rs ds under the curve [Rs, 0 < s < t] is therefore approximately \tE(S2)/2, with better accuracy for larger t. Thus we find E(R) in the limit t -> oo of

From this, (3.20), and Little's formula we now have the following. Pollaczek-Khinchin's mean value formulas. Suppose the service time distribution has finite first and second moments, £'(5') < oo and E(S1) < oo. Then

3.5. The M/G/1 system

3.5.3

65

Lindley recursion for M/G/1

We demonstrate the alternative method of artificially imposing the Markov property in the non-Markov model M/G/1, namely, to construct an embedded Markov chain and use Lindley recursion. Apart from being an excellent method for simulation of non-Markov systems, the technique is rather general. In later sections we use it to analyze a switch output queue and the ATM model called Geo/D/1. The purpose is still to find the mean, variance, etc., of buffer size, delays, or similar quantities. Denote Tn = departure time in M/G/1 for job nr n,

n = 1, 2 , . . . ,

and put Nn =• Nfn = number of units left in system at departure of job nr n

and An — number of arrivals during transmission of job nr n,

n > 1.

By inspecting the relationship between these quantities it follows that

which is the same as (2.5) for the slotted time buffer model studied in section 2.1. Moreover, since in an M/G/1 model the sequence An+\ is obviously independent of N\,..., Nn, we have the analog of relation (2.6) and hence the conclusion that (Nn)n>\ is a Markov process! Similarly, put Qn = Qrn = number of jobs in queue at departure nr n for which to obtain again (2.1). In section 2.4 we analyzed such Lindley-type recursion equations. In the present example the number of arrivals in a time interval of length S is Poisson distributed with a mean given by kS, hence

Furthermore,

and by applying the previous result (2.16) we obtain

This result is in agreement with the Pollaczek-Khinchin formula derived above, providing a second method for such results.

66

Chapter 3. Non-Markov Systems

Figure 3.7. Virtual waiting time in M/G/1.

3.5.4 The M/G/1 virtual waiting time distribution The goal in this section is to enhance geometrical understanding of a celebrated representation formula for the virtual waiting time in M/G/1 (3.23). Normally this kind of result is derived using generating functions. We chose to avoid these techniques here and instead present descriptive arguments and reasonings. Consider a single-server system of type M/G/1, which we assume has settled into equilibrium. Denote by Wq(t) the time a packet will have to wait in the queue if it arrives at the entry point of the system at the particular time t. Let Wq denote a random variable that has the corresponding queuing time delay stationary distribution. Often Wq (t) is called the virtual waiting time. Figure 3.7 shows a sketch of what the typical paths of the random function Wq (t), t > 0, look like. Pick a point randomly on the time line (two points t\ and t^ are indicated in Figure 3.7). The probability of choosing a point t, where Wq (t) = 0, equals the proportion of time that the server is idle, that is, no = 1 — Q (compare Exercise 3.7). For the case Wq(t) > 0, it is clear from the graph that the decomposition

holds. In fact, at such time points Wq (t) consists of one piece with distribution R related to the job being processed plus a number of service times one for each job in the queue. We thus have a convincing argument for the representation as the stochastic sum

where Sj are independent service-time random variables, R denotes remaining service time, and N and Q are steady state system size and size of the system queue.

3.5. The M/G/1 system

67

Figure 3.8. Virtual waiting time decomposed in remaining service times. Now we decompose Wq(t) using horizontal auxiliary lines, as shown in Figure 3.8 by the dotted lines at time points t\ and t$. It follows that at each point /, V/q(t) consists of a random number of terms each of which is a certain fraction of a service-time variable. For a given t we denote by M the required number of terms and by R\...., RM the corresponding summands. Since the service times are independent it follows that R\,..., RM are also independent. It also is clear from this construction that M is independent of the particular values of the service times. We note that at a given time, say, ti, the service times underlying / ? ! , . . . , RM correspond to jobs that may have already left the system at 73. Next, recall that the upward jumps occur according to a Poisson process. In particular if we fix /3, the locations of the jumps in |0, t$] are uniformly distributed. Since the dotted lines are reflections in the curve Wq (t), it follows that the points at which the dotted lines hit the service-time intervals are also uniform! But this is the definition of remaining service time, namely, R is whatever time remains if we pick a point at random during the processing of a service interval S. We have motivated the representation

where the Rj's are independent random variables with the remaining service-time distribution and M is an integer-valued random variable independent of the summands. (If M = 0 above, then the corresponding sum vanishes.) The final claim regarding this queue size representation formula is that M is geometrically distributed with parameter 1 — Q. For proof of the last claim we see that P(M = 0) = 1 - Q by the earlier argument of the server being idle. To obtain the full distribution of M we return to Figure 3.8. The number M (corresponding to £3) is the same as the number of steps in the dotted ladder leading from Wq(t^} leftward down to the zero level. In turn this number is the same as the number of times the dotted line intersects a service time until it intersects for the first time a service time that initiates a busy period. By the uniformity referred to above, the situation is the same at every step of the ladder, and thus at each step we perform independently a random trial that succeeds with probability p and fails with probability 1 — p. The success

68

Chapter 3. Non-Markov Systems

probability p must be the fraction of all jumps that initiates a busy period, which we calculate next. If we call the sum of an idle period plus a busy period a cycle, then length of idle period = proportion of time the server is idle = TTQ = 1 — Q. length of cycle Hence

and thus in a long interval [0, t] there will be approximately A.f (1 — Q) cycles. At the same time we know that there are approximately Xt arrivals in this interval, hence the proportion of arrivals that initiates a busy period must be the ratio 1 — Q of these quantities. What we have obtained is the conditional probability of M given that M > 1, namely, that M has, in this case, the first success time distribution (positive geometric) with success But this implies probability 1 - Q, P(M = k\M >!) = (!- Q)gk~l,k =1,2

for each k > 1, which, together with our knowledge of P(M = 0), finally shows that M is geometrically distributed with parameter 1 — Q.

3.5.5

Heavy traffic limit in M/G/1

A recurring theme in the analysis of M/G/1 and similar systems is the crucial stability requirement Q < 1. The nature of these systems as the load increases toward the critical value Q = 1 can be understood to a certain degree from the Pollaczek-Khinchin formulas (3.22), where average buffer size and average waiting times grow inversely proportional to 1 — Q. Despite the singularity at Q = 1 that appears in the mathematical abstraction of the systems, it seems that service systems in practice often do operate under high loads, at least during limited periods. It is therefore of interest to understand the behavior of systems operating with traffic intensity close to maximum. A natural approach is to rescale interesting quantities such as loss probabilities and delays and study asymptotic properties as Q -> 1. Such results belong to the regime of heavy traffic scaling. As an example we present a classical result for the M/G/1 model with an FIFO queue and service-time distribution of finite variance, V(S) < oo. The queuing delay time Wq normalized by 1 — Q converges in distribution to an exponential random variable as Q —> 1,

The proof of this result is often presented as an exercise in using moment-generating functions. As an alternative we use the elementary renewal theorem. It was shown in (3.23) that the waiting time Wq can be represented by a random variable with the distribution

3.5. The M/G/1 system

69

where and /?], / ? 2 , . . . are i.i.d. random variables with distribution derived from the given servicetime distribution P(S < s) as

Let N ( t ) , t > 0, denote the renewal process associated with the sequence R\, RT, .... Since E(S2) < oo we have (Exercise 3.8)

and hence by the renewal theorem (3.12)

The basic relation (3.11) for renewal processes shows that

thus

Since M is independent of N ( t ) and P(M > m) = Qm+l,

The asymptotic behavior of the scaled renewal process that appears in the exponent is determined by (3.25),

and since the exponential function is bounded on the positive half-line, the same asymptotic property carries over to e-('[-e)H(xi(\-e)) as wen as to me expected value. This shows

which concludes the proof of (3.24).

70

Chapter 3. Non-Markov Systems

Figure 3.9. Short simulation o/M/D/1, Q = 0.95. 3.5.6

Deterministic service times, M/D/1

Example 11 gives an example of the M/D/1 model, where the general service-time distribution G is replaced by deterministic service times S = constant. Keeping this example in mind we can visualize the M/D/1 model by thinking of packets delivered to a link at the arrival times Tk, k > 1, of a Poisson process with intensity A. < 1, buffered and transmitted at maximum capacity effectively over S = 1 time units each. Figure 3.9 illustrates the situation on the link at high load, Q = 0.95; up to four packets are buffered while the link is cleared. Put V/c = departure time of packet no A:, k > 1, V0 = 0. Some reflection shows that

which may be expanded further to give

The waiting time in the buffer, which we call Wq (k) for packet k, is given by

Either (3.26) or (3.27) can be used to analyze the sequence (Wq(k))k. By (3.26),

3.5. The M/G/1 system

/H

where [4 is the interarrival time 7^ — Tj_i. Iteration down to k — 1 shows

Thus we have

which shows that there exists a steady state representation of the buffer waiting-time distribution of the form representations that were derived earlier as (2.7) and (2.17) in the context of slotted time Markov models. We conclude this section by estimating the distribution of the buffer delay in M/D/1. The technique is general and applies in modified form to M/G/1. It will be shown that a constant 90 > 0 can be found and numerically estimated, such that

The proof is by induction. First W ? (l) = 0, so (3.29) is trivially true for k — 1. We assume as induction hypothesis that (3.29) is true for all indices up to k — 1. It simplifies the derivation of the induction step somewhat to keep a separate notation for the random variable X^ = 1 — L4, with distribution function FX(X) = P(U^ > 1 — x) and density f x ( x ) = Ae~ 1(l -'*>, x < 1. By (3.28), for y > 0,

Partition on the events [Xk > y] and its complement to get

Next we rewrite the last term above by conditioning on the outcome x of X*. Since X^ is independent of Wq(k - I), this yields

By hypothesis

72

Chapter 3. Non-Markov Systems

The integral in the last term is

Now choose 60 = sup{6 > 0 : X(e8l> — 1) < 90}. For fixed A. this gives a specific number 00 such that

and so the upper bound is true for index k. This completes the proof of (3.29). For a more general case see Exercise 3.10.

3.6 The M/G/oo model The M/G/oo model, or Poisson burst process, is the infinite-server model with Poisson arrival process A, of intensity A,, general service-time distribution G(t) — P(S < t), and usual independence assumptions. This process has non-Markovian system-size trajectories Nt but is nevertheless accessible for explicit calculations. We indicate a primary application of this model by referring to arrivals as calls, service times as call holding times, and the system size as the number of calls in progress. We have

(call i in progress at time t]»

so Nt arises by counting those arrivals that are still occupying a server at time t, and by discarding the others. Given the number A,, however, the arrival times are uniformly distributed over the interval [0,;]. The decision whether to count a call at t is determined by its uniform time of arrival U, say, and independently its length S. If U + S > t, the call is in progress and counts for incrementing N,\ if U + S < t, it does not count. The corresponding probability is

The number Nt is thus obtained from A, by independent thinning with the probability /?,, and therefore Af, itself is also Poisson distributed. Since E(At) = A,?, wehaveE(W f ) = \tpt and so for each fixed t

where we recognize the equilibrium distribution introduced in (3.13). Asymptotically, for large t, the Poisson distribution Po(g) appears and this explains the fact that a stationary version of the M/G/oo model can be constructed with steady state distribution A^ e PO(Q). This property was established for the special case M/M/oo in Exercise 2.3.

3.7. Exercises

73

In the stationary case it is possible to compute the correlation function of N,. Introduce for fixed s, t > 0, Z] = number of calls in progress both at s and s + t, Z2 = number of calls in progress at s but not s + t, Zs — number of calls in progress at s +1 but not s. Then Z\, T.I, Z^, are independent, Poisson-distributed random variables such that

Hence

With reference to the above thinning arguments it is clear that E(Z\) is a certain proportion of Q. Since calls present at time s count for Z\, should their remaining call holding time be greater than t, it follows that

where R, is the residual service-time process described in section 3.5.2. It is a standard result in renewal theory (see Exercise 3.8) that R, converges in distribution to a steady state /?oc with a mixed law given by a point mass at 0, P(R<x, = 0) — 1 — Q, and the equilibrium distribution introduced in (3.13) as continuous part,

Summing up, in steady state the covariance is stationary with

and the autocorrelation function is given by

It is noteworthy that the same expression was found for the renewal rate process autocorrelation function (3.16). Figure 3.10 shows a simulated trace of the M/G/oo model with Pareto-distributed service times. The parameters are chosen such that the first moment of S exists finitely but not the second.

3.7 Exercises 3.1 Consider the m-server loss system M/M/m/m with parameters A. and fj,.

Chapter 3. Non-Markov Systems

74

Figure 3.10. Simulation o/M/G/oo, infinite variance service times. (a) Identify the measures of performance traffic intensity (per server), utilization, throughput, and loss probability and relate them to the offered load. (b) Verify the relationship (3.1). (c) Find the limiting throughput as the load tends to infinity. (d) So that Little's formula is valid for this system, what should we mean by "average arrival rate"? (e) Put n, = 1 and produce a load-versus-throughput graph for m = 2,5, 10, 100, 500, i.e., the analog of Figure 3.1 but for M/M/m/m. (It is recommended to consider load per server versus throughput per server.) (f) Generate simulated traces of the M/M/m/m process that demonstrate its nature both under light and heavy loads and for a small and large number of servers. 3.2 Calls are made to a specific phone number according to a Poisson process with intensity A calls per hour. As soon as a call is accepted a conversation begins of length represented by a random variable S with mean E(S) minutes, during which additional calls get a busy tone and are rejected. The lengths of successive conversations are independent and independent of the arrivals. What is the equilibrium probability that the phone line is busy at an arbitrary time? 3.3 In the model for hand-off termination probability in section 3.5.1, add the parameter p0 = P(a new call attempt is blocked at start-up). Let /Yerm denote the corresponding forced termination probability. Express P{eim using the representation derived for Pterm and the new parameter PQ.

3.7. Exercises

75

Assume that the cell occupancy times Ut, i > 1, are exponentially distributed. For this case find the termination probability P,'erm. Consider the particular case p0 = Pf, which may be called nonprioritized call handling, and generate graphs of the termination probability as a function of p f for fixed values of the other parameters v and IJL. In fact, the parameter space can be reduced. How? 3.4

(a) In the model for the simplified GBN protocol the parameter N can be thought of as the offered load; the system is saturated in the sense that when each new round begins there are always N packets ready to be transmitted. Draw the loadthroughput graphs for GBN for a few choices of the parameter p (and fixed TO). Interpret the resulting curves. Why is there a maximum throughput? Let Afmax denote the corresponding load. What can be said about Nmm as a function of p and r0? (Find the equation that governs this relationship, find upper and lower bounds of /Vmax, or make a numerical study.) (b) What should we mean by traffic intensity and utilization in GBN? With reference to relation (3.1), what does the loss probability measure in this case? Which packets are "lost"? (c) Consider a more realistic version of the GBN protocol, where after a packet loss only those packets scheduled for transmission in the same round but after the lost one must be retransmitted. Hence if the first packet that is lost in a round has ordering number k, only the packets numbered k, k+l,..., N in the same round must be retransmitted. Find the expected number E(R) of packets transmitted in a given renewal cycle and the corresponding throughput. Illustrate the results with appropriate graphs and show that we have resolved the problems with simplified GBN observed in 3.4(a).

3.5 In section 3.5.1 we modeled the retrieval of data stored on a rotating disk partitioned into a total of s equal size sectors by applying the M/G/1 queuing system with processing time

where r is the number of disk rotations per second, b is the number of sectors required at each successive request, and Z is a random variable uniformly distributed on the integers 0 , . . . , s — 1. Compute the total request delay time in terms of the parameters s, b, and r in addition to the arrival rate A. Then consider the limiting case when both s and b are large but the fraction ft = b/s of sectors required at each request remains bounded. By taking limits to infinity we obtain a reduction to three parameters r, A, and ft. What is now the maximal input A that the system can allocate in steady state? What is the total request delay? 3.6 Compare the systems M/M/1 and M/D/1 with regard to queue size and delay. Conclude that queues are shorter and delay is less in M/D/1 compared to M/M/1 under the same load.

76

Chapter 3. Non-Markov Systems

3.7 By exploiting the relation

which is valid for the embedded Markov chain Nn of M/G/1 system size at successive departure times with An denoting the number of arrivals during successive service times, prove that in steady state

where Q is traffic intensity. Discuss whether this result would hold in the general system G/G/1. 3.8 Find the moments of the stationary distribution ,Feq in (3.30) in terms of the moments of the service-time distribution. Relate the expected value to the expression for M/G/1 mean residual time in (3.21), E(Rao) = XE(S2)/2. 3.9 A simple method to simulate trajectories of the M/D/1 model such as those in Figure 3.9 is based on (3.27). To simulate the sequence (Wq(k)) it is enough to generate arrival times fe}, compute departure times {i^} using (3.27), and form {w^}, where wk = tk — Vk — 1. Apply this technique to investigate the accuracy in the upper bound (3.29). 3.10 Generalize the results in section 3.5.6 to M/G/1 with general service-time distribution S. In particular show that (3.29) holds with B0 = sup{6> > 0 : E(edx) < 1}, where X = S -U,U € Exp(A.).

O&pter 4

Cell-Switching Models

The processing of ATM cells in a network node consists of several phases. Concentration, merging, and access control of incoming traffic are followed (where access is granted) by procedures such as admission control, filtering, and regularization to meet performance criteria, loss and delay control, and so on. Yet the dominating aspect is simply the routing and circuit switching of cells to guarantee that they are directed correctly according to the addressing information they carry, whether that means local processing along virtual paths or outbound traffic feed. In fact, in the broadband context the enormous amounts of data and the extreme speed under which the data have to be processed may render most attempts of error and congestion control futile. For example, control messages sent from an output node with the purpose of warning incoming traffic for congestion ahead may be obsolete by the time they arrive! Hence it is not always an oversimplification to model a nodal point simply as a set of switches. With this motivation we now include a chapter on modeling of the simplest switching units. In practice, cells are transmitted into and out of switching cards, each with a certain capacity to buffer cells in case of multiple arrivals during cell transmission intervals. We study some of the basic approaches, called space division switches, and their inherent properties. Switching from a mathematical standpoint is discussed by Schwartz [56] and Hayes [18].

4.1

m x m crossbar

Consider a switch with m input lines and m output lines, whose sole purpose it is to direct each incoming cell onto one of the output lines. Time is discretized in slots of length determined by the cell transmission time. During a slot each output line is capable of accepting and transmitting exactly one cell. The switch can be thought of as a device for input cells searching along a set of bars for their correct destination output bar; see Figure 4.1. Several cells may be destined for the same output, causing losses or buffering of cells or a combination of the two. To model the function of the switch, the arrival patterns of cells on the input lines and the rules for allocating cells on the output lines must be specified. It seems natural to apply random mechanisms for both of these. At this point choices based 77

Chapter 4. Cell-Switching Models

78

Figure 4.1. m x m crossbar model. on mathematical tractability come into play. We apply the same arrival pattern throughout this section but vary the storage and loss structure. 4.1.1 Output loss crossbar The simplest assumptions are the following: Arrivals: During each slot and at each input line a cell appears with probability p and the line remains empty with probability I — p. Each arrival is independent of what happened in earlier slots and of arrivals at other input lines. Switching: Each incoming cell is switched during the next slot to one of the m output links uniformly with probability l/m and independently of other cells switched during the same slot. At each requested output port one cell is immediately transmitted, whereas additional cells arriving at the same port are lost. In Figure 4.1 one can think of cells arriving on each horizontal bar from the left, selecting uniformly one of the crossings of a vertical bar along which the cell exits. More general models are obtained by letting the number of input lines be different from that of the output lines, by replacing the arrival probabilities p by parameters PJ, j = 1 , . . . , m, varying from one input to another, or by applying, instead of the uniform switching, other routing probabilities r,j with J^ • r/j = 1 such that r^ is the probability that a cell arriving on input i is directed to output j. We analyze the simplest case by introducing for 1 < j < m An — number of cells arriving in slot n, KJn — number of cells directed to output j at start of slot«; the initial condition could be K[ = 0, for example. For any n > 1, An e Bin(m, p).

4.1. m x m crossbar

79

since there is a success probability p of arrival at each of m input nodes. Moreover, given A,, the K^+}'s are multinomially distributed as

consistent with the obvious relation A,, — £!T=i ^+|. In particular, since the marginals in the multinomial distribution are binomial and the arrivals and switching operations are independent, Moreover, in view of the properties of the multinomial distribution

It can be seen that the sequences {A,,},,>] and, for each j, {A';]'},,>i are i.i.d., which implies that the crossbar model under the present assumptions actually operates in a steady state mode at least from slot 2 onward. This simplifies the further analysis and the evaluation of the performance of the switch, for the purpose of which we define Tn = number of transmitted cells in slot n

In fact, Tn, n > 1, also is an i.i.d. sequence and it is the expected value in the corresponding invariant distribution that is normally interpreted as the throughput of the crossbar: Throughput =

Hence, normalized per single output line, Utilization = ! - ( ] - p/m)'n =» 1 - e~p

for large m.

Furthermore Expected number of lost cells per slot

and therefore Loss probability = average fraction of lost cells per slot large m.m. = (d - p/m)m - 1 + p ) / p * (e~p - 1 + p ) / for p for large

80

Chapter 4. Cell-Switching Models

Figure 4.2. Output queue crossbar. Our knowledge about the underlying multinomial distribution also allows for the calculation of the variance

which proves that asymptotically as m tends to infinity,

By a reference to the strong law of large numbers this result can be improved to hold in the sense of convergence almost surely.

4.1.2

Output queuing with a shared buffer

The distinguishing feature with this model compared to the previous one is that each output line, or switch card, is equipped with a buffer, as shown in Figure 4.2. The typical size of a switch could be m — 128, or larger. The same arrival structure as before is maintained, adding only that arrivals are independent of the present state of the crossbar switch buffers. The assumption regarding uniform switching is also preserved. It is the switching of multiple cells to the same output that will cause the buffers to fill up. If in fact more than one cell has arrived at a given output at the beginning of a slot, all cells except one are stored in the corresponding buffer and transmitted one at a time during sequential slots. In this model the number of buffered cells

4.1. m x m crossbar

81

at one output node could grow over time without bound. More exactly, after n slots have passed there may be as many as (m — 1)« cells in a single buffer, even if this occurs only with probability l/m". It would be desirable, in order to make the model more realistic, to work with finite output buffers assuming that overflow cells are lost. The loss model studied in section 4.1.1 is of course a special case of this. However, we motivate the current assumption of infinite buffers by referring to the switch design for packet networks known as a shared buffer. In this scheme there is a common pool of buffers, where free buffer space is allocated from slot to slot to the output port in need of extra storage at this particular time. In practice this means that only the total number of buffered packets destined for any output port is subject to a finite buffer restriction. We make the approximation that the shared buffer switch is well described by the infinite output buffer model in this section. In addition to the notation introduced earlier, consider QnJ = number of cells in buffer j at end of slot n. The goal is to find out as much as possible about the distribution of the Q''s. For this study of the output buffers we continue in direct analogy with the analysis in section 2.1. By separating into three cases it is seen that

which in compact form yields the Lindley-type equation

The recursive relation (4.1) shows that the simplistic i.i.d. structure in the loss model of section 4.1.1 is replaced in this case by the Markov property. The sequences {QJn}n>\ are all (dependent) Markov chains with marginal distributions of the same type as the buffer size sequence in section 2.1, with the particular choice of Bin(m, p/m)-distributed arrivals. As a consequence there is no equilibrium situation automatically. The theory of simple Markov chains, however, is applicable in order to find a limit distribution. It is known from (2.13) that E(K'n) = p < I is a sufficient condition for the existence of a steady state. The condition says that the average number of cells arriving at a given output j in a given slot n is less than the maximum capacity of the crossbar outlet, excluding only the extreme case p — 1, which is of little interest. As a first application of (4.1) in steady state,

A consequence we observe in passing is that since in this model the number of cells exiting the switch in a single slot n + I equals

82

Chapter 4. Cell-Switching Models

the equilibrium throughput is given by

/\ \- / which is an obvious no-losses balance relation. Next, repeating the arguments leading to (2.15),

for each j, where the time index n in K> n is dropped. Expressions for the steady state probabilities P(QJ00 = k), the same for each marginal QJX, are then obtained as explained in section 2.4. In particular, as (2.16) shows,

It follows that on average the total number of cells stored in any of the output buffers is given by

This gives a first indication of what buffer sizes are required in shared buffer packet switches. Variance calculations should yield further insights into the buffer dimensioning problem. This and similar models were studied by Hluchyj and Karol [19]. Another natural direction for detailed study at this point would be to cover so-called knock-out switches, where packets contend for output buffer space in a manner that can be compared to a knock-out tournament. See, e.g., Yeh, Hluchyj, and Acampora [71] and Kim and Lee [27].

4.1.3 Input buffer blocking The next variation of the m x m crossbar switch is the case in which each input line is equipped with a buffer in which cells are forced to wait in case of contention at a particular output node; see Figure 4.3. The basic assumptions are the same as before: Slotted time. A cell appears at each input line with probability p in each slot. Cells are equally likely to be destined to any output line. Usual independence assumptions. If two or more cells are contending for the same output, then one randomly chosen cell is allocated.

4.1. m x in crossbar

83

Figure 4.3. Output contention causing input buffering. The new feature is that any cell not able to leave the switch during a particular slot must wait at its input node in what we will call a backlogged state. During the next slot it will contend once again with other input head of line (HOL) cells directed toward the same particular outlet port, and this is repeated in subsequential slots until the cell is granted transmission. Hence we add Backlogged HOL cells remain at input lines for new attempts in the next slot. New arrivals at backlogged input lines are buffered, then fed in FIFO fashion until becoming HOL cells and thus viable for transmission. This system is difficult to analyze and no powerful method seems to be known. In any case we set up a model describing the switch and give some preliminary derivations. Assign N'n — number of cells including HOL in input buffer i at beginning of slot n, I'n = number of new cells arriving at input i in slot n (0 or 1),

"

, _ J 0

if input j empty, slot«,

1 ./' if HOL cell at input / is directed toward output j.

The quantities K^ from the previous section are respecified as

= number of HOL cells at beginning of slot n destined for output j. For a given j, if K/t > 1 the algorithm emits one cell chosen with equal probabilities among the K' HOL cells addressed to output j. Define also if HOL-cell at input nr i exits at output nr j.

84

Chapter 4. Cell-Switching Models

We have and

where we also introduced the notation {Jl}n>\ for i.i.d. sequences with the uniform distribution over the integers 1 , . . . , m. The total number of cells at the end of slot n that are buffered on the input side of this m x m crossbar is

and the total number of arriving cells in slot n is

A recursion for Nn is obtained by summing over / the relations (4.2) for individual buffers,

By inspecting the sum over index / in the last expression we see that at most one summand is nonzero and thus signifies a request for output at node j at the beginning of slot n + 1. Since the crossbar throughput in slot n + 1 consists of one cell for each output line with at least one request, that is, Tn+\ = 5^7=i ^-(K' >n» w e have

This relation could have been stated directly as a Lindley-type equation for Nn, balancing the independent arrivals to match the switch throughput. To give some indication of the behavior for this type of switch, Figure 4.4 shows a simulated trace of 4000 slots for m = 300 and p = 0.55, plotting throughput Tn and the number of nonempty input buffers Yn, given by

against slot numbers. The upper curve Tn varies around the mean throughput mp = 165. To get an idea of how the throughput sequence Tn is related to the number of nonempty buffers Yn, we consider a heuristic argument. Suppose there are y nonempty buffers in the

4.1. m x m crossbar

85

Figure 4.4. Throughput and number of nonempty buffers. switch at a certain time; the corresponding y HOL cells will typically request their output nodes evenly distributed over the m available lines. As a result the throughput in the next slot will be the number of output nodes with at least one request. It is natural to rephrase this situation as a classical occupancy problem, namely, y balls are distributed randomly over m boxes and each of the my possible outcomes are equally likely. With T denoting the number of nonempty boxes and U/ an indicator function of the event that box i is empty, we have

Solving for y with t — E(T),

and thus, inserting t — mp = 165 and m — 300, we have y » 239 in reasonable agreement with the simulated result in Figure 4.4. To further enhance understanding of how the switch behaves, the next simulated trace shows the evolution of the total buffer size Nn over a range of 40,000 slots for two values of the traffic intensity per node p = 0.55 and p = 0.58. The simulation started from an empty system. Figure 4.5 shows that there is a drastic and interesting change in the required buffer size as the traffic intensity increases in the range above p = 0.5. For p = 0.55 the buffer size seems to settle quickly in a steady state with moderate fluctuations. However, at p = 0.58 the trace of Qn is quite different both in magnitude and fluctuations even if it seems to stabilize after an initial transient period. By increasing the intensity just a bit further to, say, p = 0.59, the resulting graph of total buffer size will show no tendency of stabilizing. This phenomenon is partly explained in Theorem 4.1. A somewhat related situation where a sudden collapse of effective throughput results when the input rate increases was studied by Kaniyil et al. [25]. They consider a network node operating under the input buffer limiting scheme. The input to the node consists of

Chapter4. Cell Switching Models

T

Figure 4.5. Total buffer size for m = 300, p = 0.55, and p = 0.58. transit messages from other nodes in the network plus the locally generated input messages. The node is equipped with a finite buffer subject to the crucial restriction that only a specified fraction of buffer space may be occupied by locally generated input messages. Kaniyil et al. [25] studied the performance parameters of the node just before onset of congestion using gradient dynamics and dynamic flow conservation. The methodology suggested to investigate these stability aspects might be useful also for the input buffer case considered here.

4.1.4

Input blocking, loss system

To simplify the system and to be able to study the input blocking phenomenon, we now remove the input buffers and assume that any cell arriving at an input port where another cell is already contending for transmission is lost. Slightly rephrasing the interpretation of the key quantities from the earlier analyses, we have K'n = number of backlogged input lines in slot n with cells destined for output j, 11 = number of new cells arriving in slot n destined for output line j. The throughput of the switch, i.e., the fraction of successfully transmitted cells per slot, is given by

It is clear that Kj can be thought of as the content of a virtual buffer set up to control transmissions through output port j. It is therefore natural that the key relation for buffer size variables appears in the form

4.1. m x m crossbar

87

Figure 4.6. Throughput in crossbar input loss model, m = 500, p = 0.3 (lower), and p = 0.7 (upper). in analogy to relations (2.1) and (2.5). In this case, however, the key observation is Given the ATn''s, the family (IJn+\} has the multinomial distribution, Multnom(Z n , ^ , . . . , ^), where Z,, = ^J=i ^ ^s binomial, Bin(m - ^"'=1 tf>, p) distributed. It follows that { ( K t [ , . . . , K ' " ) , n > 0 } i s a M a r k o v c h a i n w i t h f i n i t e s t a t e s p a c e w h i c h is irreducible and nonperiodic. Existence of a stationary distribution is a consequence of Theorem 2.1 in section 2.3, adapted to the case of vector-valued Markov chains. We refer to the corresponding steady state by writing {K^}. Similarly, T^} denotes equilibrium throughput. The simulation in Figure 4.6 of the level and variation of T,,, n = 1, 1800, for switch size m = 500 and two /^-values 0.3 and 0.7, gives some intuitive understanding of the model. Theorem 4.1. For the m x m zero buffer input loss crossbar switch, HOL blocking results in the throughput reduction

in particular, for the saturated case p = 1,

Chapter 4. Cell-Switching Models

88

Figure 4.7. Input loss model, throughput lower bounds. We conjecture that in the limit m —> oo

in L 2 (and almost surely), but we have not been able to show rigorously that indeed V(T^) ->• 0 as m ->• oo. The standard practice in the literature for obtaining the throughput 2 — -s/2 is to approximate the multinomial distribution of the variables 7; by independent Poisson random variables; see Walrand and Varaiya [65, Chapter 10.8]. Simulations indicate clearly that, in fact, (4.5) is the correct asymptotic throughput of the input loss switch

89

4.1. m x m crossbar

as the size of the switch grows. A further remark is that the given lower bound is relatively accurate for small m. As an example we outline in Exercise 4.3 steps for deriving by direct calculations that the throughput of the 2 x 2 input loss switch is given by

and thus under saturated load is equal to 0.75. The corresponding lower bound in Theorem 4.1, withm = 2and/? = l,is(7-vT7)/4 % 0.7192. Graphs of the load-versus-throughput lower bounds are shown in Figure 4.7 for m = 2 and m -> oo. To prove the inequalities in the theorem we begin by assuming that an initial configuration { KQ } has been given and by introducing the averaged quantity fraction of busy input lines. By conditioning on {K^} we obtain expressions for the mean and the second-order moment of /j+]. First, since each component I}n+} is binomially distributed with parameters m — E7=] Ki and p/m.

Second, an application of the conditional variance formula V(X) = E(V(X\Y))+V(E(X\Y~)) implies that

hence The idea is now to consider the recursive relation (4.4) in equilibrium, i.e., to suppose specifically that the distribution of {K^} equals the steady state distribution, hence that the distributions of K^+] = K/t coincide for all n > 0, and compute the expected values. This gives hence

In addition, by expanding the square of relation (4.4),

90

Chapter 4. Cell-Switching Models

therefore

It remains to sum this identity over j from 1 to m, divide by m, and simplify to obtain a relation, which if we write temporarily,

takes the form

It follows that the expected value x = £(V (m) ), say, of the random variable V(m) satisfies the inequality The stated inequality for E(T^) = p E(V<m^) = px is now obtained by solving the corresponding second-order equation.

4.2

Exercises

4.1 Consider an m x n crossbar with m input and n output ports operating in a slotted time fashion. During each slot, cells arrive at the input ports independently of each other and with probability p for each input. Arriving cells are immediately assigned a particular output selected randomly with equal probabilities amongst the n available output lines. During the same slot each output port has the capacity to process exactly one cell. (a) Assume first we have the output loss case of section 4.1.1. Hence if several cells are assigned to the same output destination, all cells except one are lost. Compute the switch throughput (in cells per slot), i.e., the expected number of busy output ports at the end of a given slot. What is the natural definition of utilization for this model? Find the expected number of lost cells per time slot and from this the loss probability. Verify that again formula (3.1) holds. Suppose the switch is a concentrator in the sense that n < m with concentration rate ft given by the limit n/m -> /J as m, n -> oo. Express utilization and loss probability in terms of p and /?. (b) Now consider the output buffer case (section 4.1.2). For which p values does the system settle into equilibrium? Find the expected buffer size. Answer the same questions for the asymptotic case described by the concentration parameter ft above.

4.2. Exercises

91

4.2 Consider the m x m input buffer ATM switch symbolized in Figure 4.3 and its simpler variation, the input blocking loss system, studied in section 4.1.4. One method with which to reduce the effect of HOL blocking in the buffered system is to maintain for each input port m separate buffers, each containing cells destined for one of the m output ports. We wish to study the input blocking loss version of such a switch. Thus, suppose each input port is equipped with one storage location for each output that enables it to store up to m cells as long as they are addressed to different outputs. Also assume that a cell appears at each location with probability p/m in any given time slot. The actual switching operation is then described as follows. In each slot, each output port performs a search of its m input locations and picks one cell randomly if at least one is found. Introduce for 1 < j < m, KJn = number of backlogged cells in slot n destined for output j, Nft = number of new cells arriving in slot n destined for output j. This could be called a shortbus model; the terminology refers to a design where each output is able to send its bus one round trip every slot to update the Kn value and eventually pick a cell for transmission.

(a) If for some n we are given the tKj's, then in the next slot n +1, what is the distribution of the family of random variables {A^+|}? Fix one output port, drop the superscripts, and write Kn and Nn for the number of backlogged and new cells in slot n. Show that the resulting Markov chain (Kn)n>\ satisfies

(b) Calculate the conditional expectations

(be careful with the case k = 0).

(c) Denote by jt, = ^-,K [ml] the proportion of backlogged cells on the new time scale

obtained by using a speed-up factor m. One can show (see Chapter 6) that the dynamics of X, are described by the drift and quadratic variation functions defined as

where x is of the form k/m for some k, 0 < k < m. Find the limiting functions

(d) Make a simulation study of the shortbus model. As the load tends to saturation, i.e., as p —>• 1, does the system stay stable or are there indications of congestion, severe losses, etc.? This exercise is continued in Chapter 6.

92

Chapter 4. Cell-Switching Models

4.3 To calculate exactly the throughput in the 2 x 2 input loss switch we consider the bivariate Markov chain (K^, K2) on the states (0, 0), (0,1), and (1,0), with equilibrium probabilities TTQQ, jr0i, and JTIO in steady state. Given the state of the Markov chain at time n, what is the conditional distribution for Zn ? For given values of Z n , what is the throughput at output 1? Sum over possible values of Zn and sum over states to obtain in steady state the single port throughput 1 /2+p/4+TTOO (3p/4—p2/4-l /2). Argue that the same throughput is given on the other hand by p P(K^ = 0) = p(\ +jroo)/2. Identify the expressions to get the result p(\ — p2/2(2 — p + p2)) stated in (4.6).

Chapters

Cell and Burst Scale Traffic Models

The next objective is to focus on the hierarchies of time scales that are often used to better understand the traffic structure in modern communication networks, and to study specific features of them from the point of view of mathematical modeling. Recall the three-level partitioning in call, burst, and cell time scales briefly discussed in sections 1.2 and 1.3 and shown in Figures 1.3 and 1.4. The classical Markov theory, Chapter 2, covers the call level category. The traditional non-Markov extensions, discussed in Chapter 3, apply to call level and are to a certain degree suitable also for models on the packet or cell level. As an example the M/D/1 model on one hand applies to circuit-switched voice call transmission as a generalization of a pure Markov model and on the other hand to the queuing analysis of fixed-size packets. In Chapter 4 a number of cell scale models were introduced. To further distinguish cell and burst scales, we note that if a short time interval is considered, then the cells encountered by the network during that period typically originate from different users, which motivates over such lengths of time independence between cells. Seen over a longer time scale, however, it is likely that many cells are sent from the same source, leading to correlations in the data streams. This chapter continues with further examples of cell level dynamics, with a presentation of burst level rate models and with material related to long memories in network traffic. In particular, we encounter methods designed for what has been called "the failure of Poisson modeling"; see Paxson and Floyd [48].

5.1

Cell-level traffic

We present four examples of models for stochastic dynamics arising on the level of individual packets on a microscopic time scale. In the first example ATM cells merge on a common link. The second example deals with the delay structure in a stream of IP packets of PCM encoded voice. The subject of the third example is the random variations in round-trip times of Internet traffic. Finally, the fourth example starts with datagrams at an interface of a video application and derives the distribution of packet sizes at a transport layer interface. 93

94

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.1. ATM output buffer.

5.1.1

Isochron multiplexing

We consider an ATM multiplexer such as the system in Figure 5.1, obtained by removing one of the output buffers in the crossbar model Figure 4.2. A number of incoming streams are all directed toward this particular output pipe, toward the same outgoing multiplex. In section 4.1.2 we studied the corresponding multiplexer queue under the assumptions of slotted time and the arrival stream consisting of Bin(m, p/m) cells per slot and with independent arrivals at each slot. This particular model is sometimes called the Geom/D/l queuing system. We now consider other models, as the arrival stream may have a different structure than in the case of the crossbar model. In continuous time, if we can assume that the cells arrive as a Poisson process, the M/D/1 model is applicable. The basic quantities of interest such as expected buffer size are then obtained immediately from the PollaczekKhinchin mean value formulas for M/G/1. The main objection to Geom/D/l or M/D/1 in ATM cell transfer mode traffic models is the evident periodic nature of cell emissions within bursts. Cell emission streams that are consistent with the two models mentioned above on the contrary possess independence between slots (Geom/D/l) and independence between increments (M/D/1). The basic model suggested for ATM which specifically addresses the periodic arrival nature is the m*d/D/l system. The input process is the accumulated arrivals from m independent users each transmitting a periodic stream of cells with period d, i.e., one cell every d time units. The m sources are asynchronized with each other and as a result the phases of the periodic emissions are randomly mixed. As one example, Figure 5.2 shows an arrival stream over five periods generated by m = 30 users emitting cells periodically. To provide an equilibrium state buffer the traffic intensity must obey Q = m/dC < 1 if the capacity of the link is C cells per time unit. Some theoretical results are known for the m*d/D/l model; in principle the distributions of system size and waiting time can be found—none of them in a particularly enlightening form, however. Approximative results that are useful for dimensioning buffers are known in the heavy traffic regime, i.e., traffic intensity Q close to one. Numerical calculations show that the required buffer size necessary to keep losses below a given level is significantly lower than the corresponding buffer in M/D/1 exposed to the same load. This should be expected in view of the apparent isochronous smoothing in the input data. See [8] and [64] for discussion of such results and further references.

95

5.1. Cell-level traffic

Figure 5.2. 30* l/D/l input process. It is probably misleading to use m*d/D/l as a reliable model for ATM. It is a severe restriction that all users are supposed to generate cell streams characterized by the same period d. The system J2T=\ di/D/1 is a generalization, where m users emit cells periodically with deterministic holding times but now allowing individual periods dt, i = 1 , . . . , m. This more general model has the potential to generate a traffic structure that is drastically different from m*d/D/1. For this discussion we assume that the periods d\,..., dm are obtained, in advance of the transmission period, by each user making an independent observation of a random variable A with a given period distribution. Hence a sample A ' , . . . , A m of size m represents the variability in the user stream periods. To obtain stationarity one should suppose that at time t = 0 each user starts from a random location within its period. The first cell emitted from user i after time t = 0 arrives at a time that is a fraction of A', after which cells arrive periodically with the same interarrival time A'. For simplicity we ignore this finer point; all streams begin just after the completion of a period, and therefore N't = number of cells from user / before time t = f f / A ' l ([x] denotes largest integer < x). Let A, denote the empirical mean number of cells per user at time t, i.e.,

Clearly

and so

Here we take note that £(1/A) can be unpredictable and also state that if we approximate

96

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.3. Y^Ti^i/D/l, exponential periods. N't with t/A', then

If the input lines N't are Poisson processes, then the corresponding result would be V(A,) ~ t/m, whereas in this case the input variation in J^Li difD/l is on the scale of magnitude ~ t2/m. Figure 5.3 shows 20 realizations of arrival streams A, each generated by m = 100 users. Also shown is a blow-up of the initial period of time of the same simulation. For the simulation we used an exponential distribution for A. This is an extreme case where E(\/ A) does not even exist finitely. In practice there is a lower bound to the period, corresponding to the peak rate transmission throughout the observed interval. Nevertheless, this example has some relevance for understanding more realistic cases, such as a slotted time model with periodicities mimicking a mixture of ATM users.

5.1.2

Voice packet streams in Internet telephony

We present a Markov model for the time delay variation on cell scale of packetized voice traffic typical for Internet telephony applications. The model is introduced and studied in detail by Kaj and Marsh [22]. We start by looking at the traffic pattern produced by sending a PCM-encoded speech file from Argentina to Sweden over the Internet. In a first packetization step the file is segmented into packets of 160 8-bit PCM samples each. This gives a periodic sequence of 160-byte voice packets with 20 ms interpacket times. As the audio stream passes through buffers and interacts with cross traffic en route to its destination, packets may experience random delays. The typical effect of a single packet being delayed is that the distance in time to the next packet ahead increases and the distance in time to the packet immediately behind decreases. Eventually a fast packet even overtakes a slow packet ahead and from then on must adapt to the speed of the slower packet. As a result the observed interpacket arrival times at the receiver are sometimes shorter than 20 ms and sometimes longer.

97

5.1. Cell-level traffic

Figure 5.4. Voice over IP data.

Figure 5.4 illustrates observed interarrival times for a short segment of duration 5 seconds taken from a longer trace of voice data sent from Buenos Aires to Stockholm. The upper graph in the figure shows observed interpacket arrival times on the y-axis versus packet sequence number on the X-axis. The corresponding histogram of interarrival times is shown in the lower part of the same figure. Attempting to understand the nature of the interarrival time data we make the following model assumptions. First, let the 20 ms packetization interval be the time unit in this section. When we say that packet nr k is sent at time k — I it means the packet is sent at clock time 20(fc — 1) ms into the data stream. We assume that the packets are subject independently of each other to transmission delays YI , ¥2, . . . given by i.i.d. random variables with distribution function

and with finite mean transmission time [a = f£°(l — F(x}) dx < oo. For the data we are concerned with here, typical values of fj. could be 2CM-0, i.e., 400-800 ms. If the random fluctuations of the transmission delay variables Y^ were the only source of jitter in the model, then packet k would arrive at time k — 1 + Yk. But since the packets must arrive in the same order as they are sent, the actual arrival times at the receiver occur

98

Chapter 5. Cell and Burst Scale Traffic Models

at times T\, T2,..., where

Therefore the observed packet interarrival times are given by

We also introduce the observed delay of the packets as

The representation for Tk can be written

Hence

and therefore

Moreover, for x > 0,

From these relations we conclude that (7^) and (V^) are Markov chains such that the latter and the difference sequence (t4) of the first have asymptotic distributions as follows. Theorem 5.1. The sequence (VJt) is a Markov chain with asymptotic distribution

The sequence ((4) has the asymptotic distribution

hence in particular a point mass in zero of size

5.1. Cell-level traffic

99

Moreover, E(UX) = 1 The last statement, which verifies that in this model the mean interpacket arrival time is preserved at 20 ms, requires a proof. One technique is indicated in Exercise 5.1; see also Kaj and Marsh [22]. In principle an estimate of the unknown distribution function F can be obtained from a given data set (?*)*=! of arrival times by forming the observed delays (ut)|t = i, Vk = t^—k + l, and the corresponding empirical distribution function

Then simply apply the estimate

We now return to the basic assumptions in the proposed model for voice over IP (VoIP) and discuss a more general version. In this context we use trace data to illustrate typical statistics of recorded VoIP telephone calls. The model discussed so far covers the transmission of a continuous audio stream where packets are generated at a steady pace over time. In reality VoIP systems are equipped with silence suppression. This mechanism is typically attached at the analog input port; if the sound level falls below a given decibel threshold the packetizer stops recording packets and assumes the person on the call is quiet. To include the effect of the silencer in the model we introduce a further sequence X], X^, . . . of i.i.d. random variables and assign Xk — length of quiet period between packets k — 1 and k. It is natural to specify the distribution of X letting, for some (small) a > 0, P(Xk = 0) = 1 — a. This is the (large) probability that silence suppression is not effectuated just after packet fe — 1 is played out. With probability a, on the other hand, Xk is positive and the audio stream is stalled during a time interval of length Xk time units. The reception times T\, ? 2 , . . . of the sequence of successive packets are modified accordingly and are now given by

where

is the total duration of quiet periods preceding packet k. For this model it can be shown [22] that the interarrival times C/* = Tj — 7i-i are such that

The sequence (Vk), which we have seen is a Markov chain in the case of X = 0, generalizes naturally to the case of silence suppression by putting VJt = 71- - k + 1 - S*. For the sak

100

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.5. VoIP data with silence suppression, sequence (Vk). of visualizing a trace data file recording of a VoIP call consider the related sequence

Given a trace file of arrival times {^}, hence of the interarrival times {Uk}, estimate 1 + E (X \) by the sample mean u and form the sequence v* = & — (k — 1)«, k > 1. Figure 5.5 shows such a sequence (iik) for a VoIP call lasting approximately 95 seconds, of which the recorded person speaks for 40 seconds (2000 packets). The remaining silent period consists of six longer intervals of length exceeding 3 seconds plus many shorter intervals of varying lengths. These correspond to the distribution of the sequence of quiet periods, (Xk). The typical properties of the distribution of (Xk) are further illustrated in Figure 5.6, which shows histograms of the durations of talk spurts and of silent periods in a similar data set of a recorded call. An approximation of the parameter a for this data set gives a & 3%.

5.1.3

Round-trip time distribution, PING data

In packet-switched networks the packet round-trip time is an important characteristic. The round-trip time is the measured overall delay of a packet successfully transmitted from sender to destination and returned to sender. It is sometimes reasonable to assume that the round-trip time stays constant over time. Communication satellite traffic is such a scenario where often the main part of the round-trip time is the propagation delay of 270 ms, which is the time required by a packet traveling at the speed of light to the satellite band back and forth. In performance calculations with discrete time models the round-trip time often forms a natural time slot. This is the case, for example, using reliable data transfer protocols. Recall section 3.3.4, where the round-trip time consisting of a single packet transmission time plus the ACK return time is used as a fixed model parameter. Similarly, in the study

5.1. Cell-level traffic

101

Figure 5.6. VoIP data, nature of recorded voice signal sequence (V*).

of the TCP protocol in section 6.4.3 we apply the same simplifying model assumption that the round-trip times are constant. Meanwhile, in this section we study the random fluctuations that are necessarily inherent in round-trip time measurements. The trace data set introduced in section 1.3 (see Figure 1.13) indicated the character of such variations in Internet regional traffic data (approx 100 km distance). We introduce now a round-trip time model that refers to packet data of this sort and that is based on the packet delay mechanism discussed in section 5.1.2, dealing with VoIP data. However, the general aspects of the model are independent of the particular application. Hence the model should be applicable in various situations where round-trip time variations are an issue. We want to provide an explanatory model for data such as in Figure 1.13. Figure 5.7 shows the corresponding histogram for the RTT trace file. PING packets are sent once every second. The maximum round-trip time in the data set is less than 700 ms and so each packet has been returned before the next is transmitted. In this sense the packets in sequence can be considered independent of each other. The packets do interact, however, with other traffic to and from the servers involved and with cross traffic of the Internet service provider. As a PING packet enters the transmission link we can think of this packet as being placed in line with other packets, much like the VoIP packet trains discussed in the previous section.

102

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.7. Histogram of round-trip time data. Hence the one-way delay of the PING should be well captured by the random variables V)t defined in (5.2). The interpacket distance of 20 ms from the VoIP application has no particular significance here but should be replaced by the typical interpacket distance on the transmission link in question. With this motivation we propose the following model for round-trip time data from PING. Assume that the random variable Y with distribution function F(y) = P(Y < y) corresponds to the typical delay of a packet, ignoring the possibility of being stalled by other packets ahead in line. Thus Y reflects the general transparency of the network, variation in routes, number of hops, etc. The typical choice is a black-box model such as

where /u, is a propagation delay parameter and a 2 and A represent different types of jitter. Let y(1) and V^ be two independent random variables with the asymptotic distribution of the Markov chain (Vj) in Theorem 5.1 and define as round-trip time distribution

This is the required time for a packet to travel both ways to the receiving end and back. The packet suffers from delays represented by the random variable Y and it is under the influence of other packets, potentially causing slow-down effects. The interfering packets are likewise delayed according to Y with the same distribution function F and it is assumed that delays occurring in one direction are independent of delays during transmission in the other direction. Figure 5.8 shows numerically calculated densities of the delay distribution V and the round-trip time distribution R for the case where Y is an exponential random variable. The scales have no significance; the graphs merely indicate the character of the

103

5.1. Cell-level traffic

Figure 5.8. Density profiles ofY (left), V (middle), and R (right). transformation leading from Y to R. Appropriate densities for R can possibly be fitted to empirical densities such as in Figure 5.7. Statistical analysis like goodness-of-fit should be applied to trace data of round-trip time measurements as a basis for validating or rejecting the proposed model.

5.1.4

Packet fragmentation in video communications

In this example we consider data from a variable bit rate video application. The video signal is encoded according to the MPEG standard and transmitted over the Internet via the user datagram protocol (UDP). As a result we may assume that the only variable quantity in the sequence of datagrams is the frame length. The data shown in Figure 1.11 give an example of such traffic streams. Wolfinger et al. [70] considered the transformation of an encoded video signal as the data are passed from the interface of the video application with UDP transport onto a network interface that belongs to a lower protocol layer. By modeling the load transformation effective between the two interfaces analytically, they obtained an arrival stream at the network interface based on the nature of the arrival stream at the application interface. In many situations this should be an advantage as measurements typically are simpler on an application level. Our example follows Wolfinger et al. [7()|, who analyzed the effects of header-generation and fragmentation. To model the primary load and discuss the secondary load at the network interface they assume that the frame length distributions are normally distributed. We extend their results by providing the exact distribution of the induced secondary load packet size. Consider encoded video data and put Xi = length of 1-frame no k (byte),

k= 1

n.

We apply the simple model that the sequence is a sample of i.i.d. random variables from the

104

Chapter 5. Cell and Burst Scale Traffic Models

normal distribution In this approximation it is assumed that IJL is sufficiently large and a sufficiently small to guarantee that in practice X > 0. Underpinning this model is also the assumption that I-frames are sufficiently separated in time so that the correlations in the sequence can be ignored. More diligent modeling could be based on a multivariate normal distribution with three components, one each for I-, P-, and B-frames, and covariance parameters measuring their dependence [8]. Next we consider the change in data unit lengths within the UDP and IP layers. The UDP adds headers with specific control information to the original MPEG packets and IP will cut packets into smaller fragments to conform with a maximum transmission unit (MTU) on the network layer. We use UDP headers of h = 40 bytes and the MTU = 1500 bytes for ethernet. Thus if an original data frame is larger than 1460 bytes the transformation leads to a number of IP packets of size 1500 bytes plus a fragment of smaller size. Letting Y represent the distribution of the frame lengths now measured in units of MTUs we have

The resulting secondary load arrival process is specified by assigning Z — [Y] — number of max size IP segments per MPEG I-frame, R = Y - [Y] = length of IP packet fragment. The distribution of the integer valued random variable Z is given by

where <J> is the distribution function for the standard normal. The distribution function for the fragment size 7? can be written

hence

It was observed by Wolfinger et al. [70] both numerically from the model and by investigating empirical data traces that the pure fragments were distributed close to the uniform

5.1. Cell-level traffic

105

Figure 5.9. Histogram ofl-frame lengths in encoded video of the motion picture Ghost. distribution on the interval (0, MTV), so that P(R < jc) ~ x for 0 < x < 1. To verify this it is convenient to turn to the corresponding density function

Using numerical software this function can be computed with great accuracy. One finds that as soon as the segment size is not too large in comparison with the variation in the data, the fragment size distribution is very close to uniform. For example, if a > 1500, we have

and if a > 2- MTU, the upper bound of the deviation can be reduced to 10 15 As a numerical example we take a sequence of n = 18,1691-frames representing the motion picture Ghost (Paramount Pictures, 1990). Figure 5.9 shows a histogram for the data and numerical estimates of the parameters /j. and a (bytes) are given by

The distribution of Z is given by

which gives

106

Chapter 5. Cell and Burst Scale Traffic Models

The empirical fragment distribution fits a uniform distribution on [0, 1] (mean 0.5, standard deviation l/-«/12 = 0.2887) with great accuracy. The sample mean of calculated packet fragments in the trace is f = 0.4943 and the sample standard deviation sr — 0.2896.

5.2

Burst-level rate models

Next we turn to models relevant for such examples as the ethernet data in Figure 1.8. One natural approach to modeling such arrival processes is to introduce an arrival rate process A, related to the arrival process by

With this approach network traffic is approximated by continuous flows. If A, is Markov, then A, is a Markov-modulated fluid model. Examples of form (5.3) were discussed in section 3.3, namely, the renewal rate process and the on-off processes with A, alternating between two states. A further interesting example is the Kosten model, in which it is assumed that A r = hZt, where h is a given peak rate parameter and Zt denotes the system size process of the infinite server Markov model M/M/oo. In particular, in steady state, Z, is Poisson distributed for each fixed t. A straightforward generalization of the Kosten model is the Poisson burst model, where Z, is the system size process in M/G/oo, allowing a general service time to enter as a parameter. Figure 5.10 shows trajectories of such arrival flows for the M/G/oo model in Figure 3.10. The special case in which the tail of the service-time distribution is assumed to decay so slowly that the variance is infinite is of particular interest. In this case A, exhibits long-range dependence. The integrated process in (5.3) is only one option for modeling arrival traffic. Other choices include compound Poisson processes of the form

where Nt is a Poisson process with jumps {f;} and (X,) has the role of a rate process, and time-changed Poisson processes with an intensity process that varies over time.

5.2.1

Anick-Mitra-Sondhi model

Consider ra independent, alternating renewal process on-off sources Z',,i = 1 , . . . , m, as in section 3.3, which during on-periods transmit cells at peak rate h over a link of capacity c bits per second. Each source works under a fixed mean rate X and it is assumed that the sources have attained stationarity. Then

y = P(a given source is in the on-state) = and

= number of sources in on-state at time t e Bin(/n, y).

5.2. Burst-level rate models

107

Figure 5.10. Arrival processes in the generalized Kosten model. The Anick-Mitra-Sondhi (AMS) model is the continuous, multiplexed, work-load arrival process denned as in (5.3) by letting

We compute first- and second-order moments for the special case where each source is given by the two-state Markov process with jump rates a and ft. Then

The superposition Z, of all the m sources also is a Markov process in this case, namely, a birth-and-death process on {0, . . . . m\ with intensity )^ — a(m - k) for upward jumps and /j,k = ftk for downward jumps. By (3.7),

and by (3.9),

Since the autocovariance function of the Markov on-off process is known from (3.18),

108

Chapter 5. Cell and Burst Scale Traffic Models

can be found explicitly. Other properties of the AMS model are discussed later. Figure 5.15 illustrates heavytailed on-off times and Example 12 in section 5.3.4 deals with statistical measures of burstiness. Section 6.1.2 in Chapter 6 discusses an example of bandwidth allocation in admission control using this model. 5.2.2

Markov modulated Poisson process

The term Markov modulated Poisson process (MMPP) refers to a situation where a modulating process J,, typically an irreducible Markov chain in continuous time with a finite number of states 0 , . . . , m, governs the rate of Poisson arrivals. During periods such that /, = j incoming traffic arrives according to a Poisson process of intensity Ay, 1 < j < m. As J, changes state so does the intensity of the Poisson source. It is intuitively clear that if the modulating process has a steady state distribution {JT,-}, then the effective arrival rate is given by A.^ = ^7=o ^i^-i- Service systems with MMPP arrivals can be analyzed; see, e.g., Mitrani [38]. We study the particular case A.J• = h • j and the modulating process given by the superposition of independent on-off sources Z, in (5.5),

restricting to the case where Z, is Markov. Consider a family of independent Poisson processes with intensity h, i.e.,

where each process generates Poisson events at time points Tln = £)t=i U'k. All the U'k 's are independent and exponentially distributed random variables with the same expected value l/h. We associate one Poisson process with each on-off source ZJ in (5.7) and define th arrival process by

which means that A, is formed by going over all Poisson events in [0, t] and counting only those that occur as the corresponding source is on. This can be seen as a Poisson process modulated by the number of sources Jt which are on, in the sense that the actual intensity driving the counting process A, is random and at any given time t equal to h J,:

A simulation for the case m = 3 is shown in Figure 5.11. The modulating Markov chain is plotted in the lower part of the figure and the resulting arrival process in the upper. In analogy to (3.7)

5.3. Long-range dependence traffic models

109

Figure 5.11. Markov modulated Poisson process. The relation V(X) variance

= E(V(X\Y)) + V ( E ( X \ Y ) ) suggests a method for computing the

Conditionally, given a sample function of the process Zt,

and

Hence

The first term was calculated in (5.6) and the second term equals h J0' E(Z2S) ds = hyt.

5.3

Long-range dependence traffic models

Let us return to the question of whether it is reasonable to assume Markov interarrival or service times in real networks. In recent years large sets of traffic data have become available. Statistical analysis has shown strong evidence that certain categories of data are not covered by such assumptions. We mention two examples.

110

Chapter 5. Cell and Burst Scale Traffic Models

Duffy et al. [10] reported on a statistical study of telephone call holding times (CHT), where during 8 hours a total of 302,225 calls started and the measured durations varied from 0.001 seconds to 29.5 hours! Given such data and writing S as a generic notation for CHTs, one can perform a statistical test on the hypothesis that S e Exp(/Li) based on the fact that the tail of the distribution,

decays as x -» oo at exponential rate to zero if the hypothesis is true. Such investigations in fact showed that the tail distribution for CHT is much better described by a function of the form

i.e., the distribution possesses heavy tails. The second example is the experiment performed at Bellcore from 1989 onward (see also section 1.3). The Bellcore data of ethernet LAN traffic has triggered over several years much interest in traffic modeling based on the concepts of self-similarity and long-range dependence, mathematical notions that are related to heavy-tailed distributions. See Leland et al. [31] and Willinger et al. [68]. The collected data sets were monitored in traces of 40 hours containing approximately 27 x 106 packets. The data can be viewed either as traces of the number of packets arriving per time unit or, if the varying packet sizes are taken into consideration, as traces of the number of bytes arriving per time unit. The second version is a kind of workload process. To understand the specific character found in the ethernet data we divide time into 10 ms intervals and count Xk = number of ethernet packets arriving at interval nr k, k > 1. Figure 1.8 shows such a sequence (X^) but with a time slot of 1 second. Next, form a sequence of 2-block averages

and, more generally, the aggregated sequence

For a fixed m we can now plot the new sequence (-X^ )*>i and compare with the original sequence. Indeed, if we take the same data as in Figure 1.8 and form a new trace with m = 5 (i.e., choosing the time bin 5 sec), then the result is as shown in Figure 5.12. The basic statistical character looks pretty much the same. The original experimenters at Bellcore were able to handle such large amounts of data that the same rescaling could be performed for the five time scales m = 1, 10, 100, 1000, 10,000 on a single trace. Remarkably, the resulting sequences ranging over time scales from milliseconds to minutes were similar and in this sense the process X (m) is similar to itself, regardless of m. This has been taken as an indication of self-similarity and the related property of long-range dependence.

5.3. Long-range dependence traffic models

111

Figure 5.12. Ethernet arrival rate process. 5.3.1 Self-similarity We state some definitions. The formal analog of the idea of considering aggregate blocks is the following. Consider a stationary sequence of identically distributed random variables (Xn) with mean E(X) = 0 and variance V(X) < oo. The sequence is said to be self-similar with self-similarity parameter H if for any m = 1,2, ...,

have the same finite-dimensional distributions. In the general case fj, = E(X) < oo, it is required that for each m > 1,

have the same finite-dimensional distributions. The parameter H ranges from 0 < H < 1. For applications to traffic data it seems most suitable to consider, instead of packet counts x['"\ the corresponding centered measurements and simply apply (5.9) where all random variables have zero mean. Hence in the following we restrict ourselves to /i = 0. If the weaker notion applies that (mX(^n)) and (mu'X/ 1 (so that r ( 1 > = r). It is left as an exercise to verify that for a second-order self-similar sequence, for any k

112

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.13. Autocorrelation function, ethemet data. A deeper result is now that a function r(k) which satisfies such scaling relations must have a specific form. It can be shown that the second-order self-similar sequence autocorrelation function is forced to be

Observe that the right side is a difference approximation of the second-order derivative of the function f ( x ) = x2H, hence

In practice one should resort to the less restrictive assumptions of asymptotic self-similarity or second-order self-similarity in the sense that the above properties hold asymptotically for large m. The strict properties quoted here, however, give the basic intuition. Figure 5.13 gives an example of an estimated autocorrelation function. It is based on the BBC news video data shown in Figure 1.9 and shows surprisingly large values over long distances, probably due to some machine-generated periodicities in the material. The noisy curve gives an idea of the estimation error. It shows the autocorrelation function based on the same trace but where all correlations have been destroyed by shuffling the video frame size data in random order. We turn to the closely related definitions designed to capture long memories. The sequence (Xn) is called

5.3. I ong-range dependence traffic models

113

Clearly a second-order self-similar sequence is long-range dependent if 1/2 < H < 1, since then

In continuous time we have the corresponding definitions that a stationary process At, t > 0, is self-similar if the distributions of AT, and TH A, are equal for every T > 0 (5.14) and possesses long-range dependence if and only if

5.3.2

Heavy-tailed rate models

In this section we investigate long-range dependence in the renewal rate process and the M/G/OG version of Kosten's model, the Poisson burst process. First let A ; be the stationary renewal rate process in (3.14) with associated interrenewal time distribution F ( t ) = P(U < t) of finite mean v = E(UT) and rate sequence (X/,) with mean E(X\). The corresponding arrival fluid model is A, = J0' Asds, as in (5.3). The autocorrelation function /-(/) of A, is found in (3.16). Since

it follows from the criteria (5.15) that the renewal rate process is long-range dependent if and only if U has infinite variance. To obtain the variance of the fluid arrival process we apply (3.9) in the form

which together with (3.16) yields

The Poisson burst rate process obtained by replacing the renewal rate process above with M/G/oo leads to very similar expressions. For this case we let At denote a stationary version of the M/G/oo process based on a given service-time distribution S and repeat the previous calculations. It follows that again the model is long-range dependent if and only if S has infinite variance. The arrival process variance is calculated from (3.9) and (3.31) as

114

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.14. Brownian motion approximation, W(n\ n = 1000.

5.3.3

Fractional Brownian motion

A common technique in stochastic process theory is to approximate jump processes that take their values in a discrete set with continuous state processes. Visualizing jump processes like those in Figures 1.7 and 2.1 brings to mind continuously varying random processes. To establish the basic idea, consider two independent Poisson processes A,, B,, t > 0, both having intensity A > 0. Put

For large n this means that the Poisson processes have evolved over a long time interval; we haveE(Ant~Bnt) = OandV(A nr -5 n( ) = 1\nt. Hence E(W/ n) ) = OandV(W/ n) ) = 2\t. Figure 5.14 shows 10 simulated trajectories of such processes W t ,0 oo, it follows from the central limit theorem that W, converges in distribution to some random variable W, € N(0, cr 2 1), a2 — 2X. In this perspective we can now quote nonrigorously one of the most fundamental results in modern probability theory. Theorem 5.2. There exists a Markov process W,, t > 0, WQ = 0, called Brownian motion with variance a2, which is characterized by the following properties: All nonoverlapping increments Ws, Ws+t — Ws are independent; the increments Ws+, — Ws are stationary for all s > 0; the increments Ws+t — Ws are N(0, a2t) distributed, all s > 0; the trajectories Wt, t > 0, are continuous junctions. Brownian motion arises, for example, as the limit of W(n) as n weak convergence of stochastic processes).

oo (in the sense of

5.3. Long-range dependence traffic models

115

In telecommunications modeling Brownian motion appears in various, more advanced approximative schemes (one instance is mentioned in section 6.3.3). The topic is included here as we wish to discuss a non-Markov relative to Brownian motion, so-called fractional Brownian motion (FBM) [36], which recently has attracted a great deal of attention in traffic modeling. An FBM process BH(t), t > 0, shares the properties with Brownian motion that the paths are continuous and that the increments are stationary and distributed according to a Gaussian (normal) distribution. On the other hand, nonoverlapping increments of BH(t) are not independent and one has

The parameter H (Hurst index) typically belongs to the interval 1/2 < H < 1 (the case H = 1/2 gives Brownian motion) and is a self-similarity parameter as discussed in (5.14), namely, the distributions of BH(Tt) and TH BH(t) are equal for every T > 0. An early reference to the use of FBM in telecommunications is Norros [42]. To understand the potential applicability of FBM in traffic streams we consider a rescaling scheme for a superposition of renewal streams with heavy-tailed holding times. The purpose of such a scheme is to find an appropriate time scale and a scaling of the load per user, such that the natural fluctuations in the model can be captured asymptotically as the number of users increases. Taqqu, Willinger, and Sherman [62] established limit results for the AMS-type multiplex of on-off sources where either the on periods, the off periods, or both, possess heavy tails. Figure 5.15 shows simulations of the process (Z,) in (5.5) with m = 50 users and equally distributed on and off period distributions with heavy tails of order ~ x~(l+^\ for ft — 1, ft = 0.6, and ft = 0.2. The simulation indicates that with heavier tails the trajectories of the superposition process tend to be more regular and show less variation on small time scales. A closely related and somewhat simpler set-up is as follows. Let

where Z't, i = 1 , . . . , m, are independent renewal rate processes with mean reward y — E(X) and interarrival distribution F(t') = P(U < t) as introduced in (3.14). Assume that the distribution of F has heavy tails of order ~ jf~ ( 1 + ^\ 0 < ft < \. The superposed renewal rate process A, represents the total amount of data generated by m independent users during a time interval of length t. The results of Taqqu, Willinger, and Sherman [62] suggest that for this comparable model the fluctuations of A,, appropriately scaled, converge to FBM with parameter H = 1 - ft/2. For simplicity it is assumed that F(t) is an exact Pareto distribution with a tail parameter ft as above. In general a weaker assumption on the asymptotic behavior of F for large / suffices. Moreover, the initial distribution is selected so that the renewal rate process is stationary. For each T > 0 define the rescaled process T~lATt. Since the mean equals

116

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.15. Multiplexed on-off sources, light (top), intermediate (middle), and heavy (bottom) tails. E(T ' ATt) = ymt, the process

describes fluctuations around the average. As an example one may think of the arrival counts in Figure 1.7 as A,(m) and subtract the linear slope. The significance of the parameter T is to look for a particular time scale on which the fluctuations exist macroscopically and perhaps behave as FBM. When rearranging integration and summation two alternative limit schemes arise:

and

Using the same proofs as for on-off processes in Taqqu, Willinger, and Sherman [62], it follows that if firstm tends to infinity and then T tends to infinity, the right-hand side of (5.17) converges in distribution to the FBM process BH(t), t > 0. The covariance calculation in (3.15) is essential to verify that the scaling order is correct. The scheme in (5.18) is relevant if the limit operations are taken in reverse order. It is shown in the referenced work that if first T —> oo and then m —>• oo, the right-hand side of (5.18) converges weakly to a so-called stable process. This discussion suggests the further alternative of letting T — Tm tend to infinity jointly with m. It is still necessary to distinguish different regimes of convergence. First

5.3. Long-range dependence traffic models

117

suppose that T —>- oc slower than m]^ in the sense that T^/in —> 0 as m —> oo. It can be shown in this case that the sequence in (5.17) satisfies

in the sense of weak convergence of processes. On the other hand, if T^ /m —> oc, then the sequence (5.18) converges to a stable Levy process of index 1 + ft. Relevant references for these results are Mikosch et al. [37], Pipiras, Taqqu, and Levy 149], and Gaigalas and Kaj [12]. In the intermediate regime T& ~ m we are left with the unnormalized fluctuations process on the left-hand side of both (5.17) and (5.18). The convergence of the corresponding sequence (A^/,,( — ymt)/ml/fi is investigated in [12]. The setting for the results in [12] is somewhat different from (5.16) and refers to the superposition of ordinary renewal processes with heavy-tailed interarrival time distributions, which we mention next. Let W,(l) be independent stationary renewal processes with Pareto distributed holding times of index ft. Let T be some time scaling such that T^/m vanishes as m —>• oo. Define

Then Y(m} converges as m -> oo to a multiple crpBH of FBM with Hurst parameter H = 1 ~ ft/2. The primary application of this result is that A, — X^li ^} counts the accumulated number of packets generated by m independent users sharing a LAN. Each user works under the assumption of Pareto interpackct distributions. It follows from the convergence result that for large m

But according to self-similarity with Hurst index H = 1 — ft/2, for each fixed t,

so that

In other words, the mean traffic behaves as

This provides a verification of the model for ethernet traffic proposed by Norros [42]. It is simple to generate m independent sequences of stationary Pareto-type arrival time random numbers and sort the merged data set in increasing order to obtain arrival times for the superposition process. Figure 5.16 shows three realizations each for H — 0.6, H = 0.7, and H = 0.8 with 25 renewal processes each time (m = 25). Consult Paxson [47] for alternative simulation techniques.

Chapter 5. Cell and Burst Scale Traffic Models

118

Figure 5.16. Approximation ofFBM, H = 0.6,0.7, 0.8, m = 25.

5.3.4

Statistical methods

Not much consensus has yet emerged regarding the statistical characterization of "burstiness" in traffic streams. This includes how to test statistically whether a given set of network data complies with a particular model for self-similar or long-memory traffic. The monograph by Beran [4] provides the mathematical details. The current growth of interest in these topics and the wide range of methods applied can be seen in [2]. In this section we only touch on the area and discuss some of the techniques. A crude measure of burstiness sometimes applied to cases where the mean rate and the peak rate have been identified is simply burstiness =

peak rate mean rate

Another statistical quantity intended to measure the degree of bursts in a packet stream is the index of dispersion

for which the Poisson process serves as a reference case, with I, = 1. Example 12. For the AMS model introduced in section 5.2.1 and specializing to alternating Markov renewal processes, we have

5.3. Long-range dependence traffic models

119

Furthermore, it follows from (5.6) that if we select a time scale by setting a + ft =• 1, say, so that a = y = A./ h, then

Hence whereas burstiness — h/X is a rather different function. Various graphical methods have been suggested for investigating if a set of arrival count data exhibits self-similarity or if a trace of CHT are heavy tailed. If the result of such an investigation is the application of a specific model with long-range dependence, then the next step is often to estimate numerically a corresponding self-similarity parameter, or Hurst parameter, H. To list some examples, suppose X i = arrivals during the time interval [j — 1, /). The index of dispersion for counts is given by

and linear regression can be applied to test whether ICDL ~ L1H ' for some parameter H in accordance with self-similar scaling behavior. A time-variance plot has a similar purpose and is obtained by plotting estimated variances of the aggregated sums X(m) against m in log-log scale (logm, log V(X ( m ) )). Estimation of the autocorrelation function and other time series analytic methods also can be used in this context. For an introduction to such methods, see, e.g., Molnar, Dang, and Vidacs [39]. Let us turn to the statistical analysis of the tail behavior of a holding time distribution P(S > t). To check for heavy tails such as the power law decay in (5.8), the obvious first step is to estimate F(t) = P(S < t) by the empirical distribution function

and then to investigate linear regression of y on x in

So-called quantile-quantile plots (Q-Q plots) are graphical methods based on related ideas. A more sophisticated technique, which has its origin in extremal value theory, is to use the so-called Hill's estimator. Resnick [52] contains a detailed account. The mean excess function, defined by

is a further tool for explorative data analysis of distribution tails. Clearly g(y) = E(S), y > 0, is constant if S is exponentially distributed. The idea is to distinguish heavy tails

120

Chapter 5. Cell and Burst Scale Traffic Models

from light tails based on the asymptotics of g(y) for large y. In fact, the property g(y) ->• oo may even serve as a formal definition: S possesses heavy tails if,

g(y) —>• oo,

y —> oo.

Intuitively, if g(y) is an increasing function in y, then the longer the call has lasted the longer its remaining length will be, which fits with our understanding of long memory. Greiner, Jobmann, and Kliippelberg [15] recommended the mean excess function for analysis of telecommunications data and applied the technique on a trace of 1,690,730 ATM cells (approximately 2 minutes of IP traffic). The cells are supposed to arrive in bursts of (i.i.d.) lengths characterized by an on-period distribution, separated by silent periods whose lengths follow an off-period distribution. The empirical mean excess function is calculated for the on-period data and the off-period data separately and plotted against cell burst levels v. The on-period data, in particular, generates a plot that is remarkably close to a straight line over the full range of levels y. This suggests that the Pareto distribution fits the data quite well (see Exercise 5.6). The off-period measurements suggest a lighter tail for that case. We finish the section on holding time tail behavior by discussing a further graphical tool of potential interest for analyzing network data. The starting point is to relate the mean excess function in (5.19) to notions from life length testing. Let S with distribution function F denote a generic nonnegative random variable that represents life length, which in our context can be CHTs, off-periods, etc. Note that

The distribution of S is said to be of type new worse than used in expectation (NWUE) if

It follows that S is of type NWUE if and only if £(5 - y\S > >•) > E(S) for all y > 0 (Exercise 5.7). Also, S is said to be of type decreasing failure rate (DFR) if for each fixed / the function P(S > t + u)/P(S > u) is increasing in u. This property implies increasing mean excess and hence implies NWUE. Both classes of distribution seem relevant for heavy tails, but we need to be more specific. Let F~'(JC) denote the inverse function of F(t) so that F(F~l(x)) = x. Define the TTT transform of S to be the function

Then 0 < (f>(x) < 0(1) = 1 and

Consequently, since F~l(x) tends to oo if and only if x tends to 1 from below, S exhibits heavy tails iff

lim <j>'(x) = oo.

X—>]

5.4. Exercises

121

Figure 5.17. TTT plot analysis ofethernet interarrival times. Some further insight into the shape of the total time on test (TTT) transform is offered in Exercises 5.8 and 5.9. Finally, the graphical technique based on these observations consists of sorting the available data in increasing order t\,..., tn and drawing in the unit square the TTT plot

where

is the so-called total time on test sequence. The resulting plot is the empirical TTT transform, which is used for visual evaluation of the data set. In Figure 5.17 this technique is applied to the etheraet data in (1.6) (neglecting the dependence structure). The nonsmooth curve is the TTT plot which falls under the diagonal (NWUE property), is basically convex (DFR property), and seems to have a steep derivative in x = 1 (heavy tail). Also shown in the graph are the TTT transforms of two Pareto distributions, with ft — 0.3 and 0.4 indicating Hurst parameter values H = 1 - ft/2 in the range 0.80-0.85.

5.4

Exercises

5.1 Show that the sequence of arrival times for VoIP packets, (7j) defined in (5.1), has the representation

122

Chapter 5. Cell and Burst Scale Traffic Models where TA'_i has the same marginal distribution as Tk-\ and is independent of Y\. Use this representation to verify the relation

Prove that the interarrival times U^ have the property E(I4) —> 1 as k —>• oo. 5.2 With reference to the fragmentizer model in section 5.1.4, assume that the frame lengths of encoded video can be modeled on the application level by the exponential distribution with mean p.. Find the distributions of the resulting number of maximal size IP packets (Z) and the size of fragments (R) at the network layer. 5.3 Verify the calculations in the displayed formulas (5.11). 5.4 Verify (5.13) from (5.12) by deriving the limit

5.5 Show that for the accumulated traffic arrival process A, = J0 As ds of a renewal rate process with Pareto distributed interrenewal times having tail decay x~^[email protected]\ 0 < /} < 1, the variance V(At) grows with t at the same rate as the power function t2~P. What is the natural choice of self-similarity parameter H in this case? 5.6 Show that if the Pareto distribution with survival function R(t) = 1/(1 + ?/ 0, is assumed to model CHT S, then, for any /3 > 0, the mean excess function g(y) is linearly increasing in y, y > 0, and hence is heavy tailed. 5.7 Verify that the NWUE condition is equivalent to the property that the mean excess function is bounded below by E(S). 5.8 Show that the DFR property is equivalent to the TTT transform being a convex function and show that the NWUE property is equivalent to 4>(x) < x, 0 < x < 1. 5.9 Show that the Pareto distribution in Exercise 5.6 has TTT transform eiven bv 4>(x) =

Chapter 6

Traffic Control

With reference to the general net model in section 1.2, we distinguish admission control as a term for the procedure of deciding whether a request for network service should be admitted; access control, referring to various tasks of the access net; and congestion control, for providing effective transmission along transport routes. In this chapter we discuss a selection of mathematical models and problems related to these issues.

6.1

Admission control

Admission control occurs on the time scale of calls. The decision to accept a request for service should be based on conditions that are likely to prevail for the duration of the call and not on instantaneously changing quantities. Long-term traffic statistics such as peak rate and mean rate are essential. During the call other mechanisms for access and flow control may be active, on much finer time scales. In best-effort networks without criteria for delay and loss, such as the Internet, there is no admission control. Consider user requests on a network for setting up calls—virtual connections in an ATM node, for example. Typically the requests are accompanied by a QoS specification, e.g., a listing of estimated volumes in various traffic classes and corresponding performance parameters, such as maximum loss or cell delay variation. The specification describes the desired quality of the requested service. It is a decision of the network to accept or to reject the caller and to formalize the agreement in terms of a traffic contract. A traffic contract, in its turn, must be monitored during the time it is valid and service is in progress—network policing. The worst-case approach (from a networking perspective) to solve the admission problem is to accept the call if bandwidth and buffers can be allocated based on its peak-rate requirements. For example, if the call consists of the superposition of m on-off sources where P(Zlt = 1) = /?/, and the source requires bandwidth C/ cells/sec while on, then the actual demand is bounded from above by the peak rate,

123

124

Chapter 6. Traffic Control

In principle all sources could be in the on-state throughout the call; the only chance for the server to accept the reservation is therefore to reserve bandwidth C — Y^T=i Ci f°r ^s ca^> a safe but probably very ineffective policy. We also realize that to expect stability in the network, allocation based on the mean rate is a bare minimum. The capacity C which is to be made available must exceed the mean value,

In reality an intermediate case is to be expected. This raises an interesting question, discussed next.

6.1.1

Effective bandwidth

How do we define and measure the effective bandwidth, i.e., the effectively required bandwidth that the given QoS specification is likely to produce? Clearly this is a question of statistical character. It suggests that the network must ask which risk it is willing to take of not being able to fulfill the contract, then estimate the effective bandwidth, and base its decision on that. This is an area currently under research. One direction is known as measurement based admission control and exploits recent advances in applications of the theory of large deviations; see, e.g., O'Connell [43]. For a detailed study of the large deviations methodology focusing on Internet congestion we refer to Wischik [66]. We address some of these ideas, beginning with a simple example. Example 13. Suppose a user makes the request to transmit packets of exponentially distributed size with mean size 1 that enter the network according to a Poisson process of intensity X. The server knows immediately that it has to allocate a bandwidth C > A. to guarantee that the traffic intensity in the resulting M/M/1 model is kept in the range of a steady state solution, Q — \j C < 1. This represents mean-rate bandwidth allocation. In this model there is formally no upper bound on the number of arrivals, hence there is no direct analog of peak-rate allocation. On the other hand, we obtain possible notions for the effective bandwidth as follows. As usual let N denote steady state system size and W system time per packet. Thus N has a geometric distribution with parameter Q and W is exponential with parameter C — A.. To have for any given e > 0

the requirement of the server is Alternatively, to be sure that

it is necessary that C satisfies the inequality

6.1. Admission control

125

It is now clear that a QoS contract can be formulated in terms of the parameters K,n,w, and e and that the network can compute the minimal C which satisfies both inequalities above and guarantees, with a certainty of probability s, that the requested service can be carried out. Returning to a general framework of an arrival stream process A, with stationary increments (As+t — As the same distribution for all s > 0) we discuss the approach to effective bandwidth based on entropy, or moment generating functions; see Kelly [26] and Gibbens [13]. As in the M/M/1 example, mean rate allocation of resources amounts to choosing capacity C > E(A,)/t & X which is greater than the average arrival rate. It has been suggested that the effective bandwidth surface

should be used for the purpose of measuring and characterizing traffic as well as allocating capacity. Here 9 is an additional scaling parameter, which in the limit $ ^ 0 returns the mean rate function and in the limit 0 ->• oo the peak rate. For example, if A, is the multiplex of n independent on-off sources of mean rate mt and peak rate hi bits per second, then over an interval of length t

In principle the effective bandwidth surface can be computed numerically and graphically for a given model and its shape and properties used for description and characterization of various traffic classes. For such examples see the referenced papers.

6.1.2

Statistical multiplexing gain

In this section we consider in more detail the problem of assigning effective bandwidth to incoming requests. The arrival process is assumed to be given by the AMS model of section 5.2.1 with m multiplexed on-off sources each of mean rate A and peak rate h. Our main reference is Schwartz [56, Chapter 4.1]. Recall that Z, e Bin(m, y) is the current number of active sources, (5.5), and A, = hZt the current load. Possible performance measures are the steady state quantities Loss probability: Proportion of expected loss:

P(A > c), £((A — c) + )/£(A).

We express the second option in terms of Z r . Introduce

the number of sources that can be transmitted simultaneously at peak rate (assuming no buffer). The mean rate allocation lower bound guarantees that c > 1m, hence «o > [yw] %

126

Chapter 6. Traffic Control

Figure 6.1. Effective bandwidth, AMS model. E(Z,). If the trajectory Zs,s > 0, crosses the level n0 at some time during the interval of observation, then the maximal capacity is exceeded, resulting in cell loss. Assign

In particular, FIOSS(W) approximates the proportion of expected loss. The function P\oss can be used for bandwidth allocation, as described next. For a given e > 0, find (integers) N — N(s, m) > «0 such that P\OSS(N) < s. Then ^ hm is the capacity required for the m sources and

These sequences are schematically depicted in Figure 6.1. The quantity noh/Ns is the effective bandwidth per source and the corresponding losses among the m users are controlled by the choice of e. The ratio NE/HQ satisfies

and represents the statistical multiplexing gain. This can be viewed as a measure of the degree to which the fluctuations in the input allow for capacity reduction compared to peak rate allocation. Given the traffic parameters X and h and a QoS specification in terms of e, we now have a method for controlled bandwidth assignment. For a given capacity c, find the maximum number of calls m that can be admitted. Conversely, given m calls, find the required capacity.

6.2. Access control

127

An alternative method is based on the central limit theorem. For large m, the distribution of the number of active sources is approximately normal in the sense that

Thus, for £ > 0

where ze is the upper e-quantile in the normal distribution. Tf we take, for examp] e, £ = 10 5 so ze — 4.26 this approach results in the allocation rule between mean and peak rate given by

where A + 4.26^/A.(h — A.)/m is the effective bandwidth per source. The multiplexing gain is given by peak rate 1 effective rate

y + 4.26^7(1 - y)/m'

For a similar derivation, see Schwartz [56, Formula 4-14a].

6.2 Access control Access control involves procedures for smoothing traffic flows and preventing access points of a network from becoming overly congested. Access control decisions act on the cell scale and may be based on instantaneously varying traffic conditions.

6.2.1

Leaky bucket systems

The term leaky bucket is normally used for a filtering device attached to a network entry point. The puipose of the technique is to shape or regularize variation in traffic streams, typically packet arrival processes entering an access node. The goal is to reduce burstiness prior to admittance into the network. This may cause additional delay or losses in the system but with the gain of reducing large bursts of arrivals in short intervals of time. The name leaky bucket originates from an analogy of regulating flow variations in a fluid. Arrivals are thought of as forming an irregular fluid flow piped into an access valve. To avoid extensive variations in pressure and fluid velocity at the access point, the stream is passed into an open container, the bucket, of fixed size, from which the fluid has to drain out through a pipe of fixed capacity, the hole in the bucket, into the network. The effect is that during periods of high arrival intensity, excess fluid is temporarily stored in the container until it gains access through the regulating pipe, possibly at the price of losses, i.e., bucket overflow. To begin modeling such devices, let A, denote an arrival stream considered on the burst scale level. Often a leaky bucket filter can be thought of as a fictitious single-server queuing system mounted before the access point; see Figure 6.2. This service system produces a

128

Chapter 6. Traffic Control

Figure 6.2. Fictitious queue in leaky bucket.

(departure) stream A, which is more regular than A, and will become the newly shaped arrival stream fed into the network. It is not obvious how to measure regularity but the basic goal is that A, shows less bit rate variation over time than A,. To understand how to produce such processes A, we make another analogy. Example 14. Celebrities and limousines. On Academy Award® night, celebrities arrive at random times to the venue (producing a process A,). However, they are asked to gather at a cocktail lounge nearby from which one of d limousines will take them to the theater main entrance photo opportunity. If there is a limo waiting by the time a celebrity arrives at the lounge, it departs immediately with the guest. If not, celebrities are asked to wait for the next available limo returning from the theater. Clearly there will still be some variation in the theater arrival process A, but it is likely that burst arrivals will have been smoothed somewhat. We will see next in what sense we can think of this as a fictitious queuing system.

6.2.2

The M/M/1 leaky bucket

We illustrate the ideas using once again Poisson arrivals. A stream of fixed-size packets A, is supposed to arrive according to the Poisson process with intensity A to an entrance point of a node equipped with a leaky bucket. The mechanism for achieving the desired regularized arrival stream is to require each arriving packet to carry an access bit that grants it passage into the node. The leaky bucket must be designed so that it distributes these passage tokens restrictively during periods of frequent arrivals (clusters) of the Poisson process and makes up for the delay during quieter periods. Specifically, fix a parameter r > A. and let the leaky bucket simulate and keep records of an M/M71 process, which we denote by Xt, that would be obtained if A, was fed as arrival process into a single-server queuing system with service intensity r. The distribution of X, will settle in an equilibrium state X^ with average E(Xaa) — Q/(! — Q), p = \jr. Now, introduce an integer-valued parameter d > 1. Think of the service completion times of the M/M/1 process, i.e., the downward jumps of X,, as arrival times of tokens, the signaling units needed by the packets for access, and assume the leaky bucket has the capacity to load a storage of up to d tokens. The device then works as follows. If a packet arrives and tokens are available, the packet immediately seizes a token and departs for the access node. If a packet arrives and no token is available, it will have to wait in the bucket until a sufficient number of tokens again has arrived to clear out the backlog. The picture is clarified by noting that we have the following interpretation of Xt, the

129

6.2. Access control

Figure 6.3. Leaky bucket shaping. leaky bucket level process: 0, d tokens available, 1, d — I tokens available, n, no packet waiting, k packets waiting for access, k > 1. The parameters are the leak rate r and bucket depth d. It remains to identify the shaped arrival process At. As long as X f _ < d, a further arrival at time / consumes a token and therefore produces not only the jump A, — A r _ = 1 but also At — A,_ = 1. If X,_ > d, however, an arrival at t corresponding to the upward jump A, — A,_ = 1 is delayed in the leaky bucket and does not count for A,, so A, — A,_ = 0. Instead it is during excursions of X, above the depth level d that jumps of A, occur. Each time the bucket level process drops from a state above d, a token is thought to return to the bucket. Hence if X, — X,- = -1 and X,- > d, then also A, — A,_ = 1. The effect is that during excursions of X, above d, the arrivals at the access node are pushed forward in time compared to those of the original arrival sequence. The long time arrival rate is preserved, A(t)/t » A(t)/t for large t, but the interarrival time distribution changes. Figure 6.3 shows a simulation for the parameter values p — 0.90, d = 9. The graphs of At, A,, and X, are shown with A, (dot-bar line) always below the curve of A, and X< varying around its (asymptotic) mean E(XX) = d, also indicated in the figure. If the leak rate is small (close to X), then A, is more or less the

130

Chapter 6. Traffic Control

Figure 6.4. Leaky bucket policing.

same as the departure stream from the virtual queue X:, i.e., the Poisson process. Similarly, if the depth is too large, the leaky bucket in the long run has no effect since then A, £» A,. For dimensioning purposes one can therefore introduce, for example, the ratio r = d/r as a measure (in units of time) of burst tolerance. The interpretation is that this ratio is the length of time during which the access node tolerates arrivals at a steady pace without the need for enforced delays by the leaky bucket filter. This model in principle requires infinite storage of packets in the bucket. A natural variation of the scheme is therefore to replace the M/M/1 process by an M/M/l/K system for some K = d + b so that the new parameter b corresponds to the maximal buffer space available in the leaky bucket. This has the effect that the shaping mechanism is replaced by policing, as losses are inflicted on the arrival stream in order to conform with access regulation. Packets allowed to enter the leaky bucket are said to be conforming; packets discarded from the system because of over-full buffers are nonconforming. A simulation for the case b = 3 is given in Figure 6.4; again plots of A,, A,, and X, are shown. Some packet losses early in the simulated trajectory are seen to reduce the overall input rate in this case. The M/M/1 model is used here mostly as a convenient framework for discussing the leaky bucket mechanism. In a practical situation arrivals modeled by the Poisson process would probably represent a regular input stream with limited need for shaping. In fact, the effect of the leaky bucket observed in M/M/1 simulations such as in Figure 6.3 is modest. The empirical interarrival time distribution in A is close to the exponential but with slightly smaller variance.

6.2. Access control

6.2.3

131

The generic cell rate algorithm

In broadband traffic systems based on ATM a number of variations of the leaky bucket technique have been implemented in different forms, such as the virtual scheduling algorithm, the generic cell rate algorithm (GCRA), or the token bucket filter. The intention is to enable traffic control of an ATM connection and the set-up of corresponding traffic contracts using rule-based parameters. This term refers to parameters that are understandable by the user, are significant for resource allocation, and are verifiable by the network [8, Chapter 2]. The leaky bucket rate and depth parameters are considered useful in this respect. The GCRA is equivalent to a G/D/1/K version of the model studied in the previous section. Here G stands for a general arrival stream A, given by a sequence a\, ai,..., where a^ is the time at which cell k is observed at the interface. The symbol D stands for deterministic service time, and K represents a finite system. One can imagine a token bucket filling up with tokens regularly at rate r, i.e., one every T = l/r seconds up to a maximum of d tokens. Arriving cells consume one token each and conform with the node as long as there are tokens available or the cell is allowed to queue. More exactly, a characteristic feature of the GCRA is that the decision of whether a cell is conforming is based on its waiting time and not on the available buffer space. In fact, any cell that has waited more than T time units is judged nonconforming and hence discarded. The GCRA(T, T) algorithm in this way relates a time interval T = l/r to a tolerance r which is typically chosen to be T = (d — \)T — (d — l)/r. Hence the maximal buffer space needed is r/T = d — 1 and so the total size of the corresponding G/D/1/K model is K — 2d.

6.2.4

A slotted version of the leaky bucket filter

One discrete-time variation of the leaky bucket policing scheme is the following. Cells that arrive at a node are first collected in a bucket in which there is room for at most d cells. The bucket is emptied one cell per time slot at a rate of r cells every second. Arriving cells that would cause overflow in the bucket (level > d + 1) are lost. Cells admitted through the bucket form a modified input stream, which is directed into the nodal access point. This model is similar to but slightly different from those studied by Schwartz [56, Chapter 4.2]. We analyze the system by introducing Qn = the level in the bucket just before the end of time slot nr «, A,, — number of arrivals in time slot nr «, where we assume that A\, A.I, • • • are independent and also that the arrival probabilities ait = P(An = k), k > 1, are identical for all n. The usual type of argument for deriving buffer dynamics in discrete time models is applicable. The level at the end of slot n equals the level at the end of slot n - 1 minus 1 plus the arrivals during slot n. The boundary points 0 and d must be treated separately. We obtain

132

Chapter 6. Traffic Control

Analysis of the system proceeds by assuming that the system has settled into an equilibrium state, so that if we put nt = P(Qn = k), 0 < k < d, these probabilities are the same for all n. Now take expected values of the above relations to obtain a system of linear equations for the unknown TT^'S in terms of the distribution {a^}, which we consider known. Then solve for :TI in terms of no (and a0), solve for Tti in terms of JTO and it\ (and a$, a\), and so on, using recursion to get TTJ. In principle the solution can be found in this recursive manner and then normalized to obtain ^;=o ni' — 1- One should not expect to find a solution in a closed form. Suppose we have derived a solution {n^}. Then we are in the position to find the throughput in cells per second and the average amount of lost cells per slot. In simple cases this yields explicit expressions for throughput and loss. As an example the specific case where d = 2 and An e Po(A.) is discussed at the end of the section and a further study is left as Exercise 6.3. To begin this program, we observe that by taking expected values on both sides of the identity (6.1) we get

Since, for each n, Qn-\. depends on A\,..., A n _i but not on An and since An is independent of the previous A/s, j < n — 1, we also have that Qn-\ and An are independent; the righthand side thus simplifies into

and moreover, by the equilibrium assumption, into

From this it is seen that the system of (6.1)-(6.3) in a first step gives us

and hence, in equilibrium,

6.2. Access control

133

Since, in addition, we require ^0 TT, = 1, the result is an overdetermined system of equations for {JT,}. By rearranging terms, the first d equations (corresponding to (6.1)-(6.2)) can be written

and the last one (from (6.3))

Assuming we have found an equilibrium solution {TTJ•}, the throughput of the system is simply one cell every slot as long as the system is not empty, which is the case with probability 1 - TTO. Hence, Throughput = r(l — no) cells per second. To understand how many cells are lost in each slot, consider the case Qn~\ — 0. During slot n no cells are lost in this case if An < d, whereas with probability a^+; a total of j cells are nonconforming and lost, j > 1. For 1 < k < d, if Qn-\ — k, then cell loss occurs only if •An > d — k + 2. More precisely, j cells are lost with probability aj-k+j+i, J > 1- Hence

We solve the case where d = 2. The linear equations available for this case,

give the normalized solution

134

Chapter 6. Traffic Control

For example, if A € PO(A), this is

with the properties Throughput = r(l - e~2kf(l - le~ x )) cells per second and

For this model, utilization is the normalized throughput 1 — TTQ cells per slot and traffic intensity is the load per slot E(A) = A. To get the loss probability, divide the expected number of losses by the expected number of arrivals per slot, E(A), yielding

a now familiar formula. Utilization and loss probability for the case d = 2 are shown in the graphs of Figure 6.5.

6.3

Multiaccess modeling

A multiple access scheme is used to increase efficiency of multiaccess media, shared media such as broadcast radio and satellite links, multiuser systems based on ring or bus architectures, etc. Conceptually we have a situation as is shown in Figure 6.6, where a large number of users, with or without buffering equipment, are contending for access to a single transmission channel. Neither the server nor the nodes have complete knowledge of the buffer status. One particular case is the subclass of scheduled access systems based on conflict-free protocols. Any TDM system can be considered conflict free even if there are many users. Most of the models we have encountered so far belong to this category. fn contrast there are random access systems, where a large number of users are expected to generate low-average traffic loads while having no means of coordinating their transmission attempts directly with each other. Traffic under such a scheme is subject to not only delay due to propagation, transmission, and buffering but also delay due to contention. In fact, as soon as two or more users happen to transmit in the same time slot or

6.3. Multiaccess modeling

135

Figure 6.5. Utilization and loss for slotted-time leaky bucket.

Figure 6.6. Multiaccess contention. in a scenario of continuous time, in overlapping time intervals, a collision occurs. This has to be resolved. A worst possible case of random access systems occurs if there is complete lack of knowledge and users attempt to send cells blindly. Collisions are managed either via a retransmission protocol or a collision resolution protocol. The classical example for the first case is that packets collide on a satellite link. The corresponding bandwidth is wasted, the packets involved in the collision are left corrupted, and transmission is rescheduled for a later time. This is called the Aloha model. There are several alternatives for selecting the retransmission time. Using slotted time this could be done either uniformly over the next K time slots or at the next slot with a given retransmission probability q. Standard ethernet LANs use versions of these protocols. In 802.3 ethernet the retransmission time is uniform over K slots, where the parameter K is dynamic. For each collision experienced by the station it doubles the value of K, an algorithm known as binary exponential back-off. Another important contention protocol is carrier sensing multiple access with collision detection, known as CSMA/CD in LAN ethernet. The basic

Chapter 6. Traffic Control

136

idea for the other category of protocols based on collision resolution is to let the stations directly involved in the collision engage in a further contention procedure, after which their packets, now suffering some delay, have all been transmitted. We begin the detailed study of collision models by providing an analysis of the Aloha model based on the idea of diffusion approximations. In a later section the same approach is extended to cover slotted CSMA and CSMA/CD protocols. Most theoretical studies of these models apply approximations to the effect of infinitely many users and an averaged offered load at an early stage of the analysis; see, e.g., the presentations of Kleinrock [28], Bertsekas and Gallager [5], or Saadawi, Ammar, and El Hakeem [53]. Our study is closer to that of Nelson [40], [41]. The features of CSMA and CSMA/CD on which we base our analysis of these models are as described by Meditch and Lea [35]. See also Polydoros and Silvester [50]. Some aspects of the analysis suggested in this text are new; in particular, we obtain throughput as functions of the actual load rather than some fictitious offered load including retransmissions. Subsequently we analyze the simplest prototype for collision resolution algorithms— binary splitting with blocked access. This means that when a collision occurs, the link stays blocked from access for other users until the stations involved have resolved the collision and sent their packets. Conceptually this situation is closer to the reliable data transfer models of section 3.3.4; consequently our model is again based on the renewal reward framework. Mathy s and Flaj olet [34] provided a detailed study of the distributions of the random lengths of collision resolution intervals under various protocol assumptions. However, we have not found closed expressions for the throughput in these models in the literature. 6.3.1

The slotted Aloha Markov chain

Slotted Aloha is one of the simplest retransmission models, described as follows: Slotted time. m users contend for a single transmission channel. A packet appears at each user input with probability p, each slot. If two or more users attempt to send packets in the same slot, the channel remains idle, the packets are returned for retransmission, and the users become backlogged. Backlogged users attempt to retransmit with probability q, each slot. Usual independence assumptions. No new arrivals from backlogged users. A user is thus called backlogged from the time slot at which it suffered a collision until the time slot at which the delayed packet is successfully transmitted. During any other time slot the user is free to contend. We introduce Kn = number of backlogged users in slot n, Zn = number of attempts to retransmit packets in slot n, N n = number of new packets arriving to the link from free users in

137

6.3. Multiaccess modeling which are seen to satisfy the relation

The problem is to find Throughput = expected fraction of successfully transmitted packets per slot, which we interpret as the limit

The key observation is that, given Kn, we have

and, still conditional on Kn, these two random variables are independent. Now we study the slotted Aloha Markov chain. Figure 6.7 shows a simulation of 10 independent realizations of n) for fixed parameter(K values m, p, and q over 100,000 time slots (m = 300, pm — 0.35, qm — 10). Initially all users were free. It is seen that after a period of time, varying considerably from one station to another, during which typically only a small proportion of users are retransmitting, th transmission link fills up abruptly with backlogged users. At the end of the simulation run nine Aloha systems have stopped working properly and one is still functioning.

6.3.2

Diffusion approximation approach

The simulation suggests that the discrete Markov chain n could be approximated by a continuous-time process; this leads to the idea of considering the diffusion approximation scaling

Here 0 < Kn/m < 1 is the proportion of backlogged nodes, i.e., a space rescaling of the original sequence, and x,(m) is the result after an additional time rescaling. This approach attempts to capture the typical behavior of Kn but represented by a possibly simpler process. Thus we must investigate if the sequence of scaled random processes x(m) has a limit process, or limit function, as the number of users grows to infinity,

We note that if such a limit can be found, then, in equilibrium, Throughput ~ P(Z + N = 1) ~ E(N~) ~ p(m - E(K)) ~ X(l - x x).

138

Chapter 6. Traffic Control

Figure 6.7. Simulation of the Aloha Markov chain.

To investigate [x, , t > 0} we compute its drift and quadratic variation functions. Put

and

By

hence

6.3.

Multiaccess modeling

139

Therefore, changing to scaled parameters A = mp, a = mq, and state variable x = k/rn for some integer 0 < k < m.

and we obtain the limiting drift function n(x) denned on 0 < x < 1 by taking the limit m —> oo,

Similarly, a quadratic variation function a2(x) defined on the unit interval [0, 1] is obtained bv taking the limit

Theorem 6.1. The sequence {x(/n\t > 0) converges as m -> oo to a (deterministic) function x,, namely, the solution of the ordinary differential equation given initial condition XQ, where /x(-) is the limit drift function in (6.6) and XQ is an initial value of the asymptotic fraction ofbacklogged users. Figure 6.8 visualizes the function fj.(x), 0 < x < 1, for some parameter values X and a. We return to the original question: What is the throughput for a slotted Aloha system? Consider the curve jc,, t > 0. What happens as t —> oo? For some values of or, e.g., a = 4, there is for any A a unique limit x^ regardless of XQ, namely, the unique stationary point such that /xOcnc) = 0. This value can be interpreted as the equilibrium backlog and we obtain from earlier considerations that the equilibrium throughput of cells is 1(1 — .TOO). Now we can understand how backlog and throughput vary with the offered load A for the model. Observe that x^ is some (complicated) function of A. Hence backlog and throughput are implicitly given functions of A, which can be computed numerically. For a — 4, plots are shown in Figure 6.9. It is interesting to see that the throughput increases more or less linearly with load until a certain regime, where the backlog quickly increases and the performance of the link drastically deteriorates. For larger values of a, such as in Figure 6.8, where a — 5, the situation is more complicated. There are three stationary points x^. < x^ < JT.+,, such that x, —>• jt.J, if JG > JCcnt and xt —> x^ if XQ < xcr\t, where xcr|t is a critical initial value. Moreover, a hysteresis effect can be observed; see Figure 6.10. For an interpretation of these observations suppose a random access system of this sort that normally operates under an offered load of

Chapter 6. Traffic Control

140

Figure 6.8. Limit drift functions for Aloha Markov chain. A. = 0.3 was subject to an increase in load to, say, k = 0.45. The backlog would increase and the throughput would decline, and to restore the system it would not be enough to bring the load to previous values. In fact, the backlog would follow the upper curve and thus clog the system until its load had been forced well below A = 0.2, perhaps corresponding to the need for a complete restart of the system.

6.3.3

Remark on stochastic differential equation approximation

We have seen that as m -> oo, the limit function xt is deterministic. However, for a

but finite m one can argue that the behavior of x, is given approximately by the stoch differential equation (SDE)

Here {W t , t > 0} represents Brownian motion discussed in section 5.3.3. The interpretation is that the process jct(m) varies around the function x, due to random noise generated by W, weighted with respect to the state dependent variance function CT^(JC) & a2(x)/m. Such a process is in fact a continuous-time Markov process and can be shown to possess a well-defined stationary distribution with density function

Figure 6.11 shows three such functions for a = 6 as A. passes the critical congestion region.

6.3. Multiaccess modeling

141

Figure 6.9. Aloha model equilibrium backlog (upper graph) and throughput (lower graph).

6.3.4

CSMA and CSMA/CD

The next subject is the performance analysis of slotted, nonpersistent CSMA and CSMA/CD channels. As in the previous subsection, m independent users share a common transmission link. Following Meditch and Lea [35], time is measured in minislots and data packets are of fixed length T + 1 > 1 minislots. This is because for transmission the packets require T minislots to be placed onto the channel and one minislot to clear the channel due to propagation delay. Each user is able to hold only one such packet in its buffer. As before, a user is either free or backlogged (blocked). In a given minislot each free user receives and transmits a packet with probability p and each backlogged user attempts to retransmit with probability q. Free users involved in a collision become blocked; backlogged users remain blocked. The purpose of the carrier sensing mechanism is to reduce the number of collisions due to simultaneous transmission attempts from two or more users in the same minislot. Any such attempt, either from an active free user or a backlogged one, starts with a sensing phase. If the channel is sensed idle, each user attempting to transmit initiates placing its packet onto the link. Such a minislot where a packet transmission starts is called a firing minislot. If only a single user is involved, the transmission is successful. If not, the transmission is a failure, due to collision. In both cases the channel will be busy during the next T minislots. Free users sensing the channel during a busy minislot change state from free to backlogged. The transmission attempt is put on hold until the next minislot, where again each user makes a decision according to the CSMA protocol.

Chapter 6. Traffic Control

142

Figure 6.10. Hysteresis effect for large a. Backlog (upper) and throughput (lower). The additional collision detection feature in CSMA/CD aims at reducing the bandwidth wasted by the channel over the T minislots following a firing minislot while packets are being cleared after a collision. Collision detection is included in the model by introducing a further integer parameter R, 0 < R < T, and assuming that if a collision occurs in a firing minislot, then the link can detect the collision and abort transmission after only R minislots rather than T, as in CSMA. Obviously the number of blocked users recorded in sequence from one minislot to the next does not form a Markov chain. The remedy is either to extend the state space with information on the remaining length of the transmission period or, more conveniently in this case, to identify an embedded Markov chain. Define Kn = number of backlogged users at the end of transmission period n, n>\. In addition, define sequences Nn = number of free users transmitting in firing minislot n, Zn = number of retransmission attempts in firing minislot n, Mn = number of free users sensing the link during transmission period n. The resulting Lindley equation takes the form

where again, given Kn,

6.3. Multiaccess modeling

143

Figure 6.11. Equilibrium backlog densities. Moreover, Kn determines the distribution of the vector (7V,Hi, Mn+!), which establ that (Kn) is indeed a Markov chain. We remark at this point that by allowing T = R = 0 our set-up also covers the slotted Aloha model. For the case T = R = 0, transmissions are handled in the same firing minislot as they are initiated, a minislot is the same as an Aloha slot, the sensing has no effect on performance, the quantity n vanishes identically, and Kn is the slotted AlohaM Markov chain. As a consequence the description of what occurs during a minislot differs slightly from the model in Meditch and Lea [35]; these modifications seem to be harmless. To continue analyzing the CSMA and CSMA/CD models the following alternative representation for Kn is used. Let Xn = number of free users not sensing the channel during transmission period n. Then To identify the distribution of n+\ we begin with pure CSMA, i.e., T = R > 0. First, theX total number of blocked users at the end of firing minislot n is n + Nn+i. Second, we noteK that a free user remains free at the end of the transmission period only by not sensing the channel during T successive minislots. Hence the conditional distribution of Xn+\ given Knis Now include CSMA/CD with parameter R < T. For a free user to remain free during a transmission period, it suffices to avoid sensing the channel during T successive minislots if firing was successful and to avoid sensing the channel during R minislots if a collision

144

Chapter 6. Traffic Control

occurred. Hence putting

the more general relation

follows. In particular,

To apply the diffusion approximation and evaluate performance of the carrier sensing and collision detection protocols compared to Aloha, it still remains to synchronize the time scales. The channel capacity in CSMA/CD is 1 packet per T +1 minislots, whereas in Aloha capacity is measured in packets per slot. Hence to preserve the Aloha traffic intensity \p per user, we replace the probabilities p and q by p/(l + T) and q/(l + T) and investigate the approximating scheme

The strategy is now parallel to the case of Aloha. The limit function x, = limm^oo jc}m) satisfies a nonlinear ODE and the limiting equilibrium backlog Xoo is found by analyzing stationary solutions. As stated earlier the equilibrium backlog yields the equilibrium throughput A(l — xx), since all packets allowed to enter the system must also pass through the system. To find the drift function /u-(x), note that

and

The equilibrium probability r(x) = limmH>00 rm(xm) is therefore

Moreover,

6.3. Multiaccess modeling

145

Figure 6.12. Throughput CSMA, T = 0, 1, 3, 5, 10 (left to right). Hence,

Now we are prepared to calculate the throughput. Fix a. For a range of A. values (normally less than a) solve /J.(x) = 0 to obtain equilibrium backlog x and throug Ml — *«>)• Figure 6.12 shows the result for CSMA with a — 4 and packet length varying from T = 0 (Aloha case) to T = 10. It is seen that the sudden drop in throughput at a critical load, which is characteristic for Aloha, disappears quickly with increasing T. However, the maximum throughput that can be reached in CSMA declines with increasing T. This is in contrast to previous studies of CSMA, where it is argued that carrier sensing can improve performance drastically; see, e.g., Kleinrock [28, Chapter 5.12]. In those studies throughput is typically obtained as a function of a fictitious internal load consisting of both arrivals and retransmissions, an approach that seems to be misleading. To explain further the inconsistency we note that the present study takes into consideration not only the improvements of CSMA compared to Aloha but also a drawback of the earner sensing mechanism. The main advantage is that the vulnerable period is smaller in CSMA in the sense that the relative length of the firing minislot to packet length is smaller. The disadvantage is that backlog builds up faster in CSMA. Even if retransmission attempts during the transmission period of length T minislots do not generate additional collisions, they do generate additional backlog.

146

Chapter 6. Traffic Control

Figure 6.13. CSMA/CD equilibrium throughput and backlog, R = 0, T = 0, 3, 5.

Figure 6.14. CSMA/CD equilibrium throughput and backlog, R = 0,1,3,5, T = 5.

The effect of collision detection is shown in Figure 6.13. Again, throughput and backlog as functions of A. are shown for a = 4, taking R = 0 and T = 0, 1, 5. Finally, to see the variation in R for fixed T, Figure 6.14 shows load-throughput for T = 5 and R = 0,1, 3, 5, again with Aloha as a reference.

6.3. Multiaccess modeling

6.3.5

147

A collision resolution algorithm

Collision resolution algorithms were designed to avoid the unstable character in contention protocols that often results from retransmission strategies, of which we saw examples above. We consider the so-called binary splitting algorithm with blocked access and binary feedback information. Under this scheme, if a collision occurs the link stays blocked until the stations involved have resolved the collision and successfully transmitted the colliding packets. We start from the same basic assumptions as in the Aloha model. There are m independent users contending for a single channel, time is discretized in slots, and each user attempts to reserve the link bandwidth during a slot with probability p. The number of active users per slot is therefore Bin(m, p) distributed. If at the beginning of a given slot only one user makes a transmission request, this single packet also is delivered during the same slot. If no request is made, the corresponding time slot is wasted. If two or more users request to transmit in the same slot, a collision occurs. In this case the remaining users not involved in the collision become blocked, unable to transmit arriving packets, while the collision is resolved. The procedure is repeated as soon as each colliding packet has been delivered. It is clear from this description that the delay has two components: the collision resolution interval (CRI), which consists of the time slots it takes for the colliding stations to agree on a transmission schedule, and one slot transmission time for each packet that collides. The total delay is the time it takes to actually deliver the collided packets onto the link. We say that the link is idle at time k if at the end of slot k — 1 there are no previously collided and undelivered packets waiting for transmission. Thus in the beginning of slot k all m users are free to contend for the available bandwidth during slot k. Let each time point when the link is idle mark the beginning of a new cycle. Put n = number of packets transmitted during cycle n,A Yn = the length of cycle «, in number of slots. The system starts afresh each time the link is idle. Because of this the sequence An e Bin(m, p) is independent. We have

where Ln is the length of the CRI, which eventually, if An > 2, takes place during cycle n. The CRI is determined in a binary (or more generally

Modeling in Broadband Communications

Systems

SIAM Monographs on Mathematical Modeling and Computation

Editor-in-Chief Joseph E. Flaherty Rensselaer Polytechnic Institute

About the Series

Editorial Board

In 1997, SIAM began a new series on mathematical modeling and computation. Books in the series develop a focused topic from its genesis to the current state of the art; these books

Ivo Babuska University of Texas at Austin

present modern mathematical developments with direct applications in science and engineering;

H. Thomas Banks North Carolina State University

describe mathematical issues arising in modern applications;

Margaret Cheney Rensselaer Polytechnic Institute

develop mathematical models of topical physical, chemical, or biological systems; present new and efficient computational tools and techniques that have direct applications in science and engineering; and illustrate the continuing, integrated roles of mathematical, scientific, and computational investigation. Although sophisticated ideas are presented, the writing style is popular rather than formal. Texts are intended to be read by audiences with little more than a bachelor's degree in mathematics or engineering. Thus, they are suitable for use in graduate mathematics, science, and engineering courses. By design, the material is multidisciplinary. As such, we hope to foster cooperation and collaboration between mathematicians, computer scientists, engineers, and scientists. This is a difficult task because different terminology is used for the same concept in different disciplines. Nevertheless, we believe we have been successful and hope that you enjoy the texts in the series. Joseph E. Flaherty Ingemar Kaj, Stochastic Modeling in Broadband Communications Systems Peter Salamon, Paolo Sibani, and Richard Frost, Facts, Conjectures, and Improvements for Simulated Annealing Lyn C. Thomas, David 8. Edelman, and Jonathan N. Crook, Credit Scoring and Its Applications Frank Natterer and Frank Wiibbeling, Mathematical Methods in Image Reconstruction Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion Michael Criebel, Thomas Dornseifer, and Tilman Neunhoeffer, Numerical Simulation in Fluid Dynamics: A Practical Introduction Khosrow Chadan, David Colton, Lassi Paivarinta, and William Rundell, An Introduction to Inverse Scattering and Inverse Spectral Problems Charles K. Chui, Wavelets: A Mathematical Tool for Signal Analysis

Paul Davis Worcester Polytechnic Institute Stephen H. Davis Northwestern University Jack). Dongarra University of Tennessee at Knoxville and Oak Ridge National Laboratory Christoph Hoffmann Purdue University George M. Homsy Stanford University Joseph B. Keller Stanford University J. Tinsley Oden University of Texas at Austin James Sethian University of California at Berkeley Barna A. Szabo Washington University

Stochastic Modeling in Broadband Communications Systems Ingemar Kaj Uppsala University Uppsala, Sweden

SIHJIL Society for Industrial and Applied Mathematics Philadelphia

Copyright © 2002 by the Society for Industrial and Applied Mathematics. 10987654321 All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688. The PING Utility Library is a public domain software package developed by Mark Lindner and is distributed under the terms of the GNU Lesser General Public License. The package can be freely downloaded from http://www.dystance.net/ping Academy Award is a registered trademark of the Academy of Motion Picture Arts and Sciences. Library of Congress Cataloging-in-Publication Data Kaj, Ingemar. Stochastic modeling in broadband communications systems / Ingemar Kaj. p. cm. — (SIAM monographs on mathematical modeling and computation) Includes bibliographical references and index. ISBN 0-89871-519-9 1. Broadband communication systems—Mathematical models. 2. Stochastic analysis. I. Title. II. Series. TK5103.4 .K35 2002 621.382-dc21

SlflJTL. is a registered trademark.

2002029186

Contents Preface

ix

Notation and Notions from Probability Theory

xiii

1

Introduction 1.1 A brief introduction to networking concepts 1.2 Modeling aspects of general networks 1.3 Broadband traffic characteristics 1.4 Three introductory examples 1.4.1 Simple collision model 1.4.2 Basic arrivals process 1.4.3 Periodic streams 1.5 Exercises

1 1 2 4 9 9 13 17 18

2

Markov Service Systems 2.1 Discrete-time service systems 2.2 Arrival and service rates, continuous time 2.3 Ideas of stationarity and equilibrium states 2.4 Balance equations, slotted time 2.5 Balance equations, continuous time 2.6 Jackson networks 2.7 Markov loss systems 2.8 Delay analysis in Markov systems 2.8.1 Delay in M/M/1 2.8.2 A client-server Jackson network 2.9 Exercises

21 21 24 28 30 32 35 38 39 39 41 43

3

Non-Markov Systems 3.1 Performance measures 3.2 Integrated processes and time averages 3.3 Some ideas from renewal theory 3.3.1 Renewal reward processes 3.3.2 Renewal rate and on-off processes 3.3.3 Hand-off termination probability

45 45 47 50 51 51 53

V

vi

Contents

3.4 3.5

3.6 3.7

3.3.4 Reliable data transfer The loss and delay time balance 3.4.1 Little's formula The M/G/1 system 3.5.1 Simple examples leading to non-Markovity 3.5.2 Pollaczek-Khinchin formulas 3.5.3 Lindley recursion for M/G/1 3.5.4 The M/G/1 virtual waiting time distribution 3.5.5 Heavy traffic limit in M/G/1 3.5.6 Deterministic service times, M/D/1 The M/G/oo model Exercises

54 57 58 60 60 62 65 66 68 70 72 73

4

Cell-Switching Models 4.1 m x m crossbar 4.1.1 Output loss crossbar 4.1.2 Output queuing with a shared buffer 4.1.3 Input buffer blocking 4.1.4 Input blocking, loss system 4.2 Exercises

5

Cell and Burst Scale Traffic Models 5.1 Cell-level traffic 5.1.1 Isochron multiplexing 5.1.2 Voice packet streams in Internet telephony 5.1.3 Round-trip time distribution, PING data 5.1.4 Packet fragmentation in video communications 5.2 Burst-level rate models 5.2.1 Anick-Mitra-Sondhi model 5.2.2 Markov modulated Poisson process 5.3 Long-range dependence traffic models 5.3.1 Self-similarity 5.3.2 Heavy-tailed rate models 5.3.3 Fractional Brownian motion 5.3.4 Statistical methods 5.4 Exercises

93 93 94 96 100 103 106 106 108 109 Ill 113 114 118 121

6

Traffic Control 6.1 Admission control 6.1.1 Effective bandwidth 6.1.2 Statistical multiplexing gain 6.2 Access control 6.2.1 Leaky bucket systems 6.2.2 The M/M/1 leaky bucket 6.2.3 The generic cell rate algorithm 6.2.4 A slotted version of the leaky bucket filter

123 123 124 125 127 127 128 131 131

77 77 78 80 82 86 90

Contents 6.3

6.4

6.5

vii Multiaccess modeling 6.3.1 The slotted Aloha Markov chain 6.3.2 Diffusion approximation approach 6.3.3 Remark on stochastic differential equation approximation 6.3.4 CSMA and CSMA/CD 6.3.5 A collision resolution algorithm Congestion control 6.4.1 A controlled Aloha network 6.4.2 Window control 6.4.3 Modeling TCP window size 6.4.4 TCP window dynamics 6.4.5 Meanfield approximation of interacting TCP sources . . . Exercises

134 136 137 140 141 147 149 149 150 152 153 159 162

Bibliography

167

Index

173

This page intentionally left blank

Preface This text is intended for students in mathematics, applied mathematics, and stochastics who have an interest in network modeling and for students in computer science and related areas with an open view toward mathematical models. The material also will be useful for many practitioners in the computer communications or telecommunications industry who use probabilistic models and methods. Mathematical methods based on the theory of stochastic processes have long been used effectively in telephone traffic modeling. The original telephone traffic models developed and published (1909-1927) by the Danish mathematician A. K. Erlang formed the theoretical framework for planning and dimensioning the growing telephone networks for decades to come. The work of Erlang at a telephone company in Copenhagen must in fact be considered among the single most successful theories in the history of applied mathematics. Not until the development of the emerging techniques in high capacity communication systems has it become clear that Erlang's legacy has reached its limits. Even basic mathematical traffic modeling requires a wider ranging selection of ideas and techniques. The inherent structure of modern network traffic, which is distinctly different from traditional voice traffic, generates challenging mathematical and statistical problems. Industry acknowledges the need for mathematical competence in this area, judging from the growth of conferences, academia-industry cooperative projects, and recruitment to industry-based research departments. This book covers material suitable for final-year undergraduate students to the Ph.D. level in mathematics, probability and statistics, computer science, and computer engineering. The selection of topics depends on the reader's background and interest. The reader is expected to have basic knowledge of calculus and probability, including random variables, probability distributions, and expected values. A brief introduction to networking concepts is included. The presentation of the main material covers a variety of models and situations ranging over different time scales of calls, bursts, and cells and over different protocol layers for transport, control, and applications. The mechanisms of queuing, collisions, delay, and loss appear in various forms, and the effects of buffering, retransmission, multiplexing, and traffic control are studied. Typically, the end result is some form of load-throughput analysis. The common theme is that all models are formulated in terms of appropriate stochastic quantities and the main mathematical tools are those of equilibrium Markov chain theory, renewal theory, and asymptotic limit results. The classical Markov queuing systems and more general single-server systems are covered as starting points and reference systems. The reader will find relatively simple stochastic models for more realistic networking problems such as reliable data transfer ix

x

Preface

protocols, the forced termination problem in cellular networks, space division switching, Internet telephony traffic, leaky bucket filters, the ethemet local area network protocols carrier-sensing multiple access (CSMA) and CSMA with collision detection (CSMA/CD), collision resolution protocols, and the window dynamics in transmission control protocol. Moreover, the reader will find material related to arrival process modeling, which is designed for network traffic exhibiting long-range dependence and self-similarity. This includes statistical methods and approximation by means of fractional Brownian motion. The selection of topics should serve as a background from which those who have an interest in the area will be able to continue in one or another direction. As several topics touch on or intersect with current research, the text could serve as a basis for independent investigations. It is my hope that the chosen level of mathematical rigor is acceptable for most readers. The purpose of this text is to give an overview of stochastic models and mathematical techniques based on stochastic processes for application in the fields of telecommunications and computer communication networks. The presentation introduces readers with various backgrounds and training in mathematics to a number of useful techniques in traffic modeling. The intended audience consists, on one hand, of students and professionals in telecommunications and computer engineering with an interest in using applied mathematics, in particular stochastics, to improve their understanding of communications systems. On the other hand, it is written for the purpose of introducing students and professionals in probability and applied mathematics to a huge area of interesting problems and models arising from today's accelerating developments in broadband channel transmission systems. Given this twofold purpose, the text should be concise and based on sound mathematical reasoning yet be accessible for an audience unwilling to spend more than a fair share of time on mathematical detail and generality. The approach suggested here to serve this objective is to rely on the language and concepts of random variables and stochastics and the strength in intuitive reasoning they provide. Main ideas and notions are introduced and discussed along with specific network applications, and most calculations are motivated and carried out in detail. Probabilistic arguments are preferred to analytical ones—for example, we have chosen not to use moment-generating functions. We state and use general results from the theory of Markov chains and renewal processes, but for detailed proofs and systematic treatment, readers should consult existing textbooks on mathematical queuing theory, such as Kleinrock [28], Asmussen [3J, Wolff [69], and Bremaud [7]. Basic calculus and probability as prerequisites should be enough as a starting point. Some of the models we discuss require mathematically more advanced material, such as diffusion approximations and nonlinear differential equations. However, we provide introductions and emphasize intuitive probabilistic arguments. A number of illustrations with graphs of real data or model simulations are given for clarity. Exercises are provided at the end of each chapter partly to promote the idea of using the book within a course. Some exercises support training in probabilistic calculus, some study variations of models discussed in the main text, and additional comprehensive exercises could be used as course assignments. Chapter 1 contains a brief introduction to basic concepts in networking and communication systems and also to the nature of real traffic data. For readers with limited background in probability theory we provide a summary of notions and distributions in elementary probability and discuss introductory examples, including a short presentation of

Preface

xi

the Poisson process. Chapter 2 gives a summary and introduction to Markov chain theory in discrete and continuous time, focusing on equilibrium properties. Elements of queuing, loss, and delay are covered as well as the Jackson network of Markov service systems. In Chapter 3 we begin with performance measures and study load-versus-throughput relationships. The main objective is to discuss non-Markov dynamics and study various modeling techniques, including renewal processes, renewal rate processes, and on-off processes, and to cover standard material in queuing theory, such as Little's formula and the M/G/1 model. Specific applications include forced termination of a mobile phone and reliable data transfer protocols. Chapter 4 is devoted to the study of the simplest loss and contention protocols in packet switching. The basic example is an m x n crossbar switch, where packets arriving on m input lines are randomly switched onto n outputs. This results in either loss or buffer delay due to contention for output lines. A particular artifact in some of these models is the so-called head-of-line blocking phenomenon. Chapter 5 addresses traffic modeling relevant for cell and burst time scales. We begin with specific models for isochronous cell streams, Internet telephony, and a fragmentation procedure for video communications. Then we turn to the multiplexing of independent sources over a joint transmission channel—for example, the Anick-Mitra-Sondhi model of supeipositioned on-off sources. An important finding from recent research in this area shows that the addition of distributions with heavy tails to this class of models will result in long-range dependence. We discuss the related idea of self-similarity and give an account on the topic of approximating, in a certain sense, network traffic using the continuous selfsimilar process known as fractional Brownian motion. A section on data analysis includes statistical methods for identifying heavy tails. In Chapter 6 we apply a number of techniques and models to the study of various aspects of traffic control. As part of admission control we study the topics of effective bandwidth and statistical multiplexing gain. Access control includes several versions of the leaky bucket mechanism. Multiaccess control from a mathematical perspective involves modeling the retransmission mechanisms in contention protocols. We introduce the ideas using principles of the simplest Aloha-net and generalize to the ethernet protocols CSMA and CSMA/CD. Contention protocols based on collision-resolution algorithms are modeled using rather different methods. Finally, we treat congestion control of Internet traffic in a detailed study of the transmission control protocol (TCP) and Internet protocol (IP) window dynamics scheme. Obviously many topics of interest have been omitted from our presentation or are touched on only briefly. Some areas would have required a background of rather sophisticated mathematics, such as the theory of large deviations, which has found key applications in, for example, large-scale asymptotics of buffer overflow probabilities [57], and matrix-analytic methods leading, for example, to numerical schemes for calculating loss probabilities.

Acknowledgments This book was developed partly on the basis of lecture notes for courses given over several years. A main impulse was the opportunity to spend one term at Carleton University, Ottawa, and give a joint graduate course for students at the De-

xii

Preface

partment of Systems and Computer Engineering and the Department of Mathematics and Statistics. I am most grateful to Amit Bose at the Laboratory for Research in Statistics and Probability for taking the initiative for this project and for his continued support and cooperation. Thanks to further support from loannis Lambadaris and Michael Devetsikiotis at the Broadband Networks Laboratory, Carleton University, I was provided with excellent working conditions in an inspiring research environment. It is my pleasure to thank Gunnar Karlsson, Department of Microelectronics and Information Technology, Royal Institute of Technology, Kista, and Mats Rudemo, Chalmers University of Technology, Gothenburg, for similar opportunities to lecture graduate courses for other groups of advanced students. Many students in these and other courses influenced the content and style of the book and gave valuable input and inspiration for selecting topics and problems. Among these are Tarkan Taralp, Matthias Falkner, Mattias Ostergren, and Anders Andersson. Some have become coworkers and are directly involved in research reported in the book: Raimundas Gaigalas, Jb'rgen Olsen, and Ian Marsh. I am grateful to Evsey Morozov, Petrozavodsk University, for carefully reading the manuscript and providing many useful comments. Special thanks are due to Ian Marsh, Swedish Institute of Computer Science, for reading the manuscript in great detail and for numerous comments that improved both language and content. During the writing of this book I met my wife, Olga. Her support and love have been invaluable.

I.K.

Notation and Notions from Probability Theory It is assumed that the reader is familiar with the basic ideas of elementary probability theory, in particular the concepts of stochastic variables, probability distributions, and expected values. For the reader's convenience and to introduce notation used throughout the text, some of these notions are recapitulated here, including a listing of standard distributions. In addition, this section contains some terminology and lists a few properties related to conditional expectations, Markov chains, and convergence of random variables. For detailed accounts the reader should consult such textbooks as Gut [16]. Distribution function: F(x) — P(X < x'). Quantile: The number xa that satisfies F(xa) = 1 - a. Discrete random variable: X is discrete if it assumes a finite or countable number of values x\, X2, • . • with probabilities p(x\), p(xz),..., where p(x) is the probability function for X. Continuous random variable: X is continuous if it assumes all values in an interval according to a density function f ( x ) , 1.

F(x) is continuous for all x,

2. F'(x) = f ( x ) for all x where the derivative exists, and h 3. P(a < X < ab) f(x)dx. =f

Joint distributions: F(xl,...,xr) = P(Xi <x},...,Xr <xr) p(xi, ...,xr) = P(Xi = xi,...,Xr = xr) /(*,,...,*,) = £...£F(*,,...,*,). Expected value:

xiii

xiv

Notation

Variance: V(X) = a2 = E((X - ,u,)2). Standard deviation: D(X) = a = ^V(X). Covariance: Cov(X, Y) = E((X - ^)(Y - ny)). Correlation coefficient: Independence: Two random variables X and Y are said to be independent if

This implies E(XY) = E ( X ) E ( Y ) , hence Cov(X, F) = 0, in which case X and Y are said to be uncorrelated. The following are the most common discrete distributions. Binomial distribution: X e Bin(«, p) if p(k) = (£) pk (1 - p)"~k, 00,

Notation

xv

Exponential distribution: X e Exp(a) if X e T(l, a), f ( x ) = ae~"\ x > 0,

Normal distribution: X e N(m,a) if f(jc) = expected value and a2 the variance.

where u — m is the

ForN(0, 1) the distribution function is written $(*), the density (p(x), and the quantiles za. Slightly more advanced material includes conditional probabilities and conditional expectations; we list some relations particularly useful for calculations: Conditional mean: E(X) = E(E(X\Y)). Conditional variance: V(X) = E(V(X\Y)) + V(E(X\Y)). Conditional covariance: Cov(X, Y) — E(Cov(X, Y\Z)) + Cov(£(X|Z), E ( Y \ Z ) ) . Stochastic sums: Let {Xf} be independent identically distributed random variables, let N be an integer valued random variable independent of (or a stopping time for) {X,-}, and put Y = ^'i=l X,. Then E(Y) = V(Y) = E(N)V(X)

E(N)E(X), +

E(X)2V(N).

Markov chains are sequences of random variables X\, Xi, • • . , in which the future outcome Xn+i depends on the present variable Xn but is independent of the way in which the present state arose from its predecessors X\,..., X,,-\. Markov chain property: A sequence (Xn)n>\ of random variables is a Markov chain if for all « and x\. ..., xn+\,

Markov process: A process { X , , t > 0} in continuous time with discrete states is a Markov process if for any given trajectory {xr, 0 < r < t} of states and arbitrary .v < t,

Several convergence concepts are used in probability theory. We state four of them, of which the most important for the applications in this book are convergence almost surely and convergence in distribution. Let (Xn)n>i be a sequence of random variables. Convergence almost surely: (Xn)n>\ converges almost surely (a.s.) to a random variable X, Xn U4' X if

xvi

Notation

Convergence in probability:

n)n>\ converges in probability to X, Xn -> X if

foral

Convergence in L2: n)n>i converges in L2 to X, Xn —>• X if

Convergence in distribution: (X n )«>i converges in distribution to X, Xn ->• X if

where C(F) is the set of continuity points of the distribution function F of X .

Chapter 1

Introduction

1.1

A brief introduction to networking concepts

Many excellent sources are available for those without professional training in computer science who would like to understand the basic principles of communication networks. Computer Networks by Tannenbaum [61] is a classic text. TCP/IP Illustrated [60] has practical focus, and Computer Networking: A Top-Down Approach Featuring the Internet by Kurose and Ross [30] is a source covering the most recent developments. Also useful are introductory chapters of books in the category of more mathematical presentations, the classic example being Kleinrock's [28]. More recent books include An Introduction to Broadband Networks by Acampora [1], Fundamentals of Telecommunication Networks by Saadawi, Ammar, and El Hakeem [53], and High-Performance Communication Networks by Walrand and Varaiya [65]. From such sources one can get an idea of the evolution of communication networks; the period up to 1950 is dominated by the wide spread of telephony, after which it is natural to distinguish four phases. The first phase, 1950-1975, still represents voice traffic, now transmitted over digital channels. This era was triggered by the pulse code modulation (PCM) technique, in whic a voice signal in the frequency band 300-3400 Hz is sampled 8000 times per second and each time is coded into one of 28 = 256 levels. The coded signal therefore requires 8 x 8000 = 64,000 binary digits every second, which equals a transmission capacity of 64 Kbiiys. Various standards were developed for multiplexing signals from many sources over the same transmission medium. In North America and Japan a time division multiplexing (TDM) method called the Tl digital system was introduced. In this system 24 PCM voice signals are transmitted using the basic slot time of 0.125 ms but are assembled into a frame consisting of 24 x 8 voice bits plus 1 control bit per slot. This equals 8000 frames of 193 bits per second giving 1.544 Mbit/s, which is referred to as Tl or 1.5 Mbit/s traffic. In Europe, the CCITT standard, now known as the ITU-T (Telecommunication Standardization Sector of the International Telecommunication Union), defined 30 PCM voices requiring 240 bits, plus two extra channels of 16 bits used for signalling and synchronization, turned into a frame. Consequently this requires 2.048 Mbit/s and we obtain 2Mbit/s traffic or simply a 1

2

Chapter 1. Introduction

"2M voice channel." The second phase marks the entrance of computer networks and packet switching in contrast to the circuit-switched voice traffic. The Arpanet and other early systems were able to link host computers and dial-up terminal users. In 1976 the X.25 protocol for packet switching was established, and in 1978 the International Organization for Standardization (ISO) reference model with a seven-layer framework of protocols was defined. Other key technologies were the local area network (LAN) protocol ethernet and the first packetswitched radio network Aloha-net. The goal of integrating digital traffic sources, for example, voice, data, video, and images, in common transmission mediums such as optical fiber, characterizes the third phase in the 1980s. The acronym ISDN (integrated services digital network) is sometimes prefixed by N for narrow band and B for broadband, where the latter typically refers to asynchronous transfer mode (ATM) based cell-switching techniques. Some of the TDM systems developed for broadband traffic are, in North America, the synchronous optical network SONET based on capacity 51.84 Mbit/s or multiples, and in Europe, synchronous digital hierarchy (SDH), working with 155.52 Mbit/s or multiples, up to and over speeds of 48 x 51.84 = 16 x 155.52 = 2.4 Gbit/s. The fourth phase saw the growth in the 1990s of the World Wide Web, mainly due to the HTTP protocol and the Mosaic browser from National Center for Supercomputing Applications (NCSA), and the commercialization of the Internet. Virtually all computer network traffic is carried on the Internet through TCP/IP, and, quoting Kurose and Ross [30], "Although today the majority of voice traffic is carried over the telephone networks, networking equipment manufacturers and telephone company operators are currently preparing for a major migration to Internet technology." Adjunct to this fourth phase is the accelerated development of wireless networks and the preparations for mobile networks, in particular for the mobile Internet network based on radio access and optical fibers.

1.2

Modeling aspects of general networks

We begin by attempting to distinguish some general features of communication networks and to identify where stochastic modeling issues may be used. Figure 1.1 shows schematically a network surrounded by imaginary terminals, which could be telephones, home computers, or subscriber clients. They request various services from the network via an access net. Once the terminal's requests for service are admitted to the network, access transport follows, which includes multiplexing or concentration of data streams and connection to the trunk net. The trunk net consists of multiplexed channels of varying capacity based on, say, the SDH protocol or plesiochron digital hierarchy (PDH), for transport from one node to another. From the point of view of the network server, network management is crucial to guarantee reliability, transparency, and quality of service (QoS) in terms of sufficient bandwidth, control on bit errors, delay time, etc. Within the network, transportation occurs in different transfer modes; the traditional circuit mode of voice signals remains the major part of many networks. In packet mode, or frame mode, data packets of variable size containing the actual information bits along with addresses and other signaling information pass various stages of the transmission. Yet more specific is cell mode transfer, where all information is stored in equal-size units called cells, the basic example being ATM traffic.

1.2. Modeling aspects of general networks

3

Figure 1.1. Simple network model.

To be more concrete we can think of a public switched telephone network (PSTN) with a T2 system, which means that four Tl channels are multiplexed into a capacity of 6.312 Mbit/s, serving telephone terminals requesting access on available voice links, each requiring 64 Kbit/s. Transport between nodes (telephone stations) is based on circuit switching so that all connections are open throughout the call whether or not any data is to be sent. For comparison, consider the network model in Figure 1.1 as a host server computer accessed by users for transmission of computer data sets. An X.25 system, or the more modern frame relay protocol, both using packet switching, handles the transport phase. Arriving data are disassembled, packed into smaller units with address labels, transported over a communications system, and reassembled at the destination node. A third variant is that the network is a B-ISDN with a hybrid system of circuit and packet switching trunk nets, where a multitude of users is constantly challenging the access net with a mixture of requests. The user is the provider of telephone services rather than an individual subscriber, and a call is a bandwidth request, varying with the current demand on that provider. The PSTN system carries, in a sense, continuous traffic, and the packet switched system carries discrete traffic. In the first case bit errors are acceptable because they only add noise, whereas delays are much more disturbing and should be avoided. In the second case we have the reverse situation in which a single error may destroy a computer file being downloaded, whereas some time delay until the completion of a service is acceptable. We now take a closer look at what human speech might look like in digital communications. Based on the PCM picture it seems reasonable to think of a voice signal built of talk periods filled with binary digits equal to 1 and silence periods during which only digits equal to 0 are transmitted. A typical signal is shown in Figure 1.2, where the vertical lines during talk periods indicate the discrete time slots. Since the length of a slot is only 125 microseconds, however, the signal may be approximated by a continuous time curve, Z,, t > 0, which is 0 during the silent periods and 1 otherwise. Indeed, it has been suggested based on empirical measurements that a typical human voice over a phone is composed of such alternative speech periods of length 0.6 to 1.8 seconds and silent periods of length 0.4 to 1.2 seconds. A basic technique for constructing a mathematical model for a situation like this is to let a sequence of random variables S\, $2, . . . with a given distribution describe the successive silence periods and to let a sequence T\,Ti,..., characterized by another distribution describe the speech periods. The corresponding curve Z, is a randomly varying stochastic

Chapter 1. Introduction

4

Figure 1.2. PCM voice. process in continuous time, which from time to time jumps from 0 to 1 or vice versa. The next step is to impose sufficiently strong assumptions on the model to enable the derivation of useful information, while under the same assumptions the mathematical model still retains some of the nature of the considered situation. A typical assumption, which will be used many times elsewhere, is that the sequences of random variables {Si} and {Tf} are statistically independent and have finite means. Then [Zt, t > 0} is called an alternating renewal process for which a rich theory is available. It is not difficult, however, to point out problems with such assumptions even in this simple example. It is rather likely, for example, that periods of speech or silence nearby in time could influence one another and thus violate their independence. Such objections can be raised for any of the situations and techniques considered here. We will sometimes ignore them and where appropriate discuss possible alternatives. To continue the example, let us estimate how much of the total capacity is used for the transmission of speech. We assume that the expected lengths of talk and silence periods are £(7") = 0.8 seconds and E(S) = 1.2 seconds, respectively. During each silence-talk cycle a proportion E(T)/(E(T) + E(S)) = 0.4 of time is spent in state 1, i.e., a talk state. Over a long period this is the fraction of the total capacity used for the speech periods. If we multiplex 24 such sources in TDM fashion on a Tl link, then with 24 ongoing calls, approximately 1.5 Mbit x 0.6 = 900,000 bits every second are used to transmit nothing!

1.3

Broadband traffic characteristics

The objective of this section is to gain some insight into the qualitative nature of typical traffic streams in a communication network. It is sometimes natural from the point of view of mathematical modeling to consider a hierarchy of time scales. It has been proposed that three such levels, or scales, are enough for most purposes, and we refer to them as call scale, burst scale, and cell scale in order of shorter time units. We discuss this topic with reference to Figure 1.3 and by using an example. Suppose we wish to transmit a sequence of 100 X-ray images on a fast ethernet link that has a capacity of 100 Mbit/s, and each image is approximately 10 Mbits large. Let us assume each photo is divided into 10,000 packets, each 1000 bits in size, thus requiring a slot time of 10 £is. On the cell scale level consider a synchronous stream of packets sent one every 10th slot, so the transmission rate is one photo per second. However, it is unlikely that this degree of link efficiency can be maintained over an extended period. On the next scale, the burst scale, which is typically on the order of seconds rather than milli- or microseconds,

1.3. Broadband traffic characteristics

5

Figure 1.3. Time scale hierarchy.

we may assume that the photos (bursts) arrive on average one every 10 seconds, resulting in a burst intensity of 10 Mbit/s. A possible model for the arrival times of each photo would be the Poisson process with an intensity of 0.1 images per second. Note that within bursts the cell scale model is still appropriate. Similarly, suppose such sequences of images are sent regularly with interarrival times of 10,000 seconds. One sequence can be thought of as a call of length 1000 seconds, and hence on the last level the process of calls has a rate of 1 Mbit/s. This example illustrates the important concepts of peak rate and mean rate. On the call level the traffic peak rate is 1 Mbit/s, whereas the mean rate is 100 Kbit/s. On the burst scale the peak rate is 10 Mbit/s and the mean rate is 1 Mbit/s. On the cell scale the peak rate coincides with the maximal capacity of 100 Mbit/s and the mean rate is 10 Mbit/s. It is clear from this that the appropriate time scales need to be identified for stating system performance criteria. Consider, for example, blocking probabilities at a specific node. It may be acceptable that on average one of 100 calls is lost due to congestion at the node. Perhaps acceptable quality demands would limit the risk to 10~4 of losing a burst, whereas maximum blocking probability for cells should be maintained at, for example, 10~6. From the viewpoint of the broadband network server, the total traffic load is the superposition, on each time scale, of traffic streams from many sources. We bring in some terminology from the ATM technique to discuss this further. See Onvural [44] and Saito [54] for detailed accounts. An ATM cell carries 48 bytes of data and 5 additional octets for labels and control; hence its size is 53 x 8 = 424 bits. The transmission time per cell in an ATM switch equipped with links of capacity C = 155 Mbit/s is 424/C % 2.74 x 10~6 « 3/us. On the cell level all traffic in ATM is broken down into such regular streams. To get an idea

6

Chapter 1. Introduction

Figure 1.4. ATM time scale hierarchy.

of bursts in the ATM situation one can think of a large number of on-off sources (such as the PCM voice model in section 1.2) that are added to each other and sorted according to various priority classes and traffic classes, such as audio, video of constant bit rate, and video of variable bit rate. The proposed ATM standard supports five traffic classes. Finally, the analog of calls in ATM is the set-up of virtual connections, or virtual paths; see Figure 1.4. The character of network traffic is highly unpredictable and changes quickly with the emergence of new applications, and with no certainty can we describe the nature of the dominating volumes of future network traffic. Despite this, we now consider a few empirical examples of traffic streams. The first data set is a trace of approximately 20 minutes of low-intensity ethemet traffic in the LAN of the MIC campus, Uppsala University (123,902 packets). Figure 1.5 shows the distribution of packet sizes in the data trace. Clearly there is a random variation in the data set, but there is a distinct character with three main peaks visible. The lower and upper peaks correspond to the minimal and maximal packet sizes allowed under the ethernet protocol, and the middle peak, around 576 bytes, shows the fraction of packets that were formatted as TCP segments. The interarrival times in the same data turn out to be much more dispersed. These are the successive time gaps between the arrivals of two packets. To obtain a readable graphics output, it is appropriate to compress the data on the x-axis. A histogram of the logarithms of the interarrival times is given in Figure 1.6. Some apparently machine-generated features in the data are visible along with the random character emphasized by the fact that several users contribute to the same trace. Figure 1.7 shows a plot of the counting process for the sequence of arriving packets, i.e., the process in continuous time with a jump of size one at each arrival time point. The large variation in the data set is manifest in the deviations from a linear increase clearly visible even on the time scale of the order of minutes. Finally, in Figure 1.8 the same data set has been turned into an arrival rate sequence by plotting the number of arrivals falling in consecutive time intervals of given length A. In this graph A = 1 sec. Statistical analysis of extensive ethernet data performed at Bellcore and first published in 1994 lead to the important finding that such data show characteristics of long-range dependence and

1.3. Broadband traffic characteristics

7

Figure 1.5. Ethernet packet size distribution.

self-similarity. These concepts are discussed in subsection 5.3. For an introduction, see, e.g., Willinger and Paxson [67]. The next example of empirical data refers to encoded video motion pictures. A widely used coding algorithm is the Moving Picture Experts Group (MPEG) standard encoding scheme. Variable bit rate (VBR) video coding tries to provide constant viewing quality. Basically, the video data stream is sorted into frames at a rate of 25 frames per second. There are three types of frames—I, P, and B. An I-frame is a full coded frame, and P- and B-frames are updates designed to reduce both the spatial and the temporal redundancy in the signal. The frames are arranged in periodic sequences, e.g., IBBPBBPBBPBB, forming a group of pictures (GoP). Figure 1.9 shows a graph of the sizes measured in bits of the successively arriving I-frames representing the GoPs in a trace of 17 minutes of a BBC newscast. The size distribution in this case varies modestly. The corresponding histogram in Figure 1.10 suggests that even a normal distribution could be used for crude modeling. As a further example of similar data, Figure 1.11 shows a graph of the VBR in the coded video-trace of the motion picture Last Action Hero (Columbia Tristar, 1993). The trace is 2.6 hours long and the graph shows the size of I-frames measured in bytes for approximately 234,000 frames. The third example is a trace of a voice call over an Internet telephony system from Argentina to Sweden. The voice signal is transmitted in 160 bytes packets with one packet sent every 20 ms. En route to their destination the packets go through stages of buffering and interaction with Internet cross traffic, causing random delay variations. In addition, packets may catch up with strongly delayed packets ahead in the packet train and have to adapt to a

8

Chapter 1. Introduction

Figure 1.6. Ethernet interarrival time, logarithmic scale.

slower pace. As a result, the time intervals between arriving packets are sometimes shorter than 20 ms and sometimes longer. Figure 1.12 shows a histogram for the interarrival time data of a call where the quiet periods during which the transmitting caller is silent listening to the other party have been suppressed from the data. The transmitting caller speaks for about 110 seconds, which corresponds to 5500 packets. The peak close to zero represents the number of overpassing packets that arrive together with the packet ahead in line. The final example illustrated in Figure 1.13 shows round-trip times (RTT) for a se quence of 2000 test packets sent one per second during business hours return-trip from the server www.math.uu.se to www.ericsson.se (11 packets were lost). The data were obtained using the software PING. The sample mean of the measurements is 154.82 ms with substantial fluctuations in the range 46.7 ms to 679.6 ms. The median is 115.5 ms. RTT measurements show huge variation depending on network loads, but the data in this example seems to be typical. Even if a normal RTT level is established over a period of transmission, there will be spikes corresponding to large delays. In Figure 1.13 the typical level is around 100 ms and the intensity of spikes is high. Despite the random variations in RTT data it is common practice in much modeling work to assign a fixed value to the RTT and consider it to be a constant model parameter. At several occasions we follow this practice of ignoring the RTT fluctuations, except in section 5.1.3, where we discuss data such as in Figure 1.13 and introduce an explanatory model.

1.4. Three introductory examples

9

Figure 1.7. Ethernet arrival count process.

1.4

Three introductory examples

The following examples introduce basic techniques and notation from probability theory and illustrate performance evaluation in simple cases.

1.4.1

Simple collision model

In a multiuser system it is important to assign to each user reasonable bandwidth to avoid extensive losses and delays. One source of losses in a packet-based system arises from attempts to send several messages in a given time slot during which only a single message can be accommodated and transmitted. Typically each user involved in such a collision is affected and faces a delay due to the retransmission of the message at a later time. Such situations are studied in detail in section 6.3. As a first simple illustration we consider a system of two users attempting to transmit messages in a slotted time fashion over a common transmission channel. Suppose the two users attempt to send messages with probabilities p\ and p2, independently in each time slot and independently of each other. As long as at most one user is sending in a given time slot the transmission is considered successful. If both users attempt to send in the same slot a collision occurs and both messages are lost. For simplicity we ignore retransmissions and just count the number of losses to get a measure on the performance of a system like this.

10

Chapter 1. Introduction

Figure 1.8. Ethernet arrival rate process. Introduce for k > I

Uk =

1 0

if user 1 attempts to send in slot nr fc, else,

Vk =

1 0

if user 2 attempts to send in slot nr k, else

with P(Uk = 1) = pi and P(Vk = 1) = P2- Moreover, put Nn = number of successful transmissions during n slots

and Kn = number of lost messages during n slots The expected values are found to be

1.4. Three introductory examples

11

Figure 1.9. BBC news, I-frame arrival rate.

and we also note that Nn + Kn = total number of attempted messages = with E(Nn + Kn} = n(p\ + p2}. Furthermore, the law of large numbers for sums of independent identically distributed (i.i.d.) random variables applies (in its strong form), resulting in the asymptotic (almost sure) limits valid as n —> oo,

These limits have natural interpretations in terms of some basic performance measures. They will be studied later, but to complete the example we illustrate them briefly as follows. The offered load to the system, or the total load to which the service system is exposed, is in our case the limiting number of requests the transmission channel is receiving per time unit regarding the transmittal of a message over the link, whether successful or not. Hence offered load:

number of messages attempted up to time n n

Chapter 1. Introduction

12

Figure 1.10. BBC news, frame size distribution.

The throughput depends on the rate at which the system can cope with the offered load, and it is measured as the average work completed by the server per time unit. Hence we count the number of successful messages over n time units, divide by n, and let n ->• oo to get throughput:

number of successfully transmitted messages up to slot n n

in units of messages per slot. The related term utilization is normally used as a measure of the rate of activity of the server. In this case the natural measure is the average time during which the link is busy. What is the loss probability of messages in this system? In the long run we have number of lost messages number of attempted messages which again is a consequence of the law of large numbers. On the other hand, if we consider a single slot (say, k = 1) we may argue that the loss probability is the conditional probability of both users attempting to send packets given that at least one of them attempted to transmit, in other words, the probability

1.4. Three introductory examples

13

Figure 1.11. Frame sizes, MPEG video encoding. However, this is the probability of an event causing the loss of two messages. It should therefore be multiplied by a factor of two, yielding a measure of losses in agreement to the one previously obtained. 1.4.2

Basic arrivals process

In just about any situation of setting up a mathematical model for studying network traffic, it is crucial that external arrivals of packets, frames, calls, etc. are subject to relevant and appropriate mathematical assumptions. It seems inevitable, however, that such assumptions sometimes pertain to strongly idealized conditions, hence models that account for only the most basic features of the arrival streams. At several places in this book we discuss arrival stream modeling designed to cover dependence structures and correlation. We also study periodicities and the effect of superpositioning in multiuser systems, aiming to achieve better agreement with empirical data. Still it is fair to say that the most important model for arrival traffic is the Poisson process with its fundamental assumption of constant traffic intensity. Next we give an introduction to the Poisson process, mainly for the reader with limited experience with basic probability models. We start with the intention of modeling a continuous time process N,, t > 0, with A/o = 0 such that N, = number of arrivals in the interval [0, /]

Chapter 1. Introduction

14

Figure 1.12. Histogram for voice over IP interarrival times. with the reasonable assumption that arrivals should occur (a) randomly in time, and (b) independently of each other. Simplest model. For fixed / suppose there is either one or no arrival in [0, t] and that pure chance decides which of these two alternatives occurs. LetX = number of arrivals in [0, t], and put For (a) to occur in this trivial model, it is reasonable to make the assumption

whereas (b) is not applicable. Binary model. Now assume that an arrival is possible in [0, t/2], as well as in [t/2, /], so that if we put X] = number of arrivals in [0, t/2] and X2 = number of arrivals in [t/2, t],

then the assumption P(Xi = 1) = P(X2 = 1) = A.f/2, where 0 < A. < 2/f is a constant, includes (a), whereas (b) is included by assuming that X[ and X2 are independent random variables. The number S2 = Xi + X2 = total number of arrivals in [0, t] is distributed as the number of successful attempts of two trials performed independently with a probability of success X t / 2 each time; hence according to the binomial distribution

S2 eBin(2, Xt/2).

1.4. Three introductory examples

15

Figure 1.13. RTT measurements. n-stepmodel. TheextensiontothecaseofnindependentrandomvariablesXi,..., Xn, each Bin(l, Xt/ri) distributed, is straightforward. Then

suggesting that the arrival count N, at time t would be given by a limiting quantity Sx of S,, as n -> oo. Since

this approach, known as the Poisson approximation of the binomial distribution, leads to

or, in short, N, e Po(A.f) for each t > 0. Furthermore, by a refined but similar argument, it follows for any t > s > 0 that the successive arrival increments Ns and Nt — Ns are independent random variables with stationary Poisson distributions

16

Chapter 1. Introduction

Figure 1.14. Twenty sample paths of Poisson process, A, = 1. These properties are known to characterize the Poisson process with intensity X. Alternative view. Here we indicate some aspects of the more dynamical approach to counting processes, which distinguishes the Poisson process as one of the basic continuous time, discrete state, Markov jump processes. This approach highlights the notion of the parameter A as being the infinitesimal intensity of Nt in the sense of the relation P(arrival in (t, t + h\) = P(Nt+h -Nt = \} = Xh + o(h),

h -> 0,

where o(h) denotes a remainder term, varying from one instance of occurrence to another, with the property that lim^o o(h)/h = 0. Together with the assumption P(at least two arrivals in (t, t + h]) = P(Nt+h - Nt > 2) = o(h), and using the simpler notation pk(t) = P(Nt = &), this leads, for small h, to pk(t + h) = pk-i(t)(M + o(h)) + pk(t}(\ - A/i - o(h}} + o(h), hence and therefore

This system of differential equations can be solved directly in a recursive manner applying an integrating factor to each equation. An alternative method is to turn the system into an equivalent equation for the generating functions gt(u) = X^/tlo ukPk(t}> \u\ < I , and identify its solution as the generating function E(uN') = g-^1-") of the Poisson distribution.

1.4. Three introductory examples

17

Of course a third method is to verify directly that pk(t) = e~^'(kt)k/k\,k > 0, is the unique solution to the given system of equations. Note that

—- = number of arrivals per time unit with E ( N t / t ) = A, and thus

In this sense (L2-convergence), N,/t converges toward A. as t —> oo. Convergence in P probability, N,/t —> A, follows from the Chebyshev inequality

valid for each e > 0. Even the strongest mode of almost sure convergence, N,/t -—* A, is true in this case as a result of the Poisson process strong law of large numbers. In any case, we have justified the interpretation of the intensity A as the long time arrival rate. Let T0 = 0 and, for k > 1, UK — time between arrivals k — I and k, In particular

and therefore that is, C/i e Exp(A). More generally, the interarrival times ((4)i>i are i.i.d., each having the exponential distribution with mean E(Ti) = I/A. Consequently the arrival epochs 7i have the Gamma distribution

This property, which is discussed in Exercise 1.5, makes it simple to simulate trajectories of the Poisson process. See Figure 1.14.

1.4.3

Periodic streams

Consider periodic streams of packets, each packet of length t Kbit arriving every r ms to a buffered node for access onto a link of capacity c Mbit/s. A total of m such streams are multiplexed into the same node, with the characteristic feature of the system being that the phases of the separate streams are unknown. A packet always uses the full link capacity during transmission, forcing packets from other streams to wait in the buffer while the link is busy.

18

Chapter 1. Introduction

Lacking evidence of any other arrival pattern, we naturally assume that the m arrival times from separate streams during the periodic cycle are uniformly distributed. More explicitly, since each packet requires t/c ms for its transmission, occupying the fraction l/cr < 1 of the total time, we can model the arrival times of the streams within a cycle by means of m independent Re(0,1) distributed random variables U\,..., Um, and define for cycle time 0 < t < r N(t) = number of packets being transmitted or waiting in buffer at time t

Clearly a buffer size of (m — 1) is enough to avoid losses, and the maximal delay is restricted to (m - \)tjc ms, which occurs in the worst-case situation that all streams are synchronized. Moreover

which gives

and where we assume the system has settled in a steady state. To see the limitations of the analysis of this example we only have to replace the constant size packets by variable size packet streams, which may still be periodic if we make appropriate assumptions on the packet size distribution. In principle we may now need infinite buffers (if a packet has to wait for another packet in its own stream) and the delay is also potentially large due to arrivals during the service of very large packets.

1.5

Exercises

1.1 Suppose a random trial either succeeds, with probability p, or fails, with probability 1 — p. After independent repeats of such a trial, put N = number of successes from n trials, M = number of attempts until first successful trial. What are the possible outcomes of N and Ml Write down the probability functions P(N = k) and P (M = k), and verify their normalization by summing over the relevant ^'s. Calculate the expected values and the variances of the random variables N and M. 1.2 Let M and N be independent random variables that are both geometrically distributed with the same parameter p. Find the probability P(M < N).

1.5. Exercises

19

1.3 Using the simple collision model of section 1.4.1, consider two users sending equalsize 1024 octet messages over a 2Mbit/s transmission channel, with the natural slot length being given by the message transmission time. The first user attempts to send with probability 0.05 and the second with probability 0.10 in each slot. Find the offered load, the throughput, the utilization, and the loss probability. 1.4 Consider again the collision model of section 1.4.1 with two users attempting to transmit messages over a common channel. They send in each time slot with probabilities p\ and p-2, respectively, with both messages being lost in the case of a collision. The number of successfully transmitted messages in n slots is denoted by Nn, the number of lost messages by Kn. Find the variances of Nn and Kn and also find the covariance Cov(Nn, Kn). 1.5 LetN,,t > 0, denote a Poisson process with intensity A, > 0 and associated interarrival times [Uk}k>\, and put Tn — Y."k=l Uk. Check that the events [Tn < t} and {Nt > n} are identical and hence that P(Tn < t) = P(N, > n). By differentiation find the probability densities of the arrival times T,,, n > 1. Verify that the relations Nt — maxjn : T,, < t) and N, — min{« : Tn+\ > t} also are valid. 1.6 Suppose m independent Poisson processes with intensities A,, (' = 1, m, start at time t = 0. Let 7"_ denote the time of occurrence of the first event in any of the m processes and 7_ denote the first time when at least one event has occurred in each process. Find the density functions of 7~_ and T+. 1.7 A much simplified model of an ATM switch has two input and two output ports. Arrivals occur independently in any given time slot and in each of the input ports. With probability p an ATM cell arrives at the input, and with probability 1 — p no cell arrives. Each arriving cell is transferred during the same slot to one of the outputs, chosen randomly with equal probabilities. The capacity of the output ports is limited to at most one cell during the slot. Should two cells be switched to the same output, only one exits, whereas the other is delayed in a buffer. Let X be the number of arriving cells and Y the number of exiting cells in one slot. Find the correlation coefficient px. Y between X and Y . 1.8 During transmission of a binary signal it is known that bit-errors occur independently with probability 5 x 1 0 I0 per digit. The available bandwidth is 1 Mbit/s. Compute or estimate (a) the probability of managing a one-hour-long transmission without errors, (b) the probability that at least 15 of 80 one-hour transmissions are performed errorfree. 1.9 A sending unit transmits a binary signal Xi, X ? , . . . of zeros and ones, where the sequence is independent and P(X, = 1) = 1 — P(X-, = 0) = p for each /. After digital-to-analog conversion, and because of disturbances during the transmission, the sequence Y], K 2 , . . . is received, where Yj = X; + Z; and Z, € N(0, a). The noise variables Z\, ZT. . . . are independent of each other and of the original signal. The receiver interprets the signal F, as binary digit 1 if Y, > 1/2 and digit 0 otherwise.

20

Chapter 1. Introduction (a) Find p\ = P(Yi > 1/2) as a function of p for the cases a — 1/5 and a — 1/3. (b) Estimate a value of p if 162 of 424 observations at the receiver are digit 1 and the rest digit 0 and the noise parameter is set to a — 1/3.

Chapter 2

Markov Service Systems

The Markov property is a restrictive assumption. Still, many Markovian models are highly relevant for traffic modeling. Referring to the general discussion on time scale analysis in section 1.2, a Markov model could well capture random variations occurring in a burst or call scale modeling situation. As we will see, such methods should not be excluded even on a cell level, such as in ATM switches. Of particular importance is the class of continuous-time, time-homogeneous, Markov birth-and-death processes, again despite the strong assumptions put on such processes by the Markov property. They clearly provide useful insights into modeling of service systems although their relative simplicity and mathematical tractability may not warrant applicability without caution. On the other hand, indisputable facts are that they belong to, or even form, the historical core of the subject of traffic modeling and that progress on more general models often presupposes a fair understanding of this class. In this text, Markov models in discrete time are of almost the same importance as those in continuous time. This may be in contrast to other presentations of traffic modeling where slotted, or discrete, time models arise mostly in the form of embedded Markov chains of more general non-Markov continuous time processes as they are sampled at certain random times. This is a central topic that appears in section 3.5.3, but we have chosen to also present several models of switching, multiaccess methods, and so forth, specifically in slotted time mode since this seems to be more natural and convenient. The next sections describe some basic Markovian models, starting with selected aspects of slotted time systems and turning to continuous-time classical queuing service systems. We emphasize the ideas of equilibrium steady states and calculating performance measures. The reader should be alert to the severe restrictions we work under and keep in mind, for example, that such a case as the example in section 1.4.3 is not covered by the techniques given in this section.

2.1

Discrete-time service systems

Imagine cells of size t bits arriving at a buffered nodal point for transmission on a link of capacity c bits per second, so that the time slot required for transmission of one cell is t/c 21

22

Chapter 2. Markov Service Systems

seconds. Suppose that the time line is divided into half-open intervals (l(n - \)/c, in/c], n > 1. If there is at least one cell present in the system at the beginning of a slot, then a single cell is transmitted during the same slot, leaving excess cells stored in a buffer awaiting transmission in the slots that follow. To formalize, let Xn = number of cells arriving at the node during slot number «, An = X^=i %k — total number of arrivals in n slots, Qn = number of cells in the buffer (queue) at end of slot n, and Nn = number of cells in the system (in buffer or being transmitted), slot n. Here all quantities are indexed for n > 1. In addition, appropriate initial conditions should be specified, typically Q0 = 0 or NO = 0 if the system starts from an empty state. Example 1. Suppose m users are connected to a transmission node. Each user generates independently in every slot a cell with probability p and remains silent with probability \ — p. Then for each n, Xn e Bin(m, p) and An e Bin(nm, p). The scaling p = y/m yields An e Bin(nm, y/m) ~ Po(yn) for large m; compare this with section 1.4.2.

Now,

which summarizes to

or, equivalently, Other useful relations are

On the other hand, if we choose the system size as a state variable, then, analogous to (2.2),

Assuming, for example, that

then the Markov property

2.1. Discrete-time service systems

23

of the sequence Nn, n > 1, follows directly; similarly for Qn, n > 1. A basic recursion like (2.1) for the buffer size in this simple model is called a Lindley equation. It turns out that equations of this form play a prominent role in much of classical queuing theory, a major reason being the link they provide between the distribution of the sequence in question, in our case {(?„), and the distribution of the maximum for a related random walk. We consider this next. Put £„ — Xn — 1 and

The mentioned link between [Qn] and the sequence {5n}n>o, which is a random walk with integer jump sizes in [—1,0, 1,...], requires the independence assumption of (2.6) and is given by

To prove the distributional identity (2.7) we start by demonstrating the identity

Indeed, note that (2.8) is true for n = 1 and suppose for the sake of a proof by induction that it holds for a given integer n. Then

which establishes the next instance of the identity and hence the validity of (2.8). It is noteworthy that until now assumption (2.6) was not used. Invoking it at this point yields

which concludes the derivation of (2.7). Finally in this subsection we mention the well-founded objection against the present model in that it covers only constant service times. It is the varying number of equally sized cells per slot that generates random fluctuations in the buffer whereas the service procedure simply consists of transmitting one cell per slot as long as there are cells present. To a certain degree the restriction can be circumvented, allowing for a wider interpretation. Suppose that cells have random sizes L\ and replace Xn above by J^i=\ ^i, the total amount of work arriving during slot n. For comparison suppose that the L;'s are multiples of i and that transmission occurs at the constant rate i bits per second. Then the same model applies with Qn being an integer-valued sequence representing current workload, the remaining work left to do for the transmission node if no further cells arrived.

24

2.2

Chapter 2. Markov Service Systems

Arrival and service rates, continuous time

To formally study service systems in continuous time, we introduce the following notation, relevant not only for Markovian situations but generally as well: A, = arrivals process (number of jobs, calls, requests, cells, packets,...), B, = departure process, T\, TI, . • . = interarrival times, Si, 82, • • • = successive service times, Nt = number of jobs in the system = At — B,, Q, — number of jobs in buffer (queue), Mt = number of jobs being served, Nt = Qt + Mt. A further general notion is that of average arrival rate, which, if this limit exists, is defined as We have in mind single-server systems, m-server systems with m parallel service stations, and infinite-server systems, where any job is immediately assigned a server on arrival. In the situation with a finite number of servers and in the case where no server is available, the arrivals are stored in a buffer. The queue is emptied at the rate at which service capacity is again freed up due to departures from the system and according to a given set of rules. The simplest such rule is the first-in-first-out (FIFO) principle, also called first-come-first-served (FCFS), which ranks priority of service higher the earlier the arrival time. The service times Sk normally refer to the times needed for complete service of a given arrival before exit from the system. A classical Markovian service model is given by a set of birth parameters {A.n}n>o and a set of death parameters {//,„ }„>]. Here \k = intensity of an arrival at time t if N, — k and fik = intensity of end-of-service, hence system departure, at time t if N, = k. These interpretations are consistent with the infinitesimal notions

Since the intensities A.^ and ju^ are independent of t, the exponential distribution being the only continuous distribution lacking memory is bound to appear. The basic expression of this property for the exponential distribution is that the family of remaining waiting times until the next jump has the same distributions as the waiting time itself. In fact, if S e Exp(a), then for eacht

2.2. Arrival and service rates, continuous time

25

is independent of s. It turns out that to construct the process Nt one can proceed as follows. Given NQ = k, let Nt remain on the level k for a random time which is E\p(Xk+ /i*). Then j ump to k +1 or k -1 with probabilities A.* / (A.* + /A* ) and /A* / (X* + A 4 *), respectively. Repeat these steps by selecting at each new level independent waiting times that are exponentially distributed with the appropriate intensities. Equivalently, if N, — k, the remaining time until the next arrival is Exp(Ai) and the remaining time until the next completion of a service interval is Exp(/i^). Observe that these two random times are independent; hence their minimum is exponentially distributed with intensity A.J. + /^. Compare Exercise 1.6. This minimum time is the remaining time until the next jump of Nt, in agreement with the construction described above. Of central importance is the special case when A.^ = A. for all k > 0. Considering the embedded counting process, At, of upward jumps only, we see that P(A,+h - A, = 1) = P(Nt+h - N, = l\N, = k, for some ft) = Xh + o(h), and similarly P(At+h — A, > 2) = o(h); in other words, we see that A, is the Poisson process. Other examples are finite buffer, at most K jobs in system. discouraged arrivals; arrival rate slows down with system size. Turning to the service rates /j,^ we have the following simple examples: the single server model with service time distribution Exp(^t) which gives at any time t such that A', > 0 constant intensity IJL for a downward jump in N,. the m -server model where each server operates . , , , . , , ,, , • independently of the others for a random time with distribution Exp(At). By combining arrival and service rates appropriately we obtain a list of some of the classical queuing models, shown in Table 2.1. The crucial tool that allows for the derivation of performance measures and comparative studies of Markov birth-and-death processes such as the models given in Table 2.1 is the equilibrium steady state analysis that we discuss in the next section. This is no reason, however, to ignore completely the exact analysis of the distribution of the process Nt, and as a bare minimum we provide a short discussion of Kolmogorov's backward equations on which the transient, or time-dependent, analysis of such processes is based. In analogy with the Poisson process studied in section 1.4.2, and again using the notation p/t(0 = P(N, = k ) , we have, for small h,

hence

26

Chapter 2. Markov Service Systems Table 2.1. Markov queuing systems in Kendall notation. M/M/1: M/M/m: M/M/oo: M/M/l/K: M/M/m/m:

Figure 2.1. Ten sample paths of "M/M/oo, A./M = 20. and therefore

The existence of a unique solution {pk(t), t > 0}^>0 to this system corresponds to a system size process Nt,t > 0, being well defined. In general this requires certain restrictions on the parameters kk and 11%, excluding such cases where, for example, Nt would tend to infinity at a finite (random) time. No simple necessary and sufficient conditions are known but it is well known that if there are constants a and b such that Xk + MA < a + bk for all k > 0, then a unique solution exists. Clearly this criteria suffices for the models in Table 2.1. It is the exception rather than the rule that the system of equations in (2.11) can be solved explicitly, or that a representation of the solution exists which is of practical use.

2.2. Arrival and service rates, continuous time

27

Figure 2.2. System size o/M/M/1, initially empty, X/n = 0.9. One such exception is the M/M/oo model, for which the probabilities pk(t) can be found in a tractable form; see Figure 2.1. Indeed, let X* = A. and ^ — ilk for all k > 0. It is rather straightforward to verify that the resulting system of equations

equipped with the initial conditions pn(0) = 1, /?t(0) = 0, k > 1, has the solution

As a consequence, for each fixed t we have

It is perhaps surprising that the simple choice of parameters At = 1 and ^ = /x for the M/M/1 model yield as a result for the state probabilities the unwieldy expression

where lk are the modified Bessel functions

28

Chapter 2. Markov Service Systems

Figure 2.3. Sample paths of critical M/M/1, A. = /j,. A further, reassuring, fact is that despite its formidable character the formula for pk(t) greatly simplifies in the asymptotic limit as time tends to infinity. As a matter of fact, if A.///, < l,then/7t(f) —> (\—X/^)(\/n,)k,t ->• oo, a key result for the steady state analysis to follow. Some simulated trajectories of the M/M/1 system starting from idle are shown in Figure 2.2 for the subcritical case A.//Z < 1 and in Figure 2.3 for the case A. = ju.. In each case five independent realizations are indicated.

2.3

Ideas of stationarity and equilibrium states

A part of a network traffic system such as an access control point or a local transmission node, even with a random character of the traffic streams involved, should ideally work in a relatively stable manner, at least over relevant time scales. By this we mean that apart from more predictable changes in intensity, say, daily or seasonal, the variations and fluctuations inherent in packet delays, buffer occupation levels, etc., occur in such a way that dimension and design of the system could be chosen to guarantee, typically, acceptable quality of the service tasks the system is set to fulfill. Observed by a system operator, the network node would appear to be functioning in a randomly varying but still steady state. Eventually, in a greater perspective, the wider area network of which our system was a part would then also work in an equilibrium manner. If we accept that features of real systems can be captured by such stochastic models we have begun to study, the natural interpretation for a real system in equilibrium is that the distributions of the random quantities used in the model do not change over time. There are essentially two methods for implementing this reasoning into the stochastic modeling. To illustrate them we recall our main examples so far: the buffer sizes Qn in section 2.1 and the system size process Nt in section 2.2.

2.3. Ideas of stationarity and equilibrium states

29

First, it is natural to expect that for appropriate values of any parameters that are part of the model, and after an initial stage of adaptation, the distributions would settle down over time, for instance, P(Qn = k) » P(Qn+\ = k) = • • • or even P(Qn — k) — P(Qn+\ = k) = • • -, for n sufficiently large independent of the initial state QQ. In principle the settling down into a stage of stationary evolution over time could run indefinitely. Hence, mathematically the central notions are those of limit distributions and of limits in distribution of random sequences and processes as the time parameter tends to infinity. Supposing that regardless of the initial distribution of QQ the limiting probabilities

exist, a random variable Qx with P(QX = k) = n^ can be associated with the sequence { Q n } . Then the sequence is said to converge in distribution to Qx and {n^} is said to be the limit distribution of the system. Similarly, if

then {jik} is the limit distribution of system size in the continuous-time model. The random variables Qx and Nx represent buffer size and system size in the limiting sense, which in the philosophy of equilibrium states means that they are chosen to approximate at any fixed given time the true distributions of these quantities. The second, in a sense stronger, method for encompassing a state of equilibrium into a stochastic model of the kind we are concerned with is to choose the initial distribution so that it is actually preserved under the evolution of the system. Using again the buffer size example, this approach amounts to finding a distribution {%.} with £^0 % = 1, such that

For obvious reasons a distribution that satisfies (2.12) is called stationary. Regarding the relation of the two notions of equilibrium to each other, we note that a limit distribution is always stationary. If we have already found a limit distribution, then we pick Qo according to that particular distribution and start off the Markov chain. It is heuristically obvious, and not difficult to prove, that the distribution is preserved over time. The converse is not necessarily true, however; Markov chains with a periodic behavior must be excluded. Theorem 2.1 (limit distributions for Markov chains). Suppose a Markov chain {7,,}n>o on the nonnegative integers is irreducible: P(Yn = j\Yq = i) > 0 for some n > Q, for all i, j, and aperiodic: greatest common divisor of the set {n : P(Yn = i\Y0 = i) > 0} is equal to 1 (it suffices, for example, to find one state i with P(Yi = i\Yo = i) > 0), and suppose that a stationary distribution {nj}, rtj > 0, exists. Then the stationary distribution is unique, Jij > Qfor all j, and it is the limit distribution of{Yn}. Summing up, when we say that a system works in a steady state or operates in equilibrium, we refer to, unless otherwise specified, the situation in (2.12), which is typically

30

Chapter 2. Markov Service Systems

obtained by starting the evolution of the system from a stationary state that is also the limit distribution. Even if the initial distribution is not stationary, it often seems to be a mild approximation to ignore the difference between the steady state and the actual distribution at a fixed time. Some authors prefer to work with sequences like {2n}-t» 0; hence the buffer size Markov chain Qn is irreducible and aperiodic. We thus conclude from Theorem 2.1 that to find its operational characteristics in equilibrium, it is enough to compute a stationary distribution. By (2.2) such a stationary distribution must satisfy

and thus, since we already noted that P(X = 0) = 0 would violate (2.13),

The above is a first example of how to derive information about an unknown distribution, in our case that of Qx, in terms of a known, that of X, under suitable assumptions on the model, namely, assumption (2.13), which says that on average the input is smaller than the output of the service node. The argument is based on a balance equation (2.2), where Xn represents the input and l(e n +x,, +1 >i} the output, pursuing the consequences of equating terms. We can now proceed and express the state probabilities JT* = P(QX = k), k > 1, in terms of the expression for TTQ = P(Qoc = 0) just obtained. In general there are no simple expressions for the state probabilities, but since they solve a relatively simple system of equations it is in principle straightforward to calculate them recursively. The equations are obvious from (2.3) when considered in equilibrium, namely,

2.4. Balance equations, slotted time

31

It should be observed that the previous result (2.15) is not redundant since the first of these equations determines P(Qca = 1) in terms of P( I* ~ ^ ~ ^ ^o = 0. The maximum sequence maxo — oc (in the sense of almost sure convergence) as n —> oo. Indeed, negative drift suffices for the maximum to stay finite, in agreement with the equilibrium version of (2.7), which is the representation

Example 2. Suppose that

where we assume that p < 1/2 so E(X) = 1p < 1 in accordance with (2.13). The equations for the equilibrium probabilities simplify and it is seen that

32

Chapter 2. Markov Service Systems

hence

in other words, 2TO e Ge(l - j^)Via its random walk representation, the simplicity of this example is easily understood. In fact, we have £(&) = 2p — 1 < 0, and Sn is the simple random walk with 5o = 0 and jumps of size one, upward with probability p < 1/2 and downward with probability I — p. We have established a well-known property of the simple random walk with negative drift, namely, that its maximum is geometrically distributed, yielding, for example,

2.5

Balance equations, continuous time

The notions of stationary distributions and limit distributions briefly discussed in section 2.3 apply to discrete-time models as well as to continuous-time models. In general the theoretical basis for studying the asymptotic behavior of continuous-time Markov processes is more sophisticated than that for Markov chain models. On the other hand, for Markov traffic models there often is no particular need to bring in theory that goes beyond birthand-death processes. Recall from section 2.2 that a birth-and-death process is characterized by a set of birth rates {/„} and death rates {/zn}. Restricting considerations to this class, a result can be stated that gives complete knowledge of the asymptotic properties. Theorem 2.2. Assume that a birth-and-death process N,, t > 0, on the nonnegative Integers is governed by parameters {Xn}n>0 and {jtn}n>i such that

Then a unique stationary distribution

exists, which is also the limit distribution

Before discussing the proof on an informal level, let us look at two examples.

2.5. Balance equations, continuous time

33

Figure 2.4. Sample paths o/M/M/1, Q = 0.9, 1.0, 1.1. Example 3. Consider the M/M/1 model again with parameters A and /LI. Condition (2.18) takes the form

It is customary to denote the relevant parameter ratio by

and conclude, in the subcritical regime g < 1,

thus NX, e Ge(l — Q), which was mentioned at the end of section 2.2. We illustrate this fundamental example in Figure 2.4, which shows simulated traces of the process for the three cases with intensity ratios Q equal to 0.9, 1.0, and 1.1, respectively. Example 4. The model M/M/1/K is a variant of the previous model such that an arrival at time t is accepted into the service system only if N, < K; otherwise it is lost. Hence the total size of the system is always limited to size K. Recall the intensities

The divergence of the first sum in criterion (2.18) (consistent with A./C = 0 and fig = /LI) is a recurrence condition that is automatically satisfied as there is only a finite number of states.

34

Chapter 2. Markov Service Systems

Similarly, the convergence of the second sum in (2.18) is satisfied. The corresponding distributions in (2.19) simplify to

where the required normalization ^,Q nk = 1 implies

This is a truncated geometric distribution. There is no restriction on the parameter Q in this example. The larger the value of Q, however, the more arrivals will be lost. It is not difficult to comprehend the content of Theorem 2.2. Indeed, the probabilities Pk(t) = P(N, = k) are governed by the system of equations (2.11). If the process N, evolves in a steady state fashion it means that an initial distribution for Wo has been chosen such that pk (t) is actually independent of t. But then the derivatives must vanish, p'k(t) = 0. Writing in this case 7tk = pt(t), then by (2.11)

By iterating the second relation and as the final step using the first, we obtain

hence, by induction, the balance equations for birth-and-death processes in convenient form,

Solving in terms of JTQ, this gives

after which it only remains to find JTQ such that these relations are consistent with the required normalization,

It is now obvious that convergence of the second sum in condition (2.18) is necessary. To conclude the discussion about the validity of Theorem 2.2, consider for comparison the case of discrete-time Markov chains in Theorem 2.1. The notion of periodicity is completely tied to the discrete structure and is not relevant to the case of continuous time. What is needed, however, is an analog of the irreducibility criterion. The intuitive content of irreducibility is that any state can be visited from any other state, and clearly birth-and-death processes work

2.6. Jackson networks

35

as something of a prototype for such behavior. More formally, all transition probabilities satisfy are continuous functions of t. It turns out that this property is a sufficient condition for the stationary distributions obtained as solutions to the balance equations (2.20) to be also the unique limit distributions. For complete proofs, see a textbook on stochastic processes, such as Grimmet and Stirzaker [14]orResnick[51]. The next example continues the discussion of the M/M/1 model. Example 5. Consider a transmission link of capacity c bits per second, and suppose messages arrive at the transmission node according to a Poisson process of intensity A, messages per second. We assume that the lengths L\, L 2 , ..., measured in bits, of arriving messages are randomly varying, independent from one arrival to the next, and are exponentially distributed, Lj e Exp(l/l) with mean t bits. To transform these assumptions into an M/M/1 model we put bits/msg 5" = L/C = L/C seconds per message; bits/sec hence, observe that S e Exp(c/£) since

and interpret S as the service time in an M/M/1 model with arrival intensity A, and service intensity ^ = c/t. The traffic intensity is therefore Q = Xi/c, and the system will operate in a steady state as long as X.i < c, under which the system size Nt has a geometric stationary distribution with mean

2.6

Jackson networks

It is certainly necessary for achieving even modest goals of realism in traffic models to include several service systems and alternative routes. In a LAN, for example, a variety of incoming traffic streams are switched from one node to another for continued service or for direction onto outgoing links. In the Markovian framework there is a celebrated result that extends the use of the classical queuing models quite remarkably. If each node in a network of service stations is modeled by means of a classical queuing service system with infinite buffer, and traffic is allowed to be routed from the exit of one node to the entrance of another node, then Jackson's theorem gives a recipe for writing the stationary distribution for the number of cells present at each node of the network. Despite its simplicity and tractability, however, the theory we study in this section hardly resolves the issue of modeling complicated networks. The main drawback is that the Markov structure imposed on the system requires that routing of cells between nodes be completely state independent.

36

Chapter 2. Markov Service Systems

Jackson's network model can be described as follows. We work in continuous time mode and consider m unbounded service nodes of the type studied in section 2.2 (M/M/1, M/M/oo, or more general M/M/s systems), characterized by sequences fji1 = {/^} of service rates, fji'n = rate of service in node / if n jobs are present, n > 1, i = 1 , . . . , m. Suppose that each node is supplied with an external input line, and assume X1 = intensity of external arrivals at node /,A' > 0. Thus the arrival streams are purely Poisson and not allowed to depend on the present status at the node. Finally, suppose an m x m routing matrix R = (r,y) is given such that r{j = probability of entering node j on departure from node i, prohibiting, as mentioned above, the routing mechanism from depending on the state of the system. It is obvious that = probability of leaving the network from node i, and that the parameters {X1}, {/z!}, and R determine a system that works as shown schematically in Figure 2.5. Under suitable assumptions on the parameters the resulting state variables Nf = system size, buffer plus server, at time t in nodal point /, / = 1 , . . . , m, will be well defined and in the long run enter steady state behavior. We start with the trivial case R = 0 (the matrix with only zero entries). Then the Jackson model consists of m independent Poisson arrival Markov service systems, such as M/M/1, working in parallel without affecting each other. If the separate nodes settle into equilibrium distributions [nlk}k>o, i = 1 , . . . , m, then the network steady state is simply given by

What is remarkable is that the same structure prevails for the general Jackson network. In fact, such networks behave in steady state as if the nodes were independent, presupposing we calculate the actual arrival intensities to each node, consisting not only of external arrivals but also of internal traffic from within the network. Theorem 2.3. Suppose a network is defined by parameters [kl}, {/x1}, and R as above. If YI > Qfor at least one i, then the system of traffic equations

has a unique solution (yl, ..., y m ).

2.6. Jackson networks

37

Figure 2.5. Jackson network. If for each node i the sequence {n'n} is such that the service system in that node exposed to a Poisson arrival stream of intensity yl exhibits a steady state n'(y'\ then the network stationary distribution is given by

Example 6 (M/M/1 in series). A set of m serially connected M/M/1 service stations is a simple example of a Jackson network. We assume

Now Jackson's traffic equations take the form

Hence, if A < ^ for each j — 1 , . . . , m, then in steady state

38

Chapter 2. Markov Service Systems

Figure 2.6. M/M/1 with feedback mechanism. Example 7. A feedback M/M/1 system is given schematically in Figure 2.6; each job that departs from the server is independently switched into one of two routes. With probability r the job is allowed to exit the system and with probability 1 — r it returns to the entrance queue for repeated service, in the latter case as surplus to the external Poisson stream of intensity A, consequently adding to the workload of the constant intensity JJL server. The traffic equations simplify into

and thus, if A. < r/z, the steady state system size distribution is given by

2.7

Markov loss systems

There are two basic loss mechanisms. We discuss them briefly using two of the models in Table 2.1. The single-server finite-size model M/M/l/K is the basic example of a model where jobs are lost when the buffer has filled to its maximal size, K — 1, restricting the system size to Nt < K at all times. The m-server loss system M/M/m/m, on the other hand, blocks out arrivals if the maximal number m of available servers has been drained, hence avoiding the need for a buffer at all. Of course, an analysis of the combined model M/M/m/K would involve elements from both loss mechanisms. Loss analysis of M/M/l/K. Recall from Example 4 that the steady state distribution is given by the truncated geometric distribution

One can check that the average buffer occupancy E(N00) is given by

A bound on this quantity might be an appropriate measure of service quality, hence leading to a rule for selecting K. Perhaps it is more natural to consider the probability that the

2.8. Delay analysis in Markov systems

39

system is blocking arrivals, namely,

Anticipating a discussion in section 3.2, this can be interpreted as the proportion of time during which the system is full, and hence as the probability that any given arrival is lost. Given an acceptable value of the loss probability, it is a simple task to find the required buffer size; see Exercise 2.9. The m-server pure loss system. This is the model M/M/m/m: Again it is straightforward to solve the balance equations, obtaining

and, in particular, Erlang's 1st loss formula: The amount of lost work per time unit in the M/M/m/m model is given by ITTOT, where jrm is the loss probability

For numerical purposes it is often convenient to introduce a dummy random variable X e Po(^) and note that

2.8

Delay analysis in Markov systems

Clearly the above analysis of simple loss models raises further important issues, such as providing answers to questions like, How much time does a typical job spend in the system? and, How much longer does a job have to wait if further buffer space is added? As an introduction to understanding the balance between loss and delay we consider at this point some standard calculations regarding the simplest Markov systems. For convenience, we drop indices when we discuss steady state quantities and write, for example, N instead of NCC for system size.

2.8.1

Delay in M/M/1

The typical quantities studied are system time W, the total time from arrival to departure of a typical job, and waiting time Wq, the time spent in queue waiting for the service of

40

Chapter 2. Markov Service Systems

other jobs to be completed. For the M/M/1 model under FCFS scheduling, the steady state distributions of system time and waiting time can be found explicitly. The key idea is to condition on the system size N = n and represent the corresponding waiting time in a convenient form. More precisely, suppose a job arrives in the system and finds either N = Q = 0 or N = n > 1, and hence Q = n — 1 > 0 other buffered jobs waiting for processing. The total time the newly arriving job will have to wait in the system is

where S is the service time for the arriving job itself, U is the remaining service time for the job occupying the server at arrival, and S\,..., Sn-\ are service times for the n — 1 jobs waiting in the buffer. Now S as well as Si,..., Sn-i are independent and all are exponentially distributed with parameter JJL. Moreover, service times and remaining service times are identically distributed in the exponential case (see (2.10)), which means that U has the same distribution as the other summands in the representation for W. Hence for both cases n = 0 and n > 1 we obtain

in view of Exercise 2.5. Thus, considering the conditioning, we may indicate the distribution of W via the suggestive notation

which in explicit terms means that W has the density function

In other words, the law of the M/M/1 system time W in equilibrium is remarkably simple! The exponential distribution with expected value I/(/A — A.) appears. Similarly, restricting the analysis to queuing time only, we have

hence if n > 1 and Wq = 0 otherwise. It follows that Wq is a mixed distribution that assigns mass P(N = 0) = 1 — Q to an atom at zero (no waiting in line is necessary), and the remaining mass is distributed according to the reweighted exponential density Q fw(x). Equivalently, P(Wq > jc) = Qe-^-V*, x >0.

2.8. Delay analysis in Markov systems

41

Figure 2.7. Web client-server model. First look at Little's formula. Recall that in steady state M/M/1, E(N) = g>/(\ Q) = A,/(// — A) and hence E(Q) — Q2/(l — Q). The mean sizes and corresponding mean delay times are therefore related by

These are simple instances of the celebrated Little's formula, which states that in a multitude of models the mean size is proportional to the mean delay with the average arrival rate being the proportionality constant. 2.8.2 A client-server Jackson network The World Wide Web is growing at an unsurpassed speed. Over a period of several years the number of web servers has increased exponentially. The driving force of the growth dynamics might be, on one hand, to meet the request for service from the growing collective of clients having access to the server via the Internet, and on the other hand, to generate new groups of clients of financial or other potential interest to the server. Modeling of such a large-scale supply and demand system, and of the evolutionary dynamics of the web, is in its infancy. Interesting starting points can be found in [20]; see also [58]. To set up a simple model for the performance of a website exposed to Internet community load we follow Slothouber [59]. The website is assumed to act as a file server only responding to download requests arriving from Internet clients. Each request consists of retrieving a file of exponential size F with mean E ( F ) . The server will transmit only a chunk of the file at a time, of exponential size B with mean E(B}. Hence the client may have to repeat its request a random number of times until file transmission is complete. The system is modeled as a Jackson network with five M/M/1 nodes; see Figure 2.7 (slightly modified compared to [59]). Three nodes model the web server and two nodes the Internet communication network. Requests arrive with intensity A, hits per second at the entry point of node 1, where one-time processing is performed at service rate ^JL\. The service time at node 2 with rate yU2 represents server time, which is independent of file size. At node 3 the server reads B bits of data, which are processed at the rate Cr. This block of data is transmitted by node 4 to the Internet at the server's transfer rate Cs bits per second and received by the client's browser, node 5, at client network bandwidth Cc. Now, with probability r = E ( B } / E ( F ) the file transfer is complete, and with probability 1 — r the job is retransmitted to the input of node 2. This is repeated until the job exits from node 5. One should observe before continuing that the choice of the retransmission probability r is consistent with the assumption of F and B both having exponential distributions. In fact,

42

Chapter 2. Markov Service Systems

in complete analogy to (2.22) we have

where M e Ge+(l — r) is the number of rounds needed to download the complete file of size F. We pause to recall the general Jackson net in Theorem 2.3 and to consider delay times in such networks. If we accept the basic relationship of Little's formula, then we can immediately write down expressions for delay in a Jackson network. As before consider m Markov service nodes with external Poisson arrival intensities A ( , internal traffic intensities yl obtained from Jackson's traffic equations, and steady state distributions represented by random variables Nl for the number of messages at node i, i = 1 , . . . , m. The average system time delay in node / is given by ENl / y l , where the mean is computed with respect to yl. Similarly, the average network system time for any message arriving at the network is given by

In the client-server model it is simpler to compute the total mean delay E(W} for a client as follows. First, it follows from the traffic equations for this model that

The delay times in nodes 1 and 2 are therefore

Similarly, since it takes on average E(B)/C seconds to process a file of size B under capacity C,

The number of round-trips, M, for which E(M) = 1/r = E ( F ) / E ( B ) , must be included, and the final result becomes

Clearly, for this to hold X has to be small enough so that all terms exist finitely. The final result differs somewhat from [59], mostly because we preferred five nodes rather than four to keep the exponential file size distributions exact. Slothouber suggests typical parameter values C s = 1.5 Mbit/s, Cc « 700 Kbit/s, E(F) « 5 Mbyte, E(B)^2 Mbyte and investigates the influence on delay as the remaining model parameters are varied.

2.9. Exercises

2.9

43

Exercises

2.1 An indicator for traffic load in a router is measured once a minute, at which times the load is classified as either normal or high. In the case of high load a leaky-bucket system starts, which reduces incoming traffic. The sequence of classifications can be described as a discrete-time Markov chain. The probability that a normal reading is followed by a high reading is 0.10 and the probability that a high measurement is followed by a normal one is 0.95. Find the probability that a high load is registered during an arbitrary minute interval. What is the expected number of minutes during an hour at which a registered high load is followed by a normal load, that is, at which the control system is effective? 2.2 A binary source in a communication system generates a sequence X\, Xi, ... of zeros and ones according to a Markov chain with transition probability matrix

where a and b are the probabilities for a change from one symbol to the other. The sequence is transmitted over a binary symmetric channel, which means that the fcth symbol received, Yk, is corrupted by errors with probability e and error free with probability 1 — £, independent of previous errors. When the system is in equilibrium find the conditional probability P(Xk = l\Yk = 1) that a received digit one is error free, as a function of a, b, and £. 2.3 Consider the service system M/M/oo, where the arrival intensity \n = A. > 0 is constant regardless of the number of jobs n in the system but the service rate equals j.in — \in,n > 1, where JJL > 0 is a given constant. Write the balance equations. Solve to obtain a stationary solution. Is it necessary to impose restrictions on the parameters to find a solution that is a probability distribution? What is the steady state distribution of this service system? What is the utilization? 2.4 Think of the service time required for each job arriving at a service system as an amount of work that each arrival is carrying with it (for example, link processing time). We associate with an M/M/1 queue the discrete time workload process Yn, n > 1, by letting Yn be the accumulated amount of work brought into the system by arrivals occurring in the time interval [n — 1, n). We obtain in this way an i.i.d. sequence. Why are they independent? What can be said about the distribution of the Yn 's? The mean? (A reader with some background in probability may want to find the variance and also the moment-generating function.) 2.5 Based on the previous exercise consider the M/M/1 system from the viewpoint of the server. By recording at each time t the amount of residual work that remains to be done to finish what is in the system at time t, we obtain the continuous time virtual workload process. Draw a graph of what such a process would typically look like. Observe that a simple change of units yields the equivalent virtual waiting time process.

44

Chapter 2. Markov Service Systems

2.6 For a small circuit-switched public network find the number of circuits necessary to keep the probability of blocked calls less than 0.2, assuming Poisson arrivals with intensity 120 calls per hour and exponentially distributed call durations with a mean of 2 minutes each. 2.7 Suppose an access control mechanism of a single-server system has the effect of changing a constant arrival rate a. into the rates an = « / ( « + 1), where n is the number of jobs in the system. Assume exponential transmission times with rate \JL. Find the equilibrium distribution and the expected size of the system and queue, respectively. Find the average arrival rate. Then, from Little's formula, determine the average waiting times in the system and in the queue. 2.8 By expanding the cube of Qn+\ in (2.2), express the equilibrium variance of 7 = 0 for all other pairs i, j. Describe the steady state of the network. In particular, what is the distribution of the total size of the network in steady state?

(Ampler 3

Non-Markov Systems

This chapter introduces techniques based on renewal theory and techniques related to general service-time distributions in queuing and loss systems. Renewal theoretic methods are particularly useful for performance analysis in many service systems. The basic properties of renewal processes and renewal-reward processes are covered and the application of renewal models in two specific areas of networking are discussed at some length. The objective of the first example is to express in model parameters the probability that an ongoing mobile phone call is terminated. The mobile unit moves from one cell of the service area into another and the call is terminated because no traffic channel is available in the new cell. The second set of examples deals with reliable data transfer protocols for modeling the transport-layer communication protocols on the Internet. Service systems with Poisson arrivals and general service-time distributions belong to the core of traditional queuing theory. A selection of material is presented starting with the Pollaczek-Khinchin formulas. As an introduction to more advanced topics we discuss two different approximations of the queuing delay time in M/G/1 systems: one approach is based on the heavy traffic approximation and the other uses recursive representations. Some general references for this chapter are Bertsekas and Gallager [5] and Harrison andPatel[17].

3.1

Performance measures

We begin by presenting a list of notions that are useful for measuring performance in many systems, such as the classical Markov models, another non-Markov service system that will be discussed in this chapter, and also, for example, switches. Traffic intensity:

Q=

average service time required per server per unit time,

Off Hi rt • average amount of traffic presented to system per unit of time,

45

46

Chapter 3. Non-Markov Systems Utilization:

fraction of used system capacity,

Throughput: e r

average amount of work completed by F J the system J Fper . ° . unit of time,

Loss probability:

average fraction of lost traffic, and

Blocking probability:

probability the system is blocked at a random time.

Normally one can think of utilization as the fraction of busy servers. The following relation between some of the listed quantities holds in great generality: loss probability = 1

utilization traffic intensity

We demonstrate the validity of (3.1) for a number of models in examples and exercises below. At this point we offer only a heuristic argument as follows. First, we may think of the loss probability as the limiting ratio of the number of lost messages to the number of attempted messages considered over a long time interval, as indicated in loss probability:

number of lost messages total number of messages total number of messages — number of transmitted messages total number of messages number of transmitted messages total number of messages

By dividing with the length of the time interval, the same relation considered per unit of time is obtained as number of transmitted messages per unit of time loss probability: 1 — total number of messages attempted per unit of time The system service rate is the number of messages that the system can accommodate per unit of time. Hence the ratio of the amount of transmitted messages to the system service rate measures the fraction of system capacity actually used. In other words, it measures the utilization of the system utilization:

number of transmitted messages per unit of time system service rate

Similarly, number of attempted messages per unit of time system service rate

system arrival rate system service rate

The service rate per server is inversely proportional to the expected service time. Hence the above ratio, considered on a per-server basis, equals the fraction of time during which a given server is requested to provide service, system arrival rate system service rate

server arrival rate = traffic intensity, service rate per server

3.2. Integrated processes and time averages

47

Figure 3.1. Load versus throughput M/M/l/K, K = 1,2,5, 10, 100, /x = 1. which combined with the earlier relations make (3.1) plausible. Example 8 (continuation of Example 4). We look at the specific case M/M/l/K:

It is clear that (3.1) holds. Observe that this model is defined for any Q > 0. A load-versusthroughput graph is shown in Figure 3.1. Further examples are deferred to the exercises.

3.2

Integrated processes and time averages

In various situations integrals of stochastic processes arise naturally. For example, if Nt is system size in a single-server system, then f^ 1 {Ns >0} ds is the occupation time for the server during [0, ?]. Integrals of the form fa} Xs ds are well-defined random variables as long as the sample functions of Xt are not too irregular. Sufficient regularity conditions are known and cover virtually all stochastic processes arising in practice, including those in this book. To present the rules for calculating moments we assume that { X t , t € T}, T a fixed interval,

48

Chapter 3. Non-Markov Systems

has finite mean and variance such that E(X t) is a continuous function oft and Cov(Xs, Xt) is a jointly continuous function in s and t. By using results from measure theory it can be verified that the order of integration and expectation can be interchanged, so that

Moreover, applying (3.7) to the double integral,

it follows that

The process Xt is weakly stationary if the mean function E(Xt) is constant independent of t and the autocovariance function c(t} = Cov(Xs, Xs+t) is independent of s. Under this further restriction £(/Q Xsds) = E(Xo)t and (3.8) simplifies. In particular, change of variables in the integrals shows that

Time averages is a useful technique for studying the asymptotic behavior of service systems. Recall that in the steady state analysis of Markovian service systems, a limit distribution {nk} represents the typical probabilities that after a long time interval, a trajectory of the system is observed in state k > 0. If such a process Nt with steady state A^ has been observed over a long interval of time [0, t] and a time instant u G [0, t] is selected randomly, then the distribution of Nu should be close to the distribution of A^,, which is {nk}. It is natural to expect that this would hold with or without Markovity. To follow this line of thought we represent the knowledge of the evolution of a general process Nt by writing heuristically J-t = {all events known from knowing NS,Q < s < t},

t > 0.

Indicated here is the mathematical formalism of a filtration (F^t^o associated with the stochastic process Nt, a notion that in full requires the theory of measures and measurable spaces. On a nonformal level it is sufficient to think of Ft as the information about the prehistory [Ns, 0 < s < t} of the process up to time t. Let U(t) e Re(0, t) denote a random variable independent of (A7,) that is uniformly distributed on [0, t]. Based on the above heuristics we expect that Nu(t) represents an average state, that is,

The effect of performing the conditional expectation E(Nu(t)\Jrt} is to average over U(t) only, keeping the random variation in (Ns) intact. In such a way the time average process of

3.2. Integrated processes and time averages

49

(Ns) appears. Since the density fu(t)(s) of U(t) is constant and equal to \/t for s e [0, t], it follows that

Going one step further, averaging over the sample functions of (JV y ) shows that

Similarly,

and we expect

for large t. This turns out to be true in great generality. In particular

identifying the long-run proportion of time that Nt spends in state k with the steady state probability xk for NOQ. Moreover,

and so

Such results belong to ergodic theory. As an application we return to the example of occupation times in single server systems. Here

which verifies that the fraction of time the server is occupied indeed equals the traffic intensity. Although the conceptual reference in this brief introduction is to Markov processes, typically the Markov property is not essential. Results of the above form are common throughout probability theory. We move onto renewal theory in the next section and continue the discussion of these topics and related ideas. Wolff [69] systematically exploited time averaging methods, gave detailed presentations of Markov processes and renewal theory, and studied many interesting applications of queuing theory.

50

3.3

Chapter 3. Non-Markov Systems

Some ideas from renewal theory

The standard (delayed) renewal model is the following. Consider a sequence of independent, nonnegative random variables U\, Ui, • • • , where U\ has distribution function F\(t) = P(U\ < t), t > 0, and t/2, t/s,..., are identically distributed with common distribution function F(t) = P(U{ 0, i > 2. We assume

Let Tn = Y^=i Ui, TQ = 0, denote the partial sums, and suppose that renewal events occur on the real line at times T\, 7 2 , . . . . The renewal process is the counting process

associated with the i.i.d. sequence (Ui). The special case F\(t) = F(t) = 1 — e~xt returns the Poisson process with intensity A. Two simple but useful observations are the relationships

(see Exercise 3.5) and The latter ordering property shows that

An argument based on the strong law of large numbers now completes a proof of the renewal theorem in its simplest form

More advanced methods [14], [69] lead to the corresponding property for the mean number of renewals, the elementary renewal theorem,

The stationary renewal process is obtained by choosing for F\ the equilibrium distribution associated with F(t), namely,

It can be shown that with this choice E(Nt) = t/v, so that the asymptotic relation in the elementary renewal theorem is in fact an identity for any fixed t > 0.

3.3. Some ideas from renewal theory

51

3.3.1Renewal reward processes The renewal reward theorem is an extension of the renewal theorem, which is often useful and convenient for asymptotic throughput analysis in various traffic models. The idea is to associate to the renewal cycles not only their lengths Ut,i > 1, but also a further sequence of random variables, /?,, i > 1, where 7?( represents the reward accumulated during cycle i. The total reward up to time t is given by partial reward from interval ( T ^ t , t ] . The aim is to find the asymptotic mean reward in the sense of a time average, i.e., the limit as t —» oo of fraction of partial reward. It is clear from this relation what to expect. The renewal theorem shows that Nt/t —»• 1/v, and so it should follow from the strong law of large numbers that/?,/? -> E ( R ) / v & s t —»• oo. The typical assumptions imposed on the rewards to guarantee the expected behavior are that for each _/', Rj may depend on Uj but is independent of all £/,, / ^ j, and that (/?,-) is an independent sequence, identically distributed except possibly for R I , which is allowed to have a different distribution. Moreover, it is assumed that E\R{\ < oo for any /. It can be shown that under these assumptions the details of assigning rewards to renewal events does not affect the end result. It does not matter whether a reward is counted at the beginning or at the end of a renewal interval or if it is gradually allocated continuously over time. In either case the partial rewards vanish asymptotically, and the renewal reward theorem states that the time-averaged total reward converges to the cycle averaged reward. Indeed

where E(R) is the common expected value of the rewards /?,-, i > 2. For proofs and more general versions related to regenerative processes, see, e.g., Wolff [69].

3.3.2

Renewal rate and on-off processes

Now let X i , Xi, • • • denote a sequence of i.i.d. random variables with finite mean E(X) and variance V(X), which is independent of the interrenewal times U\, f/2, • • •• Define the corresponding renewal rate process to be

where the notation TO = 0 is added. A useful property of the renewal rate process is that its mean and covariance functions are easily calculated. Indeed,

52

Chapter 3. Non-Markov Systems

and

since if U\ < t, then A, = Y^=2 ^» ^{Tn-\ 0. The class of alternating renewal processes or on-off processes is another interesting starting point for modeling general arrivals. Here the renewal times Tn = £)"=1(£/i + Vj) are generated by two i.i.d. finite mean sequences (t/;) and (V,), and the on-off process

equals one during the successive on-periods of length U\, f / 2 , . . . and equals zero during the off-periods of length V\, Vi, The initial condition is ZQ!) = 1; the analogous process Zf( ) 0) withZQ = 0 is obtained by switching the roles ofU and V. More generally we can construct a stationary on-off process Zt. To find the corresponding steady state probabilities we apply the renewal reward theorem. The mean cycle lengths are v = E(T\) = E(U\) + E(V\) and the mean rewards £(/?,•) = E(U\). Over an interval of length t the integral J0' Zs ds is the length of time the process spends in the on-state. Hence partial reward. In the limit t —» oo this yields the on-state equilibrium probability

It can be verified that the on-off process indeed has an asymptotic distribution (P(on), 1 — jP(on)}, given as above by the limiting ratios of the mean values. We compute the covariance function for the stationary process Zt in the simplest case of the Markovian alternating renewal process. In this case the sojourn times are exponential with constant intensities a > 0 for jumps from off to on, ft > 0 for jumps from on to off, and P(on) = a/(a + /?). Now

3.3. Some ideas from renewal theory

53

Referring to the Kolmogorov equations (2.11), the function p\\(t} — P(Zt = l|Zo = 1) satisfies Hence and thus

Moreover, V(Z,) = y(l — y} and therefore

3.3.3

Hand-off termination probability

The stochastic model considered here was introduced by Lin, Mohan, and Noerpel [32]. The service area of a personal communication service network is partitioned into several fixed cells. Subscribers of the service are mobile and carry portable phone units. From time to time they move between adjacent cells, and it is during these moments that an ongoing call is under risk of early termination. Whenever a portable enters a new cell with a call in progress it requires a new channel; this procedure of changing channels is called hand-off. We are interested in the number of cell transitions during an ongoing call and ultimately in the forced termination probability for the call. Assume that the intercell times, i.e., the successive time intervals that a user spends in consecutive cells over the service area, are independent and with lengths U determined by a given distribution function F(t) = P(U < t). Assume also that the call holding times, 5, are exponentially distributed with parameter /i — l/E(S} and independent of cell duration times. We pick the starting time and location for the call randomly. It is natural therefore to model a single call using the stationary renewal process. Let N, be the renewal process with interrenewal times such that [/,, i > 2, are i.i.d. with distribution function F having finite mean v = E(U) < oo and U\ has the equilibrium distribution function Feq in (3.13). This means that a user call is initiated at time t = 0 and that Nt gives the number of cell transitions for that user up to time t. Hence K = NS = number of hand-off transitions during a single call. The final parameter in the model is the forced termination parameter pf. At each hand-off transition event the call is terminated in advance with probability pf independent of the number of previous hand-offs. Therefore the random variable M = number of hand-off transitions until first forced termination is geometrically distributed, M e Ge+ (/?/), and is independent of K. The forced termination probability is the probability that the first forced termination event occurs while the call is still in progress,

54

Chapter 3. Non-Markov Systems

Now, to have K > k for some k > 1, it must be true that the duration of the call exceeds the time U\ of the first hand-off, the probability of which is P(S > U\). Given S > U\, the remaining duration of the call must exceed f/2, and so on until all k hand-offs have occurred. But since S is exponential these remaining life-lengths are also exponential, whence

and so

Now we use the calculation

to conclude

It is an integration exercise to rewrite

The final representation of the forced termination probability is therefore

3.3.4

Reliable data transfer

IP provides the basic delivery service between communicating end systems on the Internet's network layer. Although every host has a unique IP address and the protocol is responsible for the logical communication between any two hosts, the IP service makes no guarantees that data segments are delivered correctly or in order to the receiving end. Because of this, the protocol is of best-effort nature and IP is said to be an unreliable transfer protocol. The fundamental task of the transport-layer TCP, except for extending delivery service to processes running on the end hosts, is to provide reliable data transfer. Sequence numbers, acknowledgments, and timers are used to ensure that data segments are delivered orderly and uncorrupted. In section 6.4.3 we make a detailed study of the congestion control mechanisms of TCP. In this section the mechanisms of three simpler protocols for reliable data transfer are investigated: the stop-and-wait protocol, the go-back-N protocol (GBN), and the selective repeat protocol (SR). The aim is to compare these schemes with respect to effective throughput as a function of the packet loss probability. To solve this task we apply the renewal reward theorem.

3.3. Some ideas from renewal theory

55

For the exact principles of reliable data transfer see the computer networks literature, e.g., Schwartz [55], Stevens [60], and Kurose and Ross [30, Chapter 3.4]. Here we restrict ourselves to a simplified model for the transmission of packets (or frames) from a sender to a receiver, where it is assumed that packets of fixed and equal size, numbered in sequence, are transmitted successively subject to constant delay times. Every packet delivered at the receiver's end is acknowledged by the return of an ACK packet in the opposite direction, and the arrival of the ACK at the sender marks the end of a successful transmission round. For simplicity we select a time unit by letting tR = 1 be the round-trip time. In addition, a time-out clock with expiry time TO is used to handle the loss of an ACK; expiry of the clock at the sender triggers the retransmission of one or several unacknowledged packets. To continue putting this in the framework of a mathematical model, we associate with each data packet a loss probability /?, the probability that either the packet or its ACK is lost during transmission, and hence the packet unacknowledged at the sender. We start by analyzing the go-back-1 protocol. The sender transmits a single packet on the channel then waits a round-trip time for the corresponding ACK to arrive. If the packet is successfully delivered, then the return of the ACK packet marks the start of the next round, where a single packet can be transmitted again. If the packet is lost, and consequently no ACK packet arrives in the expected time period, then the time-out clock is activated and the packet is retransmitted with an additional delay of TO, the time-out span measured in number of round-trip times. The periods between consecutive packet loss events can be thought of as cycles, where each cycle consists of a random number of rounds. Let K denote the number of such rounds in a cycle, K = number of rounds until a packet is lost. Then K e Ge + (p) since the probability that a loss occurs in round k is P(K = k} = (1 — p)k~l p, k > 1. The loss is discovered only at the end of the Kth round and TQ roundtrip times elapse before retransmission of the lost packet. This means that effectively per cycle K — 1 packets are delivered over a time interval of length K + TQ. The stop-andwait protocol is very similar conceptually. The difference can be captured in the model by saying that now K — 1 packets are delivered over a time interval of length (K + \}TQ round-trip times. See Schwartz [55] for a detailed discussion of the differences between the two protocols. Although the set-up in [55] is slightly different from ours, the resulting throughput formulas will be the same. To rephrase the above in terms of the renewal reward theory briefly introduced in section 3.3.1, let N, denote the number of cycles up to time t. Then N, is the renewal process associated with a sequence of interrenewal times ([//), such that for go-back-1 Ui = Kf + T0 and for stop-and-wait (// = (Kf + l)7o, where Kt is the number of rounds in cycle /, i > 1. The mean interrenewal times are therefore

with each cycle i is associated the reward Rf = Kt — 1 with E(Ri) = (1 — p)/p. The throughput over time [0, t] can be written Throughput (t)

56

Chapter 3. Non-Markov Systems

and so, by the renewal reward theorem, ThroughputGB1 (formulas 4-6 and 4-1, respectively, in Schwartz [55]). The extension to GBN is presented in two steps. In the main body of the text the throughput is derived for a simplified version of the protocol. The more realistic modification is discussed in Exercise 3.4. In these models it is assumed that the packet size is small compared to the round-trip time and that a parameter TV acting as a windowsize is introduced giving the maximum number of packets that the sender is allowed to send without waiting for acknowledgment. A successful round consists of transmitting N packets, each packet subject to loss independently and with the same probability p. If one or several of the N packets are lost, then we assume in the simplified model that all the TV packets in the same round have to be retransmitted. (In the model of Exercise 3.4 only those packets following after the first loss in the window must be retransmitted.) The probability that a round of packets is delivered loss-free is (1 — p)N and the number of rounds until the first loss occurs is a geometrically distributed random variable K e Ge+(l — (1 — p)N). Just as for the go-back-1 protocol, we let Nt be the number of cycles to time t, and we note that

is the expected time between any two loss events. The reward associated with cycle i is a total of Ri = N(K{ — 1) packets and the expected reward therefore is

Again the resulting throughput derives from the renewal reward theorem, ThroughputGBN that is, ThroughputGBN The SR protocol was devised to avoid the drawback with GBN that all the N packets in a round must be retransmitted if a loss occurs. The SR protocol can be modeled in close analogy to GBN under the simplifying assumption that the possibility of several packets being lost in the same round can be ignored, which is to say that since p is sufficiently small, probabilities of order pk, k > 2, can be ignored. Then SR is obtained from GBN with the modification that another N — I packets are added to the reward of cycle /,

3.4. The loss and delay time balance

57

Figure 3.2. Throughput in GBN (filled lines) and SR (dashed lines). since in this case only one packet of those in the final round before time-out must be retransmitted later. This gives ThroughputSR A comparison of throughput for GBN and SR for /?-values ranging from p = 0 to p = 0.1 and N from N = 1 to TV = 40 is shown in Figure 3.2. The time-out interval was set to TQ = 4 round-trip times.

3.4

The loss and delay time balance

We have seen that the obvious remedy for avoiding congestion in nodes and transmission links is to add buffer space. It is equally obvious that if a packet is allowed into a heavily buffered system, then there is a definite risk that the packet will spend an excessive amount of time in various buffer queues, possibly generating unacceptable delays in end-to-end delivery time. Traffic modeling can give some insight into this fundamental balance between buffer space and delays. We follow the usual approach to time delay modeling, namely, we consider a singleserver system with delay consisting of buffer time and service time. Certainly the total delay of a packet can originate in many other ways. First, each contribution to the delay the packet suffers within a given subnet in the network should be summed to obtain the total. On the level of simpler units such as transmission links, the delay might be broken down into several sources. Normally it is fair to assume that processing delay and propagation delay are independent of traffic load and hence do not directly affect the loss-versus-delay trade-off. On the other hand, transmission delay depends on the packet size distribution, and queuing delay on buffer sizes and retransmission delay can be very sensitive to traffic load. Moreover, what is considered acceptable delay in one traffic class may not be acceptable in another, which points to further difficulties in systems designed for a variety of traffic types.

58

Chapter 3. Non-Markov Systems

The system is represented by a long time arrival rate A = A.,*, and a system size process Nt, typically assumed to be in steady state and referred to by N only. Our discussion still includes, but is not restricted to, the classical queuing Markov single-server systems. Consider W = time span from the arrival of a typical message until it departs from the server, = Wq + S = time waiting in buffer + service time, as in section 2.8.1 where it is called the system time. Since it is not clear what a "typical message" is, for now we leave it as a textual description. What can be said about the expected value E(W}1 Following the heuristic we argue that if we pick an arbitrary arrival "painted pink" and trace it through the system, then w = E(system time for pink message) = E(W). By definition of A, during the time span w on average Xw messages arrive at the system. Obviously they all arrive later than the pink message and so, assuming the FCFS policy, none of them are able to depart before the pink message. Consider now the situation in the system at the particular time when the pink message departs. All messages that were already in the system when the pink message arrived have, again by the FCFS assumption, now departed, leaving us with the conclusion that there are approximately Aw remaining messages in the system at that time. However, seen from the perspective of the system, the departure time of the pink message is just any time and hence there should be approximately E(N) messages in the system at that time. Therefore, to keep the balance straight, E(W) = E(N)/k. As already noted in the Markov case, this rather innocent relation between the time a typical message spends in the system and the typical system size, proportionally determined by the arrival rate, is known as Little's formula. It turns out to be one of the few results for service systems that is true in great generality without strong assumptions on Markovity or independence. 3.4.1

Little's formula

We continue the discussion of this fundamental relation more stringently but without attempting to be fully rigorous. Suppose the service system is initially empty at time t = 0 and recall that system size is always the excess of arrivals to departures, Nt = At — Bt. Figure 3.3 illustrates this relation using a simulated trace of the M/M/1 process represented via its arrival and departure processes At and Bt; the difference between the two increasing curves is therefore the system size Nt. Several busy periods are visible interjected by idle periods during which At and Bt coincide. We compute the limit of

as t -> oo in two different ways. The ergodic limit result for time averages in (3.10) gives

3.4. The loss and delay time balance

59

Figure 3.3. Arrivals and departures in M/M/1.

To compute the same limit by different means, let Wi, W2, . . . denote the successive times spent in the system for each message in order of arrival. The key observation is that for each time point t such that Nt = 0 we have

This can be understood as follows, referring to Figure 3.4. The left-hand side of the above equation is the area between the curves At and Bt. The right-hand side of the equation is the same area built up of horizontal sections one for each jump of At of height 1 and length given by the horizontal distance to the corresponding piece of the curve B,. Considering one such jump at time t, the horizontal distance is the time it takes for the server to handle At clients, hence the delay of that particular arrival. If tk is a sequence such that Ntk = 0 and tk —»• oo, k —»• oo, it follows that

as k —>• oo; compare this with the renewal reward theorem in section 3.3.1. The first factor on the right side appears by definition of the long time arrival rate X. The second factor E(W), the mean in the steady state delay time distribution, results from the strong law of large numbers n~l Y^j=\ Wj —> E(W) since A(^) passes through all integers and increases indefinitely as k grows. The last step toward Little's formula is to verify that the

60

Chapter 3. Non-Markov Systems

Figure 3.4. Blow-up, section of previous graph. same method works in general for t > 0, in the sense that error term, where the error term can be shown to vanish in the limit t -* oo [69, Chapter 5.15]. Thus we are able to identify the two limits of f^ 1 /0' Ns ds and conclude that E(N) = XE(W). A similar analysis of t ~' J0 Qs ds leads to the various instances of Little's formula:

£(busy servers) = XE(S).

3.5 The M/C/1 system In the remainder of this chapter we turn to the family of service systems known as M/G/m. Here M stands for Markov in the sense that arrival epochs are given by the Poisson process, G stands for a general service time distribution, and m represents the number of servers. We restrict the discussion to the single-server system M/G/1 and the infinite-server system M/G/oo. These systems are not directly covered by the mathematical techniques already discussed in the direction of exploiting renewal theory methods or ergodic theory but require other ideas.

3.5.1

Simple examples leading to non-Markovity

We give some further examples not covered by the models introduced so far.

3.5. The M/G/1 system

61

Figure 3.5. Read-write disk access.

Example 9. Deterministic service times. Consider a single transmission link of capacity c bits per second equipped with a buffer designed to hold arriving packets until the link is free. Packets arrive at the link according to a Poisson stream with intensity X packets per second, hence the arrival process A(t) is the ordinary Poisson process. All packets are of fixed equal length L bits and thus each requires a transmission time, or service time, of S = L/C seconds on the link. The natural measure of traffic intensity seems to be Q = AL/c and we can assume Q < 1 in order to expect an equilibrium situation. Quantities such as system size N or delay W or performance measures such as utilization are still apt for study. But how? There are no obvious balance equations. For example, even if L/C is an average service time, its inverse fj, — c/L can no longer be interpreted as a Markovian jump rate. Example 10. Tandem link, fixed packet size. Connect two such links as in the previous example in a series and let each packet that departs from the first link immediately enter the second. It is a rather striking effect that at the second node there will never be a single packet in line! In fact, the interdeparture times from the first node are all greater than or equal to the fixed number S, and thus so are the interarrival times at the second node. But that is enough to avoid any risk of collision on the second transmission link. Example 11. Rotating disk. Suppose a storage disk, sketched in Figure 3.5, rotates r times every second and is partitioned into s distinct sectors. It is connected to a read-write device that is able to retrieve and store data on a specific sector. Suppose furthermore that read-write requests, each requiring a block of b consecutive sectors, arrive at the device according to a Poisson process with given intensity. If necessary an arriving request is buffered until previously arrived requests are completed. Thinking of S = time to find the required data block as a service time, it becomes clear that this is an example of the M/G/J model. The service time distribution is quite arbitrary depending on a number of parameters. Given the lack of exponential service times it is difficult even trying to mimic the Markovian set-up from M/M/1, but it turns out that in M/G/1 much useful information already is contained in the first and second moments of S. For later reference we compute them now. First, to model the apparatus we assume that the beginning of the desired block is found at sector k from the read-write head with probability l/s,k = Q,... ,s — 1. Thus the random variable number of Z — sectors until beginning of desired block is uniformly

62

Chapter 3. Non-Markov Systems

distributed on the integers ( 0 , . . . , s - I}. The service time in seconds is given by number of sectors to read requested block number of sectors traced per second Hence we compute

where

Thus and consequently

3.5.2

Pollaczek-Khinchin formulas

For models like M/G/1, two approaches are available: to find an embedded Markov process that describes the system sufficiently well and to add more information to the state of the process under study so that the extended process is in fact Markov. In either approach Markovity is enforced back into the system via auxiliary processes that are accessible for analysis using the standard tools of Markov process theory. To exemplify, let Nt be the system size of the M/G/1 single-server system. It turns out that (Ntt)k>\, where tk are the successive times of departure from the system, is a Markov chain whose stationary distribution can be found. Similarly, as an example of how to extend the state space, let R, denote the remaining service time for the job being processed at time t. Then the two-dimensional process {(Nt, Rt), t > 0} is a Markov process although {Nt, t > 0) is not. Both ideas are demonstrated below, primarily for the purpose of deriving average size and delay in M/G/1. We begin the detailed study of the M/G/1 model noting that most concepts and quantities already introduced for single-server systems are unchanged. The parameters of the model are the arrival rate A, > 0 and a general service-time distribution represented by a random variable S. We assume that the service time has finite mean E(S) < oo and finite variance V(5) < oo and that the buffer management policy is the FIFO queue. We wish to solve the following problem: In steady state find the average size E(N), queue length E(Q), system time E(W), and waiting time E(Wq'). From Little's formula we already know the relations E(N) = XE(W) and E(Q) = XE(Wq). Furthermore, we know that E(N) = E(Q) + P(serverbusy) and E(W) = E(Wq) + E(S).

63

3.5. The M/G/1 system

Figure 3.6. Trajectory of Rt in M/G/1. Consider again the remaining, or residual, service time R, = remaining service time at time t. A typical trajectory of this process is shown in Figure 3.6. The vertical jumps are the successive service times of size S\, 82,... and occur at time points when jobs are transferred from the buffer to the processing unit or, if the buffer is empty, arrive from the external source. It is to be expected that if the M/G/1 system is in equilibrium, then the process R, should possess a stationary limiting distribution Rx so that we can speak of a steady state residual service time R (dropping the subscript) just as we have introduced N, Q, and so on. This can be verified and we will see later how to express the distribution of R in terms of that of 5. For now we rely on the mere concept of steady states and on the following interpretation. Suppose an arriving job finds at its time of arrival that exactly n other jobs are buffered waiting to be serviced. Put differently, at a typical arrival time the event {Q — n} is observed. Then the waiting time Wq in the buffer until service of the arriving job begins must satisfy where for notational convenience we put S^ — 0. Markovity is at the core of this relation. Indeed, the extended state (Q,, R,) is rich enough to enable the above representation for waiting time, without reference to any other information hidden in the history of the system. It is now simple to find how the mean values of the quantities involved relate to each other. Namely, conditionally on {Q = n},

But the time that any job spends in the service unit is never influenced by the present buffer status, and so Therefore

64

Chapter 3. Non-Markov Systems

In combination with the instance of Little's formula E(Q) = KE(Wq) we obtain

thus, under the crucial necessary condition

we obtain the relation

From now on we impose the restriction in (3.19) and observe that with Q = XE(S} we may introduce the concept of traffic intensity in M/G/1, plus we have found again the familiar basic criterion Q < 1. To go further it is necessary to express E(R) in terms of the parameters A, and S. With reference to Figure 3.6 this means expressing the average level of the peaked curve /?(/) in terms of X and S. Equivalently, referring to section 3.2, we need to compute the limit

It is not difficult to understand the typical magnitude of the integral J0 Rs ds. Over a long interval of length t the Poisson process produces approximately Kt arrivals. Thus approximately the same number Xt of service completions is produced, taking into account that the process R, is supposed to evolve in equilibrium. But the number of service completions is the same as the number of triangle-shaped paths building up the trajectory of R(t) from 0 to / (Figure 3.6). The areas of these triangles are S^/2, S%/2, etc. Hence on average E(S2)/2, which is finite since we have assumed that S has finite variance. The total area J0' Rs ds under the curve [Rs, 0 < s < t] is therefore approximately \tE(S2)/2, with better accuracy for larger t. Thus we find E(R) in the limit t -> oo of

From this, (3.20), and Little's formula we now have the following. Pollaczek-Khinchin's mean value formulas. Suppose the service time distribution has finite first and second moments, £'(5') < oo and E(S1) < oo. Then

3.5. The M/G/1 system

3.5.3

65

Lindley recursion for M/G/1

We demonstrate the alternative method of artificially imposing the Markov property in the non-Markov model M/G/1, namely, to construct an embedded Markov chain and use Lindley recursion. Apart from being an excellent method for simulation of non-Markov systems, the technique is rather general. In later sections we use it to analyze a switch output queue and the ATM model called Geo/D/1. The purpose is still to find the mean, variance, etc., of buffer size, delays, or similar quantities. Denote Tn = departure time in M/G/1 for job nr n,

n = 1, 2 , . . . ,

and put Nn =• Nfn = number of units left in system at departure of job nr n

and An — number of arrivals during transmission of job nr n,

n > 1.

By inspecting the relationship between these quantities it follows that

which is the same as (2.5) for the slotted time buffer model studied in section 2.1. Moreover, since in an M/G/1 model the sequence An+\ is obviously independent of N\,..., Nn, we have the analog of relation (2.6) and hence the conclusion that (Nn)n>\ is a Markov process! Similarly, put Qn = Qrn = number of jobs in queue at departure nr n for which to obtain again (2.1). In section 2.4 we analyzed such Lindley-type recursion equations. In the present example the number of arrivals in a time interval of length S is Poisson distributed with a mean given by kS, hence

Furthermore,

and by applying the previous result (2.16) we obtain

This result is in agreement with the Pollaczek-Khinchin formula derived above, providing a second method for such results.

66

Chapter 3. Non-Markov Systems

Figure 3.7. Virtual waiting time in M/G/1.

3.5.4 The M/G/1 virtual waiting time distribution The goal in this section is to enhance geometrical understanding of a celebrated representation formula for the virtual waiting time in M/G/1 (3.23). Normally this kind of result is derived using generating functions. We chose to avoid these techniques here and instead present descriptive arguments and reasonings. Consider a single-server system of type M/G/1, which we assume has settled into equilibrium. Denote by Wq(t) the time a packet will have to wait in the queue if it arrives at the entry point of the system at the particular time t. Let Wq denote a random variable that has the corresponding queuing time delay stationary distribution. Often Wq (t) is called the virtual waiting time. Figure 3.7 shows a sketch of what the typical paths of the random function Wq (t), t > 0, look like. Pick a point randomly on the time line (two points t\ and t^ are indicated in Figure 3.7). The probability of choosing a point t, where Wq (t) = 0, equals the proportion of time that the server is idle, that is, no = 1 — Q (compare Exercise 3.7). For the case Wq(t) > 0, it is clear from the graph that the decomposition

holds. In fact, at such time points Wq (t) consists of one piece with distribution R related to the job being processed plus a number of service times one for each job in the queue. We thus have a convincing argument for the representation as the stochastic sum

where Sj are independent service-time random variables, R denotes remaining service time, and N and Q are steady state system size and size of the system queue.

3.5. The M/G/1 system

67

Figure 3.8. Virtual waiting time decomposed in remaining service times. Now we decompose Wq(t) using horizontal auxiliary lines, as shown in Figure 3.8 by the dotted lines at time points t\ and t$. It follows that at each point /, V/q(t) consists of a random number of terms each of which is a certain fraction of a service-time variable. For a given t we denote by M the required number of terms and by R\...., RM the corresponding summands. Since the service times are independent it follows that R\,..., RM are also independent. It also is clear from this construction that M is independent of the particular values of the service times. We note that at a given time, say, ti, the service times underlying / ? ! , . . . , RM correspond to jobs that may have already left the system at 73. Next, recall that the upward jumps occur according to a Poisson process. In particular if we fix /3, the locations of the jumps in |0, t$] are uniformly distributed. Since the dotted lines are reflections in the curve Wq (t), it follows that the points at which the dotted lines hit the service-time intervals are also uniform! But this is the definition of remaining service time, namely, R is whatever time remains if we pick a point at random during the processing of a service interval S. We have motivated the representation

where the Rj's are independent random variables with the remaining service-time distribution and M is an integer-valued random variable independent of the summands. (If M = 0 above, then the corresponding sum vanishes.) The final claim regarding this queue size representation formula is that M is geometrically distributed with parameter 1 — Q. For proof of the last claim we see that P(M = 0) = 1 - Q by the earlier argument of the server being idle. To obtain the full distribution of M we return to Figure 3.8. The number M (corresponding to £3) is the same as the number of steps in the dotted ladder leading from Wq(t^} leftward down to the zero level. In turn this number is the same as the number of times the dotted line intersects a service time until it intersects for the first time a service time that initiates a busy period. By the uniformity referred to above, the situation is the same at every step of the ladder, and thus at each step we perform independently a random trial that succeeds with probability p and fails with probability 1 — p. The success

68

Chapter 3. Non-Markov Systems

probability p must be the fraction of all jumps that initiates a busy period, which we calculate next. If we call the sum of an idle period plus a busy period a cycle, then length of idle period = proportion of time the server is idle = TTQ = 1 — Q. length of cycle Hence

and thus in a long interval [0, t] there will be approximately A.f (1 — Q) cycles. At the same time we know that there are approximately Xt arrivals in this interval, hence the proportion of arrivals that initiates a busy period must be the ratio 1 — Q of these quantities. What we have obtained is the conditional probability of M given that M > 1, namely, that M has, in this case, the first success time distribution (positive geometric) with success But this implies probability 1 - Q, P(M = k\M >!) = (!- Q)gk~l,k =1,2

for each k > 1, which, together with our knowledge of P(M = 0), finally shows that M is geometrically distributed with parameter 1 — Q.

3.5.5

Heavy traffic limit in M/G/1

A recurring theme in the analysis of M/G/1 and similar systems is the crucial stability requirement Q < 1. The nature of these systems as the load increases toward the critical value Q = 1 can be understood to a certain degree from the Pollaczek-Khinchin formulas (3.22), where average buffer size and average waiting times grow inversely proportional to 1 — Q. Despite the singularity at Q = 1 that appears in the mathematical abstraction of the systems, it seems that service systems in practice often do operate under high loads, at least during limited periods. It is therefore of interest to understand the behavior of systems operating with traffic intensity close to maximum. A natural approach is to rescale interesting quantities such as loss probabilities and delays and study asymptotic properties as Q -> 1. Such results belong to the regime of heavy traffic scaling. As an example we present a classical result for the M/G/1 model with an FIFO queue and service-time distribution of finite variance, V(S) < oo. The queuing delay time Wq normalized by 1 — Q converges in distribution to an exponential random variable as Q —> 1,

The proof of this result is often presented as an exercise in using moment-generating functions. As an alternative we use the elementary renewal theorem. It was shown in (3.23) that the waiting time Wq can be represented by a random variable with the distribution

3.5. The M/G/1 system

69

where and /?], / ? 2 , . . . are i.i.d. random variables with distribution derived from the given servicetime distribution P(S < s) as

Let N ( t ) , t > 0, denote the renewal process associated with the sequence R\, RT, .... Since E(S2) < oo we have (Exercise 3.8)

and hence by the renewal theorem (3.12)

The basic relation (3.11) for renewal processes shows that

thus

Since M is independent of N ( t ) and P(M > m) = Qm+l,

The asymptotic behavior of the scaled renewal process that appears in the exponent is determined by (3.25),

and since the exponential function is bounded on the positive half-line, the same asymptotic property carries over to e-('[-e)H(xi(\-e)) as wen as to me expected value. This shows

which concludes the proof of (3.24).

70

Chapter 3. Non-Markov Systems

Figure 3.9. Short simulation o/M/D/1, Q = 0.95. 3.5.6

Deterministic service times, M/D/1

Example 11 gives an example of the M/D/1 model, where the general service-time distribution G is replaced by deterministic service times S = constant. Keeping this example in mind we can visualize the M/D/1 model by thinking of packets delivered to a link at the arrival times Tk, k > 1, of a Poisson process with intensity A. < 1, buffered and transmitted at maximum capacity effectively over S = 1 time units each. Figure 3.9 illustrates the situation on the link at high load, Q = 0.95; up to four packets are buffered while the link is cleared. Put V/c = departure time of packet no A:, k > 1, V0 = 0. Some reflection shows that

which may be expanded further to give

The waiting time in the buffer, which we call Wq (k) for packet k, is given by

Either (3.26) or (3.27) can be used to analyze the sequence (Wq(k))k. By (3.26),

3.5. The M/G/1 system

/H

where [4 is the interarrival time 7^ — Tj_i. Iteration down to k — 1 shows

Thus we have

which shows that there exists a steady state representation of the buffer waiting-time distribution of the form representations that were derived earlier as (2.7) and (2.17) in the context of slotted time Markov models. We conclude this section by estimating the distribution of the buffer delay in M/D/1. The technique is general and applies in modified form to M/G/1. It will be shown that a constant 90 > 0 can be found and numerically estimated, such that

The proof is by induction. First W ? (l) = 0, so (3.29) is trivially true for k — 1. We assume as induction hypothesis that (3.29) is true for all indices up to k — 1. It simplifies the derivation of the induction step somewhat to keep a separate notation for the random variable X^ = 1 — L4, with distribution function FX(X) = P(U^ > 1 — x) and density f x ( x ) = Ae~ 1(l -'*>, x < 1. By (3.28), for y > 0,

Partition on the events [Xk > y] and its complement to get

Next we rewrite the last term above by conditioning on the outcome x of X*. Since X^ is independent of Wq(k - I), this yields

By hypothesis

72

Chapter 3. Non-Markov Systems

The integral in the last term is

Now choose 60 = sup{6 > 0 : X(e8l> — 1) < 90}. For fixed A. this gives a specific number 00 such that

and so the upper bound is true for index k. This completes the proof of (3.29). For a more general case see Exercise 3.10.

3.6 The M/G/oo model The M/G/oo model, or Poisson burst process, is the infinite-server model with Poisson arrival process A, of intensity A,, general service-time distribution G(t) — P(S < t), and usual independence assumptions. This process has non-Markovian system-size trajectories Nt but is nevertheless accessible for explicit calculations. We indicate a primary application of this model by referring to arrivals as calls, service times as call holding times, and the system size as the number of calls in progress. We have

(call i in progress at time t]»

so Nt arises by counting those arrivals that are still occupying a server at time t, and by discarding the others. Given the number A,, however, the arrival times are uniformly distributed over the interval [0,;]. The decision whether to count a call at t is determined by its uniform time of arrival U, say, and independently its length S. If U + S > t, the call is in progress and counts for incrementing N,\ if U + S < t, it does not count. The corresponding probability is

The number Nt is thus obtained from A, by independent thinning with the probability /?,, and therefore Af, itself is also Poisson distributed. Since E(At) = A,?, wehaveE(W f ) = \tpt and so for each fixed t

where we recognize the equilibrium distribution introduced in (3.13). Asymptotically, for large t, the Poisson distribution Po(g) appears and this explains the fact that a stationary version of the M/G/oo model can be constructed with steady state distribution A^ e PO(Q). This property was established for the special case M/M/oo in Exercise 2.3.

3.7. Exercises

73

In the stationary case it is possible to compute the correlation function of N,. Introduce for fixed s, t > 0, Z] = number of calls in progress both at s and s + t, Z2 = number of calls in progress at s but not s + t, Zs — number of calls in progress at s +1 but not s. Then Z\, T.I, Z^, are independent, Poisson-distributed random variables such that

Hence

With reference to the above thinning arguments it is clear that E(Z\) is a certain proportion of Q. Since calls present at time s count for Z\, should their remaining call holding time be greater than t, it follows that

where R, is the residual service-time process described in section 3.5.2. It is a standard result in renewal theory (see Exercise 3.8) that R, converges in distribution to a steady state /?oc with a mixed law given by a point mass at 0, P(R<x, = 0) — 1 — Q, and the equilibrium distribution introduced in (3.13) as continuous part,

Summing up, in steady state the covariance is stationary with

and the autocorrelation function is given by

It is noteworthy that the same expression was found for the renewal rate process autocorrelation function (3.16). Figure 3.10 shows a simulated trace of the M/G/oo model with Pareto-distributed service times. The parameters are chosen such that the first moment of S exists finitely but not the second.

3.7 Exercises 3.1 Consider the m-server loss system M/M/m/m with parameters A. and fj,.

Chapter 3. Non-Markov Systems

74

Figure 3.10. Simulation o/M/G/oo, infinite variance service times. (a) Identify the measures of performance traffic intensity (per server), utilization, throughput, and loss probability and relate them to the offered load. (b) Verify the relationship (3.1). (c) Find the limiting throughput as the load tends to infinity. (d) So that Little's formula is valid for this system, what should we mean by "average arrival rate"? (e) Put n, = 1 and produce a load-versus-throughput graph for m = 2,5, 10, 100, 500, i.e., the analog of Figure 3.1 but for M/M/m/m. (It is recommended to consider load per server versus throughput per server.) (f) Generate simulated traces of the M/M/m/m process that demonstrate its nature both under light and heavy loads and for a small and large number of servers. 3.2 Calls are made to a specific phone number according to a Poisson process with intensity A calls per hour. As soon as a call is accepted a conversation begins of length represented by a random variable S with mean E(S) minutes, during which additional calls get a busy tone and are rejected. The lengths of successive conversations are independent and independent of the arrivals. What is the equilibrium probability that the phone line is busy at an arbitrary time? 3.3 In the model for hand-off termination probability in section 3.5.1, add the parameter p0 = P(a new call attempt is blocked at start-up). Let /Yerm denote the corresponding forced termination probability. Express P{eim using the representation derived for Pterm and the new parameter PQ.

3.7. Exercises

75

Assume that the cell occupancy times Ut, i > 1, are exponentially distributed. For this case find the termination probability P,'erm. Consider the particular case p0 = Pf, which may be called nonprioritized call handling, and generate graphs of the termination probability as a function of p f for fixed values of the other parameters v and IJL. In fact, the parameter space can be reduced. How? 3.4

(a) In the model for the simplified GBN protocol the parameter N can be thought of as the offered load; the system is saturated in the sense that when each new round begins there are always N packets ready to be transmitted. Draw the loadthroughput graphs for GBN for a few choices of the parameter p (and fixed TO). Interpret the resulting curves. Why is there a maximum throughput? Let Afmax denote the corresponding load. What can be said about Nmm as a function of p and r0? (Find the equation that governs this relationship, find upper and lower bounds of /Vmax, or make a numerical study.) (b) What should we mean by traffic intensity and utilization in GBN? With reference to relation (3.1), what does the loss probability measure in this case? Which packets are "lost"? (c) Consider a more realistic version of the GBN protocol, where after a packet loss only those packets scheduled for transmission in the same round but after the lost one must be retransmitted. Hence if the first packet that is lost in a round has ordering number k, only the packets numbered k, k+l,..., N in the same round must be retransmitted. Find the expected number E(R) of packets transmitted in a given renewal cycle and the corresponding throughput. Illustrate the results with appropriate graphs and show that we have resolved the problems with simplified GBN observed in 3.4(a).

3.5 In section 3.5.1 we modeled the retrieval of data stored on a rotating disk partitioned into a total of s equal size sectors by applying the M/G/1 queuing system with processing time

where r is the number of disk rotations per second, b is the number of sectors required at each successive request, and Z is a random variable uniformly distributed on the integers 0 , . . . , s — 1. Compute the total request delay time in terms of the parameters s, b, and r in addition to the arrival rate A. Then consider the limiting case when both s and b are large but the fraction ft = b/s of sectors required at each request remains bounded. By taking limits to infinity we obtain a reduction to three parameters r, A, and ft. What is now the maximal input A that the system can allocate in steady state? What is the total request delay? 3.6 Compare the systems M/M/1 and M/D/1 with regard to queue size and delay. Conclude that queues are shorter and delay is less in M/D/1 compared to M/M/1 under the same load.

76

Chapter 3. Non-Markov Systems

3.7 By exploiting the relation

which is valid for the embedded Markov chain Nn of M/G/1 system size at successive departure times with An denoting the number of arrivals during successive service times, prove that in steady state

where Q is traffic intensity. Discuss whether this result would hold in the general system G/G/1. 3.8 Find the moments of the stationary distribution ,Feq in (3.30) in terms of the moments of the service-time distribution. Relate the expected value to the expression for M/G/1 mean residual time in (3.21), E(Rao) = XE(S2)/2. 3.9 A simple method to simulate trajectories of the M/D/1 model such as those in Figure 3.9 is based on (3.27). To simulate the sequence (Wq(k)) it is enough to generate arrival times fe}, compute departure times {i^} using (3.27), and form {w^}, where wk = tk — Vk — 1. Apply this technique to investigate the accuracy in the upper bound (3.29). 3.10 Generalize the results in section 3.5.6 to M/G/1 with general service-time distribution S. In particular show that (3.29) holds with B0 = sup{6> > 0 : E(edx) < 1}, where X = S -U,U € Exp(A.).

O&pter 4

Cell-Switching Models

The processing of ATM cells in a network node consists of several phases. Concentration, merging, and access control of incoming traffic are followed (where access is granted) by procedures such as admission control, filtering, and regularization to meet performance criteria, loss and delay control, and so on. Yet the dominating aspect is simply the routing and circuit switching of cells to guarantee that they are directed correctly according to the addressing information they carry, whether that means local processing along virtual paths or outbound traffic feed. In fact, in the broadband context the enormous amounts of data and the extreme speed under which the data have to be processed may render most attempts of error and congestion control futile. For example, control messages sent from an output node with the purpose of warning incoming traffic for congestion ahead may be obsolete by the time they arrive! Hence it is not always an oversimplification to model a nodal point simply as a set of switches. With this motivation we now include a chapter on modeling of the simplest switching units. In practice, cells are transmitted into and out of switching cards, each with a certain capacity to buffer cells in case of multiple arrivals during cell transmission intervals. We study some of the basic approaches, called space division switches, and their inherent properties. Switching from a mathematical standpoint is discussed by Schwartz [56] and Hayes [18].

4.1

m x m crossbar

Consider a switch with m input lines and m output lines, whose sole purpose it is to direct each incoming cell onto one of the output lines. Time is discretized in slots of length determined by the cell transmission time. During a slot each output line is capable of accepting and transmitting exactly one cell. The switch can be thought of as a device for input cells searching along a set of bars for their correct destination output bar; see Figure 4.1. Several cells may be destined for the same output, causing losses or buffering of cells or a combination of the two. To model the function of the switch, the arrival patterns of cells on the input lines and the rules for allocating cells on the output lines must be specified. It seems natural to apply random mechanisms for both of these. At this point choices based 77

Chapter 4. Cell-Switching Models

78

Figure 4.1. m x m crossbar model. on mathematical tractability come into play. We apply the same arrival pattern throughout this section but vary the storage and loss structure. 4.1.1 Output loss crossbar The simplest assumptions are the following: Arrivals: During each slot and at each input line a cell appears with probability p and the line remains empty with probability I — p. Each arrival is independent of what happened in earlier slots and of arrivals at other input lines. Switching: Each incoming cell is switched during the next slot to one of the m output links uniformly with probability l/m and independently of other cells switched during the same slot. At each requested output port one cell is immediately transmitted, whereas additional cells arriving at the same port are lost. In Figure 4.1 one can think of cells arriving on each horizontal bar from the left, selecting uniformly one of the crossings of a vertical bar along which the cell exits. More general models are obtained by letting the number of input lines be different from that of the output lines, by replacing the arrival probabilities p by parameters PJ, j = 1 , . . . , m, varying from one input to another, or by applying, instead of the uniform switching, other routing probabilities r,j with J^ • r/j = 1 such that r^ is the probability that a cell arriving on input i is directed to output j. We analyze the simplest case by introducing for 1 < j < m An — number of cells arriving in slot n, KJn — number of cells directed to output j at start of slot«; the initial condition could be K[ = 0, for example. For any n > 1, An e Bin(m, p).

4.1. m x m crossbar

79

since there is a success probability p of arrival at each of m input nodes. Moreover, given A,, the K^+}'s are multinomially distributed as

consistent with the obvious relation A,, — £!T=i ^+|. In particular, since the marginals in the multinomial distribution are binomial and the arrivals and switching operations are independent, Moreover, in view of the properties of the multinomial distribution

It can be seen that the sequences {A,,},,>] and, for each j, {A';]'},,>i are i.i.d., which implies that the crossbar model under the present assumptions actually operates in a steady state mode at least from slot 2 onward. This simplifies the further analysis and the evaluation of the performance of the switch, for the purpose of which we define Tn = number of transmitted cells in slot n

In fact, Tn, n > 1, also is an i.i.d. sequence and it is the expected value in the corresponding invariant distribution that is normally interpreted as the throughput of the crossbar: Throughput =

Hence, normalized per single output line, Utilization = ! - ( ] - p/m)'n =» 1 - e~p

for large m.

Furthermore Expected number of lost cells per slot

and therefore Loss probability = average fraction of lost cells per slot large m.m. = (d - p/m)m - 1 + p ) / p * (e~p - 1 + p ) / for p for large

80

Chapter 4. Cell-Switching Models

Figure 4.2. Output queue crossbar. Our knowledge about the underlying multinomial distribution also allows for the calculation of the variance

which proves that asymptotically as m tends to infinity,

By a reference to the strong law of large numbers this result can be improved to hold in the sense of convergence almost surely.

4.1.2

Output queuing with a shared buffer

The distinguishing feature with this model compared to the previous one is that each output line, or switch card, is equipped with a buffer, as shown in Figure 4.2. The typical size of a switch could be m — 128, or larger. The same arrival structure as before is maintained, adding only that arrivals are independent of the present state of the crossbar switch buffers. The assumption regarding uniform switching is also preserved. It is the switching of multiple cells to the same output that will cause the buffers to fill up. If in fact more than one cell has arrived at a given output at the beginning of a slot, all cells except one are stored in the corresponding buffer and transmitted one at a time during sequential slots. In this model the number of buffered cells

4.1. m x m crossbar

81

at one output node could grow over time without bound. More exactly, after n slots have passed there may be as many as (m — 1)« cells in a single buffer, even if this occurs only with probability l/m". It would be desirable, in order to make the model more realistic, to work with finite output buffers assuming that overflow cells are lost. The loss model studied in section 4.1.1 is of course a special case of this. However, we motivate the current assumption of infinite buffers by referring to the switch design for packet networks known as a shared buffer. In this scheme there is a common pool of buffers, where free buffer space is allocated from slot to slot to the output port in need of extra storage at this particular time. In practice this means that only the total number of buffered packets destined for any output port is subject to a finite buffer restriction. We make the approximation that the shared buffer switch is well described by the infinite output buffer model in this section. In addition to the notation introduced earlier, consider QnJ = number of cells in buffer j at end of slot n. The goal is to find out as much as possible about the distribution of the Q''s. For this study of the output buffers we continue in direct analogy with the analysis in section 2.1. By separating into three cases it is seen that

which in compact form yields the Lindley-type equation

The recursive relation (4.1) shows that the simplistic i.i.d. structure in the loss model of section 4.1.1 is replaced in this case by the Markov property. The sequences {QJn}n>\ are all (dependent) Markov chains with marginal distributions of the same type as the buffer size sequence in section 2.1, with the particular choice of Bin(m, p/m)-distributed arrivals. As a consequence there is no equilibrium situation automatically. The theory of simple Markov chains, however, is applicable in order to find a limit distribution. It is known from (2.13) that E(K'n) = p < I is a sufficient condition for the existence of a steady state. The condition says that the average number of cells arriving at a given output j in a given slot n is less than the maximum capacity of the crossbar outlet, excluding only the extreme case p — 1, which is of little interest. As a first application of (4.1) in steady state,

A consequence we observe in passing is that since in this model the number of cells exiting the switch in a single slot n + I equals

82

Chapter 4. Cell-Switching Models

the equilibrium throughput is given by

/\ \- / which is an obvious no-losses balance relation. Next, repeating the arguments leading to (2.15),

for each j, where the time index n in K> n is dropped. Expressions for the steady state probabilities P(QJ00 = k), the same for each marginal QJX, are then obtained as explained in section 2.4. In particular, as (2.16) shows,

It follows that on average the total number of cells stored in any of the output buffers is given by

This gives a first indication of what buffer sizes are required in shared buffer packet switches. Variance calculations should yield further insights into the buffer dimensioning problem. This and similar models were studied by Hluchyj and Karol [19]. Another natural direction for detailed study at this point would be to cover so-called knock-out switches, where packets contend for output buffer space in a manner that can be compared to a knock-out tournament. See, e.g., Yeh, Hluchyj, and Acampora [71] and Kim and Lee [27].

4.1.3 Input buffer blocking The next variation of the m x m crossbar switch is the case in which each input line is equipped with a buffer in which cells are forced to wait in case of contention at a particular output node; see Figure 4.3. The basic assumptions are the same as before: Slotted time. A cell appears at each input line with probability p in each slot. Cells are equally likely to be destined to any output line. Usual independence assumptions. If two or more cells are contending for the same output, then one randomly chosen cell is allocated.

4.1. m x in crossbar

83

Figure 4.3. Output contention causing input buffering. The new feature is that any cell not able to leave the switch during a particular slot must wait at its input node in what we will call a backlogged state. During the next slot it will contend once again with other input head of line (HOL) cells directed toward the same particular outlet port, and this is repeated in subsequential slots until the cell is granted transmission. Hence we add Backlogged HOL cells remain at input lines for new attempts in the next slot. New arrivals at backlogged input lines are buffered, then fed in FIFO fashion until becoming HOL cells and thus viable for transmission. This system is difficult to analyze and no powerful method seems to be known. In any case we set up a model describing the switch and give some preliminary derivations. Assign N'n — number of cells including HOL in input buffer i at beginning of slot n, I'n = number of new cells arriving at input i in slot n (0 or 1),

"

, _ J 0

if input j empty, slot«,

1 ./' if HOL cell at input / is directed toward output j.

The quantities K^ from the previous section are respecified as

= number of HOL cells at beginning of slot n destined for output j. For a given j, if K/t > 1 the algorithm emits one cell chosen with equal probabilities among the K' HOL cells addressed to output j. Define also if HOL-cell at input nr i exits at output nr j.

84

Chapter 4. Cell-Switching Models

We have and

where we also introduced the notation {Jl}n>\ for i.i.d. sequences with the uniform distribution over the integers 1 , . . . , m. The total number of cells at the end of slot n that are buffered on the input side of this m x m crossbar is

and the total number of arriving cells in slot n is

A recursion for Nn is obtained by summing over / the relations (4.2) for individual buffers,

By inspecting the sum over index / in the last expression we see that at most one summand is nonzero and thus signifies a request for output at node j at the beginning of slot n + 1. Since the crossbar throughput in slot n + 1 consists of one cell for each output line with at least one request, that is, Tn+\ = 5^7=i ^-(K' >n» w e have

This relation could have been stated directly as a Lindley-type equation for Nn, balancing the independent arrivals to match the switch throughput. To give some indication of the behavior for this type of switch, Figure 4.4 shows a simulated trace of 4000 slots for m = 300 and p = 0.55, plotting throughput Tn and the number of nonempty input buffers Yn, given by

against slot numbers. The upper curve Tn varies around the mean throughput mp = 165. To get an idea of how the throughput sequence Tn is related to the number of nonempty buffers Yn, we consider a heuristic argument. Suppose there are y nonempty buffers in the

4.1. m x m crossbar

85

Figure 4.4. Throughput and number of nonempty buffers. switch at a certain time; the corresponding y HOL cells will typically request their output nodes evenly distributed over the m available lines. As a result the throughput in the next slot will be the number of output nodes with at least one request. It is natural to rephrase this situation as a classical occupancy problem, namely, y balls are distributed randomly over m boxes and each of the my possible outcomes are equally likely. With T denoting the number of nonempty boxes and U/ an indicator function of the event that box i is empty, we have

Solving for y with t — E(T),

and thus, inserting t — mp = 165 and m — 300, we have y » 239 in reasonable agreement with the simulated result in Figure 4.4. To further enhance understanding of how the switch behaves, the next simulated trace shows the evolution of the total buffer size Nn over a range of 40,000 slots for two values of the traffic intensity per node p = 0.55 and p = 0.58. The simulation started from an empty system. Figure 4.5 shows that there is a drastic and interesting change in the required buffer size as the traffic intensity increases in the range above p = 0.5. For p = 0.55 the buffer size seems to settle quickly in a steady state with moderate fluctuations. However, at p = 0.58 the trace of Qn is quite different both in magnitude and fluctuations even if it seems to stabilize after an initial transient period. By increasing the intensity just a bit further to, say, p = 0.59, the resulting graph of total buffer size will show no tendency of stabilizing. This phenomenon is partly explained in Theorem 4.1. A somewhat related situation where a sudden collapse of effective throughput results when the input rate increases was studied by Kaniyil et al. [25]. They consider a network node operating under the input buffer limiting scheme. The input to the node consists of

Chapter4. Cell Switching Models

T

Figure 4.5. Total buffer size for m = 300, p = 0.55, and p = 0.58. transit messages from other nodes in the network plus the locally generated input messages. The node is equipped with a finite buffer subject to the crucial restriction that only a specified fraction of buffer space may be occupied by locally generated input messages. Kaniyil et al. [25] studied the performance parameters of the node just before onset of congestion using gradient dynamics and dynamic flow conservation. The methodology suggested to investigate these stability aspects might be useful also for the input buffer case considered here.

4.1.4

Input blocking, loss system

To simplify the system and to be able to study the input blocking phenomenon, we now remove the input buffers and assume that any cell arriving at an input port where another cell is already contending for transmission is lost. Slightly rephrasing the interpretation of the key quantities from the earlier analyses, we have K'n = number of backlogged input lines in slot n with cells destined for output j, 11 = number of new cells arriving in slot n destined for output line j. The throughput of the switch, i.e., the fraction of successfully transmitted cells per slot, is given by

It is clear that Kj can be thought of as the content of a virtual buffer set up to control transmissions through output port j. It is therefore natural that the key relation for buffer size variables appears in the form

4.1. m x m crossbar

87

Figure 4.6. Throughput in crossbar input loss model, m = 500, p = 0.3 (lower), and p = 0.7 (upper). in analogy to relations (2.1) and (2.5). In this case, however, the key observation is Given the ATn''s, the family (IJn+\} has the multinomial distribution, Multnom(Z n , ^ , . . . , ^), where Z,, = ^J=i ^ ^s binomial, Bin(m - ^"'=1 tf>, p) distributed. It follows that { ( K t [ , . . . , K ' " ) , n > 0 } i s a M a r k o v c h a i n w i t h f i n i t e s t a t e s p a c e w h i c h is irreducible and nonperiodic. Existence of a stationary distribution is a consequence of Theorem 2.1 in section 2.3, adapted to the case of vector-valued Markov chains. We refer to the corresponding steady state by writing {K^}. Similarly, T^} denotes equilibrium throughput. The simulation in Figure 4.6 of the level and variation of T,,, n = 1, 1800, for switch size m = 500 and two /^-values 0.3 and 0.7, gives some intuitive understanding of the model. Theorem 4.1. For the m x m zero buffer input loss crossbar switch, HOL blocking results in the throughput reduction

in particular, for the saturated case p = 1,

Chapter 4. Cell-Switching Models

88

Figure 4.7. Input loss model, throughput lower bounds. We conjecture that in the limit m —> oo

in L 2 (and almost surely), but we have not been able to show rigorously that indeed V(T^) ->• 0 as m ->• oo. The standard practice in the literature for obtaining the throughput 2 — -s/2 is to approximate the multinomial distribution of the variables 7; by independent Poisson random variables; see Walrand and Varaiya [65, Chapter 10.8]. Simulations indicate clearly that, in fact, (4.5) is the correct asymptotic throughput of the input loss switch

89

4.1. m x m crossbar

as the size of the switch grows. A further remark is that the given lower bound is relatively accurate for small m. As an example we outline in Exercise 4.3 steps for deriving by direct calculations that the throughput of the 2 x 2 input loss switch is given by

and thus under saturated load is equal to 0.75. The corresponding lower bound in Theorem 4.1, withm = 2and/? = l,is(7-vT7)/4 % 0.7192. Graphs of the load-versus-throughput lower bounds are shown in Figure 4.7 for m = 2 and m -> oo. To prove the inequalities in the theorem we begin by assuming that an initial configuration { KQ } has been given and by introducing the averaged quantity fraction of busy input lines. By conditioning on {K^} we obtain expressions for the mean and the second-order moment of /j+]. First, since each component I}n+} is binomially distributed with parameters m — E7=] Ki and p/m.

Second, an application of the conditional variance formula V(X) = E(V(X\Y))+V(E(X\Y~)) implies that

hence The idea is now to consider the recursive relation (4.4) in equilibrium, i.e., to suppose specifically that the distribution of {K^} equals the steady state distribution, hence that the distributions of K^+] = K/t coincide for all n > 0, and compute the expected values. This gives hence

In addition, by expanding the square of relation (4.4),

90

Chapter 4. Cell-Switching Models

therefore

It remains to sum this identity over j from 1 to m, divide by m, and simplify to obtain a relation, which if we write temporarily,

takes the form

It follows that the expected value x = £(V (m) ), say, of the random variable V(m) satisfies the inequality The stated inequality for E(T^) = p E(V<m^) = px is now obtained by solving the corresponding second-order equation.

4.2

Exercises

4.1 Consider an m x n crossbar with m input and n output ports operating in a slotted time fashion. During each slot, cells arrive at the input ports independently of each other and with probability p for each input. Arriving cells are immediately assigned a particular output selected randomly with equal probabilities amongst the n available output lines. During the same slot each output port has the capacity to process exactly one cell. (a) Assume first we have the output loss case of section 4.1.1. Hence if several cells are assigned to the same output destination, all cells except one are lost. Compute the switch throughput (in cells per slot), i.e., the expected number of busy output ports at the end of a given slot. What is the natural definition of utilization for this model? Find the expected number of lost cells per time slot and from this the loss probability. Verify that again formula (3.1) holds. Suppose the switch is a concentrator in the sense that n < m with concentration rate ft given by the limit n/m -> /J as m, n -> oo. Express utilization and loss probability in terms of p and /?. (b) Now consider the output buffer case (section 4.1.2). For which p values does the system settle into equilibrium? Find the expected buffer size. Answer the same questions for the asymptotic case described by the concentration parameter ft above.

4.2. Exercises

91

4.2 Consider the m x m input buffer ATM switch symbolized in Figure 4.3 and its simpler variation, the input blocking loss system, studied in section 4.1.4. One method with which to reduce the effect of HOL blocking in the buffered system is to maintain for each input port m separate buffers, each containing cells destined for one of the m output ports. We wish to study the input blocking loss version of such a switch. Thus, suppose each input port is equipped with one storage location for each output that enables it to store up to m cells as long as they are addressed to different outputs. Also assume that a cell appears at each location with probability p/m in any given time slot. The actual switching operation is then described as follows. In each slot, each output port performs a search of its m input locations and picks one cell randomly if at least one is found. Introduce for 1 < j < m, KJn = number of backlogged cells in slot n destined for output j, Nft = number of new cells arriving in slot n destined for output j. This could be called a shortbus model; the terminology refers to a design where each output is able to send its bus one round trip every slot to update the Kn value and eventually pick a cell for transmission.

(a) If for some n we are given the tKj's, then in the next slot n +1, what is the distribution of the family of random variables {A^+|}? Fix one output port, drop the superscripts, and write Kn and Nn for the number of backlogged and new cells in slot n. Show that the resulting Markov chain (Kn)n>\ satisfies

(b) Calculate the conditional expectations

(be careful with the case k = 0).

(c) Denote by jt, = ^-,K [ml] the proportion of backlogged cells on the new time scale

obtained by using a speed-up factor m. One can show (see Chapter 6) that the dynamics of X, are described by the drift and quadratic variation functions defined as

where x is of the form k/m for some k, 0 < k < m. Find the limiting functions

(d) Make a simulation study of the shortbus model. As the load tends to saturation, i.e., as p —>• 1, does the system stay stable or are there indications of congestion, severe losses, etc.? This exercise is continued in Chapter 6.

92

Chapter 4. Cell-Switching Models

4.3 To calculate exactly the throughput in the 2 x 2 input loss switch we consider the bivariate Markov chain (K^, K2) on the states (0, 0), (0,1), and (1,0), with equilibrium probabilities TTQQ, jr0i, and JTIO in steady state. Given the state of the Markov chain at time n, what is the conditional distribution for Zn ? For given values of Z n , what is the throughput at output 1? Sum over possible values of Zn and sum over states to obtain in steady state the single port throughput 1 /2+p/4+TTOO (3p/4—p2/4-l /2). Argue that the same throughput is given on the other hand by p P(K^ = 0) = p(\ +jroo)/2. Identify the expressions to get the result p(\ — p2/2(2 — p + p2)) stated in (4.6).

Chapters

Cell and Burst Scale Traffic Models

The next objective is to focus on the hierarchies of time scales that are often used to better understand the traffic structure in modern communication networks, and to study specific features of them from the point of view of mathematical modeling. Recall the three-level partitioning in call, burst, and cell time scales briefly discussed in sections 1.2 and 1.3 and shown in Figures 1.3 and 1.4. The classical Markov theory, Chapter 2, covers the call level category. The traditional non-Markov extensions, discussed in Chapter 3, apply to call level and are to a certain degree suitable also for models on the packet or cell level. As an example the M/D/1 model on one hand applies to circuit-switched voice call transmission as a generalization of a pure Markov model and on the other hand to the queuing analysis of fixed-size packets. In Chapter 4 a number of cell scale models were introduced. To further distinguish cell and burst scales, we note that if a short time interval is considered, then the cells encountered by the network during that period typically originate from different users, which motivates over such lengths of time independence between cells. Seen over a longer time scale, however, it is likely that many cells are sent from the same source, leading to correlations in the data streams. This chapter continues with further examples of cell level dynamics, with a presentation of burst level rate models and with material related to long memories in network traffic. In particular, we encounter methods designed for what has been called "the failure of Poisson modeling"; see Paxson and Floyd [48].

5.1

Cell-level traffic

We present four examples of models for stochastic dynamics arising on the level of individual packets on a microscopic time scale. In the first example ATM cells merge on a common link. The second example deals with the delay structure in a stream of IP packets of PCM encoded voice. The subject of the third example is the random variations in round-trip times of Internet traffic. Finally, the fourth example starts with datagrams at an interface of a video application and derives the distribution of packet sizes at a transport layer interface. 93

94

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.1. ATM output buffer.

5.1.1

Isochron multiplexing

We consider an ATM multiplexer such as the system in Figure 5.1, obtained by removing one of the output buffers in the crossbar model Figure 4.2. A number of incoming streams are all directed toward this particular output pipe, toward the same outgoing multiplex. In section 4.1.2 we studied the corresponding multiplexer queue under the assumptions of slotted time and the arrival stream consisting of Bin(m, p/m) cells per slot and with independent arrivals at each slot. This particular model is sometimes called the Geom/D/l queuing system. We now consider other models, as the arrival stream may have a different structure than in the case of the crossbar model. In continuous time, if we can assume that the cells arrive as a Poisson process, the M/D/1 model is applicable. The basic quantities of interest such as expected buffer size are then obtained immediately from the PollaczekKhinchin mean value formulas for M/G/1. The main objection to Geom/D/l or M/D/1 in ATM cell transfer mode traffic models is the evident periodic nature of cell emissions within bursts. Cell emission streams that are consistent with the two models mentioned above on the contrary possess independence between slots (Geom/D/l) and independence between increments (M/D/1). The basic model suggested for ATM which specifically addresses the periodic arrival nature is the m*d/D/l system. The input process is the accumulated arrivals from m independent users each transmitting a periodic stream of cells with period d, i.e., one cell every d time units. The m sources are asynchronized with each other and as a result the phases of the periodic emissions are randomly mixed. As one example, Figure 5.2 shows an arrival stream over five periods generated by m = 30 users emitting cells periodically. To provide an equilibrium state buffer the traffic intensity must obey Q = m/dC < 1 if the capacity of the link is C cells per time unit. Some theoretical results are known for the m*d/D/l model; in principle the distributions of system size and waiting time can be found—none of them in a particularly enlightening form, however. Approximative results that are useful for dimensioning buffers are known in the heavy traffic regime, i.e., traffic intensity Q close to one. Numerical calculations show that the required buffer size necessary to keep losses below a given level is significantly lower than the corresponding buffer in M/D/1 exposed to the same load. This should be expected in view of the apparent isochronous smoothing in the input data. See [8] and [64] for discussion of such results and further references.

95

5.1. Cell-level traffic

Figure 5.2. 30* l/D/l input process. It is probably misleading to use m*d/D/l as a reliable model for ATM. It is a severe restriction that all users are supposed to generate cell streams characterized by the same period d. The system J2T=\ di/D/1 is a generalization, where m users emit cells periodically with deterministic holding times but now allowing individual periods dt, i = 1 , . . . , m. This more general model has the potential to generate a traffic structure that is drastically different from m*d/D/1. For this discussion we assume that the periods d\,..., dm are obtained, in advance of the transmission period, by each user making an independent observation of a random variable A with a given period distribution. Hence a sample A ' , . . . , A m of size m represents the variability in the user stream periods. To obtain stationarity one should suppose that at time t = 0 each user starts from a random location within its period. The first cell emitted from user i after time t = 0 arrives at a time that is a fraction of A', after which cells arrive periodically with the same interarrival time A'. For simplicity we ignore this finer point; all streams begin just after the completion of a period, and therefore N't = number of cells from user / before time t = f f / A ' l ([x] denotes largest integer < x). Let A, denote the empirical mean number of cells per user at time t, i.e.,

Clearly

and so

Here we take note that £(1/A) can be unpredictable and also state that if we approximate

96

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.3. Y^Ti^i/D/l, exponential periods. N't with t/A', then

If the input lines N't are Poisson processes, then the corresponding result would be V(A,) ~ t/m, whereas in this case the input variation in J^Li difD/l is on the scale of magnitude ~ t2/m. Figure 5.3 shows 20 realizations of arrival streams A, each generated by m = 100 users. Also shown is a blow-up of the initial period of time of the same simulation. For the simulation we used an exponential distribution for A. This is an extreme case where E(\/ A) does not even exist finitely. In practice there is a lower bound to the period, corresponding to the peak rate transmission throughout the observed interval. Nevertheless, this example has some relevance for understanding more realistic cases, such as a slotted time model with periodicities mimicking a mixture of ATM users.

5.1.2

Voice packet streams in Internet telephony

We present a Markov model for the time delay variation on cell scale of packetized voice traffic typical for Internet telephony applications. The model is introduced and studied in detail by Kaj and Marsh [22]. We start by looking at the traffic pattern produced by sending a PCM-encoded speech file from Argentina to Sweden over the Internet. In a first packetization step the file is segmented into packets of 160 8-bit PCM samples each. This gives a periodic sequence of 160-byte voice packets with 20 ms interpacket times. As the audio stream passes through buffers and interacts with cross traffic en route to its destination, packets may experience random delays. The typical effect of a single packet being delayed is that the distance in time to the next packet ahead increases and the distance in time to the packet immediately behind decreases. Eventually a fast packet even overtakes a slow packet ahead and from then on must adapt to the speed of the slower packet. As a result the observed interpacket arrival times at the receiver are sometimes shorter than 20 ms and sometimes longer.

97

5.1. Cell-level traffic

Figure 5.4. Voice over IP data.

Figure 5.4 illustrates observed interarrival times for a short segment of duration 5 seconds taken from a longer trace of voice data sent from Buenos Aires to Stockholm. The upper graph in the figure shows observed interpacket arrival times on the y-axis versus packet sequence number on the X-axis. The corresponding histogram of interarrival times is shown in the lower part of the same figure. Attempting to understand the nature of the interarrival time data we make the following model assumptions. First, let the 20 ms packetization interval be the time unit in this section. When we say that packet nr k is sent at time k — I it means the packet is sent at clock time 20(fc — 1) ms into the data stream. We assume that the packets are subject independently of each other to transmission delays YI , ¥2, . . . given by i.i.d. random variables with distribution function

and with finite mean transmission time [a = f£°(l — F(x}) dx < oo. For the data we are concerned with here, typical values of fj. could be 2CM-0, i.e., 400-800 ms. If the random fluctuations of the transmission delay variables Y^ were the only source of jitter in the model, then packet k would arrive at time k — 1 + Yk. But since the packets must arrive in the same order as they are sent, the actual arrival times at the receiver occur

98

Chapter 5. Cell and Burst Scale Traffic Models

at times T\, T2,..., where

Therefore the observed packet interarrival times are given by

We also introduce the observed delay of the packets as

The representation for Tk can be written

Hence

and therefore

Moreover, for x > 0,

From these relations we conclude that (7^) and (V^) are Markov chains such that the latter and the difference sequence (t4) of the first have asymptotic distributions as follows. Theorem 5.1. The sequence (VJt) is a Markov chain with asymptotic distribution

The sequence ((4) has the asymptotic distribution

hence in particular a point mass in zero of size

5.1. Cell-level traffic

99

Moreover, E(UX) = 1 The last statement, which verifies that in this model the mean interpacket arrival time is preserved at 20 ms, requires a proof. One technique is indicated in Exercise 5.1; see also Kaj and Marsh [22]. In principle an estimate of the unknown distribution function F can be obtained from a given data set (?*)*=! of arrival times by forming the observed delays (ut)|t = i, Vk = t^—k + l, and the corresponding empirical distribution function

Then simply apply the estimate

We now return to the basic assumptions in the proposed model for voice over IP (VoIP) and discuss a more general version. In this context we use trace data to illustrate typical statistics of recorded VoIP telephone calls. The model discussed so far covers the transmission of a continuous audio stream where packets are generated at a steady pace over time. In reality VoIP systems are equipped with silence suppression. This mechanism is typically attached at the analog input port; if the sound level falls below a given decibel threshold the packetizer stops recording packets and assumes the person on the call is quiet. To include the effect of the silencer in the model we introduce a further sequence X], X^, . . . of i.i.d. random variables and assign Xk — length of quiet period between packets k — 1 and k. It is natural to specify the distribution of X letting, for some (small) a > 0, P(Xk = 0) = 1 — a. This is the (large) probability that silence suppression is not effectuated just after packet fe — 1 is played out. With probability a, on the other hand, Xk is positive and the audio stream is stalled during a time interval of length Xk time units. The reception times T\, ? 2 , . . . of the sequence of successive packets are modified accordingly and are now given by

where

is the total duration of quiet periods preceding packet k. For this model it can be shown [22] that the interarrival times C/* = Tj — 7i-i are such that

The sequence (Vk), which we have seen is a Markov chain in the case of X = 0, generalizes naturally to the case of silence suppression by putting VJt = 71- - k + 1 - S*. For the sak

100

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.5. VoIP data with silence suppression, sequence (Vk). of visualizing a trace data file recording of a VoIP call consider the related sequence

Given a trace file of arrival times {^}, hence of the interarrival times {Uk}, estimate 1 + E (X \) by the sample mean u and form the sequence v* = & — (k — 1)«, k > 1. Figure 5.5 shows such a sequence (iik) for a VoIP call lasting approximately 95 seconds, of which the recorded person speaks for 40 seconds (2000 packets). The remaining silent period consists of six longer intervals of length exceeding 3 seconds plus many shorter intervals of varying lengths. These correspond to the distribution of the sequence of quiet periods, (Xk). The typical properties of the distribution of (Xk) are further illustrated in Figure 5.6, which shows histograms of the durations of talk spurts and of silent periods in a similar data set of a recorded call. An approximation of the parameter a for this data set gives a & 3%.

5.1.3

Round-trip time distribution, PING data

In packet-switched networks the packet round-trip time is an important characteristic. The round-trip time is the measured overall delay of a packet successfully transmitted from sender to destination and returned to sender. It is sometimes reasonable to assume that the round-trip time stays constant over time. Communication satellite traffic is such a scenario where often the main part of the round-trip time is the propagation delay of 270 ms, which is the time required by a packet traveling at the speed of light to the satellite band back and forth. In performance calculations with discrete time models the round-trip time often forms a natural time slot. This is the case, for example, using reliable data transfer protocols. Recall section 3.3.4, where the round-trip time consisting of a single packet transmission time plus the ACK return time is used as a fixed model parameter. Similarly, in the study

5.1. Cell-level traffic

101

Figure 5.6. VoIP data, nature of recorded voice signal sequence (V*).

of the TCP protocol in section 6.4.3 we apply the same simplifying model assumption that the round-trip times are constant. Meanwhile, in this section we study the random fluctuations that are necessarily inherent in round-trip time measurements. The trace data set introduced in section 1.3 (see Figure 1.13) indicated the character of such variations in Internet regional traffic data (approx 100 km distance). We introduce now a round-trip time model that refers to packet data of this sort and that is based on the packet delay mechanism discussed in section 5.1.2, dealing with VoIP data. However, the general aspects of the model are independent of the particular application. Hence the model should be applicable in various situations where round-trip time variations are an issue. We want to provide an explanatory model for data such as in Figure 1.13. Figure 5.7 shows the corresponding histogram for the RTT trace file. PING packets are sent once every second. The maximum round-trip time in the data set is less than 700 ms and so each packet has been returned before the next is transmitted. In this sense the packets in sequence can be considered independent of each other. The packets do interact, however, with other traffic to and from the servers involved and with cross traffic of the Internet service provider. As a PING packet enters the transmission link we can think of this packet as being placed in line with other packets, much like the VoIP packet trains discussed in the previous section.

102

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.7. Histogram of round-trip time data. Hence the one-way delay of the PING should be well captured by the random variables V)t defined in (5.2). The interpacket distance of 20 ms from the VoIP application has no particular significance here but should be replaced by the typical interpacket distance on the transmission link in question. With this motivation we propose the following model for round-trip time data from PING. Assume that the random variable Y with distribution function F(y) = P(Y < y) corresponds to the typical delay of a packet, ignoring the possibility of being stalled by other packets ahead in line. Thus Y reflects the general transparency of the network, variation in routes, number of hops, etc. The typical choice is a black-box model such as

where /u, is a propagation delay parameter and a 2 and A represent different types of jitter. Let y(1) and V^ be two independent random variables with the asymptotic distribution of the Markov chain (Vj) in Theorem 5.1 and define as round-trip time distribution

This is the required time for a packet to travel both ways to the receiving end and back. The packet suffers from delays represented by the random variable Y and it is under the influence of other packets, potentially causing slow-down effects. The interfering packets are likewise delayed according to Y with the same distribution function F and it is assumed that delays occurring in one direction are independent of delays during transmission in the other direction. Figure 5.8 shows numerically calculated densities of the delay distribution V and the round-trip time distribution R for the case where Y is an exponential random variable. The scales have no significance; the graphs merely indicate the character of the

103

5.1. Cell-level traffic

Figure 5.8. Density profiles ofY (left), V (middle), and R (right). transformation leading from Y to R. Appropriate densities for R can possibly be fitted to empirical densities such as in Figure 5.7. Statistical analysis like goodness-of-fit should be applied to trace data of round-trip time measurements as a basis for validating or rejecting the proposed model.

5.1.4

Packet fragmentation in video communications

In this example we consider data from a variable bit rate video application. The video signal is encoded according to the MPEG standard and transmitted over the Internet via the user datagram protocol (UDP). As a result we may assume that the only variable quantity in the sequence of datagrams is the frame length. The data shown in Figure 1.11 give an example of such traffic streams. Wolfinger et al. [70] considered the transformation of an encoded video signal as the data are passed from the interface of the video application with UDP transport onto a network interface that belongs to a lower protocol layer. By modeling the load transformation effective between the two interfaces analytically, they obtained an arrival stream at the network interface based on the nature of the arrival stream at the application interface. In many situations this should be an advantage as measurements typically are simpler on an application level. Our example follows Wolfinger et al. [7()|, who analyzed the effects of header-generation and fragmentation. To model the primary load and discuss the secondary load at the network interface they assume that the frame length distributions are normally distributed. We extend their results by providing the exact distribution of the induced secondary load packet size. Consider encoded video data and put Xi = length of 1-frame no k (byte),

k= 1

n.

We apply the simple model that the sequence is a sample of i.i.d. random variables from the

104

Chapter 5. Cell and Burst Scale Traffic Models

normal distribution In this approximation it is assumed that IJL is sufficiently large and a sufficiently small to guarantee that in practice X > 0. Underpinning this model is also the assumption that I-frames are sufficiently separated in time so that the correlations in the sequence can be ignored. More diligent modeling could be based on a multivariate normal distribution with three components, one each for I-, P-, and B-frames, and covariance parameters measuring their dependence [8]. Next we consider the change in data unit lengths within the UDP and IP layers. The UDP adds headers with specific control information to the original MPEG packets and IP will cut packets into smaller fragments to conform with a maximum transmission unit (MTU) on the network layer. We use UDP headers of h = 40 bytes and the MTU = 1500 bytes for ethernet. Thus if an original data frame is larger than 1460 bytes the transformation leads to a number of IP packets of size 1500 bytes plus a fragment of smaller size. Letting Y represent the distribution of the frame lengths now measured in units of MTUs we have

The resulting secondary load arrival process is specified by assigning Z — [Y] — number of max size IP segments per MPEG I-frame, R = Y - [Y] = length of IP packet fragment. The distribution of the integer valued random variable Z is given by

where <J> is the distribution function for the standard normal. The distribution function for the fragment size 7? can be written

hence

It was observed by Wolfinger et al. [70] both numerically from the model and by investigating empirical data traces that the pure fragments were distributed close to the uniform

5.1. Cell-level traffic

105

Figure 5.9. Histogram ofl-frame lengths in encoded video of the motion picture Ghost. distribution on the interval (0, MTV), so that P(R < jc) ~ x for 0 < x < 1. To verify this it is convenient to turn to the corresponding density function

Using numerical software this function can be computed with great accuracy. One finds that as soon as the segment size is not too large in comparison with the variation in the data, the fragment size distribution is very close to uniform. For example, if a > 1500, we have

and if a > 2- MTU, the upper bound of the deviation can be reduced to 10 15 As a numerical example we take a sequence of n = 18,1691-frames representing the motion picture Ghost (Paramount Pictures, 1990). Figure 5.9 shows a histogram for the data and numerical estimates of the parameters /j. and a (bytes) are given by

The distribution of Z is given by

which gives

106

Chapter 5. Cell and Burst Scale Traffic Models

The empirical fragment distribution fits a uniform distribution on [0, 1] (mean 0.5, standard deviation l/-«/12 = 0.2887) with great accuracy. The sample mean of calculated packet fragments in the trace is f = 0.4943 and the sample standard deviation sr — 0.2896.

5.2

Burst-level rate models

Next we turn to models relevant for such examples as the ethernet data in Figure 1.8. One natural approach to modeling such arrival processes is to introduce an arrival rate process A, related to the arrival process by

With this approach network traffic is approximated by continuous flows. If A, is Markov, then A, is a Markov-modulated fluid model. Examples of form (5.3) were discussed in section 3.3, namely, the renewal rate process and the on-off processes with A, alternating between two states. A further interesting example is the Kosten model, in which it is assumed that A r = hZt, where h is a given peak rate parameter and Zt denotes the system size process of the infinite server Markov model M/M/oo. In particular, in steady state, Z, is Poisson distributed for each fixed t. A straightforward generalization of the Kosten model is the Poisson burst model, where Z, is the system size process in M/G/oo, allowing a general service time to enter as a parameter. Figure 5.10 shows trajectories of such arrival flows for the M/G/oo model in Figure 3.10. The special case in which the tail of the service-time distribution is assumed to decay so slowly that the variance is infinite is of particular interest. In this case A, exhibits long-range dependence. The integrated process in (5.3) is only one option for modeling arrival traffic. Other choices include compound Poisson processes of the form

where Nt is a Poisson process with jumps {f;} and (X,) has the role of a rate process, and time-changed Poisson processes with an intensity process that varies over time.

5.2.1

Anick-Mitra-Sondhi model

Consider ra independent, alternating renewal process on-off sources Z',,i = 1 , . . . , m, as in section 3.3, which during on-periods transmit cells at peak rate h over a link of capacity c bits per second. Each source works under a fixed mean rate X and it is assumed that the sources have attained stationarity. Then

y = P(a given source is in the on-state) = and

= number of sources in on-state at time t e Bin(/n, y).

5.2. Burst-level rate models

107

Figure 5.10. Arrival processes in the generalized Kosten model. The Anick-Mitra-Sondhi (AMS) model is the continuous, multiplexed, work-load arrival process denned as in (5.3) by letting

We compute first- and second-order moments for the special case where each source is given by the two-state Markov process with jump rates a and ft. Then

The superposition Z, of all the m sources also is a Markov process in this case, namely, a birth-and-death process on {0, . . . . m\ with intensity )^ — a(m - k) for upward jumps and /j,k = ftk for downward jumps. By (3.7),

and by (3.9),

Since the autocovariance function of the Markov on-off process is known from (3.18),

108

Chapter 5. Cell and Burst Scale Traffic Models

can be found explicitly. Other properties of the AMS model are discussed later. Figure 5.15 illustrates heavytailed on-off times and Example 12 in section 5.3.4 deals with statistical measures of burstiness. Section 6.1.2 in Chapter 6 discusses an example of bandwidth allocation in admission control using this model. 5.2.2

Markov modulated Poisson process

The term Markov modulated Poisson process (MMPP) refers to a situation where a modulating process J,, typically an irreducible Markov chain in continuous time with a finite number of states 0 , . . . , m, governs the rate of Poisson arrivals. During periods such that /, = j incoming traffic arrives according to a Poisson process of intensity Ay, 1 < j < m. As J, changes state so does the intensity of the Poisson source. It is intuitively clear that if the modulating process has a steady state distribution {JT,-}, then the effective arrival rate is given by A.^ = ^7=o ^i^-i- Service systems with MMPP arrivals can be analyzed; see, e.g., Mitrani [38]. We study the particular case A.J• = h • j and the modulating process given by the superposition of independent on-off sources Z, in (5.5),

restricting to the case where Z, is Markov. Consider a family of independent Poisson processes with intensity h, i.e.,

where each process generates Poisson events at time points Tln = £)t=i U'k. All the U'k 's are independent and exponentially distributed random variables with the same expected value l/h. We associate one Poisson process with each on-off source ZJ in (5.7) and define th arrival process by

which means that A, is formed by going over all Poisson events in [0, t] and counting only those that occur as the corresponding source is on. This can be seen as a Poisson process modulated by the number of sources Jt which are on, in the sense that the actual intensity driving the counting process A, is random and at any given time t equal to h J,:

A simulation for the case m = 3 is shown in Figure 5.11. The modulating Markov chain is plotted in the lower part of the figure and the resulting arrival process in the upper. In analogy to (3.7)

5.3. Long-range dependence traffic models

109

Figure 5.11. Markov modulated Poisson process. The relation V(X) variance

= E(V(X\Y)) + V ( E ( X \ Y ) ) suggests a method for computing the

Conditionally, given a sample function of the process Zt,

and

Hence

The first term was calculated in (5.6) and the second term equals h J0' E(Z2S) ds = hyt.

5.3

Long-range dependence traffic models

Let us return to the question of whether it is reasonable to assume Markov interarrival or service times in real networks. In recent years large sets of traffic data have become available. Statistical analysis has shown strong evidence that certain categories of data are not covered by such assumptions. We mention two examples.

110

Chapter 5. Cell and Burst Scale Traffic Models

Duffy et al. [10] reported on a statistical study of telephone call holding times (CHT), where during 8 hours a total of 302,225 calls started and the measured durations varied from 0.001 seconds to 29.5 hours! Given such data and writing S as a generic notation for CHTs, one can perform a statistical test on the hypothesis that S e Exp(/Li) based on the fact that the tail of the distribution,

decays as x -» oo at exponential rate to zero if the hypothesis is true. Such investigations in fact showed that the tail distribution for CHT is much better described by a function of the form

i.e., the distribution possesses heavy tails. The second example is the experiment performed at Bellcore from 1989 onward (see also section 1.3). The Bellcore data of ethernet LAN traffic has triggered over several years much interest in traffic modeling based on the concepts of self-similarity and long-range dependence, mathematical notions that are related to heavy-tailed distributions. See Leland et al. [31] and Willinger et al. [68]. The collected data sets were monitored in traces of 40 hours containing approximately 27 x 106 packets. The data can be viewed either as traces of the number of packets arriving per time unit or, if the varying packet sizes are taken into consideration, as traces of the number of bytes arriving per time unit. The second version is a kind of workload process. To understand the specific character found in the ethernet data we divide time into 10 ms intervals and count Xk = number of ethernet packets arriving at interval nr k, k > 1. Figure 1.8 shows such a sequence (X^) but with a time slot of 1 second. Next, form a sequence of 2-block averages

and, more generally, the aggregated sequence

For a fixed m we can now plot the new sequence (-X^ )*>i and compare with the original sequence. Indeed, if we take the same data as in Figure 1.8 and form a new trace with m = 5 (i.e., choosing the time bin 5 sec), then the result is as shown in Figure 5.12. The basic statistical character looks pretty much the same. The original experimenters at Bellcore were able to handle such large amounts of data that the same rescaling could be performed for the five time scales m = 1, 10, 100, 1000, 10,000 on a single trace. Remarkably, the resulting sequences ranging over time scales from milliseconds to minutes were similar and in this sense the process X (m) is similar to itself, regardless of m. This has been taken as an indication of self-similarity and the related property of long-range dependence.

5.3. Long-range dependence traffic models

111

Figure 5.12. Ethernet arrival rate process. 5.3.1 Self-similarity We state some definitions. The formal analog of the idea of considering aggregate blocks is the following. Consider a stationary sequence of identically distributed random variables (Xn) with mean E(X) = 0 and variance V(X) < oo. The sequence is said to be self-similar with self-similarity parameter H if for any m = 1,2, ...,

have the same finite-dimensional distributions. In the general case fj, = E(X) < oo, it is required that for each m > 1,

have the same finite-dimensional distributions. The parameter H ranges from 0 < H < 1. For applications to traffic data it seems most suitable to consider, instead of packet counts x['"\ the corresponding centered measurements and simply apply (5.9) where all random variables have zero mean. Hence in the following we restrict ourselves to /i = 0. If the weaker notion applies that (mX(^n)) and (mu'X/ 1 (so that r ( 1 > = r). It is left as an exercise to verify that for a second-order self-similar sequence, for any k

112

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.13. Autocorrelation function, ethemet data. A deeper result is now that a function r(k) which satisfies such scaling relations must have a specific form. It can be shown that the second-order self-similar sequence autocorrelation function is forced to be

Observe that the right side is a difference approximation of the second-order derivative of the function f ( x ) = x2H, hence

In practice one should resort to the less restrictive assumptions of asymptotic self-similarity or second-order self-similarity in the sense that the above properties hold asymptotically for large m. The strict properties quoted here, however, give the basic intuition. Figure 5.13 gives an example of an estimated autocorrelation function. It is based on the BBC news video data shown in Figure 1.9 and shows surprisingly large values over long distances, probably due to some machine-generated periodicities in the material. The noisy curve gives an idea of the estimation error. It shows the autocorrelation function based on the same trace but where all correlations have been destroyed by shuffling the video frame size data in random order. We turn to the closely related definitions designed to capture long memories. The sequence (Xn) is called

5.3. I ong-range dependence traffic models

113

Clearly a second-order self-similar sequence is long-range dependent if 1/2 < H < 1, since then

In continuous time we have the corresponding definitions that a stationary process At, t > 0, is self-similar if the distributions of AT, and TH A, are equal for every T > 0 (5.14) and possesses long-range dependence if and only if

5.3.2

Heavy-tailed rate models

In this section we investigate long-range dependence in the renewal rate process and the M/G/OG version of Kosten's model, the Poisson burst process. First let A ; be the stationary renewal rate process in (3.14) with associated interrenewal time distribution F ( t ) = P(U < t) of finite mean v = E(UT) and rate sequence (X/,) with mean E(X\). The corresponding arrival fluid model is A, = J0' Asds, as in (5.3). The autocorrelation function /-(/) of A, is found in (3.16). Since

it follows from the criteria (5.15) that the renewal rate process is long-range dependent if and only if U has infinite variance. To obtain the variance of the fluid arrival process we apply (3.9) in the form

which together with (3.16) yields

The Poisson burst rate process obtained by replacing the renewal rate process above with M/G/oo leads to very similar expressions. For this case we let At denote a stationary version of the M/G/oo process based on a given service-time distribution S and repeat the previous calculations. It follows that again the model is long-range dependent if and only if S has infinite variance. The arrival process variance is calculated from (3.9) and (3.31) as

114

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.14. Brownian motion approximation, W(n\ n = 1000.

5.3.3

Fractional Brownian motion

A common technique in stochastic process theory is to approximate jump processes that take their values in a discrete set with continuous state processes. Visualizing jump processes like those in Figures 1.7 and 2.1 brings to mind continuously varying random processes. To establish the basic idea, consider two independent Poisson processes A,, B,, t > 0, both having intensity A > 0. Put

For large n this means that the Poisson processes have evolved over a long time interval; we haveE(Ant~Bnt) = OandV(A nr -5 n( ) = 1\nt. Hence E(W/ n) ) = OandV(W/ n) ) = 2\t. Figure 5.14 shows 10 simulated trajectories of such processes W t ,0 oo, it follows from the central limit theorem that W, converges in distribution to some random variable W, € N(0, cr 2 1), a2 — 2X. In this perspective we can now quote nonrigorously one of the most fundamental results in modern probability theory. Theorem 5.2. There exists a Markov process W,, t > 0, WQ = 0, called Brownian motion with variance a2, which is characterized by the following properties: All nonoverlapping increments Ws, Ws+t — Ws are independent; the increments Ws+, — Ws are stationary for all s > 0; the increments Ws+t — Ws are N(0, a2t) distributed, all s > 0; the trajectories Wt, t > 0, are continuous junctions. Brownian motion arises, for example, as the limit of W(n) as n weak convergence of stochastic processes).

oo (in the sense of

5.3. Long-range dependence traffic models

115

In telecommunications modeling Brownian motion appears in various, more advanced approximative schemes (one instance is mentioned in section 6.3.3). The topic is included here as we wish to discuss a non-Markov relative to Brownian motion, so-called fractional Brownian motion (FBM) [36], which recently has attracted a great deal of attention in traffic modeling. An FBM process BH(t), t > 0, shares the properties with Brownian motion that the paths are continuous and that the increments are stationary and distributed according to a Gaussian (normal) distribution. On the other hand, nonoverlapping increments of BH(t) are not independent and one has

The parameter H (Hurst index) typically belongs to the interval 1/2 < H < 1 (the case H = 1/2 gives Brownian motion) and is a self-similarity parameter as discussed in (5.14), namely, the distributions of BH(Tt) and TH BH(t) are equal for every T > 0. An early reference to the use of FBM in telecommunications is Norros [42]. To understand the potential applicability of FBM in traffic streams we consider a rescaling scheme for a superposition of renewal streams with heavy-tailed holding times. The purpose of such a scheme is to find an appropriate time scale and a scaling of the load per user, such that the natural fluctuations in the model can be captured asymptotically as the number of users increases. Taqqu, Willinger, and Sherman [62] established limit results for the AMS-type multiplex of on-off sources where either the on periods, the off periods, or both, possess heavy tails. Figure 5.15 shows simulations of the process (Z,) in (5.5) with m = 50 users and equally distributed on and off period distributions with heavy tails of order ~ x~(l+^\ for ft — 1, ft = 0.6, and ft = 0.2. The simulation indicates that with heavier tails the trajectories of the superposition process tend to be more regular and show less variation on small time scales. A closely related and somewhat simpler set-up is as follows. Let

where Z't, i = 1 , . . . , m, are independent renewal rate processes with mean reward y — E(X) and interarrival distribution F(t') = P(U < t) as introduced in (3.14). Assume that the distribution of F has heavy tails of order ~ jf~ ( 1 + ^\ 0 < ft < \. The superposed renewal rate process A, represents the total amount of data generated by m independent users during a time interval of length t. The results of Taqqu, Willinger, and Sherman [62] suggest that for this comparable model the fluctuations of A,, appropriately scaled, converge to FBM with parameter H = 1 - ft/2. For simplicity it is assumed that F(t) is an exact Pareto distribution with a tail parameter ft as above. In general a weaker assumption on the asymptotic behavior of F for large / suffices. Moreover, the initial distribution is selected so that the renewal rate process is stationary. For each T > 0 define the rescaled process T~lATt. Since the mean equals

116

Chapter 5. Cell and Burst Scale Traffic Models

Figure 5.15. Multiplexed on-off sources, light (top), intermediate (middle), and heavy (bottom) tails. E(T ' ATt) = ymt, the process

describes fluctuations around the average. As an example one may think of the arrival counts in Figure 1.7 as A,(m) and subtract the linear slope. The significance of the parameter T is to look for a particular time scale on which the fluctuations exist macroscopically and perhaps behave as FBM. When rearranging integration and summation two alternative limit schemes arise:

and

Using the same proofs as for on-off processes in Taqqu, Willinger, and Sherman [62], it follows that if firstm tends to infinity and then T tends to infinity, the right-hand side of (5.17) converges in distribution to the FBM process BH(t), t > 0. The covariance calculation in (3.15) is essential to verify that the scaling order is correct. The scheme in (5.18) is relevant if the limit operations are taken in reverse order. It is shown in the referenced work that if first T —> oo and then m —>• oo, the right-hand side of (5.18) converges weakly to a so-called stable process. This discussion suggests the further alternative of letting T — Tm tend to infinity jointly with m. It is still necessary to distinguish different regimes of convergence. First

5.3. Long-range dependence traffic models

117

suppose that T —>- oc slower than m]^ in the sense that T^/in —> 0 as m —> oo. It can be shown in this case that the sequence in (5.17) satisfies

in the sense of weak convergence of processes. On the other hand, if T^ /m —> oc, then the sequence (5.18) converges to a stable Levy process of index 1 + ft. Relevant references for these results are Mikosch et al. [37], Pipiras, Taqqu, and Levy 149], and Gaigalas and Kaj [12]. In the intermediate regime T& ~ m we are left with the unnormalized fluctuations process on the left-hand side of both (5.17) and (5.18). The convergence of the corresponding sequence (A^/,,( — ymt)/ml/fi is investigated in [12]. The setting for the results in [12] is somewhat different from (5.16) and refers to the superposition of ordinary renewal processes with heavy-tailed interarrival time distributions, which we mention next. Let W,(l) be independent stationary renewal processes with Pareto distributed holding times of index ft. Let T be some time scaling such that T^/m vanishes as m —>• oo. Define

Then Y(m} converges as m -> oo to a multiple crpBH of FBM with Hurst parameter H = 1 ~ ft/2. The primary application of this result is that A, — X^li ^} counts the accumulated number of packets generated by m independent users sharing a LAN. Each user works under the assumption of Pareto interpackct distributions. It follows from the convergence result that for large m

But according to self-similarity with Hurst index H = 1 — ft/2, for each fixed t,

so that

In other words, the mean traffic behaves as

This provides a verification of the model for ethernet traffic proposed by Norros [42]. It is simple to generate m independent sequences of stationary Pareto-type arrival time random numbers and sort the merged data set in increasing order to obtain arrival times for the superposition process. Figure 5.16 shows three realizations each for H — 0.6, H = 0.7, and H = 0.8 with 25 renewal processes each time (m = 25). Consult Paxson [47] for alternative simulation techniques.

Chapter 5. Cell and Burst Scale Traffic Models

118

Figure 5.16. Approximation ofFBM, H = 0.6,0.7, 0.8, m = 25.

5.3.4

Statistical methods

Not much consensus has yet emerged regarding the statistical characterization of "burstiness" in traffic streams. This includes how to test statistically whether a given set of network data complies with a particular model for self-similar or long-memory traffic. The monograph by Beran [4] provides the mathematical details. The current growth of interest in these topics and the wide range of methods applied can be seen in [2]. In this section we only touch on the area and discuss some of the techniques. A crude measure of burstiness sometimes applied to cases where the mean rate and the peak rate have been identified is simply burstiness =

peak rate mean rate

Another statistical quantity intended to measure the degree of bursts in a packet stream is the index of dispersion

for which the Poisson process serves as a reference case, with I, = 1. Example 12. For the AMS model introduced in section 5.2.1 and specializing to alternating Markov renewal processes, we have

5.3. Long-range dependence traffic models

119

Furthermore, it follows from (5.6) that if we select a time scale by setting a + ft =• 1, say, so that a = y = A./ h, then

Hence whereas burstiness — h/X is a rather different function. Various graphical methods have been suggested for investigating if a set of arrival count data exhibits self-similarity or if a trace of CHT are heavy tailed. If the result of such an investigation is the application of a specific model with long-range dependence, then the next step is often to estimate numerically a corresponding self-similarity parameter, or Hurst parameter, H. To list some examples, suppose X i = arrivals during the time interval [j — 1, /). The index of dispersion for counts is given by

and linear regression can be applied to test whether ICDL ~ L1H ' for some parameter H in accordance with self-similar scaling behavior. A time-variance plot has a similar purpose and is obtained by plotting estimated variances of the aggregated sums X(m) against m in log-log scale (logm, log V(X ( m ) )). Estimation of the autocorrelation function and other time series analytic methods also can be used in this context. For an introduction to such methods, see, e.g., Molnar, Dang, and Vidacs [39]. Let us turn to the statistical analysis of the tail behavior of a holding time distribution P(S > t). To check for heavy tails such as the power law decay in (5.8), the obvious first step is to estimate F(t) = P(S < t) by the empirical distribution function

and then to investigate linear regression of y on x in

So-called quantile-quantile plots (Q-Q plots) are graphical methods based on related ideas. A more sophisticated technique, which has its origin in extremal value theory, is to use the so-called Hill's estimator. Resnick [52] contains a detailed account. The mean excess function, defined by

is a further tool for explorative data analysis of distribution tails. Clearly g(y) = E(S), y > 0, is constant if S is exponentially distributed. The idea is to distinguish heavy tails

120

Chapter 5. Cell and Burst Scale Traffic Models

from light tails based on the asymptotics of g(y) for large y. In fact, the property g(y) ->• oo may even serve as a formal definition: S possesses heavy tails if,

g(y) —>• oo,

y —> oo.

Intuitively, if g(y) is an increasing function in y, then the longer the call has lasted the longer its remaining length will be, which fits with our understanding of long memory. Greiner, Jobmann, and Kliippelberg [15] recommended the mean excess function for analysis of telecommunications data and applied the technique on a trace of 1,690,730 ATM cells (approximately 2 minutes of IP traffic). The cells are supposed to arrive in bursts of (i.i.d.) lengths characterized by an on-period distribution, separated by silent periods whose lengths follow an off-period distribution. The empirical mean excess function is calculated for the on-period data and the off-period data separately and plotted against cell burst levels v. The on-period data, in particular, generates a plot that is remarkably close to a straight line over the full range of levels y. This suggests that the Pareto distribution fits the data quite well (see Exercise 5.6). The off-period measurements suggest a lighter tail for that case. We finish the section on holding time tail behavior by discussing a further graphical tool of potential interest for analyzing network data. The starting point is to relate the mean excess function in (5.19) to notions from life length testing. Let S with distribution function F denote a generic nonnegative random variable that represents life length, which in our context can be CHTs, off-periods, etc. Note that

The distribution of S is said to be of type new worse than used in expectation (NWUE) if

It follows that S is of type NWUE if and only if £(5 - y\S > >•) > E(S) for all y > 0 (Exercise 5.7). Also, S is said to be of type decreasing failure rate (DFR) if for each fixed / the function P(S > t + u)/P(S > u) is increasing in u. This property implies increasing mean excess and hence implies NWUE. Both classes of distribution seem relevant for heavy tails, but we need to be more specific. Let F~'(JC) denote the inverse function of F(t) so that F(F~l(x)) = x. Define the TTT transform of S to be the function

Then 0 < (f>(x) < 0(1) = 1 and

Consequently, since F~l(x) tends to oo if and only if x tends to 1 from below, S exhibits heavy tails iff

lim <j>'(x) = oo.

X—>]

5.4. Exercises

121

Figure 5.17. TTT plot analysis ofethernet interarrival times. Some further insight into the shape of the total time on test (TTT) transform is offered in Exercises 5.8 and 5.9. Finally, the graphical technique based on these observations consists of sorting the available data in increasing order t\,..., tn and drawing in the unit square the TTT plot

where

is the so-called total time on test sequence. The resulting plot is the empirical TTT transform, which is used for visual evaluation of the data set. In Figure 5.17 this technique is applied to the etheraet data in (1.6) (neglecting the dependence structure). The nonsmooth curve is the TTT plot which falls under the diagonal (NWUE property), is basically convex (DFR property), and seems to have a steep derivative in x = 1 (heavy tail). Also shown in the graph are the TTT transforms of two Pareto distributions, with ft — 0.3 and 0.4 indicating Hurst parameter values H = 1 - ft/2 in the range 0.80-0.85.

5.4

Exercises

5.1 Show that the sequence of arrival times for VoIP packets, (7j) defined in (5.1), has the representation

122

Chapter 5. Cell and Burst Scale Traffic Models where TA'_i has the same marginal distribution as Tk-\ and is independent of Y\. Use this representation to verify the relation

Prove that the interarrival times U^ have the property E(I4) —> 1 as k —>• oo. 5.2 With reference to the fragmentizer model in section 5.1.4, assume that the frame lengths of encoded video can be modeled on the application level by the exponential distribution with mean p.. Find the distributions of the resulting number of maximal size IP packets (Z) and the size of fragments (R) at the network layer. 5.3 Verify the calculations in the displayed formulas (5.11). 5.4 Verify (5.13) from (5.12) by deriving the limit

5.5 Show that for the accumulated traffic arrival process A, = J0 As ds of a renewal rate process with Pareto distributed interrenewal times having tail decay x~^[email protected]\ 0 < /} < 1, the variance V(At) grows with t at the same rate as the power function t2~P. What is the natural choice of self-similarity parameter H in this case? 5.6 Show that if the Pareto distribution with survival function R(t) = 1/(1 + ?/ 0, is assumed to model CHT S, then, for any /3 > 0, the mean excess function g(y) is linearly increasing in y, y > 0, and hence is heavy tailed. 5.7 Verify that the NWUE condition is equivalent to the property that the mean excess function is bounded below by E(S). 5.8 Show that the DFR property is equivalent to the TTT transform being a convex function and show that the NWUE property is equivalent to 4>(x) < x, 0 < x < 1. 5.9 Show that the Pareto distribution in Exercise 5.6 has TTT transform eiven bv 4>(x) =

Chapter 6

Traffic Control

With reference to the general net model in section 1.2, we distinguish admission control as a term for the procedure of deciding whether a request for network service should be admitted; access control, referring to various tasks of the access net; and congestion control, for providing effective transmission along transport routes. In this chapter we discuss a selection of mathematical models and problems related to these issues.

6.1

Admission control

Admission control occurs on the time scale of calls. The decision to accept a request for service should be based on conditions that are likely to prevail for the duration of the call and not on instantaneously changing quantities. Long-term traffic statistics such as peak rate and mean rate are essential. During the call other mechanisms for access and flow control may be active, on much finer time scales. In best-effort networks without criteria for delay and loss, such as the Internet, there is no admission control. Consider user requests on a network for setting up calls—virtual connections in an ATM node, for example. Typically the requests are accompanied by a QoS specification, e.g., a listing of estimated volumes in various traffic classes and corresponding performance parameters, such as maximum loss or cell delay variation. The specification describes the desired quality of the requested service. It is a decision of the network to accept or to reject the caller and to formalize the agreement in terms of a traffic contract. A traffic contract, in its turn, must be monitored during the time it is valid and service is in progress—network policing. The worst-case approach (from a networking perspective) to solve the admission problem is to accept the call if bandwidth and buffers can be allocated based on its peak-rate requirements. For example, if the call consists of the superposition of m on-off sources where P(Zlt = 1) = /?/, and the source requires bandwidth C/ cells/sec while on, then the actual demand is bounded from above by the peak rate,

123

124

Chapter 6. Traffic Control

In principle all sources could be in the on-state throughout the call; the only chance for the server to accept the reservation is therefore to reserve bandwidth C — Y^T=i Ci f°r ^s ca^> a safe but probably very ineffective policy. We also realize that to expect stability in the network, allocation based on the mean rate is a bare minimum. The capacity C which is to be made available must exceed the mean value,

In reality an intermediate case is to be expected. This raises an interesting question, discussed next.

6.1.1

Effective bandwidth

How do we define and measure the effective bandwidth, i.e., the effectively required bandwidth that the given QoS specification is likely to produce? Clearly this is a question of statistical character. It suggests that the network must ask which risk it is willing to take of not being able to fulfill the contract, then estimate the effective bandwidth, and base its decision on that. This is an area currently under research. One direction is known as measurement based admission control and exploits recent advances in applications of the theory of large deviations; see, e.g., O'Connell [43]. For a detailed study of the large deviations methodology focusing on Internet congestion we refer to Wischik [66]. We address some of these ideas, beginning with a simple example. Example 13. Suppose a user makes the request to transmit packets of exponentially distributed size with mean size 1 that enter the network according to a Poisson process of intensity X. The server knows immediately that it has to allocate a bandwidth C > A. to guarantee that the traffic intensity in the resulting M/M/1 model is kept in the range of a steady state solution, Q — \j C < 1. This represents mean-rate bandwidth allocation. In this model there is formally no upper bound on the number of arrivals, hence there is no direct analog of peak-rate allocation. On the other hand, we obtain possible notions for the effective bandwidth as follows. As usual let N denote steady state system size and W system time per packet. Thus N has a geometric distribution with parameter Q and W is exponential with parameter C — A.. To have for any given e > 0

the requirement of the server is Alternatively, to be sure that

it is necessary that C satisfies the inequality

6.1. Admission control

125

It is now clear that a QoS contract can be formulated in terms of the parameters K,n,w, and e and that the network can compute the minimal C which satisfies both inequalities above and guarantees, with a certainty of probability s, that the requested service can be carried out. Returning to a general framework of an arrival stream process A, with stationary increments (As+t — As the same distribution for all s > 0) we discuss the approach to effective bandwidth based on entropy, or moment generating functions; see Kelly [26] and Gibbens [13]. As in the M/M/1 example, mean rate allocation of resources amounts to choosing capacity C > E(A,)/t & X which is greater than the average arrival rate. It has been suggested that the effective bandwidth surface

should be used for the purpose of measuring and characterizing traffic as well as allocating capacity. Here 9 is an additional scaling parameter, which in the limit $ ^ 0 returns the mean rate function and in the limit 0 ->• oo the peak rate. For example, if A, is the multiplex of n independent on-off sources of mean rate mt and peak rate hi bits per second, then over an interval of length t

In principle the effective bandwidth surface can be computed numerically and graphically for a given model and its shape and properties used for description and characterization of various traffic classes. For such examples see the referenced papers.

6.1.2

Statistical multiplexing gain

In this section we consider in more detail the problem of assigning effective bandwidth to incoming requests. The arrival process is assumed to be given by the AMS model of section 5.2.1 with m multiplexed on-off sources each of mean rate A and peak rate h. Our main reference is Schwartz [56, Chapter 4.1]. Recall that Z, e Bin(m, y) is the current number of active sources, (5.5), and A, = hZt the current load. Possible performance measures are the steady state quantities Loss probability: Proportion of expected loss:

P(A > c), £((A — c) + )/£(A).

We express the second option in terms of Z r . Introduce

the number of sources that can be transmitted simultaneously at peak rate (assuming no buffer). The mean rate allocation lower bound guarantees that c > 1m, hence «o > [yw] %

126

Chapter 6. Traffic Control

Figure 6.1. Effective bandwidth, AMS model. E(Z,). If the trajectory Zs,s > 0, crosses the level n0 at some time during the interval of observation, then the maximal capacity is exceeded, resulting in cell loss. Assign

In particular, FIOSS(W) approximates the proportion of expected loss. The function P\oss can be used for bandwidth allocation, as described next. For a given e > 0, find (integers) N — N(s, m) > «0 such that P\OSS(N) < s. Then ^ hm is the capacity required for the m sources and

These sequences are schematically depicted in Figure 6.1. The quantity noh/Ns is the effective bandwidth per source and the corresponding losses among the m users are controlled by the choice of e. The ratio NE/HQ satisfies

and represents the statistical multiplexing gain. This can be viewed as a measure of the degree to which the fluctuations in the input allow for capacity reduction compared to peak rate allocation. Given the traffic parameters X and h and a QoS specification in terms of e, we now have a method for controlled bandwidth assignment. For a given capacity c, find the maximum number of calls m that can be admitted. Conversely, given m calls, find the required capacity.

6.2. Access control

127

An alternative method is based on the central limit theorem. For large m, the distribution of the number of active sources is approximately normal in the sense that

Thus, for £ > 0

where ze is the upper e-quantile in the normal distribution. Tf we take, for examp] e, £ = 10 5 so ze — 4.26 this approach results in the allocation rule between mean and peak rate given by

where A + 4.26^/A.(h — A.)/m is the effective bandwidth per source. The multiplexing gain is given by peak rate 1 effective rate

y + 4.26^7(1 - y)/m'

For a similar derivation, see Schwartz [56, Formula 4-14a].

6.2 Access control Access control involves procedures for smoothing traffic flows and preventing access points of a network from becoming overly congested. Access control decisions act on the cell scale and may be based on instantaneously varying traffic conditions.

6.2.1

Leaky bucket systems

The term leaky bucket is normally used for a filtering device attached to a network entry point. The puipose of the technique is to shape or regularize variation in traffic streams, typically packet arrival processes entering an access node. The goal is to reduce burstiness prior to admittance into the network. This may cause additional delay or losses in the system but with the gain of reducing large bursts of arrivals in short intervals of time. The name leaky bucket originates from an analogy of regulating flow variations in a fluid. Arrivals are thought of as forming an irregular fluid flow piped into an access valve. To avoid extensive variations in pressure and fluid velocity at the access point, the stream is passed into an open container, the bucket, of fixed size, from which the fluid has to drain out through a pipe of fixed capacity, the hole in the bucket, into the network. The effect is that during periods of high arrival intensity, excess fluid is temporarily stored in the container until it gains access through the regulating pipe, possibly at the price of losses, i.e., bucket overflow. To begin modeling such devices, let A, denote an arrival stream considered on the burst scale level. Often a leaky bucket filter can be thought of as a fictitious single-server queuing system mounted before the access point; see Figure 6.2. This service system produces a

128

Chapter 6. Traffic Control

Figure 6.2. Fictitious queue in leaky bucket.

(departure) stream A, which is more regular than A, and will become the newly shaped arrival stream fed into the network. It is not obvious how to measure regularity but the basic goal is that A, shows less bit rate variation over time than A,. To understand how to produce such processes A, we make another analogy. Example 14. Celebrities and limousines. On Academy Award® night, celebrities arrive at random times to the venue (producing a process A,). However, they are asked to gather at a cocktail lounge nearby from which one of d limousines will take them to the theater main entrance photo opportunity. If there is a limo waiting by the time a celebrity arrives at the lounge, it departs immediately with the guest. If not, celebrities are asked to wait for the next available limo returning from the theater. Clearly there will still be some variation in the theater arrival process A, but it is likely that burst arrivals will have been smoothed somewhat. We will see next in what sense we can think of this as a fictitious queuing system.

6.2.2

The M/M/1 leaky bucket

We illustrate the ideas using once again Poisson arrivals. A stream of fixed-size packets A, is supposed to arrive according to the Poisson process with intensity A to an entrance point of a node equipped with a leaky bucket. The mechanism for achieving the desired regularized arrival stream is to require each arriving packet to carry an access bit that grants it passage into the node. The leaky bucket must be designed so that it distributes these passage tokens restrictively during periods of frequent arrivals (clusters) of the Poisson process and makes up for the delay during quieter periods. Specifically, fix a parameter r > A. and let the leaky bucket simulate and keep records of an M/M71 process, which we denote by Xt, that would be obtained if A, was fed as arrival process into a single-server queuing system with service intensity r. The distribution of X, will settle in an equilibrium state X^ with average E(Xaa) — Q/(! — Q), p = \jr. Now, introduce an integer-valued parameter d > 1. Think of the service completion times of the M/M/1 process, i.e., the downward jumps of X,, as arrival times of tokens, the signaling units needed by the packets for access, and assume the leaky bucket has the capacity to load a storage of up to d tokens. The device then works as follows. If a packet arrives and tokens are available, the packet immediately seizes a token and departs for the access node. If a packet arrives and no token is available, it will have to wait in the bucket until a sufficient number of tokens again has arrived to clear out the backlog. The picture is clarified by noting that we have the following interpretation of Xt, the

129

6.2. Access control

Figure 6.3. Leaky bucket shaping. leaky bucket level process: 0, d tokens available, 1, d — I tokens available, n, no packet waiting, k packets waiting for access, k > 1. The parameters are the leak rate r and bucket depth d. It remains to identify the shaped arrival process At. As long as X f _ < d, a further arrival at time / consumes a token and therefore produces not only the jump A, — A r _ = 1 but also At — A,_ = 1. If X,_ > d, however, an arrival at t corresponding to the upward jump A, — A,_ = 1 is delayed in the leaky bucket and does not count for A,, so A, — A,_ = 0. Instead it is during excursions of X, above the depth level d that jumps of A, occur. Each time the bucket level process drops from a state above d, a token is thought to return to the bucket. Hence if X, — X,- = -1 and X,- > d, then also A, — A,_ = 1. The effect is that during excursions of X, above d, the arrivals at the access node are pushed forward in time compared to those of the original arrival sequence. The long time arrival rate is preserved, A(t)/t » A(t)/t for large t, but the interarrival time distribution changes. Figure 6.3 shows a simulation for the parameter values p — 0.90, d = 9. The graphs of At, A,, and X, are shown with A, (dot-bar line) always below the curve of A, and X< varying around its (asymptotic) mean E(XX) = d, also indicated in the figure. If the leak rate is small (close to X), then A, is more or less the

130

Chapter 6. Traffic Control

Figure 6.4. Leaky bucket policing.

same as the departure stream from the virtual queue X:, i.e., the Poisson process. Similarly, if the depth is too large, the leaky bucket in the long run has no effect since then A, £» A,. For dimensioning purposes one can therefore introduce, for example, the ratio r = d/r as a measure (in units of time) of burst tolerance. The interpretation is that this ratio is the length of time during which the access node tolerates arrivals at a steady pace without the need for enforced delays by the leaky bucket filter. This model in principle requires infinite storage of packets in the bucket. A natural variation of the scheme is therefore to replace the M/M/1 process by an M/M/l/K system for some K = d + b so that the new parameter b corresponds to the maximal buffer space available in the leaky bucket. This has the effect that the shaping mechanism is replaced by policing, as losses are inflicted on the arrival stream in order to conform with access regulation. Packets allowed to enter the leaky bucket are said to be conforming; packets discarded from the system because of over-full buffers are nonconforming. A simulation for the case b = 3 is given in Figure 6.4; again plots of A,, A,, and X, are shown. Some packet losses early in the simulated trajectory are seen to reduce the overall input rate in this case. The M/M/1 model is used here mostly as a convenient framework for discussing the leaky bucket mechanism. In a practical situation arrivals modeled by the Poisson process would probably represent a regular input stream with limited need for shaping. In fact, the effect of the leaky bucket observed in M/M/1 simulations such as in Figure 6.3 is modest. The empirical interarrival time distribution in A is close to the exponential but with slightly smaller variance.

6.2. Access control

6.2.3

131

The generic cell rate algorithm

In broadband traffic systems based on ATM a number of variations of the leaky bucket technique have been implemented in different forms, such as the virtual scheduling algorithm, the generic cell rate algorithm (GCRA), or the token bucket filter. The intention is to enable traffic control of an ATM connection and the set-up of corresponding traffic contracts using rule-based parameters. This term refers to parameters that are understandable by the user, are significant for resource allocation, and are verifiable by the network [8, Chapter 2]. The leaky bucket rate and depth parameters are considered useful in this respect. The GCRA is equivalent to a G/D/1/K version of the model studied in the previous section. Here G stands for a general arrival stream A, given by a sequence a\, ai,..., where a^ is the time at which cell k is observed at the interface. The symbol D stands for deterministic service time, and K represents a finite system. One can imagine a token bucket filling up with tokens regularly at rate r, i.e., one every T = l/r seconds up to a maximum of d tokens. Arriving cells consume one token each and conform with the node as long as there are tokens available or the cell is allowed to queue. More exactly, a characteristic feature of the GCRA is that the decision of whether a cell is conforming is based on its waiting time and not on the available buffer space. In fact, any cell that has waited more than T time units is judged nonconforming and hence discarded. The GCRA(T, T) algorithm in this way relates a time interval T = l/r to a tolerance r which is typically chosen to be T = (d — \)T — (d — l)/r. Hence the maximal buffer space needed is r/T = d — 1 and so the total size of the corresponding G/D/1/K model is K — 2d.

6.2.4

A slotted version of the leaky bucket filter

One discrete-time variation of the leaky bucket policing scheme is the following. Cells that arrive at a node are first collected in a bucket in which there is room for at most d cells. The bucket is emptied one cell per time slot at a rate of r cells every second. Arriving cells that would cause overflow in the bucket (level > d + 1) are lost. Cells admitted through the bucket form a modified input stream, which is directed into the nodal access point. This model is similar to but slightly different from those studied by Schwartz [56, Chapter 4.2]. We analyze the system by introducing Qn = the level in the bucket just before the end of time slot nr «, A,, — number of arrivals in time slot nr «, where we assume that A\, A.I, • • • are independent and also that the arrival probabilities ait = P(An = k), k > 1, are identical for all n. The usual type of argument for deriving buffer dynamics in discrete time models is applicable. The level at the end of slot n equals the level at the end of slot n - 1 minus 1 plus the arrivals during slot n. The boundary points 0 and d must be treated separately. We obtain

132

Chapter 6. Traffic Control

Analysis of the system proceeds by assuming that the system has settled into an equilibrium state, so that if we put nt = P(Qn = k), 0 < k < d, these probabilities are the same for all n. Now take expected values of the above relations to obtain a system of linear equations for the unknown TT^'S in terms of the distribution {a^}, which we consider known. Then solve for :TI in terms of no (and a0), solve for Tti in terms of JTO and it\ (and a$, a\), and so on, using recursion to get TTJ. In principle the solution can be found in this recursive manner and then normalized to obtain ^;=o ni' — 1- One should not expect to find a solution in a closed form. Suppose we have derived a solution {n^}. Then we are in the position to find the throughput in cells per second and the average amount of lost cells per slot. In simple cases this yields explicit expressions for throughput and loss. As an example the specific case where d = 2 and An e Po(A.) is discussed at the end of the section and a further study is left as Exercise 6.3. To begin this program, we observe that by taking expected values on both sides of the identity (6.1) we get

Since, for each n, Qn-\. depends on A\,..., A n _i but not on An and since An is independent of the previous A/s, j < n — 1, we also have that Qn-\ and An are independent; the righthand side thus simplifies into

and moreover, by the equilibrium assumption, into

From this it is seen that the system of (6.1)-(6.3) in a first step gives us

and hence, in equilibrium,

6.2. Access control

133

Since, in addition, we require ^0 TT, = 1, the result is an overdetermined system of equations for {JT,}. By rearranging terms, the first d equations (corresponding to (6.1)-(6.2)) can be written

and the last one (from (6.3))

Assuming we have found an equilibrium solution {TTJ•}, the throughput of the system is simply one cell every slot as long as the system is not empty, which is the case with probability 1 - TTO. Hence, Throughput = r(l — no) cells per second. To understand how many cells are lost in each slot, consider the case Qn~\ — 0. During slot n no cells are lost in this case if An < d, whereas with probability a^+; a total of j cells are nonconforming and lost, j > 1. For 1 < k < d, if Qn-\ — k, then cell loss occurs only if •An > d — k + 2. More precisely, j cells are lost with probability aj-k+j+i, J > 1- Hence

We solve the case where d = 2. The linear equations available for this case,

give the normalized solution

134

Chapter 6. Traffic Control

For example, if A € PO(A), this is

with the properties Throughput = r(l - e~2kf(l - le~ x )) cells per second and

For this model, utilization is the normalized throughput 1 — TTQ cells per slot and traffic intensity is the load per slot E(A) = A. To get the loss probability, divide the expected number of losses by the expected number of arrivals per slot, E(A), yielding

a now familiar formula. Utilization and loss probability for the case d = 2 are shown in the graphs of Figure 6.5.

6.3

Multiaccess modeling

A multiple access scheme is used to increase efficiency of multiaccess media, shared media such as broadcast radio and satellite links, multiuser systems based on ring or bus architectures, etc. Conceptually we have a situation as is shown in Figure 6.6, where a large number of users, with or without buffering equipment, are contending for access to a single transmission channel. Neither the server nor the nodes have complete knowledge of the buffer status. One particular case is the subclass of scheduled access systems based on conflict-free protocols. Any TDM system can be considered conflict free even if there are many users. Most of the models we have encountered so far belong to this category. fn contrast there are random access systems, where a large number of users are expected to generate low-average traffic loads while having no means of coordinating their transmission attempts directly with each other. Traffic under such a scheme is subject to not only delay due to propagation, transmission, and buffering but also delay due to contention. In fact, as soon as two or more users happen to transmit in the same time slot or

6.3. Multiaccess modeling

135

Figure 6.5. Utilization and loss for slotted-time leaky bucket.

Figure 6.6. Multiaccess contention. in a scenario of continuous time, in overlapping time intervals, a collision occurs. This has to be resolved. A worst possible case of random access systems occurs if there is complete lack of knowledge and users attempt to send cells blindly. Collisions are managed either via a retransmission protocol or a collision resolution protocol. The classical example for the first case is that packets collide on a satellite link. The corresponding bandwidth is wasted, the packets involved in the collision are left corrupted, and transmission is rescheduled for a later time. This is called the Aloha model. There are several alternatives for selecting the retransmission time. Using slotted time this could be done either uniformly over the next K time slots or at the next slot with a given retransmission probability q. Standard ethernet LANs use versions of these protocols. In 802.3 ethernet the retransmission time is uniform over K slots, where the parameter K is dynamic. For each collision experienced by the station it doubles the value of K, an algorithm known as binary exponential back-off. Another important contention protocol is carrier sensing multiple access with collision detection, known as CSMA/CD in LAN ethernet. The basic

Chapter 6. Traffic Control

136

idea for the other category of protocols based on collision resolution is to let the stations directly involved in the collision engage in a further contention procedure, after which their packets, now suffering some delay, have all been transmitted. We begin the detailed study of collision models by providing an analysis of the Aloha model based on the idea of diffusion approximations. In a later section the same approach is extended to cover slotted CSMA and CSMA/CD protocols. Most theoretical studies of these models apply approximations to the effect of infinitely many users and an averaged offered load at an early stage of the analysis; see, e.g., the presentations of Kleinrock [28], Bertsekas and Gallager [5], or Saadawi, Ammar, and El Hakeem [53]. Our study is closer to that of Nelson [40], [41]. The features of CSMA and CSMA/CD on which we base our analysis of these models are as described by Meditch and Lea [35]. See also Polydoros and Silvester [50]. Some aspects of the analysis suggested in this text are new; in particular, we obtain throughput as functions of the actual load rather than some fictitious offered load including retransmissions. Subsequently we analyze the simplest prototype for collision resolution algorithms— binary splitting with blocked access. This means that when a collision occurs, the link stays blocked from access for other users until the stations involved have resolved the collision and sent their packets. Conceptually this situation is closer to the reliable data transfer models of section 3.3.4; consequently our model is again based on the renewal reward framework. Mathy s and Flaj olet [34] provided a detailed study of the distributions of the random lengths of collision resolution intervals under various protocol assumptions. However, we have not found closed expressions for the throughput in these models in the literature. 6.3.1

The slotted Aloha Markov chain

Slotted Aloha is one of the simplest retransmission models, described as follows: Slotted time. m users contend for a single transmission channel. A packet appears at each user input with probability p, each slot. If two or more users attempt to send packets in the same slot, the channel remains idle, the packets are returned for retransmission, and the users become backlogged. Backlogged users attempt to retransmit with probability q, each slot. Usual independence assumptions. No new arrivals from backlogged users. A user is thus called backlogged from the time slot at which it suffered a collision until the time slot at which the delayed packet is successfully transmitted. During any other time slot the user is free to contend. We introduce Kn = number of backlogged users in slot n, Zn = number of attempts to retransmit packets in slot n, N n = number of new packets arriving to the link from free users in

137

6.3. Multiaccess modeling which are seen to satisfy the relation

The problem is to find Throughput = expected fraction of successfully transmitted packets per slot, which we interpret as the limit

The key observation is that, given Kn, we have

and, still conditional on Kn, these two random variables are independent. Now we study the slotted Aloha Markov chain. Figure 6.7 shows a simulation of 10 independent realizations of n) for fixed parameter(K values m, p, and q over 100,000 time slots (m = 300, pm — 0.35, qm — 10). Initially all users were free. It is seen that after a period of time, varying considerably from one station to another, during which typically only a small proportion of users are retransmitting, th transmission link fills up abruptly with backlogged users. At the end of the simulation run nine Aloha systems have stopped working properly and one is still functioning.

6.3.2

Diffusion approximation approach

The simulation suggests that the discrete Markov chain n could be approximated by a continuous-time process; this leads to the idea of considering the diffusion approximation scaling

Here 0 < Kn/m < 1 is the proportion of backlogged nodes, i.e., a space rescaling of the original sequence, and x,(m) is the result after an additional time rescaling. This approach attempts to capture the typical behavior of Kn but represented by a possibly simpler process. Thus we must investigate if the sequence of scaled random processes x(m) has a limit process, or limit function, as the number of users grows to infinity,

We note that if such a limit can be found, then, in equilibrium, Throughput ~ P(Z + N = 1) ~ E(N~) ~ p(m - E(K)) ~ X(l - x x).

138

Chapter 6. Traffic Control

Figure 6.7. Simulation of the Aloha Markov chain.

To investigate [x, , t > 0} we compute its drift and quadratic variation functions. Put

and

By

hence

6.3.

Multiaccess modeling

139

Therefore, changing to scaled parameters A = mp, a = mq, and state variable x = k/rn for some integer 0 < k < m.

and we obtain the limiting drift function n(x) denned on 0 < x < 1 by taking the limit m —> oo,

Similarly, a quadratic variation function a2(x) defined on the unit interval [0, 1] is obtained bv taking the limit

Theorem 6.1. The sequence {x(/n\t > 0) converges as m -> oo to a (deterministic) function x,, namely, the solution of the ordinary differential equation given initial condition XQ, where /x(-) is the limit drift function in (6.6) and XQ is an initial value of the asymptotic fraction ofbacklogged users. Figure 6.8 visualizes the function fj.(x), 0 < x < 1, for some parameter values X and a. We return to the original question: What is the throughput for a slotted Aloha system? Consider the curve jc,, t > 0. What happens as t —> oo? For some values of or, e.g., a = 4, there is for any A a unique limit x^ regardless of XQ, namely, the unique stationary point such that /xOcnc) = 0. This value can be interpreted as the equilibrium backlog and we obtain from earlier considerations that the equilibrium throughput of cells is 1(1 — .TOO). Now we can understand how backlog and throughput vary with the offered load A for the model. Observe that x^ is some (complicated) function of A. Hence backlog and throughput are implicitly given functions of A, which can be computed numerically. For a — 4, plots are shown in Figure 6.9. It is interesting to see that the throughput increases more or less linearly with load until a certain regime, where the backlog quickly increases and the performance of the link drastically deteriorates. For larger values of a, such as in Figure 6.8, where a — 5, the situation is more complicated. There are three stationary points x^. < x^ < JT.+,, such that x, —>• jt.J, if JG > JCcnt and xt —> x^ if XQ < xcr\t, where xcr|t is a critical initial value. Moreover, a hysteresis effect can be observed; see Figure 6.10. For an interpretation of these observations suppose a random access system of this sort that normally operates under an offered load of

Chapter 6. Traffic Control

140

Figure 6.8. Limit drift functions for Aloha Markov chain. A. = 0.3 was subject to an increase in load to, say, k = 0.45. The backlog would increase and the throughput would decline, and to restore the system it would not be enough to bring the load to previous values. In fact, the backlog would follow the upper curve and thus clog the system until its load had been forced well below A = 0.2, perhaps corresponding to the need for a complete restart of the system.

6.3.3

Remark on stochastic differential equation approximation

We have seen that as m -> oo, the limit function xt is deterministic. However, for a

but finite m one can argue that the behavior of x, is given approximately by the stoch differential equation (SDE)

Here {W t , t > 0} represents Brownian motion discussed in section 5.3.3. The interpretation is that the process jct(m) varies around the function x, due to random noise generated by W, weighted with respect to the state dependent variance function CT^(JC) & a2(x)/m. Such a process is in fact a continuous-time Markov process and can be shown to possess a well-defined stationary distribution with density function

Figure 6.11 shows three such functions for a = 6 as A. passes the critical congestion region.

6.3. Multiaccess modeling

141

Figure 6.9. Aloha model equilibrium backlog (upper graph) and throughput (lower graph).

6.3.4

CSMA and CSMA/CD

The next subject is the performance analysis of slotted, nonpersistent CSMA and CSMA/CD channels. As in the previous subsection, m independent users share a common transmission link. Following Meditch and Lea [35], time is measured in minislots and data packets are of fixed length T + 1 > 1 minislots. This is because for transmission the packets require T minislots to be placed onto the channel and one minislot to clear the channel due to propagation delay. Each user is able to hold only one such packet in its buffer. As before, a user is either free or backlogged (blocked). In a given minislot each free user receives and transmits a packet with probability p and each backlogged user attempts to retransmit with probability q. Free users involved in a collision become blocked; backlogged users remain blocked. The purpose of the carrier sensing mechanism is to reduce the number of collisions due to simultaneous transmission attempts from two or more users in the same minislot. Any such attempt, either from an active free user or a backlogged one, starts with a sensing phase. If the channel is sensed idle, each user attempting to transmit initiates placing its packet onto the link. Such a minislot where a packet transmission starts is called a firing minislot. If only a single user is involved, the transmission is successful. If not, the transmission is a failure, due to collision. In both cases the channel will be busy during the next T minislots. Free users sensing the channel during a busy minislot change state from free to backlogged. The transmission attempt is put on hold until the next minislot, where again each user makes a decision according to the CSMA protocol.

Chapter 6. Traffic Control

142

Figure 6.10. Hysteresis effect for large a. Backlog (upper) and throughput (lower). The additional collision detection feature in CSMA/CD aims at reducing the bandwidth wasted by the channel over the T minislots following a firing minislot while packets are being cleared after a collision. Collision detection is included in the model by introducing a further integer parameter R, 0 < R < T, and assuming that if a collision occurs in a firing minislot, then the link can detect the collision and abort transmission after only R minislots rather than T, as in CSMA. Obviously the number of blocked users recorded in sequence from one minislot to the next does not form a Markov chain. The remedy is either to extend the state space with information on the remaining length of the transmission period or, more conveniently in this case, to identify an embedded Markov chain. Define Kn = number of backlogged users at the end of transmission period n, n>\. In addition, define sequences Nn = number of free users transmitting in firing minislot n, Zn = number of retransmission attempts in firing minislot n, Mn = number of free users sensing the link during transmission period n. The resulting Lindley equation takes the form

where again, given Kn,

6.3. Multiaccess modeling

143

Figure 6.11. Equilibrium backlog densities. Moreover, Kn determines the distribution of the vector (7V,Hi, Mn+!), which establ that (Kn) is indeed a Markov chain. We remark at this point that by allowing T = R = 0 our set-up also covers the slotted Aloha model. For the case T = R = 0, transmissions are handled in the same firing minislot as they are initiated, a minislot is the same as an Aloha slot, the sensing has no effect on performance, the quantity n vanishes identically, and Kn is the slotted AlohaM Markov chain. As a consequence the description of what occurs during a minislot differs slightly from the model in Meditch and Lea [35]; these modifications seem to be harmless. To continue analyzing the CSMA and CSMA/CD models the following alternative representation for Kn is used. Let Xn = number of free users not sensing the channel during transmission period n. Then To identify the distribution of n+\ we begin with pure CSMA, i.e., T = R > 0. First, theX total number of blocked users at the end of firing minislot n is n + Nn+i. Second, we noteK that a free user remains free at the end of the transmission period only by not sensing the channel during T successive minislots. Hence the conditional distribution of Xn+\ given Knis Now include CSMA/CD with parameter R < T. For a free user to remain free during a transmission period, it suffices to avoid sensing the channel during T successive minislots if firing was successful and to avoid sensing the channel during R minislots if a collision

144

Chapter 6. Traffic Control

occurred. Hence putting

the more general relation

follows. In particular,

To apply the diffusion approximation and evaluate performance of the carrier sensing and collision detection protocols compared to Aloha, it still remains to synchronize the time scales. The channel capacity in CSMA/CD is 1 packet per T +1 minislots, whereas in Aloha capacity is measured in packets per slot. Hence to preserve the Aloha traffic intensity \p per user, we replace the probabilities p and q by p/(l + T) and q/(l + T) and investigate the approximating scheme

The strategy is now parallel to the case of Aloha. The limit function x, = limm^oo jc}m) satisfies a nonlinear ODE and the limiting equilibrium backlog Xoo is found by analyzing stationary solutions. As stated earlier the equilibrium backlog yields the equilibrium throughput A(l — xx), since all packets allowed to enter the system must also pass through the system. To find the drift function /u-(x), note that

and

The equilibrium probability r(x) = limmH>00 rm(xm) is therefore

Moreover,

6.3. Multiaccess modeling

145

Figure 6.12. Throughput CSMA, T = 0, 1, 3, 5, 10 (left to right). Hence,

Now we are prepared to calculate the throughput. Fix a. For a range of A. values (normally less than a) solve /J.(x) = 0 to obtain equilibrium backlog x and throug Ml — *«>)• Figure 6.12 shows the result for CSMA with a — 4 and packet length varying from T = 0 (Aloha case) to T = 10. It is seen that the sudden drop in throughput at a critical load, which is characteristic for Aloha, disappears quickly with increasing T. However, the maximum throughput that can be reached in CSMA declines with increasing T. This is in contrast to previous studies of CSMA, where it is argued that carrier sensing can improve performance drastically; see, e.g., Kleinrock [28, Chapter 5.12]. In those studies throughput is typically obtained as a function of a fictitious internal load consisting of both arrivals and retransmissions, an approach that seems to be misleading. To explain further the inconsistency we note that the present study takes into consideration not only the improvements of CSMA compared to Aloha but also a drawback of the earner sensing mechanism. The main advantage is that the vulnerable period is smaller in CSMA in the sense that the relative length of the firing minislot to packet length is smaller. The disadvantage is that backlog builds up faster in CSMA. Even if retransmission attempts during the transmission period of length T minislots do not generate additional collisions, they do generate additional backlog.

146

Chapter 6. Traffic Control

Figure 6.13. CSMA/CD equilibrium throughput and backlog, R = 0, T = 0, 3, 5.

Figure 6.14. CSMA/CD equilibrium throughput and backlog, R = 0,1,3,5, T = 5.

The effect of collision detection is shown in Figure 6.13. Again, throughput and backlog as functions of A. are shown for a = 4, taking R = 0 and T = 0, 1, 5. Finally, to see the variation in R for fixed T, Figure 6.14 shows load-throughput for T = 5 and R = 0,1, 3, 5, again with Aloha as a reference.

6.3. Multiaccess modeling

6.3.5

147

A collision resolution algorithm

Collision resolution algorithms were designed to avoid the unstable character in contention protocols that often results from retransmission strategies, of which we saw examples above. We consider the so-called binary splitting algorithm with blocked access and binary feedback information. Under this scheme, if a collision occurs the link stays blocked until the stations involved have resolved the collision and successfully transmitted the colliding packets. We start from the same basic assumptions as in the Aloha model. There are m independent users contending for a single channel, time is discretized in slots, and each user attempts to reserve the link bandwidth during a slot with probability p. The number of active users per slot is therefore Bin(m, p) distributed. If at the beginning of a given slot only one user makes a transmission request, this single packet also is delivered during the same slot. If no request is made, the corresponding time slot is wasted. If two or more users request to transmit in the same slot, a collision occurs. In this case the remaining users not involved in the collision become blocked, unable to transmit arriving packets, while the collision is resolved. The procedure is repeated as soon as each colliding packet has been delivered. It is clear from this description that the delay has two components: the collision resolution interval (CRI), which consists of the time slots it takes for the colliding stations to agree on a transmission schedule, and one slot transmission time for each packet that collides. The total delay is the time it takes to actually deliver the collided packets onto the link. We say that the link is idle at time k if at the end of slot k — 1 there are no previously collided and undelivered packets waiting for transmission. Thus in the beginning of slot k all m users are free to contend for the available bandwidth during slot k. Let each time point when the link is idle mark the beginning of a new cycle. Put n = number of packets transmitted during cycle n,A Yn = the length of cycle «, in number of slots. The system starts afresh each time the link is idle. Because of this the sequence An e Bin(m, p) is independent. We have

where Ln is the length of the CRI, which eventually, if An > 2, takes place during cycle n. The CRI is determined in a binary (or more generally

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close