PerforlJ1a~ce
Analysis .of Transaction .. :.Proc~ssin9 Syste~s
PerlOrmanceAnalysh of Transaction' . Processing Syste...
90 downloads
1740 Views
27MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PerforlJ1a~ce
Analysis .of Transaction .. :.Proc~ssin9 Syste~s
PerlOrmanceAnalysh of Transaction' . Processing Systems Wilbur H. Highleyman
Prentice Hall, Englewood Cliffs, New Jersey 07632
To my parents-
Peach and Bud
v
. Contents
PREFACE
1
xvii
INTRODUCTION
1
PERFORMANCE MODELING, 2 USES OF PERFORMANCE MODELS, 5 THE SOURCE OF PERFORMANCE PROBLEMS, 7 THE PERFORMANCE ANALYST, 7 THE S'IRUCI'URB OF nus BOOK, 8 SYMBOLOGY, 10
2 TRANSACTION-PROCESSING SYSTEMS COMPONENT SUBSySTEMS. 14 C'ommmricarion Network. 15 Processors. 15 Memory. 15 Application Processes. 16 Data Base. 16 Other Peripherals. 17 SYSTEM ARCHITECTURES. 17 Expandability, 18
12
Contents
viii
Distributed Systems, 18 Fault Tolerance, 20 Summary, 22 TRANSPARENCY, 22 The Process, 22 110 Processes, 24 Interprocess Communications, 26 Process Names, 28 Process Mobility, 29 Summary, 30 PROCESS STRUCTURE, 30 Process Functions, 30 Interprocess Messages, 31 Addressing Range, 33 Process Code Area, 33 Process Data Area, 33 The Stack, 35 Summary, 36 PROCESS MANAGEMENT, 36 Shared Resources, 36 Process Scheduling, 37 MechaDisms for Scheduling, 38 Memory Management, 40 Paging, 40 Managing Multiple Users, 43 Additional Considerations, 44 Summary,4S SURVIVABnlTY, 4S Hardware Quality, 46 Data-Base Integrity, 47 Software Redundancy, 49 Transaction Protection, SO Synchronization, 53 Backup Processes, 55 Message Queuing, 56 Chedcpoiating, 59 SOFI'WARE ARCHITECTURE, 63 Bottlenecks, 63 Requestor-Server,64 DyDamic Servers, 64
3 PERFORMANCE MODELING BO'ITLENECKS,67
67
Contents
ix
QUEUES, 68 THE RELATIONSHIP OF QUEUES TO BOTTLENECKS, 69 PERFORMANCE MEASURES, 70 THE ANALYSIS, 73 Scenario Model, 73 Traffic Model, 74 Performance Model, 77 Model Results, 83 Analysis Summary, 8S
4 BASIC PERFORMANCE CONCEPTS
87
QUEUES-AN INTRODUcrION, 88 Exponential Service Times, 90 Constant Service Times. 91 General Distributions, 91 Uniform Service Times, 92 Discrete Service TUDeS, 92 Summary, 92 CONCEPTS IN PROBABDJTY AND OTHER TOOLS, 94 Random Variables, 94 Discrete Random Variables, 94 Continuous Random Variables, Case 1, 99 Case 2, 100 Permutations and Combinations, 104 Series. 105 The Poisson Distribution. 106 The Exponential Distribution. 110 Random Processes Summarized. 111 CHARACTBRlZING QUEUING SYSTEMS, 113 INFINrI'E POPULATIONS, 113 Some Propenies of IDfiDite Populations, 114 Dispersion of Response Time, 114 Gt.11IImIl distribution, 115 Central Limit Theorem, 116 VQriQnce of response tiIMs, 117 Properties of MlMJI Queues, 120 Properties of MIG/I Queues, 122 SiDgIe-QamIel Server with Priorities. 123
Nonpreeinptive SeTver. 124 Preemptive server, 125 Multip1e-Channel Server (MIMIc). 126 Multiple-Owmel Server with Priorities, 127 Nonpreemptive server, 127 Preemptive server, 128
.
x
Contents
FINITE POPULATIONS, 128 Single-Server Queues (MlMllImlm), 130 Multiple-Server Queues (MJMIclmJm), 131 Computatiooal Considerations for Finite Populations; 132 COMPARISON OF QUEUE TYPES, 132 SUMMARY, 136
5 COMMUNICATIONS
1.
PERFORMANCE IMPACT OF COMMUNICATIONS, 140 COMMUNICATION CHANNELS, 141 Dedicated Lines, 141 Dialed. Lines, 141 Virtual Circuits, 142 Satellite Channels, 143 Local Area Networks, 144 Multiplexers and Concentrators, 144 Modems,147 Propagation Delay, 148 DATA TRANSMISSION, 149 Character Codes, 149 Asynchronous Communication, 150 Synchronous Communication, 152 Euor Pelfonnance, 152 Euor Protection, 154 Half-Duplex Channels, 155 Full-Duplex Cbaunels, 156
PROTOCOLS, 157 Message Identification aDd Protection, 157 Message TraDSfer, 158 Half-duple% mastlge transfer. 158 Full-duple%-mastlge transfer. 158 Channel Allocation, 160 Bit Syncbronous Protocols, 160 BITS, BYTES, AND BAUD, 163
LAYERED PROTOCOLS, 164 !SOIOSI, 165 Applictztion kzyer. 165 Presenttztion kzyer, 166 Session layer. 167 Transport kzyer. 167 Network layer, 167 Data link layer. 168 Physicollllyer. 168 SNA,I69 X.25, 170
Contents
xi
MESSAGE TRANSFER PERFORMANCE, 172 Half-Duplex Message Transfer Efficiency, 172 Full-Duplex Message Transfer Efficiency, 176 Message Transit Time, 178 Message Transfer Example, 180 ESTABUSHMENT/TERMINATION PERFORMANCE, 181 Point-To-Point Contention, 182 Multipoint PolllSelect, 18S
LOCAL AREA NETWORK. PERFORMANCE, 189 Multipoint Contention: CSMAlCD, 189 Token Rings, 192 SUMMARY, 196
6 PROCESSING ENVIRONMENT
197
PHYSICAL RESOURCES, 198 Processors, 199 Cache Memory, 199 110 System, 200 Bus, 201 Main Memory, 202 Processor Performance Factor, 203 Traffic Model, 204 Performance Tools, 20S Performance Model, 206 W-Bus, 206 Memory, 207 R-Bus,207 Memory queuefull, 208 Model Summary, 208 Model Evaluation, 208 OPERATING SYSTEM, 213 Task Dispatching, 214 Interproc:ess Messaging, 218 Global message netWork, 218 Directed message paths, 219 File system, 219 Mailboxes, 219 Memory Management, 220 110 Transfers, 220 OIS Initiated Actions, 221 Locks,222 Thrashing, 222 SUMMARY, 224
xii
Contents
7 DATA-BASE ENVIRONMENT
226
THE FILE SYSTEM, 227 Disk Drives, 228 Disk Controller, 228 Disk Device Driver, 229 Cache Memory, 230 . File Manager, 231 File System Perfonnance, 232
FILE ORGANIZATION, 234 Unstructured Ftles, 234 Sequential Ftles, 237 Random Ftles, 238 Keyed Ftles, 238 Indexed Sequential Ftles, 243 Hashed Files, 245 DISK CACHING, 245 OTHER CONSIDERATIONS, 249 Overlapped Seeks, 250 Alternate Servicing Order, 250 Data Locking, 251 Mirrored Ftles, 252 Multiple Ftle Managers, 253
File m.tZ1IIZger per disk volume, 253 Multiple file m.tZ1IIZgers per disk volume, 253 Multiple file m.tZ1IIZgers per multiple volumes, 255
AN EXAMPLE, 255
8 APPUCATION ENVIRONMENT PROCESS PERFORMANCE, 260 Overview, 260 Process Time, 265 Dispatch Tune, 266 Priority, 266 Operating System Load, 267 Messaging, 267 Queuing, 268
PROCESS STRUCTURES, 269 Monoliths, 269 Requestor-SeIver, 270
271 Servers, 27.2 File m.tZ1IIZgers, 274 Multitasking, 274 Reque~rs,
Dynamic SeIvers, 278 Asynchronous 110, 280
259
xiii
Contents AN EXAMPLE, 281
SUMMARY, 296
9
FAULT TOLERANCE
298
TRANSACI'ION PROTECI'ION, 300 SYNCHRONIZATION, 303 ~AGEQUEmNG,~
CHECKPOINTlNG, 307 DATA-BASE INTEGRITY, 308 AN EXAMPLE, 308
10 THE PERFORMANCE MODEL PRODUCT
312
REPORT ORGANlZATION, 313 Executive Summary, 313 Table of CoDteDtS, 313 System Description, 314 Transaction Model, 314 Traffic Model, 314 Performance Model, 315 Model Summary, 31S Scenario, 316 Model Computation, 316 Results, 317 Conclusions and RecommendatioDS, 317 PROGRAMMING THE PERFORMANCE MODEL, 317 Input Parameter Entry and Edit, 317 Input Variable Specification, 318 Report Specification, 319 Parameter Storage, 319 Dicticmary, 319 Help, 320 Model Calculation, 320 Report, 320 TUNING, 321
QUICK AND DIRTY, 322
11 A CASE STUDY
323
PERFORMANCE EVALUATION OF THE SYNTREX GEMINI SYSTEM, 325 Executive Summary. 326
xiv
Contents
-
Table of Contents, 327 1. Inttoduction,328 2. Applicable Documents, 329 3. System Description, 329 3.1 General, 329 3.2 AqUlJrius Communication Unes, 330 3.3 AqUlJrius Interface (AI), 331 3.4 Shmed Memory, 334 3.5 File Manager, 335 3.6 File System, 336 4. Transaction Model, 338 5. Traffic Model, 340 6: Performance Model, 342 6.1 Notation, 34i 6.2 Average Transaction Time, 343 6.3 AqUarius Terminal, 344 6.4 Communication Line, 344 6.5 Aquqrius Interface, 346 6.6 File Manager, 350 6.7 Disk Management, 353 6.8 Buffer Overjlow, 355 7. Scenario, 357 8. Scenario Time, 359 9. Model Summary, 360 10. Results, 365 10.1 Benchmark Comparison, 365 10.2 Component Analysis, 368 11. Recommendations, 371 References, 373
APPENDIX 1 GENERAL QUEUING PARAMETERS
APPENDIX 2 QUEUING MODELS
375
377
APPENDIX 3 KHINTCHINE-POLLACZEK EQUATION FOR MlG!1 QUEUING SYSTEMS
383
APPENDIX 4 THE POISSON DISTRIBUTION
APPENDIX 5 MIRRORED WRITES A. DUAL LATENCY TIMES, 389 B. SINGLE DISK SEEK TIME, 391 C. DUAL DISK SEEK TIME, 392
389
_
Contents
xv
APPENDIX 6 PROCESS DISPATCH TIME
397
A. INFINITE POPULATION APPROXIMATION ERROR, 397 B. DISPATCHING MODEL, 399 C. SINGLE PROCESSOR SYSTEM, 402 D. MULTIPROCESSOR SYSTEM, 40S E. AN APPROXIMATE SOLUTION, 409
APPENDIX 7 PRIORITY QUEUES APPENDIX 8 BIBLIOGRAPHY
411
414
Preface
This book provides the tools necessary for predicting and improving the performance of real-time computing systems, with special attention given to the rapidly growing field of on-line traDsaCtion-processing (OLTP) systems. It is aimed at two audiences:
1. The system analyst who thorougbly understands the concepts of modem operating systems and appJic:ation stnlCtIIres but who feels lacking in the mathematical tools necessary for performance evaluatiOD. 2. The mathematician who has a good grasp of probability and queuing theory but who would-like to gain a better undastandiDg of the technology bebind today's compotiDg systems so tbat these tools might be effectively applied._
OLTP systems are mpidly becomiDg a part of our evayday Hfe. Mercbants pass our mdit cards thmugb slot readers so that rem.ote systems can check our czedit. Every commercial aiEpJane ride and hotel stay is plamled and ttacked by these systems. When we are critically ill, OLTP systems monitor oar critical sigDs. They control our facrories and power plants. We obtain cash from ATMs, play the horses and lotteries, and inveSt in stocks and bonds tbaDks to OLTP systems. _ _ No wonder 1beir performance isbecomiDg a growing concem. -A poorly performing system may simply frustrate us wbile we wait for its response. Even worse, it can be life-tlnea:rening to a b1!sjness-or even to our loved ones. - We define the perfOlillaDce of an OLTP system as -~ time required to receive a xesponse from it once we have sent it a traDSaCtion. Our traDsadion must wait its tum over xvii
XVIii
Preface
.
..
and over again as it passes from one setvice point to another in the OLTP system, since it is just one of many transactions that the system is trying to process simultaneously. These service points may be processors, disks, critical programs, commUDication lines-any common resource shared among the transactions. As the system gets busier, the delays at each service point get longer; and the system may bog down. The study of the behavior of these delays as a function of transaction volume is the subject of the rapidly expanding field of queuing theory. .. ApeIformance aDiiyit is'One'who has an intimate knowledge of the ~ of . these systemS and who can apply the practical tools available from queuing theory and other mathematical disciplines to make IeasODable statements about the expected performance of a system. The system may be one still being conceived, an existing system in trouble, or a system undergoing euhancement. This book is intended to train performance analystS. For the system analyst who may be a bit intimidated by higher mathematics, it presents mathematical tools that have practical use. Mmeover, the derivations of these tools ate for the most part explored, perlJaps not always rigorously, but in a manner designed to give a full understanding of the meaning of the equations that represent the tool. For the most part, only a knowledge of simple algebra is required. For the practicing mathematician, there is an in-depth description of each OLTP system component that may have a performance impact. These components include commuDication lines, processors. memories, buses, operating systems, file systems, and softwue application architectures. System extensions for fault tolerance, an important attribute of OL'IP systems, ate also covered.. This book is organized so that the reader may skip easily over material that is already familiar and focus instead on material of direct inteEest. The book does not present a "cookbook" approach to perfoImance analysis. Radler, it stresses the understanding of the use of basic tools to solve a variety of problems. To this end, many examples ate given during the discussion of' each topic to hone the mader's ability to use the appIopriate tools. As of the dare of this writiDg, the title "perfoImance analyst" has DOt been accepted as a usual job c1escriptioD. This is certaiDly DOt due to a pa:eeption that pe!f0!DJlllCe analysis is unnecessmy but pedJaps instead to the pa:eeption that meaningful pe!formance analysis is somewbat that of a mythical art. If this book can advance the accepamce of the pmcticiDg perfOl'lllaDCe aualyst, it will have achieved its goal.
A wodc of this magnitDde is the result of the efforts of maDy. I would like to take this opportunity to
tbaDk:
• All my anonymous reviewers, whose in-depth criticisms played a majorrole in the ~rijzation of the book.
Preface
xvix
• -My partner, Burt Liebowitz, who often challenged my fuzzy thinking to enforce a clarity and accuracy of presentation. • My daughter Leslie for the typing and seemingly endless retyping of the manuscript as it progressed through its various stages. • My wife, Janice, a professional writer in her own right, for turning my writing into real English. • Charles Reeder, who prepared the illustrations for this book, often with •'realtime" responses. • My many customers, who have provided the real-life testing ground for the methodology presented in the book, and especially to Concurrent Computers Corp. and Syntrex Inc., for their kind permission to use the studies presented in chapters 6 and 11, respectively. • Last, but not least, my editors, Paul Becker and Karen Winget, for their encour"" agement and guidance in the mechanics and business issues of bringing a new book to press. ABOUT THE AUTHOR Dr. Higbleyman bas over 30 years' experience in the development of real-time on-line data processing systems, with particular emphasis on high perfOIDWlce multiprocessor faulttolerant systems and large communications-or systems. Other application areas include intelligent terminals, editorial systems, process control, and business applications. Major accomplisbments include the first computerized tota1i7JJtor system for racetrack wagering installed for the New York Racing Association, the first automation of floor trading for the Chicago Board of Trade, the intemalional telex switching systems utilized by ITT World Communic:atiODS, the fault-tolerant data-base management system used by the New York Daily News, a 6000-terminallottery system for state lotteries, an electroDic billing data collection system for the telephone opezating c:ompaDies, and message switching systems for international cablegram and private network services. Dr. Higbleyman is fOUllder and ChaimJan of1be Sombers Group, a company which has provided tumkey software packages for such systems since 1968. He is also founder and c:bahman of MiniDasa Services, a company using miDicompuier technology to bring data processing services to small businesses. Prior to these activities, he was founder and vice-presideDt of Data TEeDds, Inc., a turnkey developer of real-time Systems since 1962. Befen that, he was employed by Bell Telephone Laboratories, where be was responsible for the development of the 103 and 202 data sets, and by LincoJn Laboratories, whe!e he worked on the first UIDsistorized computer. In addition to his management activities at The Sombers Group, Dr. Higbleyman is currently active in:
• performance modeling of multiprocessor systems. • fault-tolerant CODSiden.tions of multiprocessor systems.
xx
Preface • architectural design of hardware and software for real-time and multiprocessor
systems. • functional specifications for data -processing systems. He has performed analytical perfonnance modeling on many systems for seveml clients including: A. C. Nielson
Autotote Bunker Ramo Concurrent Computer Digital Equipment Corp. FlISt National Bank of Chicago FTC Communications G.E. Credit Corp. Harris Hewlitt Packard
m
World Communications MACOMDCC PYA/Monarch Smith Kline Stratus Syntrex
Systeme
Tandem Telesciences Time
Dr. Highleyman received the D.E.E. degree from Brooldyn Polytechnic Institute in 1961, the S.M.E.E. degree from Massachusetts Institute of Technology in 1957, and the B.E.E. degree from Rensselaer Polytechnic Institute in 1955. He holds four U.S. patents and bas published extensively in the fields of data communications, pattern recognition, computer applications, and fault-tolerant systems. He also sits 01' bas sat on several boards, including: • The Sombers Group (Chairman) • Science Dynamics, Inc. • MiniData Services, Inc. (Chairman)
• Intematicmal Tandem User's Group (Past PIesident) • Vertex Industries
1 Introduction
Ubiquitous. mysterious. WODderful--and sometimes aggravating-computers. They
truly are becoming IIlOle and more involved in what we do today. What they do often affects our quality of Jife, from the success of our businesses to the enjoyment of our free time to our comfons and CODVaUeDces. Businesses enter 1DDSaCtioDs as they occur and obtain up-to-tbe-:mimlte status information for decision-makiDg. Banks are extending on-liDe financial services to their corporate customeIs for interactive money transfers and account S1atUs. giving COIpO!3te money mauagers the uJrimate in cash-lIow ~ CaD your telephoDe company or credit cani company about a bill. and your cbarge and paymeat history appears OIl the SCIeen for jmnwtiate action. See your tmvel agent for airline tickets. and the CODIpIUer p1eseats all optioDs, mabsyour zesemdiODS, aud issues
your tickets.
.
TIlDe for fun? Buy tickets tbIOugh your local ticket outlet from the inventory kept OIl COI:DpUt£r. Play the hones or the state lottery-ifyou'te a lucky wilmer, the computer will c:a1cuJate your payoff. Not feeling well? Computers will mauge your hospital stay, will order your tests, and, of course. will prepme your bills. Other computers will monitor your specimens as they llow tbrougb the cliDicallabolatory, thus eusnring the quality and ac:curacy of the test results. ,ten
.
Need to COThinllDicate? Computers-will carry your voice, yOUr'data, and your writwords rapicD.y to ~ ~0IlS. •
1
2
Introduction
Chap. 1
And Quietly in the background, computers monitor the material fabric of our daily lives, from power and energy distribution to traffic control and sewage disposal. All of the preceding examples are types of transaction-processing systems. These systems accept a transaction, process it using a base of data, and return a response. A b:ansaction may be an' inquiry, an Onter, a message, a status. The d8ta base may contaiJi customer information, inventory, orders, or system status. A response may be a status display, a ticket, a message, or a command. For instance, in a wagering system, the wager information is a transaction that is stored in the data base, and the reply is a ticket. Subsequently, the ticket becomes the ttansac:tion as the system compares it to the data base to see if it is a winner. The reply is the payoff amount. In the control system for an electrical power netwOIk, the transactions are status changes such as power loading, transformer temperatures, and circuit failures. The data base is the current network configuration and status. The new status change updates the data base, and a command to alter the configuration may be issued as a response (e.g., reduce the voltage to prevent a power overload, thereby avoiding a brownout). In a message-switching system, the transaction is an incoming message (text, data, facsimile, or even voice). The data base is the set of messages awaiting delivery and the routing procedmes. The response is the J:eCeipt and delivery acknowledgments returned to the sender and the message routed to the destination. In an airline or ticket IeSerVation system, the transaction is first an iDquiJy. Schedule status is re4m hed. from the data base, which also holds an inventory of available seats. A subsequeDt ttaDSaCtioD. is the Older, which will update the inventory. A ticket will be issued in IeSpoDSe. No wonder we become affected--even aggravated-by the performance of these systems. Have you ever wai1ed sevaal minutes 08 the telephone for the cledc on the other end to get your biDing status? Have you ever watched the anxiety and the anger of a racehorse player1ryiDg to get the bet in befo!e the beD goes off, only to be frustrated by an ever-slowiDg Hne at the wiDdow? Have you ever watched a mmclumt get impatient waiting for a credit cud validation, give up, and mab the sale without authorization, thus risking a possible loss? Have you ever ••• ? The liSt goes 08. , And that is wbat tbis book is an about: the prediction and control of the perl'onnanc:el of these U3DSaCIioa-proc systems, which are weaving their way into our lives. •
w
an know tbatas a computer sysrem. becomes loaded, it ""bogs down." Respcmse times to user requests get longer and leading to iDc:reased and aggravaDOD of .. . '.. .longer, ... .. fiuslmUon _.. . .. We
,
_-_._----
IOf coarse, system avaiJahiIity is an eqaaDy impodIDl cciDI:em. Ifa\oe you ever beeIl1Dld Ibat you CIIl't pc a ticket • Ibis time because ""die c:uoupaIei is doWD"? The zeliability aaalysis of1hese sysrcms is DOt a tcpic: for dIis book. Howew:r, perfarmall'OB cIegaIdIIion clue to ~ tabla by die sySIIe.ms to _ nIiable oper-
atioa is a CODCenl aDd is CCM:leCI. TecbDiques for die IdiabJlity aaaI.ysis of ~ sysImDS may be fouDd in ~ ,witz [17].
Chap. 1
Performance Modeling
3
.
"
the user population. A measure of the capacity of the system is the greatest load (in transactions per hour, for instance) at which the response time remains marginally accept-
able. Deterioration of response time is caused by bottlenecks within a system. These bottlenecks are common system resources that are required to process many simultaneous" transactions; therefore, transactions must wait in line in order to get access to these resources. As the system load increases, these lines, or queues, become longer, processing delays increase, and responsiveness suffers. Examples of such common system resources are the processor itself, disks, communication lines, and even certain programs within the system. One can represent the flow of each major transaction through a system by a model that identifies each processing step and highlights the queuing points at which the pr0cessing of a transaction may be delayed. 'Ibis model can then be used to create a mathematical expression for the time that it takes to process each type of transaction, as well as an average time for all transactions, as a function of the load imposed on the system. This processing time is, of course, the response time that a user will see. The load at which response times become unacceptable is the capacity of the system. Performance modeling concerns itself with the prediction of the response time for a system as a function of load and, consequently, of its capacity. A simple example serves to illustIate these points. FIgUre l-1a shows the standard symbol used throughout this book for a resource and the attendant queue of transactions awaiting servicing by tbat IeSOurCe. The "service time" for the resource is often shown near it (T" in this case). The service time is the average time it takes the resource to process a transaction queued to it. Figure I-1b is a simple view of a transaction-processing computer system. TrausactioDs mive &om a variety of incoming c:onnnunk:ation lines and are queued to a program (1) tbat processes these inbound requests. 'Ibis program requires an average of 80 mjUisemnds (msec.) to process a traDsaction, which is then sent to the disk server (2) to read or write dal:a to a disk. The disk server ~ these am other requests and requires an average of SO 1DSeC. per request. Once it has completed all disk walk, it forwards a respouse to an output program (3), which retams these and other respoases to the comDlUDication lines. 1be output pogram requires 20 1DSeC. on the average to process each
respoase. Since the programs and the disk system me serviDg' multiple sources, queues of As the servers get busier, the queues wID get loDger, the time a 1raDsactioil spends ill the system wID get longer, and the system's respoase time will get slower. , One huplied queue not shown in.Figure l-Ib is the processor queue. We assume in this system that many pograms are rmming-many more dian are shown. But t:he!e is only one processor. Therefore, wilen a program has woat to do, it must wait ill line with other programs before it can be giveo access to the processor and actuilly run. Let us DOW do a IiUle performance aoalysis using Figu!e 1-1b (wbiCb, by the way, will later be called a "traffic moder~). If the system is idle, no queues wID build;, and an average transaction will work its way tbrougb the system ill 80 + SO +20 = ISO msee.
ttaDsac:tioDs awaiting service may build ill froat of eich of these serYeIS.
Introduction
4
--- "I I
TRANSACTION
I
1----.1
Chap. 1
RESPONSE
RESOURCE
1 - - -.....
QUEUE TS
RESOURCE MODEL (a)
(I)
~----. III
PROGRAM PROCESS
TRANSACT~
INPUT
/
\(
BOmsec
(2)
(3)
50msec
:= I. . -~__. .
1-----'
OUTPUT.
-
20msec SIMPlE COMPUTER SYSTEM (b)
·(the sum of the iDdividual service times for each server in its path). Not a bad
respoase
time. Now let us look at the msponse time in a more normally loaded system in which the queue leDgtbs for all servas, inducting the processor, aVerage 2. Tbat is, on the average, any RqUest for service will find 2 Rquests in from of it--one being serviced and one waiting for service. The DeWly eatencl mquest will be the seconcl Rquest in line, DOt COUIdiDg the mquest being serviceclat!be time. (As willbe shown.later, resomce loads of ~ will oflauesult in ~ leDgIbs of2. That is, jfa server is busy 67 pen:ent of the time on the average, its avezage queue leagth will be 2.) With these queue leDgtbs, each traDSaCtion must wait 2 service times before being serviced, and then each traDsaCtion must be serviced. Sounds like the nisponse time should triple £.rom 150 1DSeC. to 450 1DSeC~, right? WIODg. The:response time degrada!:ion is eveu worse since each progmm must wait in a queue for the processor. Let us assume . that the average time the processor spends nmiIiDg a prognm1 is 50 1DSeC. Let us also
Chap. 1
Uses of Performance Models
5
assume that the disk server is an intelligent device that does not use the processor and so is not slowed by baving to wait in the processor queue. Remember that the processor is 67 percent loaded, so its queue length is 2. When the input program wants to run, it must wait in the processor queue for 2 x 50 = 100 msec. and then must spend 80 msec. serving the incoming transaction. So far as the incoming transaction is concerned, the service time of this program is 180 msec., not 80 IDSeC. Likewise, the effective service time of the output program in the loaded pr0cessor is 2 x 50 + 20 = 120 msec. rather than 20 msec. The disk server time remains at 50 msec. since it doesn't rely on the processor. An average queue length of2 in front of each server now causes the response time to triple relative to the effective service times. Thus, at a load of~, system response time degrades from ISO IDSeC. to 3(180 + SO + 120) = 1050 msec! Note that we could plot a curve of response time versus load if we just knew bow to relate queue size to load. Then, given the maximum acceptable response time, we could determine that load representing the capacity of the system, as shown in Figure 1-2. 'Ibis is what performance modeling is all about.
USES OF PEllFORlMtNCE .ODELS Ideally, a performance model should be validated and tuned. Its results should be c0mpared to measured~. If they are sigDificandy diffeJent, the IeaSOIIS should be understood and the model COJrected. Usually, this results in the inclusion of certain processing steps initially deemed trivial or in the determiDation of more accmate parameter values. A performance model, no matter how deIailed it may be, is neverdleless a simplification of a very complex process. As such, it is subject to the inacc:uracies of simplification. However, experieDc:e has shown these models to be smprisinglyaccurate. Moreover, the tteads pleCticted are even IDOIe ac:cmate and can be used as very efJective decision tools in a variety of cases, such as
1. Perjomuznce Predictitm. The perfomumc:e of a pJanued system can be predicted before it is built. This is an ememely valDable tool durlDg the design phase; as bottleDecks can be idemified aDd CODeCCed before implementaIion (often zequiring signifiCant 'architec:tuia'i ch3nges), and pertormanc:e goalS' Can be' W:mied.-' 2. Perjomuznce TIIIIing. Once a system is ~, it may DOt perform as expected. The performance model can be used along with actual performance measDl&ments to "look inside" the system and to help locate the problems by comparing actual pm"SSiDg and delay times to those tbat are expected. 3. CostIBeMjit of EnhIlncements. When it is planned to modify or eahance the system, the performance model can be used to esriJnate the performance impact of the proposed change. This iDformation is invaluable to the evaluation of the proposed change. If the change is being made strictly for petfonnance purposes, then the costIbenefit of the change, can be accurately determmed.
Introduction
6
Chap. 1
2.0
1.8
1.6
1.4
u
MAXIMUM jACCEPTABLE RESPONSE nME
1&1 !! 1.2 1&1
:IE
i=
., z .,0 1&1
--.------- ---
1.0
I I
Q.
III
a::
.8
I
CAPACITY
:/
.6
.4
o
.2
.4
.6
.8
1.0
LOAD
. . 4. System Co'lffiguration. As a product is introduced to the madcetplace, it often has several options that can be used to tailor the'system's perfonaance to the user's Jleeds-;.for example. the number of disks, the power of the processor. or comDlUIIic:ation line speeds. The performance model can be packaged as a sales tool to help configure-new systems and,to give the costomer some CODfidenc:e in the system's capacity and performance.
Chap. 1
The Performance Analyst
7
THE SOURCE OF PERFORIlllANCE PROBLEMS
It is interesting to speculate about the main cause of the poor peIformance that we see in practice. Is it the hardware-the processors, memories, disks, and communication systems of which these systems are built? Is it the software-the operating systems and applications that give these systems their intelligence? Is it the users of the systems, through inexperienced or hostile actions? The data base organization? The system managers?
Actually, it is none of these. The second most common cause of poor peIformance is that the designers of the system did not understand peIformance issues sufficiently to ensure adequate peIformance. The first most common cause is that they did not understand that they did not understand. , If this book does nothing more than focus attention on those issues, that are important to peIformance so that designers may seek help when they need it, the book will have achieved a major part of its goal. If it allows the designers to make design decisions that lead to an adequately peIforming system, it will have achieved its entiIe goal. THE PERFORIIANCE ANALYST The effective prediction of peIformance has been a little-used art-and for a very good reason. h IeqUires the joining of two disciplines that are inheIently contradictory. One is that of the mathematical disciplines-from probability and statistical theory to the LapIace-Stieltjes transforms, generating functions, and birth-death processes of queuing theory. The odler discipline is that of 1he system aaalyst-dara-base design, communication protocols, process stnICtUIe, and softwaIe arcbitecIure. Qualified system analysts are cenamIy adept at a1gebm. They most likely, are a little rusty atelememaly calcalus, pmbabiIity theory, and statistics. Do they know how to solve differential-cliffele'we equations, apply Bessel functions, or uDdersraud the attributes of ergodicity? Pmbably not. They don't need to know, and 1hey p!Obably don't want to know. Practicing applied madJematicians. for the most part, have not been exposed to the inner WOIkiDgs of CODfeD4KDIy data porasingsystems. 1bird-normal form, pipelining, aad SNA may be not much !DOle tban WOlds to theni. ADd when they do uncIenrand a system's performance problem. it is difficWt for 1hem to iDake ~ on it because the 8SShiI'I4ioDs teq1Iiied for reasoaab1e ca1cuJations ofIeD divage so far from the real world tbat the mathematicians' oldco11ege professcD-aDd'cerIaiDlythe pat body of coatemporary coDeagues-woald never appiove. . . Performance analysts are in an awkwaJ:d position. They must be JeaSODably accomplished in system aaalysis to 1IIlderstand the nature of the system they are auilyzing. yet they must be practical enough to make those assumptions and approXimations necessary to solve the problem. Likewise. they mustUDdersrand the application of some fairly straightforwani malbematieal principles to the solution of the problem without being SO embarrassed by the accompanying assumptions as to render themselves ineffective. In short,
8
Introduction
Chap. 1
performance analysts must be devout impeIfectionists. The only caveat is that they must set forth clearly and succinctly the impeIfections as part of the analysis. A performance model is a very simple mathematical characterization of a very complex physical process. At best, it is only an approximate statement of the real world (though actual results have been surprisingly useful and accurate). But isn't ,it better to be able to make some statement about anticipated system peIformance than none? Isn't it better to be able to say that response time should be about 2 seconds (and have some confidence that you are probably within 20% of the actual time) than to design a system for a I-second response time and achieve 5? Based on actual experience, the most honest statement that can be made without a performance analysis is of the form "The system will probably have a capacity of 30 transactions per second, but I doubt it." The pmpose of this book is to eliminate the phrase "but I doubt i~. "
THE STRUCTURE OF THIS BOOK To that end, this book takes on both system analysis and mathematics. It is designed to give applied mathematicians the background they need to understand the structure of contemporary transaction-processing systems so they can bring their expertise to bear on the analysis of their performance. It is also designed to give system designeIS the mathematical tools tequked to predict performance. In either case, we have created perl'ormance analysts. It is the author's strong contention that a tool is most useful if the users are so familiar with it that they can make it themselves. 'Ibis applies to the various relationships and equations which are used in this book. Most will be derived, at least heuristically if not rigoJ:ously; in this way, the assnmprions that go mto the use of the particular tool are inade .clear. Most derivations are simple, requiring only basic algebra, a little CalculuS perbaps, and an elementary knowledge of probability and SllDstics. The derivatioDs are included in die main body of the text. MOle complex. derivations are left for the appendixes. Just oc:casioDaD.y a derivation is so complex that only a reference is given. The book is stn1CIDIed to support the system analyst seWcing better matbematical tools, the mathematician seeking. a better system UDderstanding, . . either one seeking anythjng in between. 'Ibis structure is shown in Figme 1-3. Chapter 2 isa major!eView oldie c:on1m&JlOl8Iy tecbnology involved in ttansactionp1'OC"SSing systems. It wiD be most useful to those seeking a better undmtaDding of the ~ of'these systems, but this cb8Pter could be skhnmecl or bypassed by system analysts knowledgeable in ttansaction processing. Chapter 3 gives a sDnple but iD-deptb example of performance modeling based on chapter 2 material extended by some elementary m~ introduced as die modeling progresses. This chapter ~ a preview of the rest of the book. A thorough underS1aDding of chapters 2 and 3 will equip the leader with the tools necessary to perform elementary performance analyses. The rest of the book then hones the tool:s developed in ~ two chapters.
Chap. 1
The Structure of This Book
9
CHAPTER 2
CHAPTER 4
SYSTEM
MATHEMATICAL
BACKGROUND
BACKGROUND
!
'!
CHAPTER 3
CHAPTERS 5-8
~
A LOOK AT PERFORMANCE MeDEUNG
COMMUNICATIONS
~ ..1
CHAPTER 10
/
DOCUMENTm'lON
-
( DISK)
PROCESSOR
OPERATING SYSTEM I-APPUCATION PROGRAMS DATA BASE
CHAPTER 9 FAULT
TOLERANCE
CHAPTER II CASE STUDY
Chapter 4 peseIdS a variety of mattM:mar;ca1 tools that m useful in cerram situ': While the bulk: of the chapLer is areview of queaiDg theoJy, which allows us to !elate queue 1eagtbs to mouit:e loads, many other useful tools m~. including basic coacepts in probability and statisdcs and expmsion of IISeful series. Cbapters 5 tbrougb 8 expand concepts relative to the major building blocks of a ttaDSadion-pr0c:essiD8· system. These building blocks include the oommnnjeation network (cbapter 5), the processor and opexating system (chapter 6), the data base (chapter 7), and the· application programs (chapter 8). Chapter 9 extends 1hese concepts to fault~leraDt systems. These cbaptm's COIdiin iDsigbt for both the aulyst and the matbemati~
tioas.
Introduction
10
Chap. 1
ciano System concepts are explored in more depth, and the application of the mathematical tools to these areas is illustrated. Chapters 10 and 11 are summary chapters. Chapter 10 discusses the organization and components of the formal performance model so that it will be useful and maintainable as the system grows and is enhanced. Chapter 11 includes a complete example of a perfounance model. References are given following chapter 11 and are organized alphabetically. Appendix 1 summarizes the notation used, and Appendix 2 summarizes the major queuing relationships. Further appendixes give certain derivations that will be of interest to the more serious student. The book is not intended to be exhaustive; the rapidly progressing technology in TP systems prevents this anyway. Nor is it intended to provide a cookbook approach to perfonnance modeling. The subject is far too complex for that. Rather, it is intended to provide the tools requiIed to attack. the performance problems posed by new system architectures and concepts. The book is also not intended to be a programming guide for performance model implementation. Though the complexity and interactive nature of many models will require them to be programmed in order to obtain IeSUlts, it is assumed that the performance analyst is either a qualified programmer or has access to qualified staff. No progranmring examples are given; however, useful or critical algorithms are occasionally presented. The author highly recommends that the serious student read references [19] and [24]. James Martin and Thomas Saaty both present highly J:eadable and in-depth presentations of many oftbe concepts necessazy in performance analysis. Also. Lazowska [16] provides an int.eJ:esting approach by which many performance problems can be attacked using fairly simple and straightforwani techniques.
SVlfBOLOGY One final Dote on symbology before embaddDg on this subject. The choice of parameter symbols, I fiDel. is ODe of the most frustrating and mun.daDe parts of performance analysis. 0fteD. symbols for several hundred parameters must be iDveJDd; there are just not enough cbaracters to go around without eDODDOUS subscripting. Therefore, the choice of symbols is often not !dec:tive of what they lep:eseDt. 'Ibis is a poblem we live with. The symbols used in the body of this book are summarized in Appeu.cIix·l for easy &efeIeDce. NotWitbstanmng the.pmblems of naming CODVeDtioDs me.atioDed above, the author does iDipose certain &eSttictioDs: . .,
1. All symbols _ at most ODe chalacterplus subscripts. Thus, TS is never used for service time; but T$ may be. This pteveuts poduc:ts . of symbols from being ambiguous. The only exception is var(x), used to rep:esent 1he varlaDc:e of x. 2. Oaly characl:erSfrom the Arabic alphabet are used (uppmcase and lowercase A through Z,
IlUDleials 0 through 9).
There are two reasons for doing this:
Chap. 1
Symbology
11
a. Most of the typewriters, word processors, and programming languages I use provide little if any support for the Greek alphabet. b. More important, performance models are most useful if understood and used by middle and upper technical management. I usually find this level of user to be immediately and terminally u,timidated by strings of Greek letters. This convention can be particularly disturbing to the applied mathematician who is used to p, A, and JL as representing occupancy, mival rates, and service rates, respectively. Instead, in this text he or she will findL. R. and T as representing load, mival rate, and service time. 'Rather than p = AlJL, he will find L = RT. However, the first time he or she tries to program a model in Greek, he or she will quickly learn to translate. The only exception to this is delta. a is used to indicate a small increment, and 8 is used to represent the delta (impulse) function in certain derivations. They never appear in resulting expressions. One other convention used is the door U) and ceiling
Te = (p,(m;; + m;o) + pJ..m,,; + 1PZut»]Is
(34)
Request handler. The component of response time inttodnced 'by the JeqUeSt handler comprises the foJiowiDg items:
1. The service time for serviciDg the JeqUeSt•. 2. An iDtaprocess message time for samng the zequest to the server queue. (Where we put this is ariAtwy. For this model, we wiD load intapocess message times onto the sender.) The request handler service time mnst include 1he process dispatch time, sinCe when the RqUeStor is Ieady to service an item., it must fixst get in line with the other processes to wait for the processor. Once it has the processor, it completes its service befoIe relinquislUng it. Let td
= avemge process dispatch time (seconds).
Chap. 3
The Analysis
79
Then Ti td
= tri + td + tipm
(3-S)
will be evaluated later.
Server.
The server response-time component comprises the following steps:
1. A wait in the server queue. 2. The service time for processing the :request and formulating the reply. 3. lnterprocess message times to send data-base requests to the data-base manager queue (lid requests per transaction). 4. The data-base manager service time for each disk :request. S. An intexprocess message time to send the reply to the outgoing :requestor. Let us define the following terms:
tqi = delay time for inquiry service (queue plus service time .in seconds). tqu = delay time for update service in seconds.
Since the probability of these two types of ttansactions is Pi and P,., respectively, the server response time component, Ts, is
Ts = p;tqi + PrJ",.
(3-6)
By further defiDing Tb = data-base mauager delay time (seconds), Lsi = load on the inquiry server,
La. = load on the update server, we can expzess the response-time c:ompone:at for an inquiry traasaction, _ 'si + 2td+ Tb + 2tp 1 - L,.;
tlli -
and for an update transaction,
t",.,
tqi,
as (3-7)
as
_ t$ll + 3td + 2Tb + 3tp 1 _ L$II
tl/ll -
r:LO) \..-v
Remember that an inquiIy ~ one disk access and that an update IeqUi1'es two disk accesses. The server process must be dispatched once upon the Iec:eipt of the request and once upon the Ieceipt of each disk response. The server loads are, teSpeCtively. Lsi = P;R(tsi + 2td + Tb
+ 2t;pm)
(3-9)
(3-10)
Performance Modeling
80
Chap. 3
where R = transaction rate (transaction per second). Data·base manager. Note that the data-base manager time, Tb , becomes a part of the server service time because it is in a closed loop with the server; that is, the server must wait for each disk request to be completed before it can go on. Thus, this time becomes part of its service time. Since the data-base manager has its own queue, the server delay time is a solution to a compound queue problem but in a straightforward and obvious way. Letting
Lb = data-base manager load, then the data-base manager deJay time,
TIl>
is
(3-11)
where (3-12)
Reply handler.
Once a reply has been issued by a server, it suffers the following
further delays: 1. A wait in the reply handler queue. 2. The service time for processing the reply.
4
= reply handler load
the reply handler service time compoaeot, To, is
T. o
= tro + t"
1-4
(3-13)
where
4 = R(tro + ttD
(3-14)
Dispatch time. We DOW have expIeSSions for an parametets based on known inputs, except for the dispatch time, tdo This is the average time that a process must wait in the processor queue (the Ieady list, as described in chapter 2) before gaining access to the ~.
Chap. 3
The Analysis
81
For each transaction, the following dispatches take place: Number of dispatches
Ploc:ess
Service time
Requestor Requests Replies Server IDqairies Updates Data-base IIIIIIII&eI' Toral dispatches
t,,+ ·t.... t",
3p.
Pi + ']p. 2+3pi+SP.
From this list, one can state that the dispatch rate, Td
t61 + 21.... t.,,+ 3t.... t.
']pi
Tdo
is
= R(2 + 3Pi + 5p,,)
(3-15)
The average process time per dispatch, tpo is calculated by summjng the various process service times weighted by their probabilities of occummce for a transaction. This probability of occunence is the ratio of their fiequencies of occurrence in a transaction to the total number of dispatches per traDsaction. Thus, average process time, fp, may be ex~as
fp
= [tri
+ tTO + P~6i + PrJ"" + (1 + P.)tb + (3 + p,,)tq.JI(2 + 3Pi + 5p.,).
(3-16)
The processor load, Lp, is
Lp =
T~p
(3-17)
. . . . . Thus, average dispatcll time, t", (the time spent waiting in the processor queue bnt exclud. ing the service time) is from equation 3-1: 1
-..i..._ "P-I-Lp'fl ~ - Lp t t.d-l-Lp
(3-18)
Model summary. 'Ibis model is smnmarized in Tables 3-2 anc:l3-3. Table 3-2 lists all model paua:uerms, separated into four categories:
• Result ptl7Il1MIer. Those calcalamd parameters that are of most probable inteIest as a result (0Dly T in this case) . • Input VtIriobles. Those parameters which are most likely to be varied to play "what if" games (only R in this case). IA men ac:cmate approacb would c:ak:aIaIe diffezaIt dispatdl times for eich pocess by excluding ~ pmc:essor load of the pcoc:ess beiDg evaluarecI. See Appeadix 6 aDd a CCIIDpII8bIe example ill Qaprer 8.
Performance Modeling
82
Chap. 3
TABLE 3-2. PERFORMANCE MODEL PARAMETERS 1. Ruult PIITtIIIIdD'S T Avenge traDsadion respoase time (seconds) 2. Input VDriable.s R Avenge system 1IIIIS8dion rare (ttaDsacrions per sec:cmd) 3. Input PilTtl1MIeTS mu Avenge iDquiIy request message lengdl (bytes) 1IIj" Avenge iDqaiJy reply message lengdl (bytes) III,,; Average updare request message lengdl (bytes) m.. Avenge updare reply message lengdl (bytes) p; Probability of aD iDquiry uusaction P. Probability of aD upcIaIe traDsaCtiOD s COJIIIDDDic:adoD liDe speed (bytes per sec:cmd) t. Avenge daIa-bese IDIII8F' proc:essiDg time per disk request (seconds) Z. Avenge disk access time (secoads) IDretptocess message time (seconds) t,; Avenge request baDdler pIOCeSSiDg time for a request (seconds) t", Avenge reply haDdler proc:essiDg time for a seply (seconds) to; Average iDquiry server ptoc:essiDg time for aD iDqujry (seconds) to. Avenge update seMI' processing time for aD upcIaIe (seconds) 4. I~ PIITtIIIIdD'S L" DaIa-base IIIIIII8ger load L" Reply baDdler load Lp Pnx:essor load 1.,; IDquiIy server load L.. Updafe server load Td PIOcess dispIrch rare (ptcmm per seccmd) ~ Avenge ctispaIch time (waiting time for the processor in seconds) tp Avenge paocess time per cIispIrch (secaads) tq; Average iDqairy sene!' delay time (secaads) t.,. Average update sem:r delay time (secaads) T. Average dala-blSeJlllllllpl' delay time zequiIed to paocess a disk request (secaads) T. OcIi""",,/cmau.-liDe sespoase time CCIDIpOIlII1t (seconds) T; Request baDdler zespaase time compoaeat (secaads) T.. Reply baDdler sespoase time compoaeut (secaads) T6 Sener respaase time COIIIpOIII'JIIt (secaads)
t....
• Input ptITtI1MIerS. Those parameters wbich c:baracter.ize me'system and for which values are assumed for pmposes of the model• • IntennediIzte ptITtI1MIerS. All c:alcuJatecI. parameteIS except for leSUlt parame. ters. Table 3-3 SUiilmarizes the equaDODS, with ~ numbers to refecenc:e back to ~ . text. Note the top-dowD pn;seotation of all parameIer expessioDs in Table 3-3. The expressiOn for an intemaediate parameter is DOt presented until it appears in a previous expession. 'Ibis is an iDvaluable aid in organizing a model and in orgmrizing your own thought process.
Chap. 3
The Analysis
83
Model Results The average response time, T, as a function of the transaction load, R. is shown in Figure 3-4, using the values of input parameters largely suggested in previous sections and summarized in Table 3-4. Note the rather precipitous bIeak in the curve-much sha1per than we would expect from a T/(l-L) relationship. This is usually caused by a "sleeper" in the system: a bottleneck that contributes little delay at low loads but becomes the system bottleneck at high loads. In fact, with this response-time characteristic, the system will "bIeak" very suddenly as load is increased, and it is ~ to stay away from the knee of the curve. If the system capacity is taken as four transactions per second (67% of full capacity), we are fairly safe, with an average response time of about 0.6 seconds. Note that if we TABLE 3-3. PERFORMANCE MODEL SUMMARY 1. lWponse time T= Tc+ T/+ T.+ T.
(3-3)
.2.C~
Tc =[P,(IIIJ/ + m",) + p.c."". + m...)] s
(3-4)
3. Reqwst htwller T/=tli+t.t+z..., 4. SInen
(3-5)
T.=~+p.r..
(3-6)
_ ,. + 2t.t + T. + 2trr
(3-7)
I-La _ t. + 3t.t + 2T. + 31... z..l-L.. La =p/«.t. + 2z., + T. + 2t...> L. =p"R{t. + ~ + 2T. + 31..,) ft;-
(3-8) (3-9)
(3-10)
s. DcrIa-btlsc I/IIDItZIC7" :t _t.+t,,+z.,.
(3-11)
.- I-I. I. =R(p, + 'P.)(r. + t.t + z...) 6.
(3-12)
RIIJII11rt1n4ler r.=~
(3-13)
• I-I.. I.. =R(t,..+r,,>
(3-14)
7. DispaIt:h time t.t= 1 feL,. z,
L,. =r~
(3-18) .
_ [tli + fro + PI. + p.r. + (1 + p.)r. + (3 + p.)z.,.J Z, (2 + 3pi + 51.) r4
=R(2 + 3p, + 51.)
(3-17) (3-16) (3-15)
Performance Modeling
84
I
J
I
1.5
I I
.,, I
J
I
-I.
I
I
I
i=
I
1&1
I
Average Response nme
I
en I /
------ ... o
2
__ -
.",.
",
"
I
/
/
~
Server
COmponent
3
4
rme (tQU) 5
6
TRANSACTIONS/SECOND (R) Figare3-4 SysIem lespaase time.
TA8LE3-4. INPUT PARAMETER VALUES
Patameeer Ills
".. m.,
..... PI P.
.s It
,. t.,. tli t", tli
'.
MeaDiDg
IJIqIIi1y Ieqaest leDgIh (byres) IJIqIIi1y seply IeagIh (bytes) Update Ieqaest Ie:DgI:h (byIIes) 1JpdaIe seply IeagIh (bytes) PloIIabiIity of aD iDqaiIy PloIIabiIity of aD apcIaIe CmEhHillj.-:atioa !iDe speed (byres per secaad) DII:a-base IIIIIIIF' pocessiDg time (secaads) Disk access lime (secoads) IDImpiOCii$$ messap lime (secoads) Request baDdIer pocessiDg lime (secaads) Reply b8Ddle.r paoc:essiDg lime (secoads) IDqui:ry sener paoc:essiDg lime (secoads) Update sener ptOCeSSiDg lime (secoads)
Value 20
400 200
IS 0.35 0.65
9CiO .010
.035 .OOS .OOS .OOS .010 .015
Chap. 3
Chap. 3
The Analysis
85
should push the capacity up by 25% to 5 transactions per second, we only get a 15% increase in response time (to 0.7 seconds). However, at this point, a mere 10% increase in load gives almost a 50% increase in response time! This fairly low capacity-four transactions per second-is DOt an uncommon cbar~ acteristic ofTPsystemsas a per processor measure. One to teD transactioDS per second per processor is the common range for contemporary systems. This emphasizes the great importance of performance analysis and its companion, perfOImaDce plaDning. Note that we can use the model to "peek" into the system to analyze its performance more deeply. Not only can we plot the component response times-Te, Tit Ts, T(1' tqi, and t..-but we also can evaluate component loading-Lsi> La, Lb , ~, and Lp. This capability is our tool to determine where the system can best be modified to enhance its
performance. Let us. apply this tool to our example system. Peeking inside will identify our "sleeper," the update server. Its response time component, t.. , is also shown in FlgUIe 34. h is initially hidden by the comrmmications component, which is independent ofload because of the point~~point lines. However, it becomes the bottleneck at six transactions per second. Wbat Can we do to eDbance the system? Simply add another update server, and have doubled the capacity of the system. No extra batdware needed. In fact, the model will show the following component loads for the single server case at a load of six" transactions per second:
we
IDqairy server (h,) lJpdare server ct-) Dala-bIIse JIIIIIIIpI' (4) Reply baDCIer (J..)
PRIcessor (J..,J
0.27 0.94 0.49
0.06 0.35
By addiDg a second update server, its load is Ieduc:ed from a vety high load (0.94) to a reascmable load (0.47) at six traDsadioDs per second. Since the haniware componeitts Of the system-the disk: and processor-are carrying a 35% to SO% load, significant capacity enhancwuent could be achieved by moving into a mukiserverenviroDmeat wUbout purchasing any DeW hardware. The model could be cbaDged to Ieftect this and could be used to pIedict the optimum IlUIIIber of servers zequimd to balance the softwaIe with the bardwue. Of course, dyDamic servers, as described in chapter 2, would be self-adjUsting. The performance analysis would tbeD be diIecred to predict the performance of that system.
OaIy
Allalpis Sum...ry The above discussion bas presented, to some extent, the content of a proper performance ,analysis. The system description bas spmmed chapter 2 and this chapter, though ~
86
Performance Modeling
Chap. 3
description would normally be contained in a more formal and localized section of the performance analysis. A scenario model is developed along with a traffic model; significant explanatory text accompanies these developments. The model is summarized, and results are presented (no program was necessary to allow calculation of this simple model), as well as recommendations for system performance improvement at no additional bardware cost. In short, DOt only has an analysis been undertaken, but it also bas been thoroughly documented. Not only have we completed a real performance model, but we also have completed our first perfOI1lWlCe analysis! This analysis was completed using only a simple queuing equation, a knowledge of TP systems as presented in chapter 2, and a little native ingenuity. This is all it takes to be a performance analyst. Why, then, are there so
few of us? The remaining chapters add some tricks to our tool kit, give us a better understanding of our tools, and suess the impo.ttance of documenting our model and analysis.
4 Basic Performance Concepts
The matbematical foundation required for performance modeUng is very much a function of the depth to which the analyst is interested in diving-from sJdnmring the surface to over ODe'S head. As we saw in chapter 3. a peat deal of pedormaDc:e modeling can be acbieved with ODe queuing equation (equation 3-1). a lot of simple algebra, and some system
sease.
However, tbeIe _ many unusual and complex TP system an:bitectures that can be !DOle accurately analyzed with some better tools. 'J'bjs chapter can be ~ the ''tool kit" for the .book. 'l'.blee of the six sections of this cbapter pIOVide the basic fouadaDon of queuiDg theory for the remaining cbaptas. Below is a brief overview of 1hese tIRe sectioDs.
I. Queues-AnI7rtTDtluction gives a simple. iDI:ui1ive derivation ofpabaps the most impollaot: zelationsbip for the pafoJmance aaalyst the JCbIntchiDe-Pollaczek equation, which cbaracterizes a broad range of queue types. Equation 3-1 is a . subset of this eqaatioJL NmDlathematicians shoulcl find the iDtui1ive derivaIion UDCImtaDdabJe aDd illnnrinaring in terms Of 1IIlCIasIaDctin the behavior of queues. Those knowledgeable in queuing theory will find definitions of teIWS and ~ used 1brougbout the rest of the book. 2. Chl.arlcterizing Queuing Systems slll!ll"arizes the Kendall classification system for queues. This classification system is used tbrougbout the book.
Basic Perfonnance Concepts.
88
Chap. 4
3. Comparison of Queue Types summarizes the queuing concepts presented in this chapter by discussing the comparative behavior of queuing systems as a function of servi~ time distribution, number of servers, and population size. The othertbree sections of the chapter deal with a broad range of marhematical tools, including probability concepts, useful series, and queuing theory. Those less inclined to mathematics will be pleased to know that the rest of the book is completely understandable without a detailed knowledge of these tools.
QUEUES-AN 'NTRODUcnGN As we have seen in the simple performance examples in chapter 3, queues are all': important to the study of system performance. In the simplest of terms, a queue is a line of items awaiting service by a commoDly shared resource (which we hereiDafter will call' a server). Let us first, therefcxe, take an intuitive look at the phenomenon of queuing. Through just common sense and a little algebra, we can come close to deriving the Kbintchine-Pollaczek equation, which is one of the most important tools available for performance analysis. Its formal derivation is presented in Appendix 3, but it is not much more difficult than that presented here. Let us consider a queue with very simple characteristics. Interestingly enough, tbese characteristics are applicable as a good approximation to most queues that we will fiDei in real-Hfe systems.
These cbaractaistics are 1. 1'.bere is ODIy a single server. 2. No limit is imposed on the length to which the queue might grow. 3. Items to be serviced mive on a random basis. 4. Service order is first-in, first-out. 5. The queue is in a steady-state CODditio.D, i.e., its average length measmed Over a long enough period is CODSIaIlt
In FJgme 4-1, a server is serv:iDg a line of waiDDgttaDsadioDs. On the avemge, it takes the server T seconds to service a tmDsactioD. Some 1I3DSaCtioDs take more, some tate less, but the avemge is.T. T is called the 8DVice time of the server. The average number of 1I3DSaCtioDs in the sysIem, whlch we will call queue length, is Q tmDsaCtions. This compises W umsactioDs waiting in line plus the 1raDsaction. (if any) cummtly being serviced. Finally, transacIions are miviDg at a tate of R t:raasactions per second. The server is busy, or occupied, RT of the time. (IfttaDsadioDs are miviDg at a rate of R 1 per second, and if the server zequites T OS sec:oods to service a ttaDSaCtioD, then it is busy 50% of the time.) 'Ibis is called the·occupancy of the server or the 1Dad ~) on the server: . , (4-1) L=RT
=
=
Chap. 4
Queues-An Introduction
89
CO .1 -l 0-1
11--'
[
ARRIVAL RATE =R
I
I
SERVER
AVG. SVC. TIME • T
Tq = QUEUE nME (TIME WAITING IN LINE) Td • DELAY TIME (Tltt£ IN UNE PLUS SERVICE TIME)
L also represents the probability that an miving transaction will find the server busy. (Obviously, iftbe server is not busy, 1he!e is no waiting line.) When a tnmsaction mi~ to be serviced, it will find in front of it, on the average, W transactiODS waiting for service. W"11h probability L, it will also find a traDSac:tion in the process of beiDg serviced. The servicing of the CUJteDt traDsaction, if any, will bave been partiaUy completed. Let us say that only leT time is left to fiDish its servicing. The newly mived ttansaction will bave to wait in liDe long enough for the cummt transaction to be c:ompleted (leT seconds L of the time) and then for each traDsaction in front of it to be serviced (WT seconds). TheIefOle, it must wait in line a time Til (the queue time): Til
=WT + 1J(J'
(4-2)
From the time that it arrived in liDe to the time that its servicing begiDs, an average of W other traDsadiODS must bave mived to majntain the average line length. Since transactiODS are arriving at R ttaDSaCtioDs per secoad, tbeD . W=T"R
or Til
=WIR
(4-3)
EquadDg 4-2 and 4-3 and solviDg for 1he waiting IiDe length gives (usiDg 4-1)
1£2
W=I~
(~)
The to1allcmgth of the queue as seen by an miving ttaDsaction is the waiting line, W, plus a traDsaction being serviced L of the time:
Q =·W + L
(4-5)
L . Q = l-L [1-.(1 - k)I..]
(4-6)
or
Basic Performance ConceptS
90
Chap. 4
The delay time, Td , is detenDiDed in a similar manner. It is the total amount of time the transaction must wait in order to be serviced-its waiting time in line plus its se.Mce time. During the time tbat the transaction is in the system. (Tto, Q transactions must arrive to maintain the steady state:
Q = TdR
= TdLIT
(4-7)
Setting this equal to equation 4-6 and solving for the response time, Td , gives Td = 1 ~ L [1 - (1 - k)L]T
(4-8)
kL Tq = l-L T
(4-9)
From 4-2 and 4-4,
Equations 4-4, 4-6, 4-8, and 4-9 are the basic queuing equations for a single server. assumptions inherent in these expressions are that transactions arrive completely independendy of each other and are serviced in order of arrival. The equations axe, thelefore, very genem1 expressions of the queuing phen0menon. They axe quite accnrate when the transactions axe being generated by a number of indepeDdent sources, as long as the number of som:ces is sigaifiC8Dtly greater 1baD the average queue lengIbs. A common linrinng case of accaracy occurs when there is a small IlDIIlber of soun::es, each of which can have only one O'Iltsta1lCting transaction at a time (a common case in Computer systems). In this case, the queue length CIDIlot exceed the IlDIIlber of sow:ces; and the mival mte becomes a function of server load. (Arrivals will slow down as queues build, since sources must await servicing before geDeEating a new transaction.) Howev«, in this case, the above explessions wiD be coaservative, since they will in general pedict queue leDgtbs somewhat pealer than those that actua1ly will be experienced. (Obviously, the Jimiring case of only one sourcewDl never experience a queue, although equation 4-6 wiD predict a queue.) The parameter Ie in these equations is a function of the service-time cIistribuDon. It is the average amount of service time that is left to be expeDded on the CUIl'eIIt transaction ,being serviced when a new transaction mives. Let us look at these explessions for certain impoIIant cases of service-time distribuIions. As stated previously, the primary
For expoaeatial service times, the pobabiJity that the remaining service time wiD be greater than a given amouat t is expoaeatial (e-CI). We diseass this and other p10babiJity concepts later in this cbapter. An expoaeatial distribution bas the cbaracreristic that the remaining service time after any arbitraty delay, mllmjng that the servicing is still in pmgress, is S1ill expoaeatial, i.e., it bas DO memory. Thus, one bas the fonowing interesting situation: If the average service time of the server is T, and if the server is CWieady busy (for DO matter bow long), one is going to have to wait an average time of T for the "service to complete. Since Ie is the ptoportion of servicing J'I'!D'UIjnjng ~ a new traIIS-
Chap. 4
Queues-An Introduction
91
action enters the queue, then k = 1 for exponential service times. Thus, equations 4-6, 4-8, and 4-9 become L
Q
= l-L T
Td=T=7
(4-10) (4-11)
and
Tq
L
= 1 -L T
(4-12)
These are the forms of the queuing expzessions often seen in the literature and represent a typical worst case for service-time distributions (though it is possible to COD"struct unusual distributions that are worse than exponential). Equation 4-11 is the fomi" that was used in chapter 3 as equation 3-1.
Constant Service Times If the service time is a constant, then on the average. half of the service time will remain when a new ttaDSaCtion enters the queue. Therefore k = ~, and Q =
~ (I - ~)2 I-L
(4-13)
Td=--L(I-~)T l-L 2
(4-14)
LI2 T.q=-T l-L
(4-15)
and
These expressions show queues and delays that are somewhat less tbantbose pIedieted for exponential service times. CODSIaDt service times genaally repesent the best case.
(4-16)
where E(T)
= the mean of the service-time distribution (simply represented as Therein).
var(T)
= the variance of the service-time distribution.
Basic Performance Concepts
92
Chap. 4
and E(1'2)
= the second moment of the service-time distribution.
This JeSuit is derived in Appendix 3. We will hereafter refer to k as the distribution coefficient of the service time for the server. For exponentially distributed service times, as we will see later, the standard deviation equals the mean. Since the variance is the square of the standard deviation, then k = I, consistent with our argument above. For constant service times, the variance is zero, and k = Y.z, as also just shoWD.
Unifonn Sentice Times A service time that can range from zero to s seconds with equal probability is uniformly distributed. It has a mean of s/2 and a second moment of E(1'~
lL~Pdt = r/3 = -So
lberefore, (4-17)
This is between the cases of constant and random service time, as would be expected.
Discrete Sentice Times Often times, the service time will be one of a set of CODStant times, each with a duration of Ti and a probability of Pi. In this case, k=
! :~$
(4-18)
'Ibe above equatioDs aEe the very imPOltaut qneuing equatioDs derived by TCbjntebine and Po)]acze1c (ac:tua1ly, equatiOD 4-9 for the queue 1ime. Tt • is fOImaDy known as the
KbintcMne-PoDaczek equation, or the PoDaczek-KbintcbiDe equalion, depending upon to whom you wish to give primary credit). These are so important to pesfoJmance analYsis that we summarize them be:re: . 1cL2
W
= l-L
Q
L = -I-L [1 -
(4-4)
(I - k)L]
(4-6)
Chap. 4
Queues-An Introduction
Td
1 = l-L [1 -
Tq =
(1 - k)LJT_
kL r.:rT
(4-8) (4-9)
and (4-16)
where W
Q
= average number of items waiting to be servic:ed, = average length of the queue, including the item cmteDtly being ser· viced, if any,
Td
= average delay time for a ttaDsaction, including waiting plus service time,
Tq = average time that a transaction waits in line before being serviced, k
= distribution coefficient of the service time,
T = average service time of the server,
L = load on (oc:cupancy of) the server.
. Equation 4-8 for-the delay time is the one which will be most often used in this book. The queue length, Wor Q (equation 4-4 or 4-6), is often necessary when we are coucemed with overflowing fiDite queues. The queue time, Tq (equation 4-9), is useful when the queue contaiDs items· froin diverse somces, each with a diffem1t mean service time. In this case, the queue time is calcalatM using the weighted mix of an tnmsaction ~ times. The average delay time is calcalatrd for an item by adding the average queue time, calculated for the mix of an items mthe queue, to the average service time for that item. Thus, Td = Tq
+T
(4-19)
as is suppotted by equatioDs 4-8 and 4-9. Qrber nseful n=1atioDs between these parameters that can be deduced from the above equatioDs aDd that have already been. pesented, are Q =W+L
(4-20)
T.q-!IT L
(4-21)
and
Td=
'f
=Tq+T
(4-22)
Basic Performance Concepts
94
Chap. 4
Note that L is the average load on the server. As we shall see later, in a multiserver system with c servers, the total load on the system is d. Equations 4-20 through 4-22 still hold after substituting d for L. Equation 4-7, derived earlier, provides another important insight into queuing systems:
.
Q=RTd
(4-7)
This is known as Uttle's Law (Utt1e [18], Lazowska [16]) and states that the number of items in a system is equal to the product of the throughput, R, and the residence time of an item in the system, Td • These relationships are surprisingly valid across many more queue disciplines than the simple first-in, fiIst-out discipline discussed here. This point is explored further in Appendix 3, where it is shown that the order of servicing is not important so long as an item is not selected for service based on its characteristics. For instance, these relationships would apply to a round-robin (or polling) service algorithm but not to a service procedure that gave priority to short messages over long messages. CONCEPTS IN PROBABILITY, AND OTHER TOOLS
Performance analysis can at times require some innovative ingenuity. This typically does not require any knowledge of higher mathematics. However, a basic knowledge of probability concepts and other helpful hints can often be useful. The material presented in this section is intended to simply touch on those concepts that have proven useful to the author over the course of several performance analyses. We will launch into some detail, however, cnncmring randomness as itIelates to the Poisson and expoaential probability distributions. Not that we need these so much in our analysis efforts but because an in-depth knowledge of them is impeIative in Older to cleady understand many of the simplifying assumptions that we will often have to make in Older to obcain even approximate solutions to some problems. Excellent coverage of these topics for the p.taeticing analyst may be found in Martin [19] and GolDey [7].
Rando.. Variables Probability theory COIICel'IlS itself with tile desaiption and manipuJatioD of Tfl1IIlom variables. These variables desa:ibe real-life situations aDd may be classified into disc!ete and continuous random variables.
Discrete RandollJ Variables A discrete 'Vtl1itIble is one which can take ODly certain disCl'ete values (even though there may be an iDfinite set of these values, known as a coU1l1llbly i11jinite set). The length of a ,~is a discrete random 'Vtl1itIble; it may have no items, one item, two items, and so on
Chap. 4
Concepts in Probability, and Other Tools
95
without limit. The length of a queue with a fixed maximum length is an example of a discrete variable with a finite number of values. If we periodically sampled a queue in a real-life process, we would find that at each sample point it would contain some specific number of items, ranging from zero items to the maximum number of items, if any. If we sampled enough times and kept counts relative to the total number of samples (and assumed that the queue is in steady swe), we would find that the pIOpOItion of time that there were exactly n items in the queue would converge to some number. 'Ibis would be true for all allowable values of n. For instance, if we made 100,000 samples and found that for 10,000 times there was nothing in the queue, for 20,000 times there was 1 item. in the queue, and for 1,000 times there were 10 items in the queue, we would be fairly certain that the queue would nmmally be idle 10% of the time, have a length of 1 for 20% of the time, and have a length of 10 for 1% of the time. These values, expressed as proportions rather than as percentages, are, of course, the probabilities of their corresponc.ting events. That is, the probability in this case of a queue length of zero is .1, etc. We wiD note the probability of a discrete event as P,., wheIe n in some way describes the event. For instance, in the case of a queue, P,. is the probability of there being n items in the queue. If we were drawing balls of different colors from an urn, we might choose PI to be the probability of drawing a red baD and P2 that probability for a peen ball. Thus,
p"
= Probability of the 0CCUIrenCe of event n.
The set ofP,. that describes a random variable, n, is called the probability density function of~ . There are several important properties of discrete variables, but the most obvious are
I. Each probability must be no greater than I, since we can never be more certain than certain: '
OSp,. so 1
(4-23)
2. The probability deasity function must sum to 1. siDce 1 and 0Dly 1 event OIl each observalion is certain:
lJ,,.= 1
"
(4-24)
wheIe the SI""""rion is over an allowed values of n. 3. .Assnnring that ew:ms _ ~, the probability of a specific combination of ew:ms is the product of their p!Obabilities. Thus, if we were to draw a ball from an urn COJl1lriTring balls of several different colors, put it back, and draw another ball, and if PI were the probability of drawing a red ball and P2 the probability of drawing a green ball, then 1he probability of drawing a red ball and a green ball is PIP2. (Note that we have to put the fiISt ball back in order to avoid changing the probabilities; otherwise, the two events would not be independent.)
Basic Performance Concepts
96
Probability of ocamence
of a set of independent events
Chap. 4
(4-25)
= 1rp" "
where the product, 11', is over the specified events. Thus, "and" fmplies product (the probability of event 1 aNi event 2 is Pl]J2). 4. Assummg that events are independent, the probability that an event will be one of several is the sum of those probabilities. In the above example, the probability of drawing either a red ball or a green ball on a single draw is PI +
1'2. Probability of occunence of one of a set of independent events = ~"
(4-26)
"
where the summation is over the deslred events. Thus, "or" implies "sum" (the probability of event 1 or event 2 is PI + ]J2). S. The probability of a sequence of depeDdeDt events depends upon the conditional probabilities of those events. If we did DOt return the ball to the um, then the probabilities would change for the second dmw. The probability of a red and then greeD. draw would be the probability of a red draw times the probability of a greeD. ball given that a red ball has been drawn. Thus, the probability of red, then green = P1Pi1), where p,.(m) is the probability of event n oc:curring (a greeD ball in this case) given that event m has occurred. In general, letting p(n, m) be the probability of the sequeace of events n and m, p(n, m)
= p,.p".(n)
(4-27)
6. The average value (or mean) of a l'81ldom variable that has JllIDleric memring is the sum of each value of the variable weigbrecl by its probability. Let i be the mean of the variable with values n and probabilities p". Then (4-28) where the sum is taken over all allowed values of n. 7. h is often imponaDt to bave a feel for the ·'cfisPe.mion'· of the DDdom variable. If its mean is P, will all obemdioDs yield a value close to P (low dispersion), or will they vary widely (high dispersion)? A CODlDlOIl measme of dispeEsion is to calculate the average mean square of all obse:rvariODS Ielative to the mean. 'Ibis is called the variaDce of the lIIldom variable, deDoted VIr (n):
var(n) =
L(n-i)7" "
(4-29)
whe.re the sum is taken over all allowed values of n.. The square root of the variance ~ called its standard deviasion and, of course, bas the same dimension as the variable (i.e., items, seconds, ttaDsactiODS. etc.).
Chap. 4
Concepts in Probability, and Other Tools
97
8. The moments of the variable~ also sometimes used. The mth moment of a variable, n, is represented as TI" and is (4-30) where the summation is over all allowable n. Note that the mean is the ~ IDOIDeDt (m 1). There is also an important relation between the variance and the second moment:
=
var(n)
= L(n-iifp. n =
Ln7. n
~2nipn +
Lii?n
From equations 4-24 and 4-28, var(n) =
Til -
ii2
(4-31)
.That is, the variance of a random. variable is the diffmmc:e between its second moment and the square of its mean. (See equation 4-16 for a use of this relationship.)
9. If x is a mndom variable that is the sum. of other random variables, then the mean of x is the sum of the meaDS of its component variables, and the variance of x is the sum of the variances of its component variables. Thus, if
x=a+b+c+ .. ,
-
-
- ... x=a+b+c+
(4-32)
and var(x) = var(a)
+ var(b) +var(c) + ...
(4-33)
10. Ifx is a choice of ODe of a possible set ofvariables, thea its mean is the weigbIed avenge of those variables, and its second IIlOIDeIlt is the weighted average of the sec:oad DIOIIleDts of those variables. Thus, if a, b, c, ... are each random. variables, and if x may be a with probability P., b with probability Pin etc., then
- -
x- = ape + bpb + -CPc + ... ~ =DiP. + Jilpb + ilpc +
...
(4-34)
(4-3S)
Note that weighted second IIlOIDeDtS are added when x is a choice, whereas variances are added when x is a c:ombiDaIion. 11. The set of probabilities Pn !bat describe a I3Ddom variable may be summed up to, or beyond, some limit. This sum is called the cllmuJative disuibution func-.
Basic Performance Concepts
98
Chap. 4
tiOD of n. If the sum is up to but does Dot include the limit, then the cumulative distribution function gives the probability that n will be less than the limit. This is denoted by P(n < m), where m is the limit
< m) =
P(n
L p"
n<m
(4-36)
where the summation is over all n less than m. Note as m grows large, < m) tends toward unity. If the sum is beyond the limit (n > m), then P(n > m) is the probability that n will exceed the limit
P(n
> m) =
P(n
L P"
n>m
(4-37)
where the sum. is over all n greater than m. As m grows large, P(n > m) tends to zero. A simple example will illusttate many of these points. Figure 4-2a gives a probability deDsity function for the size of a message that may be transmitted for a terminal. Its size in bytes is distributed as follows: Message size (11)
Probability (p,,)
20 21 22
.1 .2 .3 .2
23 24
.2
Note that the probability of all messages is 1: 24
LP" ='1 _20 The mean message size is
ii =
24
L np" = 22.2 11-20
The varlanc:e of the message size is 24
var(n) = .
Its standard deViation is 1.25. The second moment is
L (n-ii)7" = 1.56 ,,-20
Chap. 4
Concepts
in Probability, and Other Tools
99
Note the relationship between variance and second moment: var(n) =
r1- - ;? = 494.4 - 22.22 = 1.56
'Ibis illustrates one potential computational pitfall. The variance calculated in this manner can be a small difference between two relatively large numbers. For that reason, the calculation should be made with sufficient accuracy. The cumulative distribution functions for this variable are shown in Figure 4-2b. As with the density function, these functions have meaning only at the disaete values of the variable. Thus, the probability !bat the message length will be greater than 22 is .4 (i.e., P23 + P24 = .4) and tbat it will be less tban 22 is .3 (i.e., P20 + P21 .3). Now, let us assume that we have a second message type with a mean of 35 bytes and a variance of3. Denote asml the first message described by the disttibution of Figure 4-2, and denote as m2 the new message just defined. Consider the following two cases:
=
Case 1. ml is a zequest message, and m2 is the zesponse. What is the average communication line usage (in cbarade.rs) and its variance for a complete transaction? In 1bis case, the communication line usage is the sum. of the message usages. The mean and variance of this total usage are the sum. of the means and variances for the individual messages. Let the total line usage per transaction be m. Then
.4 Pn
.3 .2 .1
I II I ,
I TI9202I22232425 I
MESS.. SIZE (n)
P
MESSAGE SIZE em) CUllJLATIVE DISTRIBUTION FUNCTl(JNS (b)
Basic Performance Concepts
100 m = mI
Chap. 4
+ m2_
m = iiI + m2 = 22.2 + 35 = 57.2 var(m)
= var(ml) + var(m:z) = 1.56 + 3 = 4.56
Thus, average communication traffic per traDsaction will be 57.2 bytes with a variance of 4.56 or a standaId deviation of 2.14 bytes. Case 2. Both ml and m2 are request messages. m will be ml 30 percent of the time and m2 70 percent of the time. What are the mean and variance of m? m is DOW a choice between messages. Its mean is
ii = .3 x 22.2 + .7 x 35
= 31.16
The second moment of m is found by adding the weighted second moments of ml and m2. The second moment of m2 is the sum of its variance and the square of its mean: m22 = var(m:z)
+ iil· = 3 + 39 =
1228
Then
"r = .3 x 494.4 + .7 x 1228 = 1007.92 The variance of m, then, is var(m)
= m 2 -;;p = 1007.92 -
31.1@ = 36.97
Thus, the Ieqaest messages will average 31.16 bytes in length, with a variance of 36.97 or a standaId deviation of 6.08 bytes.
The previous sec:tioD cIesc:ribed disaete random variables-those tbat take on ODly certain discrete (often integer) values, such as tbe number of items in a queue or tbe IlUIDber of bytes in a message~ But what about the number of seconds in a service? If a process mquhes somewheze betweeIl 2 and 17 msec to process a traDSaCtioD, it can vary MDf.innnusly between these limits. We can say that the probability is .15 that the service time for this process is betweeIl 10 and 11 secoDds, bat this p10bability includes service times of 10~. 10.2, and 10.25679 seccmds. The service time variable is not disc:me in this caSe. It can assume an infiDite IlUIDber of values and is dIaefore called a continuous rarulom variDble. AU of the IUleS we haw; eStabusbed for disc:me variables have a corollary foi
continuous variables, often with 1he summation sign simply replaced with an integral sign. The chaJ:acteristic and rules with which we will be conc:emed in pedmnance analyses are as fonows:
Chap. 4
Concepts in Probability, and Other Tools
101
1. The probability density function is continuous. If%is a random value,/(%) is the probability tbat x will fall within the iDfinitesimal range /(x)dx. More specifically, the probability that x will be between a and b is P(a s x
S
b) = f:/(x)dx
S
1
(4-38)
The probability density function of x is/(;c). Notice that there is no requirement tbat/(x) %1
XI
.
Xl
%1
I! X2 is less than Xl, then its value can range from 0 to Xl. TheIefore, its probability deDsity function for Ibis case is f(xv
=
;1'
Xl
< XI
and its average distance from Xl is
.
..
The pJObability thatX2 will be ~ tbanXI is (b -x~)Jb, and the probability that it will be less is xI/b. Thus, the average distance between Xl and X2 for a given value of XI
is
Basic Performance Concepts
104
Chap. 4
Since Xl can :range from 0 to b, its probability density function is !(X1)
1
=b
The average distance tbatX2 is from Xl whenXl is varied over its range will be called i and
is - = Lb~
X
o
-
2bxl
2b
+ 2x12 J~(\Xl,"", \..1_ .
2
13
_ 1 [ 2bx1 2x ]b x=2b'2 ~xl--2-+T 0
i= bl3. Thus, the average distance between Xl and X2 is 113 b. This means tbat the average seek distance on a disk is 113 of all tracks. The average distance between two termiDals on a local mea aetworlt is 113 of the bus length. The avenge:random seek distance for a tape is 113 of its length.
Pelflllltations and Combinations It is sometimes useful to be able to calculate the DUIIIber of ways in wbich we can select n objects from a group of m objects. Sometimes the order in which we select them is important, and sometimes it is not. If order is important, we are talking about permuIIItions. Let there be m distinct objects in a group, and we desUe to choose n of them. How many c:ti1fereDt ways can we choose these n objects? On the first choice, we will choose one of m objects. Given n choices, the total number of di:ffeEent ways we can choose n objects is m(m - I)(m - 2) ... (m - n
+ I).
Tbis can be written as
plll= II
m! (m-n)!
(4-45a)
wbme P': is the number of pamutations of m objects taken n at a time. However, if order is DOt important, we haveCOUDted too maay possibilities in the above analysis. We·have C01IDted all of the permutations for each set of choices but are only intaested in COUD1iDg that particular combinatiODof choices once. For iDstance, if one set of choices was ABC, we have counted it as
ABC ACB
BAC BCA
CAB CBA
or six times, whereas we are ODly iDtelestecl in COUIltiDg it once. We are iDterested in the of combiMtions of objects, DOt in all of their pemmtations.
~
Chap. 4
Concepts in Probability, and Other Tools
105
The first item chosen could have occmrec:l during any of the n choices. Given that, the second item could have occurred during any one of the remaining (n - 1) choices and SO on. The same combiDation bas been counted n(n - 1) (n - 2) ... (1) times for a total of n! times. Thus, the total number of combiDations is the number of permutations divided byn!:
cm = n!(m-n)! m! n
(4-45b)
where C': is the number of combinations of m objects taken n at a time.
Series In many of the cases with wbich we will work, we will find ourselves with a summation over an infinite (or at least a very large) number of items. Often, these infinite series can be reduced to a very manaple expression. Some of the more useful ones are summarized here.
a. 1 + x + :x2 + r3 + ... ,0 so x so 1 This can be written in the form
...
1
2X'=1=0 I-x
(4-46)
The similar series, wbich is truncated on the left, directly fonows and is
x" + Xll+l + X"+ 2 + ... ,0 so x so 1 This may be written as
(4-47)
Likewise, this series truncal'tid to the right is 1 + x + :x2 + ... + r l , 0 so x S; 1 wbich may be written as ... ID ID l-xi' 2X' = 2Xi - 2X' = (1-xi')2x i = 1-0 i-O t-II 1-0 I-x
11-1
(4-48)
The doubly ttuncated series is ;xii
+;xII+l + ... +
r
1, O:s x:s 1
This may be written as 11-1.
l-xll-II
11-11-1.
2x'=;xII i-II
2 X'=X i-O
Il
XII-xII =I-x I-x
(4-49)
106
Basic Performance Concepts
b.
+ 2.%2 + 3%3 + ... ,0 S This can be expressed as
%
%~
...
~.
Chap. 4
L %
LiU:' = - i=1 (l-x'j-
(4-50)
This can be expressed as (4-51) .
Conversely,
may be expressed as (4-52)
We DOW ctiscuss the Poisson and exponential distributions in some detail. Not because we will use them in our calculations so often (though simulation studies certainly do) but because they repsesent much of the statistics of queuing tbeoIy and form an important undeJpinDing to our unc:lersranding of the use of the tools we will bring to bear on the aualysis of perfOllDlDCe problems. The Poisson distribution provides the pmbabilities tbat exactly n events may happen in a time iDterval, t, provided tbat the 0CCUDeDCe of these events is independent. That the iDdepeDdeDce of events is the only assumption made is the mason tbat this distribution is so important. Event independeDc:e simply says tbat events oc::cur completely randomly. They do DOt occur in batches. The OCCWleDCe of one event is DOt at all depeDdent on what bas 0CCIIDed in the past, nor bas it any iDftuence on what will oc::curin the fatme. The process has DO memory; it is memoryless. We will can a process tbat creates such rauc:Iom events a . random process. In queuing th.eory, thele are two important cases of a random process: 1. The mival of an item at a queue is a random event and is independent of the . arrival of any 01her item. 'IbelefOIe, mivals to a queue are random. 2. The iDstant at which the ser:viciDg of an item by a server completes is a rauc:Iom event. h is independent of the item being serviced and of any of its past service cycles. TheIefore, service completions by a server are random.
Chap. 4 . Concepts in Probability, and Other Tools
107
Note that randomness has to do with events: the event of an mival to a queue, the event of a service time completion. Let us determine the probability that exactly n random events will occur in time t. We will represent this probability by p,,(I): p,,(t)
= the probability that n random events will occur in time t
(Remember that n is a discrete random variable. Its values are the result of a random· process. These two uses of rtmdom are unrelated. Random variables also are the result of nomandom processes.) Note thatPII(I) is a probability function that depends on an additional parameter, t. As t becomes larger, the probability that n events will occur c:han.ges. This is unlike our simple probability functions described earlier. Such a process is caDed a stochlJstic pr0-
cess. The average rate of the occmrence of events is a known pammeter and is the only one we need to know. We will denote it by r: r
= average event oc:currence rate (events per second)
Thus, on the average, n events will occur in time I. Since events are c:omp1etely random, we know that we can pick a time interval, t, sufficiently small that the probability of two or more events oc:curriDg in that time inrerval can be ignored. We will DOte Ibis ubitrarily small time interval as IJ.t and will assume that the only things that can happen in IJ.t are that DO events wDI occur or that one event will
occur. Let us DOW observe a process for a time t. At the end of tbis observation time, we find that n events have occaaed. We 1ben observe it for IJ.t more time. The probability that ~ furt:be.r event will oc:car in IJ.t is i'1J.t. The probability that no further events will occur in IJ.t is (1 - rlJ.t). Thus. the probability of observing n evems in the time (I + 1J.t) is (4-53)
This equation notes that n events may occur in the iDtaval (I + 1J.t) in one of two ways. EitI:m n evems have occ:unecl in the imerval 1 IIIIIl no events have occ:uned in the subsequent iDterval.lJ.t. or n - 1 eWIlts have oc:cwed in the iDIerval 1 IIIIIl one mare event has OCCWIed in ibe subseCpJem interval, 1J.t. (Note that since the mival of an event is independent of previous mivals, all of these probabilities me indepenIJent and may be c0mbined as shown, acccmtingtoru1es 3 and 4 in 1he earlier sectioD emitled ..Discrete Random Variables. ") . If no events OCCWIecl in the interval t + 1J.t, tbis ~p is written pJ..I
+ 1J.t) =pJ..I)(l -
rAt)
(4-54)
since PII-l does not exist. 'Ibat is, the probability of DO events occming is the pmbability that.JlQ events OC:CWled in At. . the interval t and that DO events 0CCUIIed in the interval .
Basic Performance Concepts
108 Equations 4-53 and 4-54 can be P.(t
+~-
Chap. 4
reaminged as
P.(t) = -rpJ..t)
+ rp,,-l(t)
(4-55)
and pJ..t + ~
-
pJ..t) = -rpJ..t)·
(4-56)
As we let At become smaller and smaller, tbis becomes the classical definition of the derivative of P.(t) with IeSpeCt to t, dp.(t)/dl. Denote the time derivative of pJ..t) by p~(t):
p~(t) = dpJ..t) dl
We can express equations 4-55 and 4-56 as p;At)
= -rpo(t)
(4-57)
p~t)
= -rp.(t) + rp.-l(t)
(4-58)
and
This is a set of differeDtial-c:lifferem:e equations; their solution is shown in Appendix 4 to be
P.
(t)
n = (rtY'en!
(4-59)
This is the Poisson distrlbIItion.1t gives the probability 1hat exactly n events will occur in a time interval t, given onlytbat their mivaIs are random with an avenge rate T. Though the serious student is encouraged to ~ the solution to these equations in Appendix 4, the main lesson to be leamed is the simple and underlying fact 1bat the Poisson disttibution depends only 011 the randomness of event 0CCUDeDCe. All 1bis is s1llDD18lized by sayjDg 1bat the distribution of the number of DDdom events that will occur in a time interval t is given by the Poisson distribution. ID qaeuiDg theory. the nmclom eveDIS of concern are arrivals to queues and completioDs of service. Let us look at some properties of the Poisson distribution. First, the sum of the probabilities over all values of n is
f (rtY'~-n = e-n f (Ttr = e-ne n = 1 n. n.
,,-0
.-0
as would be expected (the infinite series given by equation 4-51 is used). We now derive the mean value of n for the cJistribution:
..
ii =
tn
.-0
(rt)"e- n n!
= rtf (rtY'-l e-n ._l(n-l)!
Chap. 4
Concepts in Probability, and Other Tools
109
and
where i bas been substituted for n - 1 in the summation. Thus, the mean number of events that will occur in a time ~al t is Tt, as we would expect: (4-60)
n=Tt
The second moment of n for the Poisson distribution is derived in a similar man-
ner:
and
nz = ne-n:Lell n (rt,
~-1
n-l
Letting i
=n -
(n-l)!
1,
,;z = ne-nI(i+I) ('"!Y z!
i-o
_ _ ,;z -
Tte
... (TtY] ., Tt~(._I)' + ~ z• •-1 Z •
()i-l -n[ell .!!..-.
.-0
and
Iii = (rtf + Tt
(4-61)
From equation 4-31, the variance of n is var (II)
=Iii - i?
Since tile mean II is rt, 1bea var(lI) = rt
(4-62)
lbns, both the mean and the variance of II is rt for a Poisson distribution. Note the memoryless feature of the Poisson distribution. The probability that any number of events will happen in the time interval t is a fuDction only of the mival rate, T, the number of events, II, and the time interval, t. h is completely independent of what happened in the pevious time intervals. Even if DO event bas oc:c:urred over the past several time intervals, there is DO inaeased assurance that one will occur during the next time interval.
Basic Performance Concepts
110
Chap. 4
The Exponential Distribution The exponentilJl distribution is very much related to the Poisson distribution and can be . derived from it, as will soon be shown. It deals with the probability disUibution of the time between events. Note that the Poisson distribution deals with a discrete variable: the number of events occurrlng in a time interval t. The exponential distribution deals with a continuous variable: the time between event occurrences. To derive the distribution of interevent times, we assume that events are arriving randomly at a rate of T events per second. Let us consider the probability that, given that an event has just occ:urred, one or DlOIe events will occur in the following time interval, t. This is the probability that the time between events is less than t. 1fT is the time to the next event, we can denote this probability as P(7' < t) and can express it as (4-63)
That is, the probability that the next event will occur in a time inrervalless than t is the probability that one event will occur in time tplus the probability that two events will occur in time t, and so 011. Manipulating equatioo 4-63, we have
f (rtf - 1] n.
P(7' < t)
= e- rr [
P(7' < t)
= 1 - e- rr
..-0
and (4-64)
'Ibis is a cumulative distribution for the iDtezevent time t. Its density function, p(t), is the derivative of its cumulative distribution function. That is, from equation 4-44&, P(7' < t)
=
r
p(t)dt
DifIemltiatiDg both sides with respect to t gives oJt)
y,
= C~P(7'< t) = C~(le-rr ) tit tit
wbere C must be choseD such that tbe integral oftbe density functiOn is unity (see equation 4-39). Thus,
Since
Chap. 4
Concepts in Probability, and Other Tools
111
1
Cr-= I" r
and C=1 Thus, the probability density function for the interevent time, t, is
p(t) =
re-rr
(4-65)
We can also expJeSS the altemate cumulative distlibution giving the probability that T is greater than t. From equation 4-44b, we bave
P(T> t)
= fe-rTdT
=( -~e-rr]~ and (4-66)
as would be expected from equation 4-64, since P(T < t) + P(T > t) = 1. Since t is a continuous variable, P(T = t) is zero, and can be ignored. The mean, variance, and second moment of the exponential distribution can be shown to be
= l/r vm(t) = l1r2 i
and t2 =
21r
(4-67) (4-68) (4-69)
Note once again the JDeDlOIYless feature of the expoDeDtiaJ distribution. No matter when we start waitiDg fer an eveat (even if ODe has DOt occaned fer awbile), the expected time to the next eveat is sdJlllr. Also note 1bat t has been ~hele JeJative to the way it is used in the Poisson disIributioD. In the Poisson distnDutioo, t isa fixed iDterval over wbich the probability of occuaeuc:e of II evems is expnssed In the expoNP'iaJ disI:ributioJl, t is the raadom varlable expessiDg the time betweeIl eveats.
To SlIIDJD8I'jze the above, we make tbIee statem.eDts about a nmdom. process with an avenge eve:at rate of r events per second. 1. A TtWlom process is ODe in wbich events ate gene.fated randomly and independently. The probability 1bat an eveat will occur in an l1'bib:alily small time interval III is rill, independent of the eveat history of the process.
112
Basic Performance Concepts
Chap. 4
2. The probability PII(t) that n events will occur in a timeinterva1 t is given by the
Poisson distribution: PII(t)
=
(rt'f
n~
-rr
with n=rt var(n) = rt
and
r = (rt)2 + rt 3. The probability density function p(t) for the interevent time t is the exponential function p(t)
= re-rr
with
i = 1/r var (t)
= 1/r2
and
Thus, random, Poisson, and exponential distributions all imply the same thing: a random process. 'Ibis is a process in wbich events occur randomly, the distribution of their oc:cmren.ces in a given time interval is Poisson-distribut.ed, and the distribution of times between events is expoaeadally distributed. In queuing theaIy, dIae lie two DDCIom processes with which we ftequently deal. One is· tII'7'ivtzb to tl f/IIeW. An arrlval of an item to a queue is a random evem. Arrivals are said to be Poisson-dislribul, and the iuteIanival rate is exponentially distribatecl. The statellkems random tmivtsls and Poisson tl1TivtIJs lie equivalent. The· 01b.er p:ocess is . . IIlnice time t1/ tl server. Manning the server is busy seniciDg items in iIs queue, 1IJe·compktioD·of a service is a nmciomevem. Tbedistribution of service completioDs is PoissoD-disIrib (though we don't normaUy expess this), and service times (Which are the times between completion events in this case) are exponen- . tially disIrlbuted. The statemeJdS TfI1IIlDm·service times and exponential service times are equivalent. Due to the memoryless D8bD'e of random processes, if we begin the observation of a 1'8IIdam. server with average service time t" as it is in the middle·of pmc:essiDg an item, the average time !eQ1lh'ed to complete tbis service is still '., DO matter how loDg the item bad been in service prior to our observation. 'Ibis property was used as an argument conceming the evaluation of the service time distribution coefficient for exponential service times in the derivation of equaDOIIS 4-10 tbEOugh 4-12.
Chap. 4
Infinite Populations
113
CHARACTERIZING QUEUING SYSTEIfS Kendall [13] bas defined a classification scheme for queuing systems that lends order to the various cbaracteristics these can have. A queuing system is categorized as AlBlclKlmIZ
where
A == the arrival distribution of items into the queue, B == the service time distribution of the servers,
c == the number of servers, K == the maximum queue length,
m == the size of the population wbich may enter the queue, and
Z == the type of queue discipline (Older of service of the items in the queue). Several representations for the arrival and service time distributions (A and B) have been suggested, but for our purpose we will deal with four. A or B may be M for a IaDdom (memoryless) distribution,
D for a coastant distribution (such as a fixed service time), G for a geaeral distribution, and
U for a UDifOlmdistribution (this, admittedly, is added to the list by this anthor). lbns,.MfDI3/10l4OlFlFO represeDts a queuiDg system in which anivaIs are random, service time is CODStant, and there are 3 servers serving a queue which can be no longer tban 10 items, serving a populatioa of 40 on afint-come, fint-serve basiS.' . Ifthe maximum queueleDgth is unljnrited (K = co), if the population is iDfiDi1e (m == co), and if the queue disc:ipliDe is FIFO, 1beIl the 1ast dDee terms axe dropped. TbeD, for instance, an MIMIl system is a system in which DDdom aaivals &Ie served by a siDgle server with mndom service times. This is the simplest of an queuing systems. AD MlG/l system is one in wbichraudOm arrivals are serviced by a siDgle server with geaeral service times. This is the case solved by TOUntc:biDe and Pollaczek. INRNfI'E I'OPULA11ONS . ODe of the parameters in ~'s classification scheme is the size of the population m using the queue. This is a particularly impottant pammeter for the followiDg~. If the size of the population is iDfiDite, 1beIl the rate of arrival of user$ to the queue is independem of queue leDgth and therefOle of the load on the system. That is to say, no matter how loDg1he queue, there is still an iDfiDite popuJation of users :from which the next ~ to the queue will ~.
Basic Performance Concepts
114
Chap. 4
However, if the user population is finite, then those waiting in the queue are no longer candidates for entering the queue. As the queue grows, the available population dwindles, and the mival rate falls off. As the load on the system grows, the imposed load decreases. Thus, the load on the system is an inverse function of itself (tbis is sometimes referred to as the graceful degradation of a system). The analysis of queues formed from infinite populations is quite different from. that of queues formed from. finite populations. We will first consider infinite populations, about which a great deal can be said. Some Properties of Infinite PopUItmoIlS Regarding infinite populations, there are some general properties that can be useful. These include
1. Queue input from several sources. If several random sources each feed a c0mmon queue, each with different average arrival rates, Ti, then the total mival distribution to the queue is a Poisson distribution with an mival rate T equal to the sum of the component arrival rates, Ti (Martin [20], 393). (See Flg1JI'e 44a.) 2. Output distribution ofMIMIc queues. Ifone or more identical servers with exp0nential service times service a common queue with Poisson-type mivals, then the outputs from that queue are Poisson-djstrib, with the departure rate equal to the mival rate, i.e., the depauwes have the same distribution as the mivals (Slaty [24], 12-3). (See FIgUte 4-4b.) 3. TTtl1IStICtion stTeom is splil. If a randomly distributed. transaction stream is split into multiple paIbs, the tmusac:tious in each pa!b are random streams with proponioaate arrival lites (IBM [11], 49). (See Figure 4-4c.) 4. TII1IIl8m queues. From. 2, a randomly dimibuted transaction stream passing through 1aDdem compound queues Will emerge as a DDdomly distribu1ecl stteam with the same avenge rate as wIleD it entered the system (IBM [11], SO). (See Figure 44d.) S. 0rtJg ofservice impoI:t Oft response time. The meaD queue time and mean queue leagIh as JRCIic:red by . , KJrintcbjue-Po11acalc: equation is indepeDdeDt of the cm:Ie:r in which the 'queue is saviced, so loug as that order is not dependent upon the service time. This would not be true, for iustance, if irems xequirlDg less service wae seIViced in advance of other items (Martin [19], 423). (See Figum 4-4e.)
Dispersion of Response r .....
We have ah'ead.y disc:ussecl the ueed to be able to make a stat.ement relative to the dispersion of the response time, something in the form "the ~ that respcmse time will be less than two seconds is 99.9%." We will discuss tJ:ne appmaches to this pr0blem.
Chap. 4
Infinite Populations
115
"~~:D" MULn-C!:IANNEL
SERVER
(b)
TANDEM QUEUING SYSTEMS (d)
--..JIi[]--.C!:J-.=--"~ QUEUE ~Pl.JNE
.Gamma distribution. WIthout goiDg iDto great deIail, the Gamma function is the key to tbis stalemeot. It is a more geaeral form -of a probability functioD of which the expoDeDtial ctistribution is a special case (see Martin [19], 437-439). It bas the p:operty tbatthe sam of a set of variables follows a Gamma fmlctioD if each oftbe!8riables follows
a (janmutfanCtion. In TP systemS, a tnmsaction usually passes tbIOugh a series of servers, as we have seen. Tbe respouse time of the system is the sum of the component delay times of each server. These lie often servers with exponeutially distributed service times (at least approximarely). Though the sums oftbese delay times may DOt be exponemial, they will be Gamma-disIribut, and this distribution can be used to detemUne the probability dJat the system respouse time ~ be greater tban a multiple of its mean.
Basic Performance ConceptS
116
Chap. 4
To use this tecbnique, we need to know the mean system response time and the variance of the system response time. The mean system response time is, of comse, the primary focus of the performance model. The variance of this response time is more difficult and often impossible to calculate. However, a reasonable limiting assumption to make is that the response time is random, i.e .• it is exponentially distributed. In this case, the variance is the square of the mean. Real systems will usually have a smaller variance than this, i.e., the response time will not be completely random. The Gamma distribution is used for these pmposes as follows. Fust calculate the Gamma distribution parameter, R, where
T2
R=var(T) .
(4-70)
T is the response time, and T is its mean. Then use the Gamma cumulative distn"buQon function with parameter R to determine the probability that the response time will not exceed a multiple of the mean time. Note that R= 1 for a randomly distributed response time. Real-life response time variations will probably be less random and thus have a greater value of R. Certain values of these probabilities are listed in the following table for values ofR from 1 to 10 (the range in which we would nOJlDllly be inteIested). TABLE ..1. TABLE OF k = TIT R
= f:1vu(T) PJobablJity of zapoase time T ~exceediag
leT
.95
.99 .999 .9999
1
2
3
S
10
3.0 4.7 6.9 8.9
2.4 3.4 4.6 S.7
2.1 2.8 3.8 4.5
1.9 2.4 3.0 3.S
1.6 1.9 2.3 2.6
To take the c:onservative case of R= 1, we can say that 95 percent of an services will finish in less than three mean service times, 99 pen:ent in less than five mean service times, 99.9 peKeIlt in less than seven mean service limes, aad 99.99 perceDt in less than nine mean service times. This is oftea sufficiellt to coaservaDvely vaJidaIe the performance of a system. (These are the valnes tbat wee used in the example in chapter 3.)
CentraI.LilnitTheorem.· ACCOIdiDg to the Centtal Limit TbeoIem, "the ctisttibution of the sum of sevetal random variables approaches the DODDal distdbutioD for a wide class of variables as the number of random variables in the sum beComes large. " The p!eCise test of how closely a system will approach a normal ctistribution is quite complex. HoWever, the tbeoiem has been shown to hold well in typical queuing aDalysis problems. To use this theorem, the first step is to calculate the mean and variance of the ,;:esultiDg response time by adding the delay times for each of the components:
Chap. 4
Infinite Populations
117
-T= TI- + T2 - + ... and var(T)
= var(Tl) + var(T~ + ...
Then, for a given probability, as given in Table 4-2, simply add the standard deviation, i.e., the square root of the variance, weighted by the factor p to the mean to obtain the maximum value of response time below which actual response times will fall with the
given probability. For instance, if mean response time is 4 seconds, and if its standard deviation is found to be 3 seconds, then with 99.9 percent probability, the response time will be less than 4 + 3.09 x 3 = 13.27 seconds. TABLE 4-Z. NORMAL DlSTRIBunON p
.90 .95 .99 .999 .9999
1.28 1.65 2.33 3.09 3.71
For random distributions in which the standard devWion is equal to the mean, Table 4-2 indicates that the maximum respoase time for a given probability is (p + 1) times the mean response time. Comparing Tables 4-1 and 4-2, the normal dislribulion tedmiqne equates approximately to R = 2 to 3 when random disttibutioDs are assumed.
Variance of responie times. For a queuing system in which inputs are 1311dom with arrival rate, R, and in wbich the distribution of the service time, T, is arbitrary, i.e., the KhintcJrine-PoJlaczek MIO/l case,·the variance of the delay time, Td, is given by*
or
_
_2
KI'3
RlT2 -var(T~ = 3(1-L) + 4(1-L'f + T2 - T2
(4-71)
where
T is the mean of the service time, T (Note: elsewhere, this is noted simply as T.) *'Ibis m1aIiaD ay be foaad ill Madin [19], Madin (20], aad IBM [11], each of wbich difl'ers from Ihe oIhe:ls 8JI!I comaiDs miDor emxs.
Basic Performance Concepts
118
Chap. 4
T2 is its second moment.. T3 is its third moment R is the arrival rate to the queue L is the load on (occupancy of) the server This is solved for the three following cases of interest: 1. Exponential service time. In this case,
T2 =
if2
and
T3 = 61'3 Substituting these expressions into equation 4-71 yields the delay time variance for a server with exponentially distributed service time:
1'2
var(T,u = (l-L~
(4-72)
Note that this is the square of the mean service time as to be expected (see equation 4-11). 2. Uniform service time. If the service time may fall with equal probability between two limits (disk seeking is close'to
this), then
T2
= ~T2 3
T3
= if3
and (4-73) 3. Constant service time. If the service time is ccmstant (such as a polled communication line with a fixed-leogtb message), then
T2
= f'2
T3 = 1'3 and (4-74)
A !e8SOD8bility c:beck can be made on these variances by letting the load. L. approach zero. The delay time variance should approach the variance -of the service time.
Chap. 4
Infinite Populations
119
The results of this exercise are var(T,j) ......
T2
for exponential service for unifOIDl service for constant service
var(T,j) ...... T213 var(T,j) ...... O
All of these are as to be expected, using var (T) = T2 - T2 in each case. Using the fact that the variance of a sum of random variables is the sum of the variances (equation 4-33), one can calculate the variance of the delay times, i.e., the variance of the respoase times, of a tandem. queue in which a ttaDsaction flows through a series of servers. For iDstaDce, assume a traDsactiOD is processed by a communication line with constant service time, then by an application process with exponential service time, then by a disk with mDform service time, and finally by a comnumication line with constant service time. This situation is Jeflecteci in the following table with some sample values for service times and server loads. Service time variances are calculated acconting to the previous expressions. TABLEW- EXAMPLE TANDEM QUEUE Step
CoDl",'Oicatioos
P10cess Disk
Service lime (t) diSUibadoa
Mean
Server
ofT
load
CcmsIaDt
.2 .3
.2 .4
.004 .2SO
.4 .1
.6 .2
.373 .001
ExpoNmial
UIIifoaD
Com"''''Iicatioas
CcmsIaDt
r.o
VariaDce ofT" •
:a
We see that the tandem. queue provides us with a mean respoase time of one second and a variance of .628, i.e., a sumdard deviation of .792 sec:onds. If we wished to use the Gamma disIribution to de1mDiDe the pmbability of a long IeSpODSe, we would calculate R as R = T2/var (T)
= 12/.628 = 1.6
JmapolaDng Table 4-1 for a 99.9 paaadle, we find a value for k of 5.5. Multiplying the second mean service time by tbat Il1IIDber allows us to state that 99.9 percent of transactions will be completed in less 1ban 5.5 seconds. As an aJ.temative, we could use the Ceatral LiDiit'lbecnm. At the 99.9% pezamtile, we see that we shoald move 3.09 staDdard ckMatiODs out from die meaD. TbDs, we can make the statemeat that 99.9 percem of all transactions will ~ in less 1ban (1 + 3.09 x .786) 3.4 seconds. The Gamma distribution gave us a more conservative JeSuIt (S.S seconds) than the Centtal Limit 1beorem. In general, the more c:oase.rvative JeSUlt should be used. This will be given by the Qamma function for large, normaUmi S1aDdard deviations, i.e., the Iatio of the standard deviation to its mean, and by the Centtal Limit TheoIem for small, nonnalized standard deviations (typically, less than .6).
an
ODe
=
Basic Performance Concepts
120
Chap. 4
Properties of ././1 Queues Queue lengths. Queues formed by random mivals at a server with random service times (an MIMII system) are the easiest to analyze. For an MIMII system, the probability that a queue will be a particular length can be derived tbrougb what is called a birtb.-death process. We used the birth part of this to derive the Poisson distribution. We consider an MIMII queuing ~ i.e., a single server with random mivals and random service times, in which the average arrival rate is r and the average service rate is s. The probability that the queue will bave length II is p,.. where the queue includes all items waiting in line plus the item being serviced. If we consider a very short time interval, At, then the probability that an item will mive at the queue is rAt; this is a birth. Likewise, the probability that an item will leave the queue (US'uning there is a queue) is sAt; this is a death. We observe the queue at some point in time and note with probability P,,-h P", or P,,+ 1 that there are 11-1, 11, or 11+ 1 items, IeSpeClively, in the queue. Ifwe come back at a time that is At later, we will find 11 items in the queue under the fonowing conditions:
I. If there bad been 11 items on the first observaI:ion and if there bad been DO mivals or departmes in the subsequent interval, At. Since the probability of no mival is (I-rAt), and since the probability of DO departure is (I-sAt), then this will occur with a probability P" (l-rAt)(I-sAt). 2. If there bad been 11 items on the first observaI:ion and if there bad been one mival and one depai:tme in the time interval At. This will occur with probability p,,(rAtXsAt). 3. If there bad been 11-1 items on the fimt observation and if there bad been one mival dming the interval At, with DO departmes. This will occur with probability p_lrAt(1-sAt). 4. If there bad been 11+ 1 items on the first observation and if there bad been one depauwe during the interval At, with DO mivals. 'Ibis OCCUIS with probability pII+ 1sAt(l-rAt).
IgD.oriDg terms with At2, since these will disappear as At goes to zero, we have p"
=p,/..I -
sAt -rAt) +P_trAt + P,,+lSAt
(4-75)
AmmmJating P" terms, this beooines (s + r)p"
= rp,,-l + SPII+I .
(4-76)
The load on the system, L, is L= rls
Thus, equation 4-76. can be IeWriUeD as P,,+1 = (I
+ L) p" - !P_l
(4-77)
Chap. 4
Infinite Populations
121
For n=O, there is DO p,,-J, and there can be no departure if the initial value of n is zero. Thus, equation 4-75 can be manipulated for the case of n=O to give (4-78)
PI =Lpo
Using equations 4-77 and 4-78 iteratively, we find
P2 = LZpo
P3 = L3po and
p" = L"po Since L is the load on the server, it represents the probability tbat the server is occupied. Thus, the probability that the server is unoccupied is I-L. This is the probability tbat there are DO items in the system (no queue): (4-79)
Po= I-L
The probability of the queue length being n is P"
= L,,(I-L)
(4-80)
We can per.form some checks on this result as follows. FIl'St, the sum of these probabilities should be unity: GO
Lp"
,,-0
III
-=1 L
III
= ,,-0 LL"(I-L) = (l-L)LL" = =1 _0 I-L
Next, we can calculate the average queue leagdl, Q: GO
GO
Q = LnL"(I-L) = (l-L)LnL"
,,-1
,,-0
Using equation 4-50, this becomes Q
L
L
= (I-L) (I-L)2 = l-L
(4-81)
This is just what TCbjntr;hVte-Pol1aczek prec1ic:ted (see equation 4-10). The other msults can be simjJarIy verified. Finally, the variance of the queue length is given by
var (n)
=Q + ~ = h (I-L,
(4-82)
(The derivation of this is complex; see Saaty [24], 40.) The probability tbat a queue will exceed n items, P(q > n), is III
P(Q>n) =
~ Ln(l-L) = (I-L)
,,-,,+1
III
Lril
L Ln = (l-L)n"'ril I-L
Basic Performance Concepts
122
Chap. 4
from equation 4-47. Thus, p(Q > n) = L',+l
(4-83)
Summarizing what we have just deduced about the properties of MIMII queUes, we
have Probability of queue length being n: PII
= L II(I-L)
(4-84)
Average queue lengtb: L Q= l-L
(4-85)
Variance of queue length: L
(4-86)
var(Q) = (I-L)2
Probability of queue length exceeding n: P(Q
> n) = LII+I,
(4-87)
Also, from equations 4-20 tbrough 4-22: (4-88) (4-12)
and ([l' 1 Td=-=-T L l-L
As noted by the equatioIlllUlDbers. die expressi6a. The parity bit may be unused (always set to 0 or 1), or it may be set sudl that the total number of hils in the cbaracter is odd· (odd parity) or even (even parity). If either an even or odd parity bit is used, tbeD the resulting eight-bit code is error..defeding. This is because the changing of any one bit in the character because of noise will cause the parity mit to fail. A Competing code set is IBM's EBCDIC (Exrended BiDary CodedDecimal). 'Ibis is also an 8-bit code in which all 256 combinations are used (see Figure 5-6b). Thus, modem tec1mology bas settled on an 8-bit character code. This grouping of 8 bits.is called a byu. (Four bits is eaough to repesean a number and is used in some applications. Four bits is called a nibble). Note that today's compateIS typically use word sizes that are multiples of bytes-word sizes of 16 bits or 32 bits.
150 ~
Communications
EVEN, ODD, OR NO PARITY
Q-Z =61-7A A-Z
Chap. 5
0-9
= 60-69
= 41-SA ASCII CODE (Q)
Q-Z
= 81-89
0-9" F'O-F9
91-99 A2-A9 A-E
= CI-C9 01-09
E2-E9
EBCDIC CODE (b) NOTE: Character codes ore in hexadecimal
Having de1iDed an eight-bit byte as a basic UDit of iDfolmation in a 1F system, we must DOW be ~to seDda string of bytes over a COIIIIYI11Dic:atioD channel in an jnte1Jigmte fashion. Simply sending a loDg string of bits is DOt sadsfactory, siDce we would never kDow wIae the byte bouDdarles wae (see FigIue 5-7a). Cleady, additioDal iDforn1atioD must be embedded in the ~ of bUs so that the mceiver can cfetmnjne wIae a byte starts. ODe tedmique for doiDg this is caDed ~ ctIItIIrIlIIIic.· As shown in FigtD S-7b. a steady ""I" sigDal is traDsJ:DiUed betweeD c:baracters (a IlUD'ting signal). This interval is caDed the stop iDtetval. WheA it is desiIecl to seDd a byte. a stIlTt bit oomp.rising a single "0" bit (a.spacing sigDal) is &eDt, fonowed by the eight data-bits. Then the )iDe r:etams to marking for 1he next stop interval. The stop interval is guamnteed to be of a minimum leugth (typically 1, 1.5, or 2 bits in length). Thus, each byte is framed by a I-bit start sigDal at itS bElgiunjng and by a stop sigDa1 ~ its end (which is at least one bit in lengdl). To recogaize byte bouDdaries, the receiver
Chap. 5
Data Transmission
151 WHICH 8 BITS IS A BYTE '>
J .
1000110100000110
SIMPLE BIT STREAM (a)
BYTE
" STOP
MARK
I
LJl
SPACE
0
0
0
I
I
0
I
' l1Ja
t
STOP
-
START
ASYNCHRONOUS ENVELOPE (b)
S Y N
S Y N
S Y N
D
D
A T A
~ A
....
D A T A
S Y
.N
S Y
S Y
N
N
D A ••• T A
SYNCHRONOUS ENVELOPE (c)
1'ipre5-7 Byte 1IaDS1 lissioD
Simply looks for a stop-starttraDSition (a mark-space uaasitioD), discaIds ibe fiDt bit as ~ S1art bit, and stmes eight bits. The next bit should be a stop bit, aad the next start bit is awaited.
.
To achieve byte m:ogaition, the asynchronous COI""'Ilui<Son tecimique has cre.ated a 2-bit enVelope arouDd each byte (assuming a 1,-bit stop interval). A byte beiDg traD.Smitb:d over an asynchronous communjcatiOll c1wmel therefore requiJ:es 10 bits to pass 8 bits of iDformatioa-a 2S percent overhead. One iDterestiDg characteristic of 1:bis techaique is 1bat the traasmitter may transmit a char8cter at any time, since the stop iDterval between cbarac:ters can be axbittaI:ily long. This characterisUc is particularly useful for data tbat is raadoJDly generated (e.g. from a keyboatd) and gives rise to the term asyndvonous. applied to 1:bis tedmique for c0mmu-
nications.
152
Communications
Chap. 5
SpchI'OllOll$ COmmunication Syncbronous cornmunjcation takes advantage of blocks of data tbat can be 1raDSmitted as uninterrupted byte streams to achieve a reduction in enveloping overhead (and also to achieve an improvement in error pedormance, as described later). Basically. one or more special synchronization characters (SYN) are inserted periodically in the byte 5Ueam. The receiver can search for the syncbroDization sequence and then can count out 8-bit bytes thereafter. A typical syncbronous sequence is shown in FlgUre S-7c. The transmission is initiated with 3 SYN cbaracters (the ASCII SYN character is hexidecimall6). If the receiver is not in synchronization, it will look: for a SYN character by continuously evaluating the last 8 bits received. When it finds a SYN character, the receiver then starts accumulating 8 bits at a time as data bytes (the next two should also be SYN characters, a condition tbat can be used as a saDity check). Periodically, the transmitter will iDsert additional SYN characters to allow the receiver to ensure itselftbat it is still in synchronization. When the data has been sent, the traDstIritter can go idle, or it can send a steady stteam of SYN cbaracters to maintain synchroDization with the receiver. A typical interval between SYN cbaracters is 128 data bytes. If 3 SYN cbarac:teIs are sent after every 128 data bytes, then the envelope overhead IeqUired for byte recognition is 3/128 = 2.3%. This is an order of magnitude better than asynchronous communication.
Noise OIl the commnniraioD liDe as well as phase. frequency, and amplitude distortion caused by IiDe chmacterlstics will distort the data sigDal as it travels over the channel. This is evideDced in the demoduJated .sigDa! by a pheDomeaon known.as jitter. If successive received bits are viewed overlapped OIl an oscilloscope. the J'f!SIJJting pattem will appear as shown in Figure s-8a. Each bit ttaDsitioD. will geaen1ly not occ:ar exactly at the time tbat it should. Radler, it will be a little early or a little Jam. appeariDg to ujiuer" back and forth as successive bits are viewed. Though the relation betweenjiUer. liue distortion, aDd line noise is quite complex. the amount of jitter can be used as a measmre of the inteasity of the noise and distortion OIl the line. Figme 5-8b shows the effed: of jitter OIl an asynchronous sigDal. Let T. be the duration of a bit interval. The apploprlate strategy for asynchronous !reCeption is as f0llows:
.
1. Look for a srop-start traDsition. 2. Wait for a one-balf-of-a-bit interval (1',;2). 3. Sample the IreCeived sigDal. This should be the start bit (01herwise. declare a . false cbar.Icrer and JetDm to 1). 4. Sample eight more times at the bit interval. T•• to obtain the. eight data bits.
Chap. 5
Data Transmission /
153
OVERLAPPED RECEIVED BITS
L / .lITTER
"
EXPECTED TRANSJnON
JITTER (a)
ASYNCHRONOUS TOLERANCE (b)
• • • SAMPLE nilES
SYNCHRONOUS TOLERANCE (e)
s.
.
.
Sample ODe more time after an iDterva1 of Tb to easme that a proper stop iDterval bas been ~ved (otherwise, dec1ale a synchroDization error). 6. Repeat 1 tbrough S for each suc:cessive character.
From Figure S-8b, it is seen that a jitter of lnagnude T,)4 can move the start-stop traDsition 1/4 of a bit to the right and the traDsi1:ion of any data bit 114 of a bit to the left. At tis point, the sample of the bit !My be in error. Thus, the maximum jitter that can be ~ by an asyncbJOnous c:haDne1 is T,)4. Figure 5-Sc shows the equivalent: case with a synchronous stteam. For synchronous ~, the sample time is DOt detamined by a single trmsition as it is in the
Communications
154
Chap. 5
asynchronous case. Rather. the sampling time is based on the long-term averaging of many data transitions (or in some cases is derived from the modulated sigDal itself) and is therefore quite accurate relative to the expected transition times. Given accurate sampling times, it is evident from F"'1gure 5-8c: that it would take jitter of a magpitude equal to TJI2 to aeate an error. Therefore, a synchronous channel can tolerate twice the jitter that can be tolerated by an asynchronous channel. We have now seen that asynchronous teclmiques incur a byte-identification overhead Which is an order of maguitude more than synchronous channels and that they can only tolerate half the noise. The reduced noise tolerance means a higher incidence of rettansmissions and a fmther reduction in efficiency. So wby is asynchronous communication even used? ne reason is cost. The· requirements for more 8ccurate clocking and for a ~ complex byte boundary recogDition algorithm (recognizing a SYN character rather than a simple transition) make synchronous traDsmission IIlOIe expensive than async:bronous traDSlnission. Therefore. asynchronous techniques tend to be used for lower-speed applications (up to 2400 bits per second). and synchronous tec:bniques tend to be used for higher-speed applications, where getting as much out of a cbaDnel as possible is desirable.
Error Protection We have discussed the generation of errors because of line noise and distortion that cause jitter in the IeCeived signal. We have also seen one example of an error-detectiDg code to proteet against such eaors: the parity bit used in the ASCII cbaEac:ter set. This is commonly known as a vertical redundtmcy cMck (VRC). UDfortuDately, errors on CODIDlUIJication lines are not isolated. They tend to occur in short bursts. 1'helefore, it is qm possible that an even Il1IIIiber of errors will occur within a cbaractec. In this event, the c:baracter parity check, or VRC, will still be satisfied; and the error will go UDCIetecIed. For this reason, a SUODger amr-detedion scheme is often mqujred. This is twicaUy done by protediDg the message (or transmission block) with additional error-detection· codes placed at the end of the message. There are two in common use. 1. The loDgitudinal redundancy cbeck (LRC) adds ODe byte to the message that is itself a parity byte. Each bit is set so that the sam of an CODespondiDg bits in an by1es of the message is even (or odd. as. the case may be). 2. The cycIicalledlmdaDcy c:beck (CRC) isa much stroager error detecrion code. It is typically a 16-bit code added to the ead of the message, tbough lcmge.r codes give better protection. Though the theory behind CRC codes is quire exteDsive aDd complex (see lfamming [8]),· the CRC is essentially tbat sequence of bits that, if appended to the message, creates a biliary DUmber that is exactly divisible by some prede1mninetS DUmber.
CRC codes can be extended to provide forwatd error c:oaecting Sys1emS. In these systems, : . thele is so muchmmdancy provided by the error c:omction code (often up to SO
Chap. 5
155
Data Transmission
percent-see Stallings [25]) that Dot only can an error be detected but also the specific bit in error can be identified.. Therefore, that bit can be corrected. In this case, the code is a single-bit error-correcting code and may, in fact, couect many multiple-bit errors. Codes can be defined. that will cottect up to e eJl'OIS and detect up to d errors, where d > e (see Gallagher [6]). However, the price that is paid is efficiency. As error codes get more powerful, they impose a higher overllead on the system. In the cummt art, error-c:oIreding codes are used only in situations where retransmission is very expensive or impossible. Satellite c1wmels are a good example of the use of these techniques, since retransmission uses expensive chaJmel capacity. Broadcast systems are an example in which IetraDsmission may be impossible, since there may be no return path.
, HaH-Dupiax Channels : ..\., SO far, we have concemed ourselves with the encoding of data so as to be able to identify it (byte identification) and to proteCt it against eD'OIS. There is one other major pedOImallce . consideJ:ation at the data traJlSmjssion level, and that is whether the communication channel is simplex, balf-duplex, or full-duplex. A simplex channel can ttansmit information in only one ctirec:tion. A half-duplex channel can transmit in either direction but ODly in one dUection at a time. Afull-duplex channel can transmit in both directions simultaneously. We will ignoIe simplex channels, since they are not usually of use in TP systems. If they are used, they bebave, for performance purposes, as bali of a full-duplex clwmel. . A balf-duplex channel aeates several performance conside:mtions. Not only is traffic in one dUection affected by traffic in the other diIection, but there also can be significant delays in QmJiD.g a channel around. Channel turnaround time comprises two c:ompouents: 1. Chllnnel settling time. When a channel is relinquished by one ttansmitter and acqui1ed by another ttaDSDDtter, tbeIe is a paiocl during which the energy imparted to the channel by the fust nnsmiuer is decayiDg, and the energy imparted by die seccmd 1l'8IISIIUt:Ier is buildiDg. ODly when the new transmission eaergy is peatertban the old lIansmission energy by a significaat amount can the line be used for Ietiable oomnnmicalioa. This time is typically a few byte. intervals. . 2. Echo. suppressors. LoDg telephone tines 1r:nCl to develop echoes because of impedance mismatches alcmg tbeir leagth.To }RVent 1bis from becohdng a nuisaDce to telephone usea, echo suppteSS01S have been insIalled 1bIOugb.out the telephone DetWodc. These devices detamiDe die dimction ofprednnrinant transmitted eneIgy and suppms 1rIDSmissions in the IeVerse direction. When the direction of ttansnrission nMDeS, the echo supptessors IeVerSe di.tection. This can take a few tellS or even a few htmdreds of tmllisecmds. (Have you ever noticed. the firSt syDable of the from the other end being cut off?) : On a balf-cluplex clwmel, Ieliable communication cannot be achieved until the
conve:rsmon
156
Communications .
Chap. 5
.
echo suppressors on the.cbannel are all reversed. Though the telephone companies have undertaken a program to upgrade their equipment in order to eJjminate echo suppressors, many remain and probably will remain for the foreseeable future. This is primarily a problem for dialed lines, as d.edic:ated tines are conditioned in many ways which preclude the need for echo suppressors. The tumaround delays required by channel settling time and by echo suppressors may be established by timers in the terDrlDal or host equipment or may be compensated for in the modem itself. The modem provides two signals for this purpose: 1. A Request To Send (RTS) sigDal to the modem, requesting permission to send data. 2. A Oear To Send (crs) sigoal from the modem, indicating that data may DOW be sent.
crs
In actual fact, the 0Dly logic that links the sigoal to the RTS sigoal is a timer in the modem. The time-out is set by the user to the minimum safe time determined for channel tumarouDd. Typical timer values range from tens of nriUjseconds to hundreds of nrilJjseconds. Oearly. channel tumarouDd delays can have a sigDificant impact on communication channel perf"0l1D3IlCe. A lOO-msec. tamaround penalty for every 2OO-byte block over a 24OO-bitIsec. c:bam:ael requiring (8)(200)12400 667 msec. per block is not to be taken ligbtly. This becomes even worse for a polled channel, wherein a typical poll sequence might be 3 eharacters and a poll response 1 character (10 msec. and 3.3 DJsec., IeSpectively, at 2400 bitslsec.). Adding 100 JDsec. to each of these completely distorts the pmfomJance picture.
=
FulI-Dup1ex CIIann_
-.-
A full..duplex channel is conc:epcua1ly simple. Both sides of the conversation can transmit simu1taDeous1y. Full-duplex c:bannels can be derived in several ways from physical channels. For iDs1aDc:e. two separate cImmeIs may be provided. one to be used for each direction. This is usually the tedmiqueused for higher speed CQI1JID1IDicaDODS. For lower data rates. a single physical channel may be divided iDro two logical channels using FlequeDcy Division Multiplexing (FDM). as described earlier. The lower half of the frequeacy spectmm supported by 1be channel may be used to send data in one diIection, and the upper half of 1be.frequeacy specuum may be used to send data in the opposite ctiIection. This teclmique is used by. low speed (300 bit per second) modems. One significant advantage of this technique is that a fuD-duplex cmmec:tion can be established over a dialed line. Though fuD-duplex channels me c:onceptualIy simple. the message tmDsfer pr0tocols RqDired to take maximum advantage of the full-duplex capability can be quite a bit more complex dian those used with half-duplex chaUDels. These protocols are discussed in detail in the next section. .
Chap. 5
Protocols
157
PROTOCOLS The previous discussions can be perceived as descnbing the process of data transmission. The facilities and teclmiques described allow sequences of bytes to be delivered between
users. Let us now discuss data communications. In order to have a meaniDgful communication of data between a sender and a receiver, the data must be identifiable and transferred reliably. The procedure for accomplishing this is called a protocol. A protocol is an agIeemeDt between the sender and receiver of data as to exactly how that data will be transferred. Protocols provide tIuee primary functions:
1. Message identificotion-They identify the bounds of messages carrying the transaction and response data. 2. Do.tIl protection-They protect data agaiDst euor, ensuring its reliable delivery.
3. Chtmnel alloco.tion-They provide the mec:lwrism for allocating the channel in an O1derly mauner to the various competing users of the cbarmel.
Protocols typically have tIuee distinct parts:
1. The establishment procedure, which serves to establish a virtual CODDeCtion . between two users (or many users in the case of a bmadcast). . 2. The message transfer procedure, which describes the form of message ttaDsfer. 3. The termino.tion procedure, which bJeaks the virtual c:cmnection• .The establishment and tenxrination plocedutes saDsfy the ehaUDe! allocatiOD functiOD. The message-traDsfeprocedure is the message-ideDtifcaticm fImction. DaIa potectiOD spans and sigDUicandy complicates all pmcedm:es. Tbere me many standan:Iized protocols, and their SIUdy alcme would fill volumes. Our COIlCeED is to 1IIidea:staud1be basics ofpmtocols fmma paformaDce Viewpoint so tbat, given a protocol specific:atioD, we can evaluaIe i1s pedomumc:e wiI:biD a given ccm"'PIWc:atiODs eavimnment Therefole, we wiJl SIIIdy some classes of prorocols·and zeJate tbem geoen1ly to some of 1be IIlOIe popular protocoJs in USe today.
" ••8.,. ,.,.,,1icafiWJ .." I'Iotedioft Just as start-stop bias and sy&ebroDizatiOll bytes are USed to identify the boundaries of data bytes in a bit stream, so must there be a u:w:c:hawsm to iden1ify messages in a byte stream. This is typically accomplished with CODttol characters that are chosen to be UDiqae in the bytesttam. Quite simply, a 1IDique start-of-text byte may indicate the start of a message. and an end-of-text byte may iDcficate the end of a message. These are COIDDlODly desiguatwl STX and m;x, and are, for blstanc:e. fOUDd in the ASCII control set.
158
Communications
Chap. 5
Em>r-protec:tion (and in sOme instances, enor-correcUon) bytes follow the ETX' byte. As discussed earlier, these include LRC (longitudinal redundancy check) or CRe (cyclical redundancy check) codes. A typical message protected by a sixteen-bit CR.C code would be formatted as follows: STX (data) ETX CR.C CR.C
Let us next look at the major function of any protocol: the reliable transfer of a message from its source to its destination. Just as with communication cIwmels. we may dichotomize protocols into balf-duplex and full-duplex protocols. When using a half-duplex protocol, traDsmission occurs in ODly one direction at a time. A full-duplex protocol supports simultaneous communication in both directions. Half-duplex protocols may be implemented using either balf-duplex or full-duplex communication cbaDDels. However, full-duplex protocols require a full-duplex clwmel.
Half-Duplex Message Transfer. In a typical ba1f-duplex protocol, user A sends a message to user B and then awaits a respouse from user B, as shown in Figure S-9a. This IeSpODSe may be a positive acknowledgement that the message was received comctly (ACK) or a negative acknowledgement iDdic:atiDg that the message was received in enor (NAK). If user A receives an ACK, the next message is sent. However, if user A receives a NAK, the pIeVious message must be retransmitted and the above process repeated. This protocol causes the tnmsnrit:terto pause between messages in order to receive an acknowledgement from the receiver. 'Ibis pause is IeqUired because the half-duplex channel supports ODly ODe traDsmission at a time. Full-DupIex Message Transfer. A ftdl-dDplex clwmel allows the traDsmitfer to send ccmtin~, acknowledgements can be xebiIUed over the reverse channel while message traDsmission coadDDes. In fact, the. other ead can be tnmsmiairag its own series of messages at 1he same time. .The problem is tbe COO11IiDation of acknowledgements with messages, siDcethe traDSliiiaa may be able to seud many messages before it gees' an acknowledgement to a pmous message (especially over cbarmels with long propagation times, such as sateDite or pactet-switdlecl cJwnMls). The solution to this problem is to number messages and tb.en to acknowledge by message.number. In fact, to allowbotb. eDds·to tnwswit sjmqbaneo\1sly, the ACK'or NAK can be piggybacked into each message. Figure S-9b shows this proc:edure. User A is sending messages AI, A2, etc., wbile user B is sending messages BI, B2, and so on. Each piggybacks an acknowledgement of the last message received comctly or the first message RCeived iDcoDecd.y. . For iDstaDce, by the time user B is ready to send its fourth message, B4, it still bas ~y been able to process user A's message Al and so sends an ACKI (just as it had with
sa.ce
Chap. 5
Protocols
159
USER A
USER
B
HALF DUPLEX PROTOCOL (a)
USER A
~ ~
[ill
IB1
[ill
m:J IA41
[ill [ill
ACKO
ACK2
ACIG
ACK3
ACK3
ACK6 ACK8
ACKO
ACK5
XXXXXXXX
USER B
L!IJ
[!!] I!!] IE] [!!] [ill [!!] [!!J [!!]
ACKO
ACKO
ACKI
ACKI
ACK3
NAK4 ACK3
ACK4
ACK6
FULL DUPLEX PROTOCOL (b)
I'igare 5-9 Basic pmcocoIs.
its p!8Vious message. 83).
By the time user B is n:ady to traDsmit. its next:message. B5, it bas appoved two mme messages from user A and so sends an ACIO.
However. user B DOW fiDds user A's message A4 in Cl1VJ'. 1berefcxe, user B sends a NAK4 wiIh its aext message.. 86. When user A mcei:ves the NAK4, it resets i1se1fto start senctiDg at its message A4, aud the process comimJeS. This is obviously a mmecomplex pIOtOCOl1ban that mquhed for baJf4lplex dIaD. nels. It also zeqaires stcnge at the transmitter for seven! messages to support the poteDtial tettaMmission of1hese messages. This canbe.compared to a stonge JeqUhemeat of only one message for the simpler ~-cIupJex channel, However, in Iebml for this added cost, tile fuI1..dup1ex channel is utilized to a much greaaer.exteDt.. Note t:bat in tis example the u:ausmitter was xequired to back up to the message in eaor and to Jetnmsmit all d8ta from t:bat point OIl. We will xefer to tbis as the Go-Back-N protocol. An altemative stritegy that is also used is sel«tive retrtI1Ismissio, in which ODly the message in eaor is leO'ansmitted, In this case. mfen:iDg to Figme 5-9b, user A's message sequence would have been ( ••• A4. AS, A6. A4, A7, AS •.. ). However, this tecImique requires stoDge not only at tile ~ but also at the receiver, since later
Communications
160
Chap. 5
messages must be held until the message in error is received properly. Again, higher cost yields higher performance. .
Channel Allocation In 0Ider to send a message from a source to a destination, the sender must acquire the sole use of the chamlel and must then be able to address the receiver. This is the establishmeDt" procedme.
When the message (or perhaps a block of data repteSeDtiDg a partial long message) bas heeD sent, the sender must release the charmeL 'Ibis is the terminatiOD procedme. In order for a chamlel to be allocated to a sender, one of two procedures may be used: 1. Assignment of the chaDDeI by an orderly procedure. 2. Uncontrolled contention for the channel by all users. Ordedy procedures for channel assignment include polling of users by a master station or the passing of a token giving the current token.1lolder the right to use the cbaDDel. If a conteDtion protocol is used, then provision must be made to detect collisions and to recover the lost data. 'IbeIe is one further dichotomy to be m:ogDized wbeD consicIering establishmeDtl termiDation procedures, and that is the Il1.DDber of users of the channel. If tbe!e are ODly two users, then the cbaImel is a point-to-point channel. If there are more thaD two users, thea the clumue1 is a multipoint chanuel.
So far, the protocols we have disc:ussed are byte-orieDted. Control characters are takeD from a byte set, and error control characters are based on a byte structure. In fact, data is expected to be bytes• .Tbere are IDIIlY app1icatioas mwhich a byte SUUdUre is DOt to tbe data. A good example is public aetworks, which must be tDIISp8Ialt to any data structure. Data chanuels derived from telephoDe HDes meet 1bis crllerioD as the particaJar form of data ICpII*.....'ion and die protocol are up to die user. The situaDODbeoomes IIlO1'e complex for packet-switdJiDg netwoJb, siDce by tbeir very DatDre 1hey must bleak tbe users· daIa imo pacbIs. They use their OWD dam suuctares and p:otocoIs, aud these must be 1laIISp8leDt to any user da!a~. . The protocols used for such appIicatioDs are bit-oriellted, as they mast deal with data at the bit ratber dian the byte level. Since syncbronous HDes are typically used with these protocols for pedormance considerations, these protocols syuchroDjzation 'proceduRs. 'IberefoIe, we mfer to such p:otocoIs as bit synchronous protocols. Bit syncbroD01IS protocols use fall-duplex channels and are fall-duplex protocols. Message tlaDSfer procedures aud establisbmeDtIton procedures are entwiDed in these protocols. The two most commonly used bit synchtonous p:otocols are HDLC (a CCl'lT stan-
m
umve
J'rovide
Chap. 5
161
Protocols
dard) and SDLC, IBM's offering. HDLC stands for High-Level Data Link Control, and SDLC stands for Synchronous Data Link Control. The ANSI ADCCP and CCI'IT LAP-B
protocols are other examples. HOLC and SDLe are quite similar, especially c:oncemiDg our needs forpexformance . CODSideration. They are described in some detail in Stallings [25], Meijer [21], and Hammond [9]. We will take a bigb.-levellook at HOLC below as an example. Under HOLe, a message is broken up into packets, or frames. Each frame is enveloped with syncbroDization, control, and eIrOI' protection information as shown in Figure 5-10a. Frame elements include: • Leading and trailing flag tields that provide synchronization. Each flag is an eigbt-bit field CODtaining the bit sequence 01111110. FLAG
IADDR£"I~I 8 BITS OR MORE
8 BITS
8 BITS
OR MORE
I I
DATA
.'
FCS
16
VARIABLE
or 32
81TS
HDLC FRAME FORMAT (Q)
2!
45678
N(S)
I
0
S
I
0
U
I
I
'.
P/F
N(R)
S
PIF
N(R)
M
P/F
II
N (S)..
SEND SEQUENCE NUMBER
N (Rl .. PI F..
RECEIVE SEQUENCE NUMBER POLL I FINAL BIT
S
SUPERVISORY FUNCTION:
II
RR RNR RE.. SRE..'" M
..
RECEIVE READY RECEIVE Nar READY RE..ECT SELECTIVE RUECT
UNNUM8ERED FUNCTION
CONTROL FIELD (b) Fipre 5-18 HOLe.
F1.AG
8 81TS
Communications
162
Chap. 5
• An address field of eigbtor more bits to identify the IeCipient of the frame. • A control field of eight or more bits that defines the type of frame (information, supervisory, or uanumbered). • The data field, which may be of any length (in some implementations, it is constrained to be a multiple of eight bits). • A frame check sequence (FCS) field that contains a 16-bit or 32-bit eRe character for error detection. The flag fields provide synchronization by including a unique bit sequence in each frame. The UDiqneness of this flag sequence IDIlSt be preserved in that it must not appear in the RSt of the frame. Should a sequence of six Is that can be misinteIpreted as a flag field be found in the frame, the sequence is broken up by a technique caJled bit stuffing. The transmitter simply inserts a 0 after every sequence of five Is (except, of course, in the flag fields). The receiver, upon recei$.g five Is, checks the next bit. If it is a 0, the receiver deletes it. If it is a I, and the seventh bit is a 0, the receiver interpIets the sequence as a flag field (seven Is sigDal a special abort condition). The control field provides for three frame typeS:
1. I1Tjo17fl4titm frames (I-frames), which cmy the data. 2. Supervisory frames (S-frames), which provide flow conttol and error control. 3. U1I1UI1IIberedframes (U-frames), which are used for a variety of channel control functions. The information and supervisory frames are used for the estabJishment and eaor recovery functions in which we ate inteIested. Many of these functions ate implemented in the frame's control field, as shown in FIgUre S-IOb. The first ODe or two bits cIe.fiDe the type of frame to follow. If tbis is an information frame, a pair of tbree-bit sequence numbers is provided. One sequence number, N(S), specifies the sequence IlIIIIlber of the CUlmlt frame being sent. The 01her, N(R), specifies the sequence Il1IIIIber of the DeXt frame aDticipaIed In other wonts, N(R) tens the otber end that all previous frames bave·been mceived pmpedy and that the otber end may flush these messages from its baffas. TJms, message acknowledgement is piggybacked onto information pacbIs as they flow tbIOugh the system. If an acknowledgement is due to be sent to the otber end, but DO infomudion frame is available; then a supervisory flame may be seat U1sread. An RR frame (Receive Ready) is seat with the DeXt expected frame number in N(R) if tbis end is in a position to receive a frame. 0d1.erwise, an RNR. (hceive Not Ready) supervisory frame is sent. "Ibis also indicates the next frame to be expected but forees the 01her end to-delay its transmission UDtil a subsequent RR frame is sent. 11ms, RNRIRR. couples provide flow control over the link. If a frame is !eCeived in error, then a supervisory REJ (Reject) frame is seat, indicat,ing in N(R) the frame from which retransmission is to begin. This is the Go-Back-N:
Chap. 5
Bits, Bytes, and Baud
163
protocol. If selective retransmission is desired, the supervisory SREJ (Selective Reject) frame is sent instead. In this case, only the frame indicate4 by N(R) is retransmitted. Note that the sequence number fields are three bits in length and provide sequence numbers 0-7. This meaDS that the window size, W, on the c:lwmel is seven messages. That is. the receiver can get up to seven messages behind the sender before the sender must wait for an acknowledgement. A window size of7 is used instead of 8 to prevent confusion over message numbers. To understand the reason for this, let us consider the following example. Assume that the sender bas sent messages 0 through 7 and is waiting for an acknOWledgement while it is holding these eight messages. It then receives an acknowledgement with N(R) = 0, indicating the next message the receiver is expecting. Is it the meSsage 0 cunently being held by the transmitter. or is it the message 0 which the transmitter is due to send next? This CODfusion is avoided by always )jmiting the window size to one less than the sequence numberrange (the window size may. of course, be further limited by other factors, such as available buffering). Though the HDLC protocol described here limits the window size to 7, an extension to a window size of 127 is available through HDLC. This can be important for bigbspeed.long-delay channels such as satellite channels. Let US now look at the establishment functions built into HDLC. These are c0ntrolled via two fields in the f.nIme, tbe PlFbit in the control field and the address field. The PIF bit is tbe pon-fiDal bit. The host uses this bit to poll a termiDal, the te.nDinal uses this bit to indicate to the host that it bas nothing more to send. The specific temUnal is addressed by the host via the address field. In Older to select a termiDal for transmission, the host simply addresses a message to it via the addIess field. In Older to pon a temIiDal, the host sends a frame COD1aining that termiDal's address with the PIF bit set. If an information flame is due to be sent to the temIinal, the poll bit in that frame is set. Otherwise, a supervisory RR frame is sent. If the termiDal has no daIa to send, it wiB Ietuman RR. frame with the PlFbit set. If it bas daIa, it will mum informadon &ames to the host. ·An but the last I-frame wID have a zero PIF bit. -A onePIF bit in the last flame indicares to the host that the terminaJ bas finished its mmsmissioD.
BITS, Bna, AND BAUD A passing CQ111iiMM1t on somecmnnmricatioa terminology is appmpriate at this point. Let us take a 2400 bit-per-secoocl syacbloDous COIIIIDI.1nic:a 1iDe. .Some refer' to this as a 2400-bit-per-sec:oud line, others as a 24OO-baud line, and still otherS-- as a 300-byte-persecond line. Ale these all equivalent? I thiDk so, but I doubt it. And if that sounds coafusing, it's because we have let ourselves get sloppy with JlODK'Alclature. Let uS first take the term boIId. Baud is tecImica1ly a measme of the number of state transitions per second that ':
164
Communications
Chap. 5
.$X)mmUDication line is capable of acbieviDg while still having the receiver accurately detect the state sequences. In many cases, the line shifts between only two states: one and zero. In this case, baud is the measure of bits per second that the line can bindle.
However, many traDSmission schemes use more than 2 values per transition. A common modulation technique is 4-pbase modulation. With tIJis technique. each transition is to one of 4 states and thus represents 2 bits. In this case, a 2400-baud line supports 4800 bits per second. In general, if each state can be one of M values, tben baud and bits per second are related as fonows: Bits per second = baud x 10l2M The term bits per 8ec01ll1 is not all that clear either. From a purely information theory viewpoint, a bit is a unit of information. TheJ:e is no iDformation cmied in asynchronous start/stop bits. Since a l2OO-bauci asynchronous line (in the loose sense) transmits 120 100bit cbaracte.rs per second, with each cbaracter contaiDing only 8 information bits, is it a 1200-bit-per-second line or only a 960-bit-per-second line? Well then, we say, let us use bytes as our 1DeaSUle. But again, there is no information carried in synchronous SYN bytes nor in control bytes such as STX. ENQ. or eaor control characters. Do we eliminate these from our measure of line capacity? Enough said. Modem usage bas become somewhat sloppy, and all forms are accepted. Let's make sure that our use is clear, either by context or by explicit definition.
LA YEllED PROTOCOLS In the p:evious sections, as we discussed communicatioas from the physical dumnel1evel up to some fairly complex potocols, we described how each lower layer adds its own information to the data in order to coD"nmic:are. An appJication might SIart out with a message to sfzd. This gets passed to a p.tOtOCOl baDdler, which :flames the message with . SIart-of-textaDd end-of-text ideJltifiers and which adds some eaor control and channel COD1rOI information. FiDally, this expmded message is banded to a ttaDsmitter, which may add startlstop bits or SYN cbarac1as to get the data across the line. At the 0Iber eud, the zeceiver S1rips out the sync:broDizadoD bits aad hands _ • message in its protocol euvelope to the protocol handler. The protocol baDdler extracts the message and bands it to the application program. This pmcecIme is diapmmed in Figure 5-11. In effect; the applicaiion progDlI1 at the SOUR:e end is ~ as jf it is connmJl!k:ating dilecdy with the appJip I ~niqtsiq i
(6-25)
and the average dispatch rate for these processes, nh, is
nh
= q>p ~ ~niq i
(6-26)
Consequently, the average service time for those processes executing at a priority higher than p, tm , is
(6-27) The above bas calculated average service time for processes at priority p and for processes with a priority greater than p. The average service time, t$' for processes at priority p and higher is
t = L,tsp + Latm
Lp+Lh
$
(6-28)
This is true because of all process executions at priorities P or higher, LpI(Lp + L,,) will be processes executing at priority P with average service time of t., and LilIa." + La) of these processes will be executing at a higher priority wiI:h an average service time of For all processes rmmiDg at all priorities, the dispatch me, 11" load, 1." and service time, tt, are: (6-29) 1., = tt
Iq
~n;qt.fiq
(6-30)
i
= 1.,111,
(6-31)
For a system in which ODly ODe processor can execute the above processes (single computer, multic:omputer:, or certain multiptocessor systems), the c6spatch time in general is the queue time, Tq , taken from the MfMIlI=I=1PP model for a p-eem.ptive opentiDg system and from the MfMlll=1=INP model for a IlOIlpftaDptive system. From cbapter 4 and Appendix 2: 1. Preemptive single server
(L, + L,,)t$
_ td -
(1 - Lp - L,,)(l - L,,)
(6-32)
2. Nonpreemptive single server 1.,rs d - (1 - Lp -·L,,)(1 - L,,)
1. -
(6-33)
Processing Environment
218
Chap. 6
For a multiprocessor load sharing system with c processors:
3. Preemptive multiserver t. d
=
(L, + LJJcpots c(c!)[l - (Lp + LJJlc]2(1 - Lltlc)
6-34) (
4. Nonpreemptive multiserver t. d -
L,cPots c(c!)(l - L,lc)[1 - (Lp + LJJlc](l - Lltlc)
(6-35)
in the above equations, c-l
Po-1 = ~(L')nln! + (L')C/[l - L'lc]c! n-O
(6-36)
and L' = Lp + Lit for the preemptive case and L, for the nonpreemptive case. Note tbatLp, Lit, andL, in these equations comprise the total system load rathertban the average server load, as used in Appendix 2 and chapter 4. Of course, in all cases of a preemptive system, the actual process service time which is added to the dispatch time to obtain full delay time must be divided by (I-LiJ to ac:count for preemptive processing by the higher priority processes (see Appendix 2 and chapter 4). If the system is a single priority system, Lit in the above equations becomes zero, with the conesponding simplifications. .
Contempcmay TP applications me organized as autonomous processes, each with their own scope of respcmsibility aad all passiDg dam to each OCher via messages. In some cases, these mtapJOc:ess messages can Iepieseut a significant podion of the load on a TP system. .. Tbeze are several ways in which the messaging facility ~ be implemeated. AD are suitable for distributecl systems, but cme-the mailbox-is suitable oaly for single computer or mu1tiproc:essor systems. These 1ecImiques are described brle1ly below. However, the only IeSUlt of practical interest to the perfOlDJ8DCe 8D8lyst is the bouom-liDe time requited to pass a message from one process to another.
Global message network. W'ltb this implementation, any process can send a message to any otber process in the system without any specific effort on the part of one process to establish a path to the adler process. This tecImique is geaemIly applied to muhic:omputer systemS. AD the sending precess needs to know is the Dame of the receiving process. The operating system knows the name of all J'l'OCeSSC'S in the system and their
Chap. 6
Operating System
219
wh~ts. It assumes the responsibility for the message, ~ua1ly moving j~ into a system-allocated buffer. It then routes it over the bus (or network, if necessary) to the computer in which the receiving process is 11lDDing and queues it to the message queue for
that process. Even if the receiving process is l'UIlDing in the same computer as the sending process, this full procedure is often followed, except that the bus transfer is null, i.e., shortcuts are not taken.
This type of messaging facility is used by Tandem. Directed message paths. In other implementations, there are no general messaging facilities provided by the operating system. Rather, it is the responsibility of one process to establish a message path to another process via operating system facilities. Once established, the operating system knows of this path, and message transfer is similar to that used for global messaging. This philosophy is found in the UNIX pipe structure and is used by Synt:rex (Eatontown, New Jersey) in its distributed word-processing product. File system. The TP file system can also be used to pass messages between processes. A message file can be opened by two processes and can be used by one process to write messages to the other. The receiving process is alerted to the receipt of a message via an event flag and can read that message from its file. On the surface, this can sound very time-coDsummg-writing to and reading from disk. However, disk traDsfers are cached in memory (in a cache similar to the memory cache described in the previous section). If messages are read shortly after they are written, they will still be in memory; the message time is equivalent to the above teclmiques. If they are not lead for awhile, they are ilushed to disk to Dee up valuable memory space. . Since the file system allows· transpalent access to all files across the system, this messaging concept supports distributed systems. This tecImique is used by S1ratus in its multicomputer sysrem.
Mailboxes. Mailboxes are like message files except that they reside in common memory. Tbey are adaptable ODly to sing1e-computer or nmltiprocessor systems, siDce aD processes mustbave direct access totbe mailbox memory. Since thae Deed be no physical movement of the message as with the other techniques, message transfer with mailboxes can be much faster. Message traDSfer in multicomputer systems tends to be quite tjmr:..consmning because of multiple physical transfers of the message from application space to system space to the bus to a c:tiffeJent system space and back to appJication space. Typical transfer times are measured as a few milJjsecoods to tens of mjJJiseconcls. DUect memory ttansfer of messages in multiprocessor systems can be sigDificantly faster, especially when mailboxes are used. Typical1raDSfer times are measured in teDths of milliseconds.
Processing Environment
220
Chap. 6
In any event, 1he time required to pass messages between processes can usually be bundled in with the process service time for the sending and receiving processes . . . . .ry Afanagement
Most TP systems today provide a virtual memory facility in which there is little relation to logical memory and physical memory. In principle, many very large programs can execute in a physical memory space much smaUer than their total size. This is accomplished by page swapping, as discussed in chapter 2. When a process requD:es a code or data page that is not physically in memory, the operating system declares a page fault, suspends that program, and schedules 1he xequired page to be read into physical memory, overwriting a current page according to some aging algorithm. When the page has been swapped in, the suspended program is allowed to continue. Page fault loads are very difficult to predict and aDalyze; but for the pe1formanc:e analyst, there is an easy out. Page faulting is so disastrous to system performance that we typically assume it does DOt exist. If it becomes sigaificant, the cure is to add more memory (if possible). Though this sounds like a cop-out, it is DOt without merit. If a system does Dot have enough memory, it will begin to thrtISh because of page faulting. This sort of thrashing will rapidly briDg a TP system to its knees. Contempotmy wisdom and experienc:e indicate that page faulting should DOt exceed one to two faults per second. Overlay mauagem.ent is another technique for memory management and is c0ntrolled by the applicatiOD program. It is less flexible than page management but avoids the thrashing problem (8SS1.11Ding that ovedaid programs are DOt also rmmiDg in a paged virtual" memory environment). An applicaIion process is coasi&Rdto have a lOOt segmeat that is always in JDeDlOIY and one or DlOle overlay areas. It is flee to load parts of its progEam into its overlay .8lea when it deems fit. When the app1icadoD process makes such a Iequest. it is suspended UD1il the overlay mives and is then teSCheduled. The impact of overlay calls is simply the added overhead of the disk activity and the additioDal process dispatdring. both of which can be accounted for using the normal techniques presented heIeiD.
I/O Tra""" Once an I/O block ttaDsfer (as distinguished from a pmgrammed I/O uausfer) bas been jmatM, it continues iDdepeDdeDtly of the application process. Processor cycles are used to transfer clara diIectly to or from memcxy, foD.owing which tbeopetating system responds to a transfer completion iDtenupt. At this time, it will typically scheclule the initiating process so tbat this process can do wbarever it Deeds with the data traDSfer
completion. Let
DiD
= average 110 ttaasfer rate in both c:ti!ecti.ons (bytes/sec.).
---
Chap. 6
221
Operating System
BiD ::=.. average block. transfer rate in both directions (blocks/sec.). tdiD
tbiD
= processor time to transfer a byte (often just a ponion of a processor cycle as data may be transferred in multibyte words) (sec.).
= operating system time required to process a data transfer completion (one per block) (sec.). Then the processor load imposed by I/O at the data transfer and intenupt level, LEo,
is
(6-37) The application of this overllead value to system performance will DOW be discussed, along with other operating system functions that have a similar effect.
O/S Initiated Actions Besides the functions just described, there are other operating system functions that impose an overhead on the system. These are primarily tasks that the operating system itself initiates, such as • timer list management.
• periodic self-diagnostics, • monitoring of the health of other processorS or computers in the system. Let
1- = operating system load imposed on the system by OIS iDitiated func:tiODS. "4 = 110 load on the system (as defined above). 4 = total opeiatiug system overilead, including 110 transfers and self-initiated functions.
Then
(6-38) Since 4 of the processor capacity is being c:casamed by nonapplication-process orleoted activity, (1-4) of the processor is available for appJicaIion use. 'Ibis bas the effect of iDaeasiDg aU application service times by 1/(1-4): "" t Servi ppareD ce
A
= Actual Service Time (1 -
4)
(6-39)
That is, it appealS that the application process is rmming 'OIl a machine that has ODly ( 1- 4) of its rated speed or capacity. Ifthere me other higher priority processes rmming which also rob the application process of ~siDg capacity, then 4 is simply anOther .
i
222
Processing Environment
Chap. 6
__c;omponent of that bigherpriority processing load. (L" would include Lo ~ODS 6-32 through 6-36, for example.) Note that Lo is not meant to include data-base management overhead. Though the data-base manager is not an application process per se, from a performance viewpoint it is treated as such a process. This topic is discussed in the next chapter. Locks In a multiprocessor system, there will be contention for various operating system resources by the multiple processors in the system. For instance, more than one processor may try to schedule a new process, which means that each such processor will attempt to modify the :ready list. Multiple processors may try to IDOdify a block in disk cache memory as described in the next chapter. To pteVeDt such a resource (ready list, timer list, disk cache, etc.) from becoming CODtamjnated, only one processor at a time must be allowed to use it. Therefore, each common IeSOu:rce is protected by a so-called lock. If a processor wants to use one of these xesoun:es, it must first test to see if this IeSOu:rce is being used. If DOt, the processor must lock the IeSOu:rce UD1il it bas fiDished with it so that another processor cannot use that resource during this time. Actually, this action of testing and locking must be an integrated action so that no other processor can get access to the lock for testing between the
and lock actions. If a processor finds a lock set, it must pause and wait for that lock (i.e., enter a queue of processors waiting for that lock) before it can proceed. This queuing time for locks must be added to the process service time if it is deemed to be significant. And indeed, significant it can be. 'Ibere are examples of contemporaIy systems in which teSOUIQe loclcing is the predomiDaDt operatiDg system boUleneck. . In some systems, if the lock delay is too long, the process will be scheduled fri:later time. Though this fiees up the processor for otb.erworlt, it bas a serious impact on the delayed process because the process must DOW await another dispatch time. Lock delay can also seriously affect processor load because of the extra proc:ess-contex switching time that is iD.curIed. teSt
a
Tbe.re are sevenl possibilities for tbrasbiDg in systems of this sort. One COJDDlOD cause is page-fauJdng. ADodler cause .in mD1tiprocessor systemS is long queues for locbd resources, wbicb can cause additiODal CODtext switches. Tbese effects can cause the pr0cessing !eCp1iremeDts for a process to suddeDly increase, with a significant increase in
response time. 1'heIe are other, more subtle increases in processing requirements for TP systems. Memory and bus contention can cause process service times in multiproc:essor systems to increase as load increases. lDterp:ocess message queue leDgtbs will iDaease as load increases, causiDg dumping to disk in some systems or rejection of messages in other systems. Either case causes an iDc:rease in process service time.
Chap. 6
Operating System
223
_.AlI of these factors cause a process's effective service·time to increase as load increases. As service time increases, the capacity of the system decreases. In extreme cases (unfortuDately not uncommon in multiprocessor Systems), the system capacity can decrease beyond its current capacity, causing a "U-turn" in system performance. That is, the system can suddenly start thrashing and have a capacity less than the capacity at which it started thrashing. RespOnse times can dramatically increase by an order of magnitude or :more at this point. Figure 6-8 illustrates this phenomenon. This figure is a little different from the response time curves with which we have previously dealt, as it shows response time as a function of throughput (i.e., transactions per second processed by the system) rather than load. Normally, the tbroughput of the system is the offered transaction rate, R, and is related to the load, L, by L=K1'. However, in a thrashing system the system is 100% loaded (it is continually busy) and may not be able to keep up with the arriving transaction stream. For that reason, we observe IeSpODse time as a function of the 1broughput of the system rather tban its load. With refeIence to Figure 6-8, as long as the system can keep up with the arriving
:
T2
---------
He
THRASH NODE
\
!NCREASE IN
\ \FFERED
IJ)AI)
RESPONSE TIME B THRASHING THRESHOLD
TI
XI
THROUGHPUT
224
Processing Environment
Chap. 6
", • .transactions, it behaves properly.. For example, while operating at point :A:, it can provide a throughput of Xl traDsactions per second with an average response time of TI. However, as the incoming traDsaCtion rate approaches the .'tbrasbing threshold," B, various system resoun::es become seriously overloaded. Memory use is stressed to the point of creating excessive page faults; lock contentions cause processes to time out and be rescheduled; queues grow too long and are dumped from memmy to disk. In short, service time per t:ransaction dramatically increases. As a consequence, the capacity of the system is decJeased (it is, in effect, the inverse of the service time), the IeSpODSe time is increased (it is proportional to the service time), and the system is operating at point C. A further increase in the offered load (transaction arrival rate) to the system will only aggravate the situation, causing more tbrasbing, decreased capacity, and increased response time. This leads to the ·'U-turn" effect of Figure 6-8. What is the practical impact of such a system characteristic? Consider a user interacting with the system while the system is operating at point A. A sudden, brief burst of activity will drive the system into tbrashing mode; the user will suddenly find that the system is DOW opeI8ting at point C. Response time has suddenly increased from n to 7'2. In one typical system displaying this characteristic, the author measured response times which suddenly increased from. I second to 30 seconds! So far as the user is concemed, the system has just died. 'Ibis condition will persist until the offered load decreases long enough for thrashing to cease and for the system to get its house back in order. 'Ibis is the second tbrasbing example that we have discussed. The first example related to local area networks using contenDOD protocols (see FJgme 5-25). As in that case, the main lesson is that systems with the poteDtial for such severe tbmshing should be operated well below the thrashing thIeshold. Normal operating loads should allow adequate margin for anticipated brief peak loads to ensure that these loads will not cause thrash mode operation.
In tbis chapter we havelooked at the physical hardwaIe and its effect upon performance. Tbe hardwaEe was viewed as a complete anaJ.yzable system, and our peEformance aaalysis tools wae used to make some wide-naging statemeots about a typical system. With respect to today's opeDtiDg systems, ~ also m'iewed many chamcteristics that may have a serious impact On performance. It is often 1rUe in contemporaty sysIemS that intaproc:ess messages in mo1ricomputer systems ate the most p:edominaDt of all operating system functions. Task dispatdring is also often important, especially for those cases in which processors are running heavily loaded. 110 and other O/S activity are usually less important (with the exception of dara-base maJlagement activities, which are discussed in a later chapter). Memory maaagement (page faulting) is either not a problem or is an insmmountable problem. The rapidity at which a system bJeaks when page faulting becomes significant is so awesome as to justify remaining well away from page
faulting.,
Chap. 6
Summary
225
.In chapter 8, we will look at system performance from the viewpoint of ~ application processes. This is where we will use some of the operating system concepts developed in this chapter.
7 Data-lase Environll1ent
Most traDsaction proc:essiDg systemS obtain the iDformatiOD 1'eq1Ured to formulate a IeSpODSe from a base of data that is so large that it must be maintained on large, buIkstorage devices, typically disk units in today's technology. The data is so massive that it is very imponaDt to have efficieut access pa!bs to locale a particular data item with the nrinima) amount of effort. 'IbiS is especially true when data is sto!ed on disk, for as we shall see, each disk access IeqUires a significam amount of time. It appears that the futme is rapidly bringing high-speed gigabyte RAM (Random Access MemoIy) teclmology into the realm of reality (a giga is a billion!). When this bappeDs, maDy of today's c:oncems over mpid accesswDl be repJac:ed with equal c:oncem over the logical ease of access and maintainability of the data base a subject aheady addIessed by today's Jdational data bases. 1bough daIa-base c:qanization is DOt a topic for this book, we wiJl addIess it briefly later in this cbapter. Of course, coming in pam1le1 with gigabyte RAM is the development of kilogigabyte ctisks so that pedormance oftbese systems will probably always be an issue ;just on a larger sCale tban today. We coasider in this chaptertbe performance of data bases stond on one orDlO!e disk units. Data is typically managed by a data-base manager that provides a ""logical view" of the data base. ""Logical view" meaDS that the data is seen in the way the user wants to see it, DO matter bow the data is actually physically organized. For instance, one might want a list of all employee names by depadment. The dala-base manager will p:ovide a view of the data base as if it contains employees organiwl by depanment, even though the actual data in the data base might be organized in multiple lists, including a master employee list 22&
Chap. 7
22.7
The File System
coDWDing employee name, number, address, s8Jary, etc., aDd a second list giving employee numbers for each clepattment. Data-base managers are large and complex and are usually implemented as a set of cooperating processes. As such, their analyses will follow the teclmiques described in the next chapter, wbich covers application processes. However, the perfotmance of the database manager is very much a function of the system's ability to efficiently access and manipulate the various files (or tables) that constitute the data base. This is the role of the file system and is the subject of tbis chapter. THE FILE SYSTEAf
The file system in a CODtemporaIy TP system, viewed from a perfotmance standpoint, comprises a bierarchy of components, as shown in Figure 7-1.
DATA BASE MANAGER OR APPUCATION PROGRAMS
\11 FILE
iI I
MANAGER
I I
, CONTROL
I I I
,,, !..-
CACHE
SYSTEM OR APPU~ONBU~
MEMORY
DISK DEVICE DRIVER
DISK
CONTROLLER
•••
Fipre 7-1 File system hiam:hy.
Data-Base Environment
228
Chap. 7
. - .Disk Drives At the bottom of the hierarchy are the disk drives themselves. In most systems today, the disk drives use moving readlwrite heads that must first be positioned to the appropriate track or cylinder if multiple disk platters with typically one head per platter are used. A cylinder comprises all of the tracks on all p1atteIS at a particular radial of the disk unit. Once positi~ the drive must wait for the desired iDfOIIDation to rotate under the head before reading or writing can be done. Thus, to access data OD a disk drive, two separate but sequential mechanical motions are requUed:
• Seeking. or the movement of the readlwrite heads to the applopriate cylinder. • Rotlltion, or lmency, which is the rotation of the cylinder to the desired position under the DOW-positioned heads. This sequence of actions is necessary to position the disk heads and read data. Writing data is a bit more complex. It must first be understood that data is organized into sectors OD the cylinder (a sector is typically 256 to 4096 bytes). Data can be read or written only in multiples of sectors. A sector typically contains many records. Thus, to write a IeC01'd, the appropriate sector must be read acccm:ting to the above sequence, the IeC01'd must be iDserted, and the sector must be rewritten. Since the heads are alR:ady positioned, this simply requires an additiODal disk rotation time relative to a read opemtion. Typical seek times are 20 to 40 msec.; latency time is, on the avemge, a half revolution time, or 8.3 msec., for a disk rotating at 3600 IpJD (today's DOml). Seek plus latency time will be callecl QCCeSS time. A good average access time to be used in the following discussions and examples is 35 IDSeC. to read a reconf and 52 JDSeC. to write a IeCOEd (a rotational time of 17 JDSeC. is added for a write).
Disk Controller The disk controller is a hardware device controller that dUect1y conuols one or mme disk drives. Typical controllers can CODllOl up to eight drives. A controller executes three basic classes of c:ommands: . 1. Seek, meaning to seek a specified cylinder on a specified drive. 2. Reotl, Maing to lead a given sector or secto!S OIl the cuuently positioned cylinder OIl the specified drive. 3. Write, meaning to write a given sector or sectoIS on the cuuently positioned cyJinder on the spec:ifted drive.
Of course, there are 01her oommaMs for status and diagnostic purposes, but these are the important for peIfo1 ';Iance issues. Most controllers can overlap seeks-that is, they can have multiple seeks outstand-
ODes
Chap. 7
The File System
229
ing so that several disk drives can be positioning themselves simultaneously to their next desiriii' cylinders. The good news is that since seek time is the predomininifactor in access time, this technique can significantly reduce access time and increase disk performance. The bad news is that it takes so much intelligence on the part of the software to be able to look ahead and predict cylinder usage (except in certain unique applications) that this feature is seldom supported by software. More about this later. Some disk controllers provide buffering for the data to be read from, or written to, disk. That is, for a write operation the software will first transfer the data to be written to the disk controller buffer. The controller will then write the data to disk at its leisure. Likewise, for a read the data will be read from the disk into the controller's buffer, where it will be available to be read by the processor at its leisure. Coutroller buffering is a mixed blessing. Without buffering, it becomes a leal hardware perfonDance problem to ensure that sufficient 110 capacity and processor memory time exist to guarantee the synchronous 1raDsfer of data between the disk and processor without data loss (once started. this data stream. C8DDOt be jntenupted). On the other band, with controller buffering, a disk transfer cannot exceed the buffer length (typically, anywhere from 256 to 4,096 bytes). Without controller buffering, data transfers can be as long as desired (at least, they can be as long as the processor's 110 channel will
allow). For our ncmnal perfonDance efforts at the application level, the problem of controller buffering usually is DOt CODSide:red. Disk transfer sizes are given, and we assume that they occur without data loss.
Disk. Device Driller The disk device driver is the lowest level of software associated with the file system. h accepts higher level MDUDands from the file maDager and execates these as a series of primitive CC)IDJ1'!8Dds submitted to the disk conttoller. It monitors status sigDals Letumed by the CODttoller to c:IetermUIe success or failure of operations, takes such steps as it can to retry failed operations, and takes whatever other steps are necessary for gwaaateeiDg the iDtegtity of disk operaDons (such as repJaciDg bad sectors with spare sectors from. an allocated pool). The most oornmon ccn""I8Ms bandled by the device driver are Iadlwrite c0mmands. The driver will select the appropiate disk drive, will issue an app:opriate seek CQIIJID8Dd. will ensure 1bat the heads bave positioned pmperJ.y, and then will issue a data transfer comnpnd . The memmy location of data to be written to disk or of the destiDation of data to be read from disk is passed to the device driver along with the command from the file manager. Data may be 1:I8IIsfeJml between the disk and buffers provided by an application program (orprovided by the opemting system on an application program's behalf). or data may be traDSferred into and out of disk cache memory. The device driver, once initiated by a COIIlDJ811d from. the file DWJage.r, openres substantially at the intenupt level. When it bas sncx:essfWly completed the 1raDsfer or bas given up, it will schedule the file~. The device driver execution time is
Data-Base Environment
230
Chap. 7
.. typically included in the load 4" discussed under OIS-initiated :functions·is the previous cbapter.
Cache Memory
Most systems today provide a disk cache memory capability that :functions much like the memory cache described in chapter 6. Basically, the intent is to keep the most recently used disk data in memory in the hopes that it will be reaccessed wbile in.cache. Because of the slow speed of disk relative to dle system's main memmy, disk cache is usually allocated from. main memmy space (tbis part of memory is usually not a candidate for page swapping). Because memory sizes in today's TP systems can be quite large (many megabytes), disk cache is often not limited in size but rather is established by the application designers at system generation time. The maDageIJ1eDt of disk cache is similar in many respects to memory cache. Several factors are taken into consideration, such as
• Transfers tikely to make ineffective use of cache are often allowed to bypass cache. Sequential file transfers by sector are a good example of this. If a file is read sequentially or written sequentially by sector, previous sectors will never be reaccessed and so do not need to be cached. However, if records witbin a sector are being ac:cessed. sequential sectors are cached so that records blocked in that sector may take advamage of cache. In some systems, sequential sectors in cache are overwriUen by the next sector, as the old sector will DOt be needed again. • The various records in cache are aged and are also ftagged if they bave been mod.i1ied (a dirty~. When a new mconi is. to be read in,.1D apptoprlate area in cache must be overwriUeD. The cacMng algorithm will generally elect to overwrite the oldest area, i.e., ID area tbat bas DOt been used for the longest time. If t:be:re is a choice, a clean area wiD be o~ as opposed toa dirty area, since a dirty area must be written to disk first before it can be ove.rwrlUen (lIDless cache write-through is used as discussed next). • In many TP ~, 1IIJJdiJi«l dIIItl wiD reside in disk cache memory until it is forced to disk bybeiDg oveiWritIm with new dIlL· HoWever, in the evmt of a pmcessor failme, tbat daIa may be lost, aod the data base wiD bave been corrupted. In fault-to1enmt systems, cache w:rD-tbrough is usecl. In this an writes to disk cause an update to cache and a pbysical write to disk (just like our earlier· memory cache example) before the write is declated complete. In this way, all c:ompIered writes reside on disk in the event of a pmcessor failure. • The size ofcache metnl)ry required is a diIect 1imction of the traDsaCtion rate to be supported by dle system. Ccmsider a tiaDsaction wbich reads a record and which may bave to updare that record at the operator's discretion 30 sec:onds later. If the system is handljng 1 traDsaction per minute, then a cache size of 1 record is likely to give good perf01'lll8DCe. However, if the system traDsaction rate is 10 per second, then a mjnimum cache size of 300 records will be needed to guarantee any
some
case:
Chap. 7
The File System
231
..... _. reasonable cache bit ratio, i.e., 10 records per second (300 reconts total)-will have been read into cache during the 30 seconds it will have taken the operator to update the original record. Disk cache memory is just another fiavor of the concept behind virtual memory (page faulting) and main memory caching. As with these other mechanisms, disk cache bit ratios are very difficult to predict. As mentioned above, they are most effective when files are being accessed randomly (a common cbaracteristic of ~ systems) and are least effective when files are accessed sequentially (as with batch systems). In ~ systems, disk cache bit ratios of 20 percent to 60 percent are common. This parameter is typically specified as an input to the model or is treated as a computational variant.
File Manager The file 11UI1Ulger usually runs as a high priority process in the ~ system. In the simplest case, there is typically one file maoage.r associated with each disk conttoIler, although this is DOt a fum rule. A file manager may control several disk controllers, or as an alternative, there may be several file managers associated with a single disk controller. Multiple file managers are considered later. AppJication processes (including a data-base maaager, if any) submit requests to the. file manager for disk operations. which are stacked up in the file manager iDput queue. These can be quite complex operatiODS. such as
• Open. Open access to a file on behalf of the requesting process. Typically, a file control block (PCB) is allocated and assigned to the process-file pair. This instantiation of a file opeD is ofteD given a file DUJDber so that later file requests by this process need ODly give a file DUJDberratherthan a file D8IIle. The FeB keeps such data as canent record position, end-of-file 1D8Iker, and file permissions. A file open may be zequested for various permissions, such as read only, modify privileges, or shared or exclusive access. • Close. Oase the access to the file by tlrls process. • Position. Position the record pointer to a particular IeCOId or to the end of the file. .
• Reod. Read a record. • Lock. Lock the record being read (or file or RICOId field, depeDding upon the file management system) SO that DO od:Ier process can lock this record or update it. Locking is used prior to an update to make SUIe that processes do not step on each other when tty.ing to simu1taDeoIJS1y update tile same !eCOId. • Write. Write a new record or a modified J:eCOrd. • Unlock. Unlock the record beiDg written.
This is only a partial list of file management duties. We have yet to talk about file which would expand this Hst to include operations such as semching for blank
strudU1'es,
Data-Base Environment
232
Chap. 7
. - .slots in random 'files, keyed reads and writes, etc. The point is that the·file maDi.ger is a bigbly intelligent process, and this intelligence costs time. Typical file manager execution times can run 100SO JDSeC. for 32-bit l-MIP machines. Only for special applications today is processing time less than 10 DlSeC. per file access. When compared with the 30-50 DlSeC. disk access time, it can be seen that the file manager time makes a bad situation even worse. Note that file manager time is substantially additive to disk time. When a request is selected for processing. the file manager must first do all the validation and processing required to submit one or mcne commands to the disk device driver. It then checks to see if the data item is in cache. If not, the file manager submits the first of a potential series of commands to the disk driver and then goes dormant until the disk driver has completed the operation and bas retumed the IeSUlt (data or status). The file maaager then must verify the result and, if necessary, submit the next command to the driver. 'Ibis process continues until the request from the application process bas been completely satisfied. Some operatiODS, such ,as file opens and keyed accesses, can require several disk operations to complete. In all of these operations, the disk is active while the file manager is dormant and vice versa. Thus, the actual time required to complete a disk operation is the sum of the physical disk times and the file manager processing times. File s,stem Performance Let
t. = disk access time (seek plus latency) (sec.). tdr
= disk rotational time (twice Jate.ncy) (sec.).
TltJir = number of disk lead operations requDd for file opcnUon i
(open,. close, lead,
write, etc.).
n..., = number of disk write operations required for file operation i. ~
= file IIUID8F time for operaIion i (sec.).
Is
= proportion of ~ zequiIecl if data is in cache.
Pd = average disk cache miss DIio, i.e., the probability of teqUiriDg a physical disk ~
= file
system
service time for operation t (sec.).
Note that the disk time mquired to'lead a!econl is , . and to write a mc:ont is ,. + 'Ibe!e are two cases to CODSider in terms of file system service time: caclrlng of writes and cache writB-tbrough. If writes are cached, the sector to be updated is sem:ched for in cache. If found, the sector is updated and left in cache. It eventually will be flushed to disk when it hasn't been used for awhile but may bave bad several updates made to it by then. The cache miss mtio, Pd, must take this into accouDt. tdr.
;
Chap. 7
The File System
233
·-ff writes are not cached (cache write-through), each writeinodiftes the sector to be upclated if it is in cache, but that sector is unconditionally written to disk. The write will take advantage of cache to read the sector but will always physically write it back to disk (on the next disk spin if it bad been physically read from disk). Thus, if the sector is found in cache, a disk time of t. is required to write it-out. If it is not found in cache, a disk time of t. + tdr is Iequired to read it ind then to write it out. A time, fcta, is required every time; a time, ftb., is required Pd of the time. For cached writes, file system service time for operation i is ~
= a;~ + Pd [n.:tvt. + n,nJ.,t. + tdr)]
(7-1)
For cache write-tbrough, ~
=
a;~
+ Pd (1ld;,t. + n,u..,tdr) + n,u..,t.
(7-2)
The parameter ili takes into account the effect of finding the desiIed data in cache. If (1 -
pill of the time data is in cache, and if the file manager time then required is!;fJm;,
then the average file manager time for operation i is (1 -
pIll!;fJ;m + Pd~
or (Pd + /; - PJ;)~
Thus, (7-3)
Let
Pi = probability of file operation i.
= average file service time (msec.). Then the average file system semc:e time is fJ
fJ= ~PiZft
(7-4)
i
If
.
:
Rf = rare of file JeqUeSts (per second).
Lt = load on file system,. "J = number of file xequests per ttaDsaction.
R, = transaction rate (per second). then the file system load is (7-5)
Data-Base Environment
234
Chap. 7
. - .Assuming that file service time, fj-, is random and that the number of processes requesting . file service functions is large, the file service delay time, frt'.. including queue delays and servicing, is t",=
...
---1t1-L"
(7-6)
Equations 7-1 and 7-2 above ignore the actual transfer time of the data between disk and memory. This is a small portion of a single rotation and is usually small enough so that it can be ignoIed. For instance, if there are 32 sectors per ttack, then the ttansfer of one sector will leQuire Y.n of a rotation time. If rotation time is 16 DJSeC., this amounts to 0.5 msec., which is small compared to average access times in the order of 30 to SO DJSeC.
FILE ORGAIVIZAnON
Formal data-base structmes are generally c:baracterized as biemrchica1, networlc, or relational. Though data-base structures are not the topic of this book, suffice it to say that each of these organizations is a furtber attempt at achieving ultimate ftexibility and maintainability of the data base. . And as this goal is achieved, it exacts its toll: performance. Relational data bases are recognjzed today as being the most ftexible and maintainable- data bases-and often the . worst performance hogs (1hougb impressive strides in this area are being made). Many systems are first built as pure 'i:bixd normal fcmn" relational data bases and are then modified to compromise this structure in order to achieve adequate performance. (For an excellent discourse on data-base st.rud:1IIeS, see Date [5], a classic in this field.) One cbaracterlstic th8t all of these data-base ~ have in COllllllOD. is the need for keyed files. Thus, almost aD of today's file systemS support keyed files. They also support sequential files for batch processing, mndom files for efficiency, and UIISt1'UCtIIIed files as the ultimaI:e pmgtamrael"S out. These file SIrIlCtUreS farm the basis of 1P system performance to a large extent and are ~ next.
An UllStTU.CtllTed jile is viewed simply as an array of bytes (see Figure 7-2a). The application process can mad or write any number .ofbytes(up to _a limit) S1arting at a particular byte position. Note that in general, a uansfer operation wiD begin witbin ODe.sector on disk and eod witbin anotber sector. Let
b. = number of bytes in a disk sector. bll = number of bytes being transferred to or from an UDS1rUC:tUIed file.
Chap. 7
File Organization
235
SECTOR (b s bytes) A
~I T
RECORD (b u bytes)
UNSTRUCTURED FILE (a) .
-
L...y-J
l
T
RECORD SECTOR (variable length)
READ
(av r s records lsector)
J
-
WRITE
SEQUENTIAL FILE ( b)
READ
WRITE
'-
,/'
RECORD (fixed length)
SECTOR
RANDOM FILE (e)
If the tmDsfer size is DO gmtr:r tbaIi a sector, the probability of it faDiDg directly witbin a sector is (b" - b. + I)lb,,; odJerwise, two sectors must be accessed. With pr0bability I-(b" - b. + I)lb" = (b" - I)lbs • Thus, the avemge number of sectors to be accessed is
1 (b" - b" + 1) + 2 (b,. - I) = b" + b" - 1 b Rt
(7-24)
Let us plug some typical numbers into this expression. Consider an inventory system dedicated to the inquiry of inventory and the placing of an order against tbat inventory. As a result of an operator query, a customer master file and a product file are each read according to a key. The opemtorretums an Older, which updates an amount in the product file, updates a status key to the product file, and writes an order detail record to an Older file keyed by product code and customer I.D. Thus, 1Idr
= 2 keyed reads (customer master and product files).
1ldw
= 2 keyed writes (product aad order detail files).
11k
= 3 key file updates (product status key on the product file and customer and product keys on the order file).
.
Three files are involved: the customer master, product, and order files. Let us assume the following conditions: • The customer master file contajus 50,000 IeCOl'ds of 300 bytes each.
• The customer I.D. is 10 bytes. • The disk sector size is 2K bytes.
With this informatiOn. we can estimate the size of the customer master file B-1Iee, wbich we are assmDing is~. We do this as follows. Each disk sector can ~ 2048114 = 146 keys (aspmring.a 4-byte pointer and no ovabead).1bus, 1here will be 50,0001146 = 343 sectors occupied by the key mcuds for the customer" master file. Assuming a 30 peaamt slack,' then 343/.7 = 490 blocks is a more reasonable DUIIlber. These blocks lie pointed to by the first level of the B-tree (wbicbwe assume is cached). The 490 sectors of keys willleqUb:e 490/(146X.7) S sectoJ:s in the first level of the ttee (with 309& slack) and a root sector. Thus, the customer ,I.D. B-tree for the customer master file will leQUire 6 sectors ~ cache.
=
Data-Base Environment
248
Chap. 7
Let us assume that an equivalent analysis on the other key files leads-to the following B-tree sizes: TABLE 7-2. EXAMPLE B-TREE SIZES Key
File
Customer master Product master 0JdeI' detail
B-ttee sectors
CustomerID
6
PMduct code PMduct status
4
CustomerID
4
PMductcode
~ 23
S
Thus,
Cdt = 23 cache blocks required for B-trees (461( bytes). FiDally, let us assume that we have a I-megabyte cache memory (500 blocks). Thus, Cd
= 500 blocks of cache memory.
Then, from equation 7-24: .
Tc:adw
= (500 -
23)11lRr
=43.41Rr
where we have assumed in our value for Pldw that the product file!ecord to be updated is not found in cache, i.e., it has been read and then Bushed before it could be updated. Obviously, we would like to have enough cache to ensure that, with high probability, this recon1 is not ftusbed before it is updated. Note tbat the longevity of a block in cache is a function of the ttansaction rate, Rr • As the transaction rate increases, disk activity increases, and longevity decreases. The following table gives longevity times for a nmge of ttaDsaction rates for this case: TABLE 7-3. EXAMPLE LONGEVITIES R, (1I'IIDSIsec)
T..... (sec)
1 2 S 10
43A 21.7 8.7 4.3
Ifit is anticipated tbat a user will require about 30 seconds to fill an order, then at one
tJ:ansaCtion per second, the product file recon1 to be updated will have a good cbance of mnaining in cache. This would give better pedODDaDCe dian 8Dticipatecl, since it was assumed that this recoId would not be found in cache when it was to be updated. At
Chap. 7
Other Considerations
248
traBsaction rates greater than 1.5 transactions per second, the probability that: this record will remain in cache long enough is substantially reduced. ' . . . A feel for cache bit (or miss) mtios can be obtained from this example: ne following table lists total disk requests for each operation, the number that B-tr= (root and level I) assumed to be in cache, the number that ate assumed to come from disk (first-time reads or writes), and the number that ate candidates for cache bits if the record· has not been tlushed.
are
accesses
TA8LE74 EXAMPLE DISK ACTIVITY Opermoas
Read cast. mast. Read p:oduct Write pmcluct Write stams by Write Older deIail Write cast. ID by Write prod. code by
Tacal accesses
B-llee
4 4 4 3 4 3
2 2 2 2 2 2
2 2
..1
..1
!
2S
cache
14
Reads! writes
2 1 2
1 9
2
This table reflects the assumption !bat an key files ate supported by a two-level B-tree. Thus, there ate 25 accesses to sectors required to process this txaDsaction. Fourteen are assumed to be in cache, and 9 are assumed to nec:essarily Iequire a physical disk access. The two update accesses (one to xead the product code key file and one to write to the product file) may be in cache if the update is fast enough_ Thus, the disk cache bit ratio xanges from .56 to .64 (the cache miss ratio ranges from .36 to .44), c1qwmding upon the succ::ess of finding a record that is to be updated and is sdll in cache. 1)ese results ate typical of 1P systems. 'Ibis "aaalysis" of disk cache has used a very simple example to illusttate the basic c:oncepts of disk cache and to give a feel for the magrritnde of parameters involved in typica11P systemS_ In the n:al world, any realistic analysjs of disk caching is usua1ly ~ complex to be useful (If indeed at all possible), and reasonable assumptions for cache miss ratios are usually used iDstead However, the longevity aaalysis that led to equaIion 7-24 and to the examp1e of Table 7-3 can be useful in eglhllating the amount of cache mem.oIY tbat repesems the tbIesbold teqU.iIed to achieve high cache bit ratios for updare activity. This can be of impo11ance in 1P systems~
Paramount
OTHER CONSIDERATIONS 'l'he!e are sevenl other c:onsidaaIions that can affect the performance of the data base in a TP environment. An unciezs1:aDc6Dg of these will allow the pelfO!JDlllce aaalyst to appr0priately modify his attaclc on the problem of analysis.
250
Data-Base Environment
Chap. 7
., . .00000apped Seeks A disk controller can typically drive several (say eight) disk drives. In general, however, it can be ttaDsfelring data to or from only one disk drive because of hardware and 110 clwmellimitatiODS. However, if there is a queue of disk IequeStS waiting for the disk controller, many systems will allow the disk device driver (the software driver that controls the disk controller-see Figure 7-1) to "look ahead" through the queue. By doing so, the device driver can initiate seeks on other drives to get their heads properly positioned in anticipation of 1raDsfening data. This capability is called overlapped seeking. The seeking of the readlwrite heads on some disks is overlapped with the seeking of other heads and also with the transfer of data from one of the disk units. Overlapped seeks can drastically reduce disk access time in busy ~ (it has little effect on idle systems). Since seek time is a major component of total file mauagement time, tbis can greatly improve the responsiveness of the data-base system.
The.re is DO straigbtforwud way to analyze the effect of overlapped seeks except to . estimate the effective seek time at the anticipated load and to use that time in the analysis rather tbaD the acmal seek time. An example will illustrate the sort of estimate that can be made.
Let us assume that a pxeUroinary analysis bas shown that the avenge queue length of requests waiting for a controller (W in temIS of chapter 4 notation) is two requests. The controller bas four disks COJII1eCh'd to it, one which, of course, is busy. This leaves tluee disks. The probability that the fust request is for ODe of the free disks is 3/4. Thus, with probabiJi1y .75, ODe overlapped seek can be started. With probability .75, the second Iequest can lead to an overlapped seek if the fiIst did DOt. If the fust mquest did lead to an overlapped seek, 1hen with pmbability .5, the secoad request can lead to an overlapped seek. Thus, with probability (.25)(.75) + (.75)(.5) = .56, the second request will ft:S1Ilt in an overlapped seek. On the average, a request will be giveD an overlapped seek (.7S + .56)12 = .66 of the time. If we assume, given an overlapped seek, that the seek is' c:ompleae when die request is c:hoseD for data ttaDsfer, 1hen overlapped seeks are traDspIIeDt to die requesting process, and. overa1l seek time bas been reduced by 66 pen:eat. .If seek 1ime is 20 msec•• the effecti!e seek time is (1- .66)(20) = 6.8 DJSeC. (The assumpti.o.a of 100 pateDt seek overlap with the processiDg oftbe c:om:at request is notUDlaSOll8ble, since disk ~ times tald to ran in the same Older of magnjrude as seek times.)
Queues are no.tmal1y serviced on a FIFO (First-In, FiJst-Olit) basis. However, tbeIe are algoritbms that will look ahead tbEOugh !be queue aDd will decide which request to next service based on maximizing efficieacy. AD example of this sortof servicing algorithm is the elevator algorithm. This alg0rithm searches !be queue for that request which is closest to die CUDeDI: position of the disk head. aDd chooses ~ request for service. In tbis way, seek time is minbnized.
Chap. 7
Other Considerations
. 251
.. Algorithms such as this have a problem in that although the overall efficiency of the . disk system. is eohanced, some requests may get delayed an inoniinate amount of time. In faa, with the elevator algorithm, it is possible to create a,scenario in which a request may never be bonored in a busy time. 'Ibis would be a Ieq1JeSt at one edge of the disk when all activity is at the other edge. Remember our system manager who heats ODly from irate customers? Let's not do this to him. Of course, the algorithm can be modified to prevent this. One form of a modified elevator algorithm always sweeps in one ctiJection from one edge of the disk to the other, servicing all pending requests in cylinder order. It then reverses and repeats this sequence. In this case, one can make an approximate statement concerning the enhancement provided by this modified elevator algorithm. As pointed out in chapter 4, the average distance between two cylinden chosen randomly on disk is 1/3 of the total number of cylindels. 'Ibis is the average seek distance of the disk arm. as it services a FIFO queue. However, if there are n items on the average in the disk queue, and if these are serviced according to the modified elevator algorithm such that all are serviced in order as the disk arm. sweeps across the disk, then the average seek distance is lin of the total number of cylindels. If n is 10, then the average access time is based on moving a total of 10 peICeDt of the cylindeJ:s versus 33 percent, or a IeCluction of 3.3 in the access distance. This can be significant. Of course, a ten-item queue is even more sigDificant in tams of delay time. By the time the queue 1eDgtb is.lcmg enough to make algorithms like this mmringful, the system has long ceased perfomling salisfar.torily. Algorltbms such as these.., not typically foUDd in today's TP systems. They are DOt very effective UDless queue leDgtbs are loag, and good pedODD8llCe design dictates 1hat queue lengths be short (70 peJCeIlt resoun:e loads )!ield queue lengths in the CEder of one or two, according to KhintdDDe and Pollaczek). In addition. such algorithms teDd to be uilp:ecIicrable and have DOt been fOUDd to be DeCeSSary. However, as with overJapped seeks, one tecJmique to handle service order alg0rithms is to estimate in some way an effective seek time (or access time ifJOtatiODal latency enbancemtillts .., involved) and to use this modified value in the analysis.
A typical TP system has many users aa:essing the same files. As long as a file is simply being lead, thee is DO problem. However, as soon as two users try to update the same data, a problem can arise. Suppose User A reads an inventory record, finds 2S widgets in stock, and decides to sen 10, theJeby neecJing to update the stock quantity to ldect the fact 1hat 15 axe left. However, befoJ:e user A can return theupdat.e, user B has lead the same record and decides to sellS widgets. User A by DOW bas updated the file, unbeknownst to user B, to reflect a new ~,lS. User B then returns an updated record showing a quantity ofla widgets.
Data-Base Environment
252
Chap. 7
""". The data base DOW reflects 20 widgets in stock, whereas there are really only 10. The two users of the system have stepped on each other's toes. The solution to this dilemma is dIZt4 locking. When a piece of data is locked, no other entity (person or program) may read that data for purposes of updating (normal reading is often allowed). Depending upon the system, data locks can be applied to an entire file, a record in the file, or a data field in the record. The finer the granularity of tbe lock, the more efficient the system (except, of course, that the finer the lock, the more overhead is imposed on the system). Thus, user A will read the widget record with lock Oet us assume that record locking is available). User B will then try to read this record with lock. However, the attempt will fail, and user B will have to wait (either try again later or be queued by the system to the locked record). When user A updates the record to show 15 widgets left, tbe lock will be removed. User B's request for a read with lock will now be honored and will show 15 widgets left. Selling 5, user B will return an updated record showing 10 widgets left in stock. Qearly, file locks can bog down a system terribly, especially if the file is kept locked while an operator takes some action on the data (what if the operator takes a coffee break while he has the file locked?). Recont locks offer a much smaller chanc:e for CODf1ict, and data item locks are even better. As a general design philosophy, the locking mechanism which freezes the least data should be used (record locking is quite common in contemporaty systems). The lock also should be majntaiued for as shOlt a time as possible (for iDstance, ODly during the updating process, after the opezator has petformecl all other fanctioDs). Most 'n' systems are designed so that the chance of data lock conflicts are negligible; lock confticts can usually be ignored from a peIformance viewpoint. If lock conflicts are not negligible,delays from lock conflicts must be esril!Uded and added to die effective file IIl8DI8er service time if locks are queued or to the total tnmsaction time if the operator must wait and then resubmit the tIansaCtioD. Such delays are so appJication-dependent that notbing more in a general seuse can be said about them.
As discussed in chapter 2, a:itica1 files are often mb,ored for reliability. When data must be wrlUen to two disk units, even 1hougb. this may be done sinmlumeoUsly, the avemge write time is loDger than writing to ~ disk. Concepcual1y, one maY explain this by tbjnJdng of one disk as being die "average" disk compledDg in an "avemgen time. The other disk. will eiIbe:r complete faster, in which case themilrored write took III "avaage" time, or will complete slower, in which case the mirIored Write took a longer than "average" time. 1be net result is necessarily a mirrored write that averages loDger than a
single write. An analysis of mUlmed writes is given in Appendix 5. The results are quite simple. Mirroring a file adds about 40 percent to die seek time. Thus, if a disk has a 20 DISeC. seek time and an 8.3 msec.lateIlcy time, then single disk write time is (20 + 8.3 + 16.7) = 45 A mirrored write will add .4(20) 8 msec., yielding a write time Of 53 InSeC. 0"
msec:.
=
Chap. 7
Other Considerations
253
.. To achieve even greater reliability, fault-tolerant systems will often write to only one disk at a time. In this case, if a failure occurs during writing, one disk is guaranteed to be readable. However, a DrlnoIed write now requires twice the disk time of a single disk write. One compensating advantage of min'oIed drives is that both can be shared for reading. Some systems take advantage of this to some degree or another. If an application is heavily oriented to reading over writing, mirroIed disk drives could prove to be a peIformance advantage.
Multiple File . ....".1'$ In many TP systems, the file manager can become the bottleneck for the system. 'Ibis can be alleviated by partitioning the system so that it can bave several file managers, each sharing a portion of the load. Multiple file managers can alleviate another problem. in some cases. As we discussed earlier, disk processjng (i.e•• CPU) time and physical disk access time me substantially serial in Dature. While the file manager is processing a request, the disk is idle. Then the file manager waits while the disk transfers the desUed data. In typical TP systems, a disk driven by a single file manager can 0Dly be kept busy 2S to 60 percent of the time, thus sbaIply reducing its effective capacity. If multiple file mauagers could be used to drive a disk, disk utilization could ~ sbaIply improved. There are sevaal ways in wbich multiple file managers can be utilized. File manager per crlSle volume. A large TP system will typic:aIly have sevaal disk volumes (physical disk drives) that it uses. These systems often provide the capability for (or require) sepanIe file mauagers for each disk volume, as shown in FJ.gDre 7-Sa. While this does not solve the problem of low disk ntijization described above, it does give a DJtdvnrism for a1lev:iaDDg the file 1'!18IJSlF" botdelW'L To aualyze the pe!foImance of this type of file manager CODfiguratiOD, one simply computes the total file load (file 1'!18IJSlF" processing and disk access) as set fo1'th in equatioDs 7-1 tbrougb 7~ and then allocates this load across the iDdepeDdeDt file DJaDagen. Often, theIe is not suticieat iDformatiOll to allocate load OIl any but an evealy distributed basis. In this case, if a file load of Lt is to be ctisIribared aaoss D file maaagers, each CODtrolliDg their own disk system, thm11he load OIl each file manager is LjD.
Muldpie file managers per disk volume. Whether a TP system is large or "small, it am. bemDt fiOm bavingmultiple file maaageES shale "one disk. This DOt ODly relieves 1he·file maaagemeat boa:JeDeck to some extem but also a1lows disk utili7arinn to approach 100 peIeeDt. This S1rUdme is shown in FigaIe 7-Sb. In order for more 1:ban one file manager to use a disk volume, xequests to that disk must be appropIiaIely partitioned between the file mauagers. It is DOt sufficient to allow them to simply WO!k from a single queue, since the order in which c:el1ain mquests are ~ is critical. For iDstance,if a pmcess submits two xequests, one to open a file and ODe to ~ it, it is im:peIative that the requests be executed in that order. We would not
Data-Base Environment
254
• • •
Chap. 7
D DISK VOWMES
FILE MANAGER PER DISK VOUJME (a)
~
- 1 1MIlliHMII 11 FM Lf 1m '--_---I
:
TI1iH
MII -1111 Lf 1m
M/Mlllmlm
V
•
FM '--_ _~
JJ1!l--@ m FILE MANAGERS
MULTIPLE FILE MANAGERS PER VOLUME (b) I'ipre 7-5 Multiple &le JIIIIIIFIS.
Want ODe file manager to SIart the openiDg of a file (a leDgIhy proceclme) and anotIle£ U; immediately try to read tbat file befoJe the opeuiDg proceclme was complete. One sttaightfolwam way to partition work between different file mauagem is to bave each responsible for a de1iDed subset of files on the ctisk volume. In tbis way, all oper_. aUons OIl a file will be CODSisteut. SiDc:e requestS must be partitioned, the file IDIIDIlgel'S do DOt act as a muJriserver serviciDg a common queue. Radler, just as in the· file JDimager per ctisk volume case cIesaibed above, the load is distributed to them as iD,dividua1 and iDdepeDdeDt servers. If there are in file managers servicing a single ctisk volume, and if file system load is Lt, then the load on each file mauager is LIm. However, the ctisk DOW sees multiple users (though Dowi1ere near an iDfiDite DUIDber). It will have a queue of worlc to do and will teSpODd as a single server to the m file managers. The cbaracterlstics of the disk queue will be governed by the single server, finite users model MlMllImim discussed in Chapter 4.
Chap. 7
An Example
255
---.:wn-1 Lf I Dm
FM
h
'--_ _I
'\
___=_.....
/~.SK
---JIID-1 Lf I Dm
..-.
~
FM '-------'
• •
---TITII--I 4
~
FM
/Dm
___ =
-JWL-l ~ 4
•
.
---,~IIill--8DIIiI DISK
FM
DISK VOLUMES m FILE MANAGERS
/Dm
MULTIPLE FILE MANAGERS. MUL:rIPLE VOWMES
!sl Fipre 7-5 (COIIII1.)
Multiple file managers per multiple volumes. The above two configuratiODS can, of course, be combmed as shown in Figure 7-Sc. Multiple disk volumes are each CODtrolled by multiple file maaagers. Using the above notation, the load OIl each file maDager is LpDm.
As an example of file ~ pedormance, let us CODSider the tnmsacDOIl discussed 1D1der cache mauagement and apply it to various D111Dbas of file mauagen in a TP system wbich has one file m.auager ~ disk volume. To summarize that example. we have a S1:aDcIml t:raDsactiOD comprising: • 2 keyed Eeads • 2 keyed writes • 3 key file updates
Data-Base Environment
256
Chap. 7
Let us desigoate these as file operation types i = 1, 2, and 3,· IeSpeCtively. As discussed earlier, the following table gives the values for 1Id;r, 1ld;w, and Pi for each file operation and also gives typical values for file management processing time, ~: TABLE 7-5. FILE OPERATION PARAMETERS
File operation
i
n",.
Keyed read Keyed write Key file upcIare
1 2 3
4 2 2
.....
~
(msec)
20
2 1
30,
2S
PI
.29 .29 ~ 1.00
We further assume the following parameters: tda
tfir
= 28 DlSeC.
= 17 DlSeC.
Pd =.4 /; =.5 for i = 1, 2, 3 We further assume that writes are cached. Based on these values, the:file management times, ~, are given by equation 7-1 and are shown in Table 7-6. TABLE 7~ EXAMPLE FILE MANAGEMENT TIMES
1 2 3
58.8 79.4 57.9
The average file system service time, from equation 74. is
~ = }:Pi~ = 64.4 DlSeC. i
The load 011 the file system, given by equation 7-5 is , • ",.4"» -r - .":/'.,.. .., - (7)(64.4) 1000 Rt --
4SR,
•
and the load on each, file maaager is
LID = .4SRr/D where D is, the number of disk volumes and consequently the number of file maoagers. From equation 7-6. the mpcmse time of a file mauage:r is _~_
64.4
14f- 1 - ~ - 1- .4SRr /D
An Example
Chap. 7
257
..._.:nus
response time (or file manager delay time) is plotted in Figure 7,.6 for one to three file managers. It is clear from FIgUre 7-6 that additional file manager paths to different disks proponionately increase the capacity of the system and can have dramatic DISI< UNITS (D)
600
-•• -
500
o
E I
. ~400
'"j::: ~
",300 rn
z
f
rn
~ 200
o
2
3
4
5
6
TRANSACTIONS / SEC. (Rt) Iipre 7~ Multiple file IIIIIIIIp:rzespcmse time.
effects on response time. At two ttaDsaCti.oDs per second, the response time for the three cases is shown in Table 7-7. TABLE 7-7. EXAMPLE RESPONSE TIMES D
z.,(msec)
1 2 3
644 117 92
Data-Base Environment
258
Chap. 7
An additional point to note is the avemge disk utilization. The aver,age disk operation requires 64.4 msec. Of this, 2:a;Pi~
= 17.5 1DSeC.
i
is processing time ciming which the disk is idle. Thus, during the time that the file managementsystem is busy. the disk is only being used 73 pm:ent of the time. In many systems tbis disk utilization can be less than SO percent. It is this situation that is enb8J!t'A'.d by utilizing multiple file managers per disk volume.
8 Application En vironll1ent
Now that we bave a vehicle, let's take a look at the passenger. Our TP vehicle includes a COD1DlUIlicatiOD netwodc for exchanging messages (requests and replies) with the users, a processing enviIomneDt (the cpu. memory. and opeIatiDg system) to act as our TP engine. and a data base that is used to maintain the status of our enviroDment and to answer iDqujries Ielative to that enviroDmeDt from our users. It is 1his vehicle in which the TP application teSides and by which it is bidden from tbe JDQDCIane poblcms oftbe zeal world. The ccmmumicaDOD manager wOlries ~ line pro1OCOls, data emJIS, DetWork ouaages; it preseIItS a clean message interface to tbe application. The opetating system aDd its bardwue tab away die c:oncems of memOry management, multiple processoa. mul1iple users, iDterprocess OO1'!11DIJDicatOll, process scheduJiDg. 8Dd in some cases eWIl fault decection and RCOVerY. The file maaager or data-base IDaDager provides a smoo1h path for acressmg aDd majntabring our data. . In this enviroDu1ftlt, 1he applicaDoD is simple, at least conc:ep1DaJly. h.c:omprlses one or ID01e processes 1bat acc:eptmessages and apply tbemto die database for·iDqui!y aad update purposes. The aualysis of its pe!fmnlllQ, however, still is tied to 1he ped'cmnanc:e issues of its vebicle aad partiCularly its eagiDe. . A TP applicaDoD typically comprises a set of coopeEatiDg processes. The requestorserver model described in cbapter 3 is a good example of Ibis. 'I'helefOIe, DO matter how efficieDtly a pmc:ess is desigaed, it will still be delayed by otber system acDvities as it competes widltbem forIeSOUlCeS. This cbapter deals with the ~ of application S1rIlCImes in die TP enviromDeDt.and describes a variety of application process stn1CImeS currently in use.
Application Environment . .
260
Chap. 8
PROCESS PERFORMANCE A portion of the response time calculation is, of course, the time that is consumed by the application process in processing a transaction. However, tbis is not usually a major factor in response time, and it is often only a negligible factor. A transaction inquiry providing a 2-second response time might only require 50 IDSeC. of processing time, for instance. True, a good bit of the remaining time is used up in COJDDJ.UDicatiODS and data-base management; however, a good bit of time is also committed to process 1IlIIIJageD1eI1. These are the times that we deal with here. Virtually all of the process management considerations have been touched on in previous chapters, so most of what we will discuss here will be in the aamre of a review and consolidation of this knowledge. The delay time imposed by an application process on a traDsaction is only partly due to its actual processing time. This delay time must also include
1. The queue delay incun'ed by the traDsaction as it waits in line for the process. 2. The dispatch time incuued wbile the process is waiting for the processor. 3. The processing time of the process itself. 4. The contention for the processor with application activities of higher priority. 5. The contention for the processor with the operating system as it handles :interrupts and other system activities. 6. The messaging time required to COIDIIlUIIicate with other processes.
Ow8wiew From a fundamental conceptual viewpoint, the process enviromnent as desc:ribecl above is shown in Figure 8-1. We view a process hae in the simplest of terms .to obtain an overview tying in the above concepts as a UDified whole. A message emas the pIOceSS's message queue and waits in tbat queue for a time tq UDtil it reaches the head of the queue. The process receives tbis message, processes it, and passes it on to another process. Figure 8-1 shows a process rmmiDg in a processor. The pocess bas an input queue that RCeives messages at a rate ofRand pocesses them with an average service time of tp. As the process comple1es a ttaDsactioa, it passes it on to anotber process via an iDtaplocessor message JeqUiriDg a system time of Once the process has processed a ttansaction, it reliDquishes control of the p!OCeSSOf and waits for the next traDsadion. It then gets back in line with other processes at its priority and waits for 1he processor so thai it can service 1bis DeXt traDsaction. This is shown by the"piocess being an item waiting in a greater queuo-tb.e processor queue. The amount of time tbal1he process must wait in this queue is its dispatch time, til. Note tbat tq IepeSents the time spent by a meUllge in aprocess queue. tllrepresents the time spent by a process in the processor queue. As shown in Figure 8-1, the process, once rmmiDg, does DOtbave the processor an to ~tself. For one tbiDg, the opetadng system c:ousumes a portion of the processor capacity
t...,.
Chap. 8
Process Performance
261
.' .. _~- - - - - - - - - - t d - - - - - - - . - - - -
r
ESSOR
.------~
PROCESSOR
PROCESSES
•
MESSAGE QUEUE
Lo
PROCESSOR QUEUE
as it handles 110, cOmmunication with other processes, timer-list management, and SO OD. The load imposed on the processor by the operating system. is 4. Similarly, bigher priority processes may be usmping the processor while the process is trying to run (tbis is the case of preemptive schedlJling). These bigher priority processes impose a CPU load of L". The process nquiIes tp time to complete its task. But only (1 - Lo - L,,) of the processor is available to the process, so that in a time t;, only t; (1 - Lo - L,,) time is used on behalf of the process. l'herefore, our adWIl pmcessing time, (" once the process is given the processor, is given by
or (,=
Ie
1-4-L"
.
(8-1)
A message miviDg at the head of the process's message queue must wait first for a time, 'td, for the process to be cliSpatcbed and then for the processing time, Thus, the service time, ts , so far as a message is c:oncemed is
t;.
(8-2)
Equation 8-2 repzesents the effective processing time, or service time, that a mes-,
.
Application Environment
262
Chap. 8
. . . j . waiting in the process's message queue will experience. Before ~$~sed, the message must wait in this queue for a time tq • Since transactions ate being received by this
process at a rare of R traDsactions per second, then the load on the process is L Using the MIMII model, the waiting time tq is _ tq -
L _ Rts 1 _ L ts - 1 _ Rt.s ts
The total delay time tbrough the server,
t.
= Rts •
(8-3)
t., is
= t.q + t.s = l -tsRt.s
(8-4)
The dispatch time, ttl, is the time the process must wait for the processor wbile processes of equal priority ahead of it in the processor queue are being serviced. In Appendix 6, we point out that the MIMII model is inappropriate for the calculation of process dispatch time if any one process accounts for a substantial portion of processor time. The MIMII model will lead to arbitrarily large processor queues at high loads, but we know that the processor queue length cannot exceed m-l if there are m processes in the system. An approximation which is suggested in Appendix 6 is to simply exclude the effect of the arriving process when calculating the length of the processor queue which it will see, since it will never have to wait for itself. Let the total mival rate of messages to be serviced by processes at this priority be Rp, and let the average processing time of all messages at this priority be t;'. Then the load Lp imposed upon the processor by all processes at priority P except for the process being considered is
Lp = Rpt;, - Rtp
(8-5)
We exclude the load of the process whose dispatch time we are considering, as discussed above. From equation 6-32, the dispatch time, td, for our process is fd =
(1 -
(L, + Lo + L,Jt' Lp - Lo - L,,)(1 - Lo - L,,)
(8-6)
where
Rp = mival rare of ttaDsaCtioDs to processes at priority p.
t;' =
avenge service time for all processes at priority p.
Lp = load imposed on the processor by processes at the CODSideIed priority, except for the coasideled process. t' = service time averaged over all priorities, including the CODSidered priority and higher, but exclusive oftbe CODSidered process.
To complete its function, the process must sead an mtapocess message forwanting this transaction to another process. This requiles a time t ... , which is opendiDg system
Chap. 8
Process Performance
2&3
time"!!Ad which typically is not affected by other loads. We assume that it does not add to process service time but rather is treated explicitly. Thus, total Service time the m.essage is
for
fds
+ t;pm
This simple model bas inCOIpOrated all of our above points that affect process performance. These six points are the fonowing: 1. Traosaction (message) queue delay is tq • 2. Dispatch time is td. 3. Process time is tp. 4. Higher priority contention is Lit. s. Operating system conteD1ion is 4,. 6. Messaging time is tipm.
At. an example of the compoUDdiDg effects of the process enviromnent on process perfcmnance. assume the fonowing parameter values (all are reasonable): Plocess time (".. t;, () 0peraIiDg system loacl (t.,J Higbc:r pdority load (hJ Iutt:ijAocess messap time (r;,..) Plocess tnIIISaCIiaIll8le (R) TOIalIl'llDSllCtioD !ate at Ibis pricrity (Rp)
IOmsec.
.1 .4 2msec. IS tnms.lsec.
30 tnms./sec.
From equation 8-5. the load at Ibis priOlity. exclusive of the process being c0nsid-
ered, is Lp
= .15.
From equation 8-6. the process dispatch time z" is 37.1 JDSeC. That is to say. once a process bas wodt to do. it must wait an average of 37.1 IDSeC. before it can nm. From equation 8-2. a message will tequUe a time t, of 57.1 1DSeC. to be processed once it mives at the bead of the message"queue. ·Tbis time comprises 37.1 D1SeC. dispatch time waiting for the processor plus 20 DJSeC. of appaIent procA'$Sing time. From equatioa 8-3. the time. tq. that a message waDs in the message queue is 342.9 DJSeC. From equatioa 8-4, the total delay time z", for a message from· the iime it arrives at
the process to the time that it is processed is 400 DJSeC. Adding iDtctp10ces5 message time gives a total prooessing delay of 402msec. An this fOr a proceSsing time of 0Dly 10
msec.! tds
To obtain a feel fortbe cause oftbis apparent disaSter. let us expand equation 8-4 for by substituting equation 8-2 for t, . Using equations 8-1 and 8-6, we first write t, as
LJ t t t, = (1 - L)(1 - LIJ + (1 - LIJ = (1 - L)(1 - LIJ where we have substitnted
Application Environment
264
Chap. 8
= 101DSeC.
t=t'=tp L=Lp+Lo+Lh L;'=Lo+Lh
= .65 = .5
Then tl(1 - L)(l - LJ,) tl(1 - L;') tds = 1 - Rtl(1 - L)(l - LJ,) = 1 - [L + Rtl(l - LJ,)]
We have in effect a server with a service time of 20 1DSeC. (ok) which is lOaded 9S percent (awful!). Note the effect of priority service. If all activities were at the same priority, then LIz = 0, and tds becomes that for a server with a service time of 10 1DSeC. which is loaded 80 percent (as we would intuitively expect). CoDsequently, tds would be SO 1DSeC. instead of 400 IDSeC. Thus, prioritized service can WR8lc havoc in a heavily loaded system (the IeSults are more reasonable for lightly loaded systems). In effect, we have seen that the service time of a low priority process can be increased significantly by higher priority activity, which IeSults in a commeDSUIate increase in process load and in a possible dramatic iDaease in the delay time through the process. One should approach the design of a prioritized system with great caution. Relative to the effect of process enviromnent, consider the following change to our process structure. If the process were allowed to service all messages in its queue rather than just one message before reliDquisbjDg the processor, several dispatd1 times would be saved. A dispatch would be required cmly if the queue became empty, that is, for cmly (1 - Rts) messages (a message will find the queue icD.e 1 - Rts of the time). One might expect this to significantly improve pel'fanuance. Accounting for this c:haDgc, equation 8-2 is modified to give a ts of ts = (1 - RtJ1tl + 1 -
4t - LII
or _ 1tl + 1,/(1 -
ts -
to -
L,J
I+Rt"
'Ibis IeSUl1S in a ta' of 36.7 1DSeC. for the above case and in a proc:essiDg delay tds of 81.6 1DSeC. rather !ban 400 1DSeC. Quite an improvement and a further demoDsIxation of the importance of process enVkomo.eDt·on pelfomi8llQ'. Of come, DOthiDg ~ for flee•. The time 1hat this pocess "owus" the processor is DOW sigliificantly iIlClaSed e8cb time it is graat.ed the pmcessor. l' in equation 8-6 for other processes at this pri.ori1y and lower is inaeased, and delays 1brough these processes will increase as a result of their CX1eDded dispatch times. 'Ibis effect is e1egaDt1y stated by the WGII CoDservatioD Law (see Klcinrock [15]). The weighted sum of the queue waiting times is a constant, given by
necessarny
P-
IT
T. = L,To 1 - L,
p-l-P'"
Chap. 8
Process Performance
265
~.
Lp " = server load at priority p. Tqp = queue waiting time at priority p. L, = total load on the server.
T"
= time to complete the service of the current item when a new item mives at the queue.
This is formally proved for nonpreemptive systems. Thus, if we improve the level of service for ODe class of items, others will surely suffer.
Process Time There is not a great deal that can be said aua1ytically about process time. Before a program is written, it is difficult to estimate process time except from. general experience with similar systems. After a program is l'IlDDing, average processing time and all maDDer of dispersion measmes can be obtamed by the various performance measuring tools often provided with the system or by instJ:mnenring the process itself. Seldom can these values be deduced aoalytically. However, their accuracy has a strong impact on performance aualysis. So how do we make performance predictioas on a machine !bat hasn't been built yet, much less progmmwecl? Or on an appJic:anon that is curreutly being programmed? Do we pack our bags and give up? Or once again, do we invoke our cloak of devout imperfectionism and give it our best shot? It has been the author's practice, based on experience with several systems. to use the fonowing process times if DO better infODDation is available. They are based on a 32-bit 1 MIPS processor and should be adjusted up or down acccmtingly. They should 8Iso be appropdate1y adjusted or replaced based on the user's own knowledge and • • lienee. faac:Iicm
Ploc:ess time (msec.)
Col ' "'IIirllioas (pee block) AppJk:IIicD (per file c:aD.)
S S
File DIIIIIIpr (per opeaIiaIl)
3S
To these values should be added the process" c:omext-switching time if signjficaot (the time it takes for the openI1ing system to switch processes). For iDstance, CODSider a process which IeBds a message (1 block), mabs tIuee file calls, and retarDs a IeSpODSe (1 block). A.ssume c:ontext-switching time is 2 JDSeC. The process must be given conuol of the processor 4 times. onc:e to read the "incoming message"and Once at the completion of each file call. Total processing time is therefore 2 x 5 + 3 x S + 4 x 2 = 33 m&eC •• or 33/4 = 8.25 m&eC. per dispatch.
266
Application Environment
Chap. 8
At least these values provide a reasoDable starting point. As..xeal values are .. ..- Obtained, they should replace the suggested values, and the model should then be reevaluated. Dispatch Time The dispatch time for a process is the time it must wait in line to obtain access to the processor, i.e., its waiting time on the ready list. Dispatch time has been thoroughly
discussed in chapter 6 in the section entitled "Task Dispatching." Dispatch time expressioDS are given for single processor and multiprocessor environments and for pteemptive as well as nonpreemptive schedulers. As a summary of that section, dispatch time is viewed as the time a process must wait in a queue (the ready list) before it has access to the processor. The service time for items in the queue is the aVEnge time per dispatch that processes in that queue will consume, i.e., the average amount of time that these processes will be active once given the processor. This isa leal time, calculared fIom the actual average CPU time consumed by these processes and adjusted for operating system and higher priority process activity. The queuing models used are MIMIl for a single processor environm.eDt and MIMlc for a multiprocessor environment. This assumes that process times are exponentially distributed and that mivals to the ready list are random. It also assumes that the IlUDlber of processes in the system is much pester than the ready list's average length. If the number of processes is DOt large, then the c1etermiDaQon of dispatch time requiles an itaative calculation, as described in Appendix 6. To avoid this complex calculation, a useful approximation is to simply ignore the impact of a process OIl the processor's dispatch queue when calcuJatiDg its dispatch time. This 8ppro:timarion is evaluated in Appendix 6. In Older to calculate dispatch time, the perfounance analyst must be able to estimate the dispatch rate aad average process time per dispatch for each priority level. He JDDSt also bave a feel for the oved1ead imposed by the operating system.
A pocess is affecIecI bybigber priority processes, since these steal pmcessing capacity from all lower priorities, It is affected by processes at its own p.riorlty, since it must compete with these processes for CPU time. It may also be affected by lower priority processes if it is rmmiDg with a IlOIlpl'eeIDptive scbedu1er, as it may bave to wait for a lower priority process to complete before it can be given the CPU. The effects due to processes at the same priority and at lower prlorities are dispatching problems and are CQVeIed in the previous secrioD and in cbapter 6 UDder "Task Dispatching." Higher priority tasks DOt ODly slow down dispatdring, as described p:eviously, but also slow down the process itself if the schecJuler is pteemptive. This effect is taken iDto account in the queuiDg model for preemptive priority systeJDs given in chapter 4 (the :
I
Chap. 8
P~sPerionnance
MlML.lJaJ/aJ/pP model) and in the above example. In effect, the task processiBg time is increased by 1/(1 - Ln), where Lh is the load imposed by bigher priority tasks. Note that this is true only for preemptive schedulers. A nonpreemptive scheduler will cause a delay in the dispatching of a process jf a lower priority process is currently runoing, since higher priority requests will DOt interrupt a process once it is scheduled. A process waiting in a processor queue must wait for the processing of all processes of equal priority that anived earlier. It must also wait for all higher priority processes to be pr0cessed, regardless of when they come in. This latter delay takes the same amount of time regardless of whether the execution of lower priority processes is or is not iD.telrupte
MASTER
SYNCHRONIZATION
SLAVE
BLOCK
(a)
.------~
__--..:.(9.;.;'---1 SHARED (J)
MEM.
MASTER
(5) (2)
(7) (I)
~
(8)
FROM
SLAVE
SYNCHRONIZATION SEQUENCE (b)
When-the sJave (Which is pocasing lespoases in a simiJar.m8mier) indi'*s tbadi bas the response (8), and if both responses ateidemical (at least iDsofar as baviDg eqUal message leogtbs, wbich would disIiDguish betweeIl suc:c:ess and failure), then tbe master will mtam the respcmse to the Aquarius (9).
3.4 Shared IfemoIy All CQ!liInUDication ttaffic between the Aquarius tennjnaJs and tile file 1M!!ager is via the sbared JDeDlOQ', under 00Dtr0l of the AI. SbaIed memozy is ozganized iDto four areas:
.
.
Sec. 3
System Description
335
, ,.,.Header section. Disk controller buffers. Networlc buffers. AI buffers. The header section contains shared memory control information and one 4O-byte short buffer per line, which may hold an acknowledgement message to be IetUmed to the terminal. The disk controller buffers hold responses to be returned to the termiDals. As mentioned earlier, an acknowledgement is piggybacked onto a response if a response is
available. The networlc buffers support SyuNet, the local area networlc which can intercounect Syntrex products. SynNet is DOt a subject of this SblCly. The AI buffers are used to receive requests from the te.mUnals. All buffers are S60 bytes in length (of which S20 bytes are data). The number of . buffers varies with the number oftermiDals, but a typical configm:ati.on for a 14-termiDal system would provide 20 disk contmller buffers, 49 AI buffers and 30 network (NI) buffers for a SyuNet system. 11uee of the disk controller buffers are IeSerVed for emeIgency use to break deadlocks that can occur,on lImg read operations (a read of up to five contiguous disk blocks, used prlmarily for program download pmposes). Such a deadlock can occur if multiple terminals request program downloads simultaneously. and the two Sides of Gemini fill these buffers in diffenmt order SO that DO xequest is completed by both sides when the buffers become tun (the order of long leads is not preserved by the file 1DIIIIgeI'). The number of AI buffeIs is approximately three per line. This allows some degree of look-ahead for a terminal in that one request can be acknowledged and can be queued to the file IDIDIge.t' while the DeXt is being received. Should a message be IeCeived when all AI buffeIs are tun, it is discarded and must be tetl:ansmitred. . 'Also, should the disk contrOller buffers become full (not in a deadlock situatiOn), file managers queue up and wait for a be block.
me
.
3.5
.
File....."...
The file manager processes all J:equests from tile tennjna'Js, le"'Hling tile 8pp:optiate tespOQSeS. As the AI deteclszequestS in sbalecl JDeIDOIY that have been completed by both Sides of Gemini, ihose iequests queued to the file'mmiager' (as described above, the slave will queue its IeqUesIS ODly after being notified that the masaa: has clone so). As shown in Figme 3-2, the file manager is ran in a multi1breadecl COD1igaration in that _ are several icIemical file IIJIIIIaFIS running in the system (CUDe.Dtly, five copies run sjmultaneously). One is designated as the main file uumager; it is tbis process that manages the queue of requests in the sbaIed memory. As it tetrieves RqUeStS ftom"the queue, it decides whether to process that teqUeSt cmectly or'route it to one of its sabol~file~.
me
336
A Case Study
Chap. 11
The routing algorithm is based on classifying all requests into tlu=. classes:
a. Gets. All Gets (requests for data) except for long reads are executed by the main file manager. . b. Synchronized requests. These are requests that must be executed in order, such as opens, closes, deletes. Each is handed to a Dee subordinate file manager, and . these subordinate threads are queued (iftbeIe are more than one) so that only one . executes at a time. Executions are in order. c. All oth4r requests. All other requests are handed to a Dee subcmtinate file manager. If all subontiDate file mauagers are busy, the main file manager is stallec:l. It cannot access the next request as it may not be able to deal with it. It cannot even "peek" at the next IeqUeSt to see if it is a Get wbich it could execute. . In most cases, a subordinate will retIID1 its RSponse (usually, just a completion status) to the main file manager, who will then return it to the terminal via sband memory. . However, in the case of long Ieads in which sigDificant data is returned, the subordinate file manager processing that long read will retUm the data directly and will notify the main file manager when it has completed. Each file manager executes its request by issuing a series of read block and write block oommands to disk as aec:essary. These are independent commands so that no one file manager can seize the disk for IIlOle than a block read or write time. All file managers have equal priority for disk accesses. The disk system includes an 8O-block cache memory managed by an LRU (least recently used) algoritbm. Cerrain operations effecIively bypass cache, such as long reads, as it is unlikely that these operations would benefit from cache. In these cases, a cache block is maxked as a candidate for immediate mISe.
The Syntrex file system c:omprlses a hiemcbical stnJcture of cfirectorles which provic1e unique paths to files. A document is made up of a set of files. All files and ctiIectDries (which me ac:maD.y files, as will be seen laIer) comprise a series of 512-byre sectolS (or blocks) OIl disk. Files are OJpDiud into a document via a ctiIectory, as shown in ~ 3-4a.. Tbe dhec:bxyis a named set of sectoIS, each ofwbich c:onIain up to 15 file _ _• 'Ihus, if the dimcrory cOnims 15 or less files, ii is m8cIe up of one sector, for up to 30 files, the ctirectoIy requites two sectors. Sectors continue to be added in this way as necessm:y. Aile name in a directory comprises the name of the file to which it points and a physical disk sector address (a block pointer), wbich points to the ac1Ual :file (FigIB 3-4b). If the :file contaiDs less· than 512 cbar.acters, then it is contained in this single block. Otherwise, this block contains up to 64 poiDtms to 64 other blocks. In this case, the block is known as an indirect block.
can
Sec. 3
System Description
337
DOCUMENT
NAMED
DIRECTORY
DIRECTORY
•••
FILES
DOCUMENT DIRECTORY (0)
FILE FILE
FILE
DIRECTORY
BLOCK
~____~_UM~E~UMr-E~UM~_E~____~ (l5F1L~)
..--,........+--.,..-...,
INDIRECT
BLDCK ~~~~~(64BLO~)
r--r-f-'"'"'I'""-t
INDIRECT
BLOCKS
•••
TEXT BLOCKS
•••
'--_---' (512 BYTES )
FILE STRUCTURE (~)
I1pre 3-4 Docnmem strucIDre.
An iDdDect block may point up to 64 teXt bloCksCODtainjng the file data or to another' 64 iDdkect blocks. Figme 34b shows a direc:tmy entry. pobltiDg to an iDdRct block, which points to anotber set of inctiIect blocks, whiCh point to the set of text blocks. Thus, a file widl DO indirect levels can contain 512 bytes. One indirect lev.el supports 64 x 512 321{ bytes; two indirect levels support 64 x 51:22, or 16 megabytes (about SOOO to 8000 pages). The above description of the S1l'UCtUre of a text file as a tree sttuctme also applies to c:tiIectories. A ctirectOIy is just another file in which the text is a series of file names, up to IS per block:. Thus, if the diIectory shown in FIgUre ~ contained 100 file name entries, it would actuaJly comprise an indirect block:, which would then point to 7 text blocks, each of wbich could hold IS file D81DeS. Its indirect block would be pointed to by a file IUIDle in the next higher direc:tmy and so on.
=
A Case Study
338
Chap. 11
The ttansactions to be considered for this analysis of Gemini are the common set of editing functions, which include
Index Scan Open/Close Document Attach Copy Physical Copy
Go To Page Scroll
Delete1Insert Cut Paste Insert Footnote Add Text Attribute
Manual Hyphenation Paginate Print The disk and processor activity for these transactions has been analyzed in reference 2.1, listed earlier in the chapter. They are summarized in Tables 4-1 through 4-3 for four classes of activity:
"Ii
= number of traosactions for edit function j.
ngj
= number of Get CQ1DII18Dds for edit function j.
TI4f
= number of disk accesses required for edit function j.
n;q =number of disk: cache accesses for edit function j. Tbese terms represeat the sigDificant IeSUlts of the traasactioa. model and are those that are IeqUired by the usage scenario developed in Section 9. The terms in Tables 4-1 through 4-3 are defi.neci in Table 8-1.
...,1
TABLE "'. EDIT FUNC110N TRANSACTION ACTMTY j
FimcIion
1
IDdex Sc:aIl
4- "
+ 3Ok, +
"rJ 3
I
Ji/[lS(l - st>l
1-
2
0peaICl0se Document
3
AaachCopy
4
Pbysical Copy
2(4 + 'Jf~
.4+7b 8 + Iv, + 2(P + .041".
0 0 (P + f)dJ,.,.
+ 2f,
Sec. 4
Transaction Model
339
TABLE 4-1. EDIT FUNCTION TRANSACTION ACTIVITY Func:tioD
j
II~
+ bs + (23c,ln..>1'
5 6
Go To Page Sc:roll
IIJ$
5
IIr6
o if 11,:$ 23
7 8 9 10
DeleteIIDsen
1 + ".1".
[(II, -
11
12 13 14
CUt
8+~n..
Paste Insert Foocnote Add Text Atlribute Manual Hyphenaricm
7 + 2n,1". 8 2 2 13 + 12k, + + d/6 + (p + f)dl". 12 + 2k, + (f. + d/6 + (p + f)dl".
PagiDate
Print
23)c.J".n' if
II,
> 23
o o
cv.
1 T 1Ip1".
o o o 2 + 'JIs + (p + f)dllltb 2 + 'JIs + dJ6 + (P + f)dJ".
TABLE 4-Z. EDIT FUNCTION DISK ACTIVITY F1IDCIicm
j
IDdexScan
I
/00.
2 3
0peaICl0se Docnmem AaIcb Copy
4
Physical Copy
n.f
"-I 3
4 + 12Ok, + ,..1
/lad... 110(3 +4.) (4+ 71')d...+ [ C
1)p(n)
(I-8b) (1-7)
1 ..-I p(n) = 1-- ~ p(m)
(1~)
P(1) = 1
(l-S)
R_I
=miD(R) such dial c(e) ~ C n.. =512(1 - $5) n" =32.768O
(A3-1)
q'=r
(A3-2)
ifq=O
That is, if the first item left q items belUnd, tbeD q' is q reduced by the leaving of the next item and iDc!eased by the arrival of r items.. .If the first item left DO items behind (q = 0), then q' is equal to the DUIDber of newlymived items, r. Note intuitively that r is indicative of the load on the server. If r = I item mives during each ~ce timet, the load on the server will be 1. . 383
Khintchine-Pollaczek Equation for M/G/' Queuing Systems
384
Appendix 3
Equations A3-1 and A3-2 can be combined as follows:
=q -
q'
(A3-3) ,
1+ r +j
where j j
= 0 if q > 0 = 1 if q = 0
(A3-4) (A3-S)
Taking the expected values of the variables in equation A3-3, we have E(q') = E(q) - 1 + E(r)
+ Ev1
Since the system is in equilibrium, E(q') = E(q); and
(A3-6) ~ore,
from equation
(A3-6) EV) = 1- E(r)
(A3-7)
Let us DOW assume that arrivals to the queue are random, that is, they are genemted by a Poisson process and arrive at an average J:ate of R items per second. From chapter 4, equations 4-60 and 4-61, we know that the mean r and second moment r- of r items miving randomly in a period of t seconds are
r =Rt
(A3-8)
;2= (Rt'f + Rt
(A3-9)
Averaging r over time t, we have
E(T)
=r =E(Rt) = R1 =R:I' = L
(A3-10)
where we use T to denote the expected value of t and L to tepreSeDt the server load, RT. T is the average service time of the ,server. Using equation 4-31, we also can avemge over time:
r
r
= E(Rt'f + E(Rt) , = E[R2var(t) + £2(Rt)] + E(Rt)
E(P)=
or r=~t)+L2+L
(A3-11)
Let us DOW square equation A3-3. This gives
q'2=q2_2q+2qr+2qj+ 1 - 21'- 2j+,:z + 2rj+ f
(A3-12)
We note the following c:oncmring j:
f
=j
q(1-}) = q E{J) = 1 - L
from equaD.oas A3-4 and A3-S fiom equaD.oas A3-4 and A3-S from.equations A3-7 and A3-10
We also note that r is independent of q. Also, r is independent of j, ~ j is a function
Appendix 3
Khintchine-Pollaczek Equation for M/G/1 Queuing Systems
385
only of q. Therefore, whenever we take the expected value of rq or rj, the expe&Jed value of the" product is the product of the expected values. Taking the expected values of the terms in equation A3-12 and applying the above observations, one obtains E(q'2) = E(rj-) - 2E(q) + 2E(q)E(r) + 1 - 2E(r) - 2(1-L) + E(r) + 2(l-L)E(r) + (I-L)
Since the queue is in equilibrium, E(q'2) = E(rj-). EJiminatillg these terms and substituting the values for E(r) and E(r) from equations A3-10 and A3-11 gives
o = -2E(q)(1 -
L)
+ 2L -
L2 + R2var(t)
Solving for E(q), E( ) = 2L - L2 + R2var(t) q 2(1- L)
Denoting the expected value of E(q) by Q, and noting that R = LIT, this can be
rewritten as
:L[ L+ ~( + "t»)]
Q= 1
1-
1
(A3-13)
We now define the distribution coefficient, k, as
k=~( 1 + ~t»)
(A3-14)
Equation .A3-13 then can ie expressed as Q
L = l-L[1(l-k)L]
(A3-1S)
Equation .A3-1S is the same as the expression for Q given by equation 4-6, which was derived by a less rigorous but intuitive approach. Equation A3-14 is that reponed as" equation 4-16. The relations for W, Til' and Tdnow can be detemrined from the geneml expressions given by equations 4-S, 4-3, and 4-7, respectively. Note that these equations weze derived only under the· following assumptions:
more
• Arrivals to the queue are Poisson-distrib. • The queue is in equilibrium. • Service time is independent of mival time or any other cbaracteristic: of the times being serviced. Therefore, these equations apply for any distribution of service times and for any servicing order of the queue (so long as an item is not selected for service based on one of its characteristics, such as its service time). Thus, the solution is geneml for the MlGlll=l=/A case of queuiDg systems.
APPENDIX 4 The Poisson Distribution
In chapter 4 we began die derivation ofdle Poisson distributioD. It was determined that the probability of n items aniviDg in a time t. p,.(t), was given by the following system of differential-difference equations: pOet)= -rpo(t)
(4-57)
(4-58) The fonowing solution to this set of equations is a summary of that solution found in Slaty [24]. Let us define a gcmearing function p(z,t) such tbat GO
P(z,t)
=,,-0 Lz"p,.(t)
(A4-I)
If we should diffaentiate tbis equation n times with Iespect to z. we have a"P(z,t)
,
az" . nlp,,(t)
(n+ I)!
. (n+2)!..2.. 21 2.J',,+'it)
+ ---rr-ZP,,+l(t) +
+ ...
Setting z to zero. we obtain .
386
a"~~,t) _
n!p,.(t). z = 0
(A4-2)
Appendix 4
The Poisson Distribution
387
Thus, by differentiating the generating function P(Z,t) n times with respect to :..dividing the iCsUIt by n!, and setting z=O, we obtain PII(t). Let us now consider a time t as discussed in chapter 4 and assume that i items have arrived in the queue up to time t. That is, by the definition of PII(t), p,{O) = 1 PII(O)= 0 for n*i
Thus, from equation A4-1, for t=O, P(z,O)
=tpl..O) = t
Also, if Z is set to 1, from equation A4-1, P(l,t) =
(A4-3)
.
L p,.(t) = 1 11...0
(A4-4)
Now let us multiply the differential-difference equations 4-57 and 4-58 by zII, obtaining zOpo(t)= -rzO[Jo(t) z"p~(t)= -rr'pll(t)
If we 'SUID these over all n, we obtain
+ rz"pll-l(t)
.
.
.
11-0
.=0
.-1
~z"p~(t) = -r ~z"pll(t) + r ~z"pll-l(t)
(A4-S)
The left-band term of this expression is simply ap~. t). The first term on the right is -rP(z,t). The second term on the right is
rzpo(t) + 7Tpt(t)
+ rz"IJ2{t) + .. .
= rz[po(t) + ZPl(t) + z7".(t) + .. . = rzP(z,t). Thus, equation ·A4-S can be written as the linear differential equation
a~~,t)_ r(z-l)p(z,t)
(A4-6)
The solution to this is p(z,t)
= Cerlrl)l
(A4-7)
which can be verified by substitnting p(z,t) fromequati.on A4-7 into both sides of equaIion A4-6. The value of C is dependent upon how many items, i, are xec:eived by time t=0. Let us assume that at t=0, zero items have been received in the queue (i=O). In this way, p,.(t) will tJ:uly be the probability of receiving n items in the subsequent interval t. From equaD.0il A4-3, setting i=O, P(z,O) = t
=1
The Poisson Distribution
388
Appendix 4
.Thus, C= 1 in equation A4-7 and P(z,t)
= er