Advances
in COMPUTERS VOLUME 12
Contributors to This Volume
JAMESP. ANDERSON R. D. BERGERON G . M. FERRERO diROCCAF...
67 downloads
1428 Views
23MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advances
in COMPUTERS VOLUME 12
Contributors to This Volume
JAMESP. ANDERSON R. D. BERGERON G . M. FERRERO diROCCAFERRERA J. D. GANNON HARRY B. LINCOLN JUDITHM. S. PREWITT DAVIDC. ROBERTS D. P. SHECTER F. W. TOMPA A. VAN DAM
Advances in
COMPUTERS EDITED BY
MORRIS RUBTNOFF University of Pennsylvania and Pennsylvania Research Associates, Inc. Philadelphia, Pennsylvania
VOLUME 12
ACADEMIC PRESS New York London-1972
COPYRIGHT 0 1972, BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION I N WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, MC. 111 Fifth Avenue,
New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London NWl
LIBRARY OF CONGRESS CATALOG CARDNUMBER: 59-15761
PRINTED IN THE UNITED STATES OF AMERICA
Contents CONTRIBUTORS . PREFACE .
ix xi
Information Security in a Multi-User Computer Environment James
P. Anderson
1. The Computer Security Problem 2. Techniques of System Access Control
3. 4. 5. 6. 7. 8. 9.
.
Computer Characteristics Supporting Information Security . Operating System Functions Relating to Information Security Problems of File Protection . Techniques of File Protection . Techniques of Security Assurance . Communications Problems Summary . References .
2 10 13 18 24 28 30 33 35 35
Managers, Deterministic Models, and Computers G. M. Ferrero diRoccaferrera
1. Introduction . 2. The System Approach
. . .
3. Management Systems 4. Management Science 5. When and How Managers Have to Implement Management Science Models 6. Will Computers Eliminate Managerial Decision Making? . . References
V
37 40 43 46 50 63 71
vi
CONTENTS
Uses of the Computer in Music Composition and Research Harry 8. Lincoln
1. 2. 3. 4.
Introduction . Composition of Music by Computer . Music Research Using the Computer . Automated Music Typography for Composition and Research References . .
73 74 88 107 110
File Organization Techniques David C. Roberts
1. Introduction . 2. Survey of File Organizations . 3. Random File Structures . 4. List File Structures . 5. Tree File Structure . . 6. Implementation of File Structures References .
. . . . . .
. .
115 116 130 143 151 160 166
Systems Programming languages R. D. Bergeron, J. D. Gannon, D. P. Shecter, F. W. Tompa, and A. van Dam
1. 2. 3. 4. 5. 6.
Introduction . Criteria for a Systems Programming Language . Specific Constructs . Reviews of Several Systems Programming Languages . Extensibility and Systems Programming . Language for Systems Development . References .
. . .
. . .
176 180 192 196 235 239 283
CONTENTS
vii
Parametric and Nonparametric Recognition by Computer: An Application to leukocyte Image Processing Judith M. S. Prewitt
1. Introduction . 2. Image Articulation . 3. Image Description . 4. Discrimination: Logic and Rationale . . 5. Linear Logic and the Evaluation of Performance: Logic and Rationale 6. Feature Selection: Logic and Rationale . 7. Experimental Results: Parameter Variability . 8. Experimental Results: Parametric Discrimination using Decision Theory . 9. Nonparametric Pattern Detection : A Cytotaxonomy for Leukocytes . 10. The Inverse Problem: Human Visual Discrimination Using . Computer-Oriented Image Properties 11. Perspectives on Automatic Microimage Analysis . . 12. Summary and Prospectus . References .
AUTHORINDEX . SUBJECT INDEX . CONTENTS OF PREVIOUS VOLUMES.
. .
.
285 301 321 327 338 347 364 368 383 393 400 404 409
415 422 432
This Page Intentionally Left Blank
Contributors to Volume 12 Numbers in parentheses indicate the pages on which the authors' contributions begin.
JAMESP. ANDERSON, James P. Anderson & Co., Fort Washington, Pennsylvania ( 1 )
R. D. BERGERON, Department of Computer and Information Sciences, Brown University, Providence, Rhode Island (175) G . M. FERRERO diROCCAFERRERA, Quantitative Methods Department, School of Management, Syracuse University, Syracuse, New York (37)
J. D. GANNON,Department of Computer and Information Sciences, Brown University, Providence, Rhode Island (175)
HARRY B. LINCOLN, Department of Music, State University of New York at Binghamton, Binghamton, New York (73)
JUDITHM. S. PREWITT, Division of Computer Research and Technology, National Institutes of Health, Bethesda, Maryland (286) DAVID C. ROBERTS, Informatics, Inc., Rockville, Maryland (115) D. P. SHECTER, Department of Computer and Information Sciences, Brown University, Providence, Rhode Island (175)
F. W. TOMPA, Department of Computer and Information Sciences, Brown University, Providence, Rhode Island (175)
A.
DAM,Department of Computer and Infomz&tion Sciences, Brown University, Providence, Rho& Island ( 175)
VAN
ix
This Page Intentionally Left Blank
Preface
The past year and a half has witnessed a remarkable revolution in semiconductor electronics, and with it have come commensurately remarkable advances in computers. The success of large-scale semiconductor integration (LSI) has opened the door to the ultimate in low-cost miniaturization, so that all the sequential logic for a desk calculator can be fitted onto a single silicon chip in a conventional transistor case, at tremendous cost reduction. The result has been a rapid growth of the minicomputer market, with complete processing units in the $2000 category, and with corresponding price reductions in the data processing units of larger computer systems. Meanwhile, the computer specialists have been busily engaged in producing advances in computers in the systems and applications areas. Computer analysts and programmers are becoming increasingly skilled in prescribing and implementing higher-level languages and complex applications programs, in an ever-widening universe of markets. The current volume of Advances in Computers reflects these directions of computer growth. Professor van Dam and his coauthors provide a comprehensive analysis of systems programming languages and present criteria for such languages, including PL/I, AED, BLISS, and PL360. They examine the elements of extensibility and describe a language for systems development. A companion article by Dr. David Roberts reviews file organization techniques, covering a broad range of basic file organizations, and comparing their relative effectiveness in different data storage applicational environments. Professor Harry Lincoln, Professor diRoccaferrera, and Dr. Judith Prewitt relate the computer to the areas of music, management, and image, respectively. Professor Lincoln describes how music research is being accomplished with the aid of computers, including music representation, theory, style analysis, and ethnomusicology. And in the same article, he presents the background and future of music composition by computer. Professor diRoccaferrera examines the reasons for the explosive growth of computer application to management problems. The underlying general systems approach is identified, management systems are characterized, and the use of models in management science is presented. The article then addresses itself to the “when and how” of implementing management science models and the persistent question of whether computers will displace or replace human managers. Dr. Prewitt discusses image recognition from her many years of research in the recognition of microscopic components of blood, especially leukocytes. The entire spectrum of the xi
Xii
PREFACE
recognition problem is covered and the progress achieved to date is presented in narrative text, a large number of illustrations, and in the underlying mathematics of image recognition. Dr. James Anderson addresses himself to the important problems of information security in a multi-user environment, a question of growing relevance in time-sharing and remote-terminal computer access. After an enumeration and discussion of the many elements entering into the computer security problem, he delineates characteristics which support systems for rendering data secure. He then presents techniques for file protection in particular and security assurance in general. Complete security requires that the communications links between computers and their remote terminals be protected as well; the vulnerability to wiretapping, and control measures that may be adopted, are discussed in depth. It is with deep regret that I bring to your attention the withdrawal of Dr. Franz L. Alt as Coeditor of this series, due to the pressures of his regular duties. Dr. Alt initiated Advances in Computers 13 years ago and it was only when the third volume was being prepared that I was privileged to join him in this exciting activity. I n his characteristically modest and pleasant way, Dr. Alt continually refused to accept the designation of Senior Editor, insisting that we both be designated as editors of equal standing. But he was in fact the primary planner and decision maker in every volume to date, and the Senior Editor in deed, if not in name. I shall miss him.
MORRISRUBINOFF
Information Security in a Multi-User Computer Environment
JAMES P. ANDERSON James
P. Anderson 8
Co.
Fort Washington, Pennsylvania
1. The Computer Security Problem . 1.1 Introduction . 1.2 Technical Threats to Information . 1.3 Equipment Protection . 1.4 Back-up Data . . 1.5 Basic Assumptions of Information Protection . 1.6 Types of Multi-User Systems . 1.7 Relation t o Privacy Issue . 1.8 General Requirements for Secure Operation of Multi-User Systems . 2. Techniques of System Access Control. . 2.1 Basic Methods . . 2.2 Considerations in Design of Passwords . . 2.3 Password Distribution . 2.4 Other Methods of User Authentication . 3. Computer Characteristics Supporting Information Security . . 3.1 Hardware Facilities for Multiprogramming . 3.2 Program (User) Isolation Mechanisms . . 3.3 Two State Operation . . 3.4 1/0 Characteristics . 3.5 Virtual Machines . 4. Operating System Functions Relating to Information Security . 4.1 Recognition of Authorized Users . . 4.2 Control of Access t o Programs and Data . 4.3 Common Services . 4.4 Output Routing . 4.5 Sources of Security Problems in Contemporary Operating Systems . . 4.6 Security Relationship of Operating System to Hardware System . 5. Problems of File Protection 5.1 Basic Problems . . 5.2 Models for Shared Information Processing . 5.3 Models for Hierarchical Access Control . 6. Techniques of File Protection . 6.1 OS/360 . 6.2 File Encryption . . 1
. . . .
2 2 2 4
.
5
.
5
. .
. . .
. . . . . . .
. .
. . . .
. .
. . . .
. . . .
7 9 9 10 10 11 12 13 13 13 14 15 16 17 18 18 18 19 20 21 23 24 24 24 27 28 28 29
2
JAMES P. ANDERSON
7. Techniques of Security Assurance 7.1 Pseudo-User . 7.2 Audit Trails . 7.3 Validation of Programs . 8. Communications Problems . 8.1 Vulnerability to Wiretap . 8.2 Wiretap Countermeasures . 9. Summary . References .
. .
. . . . .
30 30 31 32 33 33 33 35 35
1. The Computer Security Problem 1.1 introduction
The problem of ‘Lcomputersecurity’’ ranges from the physical protection of computer equipment to the physical and logical protection of computerbased data. Since the techniques of physical security are reasonably well known and relatively simple to apply, this article stresses the problems and prospects for protecting data or information in multi-user computer environments . The problem of information protection in computer systems has only recently received significant attention. There are several reasons for this. (a) Until recently, there were fewer systems and less information in computer-based systems. (b) Earlier systems were mono-programmed, and therefore simple physical security techniques were (are) sufficient to protect the data. (c) Growth of online storage of large data bases-concentration in one place of significant data resources. (d) Growth of multiprogrammed operation (time-shared, multi-access, etc.), permitting ready access to large sets of data. (e) Development of on-line and batch data retrieval systems to exploit the collection of data, making it easier to access and manipulate, and increasing the number of people who could access the data. All of these factors have converged in the past few years, making it possible and profitable to penetrate systems for purposes of recovery or manipulation of information. 1.2 Technical Threats to Information
Various writers have categorized the threats to on-line data in timeshared systems. Petersen and Turn [18a] distinguish between accidental and deliberately induced disclosure, with the latter further broken down into passive and active methods. The former is wiretapping; the latter,
INFORMATION SECURITY IN A MULTI-USER COMPUTER
3
all of the perversions of normal access procedures that give a user unauthorized access. In another paper, Ware [MI elaborates on accidental disclosures that can occur, citing such events as failure of computer circuits for bounds registers or memory protection, failure of the system/user mode instruction separation and the like. He further cites the hazard (to information) that exists in the radiation of electromagnetic energy in highspeed electronic circuits (thus facilitating eavesdropping by third parties), and the vulnerability of systems to various forms of manipulation by operators and other operation staff personnel. It is interesting to note that nearly all of the writers on this subject take as an init,ial condition the assumption of the reliability of the operations staff, a reflection of the seriousness of this threat, and the extreme difficulty (or near impossibility) of providing adequate protection to a system from this source of penetration. A number of writers discuss the problem of (lsoftware failure” as a threat to information security. This unfortunate choice of terms is misleading since it conveys to the uninformed the incorrect notion that software is something that wears out, and does not properly reflect the very real danger of incorrect or incomplete design of the operating system and its components, both to information security and proper operation of a system. The term is often used to convey the fact that programs will not operate properly in the face of hardware malfunction or failure, while ignoring or weakly implying the effects of incomplete design. While the problem is not limited to the operating system alone, we note that the possibility of incomplete design is one of the major problems of information security in multi-user systems. One of the reasons for this condition is that the designers of the operating systems have heretofore been concerned with providing protection to the operating system from the effects of user programming errors that arise in a benign environment. As we will see, the problem of information security has received only rudimentary and incidental attention. Finally, a number of writers note the wiretapping vulnerability of the communications between a remote terminal and the resource-sharing computer. Because of the other vulnerabilities of systems, this problem has not been of major importance to date. A related problem, that of unauthorized access to dial-up time-shared systems (because of loss of user credentials) has been a problem noted in the trade press. The focus of this article is on the threat to information posed by programmers who can gain access to a multi-user system and exploit known or suspected weaknesses in the operating system. This is not to minimize the seriousness of the other threats in specific situations; rather it directs attention to the major source of security problems of multi-user systems.
4
JAMES P. ANDERSON
The essence of the computer security problem becomes clear when one considers that programs and data of different users share primary storage simultaneously in multi-user systems that rely on programming to maintain their separation. Furthermore, the situation is aggravated by the fact that the user of a resource-sharing system must often program the system to accomplish his work. In this environment, it is necessary to prove that a given system is proof against attack (ie., hostile penetration), and that it will not commit unanticipated disclosure. Of the two, it is easier to demonstrate that a system is proof against attacks of various kinds than it is to prove that a system will not commit unanticipated disclosure. The reason for this is that the former concerns demonstrating that the design is sufficient, while the latter involves proving that something (e.g., a hardware failure) will not happen. Because of the wide diversity of possible systems and environments for their use, no single set of measures can be specified that will assure the security of a resourcesharing system. Each system must be examined in its setting and a specific set of security techniques applied. Because of the evolving nature of nearly all resource-sharing systems (new applications, different user services, etc.) the process of security assurance of a system is a continuing one. 1.3 Equipment Protection
Clearly related to the problem of information protection is the protection of the vehicle for processing information-the computer system and its subsystems. Unless one is sure of the physical integrity of the hardware of a system, many of the other measures that could be taken are meaningless. Further, the principal objective of a penetrator may be to deny the legitimate users (owners) of the equipment the use of that equipment. The problem of sabotage is not the only reason for providing equipment protection. A skilled penetrator could induce modifications to the hardware that would circumvent the hardware aids built into the systems to maintain program separation, thus exposing the information to unauthorized disclosure. The major emphasis in the literature is on fire protection, perhaps the largest single hazard to computer systems. To aid in planning for the physical protection of computer equipment against fire hazards, the National Fire Prevention Association publishes standards applicable to data processing installations [16]. Along with fire protection, the next major problem is to provide proper controls to limit physical access to a computer operation to the operations staff only.
INFORMATION SECURITY IN A MULTI-USER COMPUTER
5
Computer rooms and data processing centers should be located in interior portions of a building, with single entrances equipped with locked doors. (Multiple emergency exits can be provided with alarms on the doors to signal when the door is opened.) Access to the computer room can be limited by use of magnetic card locks or multiple push-button locks of which there are a number available commercially [3]. 1.4 Back-up Data
The information in files must be recognized as an asset requiring protection. As a consequence, operators of data processing centers must take prudent action to prevent the loss of this data. This form of protection is common practice in most businesses, since the data is often critical to the continued operation of a business. The frequency of taking back-up copies is a function of the activity of the file, and how frequently it changes. If the transaction load is small, it may suffice to take a back-up copy of a file once a week (month) and save the transactions in the interim. When sufficient change has taken place, a copy of the current file can be taken, and the accumulated transactions discarded. When the transaction rate is high, the “old master” can be saved (with the transactions) until a new run is made. This is practical mostly for tape files. The purpose of saving transactions is to minimize the time needed to reconstruct the file in the event it is destroyed. With files that are updated in place (direct access files), the saving of transactions may be the only feasible way to reconstruct the file. In order to determine the magnitude of a security program, it is desirable to categorize the data (files) of a company as to the degree of importance it has. As an initial estimate, one can place data in the following categories: (a) Vital-Cannot operate a business without it. (b) Essential-Very difficult to operate without it, or proprietary. (c) Important-Difficult to operate without it, or increases costs without it. (d) Convenient-Simplifies some jobs. Examples of vital data might include a grocery inventory file for a supermarket chain, the file of sold and available space for a reservation service, the list of stocks held for each customer in a brokerage house, etc. In fact, any data or information that is the “product” of the business. Essential information includes that required to be reported by regulatory or taxing authorities, payroll files, accounts payable, accounts receivable, and other accounting files. Important data might include files supporting
6
JAMES P. ANDERSON
production scheduling, personnel files, files containing summary data from other accounting and business systems. Convenient data is what is left after all other data is categorized. In some installations, a large amount of convenient data is not surprising and may be a good point to review whether continued production of such data is justified. Factors to be considered in categorizing data include: (a) What role the data plays in a business (b) What its replacement cost is. (c) What its reconstruction cost is. (d) What the value of the data is to someone else. (e) What alternatives are there in an emergency? For the most part these factors are self-explanatory. The difference between (b) and (c) is that it may be possible to reconstruct lost data from back-up files or current source documents-the replacement costs are the total costs to redevelop a file from scratch if all data (and back-up) is lost. While data back-up is not the primary focus of information protection, it is important to the overall security afforded a system, and constitutes additional files that require the same degree of protection as the primary files. The use of “counter service” for programmers and other ‘LlOcal”users of a system cuts down the traffic in a computer room, and further provides a degree of control on the distribution of reports and acceptance of jobs. With modern operating systems, there is no valid reason for any programmers to be present a t the computer during the execution of their jobs. Supplementing a single access for the computer room proper, it is highly desirable to control access to a data processing center in order to minimize the risk of unauthorized persons obtaining physical access to the computer. Techniques for such control include the use of guards, picture badges (color coded to indicate type of access-computer room, keypunch areas, job counter, etc.), closed circuit TV, and the like. While only slightly challenging in its own right, equipment protection is the foundation upon which information protection rests. Without assurance of the integrity of the equipment, the balance of the efforts are wasted. I n this, all writers are in agreement. 1.5 Basic Assumptions of Information Protection
There are two basic assumptions universally held in considering computer based information protection : (a) that the physical environment of the computer system is secure, and (b) that the integrity of the operations personnel is above reproach. The importance of these assumptions is obvious, for if the physical
INFORMATION SECURITY IN A MULTI-USER COMPUTER
7
environment of the computer system is not secure, it is possible to steal or otherwise appropriate any information contained on any transportable medium. While it can be argued that such theft would be detected, the housekeeping practices in many computer installations are sufficiently lax that as long as a tape or disc pack were replaced with another, the theft would probably be put down as an administrative error. The operations personnel (including systems programmers) constitute a special hazard because of the nature of the access to files and the system required of their jobs. It is quite obvious that an operator can copy any file, list it, or even manipulate information, all with little or no chance of detection in almost any computer installation. In addition to the operators, systems programming personnel responsible for the maintenance of the operating system and the development of applications programs know about security controls installed in systems, and often can create programs operating with such privilege as to circumvent whatever controls may be present in a system. Peters [ l 7 ] notes that not all operations personnel require clearance to the highest level needed by a system provided there is a t least one individual on every shift who is cleared and is fully aware of what is required to operate a system appropriately and securely in any circumstance. This observation is important in that, depending on the installation, the cost of clearing all personnel could be prohibitive. Comber [4],in discussing information privacy (see Section 1.6), states the case thus: “Despite all that has been said heretofore, the ‘key’ to security information rests with individuals who have access to the data system.” 1.6 Types of Multi-User Systems
In considering the problems of information security, the conventional classification of systems into “time-sharing,” batch, and remote batch are of little value because they do not suggest significantly enough the degree or scope of the problem. The distinction between single-user-at-a-time system, and multi-user system is more meaningful. From a security viewpoint, we wish to classify systems as to the degree of direct user control permitted by the system. Differentiating among systems as to the degree of direct user control is merely a recognition of the fact that the security threat to a time-shared system is a function of the direct user control possible in a system. Clearly, if a user (at a terminal) cannot exercise direct control over the programts) he is executing, he is less likely to be able to cause improper operation of the program than a user who has a high degree of direct control.
8
JAMES P. ANDERSON
One can identify a number of points along a spectrum of direct control. (a) A system in which only specific “canned” programs may be used from a terminal. Airlines reservations systems, or the various query systems are examples. The user “control” of the programs is limited t o supplying parameters. (b) Systems providing interpretive computing for terminal users. Systems providing BASIC or JOSS type languages are examples. The principal distinction is that although the user may specify in some detail both the functions to be performed, and the sequencing desired, he is barred from direct control of the hardware (i.e., from writing instructions that are directly executed by the machine) by the fact that the operations and/or the sequencing between steps is interpreted by another program standing between the user and the hardware of the central processor. Further, interpretive systems isolate users from awareness of memory allocation functions. (c) Systems that use only approved compilers to produce running code. An outstanding example of this kind of system is the Burroughs B5500 which presents the machine to the users only in terms of the Algol, Fortran, and Cobol compilers. (d) Systems which permit users a t a terminal to write in the machine language of the system and/or exercise direct debugging control at the machine language level. Examples of this kind of use abound-although the machine language is most frequently assembly language for the obvious reasons. In practical use, most time-shared systems offer a range of use embracing nearly all of the cases cited above. Depending on the particular circumstance of the using installation(s), an increasing security problem is generally present as the options permitting more direct user control are selected. From still another security viewpoint, systems can be classified by the type of information they contain, and whether or not the information requires selective sharing. Examples of the selective sharing problem range from the simple case of payroll data that is available on a per-person basis to a payroll clerk, but only in aggregate t o cost analysis clerks in production, to a complete management information system that might contain sales and production status data for a product line that requires both hierarchical access (e.g., a sales manager can access all data concerning his territory, while an individual salesman would be restricted just to data concerning his sales and customers) and disjoint access (e.g., a product manager can access all sales data regarding his product lines, but not data on products the responsibility of a different product manager).
INFORMATION SECURITY IN A MULTI-USER COMPUTER
9
It is this problem of information sharing in MIS and data management systems that has given rise to a number of proposals and schemes to control the shareability of information discussed in Sections 5 and 6. 1.7 Relation to Privacy Issue
Much has been written about privacy in relation to computer systems [9,10, 19bI. Ware [24]makes the distinction between security and privacy on the basis of whether the information in question arises in the military or defense environment, or industrial or non-defense governmental environments. Comber [4]takes a more sociological view relating privacy to the notion of personal information. While such distinctions are important in an overall sense, it is sufficient to note that the issue of privacy relates to disclosure policy regardless of the kind of data or the environment it arises in. Information security techniques are but one of a number of techniques used to aid in the management of a disclosure policy. Perhaps the best known, and most formally developed disclosure policy is the scheme of national classifications applied to defense information. In this light, there is no essential difference in objectives between the government official who classifies a document SECRET, and the corporate manager who marks a document COMPANY CONFIDENTIAL. They are both using a technique to maintain the privacy of information by labeling the information with an indicator of a particular disclosure policy. The privacy issue often arises in situations involving computer based information systems because the disclosure policy is not well defined, and because they are often no information security measures applied (or planned) to the data processing systems involved. 1.8 General Requirements for Secure Operation of Multi-User Systems
The very wide variability in the environment, equipment, user populations, and information contained in multi-user systems precludes a simple specification of the administrative and technical measures necessary to provide an adequate level of protection to information in all systems. In spite of this, we can state broadly the minimum requirements that must be met by such systems [20].These include: (a) A physically secure environment for the computer, and other physical elements of the system including terminals where present. (b) Control of access to the system. (c) An adequate method of internally isolating individual programs (users) simultaneously resident on the system. (d) An adequate method of limiting access to programs and data files.
10
JAMES P. ANDERSON
With the exception of the physically secure environment for the physical elements of the system, the balance of this paper will examine these requirements and the techniques used to meet them. 2. Techniques of System Access Control 2.1 Basic Methods
After physical security measures, access control techniques are the first line of defense against unauthorized use of a computer facility and the possibility of misappropriation of information contained in a system. By “access,” we mean the ability t o gain use of the system, and are not dealing with the question of authority to access file data a t this point. For many computer systems, “access control” is exercised by a control clerk who accepts jobs for processing on the system and distributes output to a user community. The control is administrative in nature, and may involve personal recognition of the person submitting the job by the clerk before a job is accepted. Whenever the clerk is in doubt about the authority of an individual to use the system, he may verify this authority with an appropriate level of management in his organization. Where remote access to the information system is provided, as in various forms of time-sharing systems, other methods of access control must be applied. There are basically two techniques used to control remote access to a system. The first of these, called terminal control, limits remote access to a system by limiting physical access t o the terminals connected to that system. This method is appropriate to various forms of systems dedicated to a certain function or dedicated to a given user population, as might be experienced in various in-house systems, or dedicated systems such as airline reservation or stock quotation systems. Terminal control is often used on systems where the terminal is straightwired (i.e., it is a point to point permanent circuit) to the system. Where the group having legitimate access to a given terminal is homogeneous in function, it is possible from a security viewpoint to dispense with individual identification of users, letting the terminal stand as a surrogate for any user who can gain physical access to that terminal. In dedicated systems, adoption of such a policy can dramatically reduce the administrative burden of security maintenance. Adding or deleting a user is simply a matter of granting or revoking authority to use a terminal. The use of this scheme also simplifies the maintenance of file access authority, since the authority is associated with the terminal, which in turn stands for a group of users. The disadvantage of the scheme is that if a given user has
INFORMATION SECURITY IN A MULTI-USER COMPUTER
11
file access authorities not in common with the group represented by a given terminal, he must use a different terminal to exercise the additional authority. Another disadvantage is that the terminal control scheme by itself does not usually provide a fine enough grain file access authority. Any file access authority granted a terminal can be exercised by all members of the group represented by that terminal. This will be discussed further in Section 5. The second method, and by far the most common, is to identify a user to the system by placing his identity (user-ID, e.g., social security number, badge number, name) on an access list maintained in the computer, and require user identification to be supplied as part of a log-on procedure. Since it is necessary to use this identification in such things as accounting reports, which may be accessible to a wide variety of people, nearly all remote access systems provide for a “secret” password to accompany the user identification [b7]. The principal advantage of identifying individual users is that it permits highly selective file access authorities to be applied on a per user basis. There is no special advantage over terminal conrtol as a mechanism for controlling initial access to a system. The adoption of user identification as the system access control mechanism in a system places all of the burden of properly identifying a prospective user on the central system in contrast to the distribution of authority found in terminal controlled access. Of course for many systems, such as commercial time-sharing offerings, there is no choice but to identify individual users. 2.2 Considerations in Design of Passwords
I n order to be effective, passwords should be generated as a random string of letters or numbers in order to minimize the possibility of their being diagnosed. Furthermore, they should be long enough to discourage systematic testing. The determination of the random password length required to provide a given degree of protection against systematic testing is given below. It is assumed that the tests are carried out at the maximum line transmission rate (such as would result by replacing a terminal with another computer). The password size is determined by solving the following inequality for S: (R/E)4.39 X 1 0 4 ( M / P )2 A S
where R is the transmission rate of the line (characters/min), E is the number of characters exchanged in a log-on attempt, P is the probability that a proper password will be found (as a decimal fraction), M is the
JAMES
12
P. ANDERSON
period over which systematic testing is to take place (in months of 24 hour/day operation), A is the size of the “alphabet” from which the password is made up (e.g., 26, 36), and S is the length of the password in characters. As an example, we can determine the password size drawn from the standard alphabet that would give a probability of no more than 1/1000 (.MI) of recovery after 3 months of systematic testing under the conditions noted above. The line speed is 300 characters/minute, and 100 characters are exchanged in a log-on attempt. Using our expression we get
300 x 4.39 x 100
104 x 3
x
103 I 26s
3.951 X lo8 5 2@ 26s = 3.089 X lo8 26s
=
8.03 X lo9
for S
=
6
S=7
Under these circumstances, we might reasonably choose S = 7 and be close enough. If the probability were made 1/10,000 (.0001), the next larger size (S = 8) would have to be chosen. I n fact, it is the probability (of getting a “hit”) that affects the value of the expression most; the other terms rarely contributing more than a factor of 10. 2.3 Password Distribution
The problem of distributing passwords is one of the most vexing barriers
to frequently changing the password. Because of the “secret” nature of the password, users are encouraged to memorize them. If the frequency of password change is sufficiently low, this imposes little burden on a user, while if frequent changes (up to a change for each log-on) are made, the user will become confused or end up writing down the password. is because they are re-used, The reason passwords are kept perhaps for extended periods of time. If however, the password were changed each time it was used, there would be no more risk in writing down the password than in carrying a key to a locked room. The principal risk would be loss or theft, which if reported promptly could be eliminated by locking out that password. If one accepts the foregoing, then it would be feasible for the system to supply users with the next password to be used as part of the log-on procedure. This password could be carried about by the user as required. To further minimize the risk of carrying a password openly, the system could be constructed to generate, say 10 passwords each time, only one of which would be the correct one to use for the next log-on. The position in the
INFORMATION SECURITY IN A MULTI-USER COMPUTER
13
list of the correct next password would be fixed for a given user, but varied between users. In keeping with common practice and to allow for typing errors, yet prevent systematic testing of each password on a list, the log-on procedure can be designed to lock out the user after three tries (or as many as the installation desires) if any but the correct password were used. Key switches can be used to deny casual access to a system from remote terminals fitted with the key switch. The key switch, however, does not provide unique user identification, and devolves into a technique for operating a terminal control system without requiring the terminal to be in a constantly attended office. The magnetic and card reader/writer is interesting since it permits the system to transmit back the password to be used at the next log-on in a most convenient fashion. Its disadvantage is the cost of a reader/writer which effectively precludes its use in systems with a larger number of remote-access stations. 2.4 Other Methods of User Authentication
Other methods of unique positive identification of users have been available, proposed, or are being actively worked on. These include devices for accurately measuring some physical characteristic (e.g., the geometry of the hand, identification of speech patterns, fingerprint reading). In addition, key switches and magnetic card reader/writers have also been mentioned. Of the former group, the problems are not of measuring physical characteristics or speech patterns, but of abstracting the measurements into a unique representation for an individual. Even if these problems are solved, the cost of the measurement equipment is generally too high to apply to most systems, particularly as the identification is no more precise than that afforded by the password technique. 3. Computer Characteristics Suppotting Information Security 3.1 Hardware Facilities for Multiprogramming
The basis of multi-user information systems is found in the hardware of the computer used for the system. It is also the foundation of adequate security controls incorporated in such a system. Multi-user (multi-access, etc.) systems use the techniques of multiprogramming to distribute the real resources of a computer system among a number of simultaneous users. Thus, we find computers with specific hardware features to support multiprogramming. These include base registers to permit relocation of
14
JAMES P. ANDERSON
program and data segments and to facilitate dynamic linking of program segments: a comprehensive interrupt system that provides redirection of control based on external events, the concept of privileged operationsissuing 1/0 instructions, setting base and time registers, etc., and memory protection schemes. Not surprisingly, because of the only recent awareness of its importance, there is no special hardware features to support information security per se. However, on inspection, we find that many of the hardware facilities for multiprogramming are useful in providing the foundation for information security as well. The fact that many of the hardware facilities to support multiprogramming were motivated from considering the effects of undebugged programs is fortunate from a security point of view. I n the balance of this section, we will consider some security objectives/requirements, and some of the hardware facilities that can be used in meeting the objective. I n the next section, we will present the rest of the framework by considering operating systems functions that support information security. Hardware feature
Use in support of multiprogramming.
Memory protect
Prevent undebugged programs Prevents users from accessing from damaging other users, information not his, or manipulating the operating sysor operating system. tem to recover system information.
Interrupt system
Redirect control on external events.
Detects attempts to execute “illegal” instructions, violate memory protect, etc.
Privileged instructions
Prevents accidental damage t o file data or to operating system.
Prevents user from “seizing” the operating system, or extending his memory bounds. Protects file data from misappropriation.
Use in support of multi-user information security. ~~
3.2 Program (User) Isolation Mechanisms
The security purpose of isolating programs, one from the other, is to prevent a hostile program from misappropriating data from main memory or manipulating parts of the operating system to cause i t to do the misappropriation on behalf of the hostile program, whereas the emphasis on program isolation in ordinary multiprogramming has been to limit the damage an undebugged program could cause. The netj effect is that the memory protect hardware in many contemporary computers provides
INFORMATION SECURITY IN A MULTI-USER COMPUTER
15
write protection (i.e., it will act to constrain all data transfers from registers to memory to fall within the memory assigned t o that program), but does not provide read protection. Where the hardware is available for read protection, it is often the case that the manufacturer-supplied operating system does not take advantage of it. The principal isolation mechanisms are segment length checks, bounds registers, and storage locks. The segment length check operates with a base register and tests the relative address developed (after indexing, if any) against the length of the segment being referenced. If the relative zddress is less than or equal to the segment length, the base is applied, and the reference completed. If the relative address is greater than the segment length, an interrupt is generated. Bounds registers operate on a similar basis, except that the check is made on the real addresses rather than relative addresses. Storage locks are bit patterns applied to blocks of real storage allocated to a particular program. References to storage for reading or writing must be accompanied by the correct bit pattern, or a reference out-of-bounds interrupt is generated. One bit pattern (usually all zeros) is reserved for use by the operating system, and permits free reading or writing to all of the main memory. From a security viewpoint, any of these specific storage protection mechanisms can provide the necessary information protection, although the specific technique chosen has profound effect upon other aspects of the operating system design, and can affect the efficiency of the system. For example, use of bounds registers will require a program and its data to occupy contigious locations, while the length check and storage lock technique permit the operating system complete flexibility in locating program and data areas of a single program (although the latter commits a fixed size unit of storage for each allocation). Related to the problem of user isolation is the problem of leaving data behind in core storage (or on temporary work files) after a job is completed. When a new job is allocated space previously used, it may be possible to recover information left behind by the previous user unless the operating system clears this space (by overwriting with zeros for example) upon deallocation. While simple to do for core storage, such a procedure becomes quite time-consuming for external files, and may require a special job continuously resident to overwrite released file space. 3.3 Two State Operations The concept of two state computer operation is already present in most contemporary computers and is important in providing information security [ l 7 ] .The two states are known variously as supervisor-user states, master mode-slave mode, privileged-nonprivileged, etc. The principal dis-
16
JAMES P. ANDERSON
tinction is that the privileged state is permitted to execute the hardware 1/0 instructions, to set base registers, and to load and/or manipulate memory protection registers. Again, from an information security viewpoint, this arrangement is quite satisfactory and provides the essential safeguards. It is interesting to note, however, that the design of modern operating systems has introduced the concept of partially privileged program. These partially privileged programs exist outside of a user’s program and often perform an essential service, as in utility programs. The reasons such programs are given the status of partially privileged is that they may require access to the users’ programs data space, and for purposes of efficiency to execute some privileged instructions. Present designs rely on programming checks built into the principal operating system to achieve the necessary separation between itself and partially privileged programs. While the concept of partially privileged programs is merely an extension of the privileged part of an operating system for most systems, the methods used to control the partial privilege can be of considerable importance if the basic manufacturer-supplied system is used as the foundation for constructing dedicated applications systems.
3.4 1/0 Characteristics The establishment of independent 1/0 machines (Channels) was a big step forward in the design of efficient computers. Upon instruction from a central processor, the channel will cause data to be read or written from or to any portion of the main (execution) memory independent of the rest of the system. It has been a characteristic of most systems that the memory protect mechanisms is suspended for channels. Since the 1/0 instructions themselves are privileged in most modern systems, this would appear to be no problem since the 1/0 can only be executed by the executive, which can check to make sure the area of memory affected is the proper one. The problem arises because many 1/0 operations are data sensitive-(e.g., the amount of data being read in is determined by the length of the physical record recorded on tape). If a read operation is commanded, and the address to which the data is to be read is within the area set aside for the user program, the operating system may issue the command, while the data being read in can exceed the nominal user8 space by an arbitrary amount. While the exploitation of such an arrangement may not be easily accomplished, the danger exists that i t is possible to read data in such a way as to replace all or part of the operating system with one’s own version, enabling a hit and run penetration to take place.
INFORMATION SECURITY IN A MULTI-USER COMPUTER
17
Of the various memory protect schemes in use, the storage lock technique will prevent overwriting of storage not identified with the user in question by including the users storage key in 1/0 commands to the channels. It is interesting to note that the IBM System 360 incorporates such checks in the hardware. 3.5 Virtual Machines A recent development in systems structure has been the concept of a virtual machine [6] (or sometimes virtual computer, or computer with virtual memory). The principal characteristic of these systems is the fact that they generalize the base address register concept found in earlier machines by providing a table of base registers for various independent segments of programs. Addresses that exist or are developed (using index registers, etc.) in the users address space are translated into real memory addresses through the use of the table of base registers. Since the table of base registers is held in computer words, often much larger than the address size, there is room for ancillary information about the segment represented by the table entry. Most commonly, the additional information is the length of the segment (used for memory protect purposes), and an indication of the types of references permitted to the segment in question. This latter information is typically: READ (AS DATA) READ OR WRITE WRITE ONLY READ (FOR EXECUTION ONLY).
Since these (or similar) access authorities can be applied to each independent segment of a program, it is possible to develop systems with very positive and precise control of the types of referencing of programs and data permitted in the system. It also permits the operating system itself to be in the “users address space,” since the access restrictions of “READ FOR EXECUTION ONLY” can be applied to the code. By incorporating the operating system into the users address space, it is possible to create systems conceptually equivalent to the single user a t a time systems of the past. Operating systems based on the virtual machine concept isolate the management of the real resources of the system (memory, peripherals, etc.) from the rest of the operating system which is constrained to operate in the area(s) allocated a given user. Even if the user breaks out of the normal bounds of the code he writes, or spoofs the operating system to read or write into areas of storage allo-
18
JAMES P. ANDERSON
cated to him (but not generated directly from code he writes), the operating system itself is constrained to operate within his allocated space. While it would still be possible for the user to alter the operating system code (accidentally or deliberately), it is feasible to validate the operating system each time before i t assigns itself to the next task. The validation could take the form of hash totals of the code, compared against values maintained in the real resource management code (see Section 7). 4. Operating System Functions Relating to Information Security
4.1 Recognition of Authorized Users
This function involves both proper identification of the user, and in those systems using dial-up facilities, often includes recognition of terminals as well. I n the latter case, this is accomplished by equiping terminals with a “HERE IS” function that responds upon interrogation by the computer. Incorporation of this automatic response upon interrogation provides the system with an additional level of confidence that it is in communication with an authorized user, although its principal value is more in the realm of being able to collect rentals for terminals supplied by the vendor of the multi-user service. Identification of terminals is more important in defense related systems where the terminal (by being located in a relatively open or accessible area) may not be authorized to transmit or receive certain levels of government classified data, while a user who may be authorized to access these levels of information can use that terminal. I n order to prevent such a user from activating programs and/or transmitting and receiving data a t classified levels higher than the terminal is permitted to handle, it is necessary to apply the concept of information access authority to the terminal population as well as the user population. 4.2 Control of Access to Programs and Data
Once an operating system accepts a user (and his “terminal” in some cases) as authentic, it is then the operating system responsibility to exercise the controls provided to limit access to programs and data in the system. The controls are part of the administration of the file system and are quite specific t o a given system. As a simple example, many time-shared BASIC systems maintain a catalog of the saved files of a user. When the user requests a file in the OLD-NEW sequence, the name is checked against those in his catalog (or in some cases, the public catalog), and only if it is a file he owns, or can access, is he permitted to continue. In essence,
INFORMATION SECURITY IN A MULTI-USER COMPUTER
19
the controls limit access to data or programs either in the public catalog, or the user’s private catalog. Access to programs and data becomes a problem only when the simple dichotomy of public data files and user private files is left, and one builds hierarchical file sharing arrangements, or where some limits on access within a single file are required. These controls are often complex, and may have to be executed for every reference to data from a file in a given case. Some of the techniques available to exercise this control are discussed in Section 5 . 4.3 Common Services
In a sense, modern operating systems have evolved from the first centralized service, IOCS, by progressively adding more functions to assist users in managing the resources of their systems. There is information TABLE1
SUPERVISORY SERVICECALLSIN GECOS I11 Symbol GEINOS GEROAD GEFADD GERELS GESNAP GELAPS GEFINI GEBORT GEMORE GEFCON GRFILS GESETS GERETS GEENDC GERELC GESPEC GETIME GECALL GESAVE GERSTR GEMREL GESYOT GECHEK GEROUT GEROLL
Meaning Input/output initiation Roadblock Physical file address request Component release Snapshot dump (Elapsed) time request Terminal transfer to monitor Aborting of programs Request for additional memory or peripherals File control block request File switching request Set switch request Reset switch request Terminate courtesy call Relinquish control Special interrupt courtesy call request Date and time-of-day request System loader Write file in system format Read file in system format Release memory Write on SYSOUT Check point Output to remote terminal Reinitiate or rollback program
20
JAMES P. ANDERSON
security benefits as well as hazards from this centralization of services. The benefits are those arising from separation of a user from the real facilities of the system, and providing programs to mediate his actions. The hazards are those associated with incomplete design, and in modern systems, from the fact that the common services are the principal internal interface between the operating system and user program, and are the primary points of attack available to anyone attempting to penetrate the system. Examination of the major operating system service functions made available to user programs through calls on the supervisor indicate the number of potential interface points between user programs and the supervisory services. Typical of these functions are the list in Table I, available under GECOS I11 for the GE-600 series systems. The degree of risk in such services is intimately bound up in the overall design of the operating system, and whether or not the system is able to validate the parameters to the common service calls. It is beyond the scope of this paper to attempt to treat this in detail; however, it is possible to illustrate the point. It is well known that OS/360 uses locations within the users address space to store addresses of privileged operating system routines. At some points in its execution, it uses these addresses to transfer control to the routines in supervisor state. As a consequence, it is possible for a user to replace the addresses with ones pointing to code he has written, and force the sequence of actions that cause the operating system to transfer control to these addresses in supervisor state. This gives him complete control of the system for whatever purpose he wishes. Contrast this with GECOS 111, where all data necessary to support the execution of the users programs is contained in tables outside of the users address space. This prevents the user from directly manipulating this information, although it may still be possible in some operating systems to supply calls on common services with incorrect 'data, and obtain control indirectly. 4.4 Output Routing
This function is concerned with directing data back to a user at a remote location. For many commercial time-sharing systems, there is little or no problem with this function. It impacts security only if the output is misrouted. While not a frequent problem, it can be serious in some environments. The bulk of the potential errors that could cause misrouting of data lie outside of the operating system itself. The problems arise when electrical transients in the system modify bits of the device (or terminal) address in a channel, or as the channel control word is transmitted from main (computer) memory to the channel itself. Further, failures of common carrier switching networks can cause information to be misrouted or lost.
INFORMATION SECURITY IN A MULTI-USER COMPUTER
21
If the user environment demands protection against this kind of error, the use of straight-wire (point to point) connections between the central and remote sites will provide some assurance against the misroutes that occur due to the common carrier. There is still the potential for misroute from the channels, and communications interfaces, although the relative frequency of these errors is still low. With the interface to communications lines being handled by dedicated processors in many systems, it is feasible to incorporate “echo” checks between the main processor and the communications interface. This at least assures that the terminal address is transmitted properly between the systems. Within the communications interface processor, it is only possible to check that the current address being transmitted to is correct by copying the address used for the transmission and comparing it with the one used to set the register. Clearly, there is still room for errors; however, experience on a number of large scale heavily used systems indicates this is not frequent in the systems themselves. 4.5 Sources of Security Problems in Contemporary Operating Systems 4.5.7 Design for Benign Environment
The fact that the designers of operating systems are only peripherally aware of potential malevolent penetration of their systems has to be the major source of security problems in contemporary operating systems. The designers do concentrate on providing protection from program bugs of various kinds, but give little thought to the effects of a planned penetration. Thus, we see OS/360 with 1/0 control blocks located within a programmer’s address space, where he can manipulate them (e.g., supply different data set names than those supplied in the DCB or JCL cards). Another symptom of design for benign environments is not checking that parameters passed to system service calls are legitimate for that user-for example, issuing 1/0 commands with addresses outside of the users address space. This kind of action is not expected, consequently, it may be overlooked by the operating system designer. 4.5.2 incomplete Program Design
A simple example of incomplete program design with security consequences is found in one of the contemporary time-sharing systems. It is possible to attempt a log-on with either the user-ID or password incorrect, and by depressing the BREAK key on the teletype, interrupt the error message to the effect “Password Incorrect. . . Try Again,” resetting the log-on attempts counter to zero. The effect of this design error is to permit
22
JAMES P. ANDERSON
anybody with enough patience to systematically try passwords until one
is discovered that works. Since the number of log-on attempts never reaches the threshold for disconnecting the user (usually 3 on many systems), the system is not aware that penetration is being attempted. This particular design omission could be exploited by replacing the user/ terminal combination with a minicomputer programmed to try all possible (reasonable) passwords. Incomplete design is generally what is being referred to by the phrase “software failure,” and arises from the designers’ and programmers’ of the operating system not being able to anticipate extreme or unreasonable situations. Unfortunately from the security viewpoint, modern operating systems are too complex to make simple a priori determinations that they will operate properly under all conditions. 4.5.3 Problems o f Program Identity
Related to the problems of validating parameters in service calls is the problem of identifying the program that made the call as a user or a privileged program. Since service calls are often nested and embedded in the operating system itself, it is difficult for the service call to ascertain whether it is being invoked in a chain that originates in the supervisor or in a user program. It needs this information to validate parameters appropriately, since what is legal for the supervisory code is most often not for a user program. Complicating the process is the existence of programs (other service functions, for example) that, while part of the supervisor, have only limited privileges, but many more than an ordinary user’s program. In a subsequent section, we will survey a number of protection proposals that attempt to deal with this problem. 4.5.4 Inability to Have Completeness of Design
The final source of a lingering doubt about the security of an operating system design is that even one that appears to be complete cannot be proved to be complete for all of the pathological cases that could occur. This doubt is reinforced by the experience of the computing community with the bulk of programs that are never fully debugged, if only because of the extreme cost of exercising all important cases, and the virtual impossibility of even enumerating all possible pathological cases. This unhappy state of affairs is compounded by the special difficulty of accurately reproducing an exact sequence of events in a real time system. The only relevant work reported to date is that of Dijkstra [6],who has constructed a multiprogramming environment where each component
INFORMATION SECURITY IN A MULTI-USER COMPUTER
23
of the operating system is constructed to be self-contained, and to operate correctly for all cases. The components are then able to communicate freely and without possibility of interference, and it is claimed by the author that since each step of the construction process (mainly components) is proved to operate correctly, this constitutes a “proof” that the system as a whole will operate correctly for all cases. This work is an interesting and important step toward the general ability to prove complex programs (operating systems) are complete; however, it appears that it will be a number of years before the results are assimilated into the general use environment. It should be noted that it is because of just this point that there is reluctance on the part of the government to permit more than the most highly restricted operation of multi-user systems with classified defense information. While there are a large number of other considerations that apply to this situation as well, it is well beyond the scope of this article to attempt to treat them. 4.6 Security Relationship of Operating System to Hardware System
In general, the operating systems base their security procedures on the two major hardware features affecting security-the memory protection scheme and two state operations (i.e., privileged instructions) [15]. These particular hardware features establish the foundation upon which the operating system designers erect their control. The other major hardware system attribute that affects the design of the entire operating system (and the security provisions required as well) is the number of base registers and the form of addressing provided by the system, and whether or not they contain control information recognized and acted on in the hardware itself. To be sure, there are only few systems with this attribute; however, they are nearly all distinguished by being some form of virtual system. Note that it is specifically intended to include machines such as the Burroughs 5500, 6700, etc., that provide an arbitrary number of base registers as descriptors, as well as the more obvious forms such as the GE-645 used in the MULTICS system. The important characteristic is hardware acting on control information implicitly. The importance of such facilities is that it provides positive control for all user references to data and programs. If the base registers (descriptors, or segment tables in contemporary literature) are maintained outside the user’s address space, or he is otherwise effectively prevented from manipulating them, they provide an adequate mechanism for protected information sharing, and in general, expand the modes of use of multi-user systems.
24
JAMES P. ANDERSON 5. Problems of File Protection
5.1 Basic Problems
There are basically two file protection problems (excluding the problems of physical protection). The first arises in connection with computer utilities and is concerned with methods of precisely controlling the sharing of information, and more specifically, programs. The problem is complicated by the notion of implied sharing. As an example, if a user B is sharing some programs owned by user A, and then authorizes user C to share his program that shares some of user A’s programs, how is the sharing between B and C controlled such that C does not have access to the programs of A and B, only their results. Basically, the question being addressed is how can communication be established between two users’ programs such that only the results of the shared program are available to the sharer. The second problem arises in environments where data is classified according to external criteria (e.g., in files of defense information) and is more concerned with establishing a logically consistent method of determining the security label to be associated with file access requests in order to permit an intelligent determination of the validity of the request. This problem is complicated by the fact that users, programs, terminals, files, and executions all can be considered to have such labels, and that the security label of some objects (executions and some files) can change during the execution of a program, or during the execution of a job. In addition, in the environments where this problem is important there is considerable attention paid to the derivation and proper transfer of security labels to files and printed material. 5.2 Models For Shared Information Processing
The issues involved in this problem are how authorizations to use a file or a program are accomplished and the general framework in which programs are created and executed. Most of the workers involved with this problem have assumed or required the existence of a file system consisting of a collection of files and a directory associating a user with his files, or in exceptional cases, a directory associating a file with its users. Assuming the first form, the authorization mechanism must permit a file owner to designate the users with whom he wishes to share a file and those privileges the sharer is permitted with respect to the file. A commonly used mechanism is to associate with each shared file in a user’s directory a list of other users who may access the file and for what purpose (i.e., READ, WRITE, APPEND, etc.). A sharer, in order to establish a connection to the shared file, creates his name for
INFORMATION SECURITY IN A MULTI-USER COMPUTER
25
the file and equates it to the file being shared. Sharers reference to the file name he created is interpreted as an indirect reference to the owner’s directory, from which the type@) of access permitted are checked before completing the reference. A number of variants on this scheme can occur to make the process more efficient. For example, the directory search can take place at binding time (assuming pre-execution binding), a name substitution made, and a transfer of access flags made to the sharers’ file control block. However, these are implementation and application dependent and will not be discussed further here. In one model [7], actual system commands are provided to permit designating sharers of files. Other authorization models exist ;these include use of passwords associated with each file in the (protected part of the) system to act as locks. An owner authorizes sharing of his file(s) by providing the sharer with the password for the file. As Friedman [7] notes, however, this is less than satisfactory because it permits the sharer unrestricted access to the file for any purpose. The method of actually controlling authorized sharing in nearly all utility-oriented systems is based on the use of indirect references to the shared objects through descriptors. It is characteristic of most systems designed for information utilities, or large populations of on-line users, that they provide some form of virtual memory system [5]. The objects (e.g., programs, data, files) occupying the virtual memory are represented by descriptors, collected into one place, managed by the system, and acting to map a virtual address into a real address. The mapping is often aided by hardware in the system, but this is merely a technique for improving execution efficiency and is not fundamental to the concept. Since descriptors are maintained by the system (necessarily, since they deal with real resources) they are in a special segment designated READONLY to a process. Descriptors are used to control sharing in a variety of ways. Basically, each descriptor representing a program, data set, file, etc., contains control information in addition to the address in real memory where the object is located. The basic control information of security interest is the type of access permitted to the object-READ, READ-WRITE, EXECUTE, APPEND, etC. Since the operating system is the only program permitted to create and manipulate these descriptors, the necessary mechanism to provide controlled sharing of other users’ programs and files appears to be established. This would be the case if only one user at a time were permitted to gain access to an object. However, in the multiple user environment, a given object could be in use by a large number of users, perhaps with different access privileges. In general, this case is handled within the same framework as for the single user; since each user’s process is represented by a descriptor
JAMES P. ANDERSON
26
TABLE I1 INTERPRETATION OF ACCESSCONTROLS IN DESCRIPTORS AS A FUNCTION OF THE PROGRAM STATE Access control Program state
User-RE
user-
AD
WRITE
User-EXECUTE
Supervisor (any)
READ
OK Error Error OK
or
APPEND
EXECUTE
OK OK
Error
Error
Error OK
Error Error OK
READ
WRITE
OK
OK
OK
table (segment) unique to that user, the descriptor referring to such an object can have the access control information set to the appropriate value for that user. The actual checking on access type is accomplished on modern systems in hardware as a descriptor is referenced. Assuming that the processor distinguishes a t least four logical states, Table I1 indicates possible interpretations of the access control data in the descriptor. The ERROR state is assumed t o be sufficiently severe to force the abortion of the offending process. Within this general framework, a number of secondary problems emerge. Graham treats protection as a disjoint series of rings, and discusses the problems of changing control from one protection level (viewed as concentric circles or rings) to another in a safe manner [8].To provide protection in both a downward (from a superior routine to an inferior routine) as well as an upward direction, he proposes a model that augments the descriptor for a segment with ring bounds that permits free access as long as the element being transferred to is within the bounds, but invokes special software whenever the bounds are exceeded in either direction. In general, the special software validates the address being referred t o regardless of the direction of the reference. In this way, the mechanism protects a process from the operating system as much as the other way round. Vanderbilt has created a model that extends that of Graham to include cases that arise when a user sharing an object authorizes others to use the process he creates [ d l ] . I n his model, he introduces the notion of access privileges as a function of the activation level of the process, and in effect makes copies of the descriptor segment for each activation level encountered in order to provide the precise control needed. He distinguishes
INFORMATION SECURITY IN A MULTI-USER COMPUTER
27
the problems that arise from direct access to a shared procedure and adopts as part of the model the policy that direct sharing of procedures is only permitted t o procedures authorized to the borrower by their owner, while only indirect sharing of procedures is permitted for those procedures owned by a third party and authorized and used by an owner in constructing a procedure that is (to be) shared with others. I n the latter case, a borrower can only affect indirect access to procedures borrowed by the owner of a shared procedure.
5.3 Models For Hierarchical Access Control
The only available paper that deals with this subject in a formal manner is that of Weissman [27]. In it, the author defines security objects (files, users, terminals, and jobs) and security properties associated with the objects. The properties are Authority (a hierarchical set of security jurisdictions-classification) , Categories (a mutually exclusive set of security jurisdictions-a formalism of the need-to-know policy), and Franchise (clearance). The balance of the paper is devoted to developing a set-theoretic statement of the policy adopted in the ADEPT-50 system. (a) A user is granted access to the system only if he is a member of the set of users known to the system. (b) A user is granted access to a terminal only if he is cleared to do so. (c) The clearance of a job is determined from the clearance of the terminal and the clearance of the user. (d) Access is granted to a file if the clearance and need-to-know properties of the file and the user are authorized (cleared) to the job. The model treats all file accesses as events and maintains a running determinat,ion of the classification and need-to-know level of the job based on events throughout its execution. This information, known as a high water mark, is most useful in determining the derived classification and need-to-know for new files created during job execution and for labeling output. The only drawbacks with this model is that classification and need-toknow can change in only one direction-upward (to higher levels) depending on the files used in the application. Two relatively infrequent, but none the less important, cases are not treated by the model-the case where individual data items are themselves not classified, or are a low level classification, but when aggregated (collected into a file or report) may acquire a higher classification, and the case where a program trans-
28
JAMES P. ANDERSON
forms a classified file into an unclassified file (perhaps by extracting data known to be unclassified for a report). The latter case arises principally because the classification is applied to too large a unit (the file) and would disappear if fields could be individually classified. The former case cannot be handled within the framework of Weissman’s model as it stands, since it is a value judgment as to when (or if) a particular aggregation requires a higher classification than the source material. This could be handled by providing the concept of security declarations in programs that would override the running classification and need-to-know property if specific conditions were encountered during execution of the job. The conditions might be of the form, “If the number of records placed in temporary file F1 is greater than 100, advance the classification to the next highest level,” or, in general, “IF (condition) THEN (statement of security labeling. )”
6. Techniques of File Protection
6.1 OS/360
The previous section has given various models for shared information control proposed or developed in the context of information utility oriented systems. We can contrast these with that provided on System 360 as typical of the type of protection the manufacturers have considered adequate. A user may create a protected data set and apply a password of up to 8 characters to it. When the data set is opened, the password must be supplied by the operator in order to continue processing. Failure to supply the correct password causes the job referring to that data set to be aborted. On 360/DOS systems, the security parameter merely informs the operator that a “secured” file is being accessed, to which he replies YES (to permit processing to continue) or NO to abort the job. Unfortunately, this scheme is rather gross, since there is no way for the operator to distinguish the owner of the data set from other “casual” users. This means anyone attempting to open the data set will probably succeed, assuming that the operator is the principal custodian of the password. On certain time-shared versions of 360 systems, the application of passwords to files is the principal protection mechanism for those files. As Lampson [ I d ] points out, this is quite unsatisfactory for a variety of reasons including: (a) the larger number of passwords that have to be remembered, and (b) the fact that any access is total.
INFORMATION SECURITY IN A MULTI-USER COMPUTER
29
6.2 File Encryption This method of file protection has been persistently pursued by several writers as a solution to file protection problems. The bulk of the papers have dealt with cryptographic methods ranging from very simple schemes [23] to reasonably sophisticated techniques [.2, 19~1.The applicable techniques are in most cases derived from communications practice and transferred bodily to the computer environment, with little or no justification (except for a claimed secrecy) for their use. VanTassel cites some of the properties of files that must be accounted for in adaption of a cryptographic transform and further enumerates a list of the properties the transform must have [22]. The principal benefits obtained from this degree of elaboration is protection of files should other safeguards fail. Carroll and McLelland cite the advantage of encipherment in general as a useful countermeasure to passive infiltration (wire tapping), between-lines entry, piggy-back techHowever, they point out that information niques, trap-doors, and theft [a]. in core is vulnerable to dumps and system entry of various kinds. The principal problems with using file encryption are those associated with the maintenance of the cryptographic “keys” for different files, and the requirement that the user remember the keys. The technique suffers from much the same problem as that cited by Lampson for password protected files [I,%’]. The difficulty with LLkey’l management is illustrated by considering the problems attending the encipherment of an indexed sequential file. If one assumes that the method of encipherment is known (an assumption that would have t o be made in most multi-user environments since the method could be obtained from memory dumps, and in any case might be a common system service available to all users), then only the “key” would be protecting the file. If the file were enciphered using a single key (as might be the case for a sequential file), then it would not be possible to use the file in a “random” manner. If each record is enciphered with a different key, then the problem is to associate the specific key with each record. One method of doing this is to use some form of auto-key (self-keying) for such files. This method (which is not restricted to random access files) eliminates the position sensitivity of most methods derived from communications practice, and does not interfere with common file operations regardless of organization or access methods. It must be concluded that file encipherment is effective only in specific instances, particularly where the integrity of the key(s) can be maintained, and where the cryptographic technique is itself reasonably complex. For
30
JAMES P. ANDERSON
both a complete survey of cryptographic methods, and a sobering review of the “invincibility” of even highly sophisticated techniques, the reader is referred to Kahn [ii]. 7. Techniques of Security Assurance 7.1 Pseudo-User
The concept of security assurance on multi-user systems is analogous to a watchman making his rounds to check on the proper functioning of the security measures protecting his domain. I n a similar sense, we can increase the confidence in the security of a multi-user system by checking the proper functioning of the security measures taken in the system. Since the security measures in multi-user systems are based on hardware features such as those discussed in Section 3, it is here that tests can be applied. Assuming that the rest of the system and additional software security measures are properly designed, this kind of check is to guard against an undetected failure of the memory protection scheme or the two-state operation that is the base of the security mechanisms for multi-user systems. The technique consists of establishing a permanent pseudo-user program that attempts to violate memory bounds and execute instructions reserved for the supervisor state. Since the first of these conditions should be reported to the supervisor by an appropriate interrupt, while the second may NOP or report as the first (depending on the machine involved), i t is only necessary to establish a convention where the fact that the interrupt is being forced by the pseudo-user is conveyed to the operating system in order not to invoke the abort or dump procedures that would be appropriate to a normal user program. The frequency of running the pseudo-user hardware test can vary depending on external considerations, but should probably be run every minute or less. Depending on the kind of system, it is also possible to check on the proper operation of various parts of the system software. At a minimum, one can extend the pseudo-user concept to check the log-on sequence for proper rejection of invalid user numbers or an incorrect password. While these checks are not a guarantee that the system is error free, failure of the system to properly reject an invalid log-on (or to accept a valid log-on) is often evidence of a hardware malfunction that may have invalidated all other internal security measures taken. The proper course of action to take in the event of an error detected by the pseudo-user program must be a function of the kind of system involved and the type of material contained in the system.
INFORMATION SECURITY IN A MULTI-USER COMPUTER
31
7.2 Audit Trails
Audit trails are an important part of a security assurance scheme. While most multi-user systems have provisions for logging important events occurring during the running of a system, these are primarily for the purpose of accounting or occasionally for monitoring system performance. While a security audit trail may make use of the same logging program, the purpose of the security audit trail requires different information to be recorded. The security audit trail is not to be confused with the type of information required by conventional auditing practice. For a discussion of these requirements, see Wasserman’s article [66].Rather, it is concerned with recording who was doing what in a multi-user system. Its purpose is to detect patterns of abnormal activity that may be the first clue of attempted penetration of a system, and to provide a sufficient record to determine the extent of a penetration that may be discovered by external means. 7.2.1 What Should Be logged
The following list is by no means exhaustive, but is presented as an indication of the kinds of data that should be maintained in a security audit trail. Each entry should include, in addition to the item(s) indicated, the identification of the user involved and the date and time when the event took place. 1. 2. 3. 4. 5.
Log-on (successful), terminal ID, user ID. Log-on (unsuccessful), with password/user ID, terminal ID. Programs called (includes utilities, compilers, etc.). All files referred to (includes volume serial numbers, file names, etc.). All apparent program “errors” (out of bounds references, illegal supervisor calls, array bounds errors, etc.).
For multi-user batch systems that permit essentially unrestricted programming, the job control cards that are used to specify temporary files should also be recorded. 7.2.2 What to Report
Clearly, with conscientious attention to security audit trails, more data will be available than can be reasonably assimilated by simple inspection. It is, therefore, necessary to provide a data reduction program to summarize significant security events. As a minimum, all abnormal events (incorrect log-oas, execution time errors, requests for large amounts of file space,
32
JAMES P. ANDERSON
etc.) should be reported for each user. The other data can be maintained in raw form, or summarized onto files t o assist the process of determining the extent of a penetration that may be discovered later. The length of time to maintain such data must be a function of the kind of installation, and whether or not there exist statutory requirements that must also be met. 7.3 Validation of Programs
There is no known mechanical method of certifying that a program will operate correctly under all conditions of data and in all contexts of its execution. This means that the introduction of new systems programs must proceed cautiously and with human checking. Many times this is satisfactory if done; however, as a practical matter, there are few installations that can (or will) take the time to examine manufacturer-supplied changes to an operating system, compilers, or utilities to see that they do not unintentionally (or otherwise) create a gap in the security built into a system. This is perhaps the biggest barrier to the simple creation of secure multi-user systems. The only recourse an installation has is to invest the effort to understand and analyze the operating system they are using for its potential effect on system security. A related problem, not yet prominent because of the security of other concerns, is that of assuring that a program (or more generally, part of the operating system) has not been changed in an unauthorized fashion. On systems that provide unlimited programming, this may prove to be a particularly sensitive problem because a user who has determined how (if possible) to bypass the memory protection system could have modified the system after initial start-up in an undetected fashion, achieved a penetration, and returned the system to its original state before terminating his job. I n a less exotic way, it is often important to validate the current copy of the operating system as it is being loaded, or more importantly, as it is being reinstated as part of an automatic recovery after a crash. Other parts of a multi-user system that might benefit from periodic validation include the file directory(s), user access lists, and the like. The simplest validation technique is the use of hash totals for various segments of the operating system. These can be taken over any span of code or tables used by the system. The major problem with hash totals is the significant overhead incurred to do the entire system frequently. To overcome this, it may be sufficient to take hash totals only on those parts of the system (code and tables) that are particularly sensitive t o changes. For example, a user status table that contains pointers t o a system code, and possibly an indication of the users privilege, could be hashed as part of every system change, and tested
INFORMATION SECURITY IN A MULTI-USER COMPUTER
33
before the data is used. Sensitive code could be sampled and tested every few minutes or so. I n order to prevent manipulation of the hash totals themselves, it would be possible to start hash totals with an initial value (different for each day, supplied by an operator) that would have to be recovered before the hash totals could be made to look correct. 8. Communications Problems 8.1 Vulnerability to Wiretap
One of the lesser known aspects of the Civil War is that it was the first large scale conflict directed in the main by message communications from the War Department. This was possible because of the rapid spread of the telegraph network after its invention in 1840. As a consequence, both the North and South used wiretapping to gain intelligence on the disposition of forces and planned operations. Thus, we see that wiretapping has been with us ever since the invention of modern communications. In connection with multi-user remote access systems, it has not been a major problem primarily because it is easier and less risky to obtain information by other means, including use of penetration techniques outlined above. However, considering the relative ease with which it is possible to use various forms of eavesdropping devices, it can be expected to become more of a problem, particularly as the internal (programming) and external security controls are strengthened in the products provided by the manufacturers. In addition to the recovery of information flowing between a terminal and a computer, Carroll and McLelland [a] cite the problems of “betweenlines” entry (using a system on an open communication line that has an inactive terminal), and L‘piggy-back’lentry (“capturing” the communications channel after log-on, and feeding the legitimate user false messages) as additional hazards of wiretapping. 8.2 Wiretap Countermeasures
Within a facility, one can rely on physical security techniques such as personnel movement controls and guards to deny a potential penetrator the opportunity to tap lines. A more expensive solution is to use armored cable or various forms of cables with anti-intrusion alarms (often cables pressurized and fitted to alarm if the pressure drops-as it would if someone broke into it). Other techniques involve placing the cables between the terminals and the computer in such a way as to be continuously visible and use guards to monitor the cables.
34
JAMES P. ANDERSON
The more common case of terminals located some distance from the computer forces one to rely on cryptographic equipment to protect the terminal-computer communications [ I , 19~1. Until recently, all of the commercially available cryptographic equipment was designed for off-line use for message communications or voice scrambling. Currently, there are two products on the market designed expressly for remote terminal to computer communications. They are discussed below. 8.2.1 Datacoder-DCI 10
This device is located at the terminal site for protecting the transmission and storage of data maintained in files in a remote-access multi-user system [186]. No provision is made for a matching device a t the computer site, with the consequence that the file is handled internally in encrypted form. The data sheets on the device suggest that numeric-only fields in a record be maintained in the clear in order to permit computation on them, while alphanumeric data be encrypted. The device recognizes certain character combinations as control characters that switch the device between plain and encrypting (decrypting) operation. As it is constituted, the device is primarily useful only for securing noncomputational (text-only) files that are only stored by the computer for later retrieval (on some indicator that is not enciphered). Although it is possible to have numeric data in the record unenciphered for computational purposes, there is not much security in enciphering just alphanumeric data. The examples of use in the description of the device shows a payroll file with the names of the employees encrypted, but with their social security numbers in the clear (along with their pay rates, etc.). 8.2.2 Data Seqoesfor, JJC-3
This device can be used to provide encrypted communication between the terminal arid the computer, as well as in a manner similar to the DC-110 [I&]. I n its latter use, it is supported by programs that replace the JJC-3s a t the computer end and simulate the data scrambler logic for each user. Each user’s key is stored in a secure internal file, enabling the system to handle each encrypted line in a different key. This device appears to have a better concept of use with computers than the DC-110, particularly the provision of a program t o interface t o many different encrypted lines, although with the program and file of user keys being maintained in the system, the protection afforded an encrypted file at the system is limited to theft of media. In contrast, the DC-110 can be used by any user without pre-arrangement with the system,
INFORMATION SECURITY IN A MULTI-USER COMPUTER
35
and protects the information from all kinds of intrusions as well, since the key is maintained only a t the terminal end of the link. Thus the JJC-3 is oriented principally to protecting the transmission and storage of information from outside parties, while the DC-110 is oriented to protecting the information from the operators of the system as well. 9. Summary
This article has presented the problems and issues of information protection in multi-user systems. The emphasis has been on the threats inherent in programming and using a system rather than various forms of internal and external penetration of the “smash-and-grabll variety, or the subversion of employees having legitimate access to a system. Not discussed at any length were the topics of administrative controls, physical security techniques, personnel screening, and many other aspects of security that are required in greater or lesser degree to devise a secure system. One is often confronted with the question: Is it possible to build a completely secure system? The answer must be unqualifiedly negative. However, it is possible to build an adequately secure system for a particular operational environment. As we noted above, there is no single set of measures that can be taken to obtain “instant security.” Information security is a problem of providing sufficient barriers and controls to force a prospective penetrator into attacks that carry a high risk of detection and/or have a very large work factor. The work factor, if high enough, will be sufficient to deter all but the most dedicated penetrator, and for him it may make any eventual success a hollow victory. REFERENCES 1. Baran, P., On distributed communications: IX. Security, secrecy and tamper-free communications. Memo. RM-3765-PR. Rand Corp., Santa Monica, California, August, 1964. 2. Carroll, J. M., and McLelland, P. M., Fast “Infinite-key” privacy transformation for resource-sharing systems. PTOC.A FZPS. Fall Jt. Computer Cmj., Houston, Texas, 1970, pp. 223-230. 3. Chu, A. L. C., Computer security, the corporate Achilles heel. Business Automation 18(3), 32-38 (1971). 4 . Comber, E. V., Management of confidential information. Proc. AFIPS Fall J t . Computer Conf., Las Vegas, Nevada, 1969, pp. 135-143 (1969). 5. Denning, P. J., Virtual memory. Computing Surveys 2(3), 153-189 (1970). 6. Dijkstra, E. W., The structure of “THE”-multiprogramming system. Commun. ACM 11(5), 341-346 (1968). 7’. Friedman, T. D., The authorization problem in shared files. ZBM Syst. J . 9(4), 258-280 (1970). 8. Graham, R. M., Protection in information processing utility. Commun.ACM 11(5), 365-369 (1968).
36
JAMES P. ANDERSON
9. Harrison, A., The problem of privacy in the computer age: An annotated bibliography. Memo. RM-5495PR/RC. Rand Corp., Santa Monica, California, December, 1967. 10. Hoffman, L. J., Computers and privacy: A survey. Computing Surveys !)(3), 143-155 (1966). 11. Kahn, D., The Codebreakers. Macmillian, New York, 1967. 12. Lampson, B. W., Dynamic protection structures. Proc. A F I P S Fall Jr. Computer Conf., Las Vegas, Nevada, 1969, pp. 27-38 (1969). 13. McKeeman, W . M., Data protection by self-aware computing systems. C E P Rep. Vol. 2, No. 6. Computer Evolution Project, Appl. Sci., Univ. of California, Santa Cruz, California, June, 1970. 14. Mintz, H. K., Safeguarding computer information. Software Age 4(5), 23-25 (1970). 15. Molho, L. M., Hardware aspects of secure computing. Proc. A F I P S Spring Jt. Computer Conf., Atlantic City, New Jersey, 1970, 36, 135-141 (1970). 16. NFPA Pamphlet 75. Nat. Fire Prev. Ass., 60 Batterymarch Street, Boston, Massachusetts 02110. 17. Peters, B., Security considerations in a multiprogrammed computer system. Proc. A F I P S Spring Jt. Computer Conf., Atlantic City, New Jersey, 1967, 30, 283-286 (1967). Ma. Petersen, H. E., and Turn, R., Systems implications of information privacy. Proc. A F I P S Spring Jt. Computer Conf., Atlantic City, New Jersey, 1967, 30, 291-300 (1967). 186. Product Description Sheet-DATACODER, Model DC-110. DATOTEK, Inc., 8220 Westchester, Dallas, Texas 75225. 18c. Product Description Sheet-DATA SEQUESTOR, Model JJC-3. Ground Data Corp., 4014 N.E. 5th Terrace, Fort Lauderdale, Florida 33308. 19a. Security in Communications, Excerpts from 15th annual seminar. Industrial Security 14(2), 3 8 4 3 (1970). 19b. Security and Privacy Considerations in Criminal History Information Systems, Project Search Tech. Rep. No. 2. California Crime Technol. Res. Found., Sacramento, California, July, 1970. 19c. Skatrud, R. O., A consideration of the application of cryptographic techniques to data processing. Proc. A F I P S Fall Jt. Computer Conf., Atlantic City, New Jersey, 1970, 36, 111-117 (1970). 20. Taylor, R. L., and Feingold, R. S., Computer data protection. Industrial Security 14(4); 20-29 (1970). 21. Vanderbilt, D. H., Controlled information sharing in a computer utility. MAC TR-67. Project MAC, Mass. Inst. of Technol., Cambridge, Massachusetts, October, 1969. 22. VanTassel, D., Advanced cryptographic techniques for computers. Commu. A C M 12( 12), 664-665 (1969). 23. VanTassel, D., Cryptographic techniques for computers. Proc. A F I P S Spring Jt. Computer Conf., Boston, Massachusetts, 1969, 34, 367-372 (1969). 2.4. Ware, W. H., Security and privacy: Similarities and differences. Proc. A F I P S Spring Jt. Computer Conf., Las Vegas, Nevada, 1969, 30, 287-290 (1967). 25. Ware, W. H., Security and privacy in computer systems. Proc. A F I P S Spring Jt. Computer Conf., Las Vegas, Nevada, 1969, 30, 297-282 (1967). 26. Wasserman, J. J., Plugging the leaks in computer security. Haroard Business Review) Sept./Oct. pp. 119-129 (1969). 27. Weissman, C., Security controls in the ADEPT-50 time-sharing system. Proc. AFZPS Fall Jt. Computer Conf., 000 pp. 119-133 (1969).
Managers, Deterministic Models, and Computers
G. M. FERRERO diROCCAFERRERA School of Management
Syracuse University Syracuse, New York
1. Introduction . . 1.1 Premises . . 1.2 Generalities on Management Science . 1.3 Manager’s Duties . 1.4 The Resources Utilized in a Managerial Decision . 2. The System Approach . 2.1 Characteristics . . 2.2 Components of a System . 2.3 Performance of an Operative System . 3. Management Systems . 3.1 Premises . . 3.2 Information Systems . . . 3.3 Data Needed by Managers . 4. Management Science . . 4.1 Development of a Deterministic Model 4.2 Model Building . . 4.3 Computer Utilization of Models . . 5. When and How Managers Have to Implement Management Science Models 5.1 ProblemSolution Interpretation . . 5.2 Types of Problems Solved . . 5.3 Considerations on Priorities in Solving Problems . 5.4 Utilization of Computers for Solving Managerial Problems . 6. Will Computers Eliminate Managerial Decision Making? . 6.1 The Present Expansion of computer Utilization . 6.2 The Future Utilization of Computers . . . 6.3 The Interference of Computers in Decision Making . References
. .
. . . . . . .
.
. . .
. . . . .
. .
. . .
.
. . .
37 37 38 38 39 40 40 41 42 43 43 44 45 46 46 48 50 50 50 53 59 62 63 63 67 68 71
1. Introduction 1.1 Premises
Managers, as is well known, are decision makers responsible for the achievement of some established enterprise’s objectives. They have to 37
38
G. M. FERRERO diROCCAFERRERA
allocate the available limited resources for conducting their business in the best possible way. In order to reach the firm’s objectives in a financially profitable manner, managers have to perform certain actions which lead to the attainment of the desired goals [GI. Modern times continuously impose the updating of methods used to reach the firm’s goals. These methods have been defined, systematized, and disseminated for the purpose of helping executives in their daily decision-making responsibility. Scholars and managers recognize that there are some basic patterns of environmental reaction, and the knowledge of these patterns conditions managerial actions. By examining and discussing these points, scholars and managers defined a series of “principles” and “techniques” which, when properly applied, can substantially help executives in the duty of optimizing their actions. 1.2 Generalities on Management Science
The discipline which provides managers with the needed guidance to fulfill their purposes is known as (‘Management Science.” The exact birth date of this methodology cannot be established with certainty since it is the result of an evolution of thoughts, analysis of facts, comparison of results in various circumstances, synthesis of findings, utilization of feedback knowledge, and deductive reasoning. Analysts can trace its beginning back t o the Industrial Revolution. In reality, even if there is logical justification for relating the beginning of the definition of managerial rules to the Industrial Revolution, scholars prefer to indicate that the development of “principles” started at the beginning of the twentieth century [ l y ] . There is no doubt that Frederick W. Taylor (1856-1915) and Henry Fayol (1841-1925) are considered the two major initiators of and contributors to modern Management Science. Both Taylor and Fayol discussed in their published works many of the fundamental rules and appropriate methods used to increase the effectiveness of the jobs performed by workers (Taylor), as well as by executives (Fayol). The basic concept of separating the “doers” from the “thinkers” emerged a t that time. Thinkers, or managers, are those who have the responsibility of instructing others (i.e., subordinates) to perform actions in accordance with sets of well-defined schemes of reference [GI. 1.3 Manager’s Duties
Managers, when pursuing the task of reaching the enterprise’s objectives, have certain responsibilities. They must (1) conceive plans to be implemented for the achievement of the ent,erprise’s goals; (2) organize working structures capable of providing the needed means to materialize the plans;
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
39
(3) staff the system by allocating the right men to a given job; (4) schedule all working personnel, enabling them to complete the assigned tasks within a fixed time, and (5) control all the interrelated actions to be performed [2]. The three basic functions considered an integral part of management responsibilities are (1) problem analysis, which helps the executive to plan what must be done with the operative system; ( 2 ) communication with others, which is necessary for detecting and understanding the problems to be solved, as well as for the assignment of jobs and tasks to be performed under the designed conditions, and (3) decision-making activity, which encompasses the process made by managers in defining the course of action which they and their subordinates must take to fulfill the enterprise’s goals. Managers, in order to compete in today’s highly antagonistic market, have to make decisions quickly and accurately. In order to succeed under these trying conditions, they must be well informed about the real situations upon which decisions depend [15]. They must have a sufficient amount of data to att,ain a complete understanding of the facts which have determined the problems and the need for their solutions. Digital computers are playing a conspicuous part in the modern trend of greatly reducing the time which elapses between the detect,ion of a problem and the implementation of its solution. (A discussion of this point will be provided later.) 1.4 The Resources Utilized in a Managerial Decision
Managers have to decide how to allocate limited resources in an organized fashion to reach the predefined objectives. This decision depends on the type of problem to be solved [28]. It is recognized by scholars and managers that when the solution to a problem has to be found, there are eight essential resources to be considered. They are referred to as “The 8M’s’’ since the key words all start with an M. They are as follows. (1) Men. Managers must appoint or relocate labor forces and executives within the system configuration in accordance with the long-range plans, or with the specific situation under consideration. ( 2 ) Material. Raw and auxiliary materials employed in the production of goods or services have to be allocated as the necessity occurs, i.e., for the normal activity as well as for finding solutions to particular problems. (3) Machinery. Every enterprise utilizes some sort of mechanical, electric, or electronic device. This utilization may change as the internal or external situation alters. Managers may choose t o use more automated machines and less manpower, or vice versa.
40
G. M. FERRERO diROCCAFERRERA
(4) Money. Decisions on how to spend money depend on the company’s budgetary restrictions. Money is typically a limited resource which determines the selection and utilization of the other seven M’s. (5) Methods. Managers have to decide how to implement plans. The choice of which technique, what degree automation has to be used in the production of goods (or services), and how to implement controls, depends on other factors such as availability of money, personnel skill, and accomplishment schedule. ( 6 ) Market. This element is the demand for the product (or service) manufactured by the enterprise. Markets can be influenced by advertising campaigns or by an appropriate price policy. (7) Moments. Managers have to decide when products must be available to the market, how long a production cycle must be, how the distribution of the finished products has to be performed in terms of schedule, and so forth. Moreover, the collection of pertinent data must be promptly timed. (8) Messages. Information has to be available a t the right time and in its best possible form. An appropriate and reliable network of communication channels must be established for the purpose of instructing managers about problems, the course of action needed, and the outcome of their decisions.
Scholars and managers agree that, regardless of the nature of managerial problems to be solved, these eight resources are always present, although there are cases in which one (or more) assumes the leading role. 2. The System Approach 2.1 Characteristics
The system approach is one of the fundamental methods for the interpretation of facts detected in business. This approach requires that any responsible manager, when solving an enterprise’s problems, be concerned not only with the restricted area where the problem may have been detected, but with the entire firm as a unique entity. Managers must consider the direct and indirect effects their decisions may have in the performance of all the activities as planned, organized, directed, and controlled [6].The notion of optimization and suboptimization of the objectives and subobjectives must be present in any decision-making process [IW]. To fulfill this requirement, management science, and in particular operations research techniques, considers the enterprise as a unitary operative systems, even if it can be conceptually divided into subsystems. The idea behind the formulation of quantitatively expressed models is that
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
41
solutions to problems depend not only on the restricted area where the problem took place, but on all the adjacent regions of activity as well, up to and including the entire organization and even its external sphere of influence [IS]. This constitutes the so-called “total system approach.” System analysis seeks to expand the investigation of facts and data in order to insure a better and more complete understanding of the domain in which problems are detected and solved. By applying the system orientation, the manager is forced to examine the various relationships existing among the variables under scrutiny [23].In so doing, the behavior of each variable may indicate its interdependence with remote (or not apparent) causes [,25]. 2.2 Components of a System
It is generally accepted by scholars, analysts, and managers that the following five parts are always traceable in any physical system [I61: (1) Input. Some, and often all, of the eight significant resources (i.e., the 8 M’s) constitute what can be considered the needed raw material which will be utilized in implementing what managers have planned and organized. (2) Output. Every operative system has to provide an output, outcome, or payoff as a result of its activity. (3) Processor. The core of a system is the processor, which forms the operative nucleus. In a business system, for example, the management, the production and assembly facilities, the distribution department, the personnel, machinery, methods, and all types of controls, can be regarded as the processor. In a computing center the memory or storage, the arithmetic and logic unit, the control unit, the operators, and the computer routines and programs form part of the complex processor of the system. In computer science this term may refer to the hardware (the central processor) or the software (the compilers, assemblers, or the language processor) [ I I ] . (4) Control. Any configuration of significant operative resources must be placed under control to check whether or not the aim of reaching the well-defined objectives is being achieved. The means assigned to perform this control are an integral part of the system. (5) Feedback. In conjunction with the cybernetic system concept, the objective of a control is to determine when and how a system needs an adjustment in case some of the utilized limited resources are not harmoniously working together. The consequent action performed when the control detects the need for a modification in the organization of the system is called “feedback.” This action can be executed by means of an automatic device or by the intervention of a knowledgeable person or group of persons [S].
G. M. FERRERO diROCCAFERRERA
42
1 4 1 i
Degree of stability of the system efficiency
;;s,9i&
transition Positive
UL
CL
I
I
I I
I
I I I I
I
I I I I I
I
I
I
I
I I
I
I
I I
FIG.1. Transitions in the levels of stability of a system.
2.3 Performance of an Operative System
There are many ways to classify systems, but one frequently considered in management science discussions is whether the system is in a “steady state” or stable condition, meaning that the complex is operating without the detection of difficulties. The fact that all the external and internal activities of the system are balanced (i.e., in equilibrium) does not necessarily mean that the system performance is satisfactory. Figure 1 shows two levels of steady states, one higher and one lower. The second situation does not provide the complete fulfillment of the firm’s objectives because it is below the lower limit (LL) of the accepted bound representing the satisfactory zone. It could be the case that the system is slowly deteriorating, passing from one degree of stability to a lower one without clearly showing the transformation. The reliability of the information obtained through the
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
43
network system is, in this instance, very important. Managers must have prompt and complete knowledge of the passage from one system performance level to another. This variation is called “transition period.” The change may require a long time (e.g., the decay occurring in a physical system), or a few seconds (e.g., the change resulting from a dynamite explosion or nuclear reaction). Moreover, the transition period is said to be “negative” when the system passes from a higher level of stability to a lower one, and “positive” in the opposit,e case. The positive transition must be planned and actuated if the system is to resume the satisfactory efficiency level which, customarily, is defined within two limits (i.e., UL and LL of Fig. 1). This positive transition phase is characterized by the emission of orders stemming from the decisions made in view of the modifications to be performed to the system structure in order to solve the problem. The responsible executive must study the conditions of the enterprise by the operative information system which provides the appropriate data for evaluation. Reliability, completeness, timeliness, and clearness of the received and utilized data are paramount characteristics of the management information system [6]. Managers are eager to minimize the negative transition period and also the steady state period when it is below the satisfactory level. In addition, the time involved in the positive transition must be minimized. In order to achieve these goals, managers have to rely on the soundness of the information which indicates that the system is sliding down in efficiency. Corrective actions can be immediately taken as soon as a deflection on the performance is detected. The satisfactory steady state level period must, consequently, be maximized. 3. Management Systems
3.1 Premises
Managerial decisions are made through the existing channels of communication and are implemented by orders given to the appropriate subordinates. Control of the executed orders, as well as detection of the feedback reactions, is established by using the firm’s information network [,%?,$]. Managers know about the existence of problems by means of, e.g., a verbal message, a written paper, a formal report, a printout from a tabulator, or from a hard copy obtained at a computer. When an executive is “informed,” he can take the action he thinks appropriate for modifying, if necessary, the flow of activities of one or more of the available factors of production or limited resources (i.e., the 8 M’s).
44
G. M. FERRERO diROCCAFERRERA
3.2 Information Systems Information systems are set up in conjunction with operative networks in order to support managers in their decision-making tasks. Knowledge of facts is the blood of an organization, and the channels of communication are its arteries and veins. Information is generated a t a source point and used a t a receptive point. In between there are various means utilized to move the message from one person to another, e.g., oral statements, writings, diagrams, pictures, flow charts, and codes of any form [.%‘I]. One of the basic concepts upon which information theory is based is that a message, in order to achieve its goals (for which it has been generated) must be understood by the recipient. Hence, common knowledge of the codes (usually written) or languages used by the two persons in contact must exist prior to the transmission of the information. The higher the degree of understanding between them, the better the dispatch is interpreted. The message can assume a very concise configuration, and it can also be stated by the use of special terms or by particular construction of phrases, as in computer languages [9]. Managers must know that every piece of information is influenced by three factors which reduce its quality: (1) The pureness of any conveyed message is affected by a component of “noise,” an extraneous addition to, or subtraction from, the pure information. This noise becomes an integral part of the dispatch. (2) The completeness of a message could be improved by using a better means, or combination of means, of transmission. Hence, the communication is, at all times and by principle, incomplete; i.e., it carries a gap of intelligence. (3) The message could be influenced by personal interpretation or motivation, or by external forces which modify the meaning of the information from its original intent.
These three disturbances are schematically represented in Fig. 2. It is a specific task of the manager to reduce to a minimum these three inconveniences. The noise can be lowered by selecting more appropriate means of communication. The diminished intensity of noise in a network may justify the cost involved in improving or changing the media used. The completeness of a message can be improved by training the involved persons on the typical nature of the news which will be exchanged among them. External forces can be reduced by instructing the dispatcher to follow (when possible) a well-defined scheme of reference. The use of pre-established forms, in which cells must be filled and explicit questions answered, is a good way to avoid personal interpretation and biased
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
-
1 - 1
45
Ideal message
I
Message carrying "noise"
Incompleteness of the message (gap of intelligence)
Message suffering external influences (interpretation, motivation, ...I Hence, the information may be biased or subjective
FIG.2. Factors influencing the quality of a transmitted message.
description of facts and data. The quantification and codification of messages are the usual ways selected to reduce individual modifications in transmitting a report [200]. Managers, knowing that information can be distorted by these three influencing factors, must plan, organize, and control the network system to insure that sound and error-free messages are received at destination. 3.3 Data Needed by Managers
Managers, being responsible decision makers, must rely on the accuracy, timeliness, and completeness of all data and information received with reference to problems (detected during the negative transition period) and to solutions (during the positive transition period.) Managers must consider the presence of the three disturbances cited above when analyzing original data. The highest possible degree of pureness and understanding of the information received is a must for achieving the best fulfillment of the firm's objectives. Managers will transmit orders for implementing the solution by modifying the allocation and content of the eight typical limited available resources. In a society which has a very high standard of living, as does ours, managers have to respond properly to the market demand, which requires that commodities be attained at an increasingly fast rate. Their tasks are no longer so simple as they were a half century ago. Time runs fast, as do the changes in behavior of the society. In order to properly direct an
46
G. M. FERRERO diROCCAFERRERA
enterprise, and to solve its pressing competitive problems of long and short range, managers need to use new and more responsive decision-making techniques [lo].Sound data must be known promptly, decisions implemented on time, and feedback reactions immediately sensed. For these reasons scientists and managers have defined series of principles described under the label “management science techniques.” These rules are capable of helping the responsible executive in the accomplishment of his difficult task as decision maker. 4. Management Science
4.1 Development of a Deterministic Model
The development of a deterministic model can be better described by means of an example. A manager, having access to all the data needed for understanding an industrial problem, selects the experimental way of solving his query. He has the requisite time and money to find the most satisfactory solution to his problem, which is for instance, an inventory situation t o improve. He wants to define the size of stocks, the minimum cost for maintenance, turnover, space, and facilities, with reference to a given period of time of the plant activity. The best solution is found by utilizing the scientific method (i.e., by experimentation). Later, the same manager has a very similar problem in another plant of his company. He repeats the experimental steps for this new case, which can be summarized as follows. He (1) defines the nature of the problem; (2) establishes the variables involved ; (3) collects pertinent data; (4)formulates some hypothetical solutions; (5) tests them by experimentation; ( 6 ) finds the solution which provided the most adequate outcome; ( 7 ) detects the corresponding payoff; and (8) implements the found solution. Again, the problem is satisfactorily solved. Later the same inventory situation arises in another company’s warehouse. The manager repeats once more the above listed steps of the scientific process, finding the optimal solution. He has acquired useful experience in handling all these cases. At this point the manager recognizes that “to similar problems correspond similar optimal solutions.” Being quite familiar with this type of inventory query and its solution, the manager tries to describe both problem and solution through a “pattern.” The best way of portraying them is by using algebraic symbols for the variables and by indicating the existing relationships among the behavior of these variables by mathematical expressions and equations. Undoubtedly, this is a difficult task to pursue, but the direct relationship between that problem and its best solution is detectable, as well as the sensitivity of the solution to the model specification and the
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
47
reliability of the input data. By mathematically comparing the various cases of application, the manager is able to face the next case of inventory situation with a different set of means. The manager describes the poser in terms of values to be entered in the “mathematical pattern” or “model.” The expression carrying all the data referring to the new state of the inventory allows the executive to directly obtain the solution without spending the time and money he had previously invested when the experimental method was applied. Figure 3 schematically shows this concept. In order to solve a problem mathematically the manager can apply different versions of the structured model, the selection of which one to use being solely dependent on the complexity of the query. For a simple case with a few variables which are not complexly interrelated, the solution
First
Problem
Second
Similar Droblem
Solved as above
Third
Similar problem
Solved as above
experimenlal approach
L____
+
Solution
Similar solution
Similar solution
=
c-*--
Pattern
Pattern
Mathematically described model
Mathematically described model
Similar problem
7 Deftnition of voriobles and parameters as in the model
1 Solving the mathematical expression
nth
(e.g., by computer)
1
Optimal solution by mathematical notations
t
Managerial interpretation and decision
FIG.3. Scheme of the generation of a deterministic model.
48
G. M. FERRERO diROCCAFERRERA
could be found making the computations by hand, i.e., using pencil and paper, a slide rule, or a desk calculator. For more elaborated formulations this approach may take too long. Managers facing problems need to have almost immediate access to an “optimal solution,” because the reduction of the time involved in the problem-solving process is mandatory [l4]. Now digital computers and mathematical models are providing much of the help that a manager may require for his decisions. The use of deterministic (and/or stochastic) models gives managers a way of obtaining an optimal and objective solution to problems. The resulting outcomes of a model obtained through a computer run are not to be considered substitutions for the managerial decisions, but only objective suggestions [ I ] . The responsible executive, customarily, must subjectively modify the solution in order to define the most appropriate final course of actions. 4.2 Model Building
In accordance with what has been pointed out above, it can be said that many of the deterministic models used by managers, and collected in management science and operations research texts, have been devised by considering the results obtained by applying the scientific (i.e., experimental) method [27]. Other models have been designed by pure mathematical and statistical approaches, and others by performing appropriate simulations. There are times, however, when managers need to have specific models to solve particular problems. In this event it is necessary to “build a model” for subsequent and iterated usage. Model building is indeed one of the fundamentals of management science and operations research techniques There is no precisely defined way of constructing a mathematical or normative model, even if the experimental method is the one most often applied. The type of modeling depends on: (1) the nature of the problem, (2) the objectives to be reached, (3) the imposed firm’s policies, strategies, and tactics, and (4) the degree of accuracy desired. Usually the various expected outcomes can only be qualitatively expressed. They are ranked, and a score is assigned to the variegated results. A quasiquantitative model can be defined by subjectively assigning values of merit to the payoffs. When the outcomes are quantitatively measurable, but their relationships are not expressible by mathematical terms, the model is again termed “quasiquantitative.” Operations research methodology cannot utilize qualitative models. All variables values and their relationships must be quantified. This characteristic allows the utilization of management science models by digital computers. Even if there is no established series of rules to follow for [A$].
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
49
building deterministic models, here are some suggestions to consider : (1) Recognition of the nature of the problem. ( 2 ) Description of pertinent facts (diagnosis) and definition of the span of time related to the problem and its solution. (3) Definition and analysis of the elements involved, namely, (a) the environmental conditions, (b) the external forces influencing the events, (c) the variables, which must belong to one of these two categories: controllable (or decision variables) or incontrollable (e.g., by chance), (d) the relationship among variables (e) the restrictions (or constraints) in which variables behave, and (f) the goal that must be reached. (4) Determination of the characteristics of the original data (i.e., historical, technological or estimated), and the degree of accuracy desired in the collection, classification and manipulation of these input data. (5) Definition by mathematical expressions of the relationship among variables, as detected at Step 3 above, in accordance with the selected variables and their behavior (which could also be probabilistic). (6) Definition of the objective function that has to be optimized by maximization or minimization. This expression carries the cost or profit of the real variables entered in the set of equations and/or inequalities, and describes the various restrictions (or constraints) implicated in the problem. (7) Experimentation and testing for improving the model by using, if possible, special sets of controllable input and known output. (8) Determination of the sensitivity of th solution. This is another form of examination made by checking to see if the defined model specifications are providing a sound and reliable solution. Before validating the model it is necessary to test whether the expected outcomes (in quality and quantity) are supplied by the built algorithm. If not, the complete structure and procedure used to design the formulations must be reformulated. When the sensitivity of the solution reaches a satisfactory value, the normative model can be fully utilized. (9) Implementation of the model, which in effect is the utilization of the complete original set of data. During this phase further corrections and improvements (over and above what it has been done on Step 7) may have to be visualized and made. (10) Maintenance and updating of the moclel. Obviously, the deterministic model as such is not destroyed after its first utilization. It is kept and, if necessary, modified and improved for future usage [?‘I.
50
G. M. FERRERO diROCCAFERRERA
4.3 Computer Utilization of Models
A deterministic model customarily assumes the form of a computer program. All mathematical formulas, restrictions (constraints), variables, parameters, and their relationships are translated into computer languages. The variables, parameters, and the cost (or profit) values pertaining to a given problem are left outside the program since they constitute the input data recognizable by the processor by virtue of “names” or “codes” assigned to each of them. I n this way the program can be utilized for any set of original data, provided that the problem to be solved is of the “same type, size, and structure” of the model. If not, the computer routine (and consequently, the model itself) can be modified to provide the optimal solution in the newly detected conditions. When the solution is supplied by the application of a normative model (e.g., by a computer run), the manager has a mathematical expression to interpret. He must understand the meaning of those resulting formulas and values in correspondence t o the “nature” of each variable. He must extract from the computer printout the elements which constitute the decision variables, and define their optimal values for insuring the maximization (or minimization) of the objective function (i.e., of the goal to be reached). When this extraction is completed and the manager knows what the deterministic model objectively suggests, he must decide whether to implement the obtained solution or to modify it. Quite often the manager, even if he has a high degree of belief in mathematical models, modifies the decisions variables so acquired for the purpose of complying with enterprise’s policies, or for other reasons. This is because, once again, operations research models are tools, not substitutions for managerial decision making. 5. When and How Managers Have to Implement Management Science Models
5.1 Problem-Solution Interpretation
Managers, being rational thinkers by definition and conscious about their responsibilities as decision makers, are continuously exposed to the duty of properly solving problems. What kinds of problems actually have to be solved by managers? Generally speaking, when an operative system reveals some deflection from what is considered a ‘hormal” flow of activities, the trouble is likely due to a variation in the performance of one (or more) of the working elements (the 8 M’s). As systems analysis indicates, the factors of production (i.e., the eight limited resources) have t o work as planned in order to provide the expected satisfactory output [do].
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
51
Managers become aware of a changed status of the operative elements when the system is in the negative transition period (as previously discussed and depicted in Fig. 1). Essentially, the most common ways by which this knowledge is acquired are : (1) Managers may be directly notified by colleagues, subordinates, or outsiders when “something goes wrong.” This type of message may refer to the case when an anomaly appears and evidence for the reasons of its happening is detectable. The perturbation can have been generated within the enterprise or outside it. An immediate control has to be performed to certify the soundness of the information, consideration being given to the above mentioned three disturbances (summarized in Fig. 2). (2) Managers could learn about troubles via the established channels of communication, for example, by reading a written report. This report could be issued at determinate scheduled periods, such as periodical printouts of financial, production, and distribution statements. Obviously, these printed accounts can also be issued any time the need emerges. By examining the data carried by these reports, the responsible executive may recognize the necessity for implementing a corrective action to the system. (3) Managers can be formally informed by groups of people to whom the task of controlling the performance of specific activities has been assigned. This is the case when the enterprise has established a specialized service to handle these problems, e.g., a value analysis team, a method analysis department, a comptroller office, or generically, a “problem-finding” group of specialists who are sensitive to undesirable variations in performances within the firm. These people with staff position can investigate, examine, analyze, compare, and evaluate output at any level of the organization. Their investigating work can be scheduled, they can search a t random, or follow an ad hoe sequence.
When managers know about the detection of variations in the operative elements, they must determine whether a problem really exists, rather than some personal interpretation of facts, or a mistake in detecting a variance in performance. Sometimes what is defined to be a problem by one person is not so defined by another; hence, managers have to recognize the real existence of posers in light of the firm’s objectives. The knowledge of a problem may not imply the immediate determination of a solution. Sometimes the source of the mishap has to be found in related activities connected with the elements which suffered the anomaly. It is well known that factors which have a slight or an indirect influence on the fulfillment of the enterprise’s plans are difficult to detect, while something which harms the accomplishment of the firm’s major objectives is rather evident and, consequently, quickly detected. Managers must be
52
G. M. FERRERO diROCCAFERRERA
able to recognize which are the operative functions sensitive to problems, and which are over and above the possibility of being touched (directly or indirectly) by variations of the behavior of the system elements [22]. The size of the enterprise may determine the importance of a problem. When a solution is found, for example, by using an operations research deterministic model and by applying management science principles, the outcomes tend to be in direct proportion to the size of the enterprise, while the cost of the utilization of a computerized model could be (more or less) the same for a firm of any size [28]. For instance, a change in production methods may provide a reduction in the unit cost of the manufactured items. If the plant produces a large amount of that commodity, the “return” from the application of the solution is relatively greater than for a small firm manufacturing a limited amount of such items. But the cost of utilizing a model which, for example, minimizes the cost of production does not follow the same pattern. Hence, the cost of solving a problem appears to be fixed no matter who uses the model [22]. From the economic point of view, managers may believe that finding a solution to a problem and implementing it may cost more than accepting the trouble as it is. If the output obtained from the system (even if it is not in perfect operative equilibrium) is acceptable, the decision might be to “wait, and do not do anything now.” In acting in this way, the responsible executive is gambling. In fact, he prefers to wage on two instances. First, he conceives that the problem will disappear and the system will adjust itself along the line. In this case he saves the monetary cost of finding a solution and of implementing it. Second, he prefers to pay more later if the problem degenerates into a worse situation. If the executive wins, it is because the first case took place. Gambling on decisions is, by itself, not a good policy, but sometimes managers decide to differ the courses of action to be taken because they prefer to “watch what happens next.” They delay the decision even if a probability of being correct in “waiting and seeing” can be assigned, as it is in a wager evaluation [8]. When a solution has been properly found, a series of investigations, researches, collection of data, analysis, and even experimentations, have been made. It could be the case that while searching for a solution the problem itself disappears. For example, observing and interviewing clerks to discover the reasons for their low performance might, by virtue of the inquiry, cause the personnel to improve their work efficiency, and the problem may vanish. It could also be the case that by collecting and analyzing data about a supposed problem, it could be demonstrated that the trouble did not exist. For example, in case of misinterpretation of financial figures supplied in special reports. Sometimes it happens that in searching for a solution through the investigation and collection of data,
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
53
other related problems are detected. These problems then influence the way of solving the initial one, and their solution may even cause the first to disappear [ I ] . It is evident that understanding a problem is imperative before beginning the search for its solution. Managers must consider all the related disturbances and influences provoked by the problem and even by the solution. Any managerial decision has to keep in view the firm's objectives, stated policies, and expected (evaluated) or estimated outcomes. It is quite advisable, when the time from the detection of a problem and the implementation of the optimal (or most satisfactory) solution is available, to perform a feasibility study of the situation. 5.2 Types of Problems Solved
Managers are well aware that business and industrial problems can be solved by the use of deterministic (and/or stochastic) models. Operations research techniques provide the means to reach an objective solution when the query is quantitatively described by mathematical (and/or statistical) expressions. It is interesting to present briefly those problems which can be objectively solved by the deterministic approach. Management science gives a list of typical cases by placing the various models in groups having common characteristics.' Customarily, nine formal categories are distinguished [ I ] : 1. Inventory 2. Transportation (allocation) 3. Sequencing 4. Routing 5 . Replacement (maintenance) 6. Search 7. Queuing 8. Competition (game) 9. Mixed problems
An accurate survey of the use of these models in business and industry has not yet been made, but it seems that the utilization of deterministic models exhibits a clear trend to increase rapidly in the years to come. The advent of the computer real-time era has made a major contribution to the expansion of the utilization of mathematical models. Computer manufacturers have organized associations among computer users for the purpose of disseminating knowledge of existing programs, stemming from operations The sequence of these nine types of models does not indicate rank of importance or any other significant priority.
54
G. M. FERRERO diROCCAFERRERA
research models, to solve specific problems. For example, IBM founded in 1955 a society called SHARE (Society to Help to Avoid Redundant Efforts) €or the purpose of providing to its members computer programs ( i a , models) designed to solve business and industrial problems. In 1965 GUIDE was formed for IBM 705, 1401, 7000 series, and System 360 or higher computer users. In 1962 COMMON was established for the purpose of allowing the barter of programming knowledge among users of smaller computers, such as IBM 1130, 1620, 1800, or smaller System 360. The nine categories indicated above represent the type of models most utilized by managers. These models can be briefly illustrated by indicating their relevant characteristics. In each one of these classes a considerable series of related programs are available. Models are diversified, within each group, by the number of variables they can handle, their particular behavior, their operational boundaries, and the relationships existing among them. In each category problems are similar, while their representation may require an appropriate set of variables defined with reference to the available limited resources related with the problem. In particular: (1) Inventory. In any inventory problem two typical costs are encountered: one which increases with the growth of the stock size, and one which decreases as it expands. In the first type (of the increasing unit cost), the carrying expenses, which may include storage, maintenance, insurance, obsolescence, spoilage, turnover, and taxes costs, can be considered. The second type (decreasing) of the many costs that can be analyzed include the following: (1) set up and take down costs which are met each time that a new production cycle is implemented for manufacturing, assembling, and closing out an order. The larger the batch process, the smaller is the quota to be assigned to each item for these expenses; (2) direct production cost, which may also include the purchasing price of the raw materials used; (3) shortage (or outage) cost which is calculated when the demand for the manufactured items diminishes or differs in time; (4) manpower stabilization cost. In order to minimize inventories, it could be necessary to follow closely the demand with the production. This may require the hiring and instmoting of new personnel. In reality, not all these costs are simultaneously encountered. There are inventory problems which consider only a few of the possible events. In any case, the purpose of this type of model is to reduce to a minimum the total cost of handling stocks of merchandise during given periods of time. The mathematical models for solving this type of problems become quite complex, especially if the number of variables (e.g., the various cost components) entered in the formulations is large. Calculus, probability theory, queuing theory, and matrix algebra can be profitably used here. These
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
55
problems, if static in nature, are solved by using computer routines dealing with linear programming techniques ; if time variations are considered, dynamic programming methods have to be applied [6]. (2) Transportation (or allocation). This class of problems (and their related models) are subdivided into three subgroups, namely: (a) transportation, or distribution of items; (b) allocation of limited resources to demanding points; and (c) allocation of controlled limited resources. The first type is concerned with cases when there are two or more supplying sources (providing the same kind of commodity) and two or more points demanding quantities of that specific item. The problem consists of defining the number of products to be shipped from one delivery point (e.g., a factory) to a receiving one (e.g., a warehouse) in such a way that the total cost of the distribution (in a given period of time) is minimized. The transportation model requires, for its utilization, the observance of few restrictions, such as the certainty (i.e., no probabilities involved) of the quantities available a t each one of the sources and requested a t each one of the receiving places, as well as all the elementary costs of transportation per unit of measurement from each point to each other one. The second type concerns the allocation of jobs to men, men to jobs, or group of students to class, drivers to trucks, buses to routes, and so forth. The problem may assume a complex configuration, but the objective is to optimize the eaciency of the entire operative system, e.g., to maximize profits or minimize the total cost. A variation of this type of model is met in the so called “product-mix problem.” For example, in the gasoline industry there is a large demand of diversified petroleum products to be sold a t various prices. The objective is to satisfy the requirements in accordance with some well-established specifications a t the minimum total cost of production (or maximum profit). The third kind of problem is characterized by the need for optimizing the allocation of supply and demand, when the manager has the possibility of controlling the available limited resources. This is the case when a decision has to be made about where to build a new plant, place a new commercial agency, the site for a warehouse, how many salesmen to send in a territory, how many trucks to use; or in the opposite case, which shop to close, which plant to shut down, which salesmen to withdraw from the field, and so forth. Also the allocation of budgetary monetary values to the various demanding departments is a query of this type. (3) Sequencing. This type of problem is encountered when a set of operations can be performed in a variable sequence under certain welldefined constraints connected with the goals to be achieved. Usually, the objective is to minimize the time involved in the performance of all the operations required, i.e., from the first one to the last or finished product.
56
G.
M. FERRERO diROCCAFERRERA
I
I I I
No longer usable/( Degree of
I I
I
I
!Time
-
-4
Replaced
~
3 ’
Items which die
Time Replaced
‘3
efficient
FIG.4. Efficiency of the two typical items used in machinery.
It is possible that a series of priorities in activities must be observed, or some penalties on cost or time might be imposed. Generally, this type of problem can be solved by applying models called CPM (Critical Path Method) or PERT (Program Evaluation and Review Technique). Both are methods designed to find the shortest way of sequencing activities in accordance with some established conditions of priority. For example, these models are extensively used in (1) the construction of buildings, bridges, and similar works, (2) production scheduling, (3) assembly line operations, (4)military problem where strategic and tactical plans have to be implemented, ( 5 ) budgetary allocation of funds, and (6) missiles countdown operative procedures [19]. (4) Routing. A manager may have the problem of instructing salesmen on the sequence they should follow when visiting customers. He wants to minimize the time that each agent spends at the various addresses. This case is also called the “traveling salesman problem.” It is considered immaterial which person is visited first as we!l as which will be next. If the cost (in time or money) of visiting A and then B is equal to the cost of visiting B and then A, the problem is defined as “symmetric;” otherwise, if A -+B is not the same as B + A, it is called “asymmetric.” If the salesman has three customers to visit, i.e., A, B, and C, he has six routes to choose, namely : A-B-C ;A-C-B ;B-A-C ;B-C-A ; C-A-B ; and C-B-A. I n case of four customers, there are 24 possible solutions (i.e., 4! = 24). With ten customers to visit in the symmetric assumption, the salesman
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
57
has 362,880 ways to select; with twenty customers he has 243,290 x i01a possibilities, and so forth. ( 5 ) Replacement. This type of problem is especially encountered in machinery maintenance. The manager of a production plant needs to be sure that machines are working properly and that breakdowns are minimized. In any object composed of parts (e.g., an assembly or machine) two types of items are recognizable, those which degenerate or wear out, and those which after a certain period of time just are no longer usable, i.e., they die. Figure 4 shows how the efficiencytypically behaves in both cases. Items which wear out need maintenance in order to delay their degenerating process. To keep these items within the range of efficiency costs money and time. There is, consequently, a point when it is economically better to replace these pieces rather than spend in their adjustments. Figure 5 shows the effects of an appropriate maintenance program with reference to an hypothetical item. Elements which simply die can also be susceptible to adjustments for extending their life. Usually, appropriate environmental working conditions insure this effect. Machines are composed of both types of these items; hence, the problem is to repair or maintain the parts to assure the needed level of efficiency (between the two limits of a range of acceptance). The life of those parts (of both types) can be statistically determined by testing or by researching past records. Computerized mathematical models capable of handling this Degree of efficiency
I
I I I
I I I I
Time
Normal life
FIG.5. Effect of proper maintenance in a degenerating item. Dashed line, without maintenance; solid line, with maintenance; double vertical, adjustments and repairs.
58
G. M. FERRERO diROCCAFERRERA
complex problem are available, even when very many variables are considered. (6) Search. This type of model helps the analyst (in this case, the searcher) t o find what he is looking for, e.g., items in a warehouse, submarines in the sea, water or crude oil reservoirs in the ground, coal or gold veins in mines, special figures in accounting ledgers, or particular information to be retrieved from a backlog of stored data, and so on. The objective of this deterministic model is to achieve the goal a t the minimum cost of time or money, under the maximum possible level of accuracy. Accuracy in searching augments the cost of the process. It is a managerial decision to balance these two basic elements of the problem. Ststistical sampling and estimation theory are the two fundamental sources from which these mathematical models are derived. (7) Queuing. Any time that a group of persons (or items) are standing in line waiting to receive a service (or to perform an action) a queue is formed. Typical cases include people waiting a t a teller window in a bank, a t the checkout counter of a supermarket, a t the door of a theater, or a t the entrance of a bridge when driving. Queuing theory is a well-established discipline which, statistically, mathematically, and deterministically defines the problem by investigating (1) the rate of arrival of the customers a t the station(s), which is the point where the applicants (or, e.g., objects in case of an assembly line process) receive the requested service, and ( 2 ) the rate of service performed a t the station(s). The objective to be reached in solving this type of problem is t o minimize the total monetary costs encountered in the operative system, i.e., the evaluated cost of the time spent in waiting for service (when a queue is formed), and the cost of the idle time of the servicing personnel when there is no demand for action. In case of bettlenecks raised a t production or assembly lines, the rates of arrival and of service are welldefined since they are detectable data. Queuing models provide the desired solution to the most complex problems of this type. (8) Competition. When two individuals are playing cards, the move made by one may influence the decision of the opponent. Each tries to maximize his total outcome. Rules of the game are conceivably well known by both players who are in competition for winning the pot. The sequence of decisions is not known with certainty because human judgment is the basic element involved in the game. Hence, the estimation of what will be the next step performed by the competitor precedes any game decision. There is always the risk of being wrong in this evaluation. A similar situation can be visualized in business. The players are the managers competing for customers in a free market. The game is played by performing decisions in many fields, such as, the definition of selling
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
59
prices, the selection of the types of raw material to be used, the organization of the distribution system, the presentation of the products (e.g., wrapping), the implementation of advertising campaigns, the establishment of customers’ credit policies, the definition of the auxiliary benefits to grant to customers. When more than two competitors are “playing” this sort of business game, the formulation of the “rules of the game” becomes extremely difficult and cumbersome. In this case, the selection of the variables to be considered (controllable and uncontrollable) is a very arduous task, especially if the investigation is to be accurate. Theory of games gives good insights and suggestions for the description of problems of this type and for the preparation of the relat,ed models. Algorithms for business games are extensively utilized in business for training purposes. Managers, or potential managers, are grouped in teams playing, competitively, series of decisions, e.g., allocation of money for production, research and development, plant improvement, definition of selling prices in various marketable areas, and so on, simulating, in this way, what managers would have done for conducting that business. Each set of decisions could represent, for example, a quarter of a year in reality. The computer, utilizing a programmed model, determines the results obtained by the competitive actions of the players. Each printout issued by the computer serves as a base for the next decision since it carries the resulting reactions of the market (quantities of items purchased, inventory levels, etc.) as a consequence of the team’s competitive decisions. By using this business simulation game approach, potential managers can become familiar with business problems and their solutions. (9) Mixed Problems. In the real world it is quite rare that a complex problem can be solved by utilizing just one of the above-cited deterministic models. A distribution problem may be connected with scheduling the production in order to avoid queues and the formation of inventories. A transportation problem could be related to an assignment, a routing, or a sequencing situation. A maintenance problem could be linked with inventory or search needs. This class of mathematical models includes combinations of the above mentioned “pure” types. 5.3 Considerations on Priorities in Solving Problems
Managers know that problems appear in bunches. During the process of solving one, others emerge. While the preparation for defining a solution to one trouble is made, another claims to be solved. Large and small problems are continuously under consideration by the responsible decision maker. Managers must have a priority list, assuming that the appearance of problems is predictable. Executives not only must define a criterion for
60
G. M. FERRERO diROCCAFERRERA
choosing which problem-solving process should be handled first when several of them are detected and some are already under study, but they have to establish a grneral norm to observe. Is the rule “solve first the problems which were detected first’’ a good one? Apparently not, since there are cases in which the solution of a “second” problem also resolves the “first.” Is the idea of simultaneously implementing two or more solutions a valid one? Probably, but it is tcchnically impossible to do this in all instances. Is the suggestion to implement first those solutions that require mow time, and then the others, sound? The answers to thesr questions lie in the state of need, which in its turn depcnds on the urgency of solving the key problem first. It is a matter of evaluating the foreseen outcomes of each solution in order to visualize the payoffs that can be expected. Which one will improve the grneral situation to a greater extrnt? The definition of th(1 “key-problem” dcpends on which will be the ‘Lkey-solution.”This last is the leading concept for the cstablishment of priorities. Usually problems which involve the entire enterprise are considered to deserve precedrnce over those which are conccrned uith one sector or function. Also, problems affecting the main objective or the corporate policies must have the “right of way” over the others. In order to better visualize which problrm-solving process has to assume the front position, it is appropriate to briefly discuss the “contrnt” of thc most commonly encountcrcd queries in business and industry which can be solved by the use of deterministic models. Since problems having an influence on the enterprise as a whole have priority over those effecting only a section of it (such as a division, a department, an office), the following list of examples starts with thr first type, and concludes with others [ I ] . Overall Planning. Operations research techniques can be utilizrd to solvp problems concerned with the planning of the enterprise activity as a unique entity. General long-range plans developed in accordance with the firm’s objectives can be programmed, as can plans for the establishment of criteria to be used for the best allocation of the available limited resources (i,e., the 8 11’s). Also the determination of the optimal policy and organization to be actuated in the long run can be defined by the use of operations research normative (or deterministic) models and by simulation. Financial Plans. Budgetary restrictions can be defined by applying operations research models. The best and most appropriate accounting system can be designed and tested by using management science techniques. The same can be said for the implementation of control and feedback systems within the communication network. Financial and credit policies can be established through the application of algorithms or deterministic models and the use of simulation techniques. Personnel. To assign the right man to the right work, and vice vrrsa, is
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
61
a problem that must be solved for all jobs in the enterprise, a t any level of the organization. Recruiting personnel to maximize the efficiency of all the firm’s services is a complex problem that operations research techniques can solve. Incentive plans to minimize absenteeism and turnover can be designed by the help of mathematical or normative models and by simulation. Research and Development. A complex problem may arise when a longrange plan has to be established concerning the organization of a research and development department. Questions such as how large it should be, how much its services must cost, which projects it has to take care of, and which areas of knowledge and investigation it has to work, can be answered via the utilization of algorithms or deterministic models. Marketing. Problems concerned with, e.g., the shipment of products (or services), the appropriate geographical location of the receiving warehouses and their size and capacity, the consequent area of distribution, as well as the determination of the best position of retail outlets in accordance with the density of potential customers, can be solved by the use of management science models. Price policies can be tested, as well as the efficacy of advertising campaigns for given areas, periods of time, and type of media utilized. Promotion strategies can be defined and studied by using algorithms. Many other marketing problems can be solved by making use of deterministic operations research and simulations models. Purchasing. The best type of raw materials to be used in planned production can be established mathematically when the technical characteristics, specifications, prices, lead time, and availabilities are given as original data, as well as the needed quantities and the manufacture plans. Warehouses, and the definition of the conditions of the supply, storage, and turnover, can be determined by means of operations research algorithms. The same can be said for establishing the most appropriate usage of auxiliary materials and their contribution (technical and/or in terms of value) to the final products. All the financial and bookkeeping problems, with reference to the purchasing of any type of material, can be established by using computerized deterministic models, or simulation. Production. Mathematical models are extensively used to determine the best location, size, organization, capacity, degree of automation of plants, departments, shops, offices, services, and so forth. Also, problems in scheduling, dispatching and routing productions, and assembly operations are solved by the use of normative models. The indication of which is the best sequence to follow in performing the various manufacturing and assembly activities, as well as the handling of materials and of finished or semifinished products, can be optimized by the use of operations research models or by simulation. There are almost no business or industrial prob-
62
G. M. FERRERO diROCCAFERRERA
lems that cannot be solved by the application of deterministic (and/or stochastic) models utilizing digital computers [I]. 5.4 Utilization of Computers for Solving Managerial Problems
A common remark made by some managers who are not too familiar with deterministic models and management science principles is that a computer is indispensible for finding and implementing objective solutions to problems. The truth is that electronic computers can greatly help to solve complex, and/or voluminous input data, quantified problems. There are cases in which a normative model, having few variables, a restricted matrix, or a simple formulation, can be solved using a desk calculator.* For difficult cases, where the time and effort to resolve them by “paper and pencil” would become prohibitive, the use of a computer is welcome P31. If the enterprise has a computer with a real-time operative system, managers and management science teams may have a console-terminal in their offices with which they can interrogate the processor for computing formulas or partial calculations of complex mathematical expressions included in deterministic or normative operation research models [ I l l . Timesharing also allows managers to perform sensitivity analysis online, giving an immediate response to proposed solutions. The possibility of executing many thousands of calculations in few seconds is a great help to the analysts seeking optimal objective solutions to problems. Managerial decisions can be reached in a very short time, giving the executive the time to ponder, without pressure, the courses of action t o be taken. Consequently, managers can operate in a state of confidence, having the opportunity to consider choices in solutions especially when computer simulation techniques are applied [28]. The advent of computers indeed had an effect on the use of mathematical models. The possibility of working out complex problems in a very short time determined a considerable improvement of the quantitative formulation of queries as well as of their solutions [26]. Computers are tremendously useful tools in the design and development of operating systems for decision-making problems where limited resources (i.e., the 8 M’s) have to be allocated in order to optimize (maximize or minimize) the objective function within a well-established set of restrictions. The rate of acceptance by managers of the use of operations research algorithms *Students in management science classes learn to apply mathematical models to solve simple problems utilizing slide rules or desk calculators. This allows the student first to grasp the content of the model, and then to handle more complex problems using computer programs.
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
63
is also in direct proportion to the availability of real-time, time-sharing computer equipment, and to the accuracy that software and computer languages can provide with respect to the managerial need for decision making. It seems appropriate to reemphasize now the three major characteristics of the operations research techniques as generators of the analysis and systematization of business and industrial problems and of the formation of deterministic (and/or stochastic) models. These essentials are summarized as follows [ I ] : (1) The system orientation concept must always be present in any problem description as well as in the finding of its solution. Managers have the responsibility of conducting business using the enterprise’s organization as an operative system. All troubles, deflections from the norm, and organizational slips needing improvements have to be treated as “unbalanced states of the system” (ie., as a negative transition period). Solutions consider the factors of production (the 8 M’s) as the elements to be used for putting the system back into the desired conditions of productivity (i.e., through the positive transition period). (2) The mixed knowledge team concept. It is necessary to have the combined contribution of qualified experts to properly define problems, collect and study pertinent data, design normative models, experiment or simulate cases to construct optimal solutions to quantified problems. The same type of mixed team can help the manager in implementing his decisions as final sets of actions. (3) The quantitative expressions used to describe the query and the methods to be used to solve problems. Mathematical formulas are the best way to describe problems and to design operations research models. It is by virtue of this last characteristic that digital computers are so extensively utilized. 6. Win Computers Eliminate Managerial Decision Making?
6.1 The Present Expansion of Computer Utilization
Modern electronic technological advances allowed the construction of faster and less expensive digital computers. These “human energy saving” devices, capable of dealing with an enormous quantity of data and of making very complex computations, determined a progressively increasing expansion of their utilization. This trend will undoubtedly continue. Computer efficiency in performance and usage will improve, and the cost of operating such systems will decrease. Moreover, the techniques de-
64
G. M. FERRERO diROCCAFERRERA
veloped to communicate with computers will develop so rapidly that soon managers will “talk” to an input apparatus, inquiring about data or computed results, and will receive an immediate audible output response. With “real-time, time-sharing” computer systems, executives can already interrogate and work quantitatively expressed problems by transmitting the original data via a keyboard console. The computed answer is immediately returned to the inquirer, usually by means of an automatically typed message at the same device. Many of these secluded terminals (consoles), each working independently, can be connected to a computer by utilizing the existing telephone cable network. The immediate response obtained from the computer gives the operator the impression that the electronic machine is running exclusively for him, even if several hundred terminals are simultaneously connected with the central processor. This extremely fast way of obtaining returns from the machine is only possible because of the very short time required for the computation of data. The large capacity of a computer memory allows the writing in, the retaining, and the retrieval of very large amounts of data, as well as the utilization of computational subprograms standing by in the processor’s storage. There are series of operations or calculations which can be designed and recorded in fixed patterns and entered in the computer memory. Any time a manager needs to use these “subroutines” he can insert the appropriate original data pertinent to his specific problem into the model. The processor and the computational portion of the computer utilize the input information within the frame of prefixed mathematical formulas. The calculations needed, for example, to issue a payroll can be prepared in advance, stored in the computer memory and utilized, with the suitable set of original input data, any time they are desired. A bank can have subroutines available to make the various computations of financial interests, and/or for continuously updating accounts of customers borrowing or depositing money. An enterprise may need to have an instantaneous inventory situation any time movements of materials are made. A subprogram, taking care of the supply and withdrawals of goods in all the warehouses, can furnish to any inquiring remote terminal the quantity level of the moved items and related information, such as the reordering dates, the cost involved, the value of the stored commodities, and many other statistical data useful to managers for their everyday decision-making responsibility. As already indicated, the advent of more sophisticated technology in computing machines, not only that concerned with the hardware and the memory size, but with the software as well and the concept of utilizing stored programs in a real-time mode, determined the possibility of greatly extending the utilization of computers. Consequently, very many related
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
65
applications and concatenated works can be practically simultaneously performed. More and more automatized procedures are now handled by computers. It is possible today to register a few original input data into the processor memory and to have as output a large series of objectively computed results. For example, an industrial firm can enter as input into a computer the demanded quantity of some well-specified items to be manufactured for a certain date. The printout that could be obtained as output are bookkeeping analyses, billing registrations, printing of invoices, values of the changed levels of inventories where components and parts for the final product have to be withdrawn, the reordering updated timetable for the warehouses, the appropriate set of orders to the production and assembly lines, the sequence of the mechanical operations to be followed, the indication of the tools needed at the various stages of the working procedure combined with the current series of manufacturing activities, the schedule for the distribution of the finished goods, the pertinent notification to the salesmen about their commissions, and all the numerous related financial and technical statistics which may be needed. Today any bank is acting as a “clearing house” when dealing with checks to be paid and amounts to be acquired by any and all of its customers. Computers are performing this heavy task by calculating in fractions of seconds the various amounts. The result is that each account is instantaneously updated and kept in memory. At the same time, interests are computed and properly assigned to each account. Totals and statistics are also immediately available. There are many banks which have provided the headquarters and the branch offices with separate independent terminals, each directly connected with a central computer. This equipment allows tellers to promptly verify the consistency of any account while the customer is standing at the counter. When the transaction is accomplished, the computer is automatically informed of it, and all the accounting and financial ledgers are at once revised and the current values made available. Education has also greatly changed. The traditional way of teaching is being replaced by a computerized process. Each student can learn, sitting at a computer console, by answering questions which appear on the screen (TV type) of the terminal. The pertinent instructions are stored in the computer memory and retrieved by the student in accordance with his grade of study and the progress he has made in the subject. If the answer given by the inquiring person to the machine is appropriate, the next question appears at the console and the process continues. If the answer is incorrect, a set of explanations comes out at the terminal in visual form (on the screen) as well as on a typed sheet of paper. The interrogation of the computer can be performed by a large number of students simultaneously. At the end of the learning session exams can
66
G. M. FERRERO diROCCAFERRERA
also be taken by computer. Answers to the examination questions are immediately evaluated by the machine and the final grade calculated and issued. The advantage of this mechanized method of learning lies in the fact that each student can define his own path in assimilating new concepts. It is not a question of memorizing formulas or sentences; rather, it is a matter of understanding. Students can retrieve previous topics and review learned applications without being required to ask questions to the teacher in front of all the classmates. Students are in this way self-governing and self-controlling. A similar computerized setting can be established in any enterprise where the responsible executives, foremen, warehouse directors, inventory and storage keepers, financial, accounting, and distributing department managers can interrogate the continuously updated computer memory on the various statistics and facts pertaining to a particular department activity or to the entire operative system. The terminals located in the various firm offices can be used for communication and instantaneous transmittal of data. If copies are needed, a temporary storage in memory can be utilized to allow the retrieval of the information by anyone interested a t anytime it is needed. The computerized airline reservation system is today reality. Passengers, either by phone or by going to the appropriate office, can make reservations for certain flights. The airline ticket clerk sends (by a console) this information to a central computer (sometimes located thousands of miles away) which searches in its memory for an available seat on the requested flight. If it is not possible to accommodate the traveler’s desires, the computer immediately provides alternate ways for the trip. When agreement is reached with the passenger, the computer books that seat for that flight in its memory. This type of investigation and transaction can be performed by a large number of simultaneously inquiring terminals. The computer provides an instantaneous response to each one of them. The widespread use of computers for supplementing human activities can be hypot]hetically extended. One can visualize a society where not only money is unnecessary, but almost any traditional form of personal communication for business transactions is unnecessary as well. Data stored in the computer memory could be for recording of happenings, for supplying elements for statistics or for other conceivable needs, and for collecting any type of information stored for future utilization. Each manager will also have at home a computer terminal which will become as indispensable a part of his life as the telephone is today. A person will utilize computers any time he feels the necessity to know something, to transmit or record facts and values, or to make any type of computation. Undoubtedly, electronic machines are penetrating deeper and deeper
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
67
into the business world as a technical necessity. No one denies that all the scientific achievements, all the successful space flights, and all the improved methodologies have been possible only by virtue of the availability of the appropriate electronic machines. Even the high standard of living that nations are enjoying today is indirectly due to the increasing utilization of digital computers. The today’s recognized trend in business, industry, as well as in social life, is for ever more augmented usage of automized computing machinery. 6.2 The Future Utilization of Computers
Data can be retrieved from any memory computer when the relevant searching code is provided. Just as items are found by their qualifying numbers, so a person can be recognized by a digital code (e.g., his Social Security number). Consequently, it is possible to provide the holder of a bank account with a punched plastic card carrying his code and the other data technically needed to recognize where (i.e., in which bank) his financial statements are carried out. With this document anyone can pass money from his account to any other one, as well as add capitals to his own funds. The owner of this card can receive his salary without seeing it. The amount can be passed from the account kept in the computer memory under the firm’s credit card code to the employee’s personal account. He, the worker, will receive a slip of paper issued by a computer terminal, notifying him of the transferred money. The worker can now spend what he wants without having seen the currency. He can go to the supermarket, buy, and present his punched plastic card to the cashier. By inserting the customer’s credit card into the appropriate slot and recording the total figure of the bill with the keyboard of the computer terminal located a t the check counter, the corresponding quantity of dollars and cents is withdrawn from the purchaser’s account and charged to the supermarket’s account. It is possible t o reserve seats a t the football field or a t a theater by instructing the computer of the need for tickets. The game or show is indicated by the code number of the stadium or theater, the day and hour of the performance, and the type of desired seats. The computer will search in its memory about the availability of the request. If it is not possible to satisfy the demand exactly, the processor will display alternative choices at the console screen, or type the information a t the terminal. The inquirer can then make his decision. Payment for the tickets could also be made automatically. The computer will transfer the appropriate amount of money from the customer’s bank funds to the stadium or theater account. Housewives will use the punched plastic card when ordering groceries
68
G. M. FERRERO diROCCAFERRERA
or merchandise by phone. Upon receiving the goods, payments will be made by withdrawing the appropriate amount from one fund and by electronically sending it to the bank carrying the store account. The “home type” of computer console will be greatly simplified in its usage and the terminal will be inexpensive. Computers will take care of almost all paper work by automatically utilizing models for the computation of data and the printout of results. A manager can retrieve from the computer memory information and data referring to a complex business or industrial problem that he has to solve. By collecting new facts or new insights, he can increase in number, or improve in quality, at any time, the existing backlog of data already stored in memory. The computer can be instructed to systematize those elements, summarize them, make statistical or mathematical computations, and so on, in order to provide the decision maker with a revised and always current set of pertinent original data and/or semifinished computational findings. When the manager considers that the time for decision has come, he can retrieve from memory all the computerized results and enter them in an “operations research decision-making model,” obtaining an objective final solution to his problem. This approach can be applied to any type of managerial quantifiable problem. As is generally recognized, portions of business and industrial activities are, little by little, being continuously automated. 6.3 The Interference of Computers in Decision Making
In the near future the total utilization of computers in a firm could be a reality, but not at the Utopian level of a “push button factory” which produces commodities in such a highly automatized way that it does not need any human intervention or control. The increasing dependence of business on electronic calculating machinery is’ already greatly sensed. There are no concerns, no industrial activities, which are not already directly or indirectly related to operations performed by means of digital computers . Is the foreseen massive interference of electronic computers in business, and consequently in human life, an absolutely unrealistic assumption? IS it actually improbable that in the future social behavior will heavily depend on computers? The development of automation, as an ever more extensive substitution for human activities, is already a vivid reality. If the answer to the question “will computers become an indispensable element of human life?” is ‘‘yes,” to what extent will electronic machinery participate in business and social activities? The example of the instantaneous updating of corporate and personal banking accounts by means of automatic adding and subtracting figures
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
69
to and from a backlog of money, which exists simply as a numerical series of digits, indicates only one of the many possible cases of computer utilization. A firm, a concern, as well as an individual, by technical necessity, are treated as items sitting in a storage room. An item can be moved, modified, and changed in its characteristics, one of which could be its monetary value, corresponding to the available money recorded in its or his banking account(s). The “depersonalization” resulting from this type of automatized and instantaneous administration is highly sensed by individuals working in enterprises where computers are the logical substitution for human work. Financial intelligence is only a small part of the material stored in a computer memory. Data with reference to an individual’s health, employment, cultural, political, and social activities can be recorded along with the well-known “census)’ type data. With reference to an enterprise, a very great variety of information and statistics can be sent and kept available by the computer. Everything can be coded and stored in a very slight portion of a gigantic computer memory. When needed, data can be extracted and utilized by managers or by other interested people. For a firm there are facts which are delicate, confidential, or reserved. Not all data can be obtained without formal permission of the owner. Who then can retrieve these data? Who has the right t o investigate the private life of an enterprise, or of a person, by inquiring what is stored in a computer memory about it or him? For example, can the bank, the insurance company, or the car dealer, when a person is buying a new vehicle, read the existing financial records in the computer memory with reference to that customer? None of them can legally do so. By extending this concept, when something is bought the transaction must be preceded by a credit examination of the parties involved since the payment is done by a computerized transfer of capital. If it is permissible to know everything about a customer before signing a contract (of whatever nature) by interrogating the computer (assuming that both parties know the other code number), the ethic and the way of conducting business will assume a new form of being. Any transaction would be preceded by this computer practice. Society will certainly resent this manner of conducting business. It would probably only be justified in case of important and complex transactions. It would probably become a normal procedure to investigate all facts and data about a firm and its manager before having any meeting with him as representative of his enterprise. A n attitude of reciprocal mistrust will be generally enforced. No one will have confidence in anyone without computerized proof. This could happen if, and only if, every individual, every enterprise, firm, shop, or institution of any size, is also
70
G. M. FERRERO diROCCAFERRERA
known by his or its computer numerical code. This knowledge will be part of the name. Consoles capable of being connected with the computers of a unique national (and one day international) operative system must always be available a t the places where transactions are performed. The automatic selection of the right computer memory retaining the searched data or the required information will be automatically made by the reading of the various codes entered in the inquiring computer terminal. Computers will not only be used to keep track of business happenings from the statistical point of view, but, as they are already extensively utilized today, will solve various kinds of quantified problems. The managerial decision-making process is already heavily oriented toward the objective solution of problems using mathematical or deterministic models, or by performing simulations of any possible business situation. A great variety of standing-by computer subroutines and normative models will be available to every manager who needs to find an optimal solution to a problem, business, industrial, social, or whatever character it may have. Consequently, the skill of reasoning and of utilizing personal judgment, interpretation of facts, and the consequent responsibility of deciding will pass from the subjective to the highly objective (ie., computerized or scientific) approach. No matter what the problem is, the first attempt made by a responsible executive to solve i t will be to utilize an existing computer program. Since the stored and generally available subprograms (e.g., the operations research deterministic or stochastic models) can be used by an immense number of operators in a large variety of instances, an increasing uniformity of solutions can be expected. This fact also will tend to reduce the human personality of a manager. He will heavily rely on computerized solutions since they are easily obtainable. It will suffice to insert the original data a t the keyboard console, indicate the type of model to be used, the code of which is found in an ad hoc list of models, and obtain the objectively found optimal solution. What about the possible case of a computer failure? In a society mainly dependent on stored records and programs what will the consequences be of computer calculation mistakes or of technical disfunctions of the system? Such a disaster can have tremendous impact on the business and industry in general as well as to the society. This happening will be similar to a natural cataclysm. A small power failure (even not as severe as the blackout in New York State in 1965) may cause the destruction forever of precious factual records. True, auxiliary memories can be provided and technical methodology can be devised to avoid losses of data in these unfortunate cases, but these preventive policies will impose a highly complex computer system design, one which will be expensive and difficult to construct, organize, and maintain. The vulnerability of this gigantic system will be
MANAGERS, DETERMINISTIC MODELS, AND COMPUTERS
71
tremendously increased by the size of its utilization. Sabotage of computers and other sources of misfortunes can also be expected. Even industrial espionage can be visualized since the records stored in the computer memory could refer, for instance, to governmental, state, and local activities, including, probably, some data on military achievements. Big corporations, powerful enterprises, research companies, universities, and key industrial firms may have data stored without knowing what has been sent to the memory by other related users of the computer system. Unauthorized withdrawals of data and information can be avoided by assigning specific reserved codes only known by the owner of the records. This procedure, still, does not insure the absolute secrecy of the stored information. Computers are ever more invading the privacy of this increasingly mechanized society by performing many human activities (i.e., computing and memorizing). Society itself will determine the boundaries of this invasion. At this point it is appropriate to cite Norbert Wiener’s warning that because computers work so fast men cannot be sure to be able to “pull the plug’’ in time. It will be the silent struggle between the technology applied to perfecting digital computers and the managerial utilization that will deviate or arrest the above mentioned impressive trend. Hence, computers will never eliminate human personality or managerial decision-making responsibility, neither now, nor in the near or distant future. Computers will always be tools in the service of man, who is indeed a clever creature (. . . in fact, he developed the computer. . .). REFERENCES 1. Ackoff, R. L., and Rivett, P. A Manager’s Guide to Operations Reyearch. Wiley, New
York, 1967. 2. Albers, H. H., Organized Executive Action. Wiley, New York, 1961. 3. Anthony, R. N., Dearden, J., and Vancil, R. F., Management Control Systems. Richard D. Irwin, Homewood, Illinois, 1965. 4. Charnes, A., and Cooper, W. W., Mamgement Models and Industrial Applications of Linear Programming. Wiley, New York, 1961. 5 . Dearden, J., and McFarlan, F. W., Management Information System. Richard D. Irwin, Homewood, Illinois, 1966. 6. diRoccaferrera, G. F., Operations Research Models for Business and Industry.. SouthWestern Publ. Co., Cincinnati] Ohio, 1967. 7 . diRoccaferrera, G. F., Introduction to Linear Programming Processes. South-Western Publ. Co., Cincinnati, Ohio, 1967. 8. Dyckman, T. R., Smith, S., and McAdams, A. K., Management Decision Making Under Uncertainty. Macmillan, New York, 1969. 9. Elliot, C. O., and Wasley, R. S., Business Information Processing Systems. Richard D. Irwin, Homewood, Illinois, 1968. 10. Feigenbaum, E. A., and Feldman, J., ed., Computers and Thought. McGraw-Hill, New York, 1963.
72
G. M. FERRERO diROCCAFERRERA
1 1 . Fisher, F. P., and Swindle, G. F., Computer Programming Systems. Holt, New York, 1964. 12. Garret, L. J., and Silver, M., Production Management Analysis. Harcourt, New York, 1966. 13. Hall, A. D., Systems Engineering. Van Nostrand, New York, 1962. 1.4. Hampton, D. R., Behavioral Concepts in Management. Dickenson Publ. Co., Belmont, California, 1968. 15. Head, R. V., Real-time Business Systems. Holt, New York, 1965. 16. Johnson, R. A., Kast, F. E., and Rosenzweig, J. E., The Theory and Management of Systems. McGraw-Hill, New York, 1967. 17. MacKenzie, R. A., The management process in 3-I>. Harvard Business Review, No. 47, Nov./Dec. (1969). 18. Martin, J., Programming Real-time Computer Systems. Prentice-Hall, Englewood Cliffs, New Jersey, 1965. 19. Mayer, R. R., Production Management. McGraw-Hill, New York, 1962. 20. McDonough, A. M., Information Economics and Management Systems. McGrawHill, New York, 1963. 21. McDonough, A. M., and Garret, L. J., Management Systems. Richard D. Irwin, Homewood, Illinois, 1965. 22. Miller, D. W., and Starr, M. K., Ezecutive Decisions and Operations Research. Prentice-Hall, Englewood Cliffs, New Jersey, 1969. 23. Optner, S. L., System Analysis for Business Management. Prentice-Hall, Englewood Cliffs, New Jersey, 1960. 24. Schroder, H. M., Driver, M. J., and Strenfert, S., H u m a n Znformation Processing. Holt, New York, 1967. 25. Scott, W. G., Organization Concepts and Analysis. Dickenson Publ. Co., Belmont, California, 1969. 26. Sprague, R. E., Electronic Business Systems. Ronald Press, New York, 1962. 27. Wagner, H. M., Principles of Operations Research. Prentice-Hall, Englewood Cliffs, New Jersey, 1969. 28. Wagner, H. M., Principles of Management Science. Prentice-Hall, Englewood CliffS , New Jersey, 1970.
Uses of the Computer in Music Composition and Reseacch HARRY B. LINCOLN Deportment of Music State University of New York at Binghamton Binghamton, New York
1. Introduction . 2. Composition of Music by Computer . . 2.1 Music Composed for Performance by Traditional Means 2.2 Music Composed for Performance by Electronic Synthesizer 2.3 Real-Time Composition by Digital-Analog Computer . . 3. Music Research Using the Computer . . 3.1 Problems of Music Representation . 3.2 Music Theory and Style Analysis. 3.3 Ethnomusicology . 3.4 Thematic Indexing . . 3.5 Bibliographies and Information Retrieval Systems 4. Automated Music Typography for Composition and Research Discography . . . . . . . . . . . . References .
.
73 74 75 78 . 85 . 88 . 88 . 92 . 104 . 105 . 106 . 107 . 109 . 110
. . .
.
.
.
1. Introduction
It is now some twenty-two years since the first use of automated procedures in music research. Although limited to sorting routines, Bronsonls work in folk music [ I l l is nonetheless considered the earliest computeroriented music research project. The use of the computer in music composition has almost as long a history. The earliest efforts, dating from the early 1950s1are summarized in historical surveys by Hiller [%$I, Kostka [6I],and Bowles [9].The piece Illiac Suite, published by Hiller and Isaacson in 1957 [37] was the first computer-generated music to be performed and recorded, and receive wide publicity. Simon and Newel1 [74] were moved to predict in 1957 that “within ten years a digital computer will write music that will be accepted by critics as possessing considerable aesthetic value.” In the years since then there has been less accomplished in music research than early optimists had predicted, and computer-generated music compo73
74
HARRY B. LINCOLN
sitions most certainly cannot be said to have been widely accepted by critics. But although the accomplishments to date may seem modest to both musicians and computer experts, it must be recognized that the problems are formidable. In fact, it can be argued that the slow progress has been due in part to computing problems more complex than appreciated by the musician and to musical problems more complex than appreciated by the computer expert. But it is not reasonable to be pessimistic, since that which has been accomplished to date will provide a base for increasingly significant research and interesting compositions in the future. Much has been written by both composers and researchers, but publications of research results are rather limited to date, as are recordings and scores of computer-generated music. Although there is undoubtedly work in progress which has not come to the author’s attention, this article will attempt to indicate the wide range of effort in both composition and research by describing a number of representative projects. 2. Composition of Music by Computer There have been two basic approaches to the composition of music by computer. In one the computer is used as a tool to provide output in the form of an alphnumeric representation of standard music notation or the actual printing of that notation. Then the notated music is performed by musicians using standard instrumental and vocal ensembles. In other words, the sound of the music itself would not inform the listener that a computer had played a role in its composition. In the second approach, which has become increasingly prominent in the past few years, the composer programs instructions for a digital computer whose output in turn serves as input to an analog computer and this, in turn, produces electronic sounds via loudspeakers or transmits the electronic signals to recording tape for later performance. The sounds produced may, in some cases, reflect an attempt to imitate existing standard instruments, or, more likely, they will be newly created sounds. The taped results may provide the total composition or may be used in conjunction with standard instruments and/or voices. Again, as with notated music, the sound of the electronically synthesized music would not inform the average listener as to whether the source was the more traditional electronic studio (such as the Moog synthesizer) or whether the electronic signals had been programmed via digital-analog computers. It is expected, however, that increased sophistication in programming will make possible greater flexibility and speed in electronic synthesis than possible with the patch cords and keyboard of the present electronic studio.
USES
OF
THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
75
2.1 Music Composed for Performance by Traditional Means
Although most of our attention in this article will be directed to composition by digital-analog devices, i.e., various types of sound synthesis, a brief summary is given of music composed by computer for eventual performance by traditional instruments. The most complete survey of computer composition to 1967 is found in Hiller [S4]. The very earliest efforts a t using the computer for composition were based on analysis of simple folklike melodies followed by varied syntheses of these melodies, developed from probability tables. Pinkerton [67]wrote a “banal tunemaker” for the generation of new nursery tunes based on his previous analysis of 39 nursery tunes. According to J. Cohen [ I 8 ] ,J. Sowa developed “A Machine to Compose Music” that was based on Pinkerton’s ideas. A brief summary of work by Olson and Belar is given by Hiller [Sg] and in more detail by the two researchers in earlier articles in acoustical journals [65, 661. Their ‘Lcomposingmachine” is described by Hiller as a prototype of the famous RCA Electronic Music Synthesizers. Olson and Belar obtained various frequency counts of pitches in eleven Stephen Foster tunes. From these melodies various transition probabilities were developed, and synthesized Stephen Foster tunes were generated. Further work in analysis coupled with stochastic procedures to invent or “compose” new melodies were carried out by Brooks, Hopkins, Neumann, and Wright [ I S ] . Their results confirmed three predictions which are quoted by Hiller
[343: (a) If the order of synthesis is too low, it results in note sequences not contained in and not typical of the sample analyzed. (b) If the order of synthesis is too high, it results in excessive duplication, in whole or in part, of the original sample. (c) Between these two extremes, synthesis results in tunes which are recognizable members of the class from which the sample was drawn. The most widely discussed early attempt a t computer composition was the Illiac Suite for String Quartet by Hiller and Isaacson. The score of the work was published by Presser I371 and the composers give a detailed description of the preparation of the work in the book Experimental Music [S8]. The piece is based on computer composition of music according to the “rules” of simple Liacademic’fcounterpoint. While most listeners found very limited inherent musical interest in Illiac Suite, the idea of machinecomposed music attracted wide attention. The output of the piece was in the form of a music representation which had to be converted (by hand) into a musical score for performance by string quartet. For a recording of this piece see the Discography a t the end of this chapter.
HARRY 6. LINCOLN
76
MUSICOMP:
USE OF ML. ROW
LOCATION
OPERATION
Calling
P
Sequence Return
VARIABLE F I E L D
Call
ML.ROW,R,,
PZE
i, f , j
. .., E
P+3
R is t h e f i r s t l o c a t i o n of a l i s t of n_ i t e m s . f is t h e form of t h e r o w sought. j is t h e t r a n s p o s i t i o n sought. i is t h e o r d i n a l n u m b e r of t h e i t e m sought. Example:
If n.12, i = l l , f = 2 , j = 7 we o b t a i n t h e 11th n o t e of t h e r e t r o g r a d e v e r s i o n of a 12 - t o n e row t r a n s p o s e d u p w a r d s a p e r f e c t fifth. So we w r i t e i n t h e p r o g r a m 1,111
CALL PZE
ML.ROW,R 11, 2 , 7
,...,
12
,1111
FIG.1. Use of ML.ROW as an example of actual programming. From Hiller [34].
I n 1958, Hiller began his directorship of the “Experimental Music Studio” a t the University of Illinois from which eminated several computer compositions by various persons for a period of some ten years. This work is described in some detail in Hiller’s survey f.T4] and includes such wellknown pieces as Hiller’s Computer Cantata [QO],his Avalanche for Pitchman, Prima Donna, Player Piano, Percussionist, and Prerecorded Tape, and HPSCHD, for one to seven harpsichords and one t o 51 tapes by Cage and Hiller [35]. Many of the compositions of this period, and especially from the Illinois schools, feature stochastic processes, controls on degree of probability and indeterminancy, etc. Some details of these stochastic procedures are given by Hiller in [36].Figure 1 illustrates a subroutine in his compositional programming language known as MUSICOMP. Regarding this example he writes ([%I pp. 74-75) : ML.ROW, for example permits us to extract any pitch from any transposition of any of the four forms-original, inversion, retrograde, or retrograde inversion-of a row such as a 12-tone row. By setting up appropriate logical loops and entries into this subroutine, we are able to handle one important component process of serial composition directly. The three parameters I, F, and J must, in the actual calling sequence, be replaced by specific integers that represent the element of the row, the transposition of the row, and the form of the row, respectively. The calling sequence shown provides an explicit example of how this is done.
Two of the works from the Illinois studio make use of the CALCOMP plotter as a means of communication to the performers or the performance
USES
OF
THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
77
media. For example in Three Pieces for Solo Percussionist by Herbert Briin “the plotter draws standard symbols (triangles, circles, lines, and so forth) from the CALCOMP library. The language of the score is given by the distribution, size, and position of symbols on each page of the score; in effect, each page is a plot of dynamic level versus time. Once the performer is provided with a short introductory explanation of the notation, he is able to perform directly from the score ([Sd] p. 59).” I n the Avalanche piece mentioned above, one of the instruments is a player piano whose roll has perforations cut along markings inscribed by the CALCOMP plotter. In the early 1960s important efforts in computer composition were undertaken in France by the composers Barbaud and Blanchard working at the “Centre de Calcul Eletronique de la Compagnie des Machines Bull” and by Xenakis under arrangements with IBM in Paris. Barbaud and Blanchard produced a composition in 1961 which utilizes the twelve-tone technique. They first randomly generated a tone row and then developed structures from the row using various combinatorial operations. The musical procedures parallel mathematical ones; for example, transposition by addition and subtraction, interval expansion by multiplication, inversion by sign change around a chosen point, and so forth. The results expressed in a music representation were then scored for performance; Barbaud’s book describing this work [.2] includes sample programs. Xenakis described his work in a series of articles in Gravesaner BlLitter [SS]. His earliest composition, Achorripsis, is available in score [S2]. The role of the computer in some of his other compositions was not clear in some of the publications about them until clarified in discussions with Hiller ([Sd] p. 78). Several examples of computer music using various stochastic processes were heard a t the Cybernetic Serendipity show in London in 1968, concurrent with the IFIPS meeting in Edinburgh. Kassler, in reviewing some of this music for Current Musicology wrote ([CS] p. 50) : The penchant for utilizing rules from certain non-musical information theories seems to avoid the essential requirement of discovering a musical information theory. We seriously question that a theory derived from another universe can produce anything more than minimal compositional requirements. There may indeed be some convenient analogous properties between music and games, or between music and language (the 18th century noticed them), but i t must not be forgotten that the data are different, that music will remain a universe in itself with its own inherent problems-even though the computer will play a part in deciding what these problems are to be.
Hiller’s survey [S4] includes descriptions of work in several other European countries. Especially interesting among these is the work of Papworth to solve some of the problems of “change ringing,’’ that is, the various permitted permutations of ringing a set of bells that have been developed
78
HARRY B. LINCOLN
over several hundred years of this activity in England. The basic problem, as described by Hiller ([%$I pp. 84-85) follows: Given the numbers, 1, 2, . . . , n, representing church bells of different pitches in descending order, find rules for generating in some order all n! permutations or “change” or subsets thereof. However, the following restrictions must be observed: (1) the first and last permutations of any sequence or subsequence must be the original row, 1, 2, 111, n, which is known as a “round”; otherwise, no two rows may be the same; ( 2 ) in any row, no number may occupy a position more than one place removed from its position in the preceding row; (3) no number may remain in the same position in more than two consecutive rows.
Papworth was able to solve the system of change ringing known as “plain bob major.” He was concerned, first with proving that “plain bob major” is sufficient to generate all 8! or 40,320 possible permutations of eight numbers, and second, with generating sample compositions starting with random numbers. Specifically, each successive “lead-end” (the first row of each treble lead) was tested against all previous lead-ends and stored if new. Alternately, it was rejected and a new lead-end generated, Papworth says that his greatest difficulty involved making a composition end with rounds at the correct point. To achieve this, he found i t necessary to write “alternation routines” to have the composition come out right. (Hiller [34]p. 85)
2.2 Music Composed For Performance by Electronic Synthesizer
Probably the most striking development in the past decade in the field of music composition has been the wide acceptance of the electronic synthesizer. Almost every campus with a composition program has a Moog, Buchla, or other synthesizer as part of its equipment, and many composers who, until a few years agol were content to work in other idioms have turned to the synthesizer as an important tool for composition. It is in no sense replacing the performer, but rather complementing him. The total range of musical sounds and means of expression has been expanded by the synthesizer. Indeed, “Moog,” for example, is almost a household word, thanks to a famous record by Carlos, Switched on Bach [15], whatever opinion serious composers may have of that effort. It was inevitable that two streams of effort, computer composition and the synthesis of music, should merge. The merger has taken the form of computer control of sound synthesis by means of the digital-analog computer. The lay reader, if unfamiliar with this technology, would do well to begin with Mathews’ book [60]which is a textbook and a manual for use with the MUSIC V programming language. Sample problems and tutorial examples are included. Figure 2 illustrates concisely the basic procedure for converting digital information to sound. A sound can be considered as a time-varying pressure
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
79
Loudrpeoker filter 0 to 5KHz
Sequence of numbers from
Sequence of pulses with
47-I
Sound pressure wave obtainrd
0 20
2
* m
3:
10
$2 ez
s .o
- 5
0
mnt -10 a
g -20
0
1 Time (msec)
2
FIG.2 . Conversion of digital information to sound. From Mathews [60].
in the air and these pressure functions which we hear as sound are generated by applying the corresponding voltage functions to a loudspeaker. The illustration shows how numbers stored in the computer memory are SUCcessively transferred to a digital-to-analog converter. For each number the converter generates a pulse of voltage whose amplitude is proportional to the number. These pulses are shown on the graph in the lower part of the illustration. The square corners of the pulses are smoothed with a filter (low-pass filter) to produce the smooth voltage function drawn through the pulses. This voltage, supplied to the loudspeaker, produces the desired pressure wave. Mathews continues this exposition of fundamentals by showing that it is necessary to have a sampling time half the desired bandwidth. For example, to achieve the high-fidelity sound of a bandwith of 15,000 Hz, it is necessary to have 30,000 samples per second. The meaning of this last information for the computer is summarized by Mathews ([GO] p. 6): We can now begin to appreciate the huge task facing the computer. For each second of high-fidelity sound, i t must supply 30,000 numbers to the digital-to-analog converter. Indeed, i t must put out numbers steadily at a rate of 30,000 per second. Modern computers are capable of this performance, but only if they are expertly used. We can also begin to appreciate the inherent complexity of pressure functions producing sound. We said such a pressure could not be described by one number; now i t is clear that a few minutes of sound require millions of numbers.
Mathews stresses the importance of an absolutely uniform rate of sampling, otherwise the equivalent of flutter or wow in an ordinary tape
80
HARRY
B. LINCOLN
recorder will result. He also describes the need for a buffer memory and control mechanisms t o permit a uniform rate of transmittal of data ([GO] pp. 31-33). Computer composition by means of sound synthesis can become a very expensive procedure in terms of computer time since so much data is needed for even a few seconds of music. Mathews describes some alternative procedures including the storage of samples in computer memory, a sample being read from memory rather than recomputed when that particular sound is needed. Although stored functions take memory space, they save time. Mathews summarizes this part of this argument with these comments ([GO] p. 3 5 ) : We have considered sound synthesis from the position of the computer and i t has led us to stored functions. Now let us look from the composer’s standpoint. He would like to have a very powerful and flexible language in which he can specify any sequence of sounds. At the same time he would like a very simple language in which much can be said in a few words, that is, one in which much sound can be described with little work. The most powerful and universal possibility would be to write each of the millions of samples of the pressure wave directly. This is unthinkable. At the other extreme, the computer could operate like a piano, producing one and only one sound each time one of the 88 numbers was inserted. This would be an expensive way to build a piano. The unit-generator building blocks make i t possible for the composer to have the best of both these extremes.
The unit generator in this system is called the orchestra, and the different subprograms are known as instruments. The subprograms perform functions which experience (including, for example, users of synthesizers such as the Moog) has shown t o be useful. Among these are oscillators, adders, noise generators, attack generators, and similar functions found in electronic sound synthesis. The final principle for specifying sound sequences in the MUSIC V system is the note concept. Mathews pointedly remarks that in music “notes have been around for some time” and argues that for practical reasons it is necessary to retain the idea that the sound sequences of the instruments (of the synthesizer) must be chopped into discrete pieces, each of which has a duration time. Once anote is started, no further information is given to the note-the complexity of the instruments determines the complexity of the sound of the note ([GO] p. 36). The above remarks can serve only as an introduction to the MUSIC V programs, which represent the present state of many years of work by this pioneer and leading researcher in the field. More will be said about recent work by Mathews in the area of real time synthesis in Section 2.3 below. Two other writers, Divilbis and Slawson, give basic information on computer synthesis as well as offering variations or alternatives to procedures described so far. Divilbis [Zl],after describing the computer
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
81
process needed to generate a sine tone of one second duration, continues: In general, musically interesting sounds are much more complex than the sine wave example. . .for example, a sound may be the sum of several “voices,” each with s distinct pitch, waveform, and envelope. Under these conditions even a very high speed computer cannot calculate 10,000 sample points per second. Instead, the computer evaluates the function at the rate of approximately 1000 points per second and stores these calculated values on magnetic tape. The digital-to-analog converter converts 10,000 sample points per second quite independently of the computation necessary to obtain these points. We see now that the magnetic tape is a vital part of the system since i t allows the computer to spend ten seconds (or longer) calculating the sample points for one second of music. The ten-bone ratio cited is typical although the exact ratio depends on the complexity of the sound generated.
Divilbis carried out his work at the Coordinated Science Laboratory at the University of Illinois using a smaller computer than that available to Mathews and has put particular emphasis on economy of operation. In his article he outlines an approach and system somewhat different from Mathews and one which has certain advantages and disadvantages. It has the advantage of offering some of the elements of a real-time system and the disadvantage of a very limited range of options. Slawson has written a description of a speech oriented synthesizer of computer music [?‘5]which provides a good introduction to some of the techniques of computer composition. He effectively answers a common question from musicians, namely, how can a synthesizer represent two different “voices” or “instruments” and thus be able to produce polyphonic music. He writes ([75]p. 97) : Another troublesome, but on second thought obvious, fact is that the multiple voices in a contrapuntal composition reach the ears, not as ‘‘separate” pressure wave forms, but as a single varying pressure (or two varying pressures, one to each ear). At any given time, the instantaneous pressures arising in the several instruments are added algebraically to make up the single pressure that excites our hearing mechanism. Thus, the claim that any sound can, in principle, be synthesized applies as well to “combinations” of sounds. Any waveform is “single-valued”; it has only one pressure at a time.
Slawson clearly states that he is writing about synthesis of computer music, and not about composition of music by computer, and he has voiced his objection to some of the claims and work of those who regard themselves as composers of computer music [?‘@].Slawson’s synthesizer design is based on a model of speech production and he summarizes this “sourcetransfer function model” as follows ([?’5]pp. 107-108) : Basic to the model are the assumptions that acoustic energy originates a t a single point-the “source,” that the energy is modified in certain ways by a passive network of resonances, and that the source and the passive network are independent of each other-nothing the source does affectswhat the resonances do and vice-versa. The ratio of the output of the resonances to the output of the source a t all frequencies is Called
HARRY B. LINCOLN
82
the “transfer function.” I n other words, the transfer function is a description of what the resonance network does to any acoustic energy reaching it from the source. The template-like appearance of the transfer function, when plotted as a function of frequency, suggests the more graphic term “spectrum envelope,” which is almost a synonym for transfer function. I n the speech model, the energy source corresponds to the action of the vocal chords; the resonance network, to the throat and mouth. I n the model as adapted in the present programs for the synthesis of music the source is simply a train of pulses unconstrained in either frequency or amplitude. The pulse train may be made to be aperiodic for the production of noisy sounds. The resonance network-the transfer function or spectrum envelope-has four simple resonances (poles whose frequencies and bandwidths are also unconstrained). At present the transfer function can contain no antiresonances (zeros). Each voice in a segment of sound is assigned to one sourcetransfer function combination. The total number of variables to be controlled in each voice, therefore, is twelve: the source amplitude and frequency, the frequencies and bandwidths of four resonances, a mode variable that determines, among other things, whether or not the source is periodic, and a voice number. Each specification of these variables is fixed temporally by a thirteenth variable-the time in milleseconds between this specification of the variables and the next.
A synthesizer has, of course, the possibility of producing a vast range of sounds, many of them outside the scope of traditional music notation and instruments. Slawson, to illustrate a programming procedure, chose a sequence of events illustratable in standard musical notation and describes in detail the steps taken to synthesize the excerpt. Figure 3 shows his example ([75]p. 100). A concise summary of Slawson’s extended description of this excerpt (1751 pp. 101-107) follows: The statement “SETUP 2” prepares the preprocessor to accept a two-voice texture. The statement BEATI 594 sets the “beat” to its initial value of 594 msecs, or the equivalent of 108 beats SETUP BEATI VOICE YU .YU . u YU REST YU YU
.
1= 108
2 594 1
5A. 1.5, M F . M F 6CS,.5 6G.1.,F SAS, 3.5. F. MP.
-
. . .
0.5
.u
6GS. I .5. M F , M P , A 6 F . . 5, M P SE, 1 . 5 , , PP
REST VOICE
1.5 2
.W I .W I .I .W I .I REST FINISH
4
5 B . 1.5, M P . M P SDS,0.5 5FS. I , , M F 6C. 2. M F , M P , A 6D,1.5, MP. PP 1.5
FIG.3. Synthesis of computer music. From Slawson [?‘5].
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
83
per minute. VOICE 1 signals the beginning of the first voice. The note A in the fifth octave 5A (A-440) ;its duration is “1.5” beats, and its beginning and closing loudness is M F (mezzoforte). The dash (-) at the close of the seventh statement calls for a slight accent with a numerical equivalent of 1.5. From thc limited information given here one can translate the program as it relates to the music example. Slawson adds much detail to this brief summary, and illustrates how one may call for glissandi, crescendi, and decrescendi over several events, precise control of staccato, and so forth. The various programs for music synthesis developed by Mathews are used by a number of composers. In some cases they have been modified according to needs of a particular hardware configuration or to the idiosyncratic needs of a particular composer. One of the first variants of the MUSIC IV program was written by Howe and Winham a t Princeton. Known as MUSIC 4B it was designed for the IBM 7094 computer and was later revised into the MUSIC 4BF program (FORTRAN version) for third generation hardware. Randall, who uses this program a t Princeton, has written speculative essays on various aspects of acoustics and music [69]and has had computer compositions on recent record releases [68].Dodge’s use of MUSIC 4B [22]stresses performance more than composition. More recently Howe has written a version, MUSIC 7, for use on the X.D.S. Sigma 7 computer [CI].Howe’s broad outline of the way the program proceeds is generally descriptive of the MUSIC 4 and the MUSIC 5 programs ([,$I] p. 0.1) The program normally makes three “passes” over the score. Each pass processes all of the data and transmits its results to the next pass by means of a scratch file. Each card in the score specifies one operation and 12 parameters or variables to the program. Pass 1 reads and prints the score, providing a record of the input data. The user may generate or modify his score by means of optional subprograms during this pass. The input data to Pass 1 is normally a deck of punched cards, but it may also be a file prepared a t an ASR teletype. Pass 2 sorts the elements in the score into chronological order and performs a number of additional options, such as converting starting times and durations from beats into seconds. Finally, the revised score is printed showing all modifications. Pass 3 now reads the score and calls the user’s orchestra to generate the music specified. Pass 3 prints a record of the performance showing optional function displays and a record of the maximum amplitude reached in each output channel during each time segment. Only P a s 3 requires a sigrtificant amount of computer time.
Batstone [4]uses the MUSIC 4BF program in operation with a CDC 6400 with conversion done via a Librascope L3055 computer and a DACOFIL-V digital-analog converter. Final recording equipment includes a Dolby A-301 noise reduction system and work is in progress on the ingenious idea of developing a programmed function to approximate the Dolby in record mode a t the time the digital tape is written.
a4
HARRY 6. LINCOLN
Whereas the work of Mathews (in the MUSIC V program) is based on mathematical simulation of oscillators, Hiller and Ruiz [39] propose to set up differential equations describing the motions of vibrating strings, bars, plates, membranes, spheres, and so forth, and to mathematically define the boundary conditions in such a manner as to synthesize a tone produced under varying conditions. Plucked and bowed string tones have already beeen simulated with varying degrees of success by Ruiz and the sequence of steps may be summarized as follows [S9]: (1) The physical dimensions of the vibrating object and characteristics such as density and elasticity are used to set up differential equations describing its motion. ( 2 ) Boundary conditions, such as the stiffness of a vibrating string and the position and rigidity of its end supports are specified. (3) Transient behavior due to friction and sound radiation is defined. (4) The mode of excitation is described mathematically. (5) The resulting differential equations are solved by means of a standard iterative procedure with the aid of a computer. (6) Discrete values of the solution corresponding to the motion of a selected point of the object are written on a file of digital magnetic tape. (7) The numerical values on this tape are converted into analog signals by means of a digital-to-analog conversion and recorded on audio tape. (8) A few cycles of the solution are also plotted by means of a microfilm plotter in order to compare the visual appearance of the vibration with the sound of the tape.
The programs developed by Ruiz for this project were written in FORTRAN, and more recently in COMPASS, the CDC-6400 assembly language. The project continues at the University of Buffalo under Hiller’s direction. I n some synthesis systems there is an important role for computer graphics. The “composer” can learn to work with a system of graphic displays which, even if entirely different from standard musical notation, will give him a visual description of the present state of his program or of the components making up a particular sound. Mathews and Rosler [62] developed a graphical language for portraying the “scores” of computergenerated sounds. Mathews’ GROOVE system, described in Section 2.3, uses a CRT display as an important element in a real-time situation. Henke [SS] has developed an interactive computer graphics and audio system at M.I.T., which emphasizes economy of operation and makes effective use of the CRT display. At present the emphasis is on control of timbre, using a program in FORTRAN IV, known as MITSYN (Musical Interactive Tone Synthesis). A totally different approach to music “composition” by computer is
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
85
found in the work of Rosenboom. He makes use of the ability (inherent or teachable) of some persons to control their own heart rate, blood pressure, or brainwaves. By using electroencephalographic alpha feedback as an input signal he produces a “music” which has been generated by the brain waves of the subject(s). In a summary paragraph he describes one of his procedures as follows ([71]p. 14) : The system . . .includes the triggering of a sequence of sound events by the ARP synthesizer and associated equipment, rather than just on-off or amplitude indicating. Each synchronous alpha burst, sensed by a high quality analog multiplier and followed by a threshhold detector in an EXACT voltage controlled sweep generator, by pairs of participants triggered a slow rising sweep of the harmonic series which slight, automatically induced sequence changes each time through initiation of voltages that determined the starting pitch of the resonant filter, provided by a sample hold circuit being fed filtered white noise, and some randomly introduced bell like accentuations of various harmonic tones, produced by shocking the resonant filters with narrow pulses, at the attack initiated by each alpha burst and throughout the sequence. The result was indeed a music which is that of inner space, etherial, yet soothing and representative of consciously experiencable universal quantifiers of tension-integrity universe.
2.3 Real-Time Composition by Digital-Analog Computer
In much of the work described thus far there is a significant disadvantage in music synthesis by computer compared to similar work in the electronic studio. This disadvantage has been the lack of real-time control of the resultant timbres, durations, attacks, and other components of the output. The composer in the electronic studio can manipulate his patch cords and various controls until he has the exact effect he wants. The computer composer has had to live with the necessary delays in getting his material processed. The development of real-time composition has been the subject of a major effort by several persons in the field. Two projects, those by Clough [16,17]and by Mathews and Moore [61]are reported here. Clough [I71 assisted by Leonard of Digital Equipment Corporation, has developed a system known as IRMA (Interactive Real-Time Music Assembler). In this system a segment of computer main memory called the “measure” contains acoustic samples that are treated as a digital loop, which is analogous to a tape loop. The contents of the measure are continuously and repeatedly converted to analog form and monitored by the composer-operator, who may affect the contents of the measure from the console typewriter without interrupting the conversion process. The measure may be modified by insertion and deletion of specified events, reverberation, volume control, change of starting point and duration on a specified or random basis, selected erasure of any measure segment, and execution of macro commands which the composer has previously defined. The user may also call, for example, for a “jumble” which delivers a series of loop
HARRY
86
Q MAGNET I C
B. LINCOLN
10 DISPLAY
DISK FILE
4
---
-
A-TO-D CONVERTER
TYPEWRITER
SAMPLING OSCILLATOR
FIG.4. Block diagram of GROOVE system. From Mathews and Moore [61].Copyright @ 1970, Association for Computing Machinery, Inc.
segments whose lengths are determined by random selection of values between progressively changing minimum and maximum limits. A more elaborate system, and one which requires special hardware, is GROOVE, developed by Mathews and Moore [ G I ] a t Bell Telephone Laboratories. Although described as a general purpose program, with potential applications to various automated processes, its initial application has been to control an electronic music synthesizer. Figure 4 ( [ G I ] p. 716) shows a block diagram of the GROOVE system. Important elements in the real-time situation of this system are the seven real-time inputs to the multiplexer and analog-to-digital converter. Four of these voltage inputs are controlled by potentiometers. The other three are from a threedimensional linear wand, nicknamed the LLjoy-stick,”which projects from a square box. Other equipment includes the input typewriter, a specially built piano keyboard input device, and an oscilloscope. Figure 5 shows a block diagram of the steps that can be followed in carrying out a real-time decision in the GROOVE program. The operator can change the value of a specified disk function without, however, losing the older function. After the new function is sampled and approved it can be made permanent and the old function removed. The actual modification of a function is carefully controlled by the various input devices and the results are not only heard but are visually observed on the oscilloscope. One of the important features of GROOVE is the flexible control of “pro-
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
87
gram time” and is summarized by Mathews and Moore who write ([HI pp. 719-720) : Coarse control of “program time” may be accomplished by typing TIME N , where N is a disk buffer. If N = 0, the computer will simply go back to the beginning of the disk functions. At any point, we may set a switch which will cause the computer to recycle continuously through the same disk buffer. We may also slow down the progress of program time by reducing the frequency of the interrupt oscillator. Or we may stop the progress of time altogether by throwing a switch which essentially tells the computer: “Don’t progress time normally at all, but instead, use the value of a knob to give the current position of time within one disk buffer.” The z axis of the 3-dimensional wand is drafted for this task, since moving it from left to right most resembles the perceptual task of moving a time pointer along an abscissa. Along with the visual display of the time functions and the perceptual feedback from our controlled system, we now have a fine control over the exact position of program time. This is a very powerful feature of the editing system.
ENTER BY INTERRUPT FROM SAMPLING R A T E OSCILLATOR BEGINN ING
1
P READ INSTRUCTIONS FROM TYPEW RlTER
?
I
UNPACK ONE S A M P L E OF DISK FUNCTIONS R E A D CURRENT KNOB VALUES
C COMPUTE ONE SAMPLE OF ANY PERIODIC FUNCTIONS COMPUTE ONE SAMPLE OF NEW DISK FUNCTIONS COMPUTE A N 0 OUTPUT ONE SAMPLE O F 14 OUTPUT FUNCTIONS
I
OF DISK FUNCTIONS
I I
IN I T IATE OUTPUTING CRT OISPLAY
I I
, SERVICE DISK
I / 0 REOUESTS ”
R E T U R N FROM INTERRUPT
FIG.5. Block diagram of GROOVE program. From Mathews and Moore 1611. COPYright @ 1970, Association for Computing Machinery, Inc.
88
HARRY B. LINCOLN
Mathews and Moore point out that, using these features, it is quite possible to stop in the middle of a run and “tune up a chord” and then go right on. The present short description can outline only a few of the features of this system. The GROOVE program represents an important development in the effort to -achieve real-time synthesis of musical sounds. 3. Music Research Using the Computer 3.1 Problems of Music Representation
The computer is a poor musician-it cannot read music. Although flippant, this statement poses one of the first problems confronting a person wishing to use the computer to solve an analytical problem. (An even more basic problem is precisely describing the problem to be solved, although the researcher may not be aware of this fact a t first.) The notation of music, as it has gradually evolved in the West, is complex and two dimensional. Each character depicts pitch, a dlvertical’l dimension in the notation, and rhythm, a “horizontal” dimension in the notation. The computer, designed for scientific and business applications, has no particular qualifications for use by the musician. The researcher must develop a way of translating musical notation into machine-readable language or, more precisely, into a music representation. There are almost as many music representations as there are researchers. It is interesting to study various representations and to see the solutions to these basic questions which quickly come to the fore. Should the representation be determined by needs of the programmer and the characteristics of the programming language? Should the researcher encode only those elements of notation of interest to him in his project, or should he represent all elements of notation with the view of meeting possible future needs or the needs of others who might use his data? Finally, should he develop a representation that is easy to program but tedious to encode, or a representation more quickly encoded but more difficult to program? These are difficult questions. Some argue that the questions are not important because translation programs will make it possible to move easily from one representation to another. A comparison of just three languages, ALMA [Sd], Ford-Columbia [5],and “Plaine and Easie” [ I d ] illustrates the complexities of translation. As might be expected, researchers working with FORTRAN show a preference for numeric representations. Baroni [S] at the University of Bologna encodes six-digit numbers, in fixed fields, for each note, with each digit standing for a different component of the note, e.g., duration, pitch, octave register, etc. With such an encoding, using FORTRAN, it is rela-
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
89
tively easy to make studies of any particular component. Encoding and keypunching a six-digit number for every note is, however, more time consuming than with most other representations. Most representations attempt to make use of mnemonics wherever possible as a means of speeding encoding and also to make it possible for the researcher to “read” back the representation as a melody. A brief survey of several representations in terms of just two components of notationpitch and duration-will show the variety of possibilities available to the researcher. If a mnemonic is used for pitch designation it is necessary to indicate an octave designation since pitch names are repeated in each octave. This, for example, Asuar [la] uses the solfeggio terms for scale degrees (Do, Re, Mi, Fa, etc.) and up to eight octave designations. Thus M I 5 is fourth space E. The next obvious mnemonic for pitch designation is simply the letter name of the note and here again octave designations are necessary In ALMA, a system developed by Gould and Logemann [32] as an expansion of Brook’s “Plaine and Easie Code” [ l d ] , apostrophes precede notes for registers above middle C and commas precede notes for those below middle C. For example, E would designate the same note as MI 5 in Asuar’s system. Wenker [SO] uses plus or minus signs, instead of commas and apostrophes, to indicate pitches in octaves above or below the octave lying above middle E. In both systems duration is indicated by the numerical mnemonics of 1 for whole note, 2 for half note, 4 for quarter note, etc. These or similar mnemonics for duration are common to several representations. Jackson and Bernzott [46]use an entirely different mnemonic system for designating pitch. They simply number the keys of the piano 1-88 and designate any pitch according to its corresponding position on the piano keyboard. Pitches beyond the piano’s range are represented by minus numbers for those in low register, and 89-98 for those in higher register. The number 99 is reserved to designate a rest. In this system, harmonic equivalents are not distinguished, e.g., G sharp and A flat are numbered the same. If necessary, provision is made for a distinction, but it calls for a second pitch card. Jackson and Bernzott do not encode duration per se, but indicate it by assigning a fixed number of columns to various rhythmic values. For example, one measure of 4/4 meter would take 64 columns. It is apparent that this system calls for many more columns of information per note of music than others under discussion. One of the better known music research projects using the computer is the “Josquin Project” at Princeton University, dedicated to study of the Masses of the Renaissance composer, Josquin des Pres. Early stages of
90
HARRY 6. LINCOLN
this project included the development of a programming language for Music Information Retrieval (MIR), first written by Kassler [50] for the IBM 7094 Computer, revised by Tobias Robison, a professional programmer and re-written for the IBM 360 series by John Selleck, a research assistant a t Princeton. The input representation for this language is known as IML (Intermediary Music Language) developed by Jones and Howe a t Princeton. I n the IML-MIR system the convention of indicating duration by the mnemonics of 1 for whole note, 2 for half notes, etc., as already described above, is adhered to. However, pitch is encoded by a numerical designation of staff position, that is, a numbering of staff lines and spaces from ledger lines well below the bottom line of the staff to ones well above the staff ([50]p. 304). This dispenses with the convention of octave registers found in several of the representations already discussed. Probably the most widely used system for encoding music in machinereadable form is DARMS (Digital Alternate Representation of Music Symbols), generally known as the Ford-Columbia Representation, which was developed by Bauer-Mengelberg [5] as part of a larger project in automatic music printing, financed by the Ford Foundation through Columbia University. It is widely used not only for its efficiency and speed of encoding but also because it was the representation taught at the Binghamton Seminar in Music Research and the Computer in 1966. I n Ford-Columbia, mnemonics are used for rhythms, but instead of numbers (e.g., 2 for half note) the note values are indicated by letter, for example, Q for quarter note, H for half note, etc. Pitch is indicated by staff-position codes, in a manner similar to IML-MIR (see above), but with a numbering system which is more easily remembered by the non-musician encoder, that is simply 1,3,5,7,9, for staff lines and 2,4,6, and 8 for staff spaces. With the use of abbreviation conventions, the representation is very compact, with a single number often sufficing for a note. Because it is graphicsoriented, the Ford-Columbia system can be encoded by clerical help unskilled in music, although it is the author’s experience that student musicians will do the job much faster, a statement which is probably true for any representation. The usefulness of the Ford-Columbia Representation will be greatly increased with the completion of a software packages in PL/I currently under preparation by Raymond Erickson under a Fellowship a t IBM
3C VCBA
FIG.6. Gould’s encoding system for Gregorian chant. From Gould [ S l ] .
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
91
FIG.7. Encoding system for Byzantine chant. From Schifldt and Svejgaard [72].
Systems Research Institute in New York. The three main programs in this package are : (1) Syntax-error detector, performing a parse on the data to evaluate correctness of form of the raw dataset. ( 2 ) Display of the “valid” (corrected) dataset in some two-dimensional form, illustrating vertical alignments encoded in the data-string and stratifying the information t o facilitate checking of pitches, rhythmic values, etc., with the original score. (3) Transformation of data (still in string form) into a set of crossindexed tables for musical analysis purposes. Certain repertoiries of music use a notation, or call for transcription in a notation, which differs from the standard Western notation used in the above representations. The Gregorian chant repertory is still notated with special characters on a four-line staff. Gould [SI]developed an encoding system for keypunching the Liber Usualis, the principal book of chant in current usage. His system is illustrated in Fig. 6. The 3C means a C clef on the third line. The letter V stands for the notational sign known as the Virga and VC means a Virga on note C. The B and A signify the notes by the same letters. Other more complex notational patterns are handled in a similar manner. Two Danish scholars, the musicologist Nanna Schifidt and the mathematician Bjarner Svejgaard “721 used a code system of letters for neumes and numbers for accent and rhythm neumes to encode Byzantine chant melodies. An example of this representation is shown in Fig. 7, in which I stands for Ison (L), E for Elaphron ( m ) , P for Petaste (d), A for Apostrophus ( >), and X for Oxeia ( 0 ) .The rhythms are indicated by 1 for Bareia (\), 6 for Diple ( 11 ), and 8 for Tzakisma (u). Medieval notation has note values and groupings of notes rarely found later than the sixteenth century. Erickson, in the project described in the preceding paragraphs, has expanded the Ford-Columbia Representation to include symbols for longa, breve, and the various ligatures (groupings of two or three notes) found in this notation. Instrumental music in past centuries was often notated in what is known as tablature, in which notes are not used, but instead numbers and letters
HARRY B. LINCOLN
92
n
GS, 4=2,2APH++H, 4(-2)BPl 4E+, 2NB. PH-H(-), 4$G,/,
I
3
D. C.
l)AH+H,TT),//,END
FIG.8. Wenker’s representation. From Wenker [go].
are written on lines of a staff. In the case of string instruments such as the guitar or lute the lines stand for the strings of the instrument and the symbols indicate position of the fingers on the string. Thus any system of encoding this music must translate the tablature into modern notation. Certain performance practices create difficulties in this work since the duration of a tone was not always specifically notated but must be inferred by notes (and harmonies) immediately following it. Hultberg has developed an intermediate program which reads an encoding of Spanish lute tablature and translates it into Ford-Columbia Representation. A vast repertory of lute music from the Renaissance awaits transcription into modern notation. With the development of automated printing of music Hultberg’s intermediate program and tablature encoding system would make possible more economic publication of this material. The whole field of ethnomusicology raises complex problems in music representations. The field worker and the person transcribing recordings into modern notation must indicate microtones and other instances of performance and musical tradition not encompassed in Western notation. Ethnomusicologists are not in complete agreement among themselves on the notation of music from other cultures. An idea of the complexity of the problem confronting these researchers can be shown in an example from Wenker’s representation [go] as illustrated in Fig. 8. The use of portamento (P), hold (H), slightly flatted pitch (-), quarter tone flat (-2), etc., are illustrated. If required, Wenker’s representation allows for the designation of pitch in terms of cycles per second, as well as in cents or savarts. The above remarks on music representations have made comparisons only on the basis of encoding of pitch and duration (rhythm) of a note. The variety of approaches among the different systems becomes even more apparent when considering such other components as accidentals, groupettes (triplets, etc.), slurs, ties, expression marks, etc. Another system of entering music information is the use of a piano or organ keyboard linked to an analog/digital computer. Recent work on this possibility has been carried out at the University of Utah by Ashton [ I ] and by Knowlton [5’oal.
[a]
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
93
3.2 Music Theory and Style Analysis
A wide range of activities is encompassed in the broad area of music theory and style analysis. While there are differences between the two fields, there is much overlapping in procedures and techniques, and for our purposes the two are discussed together. Both the music theorist and the musicologist make use of a variety of analytical tools which lend themselves to computerization. Among these are interval counts; citations and descriptions of such components as particular rhythmic patterns, specified intervals, or uses of accidentals; statistical counts of chord roots, inversions, or variants; and correlations of text accents with metrical or agogical accents, to name but a few of the many attempts to use the computer effectively as a tool for research. The amount of data and the range of analysis may vary greatly from one project to another. The computer can be as useful for simple operations on thousands of short melodies in a thematic index as it is for multiple analyses of one particular composition. To date, most publications of computer-assisted analysis have stressed methodology, as might be expected in a new field. Probably the most difficult problem confronting the theorist is not how to measure, but what to measure. Because of the demands for precision in this work, the researcher is obliged to define his problem in the most specific terms, and it is this precise structuring of the problem and the means of attacking it which has proved a stumbling block in many projects. Music is one of the arts, and objective measurement, while superficially easy (anyone can count intervals) can be meaningful only in a carefully defined and structured problem. Researchers are also handicapped by lack of a consistent terminology for the components of music. There are several widely used terminologies for descriptions of vertical chord structures, to say nothing of several schools of thought on chord functions. Because of the multiplicity of problems one might be led to agree with Kassler who writes [49]:
.
. . i t is easier today to keypunch music than to process significantly the musical data so obtained. Would not effort expended to construct suitable theories be a more progressive investment at present than participation in standards committees [to develop a common music representation]?. . . The survey which follows will show the range of historical periods, types of problems, and approaches to their solutions, which can be found among contemporary researchers. Compositions from all historical periods have been the subject of computer-assisted studies. One of the earliest repertories is Gregorian chant, with some two thousand melodies and a long history of change and development. These melodies were often borrowed by polyphonic composers throughout the middle ages and the renaissance, and the borrowings were not only from the beginning of melodies but also from internal phrases. Gould’s encoding system for keypunching the Liber Usualis has already been discussed. Gould set up tables
94
HARRY 8. LINCOLN
of various patterns of two to seven notes (these groupings are known as neumes in chant notations) and showed how a simple alphabetical representation of these patterns could be used as basic material for analytical work. Selleck and Bakeman [73]used a few chants as material for a pilot project in analysis. They were interested in identification of melodic patterns or “cells” which may recur in various places in a chant melody. Their first analysis compared the content of every phrase with that of every other phrase, yielding an indication of identity (represented by one) or nonidentity (represented by zero). Further tests cited occurrences of large cells made up of groupings of small cells usually two or three notes in length. One of the interesting results of their work showed that “in a larger chant such as the Mode V Credo with 43 phrases, almost none of which are identical, the identification by machine of the cells and their arrangement afterwards reveals the same three ‘ur-phrases’ for this piece.” It was their experience that computer techniques ‘ h o t only allow the musical analyst to ask questions, the solutions of which would otherwise be beyond the range of practicality, but the data so generated often suggest new approaches, new problems that would not be suggested by original material itself.” SchiGdt and Svejgaard, in their work with Byzantine chant as cited above, also looked for configurations of neumes common to various chants. A particular configuration is known as a “formula” and a long chant is made up of certain arrangements of formulas. It was found that the strict laws of the formula scheme were abandoned a t the textual and melodic climax of the chant. The computer can be queried for the location of any particular formula in the total body of chant melodies. For example, “if we ask the computer to find A, B, C, and D in the encoded material, the answer from the machine concerning this hymn will be Hymn No. 32 A.a.7, B.d.ll.A.a.23, etc., meaning: I n hymn No. 32 the formula A arrives on neume (or neume group) No. 7 and starts on the tone a, formula B arrives on neume No. 11 and starts on the tone d, etc.” The musicologist working in medieval and renaissance music has a host of problems in collations of manuscript sources, identification of cantus firmi, searches for recurring rhythmic patterns, and questions of borrowings, to say nothing of the broad problem of description of a particular composer’s style. The computer is being used in a remarkable variety of projects, ranging from simple statistical counts to complex analysis of harmony, counterpoint, rhythm, and structure. An example of the varieties of information to be sought in a particular body of music or repertory is provided by the projects of Karp [47].His recent work with medieval melodies illustrates several types of searches of the simply encoded material. The program is designed to analyze music
USES
OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
95
set to “lyric” texts, i.e., stanzaic forms with not more than 20 lines of verse per strophe. The output includes a line by line printout of the music. If the musical structure is different from the poetic structure, the printout is twofold, first by musical structure, then according to poetic structure. Various factors contributing to rhythmic design are analyzed, including degree of floridity for the entire piece and for each line, comparison between floridity of the opening and concluding sections, degree of rhythmic homogeneity, and increase or decrease of motion from beginning to end of a phrase. Modal structure is indicated and the melody is identified by class, range, and tessitura, the latter with peak and low point recorded. Percentage summaries of pitch content are provided for each work and a series of probability profiles will be generated for each class of melody, together with information regarding melodic borrowing. Karp began his work using FORTRAN and has continued with that language, although he recognizes that other languages lend themselves better to the string manipulations called for by his analysis. Brender and Brender also work with medieval music, concentrating on the complex problems of mid-thirteenth century notation [lo].In this notation the duration of a note is determined not only by its graphic form but also by its context with other notes, and the rules for transcription into modern notation are quite complicated. The Brenders encode the music with a simple representation which describes the graphic arrangement of the notes. The encoder makes no decision regarding the rhythmic pattern to be derived. This is left to the computer program which uses a technique called “sliding pattern match” and the output cites the rule number which was applied to achieve each grouping. Once the transcription is achieved, several analytical programs are employed. Among these are measurements of voice crossings, tabulation of melodic intervals in each part, and a listing of harmonic intervals on a two-dimensional chart. The study of melodic intervals includes an interesting study of “average rate of movement. ” Brender and Brender write ( [ l o ]p. 206) : Each melodic interval is divided by the duration of the first note of each pair and the average taken over the whole melodic line. By analogy to physical terms, the interval represents a change of position and the duration the time for that change; hence the quotient is a sort of velocity or speed. It was expected that, averaged over the entire piece, this quantity would be close to zero, in some cases slightly negative, i.e., that, on the average, increasing interval rates are as common as the decreasing interval rates. It was surprising to observe that this quantity was consistently negative. Only one out of the twenty-one lines analyzed had a positive value. This means that d e scending lines tend to change or move faster and over larger intervals than ascending lines. The motetus was normally the most negative, followed by the triplum, then the tenor.
In the harmonic analysis each triad is displayed on a two-dimensional
LOW TO HIGH INTERVAL 0
NONE
. I
0
NONE
1
MN2
4
2
MJ2
E
3
MN3
4
MJ3
5
PF4
li
AC4
7
P"5
>
w I+ B
w
A
n
3
11
MJ7
I2
OC'T
1
2
MN2
MJ2
7
3 MN3 2
I
4 MJ3
5 PF4
6 AG4
2
1
8
PF5
MN6
9 MJ6
10 MNI
11
12
MJ7
OCT
13
14
3
I
1
3 6
1
2
1
3
@
I 1
2
1
1
1
1
1
2
2 5
2
El
FIG. 9. Graphic harmonic analysis of Cruci Domini-Cruci Forma-Portare. Note that the major triad (circled) occurs eight times, and that an octave with an inner fifth (boxed) occurs thirteen times.
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
97
chart (Fig. 9) with the left ordinate representing the low-to-inner interval, and the horizontal ordinate the low-to-high interval of the chord. The resultant chart provides a graphic description of overall chord usage in each piece. In another study of medieval music, Erickson [.%$I has attacked the complex question of rhythm in twelfth-century music as represented in the Magnus liber organi of Leonin. This much studied collection has been the subject of debate on interpreting the notation of rhythm. Using a modified Ford-Columbia Representation, Erickson developed a “phrase-structure analysis” program which uncovered a number of significant stylistic characteristics of this music. It was discovered “that (1) some ten formulae (incorporating at least three turning points) account for the beginnings of almost half of the more than 900 duplum ordines tests; (2) specific melodic contours are associated with specific pitch levels and tenor notes; (3) families of organa can be grouped according to the use of certain melodic formulations; (4) a given melodic cliche often recurs several times in the course of a single composition; ( 5 ) the phenomenon of organum contrafacts, hitherto unrecorded in the literature, is to be found in the Magnus liber.” A distinctive feature of Erickson’s work is his use of syntax-directed compiling techniques to transform the data as encoded (in string form) into internally stored tables. Specifically he implemented, in PL/I, the CheatamSattley compiler [23, 251. Beginning with the Renaissance the problems of notation and rhythm are not as tangled as in medieval music, but there are other vexing questions which seem to lend themselves to computer-oriented procedures. The sheer quantity of manuscript and printed sources and the frequent confusion in attributions have always proved a challenge to maintaining control of data in hand card file systems. The apparent stylistic homogeneity among composers, and within the total repertory of any single composer, has made dating of works difficult. One must stress the word apparent in the previous sentence. Much progress has been made in defining the components of a musical style and new tools for measuring them have been developed by theorists and historians. With carefully structured procedures one can define stylistic differences among compositions which past historians have tended to lump together into one style. In spite of this progress, however, there is still a strong element of intuition which enters into much stylistic analysis. The researcher using the computer does not want to sacrifice this element of intuition, and yet is compelled to define his problems in the very precise terms demanded by computer-assisted analysis. It can probably be argued that these more precise definition of problems have contributed as much to date as have the results showing on computer printouts.
98
HARRY
B. LINCOLN
The Josquin project at Princeton is a long range effort to analyze the Masses of Josquin des Prez in order to define certain stylistic characteristics and help to develop a more accurate chronology of the works than is now known. Lockwood [SS] has described efforts to determine those points in the Josquin Masses in which musica jicta might be required. This involves citing the locations of melodic leaps of the augmented fourth or presence of that interval in a vertical sonority. Other tests are being devised to determine the range of each voice and the medium note (pitch) of each voice as well as for the whole composition. There is evidence that a composer may adjust these matters slightly depending upon his geographical location or the singers available to him. By developing various statistical profiles of a genre of composition, one can argue for certain groupings and perhaps, eventually, establish the chronological sequence of a composer’s works. This work is very tentative, no matter by whom it is being carried out, because there is yet so much to be learned about the validity of various measurements . An interesting attempt to apply certain standard numerical methods, already used in other disciplines, to musical style analysis, has been tried by Crane and Fiehler [LO]. They argue that three classes of characters may be distinguished: two-state, ones which are either present or absent in a composition; multistate, such as meter in a set of works that show several different meters; and continuous characters, those that have any value within a certain range, such as beats per chord. Using standard formulas, the coefficients of association, correlation, or distance are measured. “The result of the affinity computations will be a matrix like a mileage-betweencities table, whose columns and rows are headed by the identifications of each work. At the intersection of row i and column j will be entered the affinity between works i and j ([LO] p. 215).” Of the various ways available to show the distribution and clustering of points, the authors have chosen a dendogram to represent this information graphically. Figure 10 illustrates the clustering of a group of twenty chansons. I n their article on this work the authors also briefly describe other possible statistical techniques which can reveal new insights from stylistic data. Among these techniques are seriation analysis, factor analysis, and discriminant analysis. In an early project, with procedures overlapping those used in thematic indexing (see Section 3.4), Collins [lo] encoded a number of madrigals by the English composer Thomas Weelkes. Weelkes’ mannerism of borrowing melodic fragments from his own earlier works was the subject of a systematic search of the bass lines of a large group of works to identify borrowings of melodic contours. The computer was directed, in effect, to compare intervals 1 to 6 with intervals 2 to 7, then 3 to 8, and so o n . . . then compare intervals 2 through 7 throughout. When two segments of intervals
USES
+-
OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
..
F
99
-7
I
w
u z
2 v,
n
B I-
z
w
0 L L LL W
.:-
8
.t
i
N W > >> W > O
FIG.10. A dendrogram showing twenty chansons clustered according to style. Each horizontal line shows the coefficient of distance at which the two works or clusters below i t join. A lower coefficient of distance indicates a greater affinity.
were found to be identical, the citation would read, for instance, ‘‘Piece A:27, piece B:265; meaning that the 27th to the 32nd intervals of piece A were the same as the 265th to the 270th intervals of piece B.” Further first hand comparisons of the pieces are then necessary to determine if the borrowing is simply a standard melodic formula or cliche pattern, or the signal of a more important and longer quotation from another work. Sixteenth-century secular repertories which have been the subject of computer analysis include the French chanson in France and Italian madrigal and frottola. Studies by Hudson [@I and the present author [55, 571 are described under thematic indexing below (Section 3.4). Bernstein [S] began with a thematic indexing project with the chanson repertory but decided that a procedure analogous to the literary concordance would
100
HARRY
B. LINCOLN
reveal more information than the thematic index. He chose 300 chansons for a pilot project which involved encoding the complete works, using CLML (Chicago Linear Music Language) which is a modification of a Ford-Columbia Representation. He gives the following detailed description of one of several programs developed ([7] p.159) : The input data, compared to a table of relevant parameters, ww decoded and stored in two parallel tables. One such table, in integer form, gives the actual note value, counting each semitone as an integer starting with the lowest note on the piano. The other table contains, in decimal form, the time a t which the note will stop sounding. The unit in the letter table may be varied by the user. A third table indicates the ends of phrases and is similar in structure to the table of note duration. I n the performance of harmonic analysis, the chords had to be examined each time a note changed in any of the voices. A chord was rejected for either of the following reasons: (1) if i t contained less than three notes; or (2) if i t included intervals other than thirds and fifths. If neither of these conditions prevailed, the root of the chord was determined by examining the intervals, concentrating on the interval of the fifth, and designating its lower member the root.
Bernstein indicates that he hopes t o expand the analysis programs to permit identification of the roots of seventh and diminished chords or even of incomplete triads in which two tones are given and the third implied. There has been very little use of the computer for analysis of music between the Renaissance and the twentieth century. Although the reasons for this are not clear they may be due to the length and size of the major genres of these periods-encoding a complete Brahm’s symphony would be a real task. Also, the harmonic vocabulary becomes increasingly complex and the analysis of chromaticism in the nineteenth century, for example, is fraught with differences of opinion regarding nomenclature and analytical tools. However, two projects, one in the Baroque and one in the Classical period, have been reported. Logemann [59] explored a problem that seems designed for computer analysis. Since the middle ages composers have written works in the form of puzzles in which the correct performance of the work is hidden by enigmatic key signature, clefs, or veiled instructions (often in the form of a pun or conundrum) for performance. Sometimes the correct answer can only be found by “trying” a number of possibilities. Logemann worked with two canons (a canon is a form of consistent imitative counterpoint) from Bach’s Musical Offering in which Bach indicated the transposition or the inversion of the second voice but omits the sign indicating the point a t which the second voice begins. Pragmatically, one simply has t o try beginning the second voice a t all possible entry points and selecting those (or the one) which lead to the best sound. After encoding the melody of the canon Logemann translated the data into a table of numbers representing the pitches by assigning to each note its integer number of semitones above middle E. Tests were made of the distribution
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
101
of intervals obtained at each possible entry point of the second voice. Logemann illustrates, in music notation, the results obtained for differing degrees of “harmonicity” ([59]pp. 70-71). The project from the classical period to be reported is that of LaRue [52], who already had established the effectiveness of a large hand card system for clarifying the vast confusion in attributions among hundreds of symphonies by contemporaries of Haydn. With the help of Logemann and Berlind a t New York University, he devised procedures for style analysis that included a computer program to cite changes in articulation of rhythm. Points of change were tabulated in reference to measure numbers or finer points on a time line. When applied to the symphonies of Haydn, the initial work pointed to some interesting observations. LaRue writes ([52]p. 200) : . . . in the second result, however, one aspect was found in which Haydn seems to behave consistently. He uses more themes in the second part of the exposition sections of his first movements, after the modulation to the dominant key, than he does in the beginning of his symphonies, in the tonic area. I owe this possible breakthrough, this fist crack in Haydn’s armor, entirely to the computer. It had simply never occurred to me to look at Haydn in this way. All sorts of experimental tabulations had been made, exploiting the wonderful flexibility that the computer provides; and in studying these tabulation, suddenly this unsuspected correlation just popped out. One can immediately see all the interesting questions this raises. If Haydn uses more themes in a tension area such as the dominant, does he use more themes, relatively, in other tension areas, such as modulations in the development? And what happens in recapitulations, where Haydn does not modulate to the dominant, so there is no tension area: Does he eliminate some of the themes? Everyone knows that Haydn makes changes in recapitulations, but no one has been able to say why. Maybe we have a clue at last.
Some researchers have attempted to develop computer-assisted systems for musical analysis which are applicable to music of any style. In some instances the emphasis is on the development of a research tool rather than describing its application to the music of a particular historical period. Forte, who is responsible for much of the fundamental work in this area, has developed a program for the analytic reading of scores [27] which, while designed for use with atonal music, illustrates logic and procedures which can be applied to a wide variety of styles. Using the Ford-Columbia Representation and the SNOBOL programming language, Forte has established various procedures to parse strings for blocks and segments of information and to identify accurately their temporal locations in the string. Forte writes ([27] p. 357): [Other programs] yield increasingly higher representations of structural relations. If the collection of pitch-class representatives in each segment of the output strings from the reading program is scanned to remove duplicates the resulting collection may be called a compositional set. Then, from a listing of all the compositional sets the analysis programs (1) determine the class to which the set belongs; (2) list and count all occurrences of each set-class represented; (3) compute, for each pair of set-class representa-
102
HARRY B. LINCOLN
tives, and index of order-similarity; (4) determine the transposition-inversion relation for each pair of set-class representative; (5) list for each set-class represented, those classes which are in one of three defined similarity relations to i t and which occur in the work being examined; (6) summarize in matrix format the set-complex structure of classes represented in the work; (7) accumulate and retrieve historical and other informal comments in natural language.
Forte concludes that “any syntactic class or combination of classes could be defined as delimiters in a similar way, thus creating new analytic strata. It might be pointed out in this connection that aspects of an individual composer’s ‘style’ can be investigated to any depth, provided, of course, that the researcher can specify the syntactic conditions with sufficient precision ([d7] p. 362).” This concluding remark is, of course, the nub of the problem and the challenge to anyone using the computer for analysis of music. An ambitious set of programs for style analysis was developed by Gabura [d9] and tested on compositions by Haydn, Mozart, and Beethoven. Gabura relies heavily on statistical procedures such as the computation of frequency distribution for pitch classes. Music theorists in general appear wary of statistical procedures, although it is the conviction of some that when properly used they can provide much helpful general information as well as objective confirmation of intuitive insights. Gabura shows a variety of interesting tabulations of results, including graphs, correlogranis, and tabulations of pitch structures. Traditional systems of musical analysis, which work with reasonable satisfaction for earlier music, have proved inadequate for atonal and serial compositions of the twentieth century, to name but two styles which hardly fit in the traditional tonal schemes of most music from the seventeenth through the nineteenth centuries. The newer tools of analysis for serial composition, for example, already include studies of combinatorialitypermutation schemes, and other procedures which lend themselves to computation. It is not surprising, then, to find contemporary theorists, turning to the computer to test various analytical ideas, some of them of great complexity. Using the year 1910 as a dividing line, Jackson [45] studied a group of compositions in terms of their harmonic structure, with emphasis on the presence or absence of certain harmony. His comparisons were in three categories: “(1) those involving the chordal or intervallic content of the pieces; ( 2 ) those involving the dissonance content; and finally (3) those involving what shall be called the ‘recurrent chord content’ ([45] pp. 133-134).” The program for chordal and intervallic content used a chord table and interval vector table against which the various chords forms in a piece were matched. Studies of dissonance content were difficult because
USES
OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
103
there is no agreement on what constitutes dissonance. There was also a problem in handling sonorities which contained two intervals consonant in themselves, but which produced a dissonant clash because of their combination. Results were shown in terms of frequency of dissonant intervals, expressed in percentages. The music of Anton Webern is the subject of two recent studies, both of them concerned with some aspect of the composer’s use of the twelvetone (serial) technique in the Piano Variations, Opus 27. Fiore [ZS] working with the Ford-Columbia Representation and SNOBOL, has analyzed the complete score for harmony as a function of the theme-and-variations form. Among other conclusions she demonstrates “the change in Webern’s late works from an emphasis on the preferred tritone, half-an-octave, to the minor third in juxtaposition to the major seventh.” Working with the same music, Fuller [28] developed the program CONHAN (CONtextural Harmonic ANalysis) to find the most important roots in a musical passage, arguing that there are “points of harmonic reference which stand out from the dissonant texture.” The calculation of permutations of the twelve-tone row gives useful information for the analysis of music based on the twelve-tone system. The twelve transpositions of the row, along with the four basic orderings of the row (original, inversion, retrograde, and retrograde inversion), yield fortyeight permutations of the row. Within these permutations of a particular row one may find segments with identical pitch class sets, and this information can be of value to both the composer and the theorist. Lefkoff [53]has developed programs which calculate and print recurrent pitch-class sets in various useful formats. Theorists of the twelve-tone system are also interested in properties and relations of sets of less than twelve pitch-classes. A summary of some of these relations and their applications to combinatorial music systems is given by Howe in an article [4Z] which includes an analysis of a few measused of the fourth movement of Webern’s Fiinf Satze as well as a FORTRAN program to calculate pitch-structures of size 2-1 1 (semitones). In an unusual departure from traditional analysis Howe suggests a reworking of one measure of the piece to make it more representative of the analysis. In a remarkable concluding paragraph Howe writes ([&‘I p. 59) “The possibility of making such a simple revision in order to clarify the compositional procedures in this passage indicates that either our analysis or the composition itself is inadequate in its present form. In an analysis, we usually attempt to make the best possible sense out of a composition, and we feel free to adjust our concepts to clarify the piece; we can also adjust the piece to clarify our concepts. But in the latter case we have not given a sufficient explanation of the composition as it stands.”
104
HARRY
B. LINCOLN
For centuries musicians and theorists have struggled with the problem of various tunings of the octave scale found in Western music. The established system of 12 semitones to the octave has often been challenged and systems offering anywhere from 19 to 72 degrees have been proposed. Stoney [77] developed programs to test various systems and has drawn conclusions of interest to composers and theorists experimenting in this area. He classifies systems of equal temperament as either positive systems, that is, systems with wide fifths, or negative systems having narrow fifths, and further charts the most effective of systems from 12 t o 72 degrees. He concludes that “the most promising low-order systems are those of 24, 22, and 19 degrees. For these last three systems practical experimentation over an adequate period of time and utilizing suitably designed instruments would be required in order to assess their respective deficiencies and merits ([7’7’] p. 171).” 3.3 Ethnomusicology
Reference has already been made (see Section 3.1) to special problems in music representation faced by the ethnomusicologist. Since he works with vast repertories of music, usually monophonic, the ethnomusicologist has often relied on large card file systems to control his materials. He often works with many subtle variants of the same melody, variants transcribed from field recordings, and must devise systems for distinguishing among these variants as well as grouping them into clusters with common characteristics. An idea of the size of these files can be gained by noting the work of Dr. Ladislav Galko, Director of the Institute of Musicology a t the Slovak Academy of Sciences in Bratislava, who has a manual system of 25,000 cards as well as some 200,000 melodies, on tape, awaiting transscription. The computer has had limited use by ethnomusicologists to date, in spite of the fact that, as noted in the Introduction, the earliest use of data processing techniques in music research was by a researcher in folk music. Lieberman [54] has reported on a project a t the Institute of Ethnomusicology at UCLA. The purpose of the project was to clarify the concept of patet (mode) in Javanese gamelan music, and to learn how requirements of patet affect and guide group improvisation. While aware of the dangers of overuse of statistical methods, Lieberman points out a rationale for their use in a studying the music of other cultures when he writes [54]p. 187) : Statistics have the advantage, however, of objectivity; and since ethnomusicologists frequently deal with musical cultures as external observers rather than as native practitioners, objective techniques are welcome safeguards against unconscious superimposition of alien values, which seldom apply.
USES
OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
105
I n addition t o studies of frequency counts, programs were developed to cite two-, three-, and four-note patterns. To further assist in patternsearch, ‘‘a program was developed which, once provided with archetypical four-note formula, could recognize, isolate, and label direct or retrograde patterns with embellishments or extensions of any length.” I n another example of computer applications to ethnomusicology Suchoff [78] has continued and extended the work begun by B61a Bart6k, who was not only a great composer but also a pioneer in the systematic study of folk music. Suchoff’s work in systematic indexing of these materials is discussed in Section 3.4. 3.4 Thematic Indexing
The thematic index, first developed in the late eighteenth century, has been used in many forms, both in printed indices, usually of the works of a particular composer, and in private card files for the researcher. A thematic index, in addition to listing of composer and title to each piece of music, includes a quotation of the opening few notes of the piece in question, and thus is comparable to an index of first lines of poetry. If the researcher already knows the composer or title he has no problem locating the melody in question. However, in some cases a researcher wishes to trace a particular melody which he may simply have in his mind or, more likely, have found in a manuscript or printed source where it is unidentified. If he can characterize the unknown melody as a pattern described by letters and or numbers he should be able to consult a listing of melodies arranged by some ordering of these letters and numbers and find other instances of the use of his particular melody, if they exist. Several systems of ordering themes have been proposed, including simple alphabetization of the letter names of the notes, transposition of the theme into the key of C major and then identifying by the names of the notes, or more commonly, some scheme of denoting the sequence of intervals in the melodies. The latter is usually preferred because it permits citation of transposed melodies which would otherwise be lost in an alphabetized file. This is illustrated in an example which shows two melodies which are identical except for transposition. The letter names of the notes differ, but the interval sequences are the same. The problem becomes more complex when a composer borrows a melody and makes only a slight change in the contour, or fills in a large interval with one or two smaller ones. Various schemes have been proposed for computing the broad contour of a melody, ignoring repeated notes, designating each note’s relation to the first note rather than the one immediately preceding, etc. Hudson [CS] gives a good discussion of some of the alternatives in these procedures in his description of a catalog of the renaissance French chanson. Some of the fundamental
106
HARRY B. LINCOLN
problems of ordering incipits in any repertory are discussed by Meylan [63] in his description of work with the fifteenth century basse danse. Trowbridge, working with a group of 702 compositions in fifteenth century chansonniers, has devised effective techniques for finding closely (but not identically) related incipits. He first attempted to represent an entire incipit by a single complex polynomial equation, but due to practical limitations of the mathematical routines as well as the complexity and variety of the melodic curves, the method failed to produce satisfactory results. I n one of his conclusions he writes [7'9]: Any mathematical representation of a melody must be an approximation based on certain musical assumptions. Some types of relationships must be judged more important than others and weighted accordingly. The solution of this problem poses the most difficult questions of a musical nature, since almost all generalizations about music fail in certain specific instances. One of my assumptions, for instance, has had to be that the direction of a melodic lines a t a given point is more important than the actual rate of change but that the latter-the rate of ascent or descent-is more important than a particular rhythmic configuration a t this point.
The present author [55,57'1 has developed a large file of incipits (opening melodies) of sixteenth-century Italian music. Some 40,000 incipits have been encoded from the frottola, madrigal, and motet repertories. The computer extracts the interval sequence from the encoding (in Ford-Columbia Representation) of an incipit. Interval sequences are listed in numerical order along with information on composer and title. Interesting examples of borrowings within the frottola repertory have been cited [57].The large data bank of melodies has been used for testing of a new music typography discussed below. A similar plan of indexing is used by Suchoff to catalog Eastern European folk melodies. I n addition to questions of borrowings, he is interested in citing variants among peasant melodies as notated first hand by Bart6k. Suchoff has used the GRIPHOS program for organizing and annotating the text materials in this large project [78]. 3.5 Bibliographies and Information Retrieval Systems
The thematic indexing plans described above, while essentially informational retrieval systems, pose problems peculiar to music in their demand for music representations. It is of course possible to use established procedures for developing bibliographies, concordances, and similar data banks and information retrieval systems which can prove useful in many situations. Two such projects are briefly described a t this time. Nagosky [64] has developed the Opera Workshop Codex, whose object is "to place into a single volume as much objective information on operas as would be most useful to college and university opera and voice pro-
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
107
grams.” The data bank for this project is a thoroughly cross-indexed file of information on 250 operas, including details as to title, composer, date, type of opera, duration of the opera, size of chorus, details on the ensembles within the opera, and so forth, for some two dozen items of information. Nagosky proposes that this data file be accessible from terminals throughout the state or larger area and thus provide an information retrieval for large numbers of music students in different institutions. Willis [81] developed a data bank of specific factual information on all performances at Orchestra Hall in Chicago, including details on ticket prices, attendance, type of concert program, persons participating, and so forth. From this data he drew various conclusions about the history of the Hall during periods of changing tastes and, of course, his output provides det,ailed information for the interested researcher. 4. Automated Music Typography for Composition and Research
The development of automated printing procedures such as photocomposition during the past two decades has left music printing relatively untouched. Most music copy is still prepared by autography, music typewriter, or hand engraving. The fields of music composition and music research would both benefit greatly if economies in the music printing industry would permit publication of serious works not presently feasible or profitable. Three methods of automated music printing are currently being explored: photocomposition, computer plotter, and special typography for the high-speed computer printer. Photocomposition of music is the subject of a project at Columbia funded by the Ford Foundation. Bauer-Mengelberg and Ferentz developed a music representation called DARMS (Digital Analog Representation of Music Symbols), although more generally known as the Ford-Columbia Representation for the encoding of music to be printed (see Section 3.1). The Photon disk was engraved with some 1400 symbols for the printing of music, symbols including all the usual music characters, upper and lower case alphabets, and a wide range of slurs and ties. Work has begun on the complex programming necessary to justify right margins (a more difficult matter than with printed text), extract parts from scores or build a score from parts, etc., but to date the project remains incomplete. The possibilities of using the computer plotter for printing music has attracted the attention of several researchers. Byrd [ I d ] ,Gabura [SO], and Raskin [7O] have developed programs which do a very respectable job of music printing. Byrd’s system is machine dependent (CDC 3600) and uses a program known as PLOTZ 9.
HARRY
108
B. LINCOLN
FIG. 11. Music printing by computer plotter. Courtesy of Professor J. Raskin.
It is, of course, difficult to draw smooth curves or slanting lines on the incremental plotter, but if the original is fairly large, reduced copy minimizes this problem. Figure 11 shows a sample of Raskin’s music printing by plotter. Each of the three researchers using the plotter developed his own music representation and unfortunately none of these representations is currently widely used by other researchers. The plotter also has the disadvantage of being slow (although cheap to run offline) and of course it is not as widely available on campuses as the line printer. A third possibility for printing of music, the use of special type characters for the line printer, has been the subject of a project currently under the direction of the present writer [56].Special type slugs have been designed in cooperation with the Glendale Laboratory of the International Business Machines Corporation in Endicott, New York. In designing the typography it was decided to work with the printer set for eight lines to the inch, thus spacing staff lines an eighth inch apart. This gives a half-inch staff (the distance between the outer lines), a size easily read in printout and yet permitting a reduction up to 50% for publishing of thematic indexes by any photooffset printing process. Figure 12 illustrates an example of the current state of this typography. Since the paper moves in only one direction through the printer, and since each music character is made up of two to six type pieces, the information for a line of music must be stored in a two-dimensional array and then the various pieces of type on each line are printed as the paper passes through the printer. Programming for the project has been carried out by Granger in the Computer Center at State University of New York at Binghamton. With the development of more type characters the possible uses of the typography might include, in addition to the present thematic indexing project of the author, such applications as special library cards and the printing of scores.
-
I
>
-
C
1 v
I I I ‘
.r 1 .
I 1 I
0
I’
8
.-
. >
I
I
1
I
?
h
R
r iE A
0
r L
- I
1
K
FIG.12. Computer typography for music.
I
I
I
A ”
-
USES OF THE COMPUTER IN MUSIC COJlAPOSlTlON AND RESEARCH
109
ACKNOWLEDGMENTS The author gratefully acknowledges helpful discussions with M. Mathews and F. Moore at Bell Telephone Laboratories, L. Hiller at the State University of New York a t Buffalo, J. K. Randall at Princeton, and M. Gallery and C. Granger a t the State University of New York at Binghamton. A DISCOGRAPHY OF COMPUTER MUSIC Prepared by M. Gallery, Computer Center, SUNY/Binghamton, N.Y. The compositions listed with each record are only those using the computer. The letters enclosed in brackets indicate whether the computer waa used in composition [C], sound synthesis [S], or both [C, S]. Angel S-36656 [C] Yannis Xenakis: ST/lO-1.080262 for Ten Instruments. Angel S-36560 [C] ST/g for string quartet. Decca DL 79103 [C, S] Music from Mathematics (this was the first computer music record). M. V. Mathews: May Carol Numerology The Second Law Joy to the World (Arrangement) Bicycle Built for Two (Arrangement) Molto Amoroso J. R. Pierce: Variation in Timbre and Attack Stochatta Five Against Seven Random Canon Beat Canon Melodie Theme and Variation S. D. Speeth: D. Lewin: Study % 1 Study % 2 N. Guttman: Pitch Variations J. Tenney: Noise Stuny Fr&reJacques Fantasia by Orlando Gibbons Decca 710180 [C, S] Voice of the Computer M. V. Mathews: Masquerades Slider Swansong J. R. Pierce: EightrTone Canon Computer Suite from “Little Boy” Jean-Claude Risset: R. N. Shephard: Shephard’s Tunes Wishful Thinking About Winter Wayne Slawwn: James Tenney : Stochastic Quartet Deutsche Grammophon Gesellschaft DG-2543005 [C] Algorithms I, Versions I and IV Lejaren Hiller: Heliodor HS-25053 [C] Computer music from the University of Illinois (note: this record has been deleted by the publisher). Illiac Suite for String Quartet Lejaren Hiller and Leonard Isaacson: Computer Cantata Lejaren Hiller and Robert Baker:
110
HARRY
B. LINCOLN
HeIiodor 2549006 [C] Lejaren Hiller:
Computer Music for Tape and Percussion, Avalanche for Pitchman, Prima Donna, Player Piano, Percussionist and Prerecorded Tape. Nonesuch H-71224 [C, S1 John Cage and HPSCHD (for Harpsichord and ComputerLejaren Hiller: Generated Sound Tapes) Nonesuch H-71245 [C, S] Computer Music Quartets in Pairs J. K. Randall: Quartersines Mudgett : Monologues by a Mass Murderer Barry Vercoe: Synthesism Charles Dodge: Changes Nonesuch H-71250 [C, S] Charles Dodge: Earth’s Magnetic Field Vanguard VCS-10057 [C, S] Lyric Variations for Violin and Computer J. K. Randall: REFERENCES 1 . Ashton, A. C. Electronics, music and computers. P h D . dissertation, Univ. of Utah, Salt Lake City, Utah, 1970. l a . Asuar, J. V. (Director of Tecnologia del Sonido, Facultad de Ciencias y Artes Musicales, Universidad de Chile), interview with Mr. Asuar, February 1971. 2. Barbaud, P., Znitiation a la composition musicale automatique. Dunod, Paris, 1966. 3. Baroni, M., Computer studies of the style of renaissance four-voice instrumental canzone. Unpublished manuscript, Institute for Music History of the University of Bologna, 1970. 4. Batstone, P., letter to author dated February 23, 1971, describing use of MUSIC 4BF, University of Colorado a t Boulder. 6. Bauer-Mengelberg, S., The Ford-Columbia input language, in Musicology and the Computer (B. Brook, ed.), pp. 4 S 5 2 . City Univ. of New York Press, New York, 1970. 6. Beauchamp, J. W., A computer system for time-variant harmonic analysis and synthesis of musical tones, in Music by Computers (H. von Foerster and J. Beauchamp, eds.), pp. 19-62. Wiley, New York, 1969. 7. Bernstein, L., and Olive, J., Computers and the 16th-century chanson, a pilot project a t the University of Chicago. Computers and the Humanities III(3), 153-161 (1969). 8. Bernstein, L., Data processing and the thematic index. Fontes Artis Musicae XI(3), 159-165 (1964). 9. Bowles, E., Musicke’s handmaiden: Or technology in the service of the arts, in The Computer and Music (H. Lincoln, ed.), pp. 3-20. Cornell Univ. Press, Ithaca, New York, 1970. 10. Brender, M., and Brender, R., Computer transcriptions and analysis of midthirteenth century musical notation. Journal of Mt~sicTheory 11(3), 198-221 (1967). 11. Bronson, B. H., Mechanical help in the study of folksong. Journal of A m r i c u n Folklore 63, 81-86 (1949). 12. Brook, B., The simplified “Plaine and Easie Code System” for notating music. F o n t s Artis Musicae XII(2/3), 156-160 (1965).
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
1s. Brooks, F. P., Jr., Hopkins, A. L., Jr., Neumann,
111
P. G., and Wright, W., An experiment in musical composition. I R E Trans. Electron. Comput. EC-6, 175 (1957). 14. Byrd, D., Transcription by plotter. Random Bits 5(9), 1, 6-8 (1970). Indiana University Research Computing Center, Bloomington, Indiana. 16. Carlos, W., Switched on Bach. Columbia Record No. MS 7194 (1969). 16. Clough, J., TEMPO: A composer’s programming language. Perspectives of New Music Fall/Winter, 113-125 (1970). 17. Clough, J., A report from Oberlin. Computer Music Newsletter No. 1, pp. 2-5 (1971). Music Division, Purdue University, West Lafayette, Indiana. (Mimeographed.) 18. Cohen, J. E., Information theory and music. Behavioral Science VII(2), 137-163 (1962). 19. Collins, W., A new tool for musicology. Music and Letters XLVI, 122-126 (1965). 20. Crane, F., and Fiehler, J., Numerical methods of comparing musical styles, in The Computer and Music (H. Lincoln, ed.), pp. 209-22. Cornell Univ. Press, Ithaca, New York, 1970. 21. Divilbis, J. L., The real-time generation of music with a digital computer. Journal of Music Theory 8(1), 99-111 (1964). 22. Dodge, C., The composition of “Changes” and its computer performance. Ph.D. dissertation, Columbia Univ. School of the Arts, New York, 1970. See D i s cography, above. 23. Erickson, R., A general-purpose system for computer aided musical studies. Journal of Music Theory 13(2), 276-294 (1969). 24. Erickson, R., Rhythmic problems and melodic structure in Organum Purum: A computer assisted study. Ph.D. dissertation, Yale Univ., New Haven, Connecticut, 1970. 26. Erickson, R., Syntax-directed compiling. Proc. AFZPS Eastern Joint Computer Conf. 1964, pp. 31-57. Reprinted in Programming Systems and Languages (S. Rosen, ed.), pp. 264-297. McGraw-Hill, New York, 1967. 26. Fiore, M., Webern’s use of motive in the “Piano Variations,” in The Computer and Music (H. Lincoln, ed.), pp. 115-122, Cornell Univ. Press, Ithaca, New York, 1970. 27. Forte, A., A program for the analytic reading of scores. Journal of Music Theory 10(2), 330-363 (1966). 28. Fuller, R., Toward a theory of Webernian harmony, via analysis with a digital computer, in The Computer and 2clusic (H. Lincoln, ed.), pp. 123-131. Cornell Univ. Press, Ithaca, New York, 1970. 29. Gabura, A. J., Music style analysis by computer, in The Computer and Music (H. Lincoln, ed.), pp. 223-276. Cornell Univ. Press, Ithaca, New York, 1970. SO. Gabura, A. J., Music style analysis by computer. Master’s thesis, Univ. of Toronto, Toronto, Canada, 1967. 31. Gould, M., A keypunchable notation for the Liber Usualis, in Elektronische Date* uerarbeitung in der Musikwissenschuft (H. Heckmann, ed.), pp. 2 5 4 0 . Gustav Borne Verlag, Regensburg, 1967. 3%’.Gould, M., and Logemann, G., ALMA: Alphameric language for music analysis, in Musicology and the Computer (B. Brook, ed.), pp. 57-90. City Univ. of New York Press, New York, 1970. 33. Henke, W., Musical Interactive Tone Synthesis System. Mass. Inst. of Technol., Cambridge, Massachusetts, December, 1970. (Users Manual, Mimeographed.) 34. Hiller, L., Music composed with computers-a historical survey, in The Compi~ter and Music (H. Lincoln, ed.), pp. 42-96. Cornell Univ. Press, Ithaca, New York, 1970,
112
HARRY B. LINCOLN
56. Hiller, L. HPSCHD, Avalanche, and Algorithms. See Discography, above. $6. Hiller, L., Some compositionaltechniques involving the use of computers, in Music by Computers (H. von Foerster and J. Beauchamp, eds.), pp. 71-83. Wiley, New York, 1969. 57. Hiller, L., and Isaacson, L., ZZZim Suite for String Quartet, New Music Ed., Vol. 30, No. 3. Theodore Presser Co., Bryn Mawr, Pennsylvania, 1957. 58. Hiller, L., and Isaacson, L., Experimental Music. McGraw-Hill, New York, 1959. 59. Hiller, L., and Ruiz, P., Synthesizing musical sounds by solving the wave equation for vibrating objects. Unpublished manuscript. (Abstracted from P. Ruiz, Mus.M thesis, Univ. of Illinois, Urbana, Illinois, 1970.) 40. Hiller, L., and Baker, R., Computer Cantata: A study in compositional method. Perspectives of New Music 3(Fall/Winter), 62-90 (1964). 41. Howe, H. S., Jr., Music 7 Reference Manual. Queens College Press, New York, 1970. 42. Howe, H. S., Jr., Some combinational properties of pitch structures. Perspectives of New Music Fall/Winter, 45-61 (1965). @. Hudson, B., Towards a French chanson catalog, in The Computer and Music (H. Lincoln, ed.), pp. 277-287. Cornell Univ. Press, Ithaca, New York, 1970. 4.4. Hultberg, W. E., Transcription of tablature to standard notation, in The Computer and Music (H. Lincoln, ed.), pp. 288-292. Cornell Univ. Press, Ithaca, New York, 1970. 46. Jackson, R., Harmony before and after 1910: A computer comparison, in The Computer and Music (H. Lincoln, ed.), pp. 132-146. Cornell Univ. Press, Ithaca, New York, 1970. 46. Jackson, R., and Bernzott, P., A musical input language and a sample program for musical analysis, in Musicology and the Computer (B. Brook, ed.), pp. 130-150. City Univ. of New York Press, New York, 1970. 47. Karp, T., A test for melodic borrowings among Notre Dame Organa Dupla, in The Computer and Music (€I. Lincoln, ed.), pp. 293-297. Cornell Univ. Press, Ithaca, New York, 1970. 48. Kassler, J., Report from London: Cybernetic serendipity. Current Musicology No. 7, p. 50 (1968). 49.Kassler, M., Review of H. Lincoln, The current state of music research and the computer. [Orig. article published in Computers and the Humanities 5(1), (1970).1 Computing Reviews 11(12), 652-653 (1970). 60. Kassler, M., MIR-A simple programming language for musical information retrieval, in The Computer and Music (H. Lincoln, ed.), pp. 300-327. Cornell Univ. Press, Ithaca, New York, 1970. 60a. Knowlton, P. H., Interactive communication and display of keyboard music. Ph.D. dissertation, Univ. of Utah, Salt Lake City, Utah, 1971. 61. Kostka, S., The Hindemith String Quartets: A computer-assisted study of selected aspects of style. Ph.D. dissertation, Univ. of Wisconsin, Madison, Wisconsin, 1970. 62. LaRue, J., Two problems in musical analysis: The computer lends a hand, in Computers in Humanistic Research: Readings and Perspectives (E. A. Bowles, ed.), pp. 194-203. Prentice-Hall, Englewood Cliffs, New Jersey, 1967. 65. Lefkoff, G., Automated discovery of similar segments in the forty-eight permutations of a twelve-tone row, in The Computer and Music (H. Lincoln, ed.), pp. 147-153. Cornell Univ. Press, Ithaca, New York, 1970. 64. Lieberman, F., Computer-aided analysis of Javanese music, in The Computer and Music (H. Lincoln, ed.), pp. 181-192. Cornell Univ. Press, Ithaca, New York, 1970.
USES OF THE COMPUTER IN MUSIC COMPOSITION AND RESEARCH
113
66. Lincoln, H., A Computer application to musicology. ZnformatiME Processing (ZFIPS) 68,957-961 (1968). 66. Lincoln, H., Toward a computer typography for music research: A progress report.
Paper presented to the International Federation Information Processing Societies, Ljubljana, 1971. 67. Lincoln, H., The thematic index: A computer application to musicology. Computers and the Humanities II(5), 215-220 (1968). 68. Lockwood, L., Computer assistance in the investigation of accidentals in renaissance music. Proceedings of the Tenth Cmgress of the Znternatwnal Musicological Society, Ljubljana, 1967. 69. Logemann, G., The canon in the Musical Offering of J. S. Bach: An example of computational musicology, in Elektronische Datenverarbeitung in G?+T Musikwissenschuft (H. Heckmann, ed.), pp. 63-87. Gustav Bosse Verlag, Regensburg, 1967. 60. Mathews, M. V., The Technology of Computer Music. M I T Press, Cambridge, Massachusetts, 1969. 61. Mathews, M. V.,and Moore, F., GROOVE-A program to compose, store, and edit functions of time. Commun. ACM 13(12), 715-721 (1970). 6.8.Mathews, M. V., and Rosler, L., Graphical language for the scores of computergenerated sounds. Perspectives of New Music 6(2), 92-118 (1968). 6s. Meylan, R., Utilisation des calculatrices electroniques pour la comparaison interne de repertoire des basses danses du quinzieme siecle. Fondes Artis Musicae X11(2), 128-134 (1965). 64. Nagosky, J., Opera Workshop Codex. Dept. of Music, Univ. of South Florida, Tampa, Florida, 1969. (Mimeographed.) 66. Olson, H., and Belar, H., Aid to music composition employing a random probability system. J . Acwust. Soc. Amer. 33, 1163 (1961). 66. Olson, H., and Belar, H., Electronic music synthesizer. J . Acwust. SOC.Amer. 26, 595-612 (1955). 67. Pinkerton, R. C., Information theory and melody. Sci. Amer. 194,76 (1956). 68. Randall, J. K., For recordings, see Discography, above. 69. Randall, J. K., Three lectures to scientists. Perspectives of New Music 5(2), 124-128 (1967). 70. Raskin, J., A Hardware independent computer graphics system. Master’s thesis in Computing Science, Pennsylvania State Univ., University Park, Pennsylvania, 1967. 71. Rosenboom, D., Homuncular homophony. Paper presented at the Spring Joint Computer Conference, 1971. (Mimeographed.) 72. Schigdt, N., and Svejgaard, B., Application of computer techniques to the analysis of byzantine sticherarion melodies, in Elektronische Datenverarbeitung in der Musikwissenschuft (H. Heckmann, ed.), pp. 187-201. Gustav Bosse Verlag, Regensburg, 1967. 73. Selleck, J., and Bakeman, R., Procedures for the analysis of form: Two computer applications. Journul of Music Theory 9(2), 281-293 (1965). 74. Simon, H. A., and Newell, A., Heuristic program solving: the next advance in operations research. Oper. Res. 6, 1-10 (1958); quoted in Framkel, A. S., Legd information retrieval, in Advances in Computers (F. L. Alt and M. Rubinoff, eds.), Vol. 9, p. 114. Academic Press, New York, 1969. 76. Slawson, W., A speech-oriented synthesizer of computer music. Journal of Music Theory 13(1), 94-127 (1969).
114
HARRY B. LINCOLN
76. Slawwn, W., review of G. Lefkoff, Computer Applications in Music. (West Virginia Univ. Libr., Morgantown, West Virginia, 1967.) Journal of Music Theorg 12(1), 108-111 (1968). 77. Stoney, W., Theoretical possibilities for equally tempered musical systems, in The Computer and Music (H. Lincoln, ed.), pp. 163-171. Cornell Univ. Press, Ithaca, New York, 1970. 78. Suchoff, B., Computer applications to Bartok’s Serbo-Croatian material. Tempo LXXX,15-19 (1967). 79. Trowbridge, L., A computer programming system for renaissance music. Unpublished paper presented to the American Musicological Society, Toronto, November, 1970. 80. Wenker, J., A computer-oriented music notation, in Musicology and the Computer (B. Brook, ed.), pp. 91-129. City Univ. of New York Press, New York, 1970. 81. Willis, T. C., Music in Orchestra Hall: A pilot study in the use of computers and other data processing equipment for research in the history of music performance. Ph.D. dissertation, Northwestern Univ., Evanston, Illinois, 1966. 82. Xenakis, Y., Achurripsis. Bote & Bock, Berlin, 1959. 83. Xenakis, Y., Free stochastic music from the computer. Gravesaner Blaetter VI, 69-92 (1962).
File Organization Techniques DAVID C. ROBERTS Informatics, Inc.
Rockville, Maryland
1. Introduction . . 2. Survey of File Organizations . 2.1 Introduction . 2.2 Sequential Organization . 2.3 Random Organization. . 2.4 List Organization . 2.5 Tree Structures . . 3. Random File Structures . 3.1 Introduction . 3.2 Direct Address . . 3.3 Dictionary Lookup . 3.4 Calculation Methods . . 4. List File Structures . . 4.1 Introduction . 4.2 List Structures . . 4.3 Linear Lists . 4.4 Inverted File and Multilist Organization 4.5 Ring Organizations . 5. Tree File Structures . 5.1 Introduction . 5.2 Tree Structured . . 5.3 Minimizing Search Time . 5.4 Representations of Trees . 6. Implementation of File Structures . 6.1 Introduction . . 6.2 Sequential Data Set Organization . 6.3 Partitioned Data Set Organization 6.4 Indexed Sequential Data Set Organization . 6.5 Direct Data Set Organization 6.6 A Procedure for File Design . References. .
.
115
. 116 . 116
. . . .
. .
.
.
. . . .
.
.
. . . . .
. . . .
. . . . .
117 118 121 128 130 130 131 131 131 143 143 145 146 148 150 151 151 152 155 157 160 160 161 162 162 164 164 166
1. Introduction
This paper is a survey of file organization techniques. Section 2 is an elementary introduction to file structures, and also introduces the more detailed discussions of Sections 3 through 5 . Section 3 discusses random file structures, including direct address, dictionary lookup, and calculation 115
116
DAVID C. ROBERTS
methods of file addressing. Section 4 describes list structures-lists, inverted lists, multilists, and rings are discussed. Tree file structures, including symbol trees, immediate decoding trees, and directory trees, conclude the discussion of file structures. In Section 6, the file structure survey material is related to an available computer system, the IBM System/360. The data set organizations that are supported by Operating System/36O are each briefly described, followed by a presentation of a systematic method for file design. The objective of this paper is to introduce the reader with limited knowledge of file organization techniques to this area in such a manner that he will have some feeling for the tradeoffs that are made in the design of a file organization. For this reason the discussion of each technique, rather than summarizing the methods that have been deveIoped to obtain the utmost in efficiency, emphasizes instead the more fundamental considerations that are important in the selection of a file organization technique. 2. Survey of File Organizations 2.1 Introduction
This section classifies and introduces various techniques of file organization. Extensive work is being performed in this area, and, because of this, the terminology is expanding somewhat faster than it is being standardized. Therefore, the definitions offered here are not absolute, and some TABLEI EXAMPLE FILE Automobile Record number License
Make
Owner Model year
1
477-018 Corvair
1962
2
481-062 Plymouth
1968
3
791-548 Mercury
1967
4
521-607 Cadillac
1969
5
521-608 Volkswagen 1964
Name
Street address
City
JohnT. Mayo 5559Oak Laurel Lane Stanley N. 8003 Ross- Waldorf Rudd burg Drive Patrick J. 1627 Daisy Silver Berry Lane Spring Roger Johnson 1901 Bruce Waldorf Place VeraC. 1901 Bruce Waldorf Johnson Place
State Md. Md. Md. Md.
Md.
FILE ORGANIZATION TECHNIQUES
LAST NAME
117
FIRST NAME, INITIAL
STREET ADDRESS CITY
I
AUTOMAKE
STATE
I
1 MODELYEAR
LICENSE NUMBER
I
I
FIG.1. Record format for example file.
authors may use different terminology. The overall classification scheme presented here, and the order of presentation, is a modification of Dodd’s
WI.
For the basic terminology, the standard COBOL definitions are available [55].An elementary data item is a piece of information that is stored, retrieved, or processed. A collection of one or more data items that describes some object is called a record. For handling convenience, records are usually grouped into logical units called Jiles. A data management system, then, deals with records in files. Each record contains some number of elementary data items which are the object of processing operations. An elementary data item that is used to organize the file is called a key. Table 1 is a sample file that is used to illustrate this discussion. The file contains data on automobiles and their owners, as might be maintained by a state department of motor vehicles. Figure 1 shows a record layout that might be used for each record of this 61e. The number of bits, characters, or machine words allocated to each field in the record would depend on the characteristics of the computer to be used, and are therefore ignored here. 2.2 Sequential Organization
Sequential organization was, historically, the first to be developed and is the best known. In a sequential file, records are arranged in position according to some common attribute of the records. Sometimes records in a sequential file are stored in order of their arrival in the file. Figure 2 shows part of the example file organized as a sequential file, where the file is ordered by license number. The principal advantage offered by sequential organization is rapid access to successive records. That is, if the nth record has just been accessed, then the n l t h record can be accessed very quickly. This is always true if the sequential file is resident on a single-access auxiliary
+
DAVID
118
Mayo J o h n T . 5559 Oak Lane Laurel Md.
Md. Piyrnouth 1968 481-062 Johnson
C. ROBERTS
Corvair 1962 477-018
Roger
1901 Brucr Place
Rudd Stanley N. 8003 Rossburg Drive
Waldorf
Md. Cadillac 1969
Waldorl
521-607
FIG.2. Sequential organization of example file.
storage device, such as a tape drive. However, if the file is stored on a multi-access device such as a disk drive, then, in a multiprogramming environment, head motion caused by other users may reduce this advantage. A sequential file is searched by scanning the entire file until the desired record is found. For a large file, this is a lengthy process. For this reason, transactions for a sequential file are usually accumulated, sorted so that they are in the same order as the file, and then presented a t all once for processing, so that only one pass through the file is required. Using such batching techniques, it is possible to process sequential files a t a very low cost per transaction. Obviously, a sequential file is not well suited to an online application, where the transaction batch size is small. When a file is to be processed on the basis of more than one key, efficiencies that are achieved by ordering the file on a single key are impossible unless a duplicate copy of the file, ordered on the second key, is maintained. The alternate copy can then be used to determine the primary keys of the record to be processed, and standard sequential processing techniques can be used. This technique adds many steps to any process accessing the file, greatly reducing efficiency. I n summary, sequential organization permits rapid access to successive records and is the most economical implementation of a file that will have large batches of transactions processed against it according to one key. However, processing and retrieving records out of sequence is very slow, and small-volume updates are inefficient, since any update requires recopying of the entire file. 2.3 Random Organization
The records in a randomly organized file are arranged according to some established relationship between the key of the record and the location of the record on direct-access storage; records are stored and retrieved through the use of this relationship. There are three types of random organization : direct address, dictionary lookup, and calculation.
FILE ORGANIZATION TECHNIQUES
119
F I 8003 Rossburg Drive
5559 Oak Lane Laurel
1627 Daisy Lane
Plymouth
Corvair
79 1-548
48 1-062
477-01 8
062
018
L=J==l Johnson
548
Johnson
Roger
Vera
I
c.
I90 I Bruce Place
I901 Bruce Place
Waldorf Volkswagen 521-608
52 I 4 0 7 L
1964
I
~~
607
I
I
608
FIG.3. Direct address organization of example file. 2.3.1 Direct Address
When the address of the record is known, this address can be used directly for storage and retrieval. This presumes that some address bookkeeping is performed outside the computer system, or that each record contains some field that is used directly as the key. Figure 3 shows a direct organization of the example file. I n this case the last three digits of the vehicle license number must be known to access a record. A direct address organization cannot often be used because usually the programs that are to access a file must also keep track of the locations assigned to records. Whenever direct addressing can be used, of course, it is optimally efficient, since no accesses to any file or table need to be made in order to access a record in the file. 2.3.2 Dictionary lookup
A dictionary is a table of two-element entries, each of which specifies a key-to-address transformation. When a record is added to the file, an entry is added to the dictionary; when a record is deleted, a dictionary entry is removed. Retrievals are performed by looking up the desired key in the dictionary, and then using the address obtained from the dictionary to access the record. Figure 4 shows a dictionary for the example file, which could be used for access based on owner’s surname.
120
DAVID C. ROBERTS
Berry
Johnson
Johnson
Mayo Rudd
FIG.4. Dictionary for example file.
Since each reference to a record in the file requires a search of the dictionary, the search strategy used has great influence on the effectiveness of the file design. The two search strategies that are commonly employed are the binary search and sequential scan. If the dictionary is not maintained in any collating sequence, a sequential scan is the only method that can be used to obtain an address. If the dictionary has n entries, then, on the average, (n 1)/2 accesses will be necessary to obtain an address. On the other hand, if the dictionary is maintained in collating sequence of the keys, a binary search is possible. The binary search begins by first testing the key at the location that is a power of 2 nearest to the middle of the dictionary. A comparison indicates whether it is the desired key, and, if not, in which half of the file the key is located. This operation is then repeated, eliminating half the remaining dictionary at each step, until the desired key is located, or its absence is established. In order to use a dictionary with maximum effectiveness, the entire dictionary should be kept in core memory. But for many files of practical size, the dictionary is so large that this becomes impossible. In this event, it is necessary to segment the dictionary. Segmented dictionaries are often cascaded, so that there is a hierarchy of dictionaries that are searched. Such an organization is really a tree structure, which is discussed below. The degradation in esciency produced by linear segmentation of a dictionary depends on the frequency of reference to parts of the dictionary that are not resident in main memory. If the most frequently referenced entries are kept in main memory, this degradation may be slight.
+
2.3.3 Calculation
In the calculation method, a key is converted into an address by some computational manipulation of the key. Since the set of addresses produced
FILE ORGANIZATION TECHNIQUES
121
is much smaller than the set of possible keys, the addresses generated from distinct keys are not always distinct. Two processes, compression and hashing, are discussed here; only hashing, however, is an address calcdation method. Compression is included because an important performance measure of both compression and hashing algorithms is the extent to which they map different keys into the same transformed key. Compression (called “abbreviation” by Bourne and Ford [as]) is the transformation of the key so that it requires as little storage as possible, and yet retains as much as possible the discrimination and uniqueness of the original key [25]. Compression is normally applied to keys that are words or names in natural language, so as to remove characters that add the least to the information content of the keys. Hashing is a transformation on keys that produces a uniform spread of addresses across the available file addresses [IQS].Thus, hashing is used to transform a key (that may have been compressed) to an address. ,4popular hashing algorithm is to split the key into two parts, multiply the two halves of the key, and use the middle bits of the product as the hash address. Compression techniques are especially useful in cases where the keys may contain errors, such as may occur in systems performing retrievals based on names, particularly if the name has been transmitted verbally. Specialized compression techniques based on phonetics have been developed that map various spellings of the same word into the same compressed key [Q9].A name-based retrieval system is an example where both compression and hashing might be used together. The main problem connected with the use of calculated addressing concerns the ambiguity that is introduced by the transformation to a shorter key [IQS].When two keys calculate to the same file address, a “collision” is said to have occurred; some method of resolution of the conflict must be employed. The general effect of incorporating collision handling into a file design is to require a large increase in the number of processing steps that are performed whenever a collision is encountered. This effect can substantially degrade the performance of a calculated addressing sckieme, despite its intuitive simplicity. Therefore, in a calculated addressing scheme, careful selection of transformation algorithms and the collisionhandling scheme must be made. 2.4 List Organization
A pointer to a record is the address of that record, expressed in a way that permits the direct location of the record. Thus, a pointer can be an actual disk address, or it can be an address relative to the disk address of
122
DAVID
C. ROBERTS
the first record of the file, or some other quantity. By the use of pointers to imply linking relationships between records, it is possible to completely divorce the physical and logical arrangement of a file. I n fact, through the use of pointers, it is even possible to represent recursive data structures, which have no representation without pointers. The fundamental component of a list is a record, as defined above, where one or more of the fields may be pointers. Then a list can be defined as a finite sequence of one or more records or lists. Lists that are not recursive and do not have loops or intersections can be represented without the use of pointers, if physical ordering is used t o represent the linking relationship between records. This type of allocation is called sequential allocation; in contrast, the use of pointers to join related records is called linked allocation. Linked allocation schemes are easier to update, but require more storage for pointers. I n the figures showing file organizations with pointers, pointers are represented by arrows from the record containing the pointer. The end of a list is indicated by a pointer of some special value, often zero; in the figures this end-of-list indicator is represented by the symbol used to represent “ground” in circuit diagrams, after Knuth [I071:
-
6
2.4. I Linear Lisfs
A linear list is a set of records whose only structure is the relative linear positions of the records [IOi”]. There are special names for linear lists in which insertions and deletions are made a t one end of the list [I071: A stack is a linear list for which all insertions and deletions are made a t the same end, called the top. A queue is a linear list for which insertions are made a t one end, called the back, and deletions are made a t the other end, called the front. A deque (contraction of “double-ended queue”) is a linear list for which insertions and deletions can be made a t either end, called the left and right of the deque.
These three structures are encountered frequently. They are sometimes also called queueing disciplines; a stack is a LIFO (last in first out) queue, a queue is a FIFO (first in first out) queue, and a deque is a queue that can be used in either way. These names reflect the primary use of these structures-the construction of various types of task queues. They also occasionally are useful as intermediate files in applications that require complex retrieval processes on large files. For example, a stack might be used t o accumulate the results of a search of a file on a primary key for later qualification by other processes. A linear list may be implemented using either sequential or linked allocation. Linear lists are not limited to these restricted cases. In general,
FILE ORGANIZATION TECHNIQUES
I Rudd
John T.
I Stanley N. I
Patrick I.
5559 Oak Lane
1627 Daisy Lane
Laurel
I962
Waldorf
Md.
Plymouth
1968
477-018
I
123
791-548
I + -
3
/----Johnson
Roger
Johnson
Vera C.
1901 Bruce Place
190 1 Bruce Place
Waldorf
Waldorf
Md.
Cadillac
Volkswagen
1964
521-607
521-608
+
0
4
1_ -
4
FIG.5. List organization of example file.
additions and deletions may be made at any point in a list, and a record may be on several lists at once. The great ease with which additions and deletions can be made in a linked list is one of the chief advantages of linked allocation. Figure 5 shows a linear list organization for the example file, where a separate list has been used for each distinct city of residence in the file. The first spare field has been used as the list pointer. The Waldorf list of Fig. 5 could also be implemented by constructing a sequentially allocated list consisting of record No. 2 followed by record No. 4, followed by record No. 5, without any pointers. 2.4.2 Inverted File
An inverted file is composed of a number of inverted lists. Each inverted list is associated with one particular value of some key field of a record and contains pointers to all records in the file that contain that value of the key field [120]. Inverted lists are normally produced for all fields of the record, permitting the file to be accessed on the basis of any field. Figure 6 shows an inverted file structure for the example file. Since the file is inverted on the basis of all fields, any one can be used to access a record.
124 DAVID C. ROBERTS
0
c
w
FILE ORGANIZATION TECHNIQUES
125
FIQ.7. Multilist organization of example file.
126
DAVID
C. ROBERTS
Note that the longest inverted list is the one for “Md.,” which appears in all records in the example file. An inverted file permits very rapid access to records based on any key. However, updating an inverted file structure is difficult, because all the appropriate inverted lists must be updated. For this reason, an inverted file structure is most useful for retrieval if the update volume is relatively low or if updates can be batched. A variation of inverted file structure that includes features of lists is multilist structure. A multilist consists of a sequential index that gives, for each key, the location of the start of a list that links together all records characterized by that key value. A multilist can be regarded as an inverted list structure in which all entries after the first in each inverted list have been represented by lists in the file rather than by entries in the inverted list. Figure 7 shows a multilist organization of the example file. Note that one link field is needed for each of the key fields in the original record, since exactly one list will pass through every record for each field in the record. If the lists are divided into pages, if pointers in the index refer to records by page and record number within page, rather than record number within the file, and if each list is restricted in length to one page, the structure is called a cellular multilist. In a cellular multilist, then, each inverted list is represented by a number of sublists, where a sublist is a linked list within a page. The index points to the first record of each sublist. A multilist is easier to update than an inverted file because it avoids the necessity for complete reorganization of the sequentially allocated inverted lists, but retrievals are slower than with an inverted file because the lists must be traversed to perform a retrieval. A cellular multilist organization lies midway between the inverted file and multilist, both in updating difficulty and in retrieval speed, because it represents the inverted lists by a structure that is partially linked (the file) and partially sequentially allocated (the index). 2.4.3
Rings
A ring is simply a linear list that closes upon itself. A ring is very easy to search-a search starting a t any element of the ring can search the entire ring. This facilitates entry into a ring from other rings. The danger that an unsuccessful search may continue endlessly around the ring is solved very simply if the search program saves the record number of the first record it searches; the record number of each record that is searched is then compared to this stored number, and the search terminates unsuccessfully when it returns to its starting point [195].
8
f
P
g
4
5 d
FIG.8. Ring organization of example file.
h)
u
128
DAVID C. ROBERTS
One important use of rings is to show classifications of data. A ring is used to represent each class; all records in that class are linked t o its ring. If a tag field is included in each record along with each pointer, then a class can be searched for the records it contains, or a record can be searched for the classes which contain it with equal ease. The classification scheme can be hierarchic; if it is, a hierarchic search of the file can be performed very easily, starting at the highest level ring. Figure 8 shows a ring organization of the example file. This example shows only a city ring; other rings could be incorporated if the file was to be entered using other keys. The organizations of Figs. 7 and 8 are very similar; in fact, the only two dserences are the circularity of the ring (rather than having some end) and the ring versus sequential organization of the index information. If a file is to be accessed using a variety of types of data, using a ring structure to classify the index information reduces the average access time. Updating a ring structure is slightly easier than with other list structures because all the pointers to be modified during an update can always be found by traversing the ring. There are two chief disadvantages associated with the use of ring structures: the overhead introduced by the pointers (which is essentially identical to the overhead associated with any list organization) , and the number of operations required to locate a record. With an inverted file, for example, if a record is to be located on the basis of three keys, the three appropriate inverted lists can be used in combination to locate the desired record very quickly; with a ring structure, one of the three rings would have to be searched sequentially. Thus, if the rings are very long, the search time to locate a record can be long. 2.5 Tree Structures
A tree file structure corresponds to the hierarchic segmentation of a dictionary. The highest level dictionary points to a second-level dictionary, and so on, until the bottom level dictionary points to records. A tree is a hierarchic structure that corresponds to the Dewey Decimal notation used for classifying books [lor].The root of the tree might be given the number 1, and the first-level branches would be 1.1, 1.2, and 1.3, and so on. Another example of a tree structure is the numbering system used for this article, which is also a Dewey Decimal notation. Trees can be named by the information content of each record. If each record contains only one symbol, the tree is called a symbol tree; if each record contains part of a directory, the tree is called a directory tree [120]. Figures 9 and 10 show symbol and directory trees, respectively, for searching the example file by license number. In both figures, *n is used to
FILE ORGANIZATION TECHNIQUES
4
7
A
1
i i 1
129
7
9
8
I
I
1
i
5
6
4
7
I
*4
8
I
*5
8
*3
FIG.9. Example file symbol trees.
represent a pointer to the location of the nth record. Three symbol trees are required because there are three starting symbols. With this structure, a record address is obtained in six comparisons; using the directory tree, in two to five comparisons. The correspondence between a tree structure and a binary searching algorithm is very strong; in fact, a tree can be drawn to represent the actions of a binary search. The root of the tree would correspond to the middle record of the file; the two elements on the second level would correspond to the one-quarter and three-quarter elements in the file, and so on [283]. In this way, a tree can be searched with the same number of operations as a binary search, but without the necessity for sequential allocation. Thus, at the cost of storage for pointers, the speed of a binary search can be obtained with the ease of linked updating. This is the primary motivation for the use of tree structures in large random-access files.
130
DAVID
4774318
481-062
C. ROBERTS
1-
521-607
521-608
791-548
FIG.10. Example file directory tree.
3. Random File Structures 3.1 Introduction
The availability of direct-access auxiliary storage devices naturally suggests the use of random file organization. The use of a single file address or record number t o retrieve any record in a file with only one access to auxiliary storage is intuitively attractive, both in terms of programming simplicity and speed of execution. If some key is available whose number of possible distinct values is equal t o the capacity of the file in records, and if some one-to-one mapping of these keys onto record addresses can be constructed, then random organization is very simple t o use. I n such a case the method called direct address organization, described below, is used. But, more often, the number of possible distinct key values is much greater than the number of record addresses that can be allocated for the file, and the distribution of key values t o be expected may not be known. I n this case, if random organization is to be used, some method must be developed to transform key values into addresses of records in the file for all possible key values. The two principal methods of performing this transformation, namely, dictionary lookup and calculation, are discussed below. Note that the contents of a record are also often under the control of the file designer. I n particular, each physically addressable storage unit may contain several records. I n this case, the addressable unit of auxiliary storage is called a physical record, the file record a logical record. I n this discussion, the term record always refers t o logical record.
FILE ORGANIZATION TECHNIQUES
131
3.2 Direct Address
Direct addressing, when it can be used, is optimally efficient, since only one auxiliary storage access is required to obtain any record. In this case, the file design task is trivial and is completed immediately; auxiliary storage for one record is allocated for each possible key, the key to record number transformation is programmed, and auxiliary storage for one record is allocated for each possible key. 3.3 Dictionary lookup
Dictionary lookup is a very effective file accessing method if the dictionary can be kept in main memory and if all keys have the same length, I n this situation, the dictionary can be kept in collating order of the keys, and a binary search can be used to find the address of any record very quickly, and only one access to auxiliary storage is needed to obtain any record. If the keys are not all the same length, two alternatives are available. Space sufficient for the longest key in the dictionary can be allotted to every key, in which case a binary search of the dictionary can still be used, or just sufficient space can be allocated for each key, in which case a sequential scan must replace the binary search. The first alternative increases the size of the dictionary; the second increases the average number of operations needed to access a record. If the dictionary is too large to be kept in core, it must be segmented. A hierarchically segmented dictionary is a tree, discussed in Section 5. A sequentially segmented dictionary requires complicated programming techniques to make it an efficient addressing mechanism. The basic problem is that if the dictionary is arranged so that its most frequently accessed entries are always kept in main memory in order to minimize auxiliary storage accesses, a binary search of the dictionary is impossible. Dictionary lookup can be used to provide access to a file based on more than one key-a separate dictionary is used for each key. If a dictionary is provided for every field, the structure becomes an inverted file, as discussed in Section 4. Multiple dictionaries tend to slow down updating. If a record is added to or deleted from a multiple-dictionary file, all the dictionaries must be updated. Since they are normally kept ordered by collating sequence, sorting of the dictionaries is also necessary. 3.4 Calculation Methods
Dictionary lookup imposes the overhead of an index between the key and the address of the desired record. If the key is known imprecisely, and several probes of the dictionary might be necessary to locate the file key
132
DAVID C. ROBERTS
that is closest to the desired key, maximum possible use of the information content of the dictionary has been made. On the other hand, if the key is known precisely, it seems reasonable to structure the file so that the key can be used to retrieve the record without access to any intermediate tables [l7]. Calculation addressing methods have been developed to permit access to a file without the use of intermediate tables. Hash addressing is the use of a key of the desired record to locate the record in storage by means of some transformation, called a hash, on the key. The ideal hash spreads the calculated addresses uniformly throughout the available addresses. A collision is said to have occurred when two keys calculate to the same address. Some method of handling collisions must be provided as part of any file structure incorporating hash addressing techniques [I@]. In the sense that a key representing part of the contents of a record is used to address the record, a file organization using hash addressing acts as an associative memory [IIY]. Sometimes the key field to be used for calculation may contain some characters that do not contribute to the information content of the key; the classic example of this is the redundancy of words in natural language. If keys with redundancy are used as the basis for hash addressing, the distribution of calculated addresses produced by the hash algorithm will be adversely affected. Of course, the performance of any other scheme of key-to-address transformation will be similarly degraded by redundancy in keys; for example, key redundancy would cause every dictionary entry to be longer than otherwise required. In the case of proper nouns and English language words, compression techniques have been developed to reduce this redundancy. Compression is not the same process as hashing: a compression technique transforms a key into another key, but, a hash transforms a key into a file address. 3.4.1 Cornpression
As stated by Bourne and Ford [65],“the major objective [for a compression technique is] to provide as much condensation as possible while maintaining a maximum amount of discrimination between the [keys.]” They tested a large number of compression techniques and tabulated the results of their testing. Each algorithm was tested on a number of different sample data bases, including English words, and various collections of names. TO test an algorithm, it was applied to all the members of a set of test keys. The number of unique compressed keys that were produced by the algorithm were counted, and this number was used as a performance measure.
FILE ORGANIZATION TECHNIQUES
133
Some of the more important compression measures tested were: (1) Selective dropout of every nth letter. Starting from the left end of the word, every nth letter is dropped. The algorithm is applied as many times as necessary to shorten the word to the desired length. For example, consider the word “abdication” for n = 3, and compressing to four letters: First pass abictin Second pass abctn Third pass abtn (2) Selective dropout by character usage rankings for each letter position. Using a separate ranking of character usage for each letter position, eliminate the most popular letters, in order of decreasing popularity, until the desired compressed key length has been reached. (3) Selective dropout by a single ranking of bigram usage. Using a single ranking of bigram usage, starting with the most popular bigram, delete bigrams in order of popularity until the desired compressed key length is reached. A bigram is a pair of adjacent letters; each letter of a word except the first and last contribute to two bigrams. (4) Selective dropout by a single ranking of letter usage. From a single ranking of the usage of letters for all letter positions, remove letters in the order of popularity, until the desired compressed key length has been reached. Bourne and Ford used different rankings for common and proper words : Common words
EIROATNSLCPMDUHGYBFVKWXZJQ Proper words EARNLOISTHDMCBGUWYJKPFVZXQ (5) Vowel elimination. Starting from the left, eliminate the vowels a, el i, 0,and u until the desired compressed key length is reached. If there are not enough vowels in the key to reach the desired compressed key length, then truncate from the right to the desired length. (6) Truncation from the right. Starting from the right, remove letters until the desired compressed key length is reached. Bourne and Ford augmented one of their techniques by generating a “check” character from the characters that had been removed, and appending it to each compressed key. Generation of a check character is straightforward. For example, if the letters of the alphabet are represented by the integers 1 to 26, the deleted characters can be summed modulo 26 to obtain an integer in the required range for a check letter. Consider the
134
DAVID
C. ROBERTS
word “abdication” truncated above to “abtn” : a b
t n
1 2 20 14 37
=
11 mod 26
In this case, the check character would be the eleventh letter of the alphabet, “k”. Tables I1 through V present some of Bourne and Ford’s results.’ Table I1 shows the symbols that are used to identify the compression techniques in the figures that follow. Table I11 is a ranking of the techniques by their performance on words, for various lengths of compressed keys. Note that selective dropout with n = 2 and a check character appended gave the best results in every case. The techniques of selective dropout by character usage ranking were uneven performers and gave good results only for certain compressed key lengths. Vowel elimination and truncation from the right performed very poorly. Table IV lists the number of unique words remaining in a collection of 2082 words, after compression to various lengths. This table is useful for the selection of a compressed key length. For technique A with a key length of four characters the number of distinct compressed keys obtained TABLEI1 KEY TO SYMBOLS USED IN TABLES 111 TO V Technique symbol
Technique
G H
Selective dropout of every 2nd letter; check character added Selective dropout by character usage rankings for each letter position Selective dropout by a single ranking of bigram usage Selective dropout of every 2nd letter; no check character Selective dropout of every 3rd letter; no check character Selective dropout by a single ranking of letter usage Vowel elimination Truncation from the right
Copyright 1961, Association for Computing Machinery, Inc.; reprinted by permission of the Association.
FILE ORGANIZATION TECHNIQUES
135
TABLE I11 PERFORMANCE RANKING OF VARIOUS TECHNIQUES BY PERFORMANCE ON COMMON WORDS Compressed key length, in characters 3
4
5
G E
A D E C G B F
H
H
An
C D
F B
* See Table I1 for key
6
to symbols.
with a compressed key length of four characters was nearly identical to the original size of the t,est set, so that a compressed key length of four characters should suffice for technique A for a key collection similar to the test collection. With the use of any other technique, a compressed key length of a t least six characters would be required to obtain the same performance. TABLE IV NUMBEROF UNIQUECOMPRESSED KEYSGENERATED FROM COMMON WORDS Compressed key length, in Characters 1
2 3 4
5 6 7 8
Technique symbol0
A
B
C
D
E
F
G
H
26 511 1831 2056 2078 2080 2082
26 418 1511 1997 1960 2077 2081 2082
26 472 1611 1965 2043 2069 2076 2082
26 401 1576 1991 2054 2075 2077 2079 2080
26 196 1060 1912 2048 2068 2077
26 388 1545 1871 1957 2073 2079 2081 2082
26 300 1087 1653 1968 2051 2078 2082
26 196 841 1456 1762 1938 2012 2054 2073 2081 2082
9
10 11 a
See Table I1 for key to symbols.
136
DAVID
C. ROBERTS
TABLEV NUMBER OF UNIQUE COMPRESSED KEYSGENERATED FROM NAMES Compressed key length, in characters 1 2 3 4 5 6 7 8 9 10 11 12 a
Technique symbola A
D
E
H
25 606 7117 8122
25 557 6313 8013 8171
25 245 2559 7115
25 245 1542 3561 4914 5875 6756 7377 7766 7953 8042 811.7
See Table I1 for key to symbols.
Table V lists the number of unique proper names remaining in a test collection of 8184 unique names of people. Before the compression algorithms were tested, the names were edited by removing blanks and other nonalphabetic characters to form a single word of no more than 22 characters. In this case, very good performance could be obtained using algorithm A and a compressed key length of four letters. In every case, however, the names, which originally contained six to 22 characters, could be represented with very little loss of uniqueness by a compressed key of ten characters. In a system such as an airline reservation system, a bank account record system, or a credit card account system, the system must be able to perform retrievals based on names, and the names may be misspelled, especially if they are being communicated verbally. Special compression techniques, such as Soundex and Consonant Coding have been developed for such systems. These techniques are designed to map, as much as possible, all the possible spellings of any given name into the same compressed key. The Soundex compression scheme is usually used to convert names into one alphabetic character and three digits, although other compressed keys could be produced. The rules are as follows [65]: (1) The first character of the name is used for the first character of the compressed key.
FILE ORGANIZATION TECHNIQUES
137
(2) Except for the first letter, drop all occurrences of a, el i, 0, u, y, w, and h. (3) Assign the following digits to the following similar-sounding sets of characters: Character Digit B,F,P,V 1 C,G,J,K,QS,X,~ 2 DlT 3 L 4 M,N 5 R 6 Insufficient consonants 0 (4) Delete the second letter in a pair of adjacent identical letters of pair of letters in the same group. (5) If there are insufficient consonants, fill out with zeros. The Soundex technique has been used widely and is described in the literature of several computer manufacturers, without reference to its originator.
Another compression technique used in a system that tolerates misspelling of names is described by Davidson [49]; he calls this scheme “consonant code,” and cites an IBM publication as its source [ l d l ] .This method produces a compressed key of five alphabetic characters. (1) Use the first character of the surname as the first character of the compressed key. (2) Use the first letter of the first name as the fifth character of the compressed key. If there is no first name, use a dummy character that is programmed to match all characters. (3) Initialize the second, third, and fourth characters of the compressed key as blanks. (4) Replace the second, third, and fourth characters of the compressed key using the following rules : (a) Delete all vowels and all occurrences of H, W, or Y, except for the first letter of the surname. (b) If any letter repeats, delete all but the first occurrence. (c) Insert the first three remaining letters after the first letter into the second, third, and fourth positions in the compressed key. With this compression scheme, the first four characters of the compressed key are called the compressed surname. Table VI shows the results of applying these two algorithms to several spellings of three easily misspelled names. The vowel-elimination technique enables both algorithms to handle
138
DAVID
C. ROBERTS
TABLE VI COMPRESSION TECHNIQUES RESULTSOF SPECIALIZED APPLIEDTO SAMPLESURNAMES Compression technique Surname
Soundex
Miller Mueller Muller Lo Loew Lowe Korfhage Korvage Rogers Rodgers Herring Herron
M460 M460 M460 LOO0 LOO0 LOO0 K612 K612 R262 R326 H652 H650
Consonant coding MLR
MLR MLR L L L KRFG KRVG RGRS RDGR HRNG HRN
Miller, Mueller, and Muller; the elimination of the letter W makes Lo, Loew, and Lowe map into the same compressed key for both algorithms. The advantage of the use of equivalence classes of similar-sounding letters in Soundex permits it to compress Korfhage and Korvage into the same compressed key, while consonant coding fails in this case. In the last two cases, Rogers and Rodgers and Herring and Herron, both algorithms fail. This shows the necessity of including searches on secondary keys in nameretrieval systems, in case the specialized compression algorithm fails. The airline passenger record system described by Davidson, for example, includes facility for a search on telephone number for use in such cases. 3.4.2 Hash Addressing
I n hash addressing, the key is operated upon by some mathematical algorithm called a hash that generates an address in secondary storage directly from the key. The hash maps elements of the set of keys into the set of auxiliary storage addresses, so it is useful to examine the characteristics of these two sets to develop the requirements for a hashing algorithm. The set of keys is a set of variable length quantities which may have varying restrictions on the permitted values for portions of the keys (BCD, integer, etc.). Therefore, the key may in fact be a mixed radix expression; the set of keys will be usually a small subset of the set of possible keys; and the keys will be clustered in various ways, and the way in which they are
FILE ORGANIZATION TECHNIQUES
139
grouped may change as deletions and additions to the file are made. I n contrast to this, the set of secondary storage addresses is numeric, restricted in range, densely occupied, consecutively ordered, and timeinvariant [117]. It is desired that the transformation algorithm enable any record to be located in only one secondary storage access, which is to say, that each record should be stored at its calculated address, called its home address. Due to the unpredictable characteristics of the key set, the best possible performance will be obtained from a hashing algorithm that maps keys into what would correspond to a random selection of addresses [ l l r ] . No matter how carefully the hashing algorithm is chosen, collisions will inevitably occur. Therefore, it is necessary to choose some method for handling collisions. Hash addressing, then, consists of two stages: a mapping t o allocate calculated addresses as uniformly as possible among available addresses, and overflow procedures to direct collisions to other addresses [ll7].
Secondary storage addressing is usually described in terms of the addressing of buckets, where a bucket is simply the largest quantity of data that can be retrieved in one auxiliary storage access. [lSZ, 1411. Usually, a bucket is large enough to hold more than one record. The effect of transforming to bucket addresses, where each bucket holds several records, rather than to a single record address, is generally to improve the performance of the addressing technique in use. In this discussion, the grouping of records into buckets is ignored, since it does not affect the relative merits of these addressing schemes. (a) Transformation Algorithms Several algorithms have been developed for transforming keys t o addresses, and a number of them have been in common use for several years.
(1) If the key occupies several machine words, form the product and use the middle bits for the hash address. This technique is not useful when part of the key can be zero, such as may occur when short keys are justified by filling with zeros, or if blanks are represented by zeros [I@]. (2) If the key is longer than the desired address, cut the key into several pieces each as long as the address and form the sum of the pieces. This method can be used for single-word and multiple-word keys [l42]. (3) Square the key, and use some bits from the middle of the square as the address. The middle of the square depends on all the bits of the key, so there is a high probability that different keys will hash to different addresses [ l 4 Z ] . (4) Exclusive-or the words of a multiword key, then divide the result
140
DAVID C. ROBERTS
by the table size in records and use the remainder as the calculated address. The table length must be odd, or the rightmost bit of the key will become the rightmost bit of the hash address. This method produces a calculated address within the range of file addresses that has very attractive randomness properties [132]. (5) Interpret the key as a series of 4-bit digits of some radix p (where p 5 16) and transform the key representation into some radix q, where p is relatively prime to q, then truncate the result to obtain the calculated address. This method produces a very uniform distribution of calculated addresses, and is especially good in breaking up clusters of adjacent keys [llr].
Experimentation is presently the best method available for choosing a hashing algorithm. Generally, some sort of representative sample of the keys is obtained, and a selection of hashing algorithms is used to produce several sets of calculated addresses. Each of these sets is then counted, and the number of distinct calculated addresses is compared to the number of distinct sample keys as a measure of the performance of each algorithm. This process is essentially identical to the measures used by Bourne and Ford [25]for the performance of compression algorithms. (b) Scatter Table Construction When records are being added to a file using hash addressing, first a calculated address is produced. Then, if there are no records stored at that address, the record is entered a t its home address and the process is complete. However, if there is already a record a t the home address of the new record, either the record that is already there must be moved, or the new record must be stored a t some other location, and methods for later locating both records must be provided. To retrieve a record, first a calculated address is produced, using the same algorithm that was used to load the file, then that address is accessed. If the desired record is stored in the home address, or if that location is empty, the operation is complete. If another record is stored a t the home address, some means of searching other locations that where the record might have been stored in case of a collision must be employed. Three popular organizations for scatter tables are considered here, along with two other specialized techniques [I&?]. For each method, the insertion procedure is given; the retrieval procedure is virtually identical except that it does not include writing.
(1) Linear Probing. If the calculated address is occupied, then begin scanning forward in the file until an empty location is found, and write the record into that location. This method is very easy to program, but it is the least efficient, because it tends to cluster overflow records. If a collision has occurred a t location j, this method raises the probability that
FILE ORGANIZATION TECHNIQUES
+
141
a collision will occur a t location j 1 above the average collision probability of the file. (2) Random Probing. When a collision occurs, a pseudorandom number generator is used to generate an offset and that offset is added to the calculated address and the resulting address is probed. This process continues until an empty location is found, or the entire file is full. The pseudorandom number generator must be such that it generates all the addresses in the file once and only once before repeating. This method is more efficient than linear probing because it tends to distribute overflow records. However, deletion of records using random probing is dif€icult. If other records have collided a t the deleted record, they may become unlocatable. The only way to locate all such records is to recompute the calculated address for every record in the file. Alternatively, a special deletion flag may be inserted in the location, and the search program designed so that a deleted record is treated as an unequal probe. Eventually, however, this technique leads to an excessive number of searches, and the file must be reorganized. (3) Direct Chaining. When a collision occurs, if the record in the home address has the same key as the record to be stored, the new record is stored in any available location obtained by any means, and a pointer to the overflow record is stored in the home location. If the record occupying the home location does not have the same key as the record to be stored, it is moved anywhere else, its overflow chain is adjusted, and the new record is stored in its home location. This is the most efficient scheme for handling collisions. Storage space for a chain pointer is used in each record to permit more rapid search. The greatest programming complexity is encountered in the necessity to move records from one location to another. (4)Scatter Index Tables. It is possible to separate the data area completely from the scatter table if every- record is treated as an overflow record. The scatter table then consists only of pointers to chains of records in the data area. This technique is particularly advantageous in the case of variable-length records; if variable-length records are to be stored in a scatter table, the table must include, for each record, space for the longest record in the file. Deletion is not difficult with the use of scatter index tables; the deleted record is simply removed from its chain, and its space is returned to free space. ( 5 ) Randomized Check Code. When the key is very complex, or is a variable-length string of characters, the comparison of the key with the key of each record encountered during a probe may be very time consuming. In this case, to speed this comparison, a calculated address that consists of more bits than are needed for the address can be computed, and the extra bits can be stored with each record. Then, when a record is encountered during a probe, these extra bits are compared, and the full keys are compared only when these two supplementary keys are equal.
142
DAVID
C. ROBERTS
TABLEVII AVERAGENUMBEROF PROBESTO RETRIEVEA RECORDFOR VARIOUSSCATTERTABLE ORGANIZATIONS
Load factor
E-random probing
E-linear probing
E-direct chaining
.1
1.05 1.39 1.83 2.56
1.06 1.50 2.50 5.50
1.05 1.25 1 ..38 1.45
.5 -75 .9
If an equal number of bits are used for the calculated address and the supplementary key, it is possible that no two keys will ever be compared unless they are equal. Morris [l4%']has determined the average number of probes needed to locate a record for each of the first three methods above; his results are tabulated in Table VII and shown as a graph in Fig. 11.2The load factor
Linear
D--
Chaining
DLR
I
I
I
.I
.5
.75
---J
.o
Load Factor
FIG. 11. Average number of probes to retrieve a record for various scatter table organizations. Copyright 1968, Associat.ion for Computing Machinery, Inc. ; reprinted by permission of the Association.
FILE ORGANIZATION TECHNIQUES
143
is simply the fraction of the available locations in the file that are occupied, and the average number of probes depends only on the load factor, and not on the file size. For small load factors, all three methods give practically identical performance; therefore, for a load factor between .1 and .5, linear probing, which is the simplest to program, could economically be used. For higher load factors, however, random probing gives better performance than linear probing, at moderate increase in programming complexity. For the best possible performance a t high load factors and a t the highest cost in programming complexity, direct chaining is used. Thus, the selection of a scatter table organization is dependent on the occupancy factors to be experienced by the file and the performance requirements.
4. List File Structures 4.1 Introduction
This section describes list structures, which include lists, linear lists, inverted lists, multilists, and rings. For each structure, methods for searching and updating are outlined, along with a discussion of the various tradeoffs that must be made in selecting a particular organization. I n dealing with list structures, it is desirable to have a compact notation for specifying algorithms for manipulating them. A convenient notation for this purpose has been introduced by Knuth [lor].Every record consists of a number of fields, some of which are pointers; therefore, all operations on lists of records can be expressed in terms of operations on fields of records. Let the notation also include link variables, whose values are pointers. Then, let a reference to any field of a record consist of the name of the field, followed by the name of a pointer variable pointing to the record, enclosed in parentheses. To use the example of Section 2, let each record consist of the fields LAST, FIRST, STREET, CITY, STATE, MAKE, YEAR, and LICENSE, corresponding to the fields of the record layout shown in Fig. 1. In addition, augment that record by adding one pointer field, and call the pointer field NEXT. Consider the example of Fig. 5, which shows the file with such a field added to each record. Suppose the initial value of the pointer variable LINK is 2; that is, it points to recurd number 2. Then the value of NEXT(L1NK) is 4, the value of NEXT(NEXT(L1NK)) is 5, and NEXT(NEXT(NEXT(L1NK))) is the null link, which is represented by A, the Greek letter lambda. The only operator included in the notation is t,which is used as an assignment operator. A + B is read “assign the contents of B to A.” The usual mathematical symbols for equal, greater than, less than, and so forth will be used with their conventional meanings. Let the notation also include one reserved
144
DAVID C. ROBERTS
word: LOC. If V is a variable or field, let LOC(V) be the storage address of V. To return to the example given, LOC(NEXT(NEXT(L1NK))) = 4. The last notational convention needed is the use of square brackets to indicate relative position within a list. If W is the name of the list in the example above, then
YEAR(W[l])
=
1968
and
NEXT(W[3])
= A,
where YEAR and NEXT are field names, and W[n]is treated as a link variable. List processing languages were originally developed during work in artificial intelligence, in which the structure of data files, and not just their contents, must be changed drastically during a program, and cannot be predicted. For applications of this type, list structures are used not only for files that are stored on auxiliary storage devices, but also for core memory. A variety of list-processing languages were developed to simplify programming of applications that make extensive use of list structures; these languages typically provide extremely sophisticated capabilities, and usually are rather slow in their execution. Because of the existence of these languages, a common notion has developed that a list-processing programming language must be used to write any program that manipulates list structures. The use of Knuth’s method of expressing algorithms in this discussion, which is very similar to several programming languages, shows that these manipulations can easily be performed in a popular programming language, such as FORTRAN, PL/I, or COBOL. Whenever lists are being manipulated by a program, no matter what the type of list, one problem always must be solved: the maintenance of a pool of available space. Before any list manipulation takes place, the storage space that is available for all list structures is first initialized in some way so that all the records in these area are linked to the pool of available space, called the free list. Then, when a record is to be added to a list, it is unlinked from the free list, its fields are set to their desired values, and the record is linked onto the desired list. The same effect can be achieved by the use of an availability table containing one bit for each record. When a record is used, its bit is set to 1; when it is released, its bit is returned to zero. During execution of a program, an interchange of records between the pool of available space and the lists being processed will take place. Since the processing associated with this function is a pure overhead task, and does not contribute directly to solving a problem, considerable attention has been devoted to the methods of performing this interchange, with the
FILE ORGANIZATION TECHNIQUES
145
goal of minimizing the number of instruction executions needed. There are two basic approaches to the maintenance of the free list: (1) Return a record to the free list whenever it is no longer being used by the program. (2) When a record is released by the program, do not link it to the free list. Rather, wait until requests for records from the free list exhaust it, and then locate and link to the free list all records that are not currently in use. The procedure of identifying and linking up all free records is known as garbage collection. Three principal methods have been described, two that use variations on the first method, and one that uses garbage collection [166].Newel1 I1461 describes the method incorporated in the IPL-V programming language: leave the responsibility to the programmer. Thus, whenever a node is deleted from a list, the program must make a decision regarding its return to the free list. If the only structures being manipulated are relatively simple, this is not an unacceptable solution; however, if, for example, lists are permitted to be parts of other lists, the task of determining when a record should be returned to the free list becomes complicated, and this approach loses much of its appeal. A second approach is due to Gerlernter et al. [74].A reference counter is maintained for each record, showing the number of references to it. When a reference count is decremented to zero during a deletion from a list, the record is returned to the free list. This approach rapidly becomes more complicated as the permitted list structures become more complex. For example, if part of a list is added to another list, a new reference counter for the part must be established. In this way, reference counters tend to proliferate. McCarthy [I261 originally proposed the scheme of garbage collection. Using this method, once the free list has been exhausted, processing of the program is temporarily suspended, and the garbage collection routine is initiated. This routine traces through all the lists in use by the program, marking each node that is encountered in a list. Every record in the file is then read in order, and those that have not been marked are placed on the free list. Finally, the marks are removed from the marked nodes, to initialize for the next garbage collection. A variety of algorithms for traversing lists have been proposed; for further details in this area, see Schoor and Waite [I661or Knuth [lor]. 4.2 List Structures
The completely general dehition of a list is a sequence of one or more records or lists. This definition, due to Knuth, permits a list to be an ele-
146
DAVID
C. ROBERTS
ment of itself. Such a facility is useful in applications requiring recursive structures, particularly artificial intelligence applications, but it is not really applicable to the general file organization problem, so it is not considered a t any great length in this discussion. 4.3 Linear Lists
AS defined in Section 2, a linear list is a set of records whose only structure is the relative linear positions of the records. Note the difference between a list and a linear list; a list can include circular or recursive structures, whose structural relationships are more complex than a simple linear arrangement. The three important classes of linear lists in which insertions and deletions are made at the ends of the list are defined in Section 2 (see page 122). Linear lists can be implemented using either sequential or linked allocation. Sequential allocation simply places the records in the list one after another, so that LOC(X[i 11) = LOC(X[i]) c ,
+
+
where c is the number of words in each record (this implies that all records are the same length). Sequential allocation is especially convenient for the implementation of a stack. The only pointer variable needed is a pointer to the top of the stack, called TOP. To place an element X onto stack Z, two steps are necessary; TOP is incremented, and X is placed into the storage location pointed to by TOP : TOP t TOP
+c
Z[TOP] t X. Removing an item from the top of the stack, provided the stack is not empty, is easily accomplished by reversing the above procedure:
X t Z[TOP] TOP + TOP - C. From these two algorithms, it is obvious that a stack is very convenient to implement using sequential allocation. Most stacks that are resident in main memory are implemented in this fashion. Instead of using sequential storage locations, it is possible to arrange a linear list using pointers so that the logical structure and physical structure are completely independent. This is accomplished by representing the logical connection between two records by a pointer that points from one record to another. Linked allocation permits greater flexibility in the use of storage, at the cost of additional storage space for the pointers.
FILE ORGANIZATION TECHNIQUES
147
Knuth gives these points in a comparison of linked and sequential allocation: (1) Linked allocation requires more storage, because of the pointers. (2) Deletion from a linked list is much easier (one pointer is changed) than from a sequential list, which may require shifting of the entire list. (3) Insertion into the midst of a linked list is much easier. (4)Random access to any element of the list is much faster for a sequential list. (5) It is easy to join or break two linked lists. (6) Intricate structures, such as lists of variable-length records, can be represented more easily with linked lists. The number of links in a record is not by any means limited to one; varying number of links may be included in each record for different list organizations. One such structure is a doubly linked list, illustrated in Fig. 12. The extra space required to store the pointers for a doubly linked list provides two advantages: the list can be searched in either direction, and insertion and deletion of records can be accomplished very easily. For example, the algorithm for deletion of the record pointed to by the link variable is RLINK(LLINK(X)) t RLINK(X) LLINK(RLINK(X))
t LLINK(X).
In this case, the two pointer fields of each record are called RLINK and LLINK. This algorithm permits the deletion of a record, given only its
Head
L FIG.12. Doubly linked linear list.
148
DAVID
C. ROBERTS
record number; no tracing through the list is required. Similarly, a record may be inserted to the right or left of any record in the list in a similarly simple manner. Consider the insertion of record V to the right of record U: LLINK(V) t U RLINK(V) t RLINK(U) LLINK(RLINK(V)) t V RLINK(U)
t V.
Singly and doubly linked lists are used in many applications, both for structures in core memory and for auxiliary storage. One example of the use of a singly linked list is the chaining method of scatter table organization described in Section 3. In this structure, the pointers to overflow chaiis are list pointers; this is an example of a singly linked list. Since the overflow chains are searched in only one direction, and since deletion is performed only during a forward search of the list (so that all the pointer values needed are immediately available), there is no need for double linking in this case. 4.4 Inverted File and Multilist Organizations The inverted file organization has been developed to minimize the time needed to retrieve from a file, at the expense of update time [ l 7 ] .The ideal file structure for retrieval purposes would read only the desired records from auxiliary storage and no others, performing all preliminary searching in core. If the complete inverted file can be kept in core, the inverted file structure accomplishes this goal. Even if the inverted lists must be stored on auxiliary storage, one access to auxiliary storage can read an inverted list pointing to many more records than could be read in one access; so even in this situation the number of accesses to auxiliary storage is less than would be required to search the records themselves. In order to facilitate conjunction and disjunction (“and’’ and “or”) operations on inverted lists, the simplified structure of Fig. 6 is usually augmented by the addition to each inverted list of a count giving the number of items in the list. This count is used to select the shortest lists for scanning, if the search permits such a choice. For example, consider the inverted file of Fig. 7. If the file is to be searched for all persons who are residents of Maryland, have the name Johnson, and own Volkswagens, then it is obviously easier to intersect the Volkswagen and Johnson lists first and then intersect the results of that search with the list for Maryland; in this way, only one candidate record remains after the first intersection. This advantage is more striking when the two lists are very different in
FILE ORGANIZATION TECHNIQUES
149
length. If one wants to find a sergeant who speaks Urdu and Sanskrit in an army personnel file, it is obviously easier to first intersect the Urdu and Sanskrit lists, and then intersect the result with the list on sergeant, than to start with the sergeant list. Union and intersection operations on inverted lists are facilitated if each list is maintained in collating order of the record addresses. In this case, two lists can be intersected or their union can be found with one pass through both lists. For this reason, inverted files are almost always kept in collating sequence. An inverted file is difficult to update. To add a record, the inverted list corresponding to the value for each field in the record must have a pointer t o the new record added. The necessity of keeping each inverted list in order by record number (to facilitate searches and merges) tends to increase the complexity of this operation. Deleting a record from an inverted file organization similarly requires the modification of inverted lists corresponding to the value of every field in the record, with the same problems. A data management system that has been designed around the inverted file organization is TDMS (Time-shared Data Management System) [20] produced by the System Development Corporation. TDMS uses inverted files, with a hierarchically segmented dictionary of several levels that is used to locate the inverted lists. Xaturally, the performance characteristics of TDMS are those of an inverted file organization. Retrievals can be performed on the basis of any field, which is a useful feature if the queries are unpredictable as to field requirements, and logical operations in queries can be performed efficiently. Of course, the use of an inverted file organization optimizes the retrieval performance a t the expense of updating, so updating TDMS files is very time consuming. Thc particular organization of the inverted lists chosen for TDMS is rather elaborate, so the storage of the inverted file requires as much storage space as does the original file itself. Thus, a substantial penalty in ease of updating and storage economy has been incurred in order to optimize the response to unpredictable queries. Multilist file organizations are also usually augmented by the addition of a list length to each index entry, so that the shortest lists can be searched. A multilist structure is conceptually an inverted file, where only the heading of each inverted list is kept in the index, and the rest of each list is indicated by links among the records. Therefore, the multilist is easier to update, because the index information is not stored in a link organization, but it is more time consuming to search, because now every record on each inverted list must be brought into core. If the lists are long and the queries being processed are conjunctions containing a number of keys, the ratio of records read from auxiliary storage to the number of records satisfying
150
DAVID C. ROBERTS TABLE VIII TIMING COMPARISONS OF INVERTED FILEA N D ~IIJLTILIST ORGANIZATIONS~ Transitftion Retrieval \Vhole rerord addition \\'hole recwrd deletion Xnn-key modification (without relocat ion) Non-key modification (with reloration)
Inverted file
Yultilist
13.9 15.0 15.8 1. 3
86.8 4.0 1.3 1.3
25.8
4.8
the query may be as low as ljl000, which means that 99.9% of the reads from auxiliary storage retrieve no useful information [If51. Table YIII presents a comparison of timing calculations to perform various operations on multilists and inverted files. The transactions also include an initial decoding using a three-level tree as directory to the index. The time units in the table are in seconds [115];hotvever, their values for any specific implementation of these structures would depend on the details of the hardware and software used in the implementation. Therefore, these numbers should only be used as a basis for comparing the relative speeds of the two organizations for the transactions presented. The inverted file is much faster in retrieval, but much slower in updating. The choice of one or the other organization depends on the relative volume of updates and retrievals and the response time requirements for retrievals and updates. A ring is no more than a linked list that closes upon itself. A ring structure shown in Fig. 8 can be viewed as a variant of multilist organization. If the multilist index is placed in a linked list that closes upon itself, and if the null link that terminates every list in a multilist is modified to point to the head of the list, the multilist has been converted to a ring organization. The performance characteristics of a ring and multilist are similar. 4.5 Ring Organizations
Because a ring has no end and no beginning, and because, given some record number x the record on each side of x can always be located by moving forward through the ring, it is somewhat easier to construct a general-purpose system using rings than multilist. In fact, two such systems have been constructed. One of these, Integrated Data Store (IDS), produced by General Electric, is available on a variety of GE computers.
FILE ORGANIZATION TECHNIQUES
151
IDS is a set of subroutines for processing rings; facilities to create and destroy, add to and delete from rings are provided. IDS provides a convenient way to mechanize a file that may change drastically and that requires the representation of complex interrelationships of records [lo]. Rings can also be used to represent matrices; a separate ring can be used to represent each row and column, and each record will therefore reside on exactly two rings. This storage method is useful only for sparsely occupied matrices; for dense matrices, the sequential allocation techniques that have been developed for use with programming languages such as FORTRAN (see, e.g., Knuth [lor])are more economical of storage space. Rings have been used to construct cylinders, where a cylinder is a pair of rings with cross linking between them. Various types of information can be represented by these links. For example, a figure consisting of n points joined by line segments can be represented by a cylinder whose two rings each have n records [195].A line connecting point j with point k on the figure is represented by a link between the j t h record in one ring and the kth record in the other ring of the cylinder. At the present time it is not clear that cylinders are useful for other than research applications; however, useful applications in the area of graphic data representation may be developed. 5. Tree File Structures 5.1 Introduction
A binary search is used for a sequentially allocated random-access file that is stored in order of the collating sequence of its keys. This arrangement reduces search time at the expense of update time. For a file that is updated more often than it is searched, linked allocation can be used to minimize update time at the expense of search time. For a file that is updated and searched with similar frequency, however, neither of these approaches is very practical, and some sort of compromise must be struck. A tree structure is such a compromise, combining the speed of a binary search with the updating ease of linked allocation. A precise definition of a tree structure can be expressed easily with the use of elementary graph theory. The following is a modification of definitions used by Sussenguth [I84and Birkhoff and Bartee [18]: A graph is a set of nodes and branches joining pairs of nodes. A path is a sequence of branches such that the terminal node of each branch coincides with the initial node of each succeeding branch. A graph is said to be strongly connected if there exists a path between any two nodes of the graph. A circuit is a path in which the initial and final node are identical. A tree is a strongly connected graph with no circuits.
152
DAVID C. ROBERTS
The root of a tree is a node with no branches entering it. The root of the tree is said to be at the f i s t level; a node which lies at the end of a path containing j branches from l t h level. the node is said to be at the j
+
A leaf is a node with no branches leaving it. The filial set of a node is the set of nodes which lie at the end of a path of length one from the node. The set of nodes reachable from a node by moving toward the leaves are said to be governed by the node.
From the above definitions, it follows that a tree will always have one and only one root; a tree of n nodes will always have exactly n - 1 branches; and n - 1 is the smallest number of branches required for the tree to be strongly connected. Tree-organized files are used most often as indexes to some other file. Such an arrangement permits a record in the tree file to consist of only keys, pointers to other records in the tree, and addresses in the file. This approach is particularly useful if the records in the file are variable in length. If the file consists of short, fixed-length records, then the entire record can be placed within the tree structure. In this discussion, it is assumed that the tree structures under consideration are being used for key-to-address transformation, also called decoding. A key, of course, can be any information that can be represented in a computer-a series of mixed letters and digits, binary integers, or some other quantity. Since decoding is an application of tree searching, which is the fundamental operation performed in any tree-structured file, this discussion is also applicable to any tree-structured file. It is important to be aware of the distinction between the structure of a tree, which consists of certain items of information and some defined relationship between pairs of these items, and the representation of a tree within a computer, which is composed of words of storage containing numeric values. To index a given file, one of several dzerent tree structures might be used, each of which could be represented in several different ways in the computer. Thus, to design a tree-structured index to a file, first the tree itself, which is the embodiment of certain indexing relationships among the records, must be selected, and then a representation of that tree must be designed. This section first considers various tree organizations, then discusses methods of reducing search time for each of them. Finally, several representations of trees are considered. 5.2 Tree Structures
Many names have been used for various tree structures; no standard terminology exists. For this discussion, trees are classified into three types : symbol trees, immediate decoding trees, and directory trees. These three cate-
FILE ORGANIZATION TECHNIQUES
153
gories provide a reasonable basis for a survey of tree organizations. However, the following discussion by no means exhausts all the possible combinations of node contents and linkage arrangements that can be made. Any key can be decomposed into a string of individual symbols. For example, a decimal number can be decomposed into a string of singledigit symbols, a series of double-digit symbols, and so on; a 36-bit binary number can be decomposed into twelve symbols of three bits each, nine symbols of four bits each, and other groups of symbols. In the construction of a symbol tree, the leaves of the tree are placed in one-to-one correspondence to the addresses in the file that is indexed by the tree. The non-leaf nodes of the tree are used only for searching; each leaf node of the tree contains a file address. Each key is broken into a number of symbols, and one node is used for each symbol. The tree will have one node on the first level for each distinct first symbol of a key; one node on the second level for each distinct second symbol; and so on [183]. The filial set for each symbol will have one node for each distinct following symbol in the key set. Figure 13 shows a symbol tree for the words platypus, P
A.
S.
"I
1 1 1
E o
1
A 0
I
N e
1 1 I 1 1 "i
N e
1
P e
Y O
'i "i "i I M.
8
S O
*
FIG.13. Symbol tree.
'i *
154
DAVID C. ROBERTS
platinum, peasant, poet, and playboy. Note that keys with common initial polygrams (sequences of symbols) share nodes in the tree. Such polygram sharing tends to reduce the storage requirements of the tree. A symbol tree is therefore especially useful when the keys are long and share initial polygrams. A further advantage of symbol trees is that variable length keys present no additional complications. With a symbol tree, decoding is never complete until a leaf is reached, There is another type of tree that has appeared in the literature [8], called in this discussion an immediate decoding tree, in which an entire key is stored at each node, and decoding can be completed without reaching a leaf. I n this structure, one node is used for each distinct key value. The tree is searched until an exact match is found, and then the file address in that node is taken as the result of decoding. An immediate decoding tree for keys which take the integral values from one to fifteen is shown in Fig. 14. This structure is particularly suitable for relatively short keys that can be represented in one or two machine words. For long keys, particularly if they share many initial polygrams, a symbol tree makes more efficient use of storage. When an immediate decoding tree is searched, the search key is compared with the key at the root. If the two keys are equal, the file address stored a t the root is the decoded address. If the search key is less than the key a t the root, the left branch is taken and the previous step is repeated. If the search key is greater than the key a t the root, the right branch is taken and the previous step is repeated. If a leaf is reached without an equal comparison, the search terminates unsuccessfully. A directory tree lies in the middle ground between a symbol tree and an 8
FIG.14. Immediate decoding tree No. 1.
155
FILE ORGANIZATION TECHNIQUES
immediate decoding tree. Each node of a directory tree contains several entire keys, but no file addresses. During the search, if the search key is greater than the j t h key a t a node but less than the j l t h key, then the branch corresponding to the j t h node is tested next. When a leaf is reached, that leaf will contain the desired file address, if the search key is in the file. Figure 10 shows a directory tree. A directory tree is most useful if the file keys are all the same length, or can be compressed to the same length. In that situation, the keys stored at each node can be searched using a binary search, greatly speeding the search process.
+
5.3 Minimizing Search Time
Considerable attention in the literature has been given to methods of reducing the search time for a tree. Various methods have been proposed for the three tree types under consideration here. Sussenguth [I831first addressed the problem of searching a symbol tree. He claims that the searching time requirements for such a tree are only 24% greater than for a binary search. However, his results are limited by his assumptions about both the computational steps needed for each process and that all leaves lie a t the same level of the tree, which amounts to assuming that all keys are of equal length. Patt [I501 removes some of the limitations of Sussenguth’s analysis. Patt’s work has been criticized by Stanfel [I781 because of his assumption that no strings share initial polygrams. Nevertheless, one of Patt’s theorems is useful in reducing the search time of a symbol tree: The average search length for this type of structure is minimized if the nodes of every filial set are ordered according to the number of terminal nodes reachable from each.
If the symbol tree is constructed in the way Patt suggests, the symbols stored a t those nodes that are elements of the largest number of key strings will be the first to be compared with symbols from the search key. Scidmore and Weinberg [16Y] propose a slight modification of Sussenguth’s structure; instead of using terminal nodes for data record pointers they allow such a pointer to be placed in any node. Their analysis therefore does not require that every such pointer lie a t the same level of a tree, or even that the pointers terminate a path through the tree. But in their analysis they assume a uniform random distribution of keys. This assumption is open to question; several factors act to cluster keys. For example, if keys are words from natural language, rules of spelling and phonetics act to cluster the keys. Updat,ing a symbol tree is not difficult, and it can be done online. The only additional structure that is needed is a list of available space, maintained as discussed in Section 4.
156
DAVID C. ROBERTS
With an immediate decoding tree, since the root does not correspond to the first symbol in the key, it is possible to completely reorganize the tree for faster searching. In particular, if some paths from the root to the leaves are considerably longer than others, the upper bound on search time can be lowered by reorganizing the tree to make all the path lengths more nearly equal. Consider as an example the immediate decoding tree illustrated in Fig. 15. The longest path from the root to a leaf has length six; the shortest, two. To reorganize this tree, note that it is simply a representation of a total ordering relation of the keys; this can be demonstrated very easily by traversing the tree in postorder [lo71 : 1. traverse the left subtree (in postorder) 2. visit the root 3. traverse the right subtree (in postorder). Tracing the immediate decoding trees in postorder of Fig. 15 gives this ordering of the keys: 1 , 2, 3, 4, 5, 6 , 7 , 8,9, 10, 11, 12, 13, 14, 15. Note that this is the same ordering that is obtained by traversing the immediate decoding tree of Fig. 14 in postorder. Therefore, these two trees are alternative representations of the same total ordering. The simplest method to reOrganize an immediate decoding tree is to traverse it in postorder, constructing a sequentially allocated random access file that corresponds to the tree. This file can then be used to completely reconstruct the tree by
\ 4.
1
J
3 5
7
FIG.15. Immediate decoding tree No. 2.
I4
FILE ORGANIZATION TECHNIQUES
157
assigning the middle element to the root, the one-quarter and three-quarters elements to the first level, and so on. More complicated algorithms that avoid complete reconstruction can also be devised. Minimization of search time for directory trees has been treated by Landauer, who suggests the use of balanced directory trees. Using a balanced directory tree of n levels, all keys are decoded in exactly n or n - l levels [ I l l ] . Landauer made an extensive theoretical investigation of the branching properties of directory trees and derived an optimum branching factor for such trees. However, his analysis does not include the amount of information that can be obtained in one access to auxiliary storage, and in a practical situation, this parameter is of great importance. If the number of keys at each node is selected so that every node exactly fills the amount of auxiliary storage that can be read in one access, decoding will be more rapid than with the use of Landauer's value for the optimal branching factor. 5.4 Representations of Trees
A number of different computer representations of trees have been developed. This discussion by no means exhausts all the representations that have been suggested; even if it did, the number of possible new representations is unlimited. Rather, this discussion is intended only to present the most important representations, to serve as a starting point for a file design. The most popular representation of a symbol tree, suggested by Sussenguth, is double chaining. Each record contains three items: a symbol, a pointer to the node's filial set, and a pointer to the next node within the same filial set as the node. Figure 16 shows a doubly chained representation of the symbol tree of Fig. 13. When such a tree is searched, every time a match is encountered the filial set of that node is searched. If a match is not found, the search continues through the siblings of the node. The search terminates successfully when a * symbol is encountered; that node contains the file address. The search terminates unsuccessfully when a sibling set is exhausted without finding a match. Thus, the two links provide sufficient information for the complete search, and no other links are needed for a symbol tree. An immediate decoding tree never requires the search of a set of siblings; therefore, the double chaining of Fig. 16 would not be the best arrangement for such a tree. Rather, a scheme such as that of Fig. 17 would be used. In this case, each node includes pointers to the two members of its filial set. During the search, either the left or right branch is taken from each node, or if the search key and file key are equal, the file address, which is stored in the fourth field, is taken as the result of the search.
158 DAVID C. ROBERTS
FIG.16. Doubly chained symbol tree.
FILE ORGANIZATION TECHNIQUES
FIG.17. Doubly chained immediate decoding tree. 159
DAVID C. ROBERTS
160 Degree Key
0 2 0 2 0 2 0 2 0 2 0 2 0 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (a)
Degree Key
0 2 0 2 0 2 0 1 1 2 0 2 0 2 0 1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 (b)
FIG.18. Postorder sequential representation of immediate decoding trees. (a) Immediate decoding tree No. 1; (b) immediate decoding tree No. 2.
Some tree searching applications require frequent references upward as well as downward in the tree. Although this is not usually the case in decoding, it can arise if the tree structure is being used for the storage of hierarchically organized files. In this case, Knuth has suggested the addition of another pointer to each node-a pointer to the predecessor node. Directory trees are usually multiply chained, with one pointer stored for each key a t each node. The above representations of trees have all assumed the use of linked allocation. However, if sequential allocation is used for trees that are stored on auxiliary storage, the saving of the space used for the links can permit the searching of more nodes with one auxiliary storage access than is possible with linked allocation. From the discussion of postorder traversal above, it is clear that a tree structure represents an ordering relation among the nodes. A postorder sequential representation of a tree can be constructed by listing the nodes in postorder and giving the degree of each node [107],where the degree is the number of branches leaving that node. Figure 18 shows the postorder sequential representation of the immediate decoding trees of Figs. 14 and 15. 6. Implementation of File Structures
6.1 Introduction
When a file is implemented, the implementation must be performed within the constraints imposed by an available computer system. These constraints are in two areas : physical constraints arising from the hardware characteristics of the computer and its peripherals, and software constraints imposed by the data management facilities provided by the operating system. This section discusses the latter constraints as imposed by the data management facilities of Operating System/36O (OS/360) , and suggests how these facilities can be used to implement a file.
FILE ORGANIZATION TECHNIQUES
161
The term “data set” is used in OS/360 documentation to refer to a logically distinct file, and the term is used with that meaning in this discussion. A data set consists of some number of “volumes,” which are standard physical units of auxiliary storage, such as reels of magnetic tape. OS/360 provides for four data set organizations : sequential, partitioned, indexed sequential, and direct-access. This does not at all mean that a user cannot implement some other file organization, such as a tree structure; it does mean, however, that (if OS/360 data management is to be used) any structure must be implemented as a collection of data sets of the four permitted organizations. A tree structure, for example, could be implemented using a direct-access data set or, perhaps, an indexed sequential data set. Thus, the data set organizations that are supported by OS/360 provide an environment in which file organizations can be implemented, rather than limiting in any way which organizations can be implemented. Naturally, any similarity between a desired file organization and one of the data set organizations should be exploited in order to simplify programming. OS/360 has been designed so that data set organizations are, as much as possible, independent of the physical device upon which the data set resides. There are, of course, basic restrictions that arise from the physical characteristics of storage media; for example, a data set residing on a sequential-access peripheral device, such as magnetic tape, must have sequential organization. All four types of data set organization can be used on a direct-access storage device, such as drum, disk, or data cell, while storage media that are essentially sequential in nature, such as magnetic tape, can contain only sequentially organized data sets. Therefore, if a program is to be run with a data set that can be resident on sequential or direct-access devices, the data set organization used should be sequential, or it must be reformatted before some executions of the program. Another type of “independence” that is available using 0s is the ability to make a direct-access data set independent of its location within a volume. This is accomplished by making all pointers within the data set relative to the start of the data set. In this way the data set becomes ‘‘relocatable” similarly to the way that programs that use only addresses relative to the instruction counter are relocatable. 6.2 Sequential Data Set Organization
Sequential data set organization is analogous to a deck of punched cards, where each card is one record. All records must be processed sequentially; that is, to process the n l t h record, the first n records must fmt be processed.
+
162
DAVID C. ROBERTS
The average access time to any record in a sequential data set is a linear function of the number of records in the data set, since locating any record requires a linear search of the data set until the desired record is located. For this reason, sequential data sets are not efficiently used for applications that require random-access capabilities. 6.3 Partitioned Data Set Organization
A partitioned data set is a collection of sequential files (called“ members”) along with a directory that gives the name of each member and its location within the data set. The members are organized sequentially. The directory is a series of records a t the beginning of the data set. The directory contains an entry for each member of the data set. Each entry contains a member name and the starting location of the member within the data set; the entries are arranged in alphabetic collating sequence by member name. By use of the directory, any member of the data set can be retrieved directly. Members are added to a partitioned data set by writing them after the last member of the set; members are deleted by deleting their directory entry. Deleted members leave gaps of unused storage in the data set; when an addition to a partitioned data set cannot be accomplished because of insufficient available space a t the end of the data set, it is “reorganized.” Reorganization consists of recopying all members of the data set, discarding gaps between members. Partitioned data sets are particularly useful for program libraries ; any program within the library can be loaded without access to any other. A partitioned data set could also be used for files that are loaded into core in their entirety and searched in core, such as dictionaries and inverted lists (see Section 4). Since a member of a partitioned data set cannot be read other than all at once, partitioned data sets are not useful for storage of files that are searched one record a t a time. 6.4 Indexed Sequential Data Set Organization
Indexed sequential organization is analogous to a file of punched cards with a table that gives, for each file drawer, the contents of the last card in the drawer. The data set consists of a hierarchical set of these tables called indexes and a data area that contains the records that comprise the data file. The index structure corresponds loosely to a directory tree, with a very large number of keys at each node. The records in the data area are stored in collating sequence of a key field of each record. Each block of storage in the da.ta area is preceded by a key field that gives the key value for the last record in the block.
FILE ORGANIZATION TECHNIQUES
163
There are three levels of index: track, cylinder, and master. A track index gives the key of the last record on each track; there is one track index per cylinder in the data area. A cylinder index gives the key of the last record on each cylinder; there is one cylinder index for each data set. When the size of the cylinder index exceeds a user-set threshold, a master index is created, which gives the key of the last record in each track of the cylinder index. Up to three levels of master index can be provided; the number generated is under user control. Each index entry consists of three fields: a count, a key, and a data area. The key field gives the key of the last record that is indexed by the entry, and the data area contains the full address of the track or record, the level of index, and the entry type. Updating an indexed sequential data set is complicated by the necessity of maintaining the records in collating sequence by key. When a record is added, it is written into its correct position. Records after the new record are moved up one position. If the last record in the cylinder will not fit, then it is written onto the first available location in the overflow area. Overflow areas are specified by the user and are reserved to accommodate overflow during updating. They can be allocated either as part of each cylinder in the data area, or as an independent overflow area, into which overflow from all cylinders is written. Use of an independent overflow area reduces the unused space that is used for overflow, but has the disadvantage that the time to search records in the overflow area is higher because of the disk head motion required to access the independent area. When a record is written into an overflow area, a pointer is appended to the record to connect it logically to the correct track, and appropriate adjustments are made to the track indexes. Once overflow to an independent overflow area has occurred for a substantial number of cylinders in a data set, the time needed to locate a record will be greatly increased because more than one head motion will be needed to access each cylinder that has overflowed. For this reason, and also because overflow areas can become full, an indexed sequential data set must occasionally be reorganized. The user can initiate reorganization, and the system makes available three types of information on which that decision can be based: the number of cylinder overflow areas in use, the number of unused tracks in the independent overflow area, and the cumulative number of references to cylinder overflow areas other than the first. Indexed sequential organization has been developed to satisfy a frequent requirement in the design of random-access files: a file that must be accessed randomly according to some fixed key field of each record, where the set of possible keys has a much greater range than the number of records in the file. I n order to implement such a file, a method must be
164
DAVID C. ROBERTS
developed to transform all possible keys into record addresses, where the record addresses are a set of compact integers. Various techniques for performing this transformation have been described in this document (see Sections 2, 3, 4, and 5); however, any of these techniques require special programming, while the indexed sequential organization is readily available. Although the directory approach used in indexed sequential organization requires at least two disk accesses to locate any record (first to the directory, then the record) , it greatly simplifies programming. For applications where performance requirements are not critical, the use of indexed sequential organization provides an economical implementation method. The most serious limitation of indexed sequential organization concerns the use of multiple key fields; if record selection based on a part of the record other than a single key field is to be performed, then the entire file must be scanned sequentially. Another limitation concerns updating. If a file is subject to a very large transaction volume, the degradation in performance caused by the necessity to process overflow conditions can be severe. 6.5 Direct Data Set Organization
A data set of direct organization has space for one record allocated for each possible key value. A fixed relationship is established between the key of a record and its address; then, using this relationship, any record can be accessed without the use of any intermediate tables or indexes. System/360 disk storage is divided into tracks, where a track is a bucket-the largest unit of information that can be retrieved in one access. Although track addresses are not contiguous (for example, track 0100 may follow track 0045), the operating system permits reference by relative track address. Thus, the address of a record in a direct data set is relative track address and a record number within the track. The key of a record can be used as its address by a simple transformation. The key must be numeric, and the records must be fixed in length. The key is divided by the number of records per track; the quotient is used as the relative track address, and the remainder plus one is the record number. 6.6 A Procedure for File Design
Before a file can be implemented, both a file structure and a data set organization (if the implementation is being performed on System/36O) must be selected. It is necessary to use a systematic approach to file design, and not try to specify everything at once, An attempt to settle all aspects of the file design problem simultaneously leads to confusion and poor design. To discuss a procedure for file design, it is necessary to establish terminology for discussing various aspects of the problem. These definitions are offered for that purpose.
FILE ORGANIZATION TECHNIQUES
165
I n f o m t i m structures. Representations of the elements of a problem or of an applicable solution procedure for the problem. Data structures. Representations of the ordering and accessibility relationships among data items, without regard to storage or implementation considerations. Data structures are highly dependent upon the class of problem to be solved and the environment in which the solution is to be used. Storage structures. Representations of the logical accessibility between data items as stored in a computer memory. Performance requirements. Requirements specifying the performance that is to be achieved with the file; these are characteristics that are expressed in units of time, such as maximum access time and update time. The inputs to the file design process are performance requirements, which are determined by the user of the file, and information structures, which are the data to be placed in the file in their original form. Data structures are formulated as an intermediate step toward storage structures. A systematic procedure for designing a file, using the above terminology, is : (1) Identify the information structures that are to be stored and manipulated. (2) Determine the performance requirements to be met by the file design. (3) Identify the ordering and accessibility relationships that must exist among the data items, and use these to determine the data structures that must be incorporated into the file design. (4) Select representations of the data structures to be used within the machine that will meet the performance requirements; this determines the storage structures. Steps 1 and 2 are concerned with the problem that is to be solved with the file. A complete understanding of that problem and of the level of performance that will constitute an acceptable solution is a necessary prerequisite to any more detailed work with the design. Once the problem and the solution requirements are definitely established, then the relationships among the data items can be established. These relationships will be such characteristics as which fields will be used as keys, which data items must be directly accessible from other data items, and so on. The data structures embody the linking relationships among the records in the file. These relationships must be represented, either explicitly by pointers or implicitly by position. These relationships are the key to the final file design; a file structure must be selected to embody these relationships. Sufficient information has now been developed so that the data structures and performance requirements can be used to select the storage
166
DAVID
C. ROBERTS
structures, which include both a file structure and a data set organization well suited to that file structure. ACKNOWLEDGMENTS The author wishes to thank Dr. John Hutchings, Mr. Sy Kaback, and Mr. Malcolm Tanigawa for their support, encouragement, and suggestions, and Dr. Thomas C. Lowe for his assistance in this and many other endeavors. REFERENCXS
1. Abate, J., Dubner, H., and Weinberg, S., Queueing analysis of the IBM 2314 disk storage facility. J . A C M 15(4), 577-589 (1968). 8. Abraham, C. T., Formatted file organization techniques. AD-819 338L. IBM Watson Res. Center, Yorktown Heights, New York, 1967. 3. Abraham, C. T., Ghosh, S. P., and Ray-Chaudhuri, D. K. File organization schemes based on finite geometries. Inform. Contr. 12(2), 143-163 (1968). 4. Amey, G. X., Proceedings of session on information retrieval a t the 1968 defense research board symposium. AD-854 854. Defense Res. Board, 1969. 5. Anderson, R., File organization for a large chemical information system. AD-633 354. U.S. Nat. Bur. of Stand., Washington, D.C., 1965. 6. Angell, T., and Randell, T. M., Generalized data management systems. I E E E Comput. Group News Nov., 5-12 (1969). 7 . Annu. Rev. Automat. Program. (Int. Tracts Comput. Sci. Technol. Their Appl.), 5 (1969). 8. Arora, 6. R., and Dent, W. T., Randomized binary search technique. Commun. A C M 12(2), 77-80 (1969). 9. Baber, R. L., Tape searching techniques. J . A C M 10(4), 478486 (1963). 10. Bachman, C. W., and Williams, S. B., A general purpose programming system for random access memories. Proc. A F I P S Fall Joinl Computer Conf. San Francisco, 1964, pp. 411-422. 1 1 . Baskin, H. W., and Morse, S. P., A multilevel modeling structure for interactive graphic design. I B M Syst. J. 7 ( 3 4 ) , pp. 218-228 (1968). 12. Baum, C., Proceedings of the symposium on computer-centered data base systems (2nd). AD-625 417. System Development Corp., Santa Monica, California, 1965. IS. Bayes, A. J., Retrieval times for a packed direct access inverted file. Commun. ACM 12(10), 582-583 (1969). 14. Becker, J., and Hayes, R. M., Information Storage and Retrieval: Took, Elements, Theories. pp. 448 Wiley, New York, 1963. 15. Belady, A study of replacement algorithms for a virtual storage computer. I B M Syst. J . 5, 2 pp. 78-101 (1966). 16. Benner, F. H., On designing generalized file records for MIS. Proc. A F I P S Fall J t . Computer Conf., Los Angeles pp. 291-303 (1967). 1'7. Berul, L., Information storage and retrieval a state-of-the-art report. AD-630 089. Auerbach Corp., Philadelphia, Pennsylvania, 1964. 18. Birkhoff, G., and Bartee, T. C., Modern Applied Algebra. 428 pp. McGraw-Hill, New York, 1970. 19. Black, W L., Discrete sequential search. Inform. Contr. 8(2), 159-162 (1965). 20. Bleier, R. E., and Vorhaus, A. H., File organization in the S I X time-shared data management system. Proc. ZFIP Amsterdam, 1968. pp. 12451252. 21. Bloom, B. H., Space/time trade-offs in hash coding with allowable errors. Commun. A C M 13(7), 422-426 (1970).
FILE ORGANIZATION TECHNIQUES
167
22. Blunt, C. R., An information retrieval system model. AD-623 590. HRB-Singer,
Inc., State College, Pennsylvania, 1965. $3. Blunt, C. R., A general model for simulating information storage and retrieval systems. HRB-Singer, Inc., State College, Pennsylvania, 1966. 24. Bourne, C. P., Evaluation of indexing systems. Annu. Rev. Inform. Sci. Techrwl. 1, 171-190 (1966). 25. Bourne, C. P., and Ford, D. F., A study of methods for systematically abbreviating English words and names. J . ACM 8(4), 53S552 (1961). 26. Buchholz, W., File organization and addressing. ZBM Syst. J . 2, 86-111 (1963). 27. Buckerfield, P. S. T., A technique for the construction and use of a generalized information table. Proc. ZFZP Amsterdam, 1968. pp. 395-403. 28. Bymes, C. J., and Steig, D. B., File management systems: a current summary. Datamation 15(11), 13t3142 (1969). 29. Campi, A. V., Dunn, R. M., and Gray, B. H., Content addressable memory systems concepts. ZEEE Trans. Aerosp. Electron. Syst. 1, 168-173 (1965). 30. Carr, J. W., List processing research techniques. AD-858 561L. Univ. of Pennsylvania, Philadelphia, Pennsylvania, 1969. 31. Cass, J. L., Organization and applications of associative file processors. Proc. ONR/RADC Seminar on Associative Processing, May, 1967. 32. Chandy, K. M., and Ramamoorthy, C. V., Optimization of information storage systems. Inform. Contr. 13(6), 509-526 (1968). 33. Chapin, N., A deeper look a t daat. Proc. 93rd Net. ACM Conf. Las Vegas, Nevada, pp. 631-638 (1968). 34. Chapman, R. L., The case for information system simulation. Proc. 2nd Congress Information System Science, pp. 477484. Spartan Books, Washington, D.C., 1964. 35. Chen, F. C., and Dougherty, R. L., A system for implementing interactive applications. ZBM Syst. J . 7 ( 3 4 ) , pp. 257-270 (1968). 36. Childs, D. L., Description of a setrtheoretical data structure. Rep. TR-3 (AD-668 404). 28 pp. Univ. of Michigan, Ann Arbor, Michigan, 1968. 37. Childs, D. L., Feasibility of a set-theoretic data structure. Proc. ZFZP Amsterdam, 1968. pp. 4-30. 38. Chu, W. W., Optimal file allocation in a multicomputer information system. Proc. ZFZP Amsterdam, 1968. pp. 1219-1225. 39. Clampett, H. A,, Jr., Randomized binary searching with tree structures. Commun. ACM 7(3), 163-165 (1964). 40. Climenson, W . D., File organization and search techniques. Annu. Rev. Inform. Sci. Technol. 1, 107-135 (1966). 41. Codd, E. F., A relational model of data for large shared data banks. Commun. ACM 13(6), 377-387 (1970). 4%’. Coffman, E. G., Jr., and Eve, J., File structures using hashing functions. Commun. ACM 13(7), 427436 (1970). 43. Collmeyer, A. J., File organization techniques. ZEEE Comput. Goup News Mar./Apr., 3-11 (1970). 44. Conger, C. R., The simulation and evaluation of information retrieval systems. HRB-Singer, Inc., State College, Pennsylvania, 1965. 45. Craig, J., General design specifications for a random access storage management system. AD-819 350. Computer Associates, Inc., 1967. 46. Craig, S. W . , Jr., NIPS course 2, computer-rtssisted instruction segment system description. Vol. I. Formatted file system concepts and file structuring. AD-860 919L. IBM Federal Systems Division, Gaithersburg, Maryland, 1969.
168
DAVID
C. ROBERTS
47. Curtice, R. M., Magnetic tape and disc file organizations for retrieval. Center for Information Sciences, Experimental Retrieval Systems Studies Rep. No. I . 44 pp. Lehigh Univ., Bethlehem, Pennsylvania, 1966. 48. Daley, R. C., and Neumann, P. G., A general-purpose file system for secondary storage. Proc. AFZPS 1966 Fall Jt. Computer Conf. Las Vegas, Nevada, 1965. pp. 213-229. 49. Davidson, L., Retrieval of misspelled names in an airlines passenger record system. Commun. ACM 5(3), 169-171 (1962). 60. Defense Intelligence Agency, Washington, D. C.; IDHS 1410 formatted file system: file maintenance and file generation manual. AD-637 017. 1966. 61. de la Briandans, S. R., File searching using variable length key. Proc. WJCC pp. 295-298. San Francisco, California, 1959. 62. Delgalvis, I., and Davison, G., Storage requirements for a data exchange. ZBM Syst. J . 3(1), pp. 2-13 (1964). 63. Dennis, J. B., Programming generality, parallelism and computer architecture. Proc. ZFZP, Software. Amsterdam, 1968. pp. 484492. 64. D'Imperio, M . E., Data structures and their representation in storage. Annu. Rev. Automat. Program. (Znt. Tracts Comput. Sci. Technol. Their Appl.) 5, 1-75 (1969). 66. Dodd, G. G., Elements of data management systems. Computing Surv. I(2), 117-133 (1969). 66. Dopping, O., Selection of the most economical type of direct access storage.
67. 68. 69.
60. 61.
6'2. 63.
64.
6'6. 66.
67. 68.
Proceedings of the International Symposium on Economics of Automatic Data Processing. Rome, Italy, Oct. 1965. pp. 21e213. Dumey, A. I., Considerations on random and sequential arrangements of large numbers of records. Proc. ZFZP Congr. 1, 255-260 (1965). Dzubak, B. J., and Warburton, C. R., The organization of structured files. Commun. ACM 8(7), 446452 (1965). Elcock, E. W., Note on the addressing of lists by their source language names. Comput. J . 8(3), 242-243 (1965). Evans, D., and Van Dam, A., Data structure programming system. Proc. ZFIP, Amsterdam, 1968. pp. 557-564. Fed. Amer. SOC.Exp. Biol., Congr., Znt. Fed. Doc. (FZD) Washington, D.C. Abstr. AD-625 498 (1965). Feller, W., An Introduction to Probability Theory and its ApplicatimLs. Vols. I & 11. Wiley, New York, 1950. Ferris, R. J., AD-610 131. An analysis of the multiple instantaneous response file. Rome Air Development Center, Rome, N.Y., December, 1964. Fife, D. W., and Smith, J. L., Transmission capacity of disk storage systems with concurrent arm positioning. ZEEE Trans. Electron. Comput. 14(4), 575-582 (1965). File Organization, selected papers from FILE 68. Swets and Zeithinger, Amsterdam, 1969. Fischler, M., and Reiter, A., Variable topology random access memory organizations. Proc. AFZPS Spring Jt. Computer Conf. 34, pp. 381-391. Boston, Massachusetts, 1969. Flores, I., Computer time for address calculation sorting. J . ACM 7 ( 6 ) , 389-409 (1960). Fossum, E. G., and Kaskey, G., Optimization and standardization of information
FILE ORGANIZATION TECHNIQUES
169
retrieval language and systems. AD-630 797. UNIVAC, January, 1966. Blue Bell, Pennsylvania. 69. Fredkin, E., Trie memory. Commun. ACM 3,490-499 (1960). 70. Frer, E. H., and Goldberg, J., A method for resolving multiple responses in a parallel search file. I R E Trans. EZectron. Comput. 10(4), 718-722 (1961). 71. Fuller, R. H., Contenbaddressable memory systems. UCLA Rep. No. 63-25, Contract No. NR-233 (52). 1963. 72. Gabrini, P. J., Automatic introduction of information into a remobaccess system: a physics library catalog. AD-641 564. Univ. of Pennsylvania, Philadelphia, Pennsylvania, 1966. 73. Garner, H. L., Mathematical models of information systems. AD-673 386. Univ. of Michigan Tech. Rep., Ann Arbor, Michigan, 1968. 74. Gerlernter, H., et al., A FORTRAN-compiled list-processing language. J . ACM 7 , 87 (1960). 76. Ghosh, S. P., On the problem of query oriented filing systems using discrete mathematics. Proc. IFZP Amsterdam, 1968. pp. 1226-1232. 76. Ghosh, S. P., and Abraham, C. T., Application of finite geometry in file organization for records with multiple-valued attributes. ZBM J . Res.Develop. 12(2), 180-187 (1968). 77. Ghosh, S. P., and Senko, M. E., File organization: on the selection of random access index points for sequential files. J . ACM 16(4), 56S579 (1969). 78. Gluss, B., A record storage model and its information retrieval strategy. Computers and Operational Research, Proceedings of the Second International Conference on Operational Reseurch. 810 pp. Wiley, New York, 1961. 79. Goldberg, J., Multiple instantaneous response fie. AD-266-169. Stanford Res. Inst., Stanford, California, 1961. 80. Goldberg, J., and Green, M. W., Large fles for information retrieval based on simultaneous interrogation of all items, in Large Capacity Memory Techniques for Computing Systems ( M . C. Yovits, ed.), pp. 63-77. Macmillan, New York, 1962. 81. Gotlieb, C. C., The processing of files, data and information. Amer. Math. Mon. ?2(2), 119-124 (1965). 89. Greenberger, C. B., The automatic design of a data processing system. Proc. ZFIP Edinburgh, 1965. pp. 277-282. 83. Grems, M., A card format for reference files in information processing. Commun. ACM 4(2), 90-98 (1961). 84. Griffith, J. E., An intrinsically addressed processing system. I B M Syst. J . 2, (3) pp. 182-199 (1963). 86. GrifEths, T. V., and Petrick, S. R., Top-down versus bottom-up analysis. Proc. ZFIP Amsterdam, 1968. pp. 437443. 86. Gurk, H. M., and Minker, J., The design and simulation of an information processing system. J . ACM 8(2), 260-270 (1961). 87. Hays, D. G., Introduction to Computational Linguistics. Elsevier, Amsterdam, 1967. 88. Head, R. V., Dribble posting a master file. Commun. ACM 9(2), 106-107 (1966). 89. Heising, W. P., Note on random addressing techniques. ZBM Syst. J . 2, 112-116 (1963). 90. Henry, W. R., Hierarchical structure for data management. ZBM Syst. J . 8(1), 2-15 (1969). 91. Herner, S., Methods of organizing information for storage and searching. Amer. DOC.13(1), 3-14 (1962).
I70
DAVID
C. ROBERTS
92. Hess, H.. A comparison of disks and tapes. Commun. ACM 6(10), 6 3 4 4 3 8 (1963). 93. Hibbard, T. N., Some combinational properties of certain trees with applications to searching and sorting. J . ACM 9(1), 13-28 (1962). 94. Hoare, C. R., Data structures in two-level store. Proc. I F I P Amsterdam, 1968. pp. 322-329. 96. Holt, A. W., Some theorizing on memory structure and information retrieval. Proc. ACM Conf. (1963). 96. Holt, A. W., Information system theory project. Vol. I. Mem theory. A mathematical method for the description and analysis of discrete, finite information systems. AD-626 819. Applied Data Res., Inc. Princeton, New Jersey, 1965. 97. Humphrey, T., Promenade-an improved interactive-graphics man/machine system for pattern recognition. Appendix 9E. The putget virtual-memory filehandling system. AD-694 115. Stanford Res. Inst., Stanford, California, 1968. 98. IBM Federal Systems Division, Gaithersburg, Maryland. Program description : NMCS information processing system (NIPS) (IBM 1410/7010). Vol. I. General purpose components. Part 2. File maintenance (FM). AD-835-982. 1967. 99. IBM Federal Systems Division, Gaithersburg, Maryland. User’s manual for information processing system. AD-838 001, 1967. 100. Johnson, C. I., Principles of interactive systems. ZBM Syst. J . 7 ( 3 4 ) , pp. 147-174 (1968). 101. Johnson, L. R. An indirect chaining method for addressing on secondary keys. Commun. ACM 4(5), 218222 (1961). 102. Johnson, T. G., Mass storage relational data structure for computer graphics and other arbitrary data stores. Rep., MIT Uep. of Architecture, Cambridge, Massachusetts, 1967. 103. Kapps, C. A., SPRINT: a direct approach to list processing languages. Proc. A F I P S Spring Jt. Computer Conf. Atlantic City, New Jersey. pp. 6 7 7 4 8 3 (1967). 104. Kay, M., and Ziehe, T., The catalog: a flexible data structure for magnetic tape. AD-623 938. Rand Corp., Santa Monica, California, October, 1965. 105. Kememy, J. G., and Snell, J. L., Finite Markou Chains. Van Nostrand, Princeton, New Jersey, 1960. 106. Knuth, D . E., Minimizing down latency time. J . ACM 8(2), 119-150 (1961). 107. Knuth, D. E., Fundamental Algorithms: The Art of Computer Programming. Vol. 1. Addison-Wesley, Reading, Massachusetts, 1968. 108. Koller, H. It., Safety data and social issues: the national highway safety data system. Proc. ZFIP Amsterdam, 1968. pp. 1253-1259. 109. Kronmal, R., and Tarter, M. F., Cumulative polygon address calculation sorting. Proc. ACM 20th Nat. Conf. Cleveland, Ohio, 1965. pp. 376-385. 110. Kurki-Suonio, R., Formal description of input data. Proc. I F I P Amsterdam, 1968. pp. 444447. 111. Landauer, W. I., The balanced tree and its utilization in information retrieval. I E E E Trans. Electron. Comput. 12(5), 863-871 (1963). 112. Lee, C., Content addressable and distributed logic memories, in Applied Automata Theory (J. T. Tou, ed.). Academic Press, New York, 1968. 11s. Lefkovitz, D., File Structures for &-Line Systems. 215 pp. Copyright by Computer Command and Control Company. Spartan Books, New York, 1969. 114. Lefkovitz, D., and Prywes, N. S., Automatic stratification of information. proc. AFZPS Spring Jt. Computer Conf. Detroit, 1963. pp. 229-240.
FILE ORGANIZATION TECHNIQUES
171
115. Levien, R. E., and Maron, M. E., Relational data file: a tool for mechanized interference execution and data retrieval. RM-4793-PR. Rand Corp., 1965. 116. Lewin, M. H., Retrieval of ordered lists from a content addressed memory. RCA Rev. 23,21&229 (1962). 117. Lin, A. D., Key addressing of random access memories by radix transformation. Proc. AFIPS Spring Jt. Computer Conf. pp. 355-366 (1963). 118. Lombardi, L., Theory of files. Proc. 1960 EJCC,New York pp. 137-141 (1960). 119. Lowe, T. C., Boundary crossing in partitioned systems. Informatics, Inc., Rockville, Maryland, 1969. 120. Lowe, T. C., Design Principles for an on-line information retrieval system. AD-647 196. Ph.D. dissertation, Univ. of Pennsylvania, Philadelphia, Pennsylvania, 1966. 1.21. Lowe, T. C., Direct-access memory retrieval using truncated record names. Software Age Sept. (1967). 122, Lowe, T. C., Encoding from alphanumeric names to record addresses. Software Age Apr. (1968). 123. Lowe, T. C., The influence of data-base characteristics and usage on direct-access file organization. J. ACM 15(4), 535548 (1968). 124. Lowe, T. C., and Roberts, D . C., On-line retrieval interim report. T R 69 10 0 1A. Informatics, Inc., Bethesda, Maryland, 1969. 126. Lum, V. Y., Notes on FOREM. IBM San Jose Res. Lab., San Jose, California, 1968. 126. McCarthy, J., Recursive functions of symbolic expressions and their computation by machine, part I. Commun. ACM 3 , 84 (1960). 127. McGee, W. C., File structures for generalized data management. Proc. IFIP Amsterdam, 1968. pp. 1233-1239. 128. McGee, W. C., The property classification method of file design and processing. Cmnmun. ACM 5(8), 450453 (1962). 129. McIlroy, M. D., A variant method of file searching. Commun. ACM 6(3), 101 (1963). 130. Mandelbaum, D., Optimal schenuling of disk file data transfers. IEEE Trans. Electron Comput. 12(5), 551 (1963). 131. Maron, M. E., Relational data file 1: design philosophy. P-3408. Rand Corp., Santa Monica, California, 1966. 132. Maurer, W. D., An improved hash code for scatter storage. Commun. ACM 11(1), 35-38 (1968). 133. Meadow, C. T., The Analysis of Information Systems. Wiley, New York, 1967. 134. Mealy, G. H., Another look a t data. Proc. AFIPS Fall Jt. Computer Conf. Anaheim, California, 1967. pp. 525-534. 135. Meisel, R. M., Monte Carlo techniques for simulation and design. Electro-Technol. (New York) 78(4), 48-51 (1966). 136. Miller, L., Minker, J., Reed, W. G., and Shindle, W. E., A multi-level file structure for information processing. Proc. WJCC 17, 53-59 (1960). 137. Miller, S. W., Fundamental investigation of digital computer storage and access techniques. AD-260 117. 90 pp. Stanford Res. Inst. Stanford, California, 1960. 138. Miller, S. W., Investigation of storage and access techniques suitable for use ifi large-capacity digital memories, in Large-Capacity Memory Techniques for Cornputer Systems (M. C. Yovits, ed.), pp. 1-14. Macmillan, New York, 1962. 139. Minker, J., Generalized data management systems-some perspectives. Univ. of Maryland Computer Science Center Rep. No. 69-101. College Park, Md, 1969.
172
DAVID C. ROBERTS
140. Minker, J., and Sable, J. D., File organization and data management. Annu. Rev. Inform. Sci. Technol. 2, 123-160 (1967). 141. Modern Coding Methods, IBM Booklet 32-3793-6. White Plains, New York, 1962. 146. Morris, R., Scatter storage techniques. Commun. ACM 11(1), 3-4 (1968). 143. Moxham, J. G., Batch information processing system. AD-852 599. Defense
Scientific Information Service, Washington, D.C., 1969. 1.44. Munn, W. J., Central control by random access. Data Process. 4(1), 36-39 (1962). 145. Naman, P., Algebra of management information. Proc. ZFZP Amsterdam, 1968.
pp. 1240-1244. 146, Newell, A., ed., Information Processing Language-V Manual. 2nd Ed. Prentice-
Hall, Englewood Cliffs, New Jersey, 1964. 147. Nolan, J., An experimental on-line data storage and retrieval system. AD-623 796.
Lincoln Lab., MIT, Cambridge, Massachusetts, 1965. 1.48. Opler, A., Dynamic flow of programs and data through hierarchical storage. Proc. I F I P Edinburgh, 1965. pp. 273-276. 149. Pan American Word Airways, Inc., Interference prediction model development: EMETF support of random access discrete address systems. AD-857 006. 1965. 160. Patt, Y. N., Variable length tree structures having minimum average search time. Commun. ACM 12(2), 72-76 (1969). 161. Perlis, A. J., Hess, J., and Phillips, J. A., Disc file applications. 136 pp. American Data Processing, Inc., Detroit, Michigan, 1964. 169. Peterson, W. W., Addressing for random access storage. I B M J . Res. Develop. 1(2), 130-146 (1957). 163. Poland, C. B., Advanced concepte of utilization of mass storage. Proc. I F I P Edinburgh, 1965. pp. 249-254. 164. Postley, J. A., File management application programs. D P M A Quart. 2(4), 20-2s (1966). 166. Prywes, N. S., Man-computer problem solving with multilist. AD-646 154. Univ. of Pennsylvania, Philadelphia, Pennsylvania, 1966. 166. Prywes, N. S., and Gray, H. J., The multi-list system for real-time storage and retrieval. Proc. I F I P Munich, 1962, pp. 112-116. 167. Reynolds, J . C., Automatic computation of data set definitions. Proc. I F I P Amsterdam, 1968. pp. 456468. 168. Ricour, D. H., and Mei, V., Internal data management techniques for DOS/360. I B M Syst. J . 6, (1) pp. 38-48 (1967). 169. Rothenberg, D. H., An efficiency model and a performance function for an information retrieval system. Inform. Stor. Retrieval 5(3), 109-122 (1969). 160. Rybak, F. M., Study to determine the applicability of the solomon computer to command and control. Vol. I. Information storage, retrieval and communication system control. AD454 765. Westinghouse Corp., Baltimore, Maryland, 1964. 161. Rybak, F. M., Study to determine the applicability of the solomon computer to command and control. Vol. 4. Summary. AD-450 214. Westinghouse Corp., 1964. 166. Sable, J. D., Data management system study. Report NASA-CR-86057. 289 pp. Auerbach Corp., Philadelphia, Pennsylvania, 1968. 163. Savas, M. A., A system for filing and retrieving variable length facts about unlike items. Tech. Note 6, 14 pp. Information Systems Dep., Ramo-Wooldridge, Canoga Park, California, 1960. 164. Schay, G., and Raver, N., A method for key-to-address transformation. I B M J . Res. Develop. 7(2), 121-129 (1963).
FILE ORGANIZATION TECHNIQUES
173
166. Schay, G., and Spruth, W. G., Analysis of a file addressing method. Commun. ACM 5(8), 459-462 (1962). 166. Schorr, H., and Waite, W. M., An efficient machine-independent procedure for garbage collection in various list structures. Commun. ACM lo@), 501-506 (1967). 167. Scidmore, A. K., and Weinberg, P. L., Storage and search properties of a treeorganized memory system. Commun. ACM 6(1), 28-31 (1963). 168. Seaman, P. H., Lind, R. A., and Wilson, T. L. An analysis of auxiliary storage activity. ZBM Syst. J . 5, (3) pp. 158-170 (1966). 169. Senko, M. E., Abraham, C. T., Ghosh, S. P., Lum, V. Y., Owens, P. W., Pomper, I. H., Baker, F. T., Schenken, J. D., and Walker, T. P., Formatted file organization techniques. IBM Thomas J. Watson Res. Center, Yorktown Heights, New York, 1967. 170. Senko, M. E., Lum, V. Y., and Owens, P. J., A file organization evaluation model. Proc. 4th ZPZFS Conf., Edinburgh pp. 19-23 (1968). 171. Senko, M. E., Meadow, H. R., Ling, H., Lum, V. Y., Bryman, M. R., Drake, R. J., Meyer, B. C., File design handbook. IBM San Jose Res. Lab., San Jose, California, 1969. 172. Seppala, Y., On optimization of the maintenance of a register. BIT 6(3), 212-227 (1966). 173. Shafer, P., Block oriented random access memory (BORAM). AD-832 405L. Burroughs Corp., Paoli, Pennsylvania, 1968. 174. SHARE Committee on Theory of Information Handling, General data files and processing operations. SHARE Rep. TIH-1 SSD-71, Item (3-1663. 22 pp. 1959. 176. Sharma, R. L., Analysis of a scheme for information organization and retrieval from a disc file. Proc. ZFZP Amsterdam, 1968. pp. 853-859. 176. Shoffner, R. M., The organization, maintenance and search of machine files. Annu. Rev. Inform. Sci. Technol. 3, 137-167 (1968). 177. Shoffner, R. M., A technique for organization of large files. Amer. Doc. 13(1), 95-103 (1962). 178. Stanfel, L. E., A comment on optimal tree structures. Commun. ACM 12(10), 582-583 (1969). 179. Stanfel, L. E., Tree structures for optimal searching. J . ACM 17(3), 5 W 5 1 7 (1970). 180. Stelwagon, W. B., Principles and procedures for the automatic flowcharting program Flow 2. AD-637 863. Naval Ordnance Test Station, 1966. 181. Stevenson, D. A., and Vermillion, W. H., Core storage as a slave memory for disk storage devices. Proc. ZFZP Amsterdam, 1968. pp. 1277-1284. 18.9. Sussenguth, E. H., Jr., An evaluation of storage allocation systems. Sci. Rep. No. ISR-12, 53 pp. Computation Lab., Harvard Univ., Cambridge, Massachusetts, 1962. 183. Sussenguth, E. I€., Jr., Use of tree structures for processing files. Commun. ACM 6(5), 27S279 (1963). 184. Swed, R. E., Linear vs. inverted file searching on serial access machines. The Coming Age of Information Technology, pp. 122-128. 166 pp. Documentation, Inc., Bethesda, Maryland, 1965. 186. Symonds, A. J., Auxiliary-storage associative data structure for PL/1. ZBM S y ~ tJ. . 7, (3-4) pp. 229-246 (1968). 186. Symonds, A. J., Relational language for accessing a software associative memory. Proc. Jt. Conf. on Math and Computer Aids to Design, 1969.
174
DAVID C. ROBERTS
M. E., and Kronmal, R. A., Non-uniform key distribution and address calculation sorting. Proc. A C M Conj., Washington, D.C. pp. 331-337 (1966). 188. Taunton, B. W., “Name code,” a method of filing accounts alphabetically on a computer. Data Process. 2(3), 23-24 (1960). 189. Teichrow, D., and Lubin, J. F., Computer simulation-discussion of the technique and comparison of the language. Commun. A C M 9(10), 723-741 (1966). 190. Thompson, D. A., Benningson, L. A,, and Whitnam, D., A proposed structure for displayed information to minimize search time through a data base. Amer. Doc. 19(1), 80-84 (1968). 191. Unk, J. M., General purpose external memory system for data base processing (ISAR base). Proc. ZFZP Edinburgh, 1965. pp. 267-271. 192. Wallace, E. M., User requirements, personal indexes, and computer support. AD-636 833. System Development Corp. Santa Monica, California, July 25, 1966. 193. Weingarten, A., The analytical design of real-time disk systems. Proc. ZFZP Amsterdam, 1968. pp. 860-866. 194. Weizenbaum, J., Knotted list structures. Commun. A C M 5(3), 161-165 (1963). 195. Weston, P., and Taylor, S. M., Cylinders: a data structure concept based on rings. AD-679 948. Coordinated Science Lab., Univ. of Illinois, Urbana, Illinois. 196. Wilkes, M. V., Lists and why they are useful. Proc. ACM Cunf. Fl-1-F1-5 (1964). 197. Wilkes, M. V., A programmer’s utility fding system. Comput. J . 7(3), 180-184 (1964). 198. Windley, P. F., The influence of storage access time on merging processes in a computer. Comput. J . 2, 49-53 (1959). 199. Witkin, N., A fast random search method. S. Afr. Comput. Bull. 2 (1960). 200. Woodward, P. M., and Jenkins, D. P., Atoms and Lists. Comput. J . 4(1), 47-53 (1961). 201. Ynt,ema, D. B., and Klem, L., Telling a computer how to evaluate alternatives as one would evaluate them himself. Proc. 1st Congr. Information System Sciences Session 4, pp. 21-36. Mitre Corp., Bedford, Massachusetts, 1962. 202. Younker, E. L., Heckler, C. H., Jr., Masher, D. P., and Yarborough, J., Design of an experimental multiple instantaneous response file. Proc. A F Z P S Spring J t . Computer Conf. pp. 515-528 Washington, D. C., 1964. 80.9. Zurcher, F. W., and Randell, B., Iterative multi-level modelling. Proc. ZFIP Amsterdam, 1968. pp. 867-871. 187. Tarter,
Systems Programming languages R. D. BERGERON,’ J. D. GANNON? D. and A. VAN DAM
P. SHECTER, F. W. TOMPA?
Department o f Computer and Information Sciences Brown University Providence, Rhode Island
1. Introduction . 2. Criteria for a Systems Programming Language . 2.1 Target Code Efficiency . 2.2 Run-Time Environment . 2.3 Error Checking . . 2.4 Debugging Facilities . . 2.5 Syntax Considerations . 2.6 Adaptability . 2.7 Program Modularity . . 2.8 Machine Dependence . . 2.9 Multiple-User Systems . 2.10 Miscellaneous . . 3. Specific Constructs . . 3.1 Data Facilities . . 3.2 Storage Allocation . 3.3 Data Manipulation . . 3.4 Program Control and Segmentation 3.5 1/0 and Debugging . 4. Reviews of Several Systems Programming Languages 4.1 PL/I . 4.2 AED . 4.3 BLISS . 4.4 PL360 . 5. Extensibility and Systems Programming . 5.1 Universal Language or Universal Processor . 5.2 Facilities of Extensible Languages . 6. Language for Systems Development . . 6.1 Overview of the Base Language . . . 6.2 Variables and the Declaration Statement 6.3 Program Segmentation and Storage Allocation . 6.4 Procedures and Their Invocation . 6.5 Statements and Operators . . 6.6 1/0 and Debugging Facilities .
176 180 180 184 185 186 187 188 189 190 190 191 192 191 192 194 194 195 196 196 211 214 232 236 236 238 239 240 242 247 255 262 269
1 Present address: Faculteit Wiskund, Katholike Universiteit, Nijmegen, The Netherlands. * University of Toronto, Toronto, Ontario, Canada.
175
176
R. D. BERGERON et a/.
6.7 Compile-Time Facilities . 6.8 LSD Extensibility . . 6.9 Implementation . . Annotated Bibliography . References and Additional Bibliography
. 273
.
.
.
.
.
.
.
.
.
.
. . . .
276 278 279 283
1. Introduction
The purpose of this paper is to state the rationale for systems programming languages, to enumerate some criteria for judging them, to subject some existing systems programming languages to close scrutiny, and finally to describe an extensible language currently being implemented which is addressed specifically to our criteria. As with so many topics in computer science, the matter of choosing which language to program in or what language features to emphasize is largely a matter of personal taste (bias) and experience. The criteria which are applied here to systems programming languages are a product of five year’s collective experience of the software technology group a t Brown University. This experience includes the design and implementation of production systems, particularly in the areas of interactive computer graphics, time sharing, operating systems, information retrieval, and software paging. Furthermore, our viewpoint is colored by having extensive personal experience almost exclusively with IBM hardware and software, specifically with assembler language, PL/I, and several proprietary systems languages for the /360 (both of the assembler language and machine language producing varieties). Thus we really do not offer absolute and incontestable statements in this paper; while some points will be obvious, many others are controversial, and open to vigorous debate. Some of our judgments will not be applicable in some other environment where resources and constraints are widely different. (Consider an industrial environment in which management, because of the frequent changes in hardware systems, has decreed that FORTRAN is the universal programming solvent, or that if programs are inefficient in use of space and cannot run in the available core then it is cheaper to buy more core than to reprogram.) Other arguments may also be inaccurate in the near future, when hardware characteristics change or when newer (and more sophisticated) implementations of languages bear a closer resemblance to their postulated ideals, while retaining the ability to generate (‘good” code. Before proceeding to justify systems programming languages and to enumerate their characteristics, some attempt will be made to define them. Since a systems programming language is a language in which one programs systems, a definition for the term “system” is useful. Webster’s defines a
SYSTEMS PROGRAMMING LANGUAGES
177
system, appropriately enough for our purposes, as an “assemblage of objects united by some form of regular interaction or interdependence; an organized whole. . .” This definition is adapted in common programming parlance to mean that a system program is an integrated set of subprograms, together forming a whole greater than the sum of its parts, and exceeding some threshold of size and/or complexity. Typical examples are systems for multiprogramming, translating, simulating, managing information, and time sharing. Note that the term system program is more inclusive than the term programming system (which is more nearly synonymous with operating system). Having defined the term system program, it is now possible to list some identifying characteristics. The following is a partial set of properties, some of which are found in non-systems, not all of which need be present in a given system. (1) The problem to be solved is of a broad nature consisting of many, and usually quite varied, sub-problems. (2) The system program is likely to be used to support other software and applications programs, but may also be a complete applications package itself. (3) It is designed for continued “production” use rather than a one-shot solution to a single applications problem. (4) It is likely to be continuously evolving in the number and types of features it supports. (5) A system program requires a certain discipline or structure, both within and between modules (i.e., “communication1’), and is usually designed and implemented by more than one person. Being rather facetious, one might say that until recently, an operational distinction between system programs and other (application) programs was that the Iatter were written in high level (i.e., inefficient) FORTRAN by effete applications programmers. On the other hand, system programs, i.e., “real” programs, were handcrafted in assembler language by members of the systems programmer priesthood, to be lovingly tailored to the i d h syncrasies of the actual machine used. (In all fairness to FORTRAN, particularly early implementations, it should be mentioned that great attention was paid to optimizing for the target machine, i.e., taking into consideration drum latency optimization on the IBM 650, generating code to take account of frequency of occurrence of variables, and taking computation out of loops, etc. Furthermore, FORTRAN was not touted for anything but applications programming.) Thus programmers were traditionally faced with these two extremes-potentially efficient, but hard to
178
R. D. BERGERON et a/.
write and debug assembler language, versus inefficient, limited facility, but easy to write and debug high level language. For systems programmers this choice was typically resolved in favor of assembler language. As the size and complexity of systems increased over the past decade,’ systems designing, writing, and debugging time increased alarmingly, while the comprehensibility and maintainability of the assembler code correspondingly decreased. At the same time, cost per memory bit and per logic unit was going down, making substantially faster and bigger machines economically viable, and making the efficiency argument for assembler language less attractive. This was particularly true because, as programmers’ salaries rose, software began to cost more than hardware. Furthermore, it was found in many efficiency experiments that programmer productivity, as measured in lines of code written and debugged per day, was about the same for the more expressive, yet compact high level languages as for assembler language [lo]. Thus the greater productivity of high level programming, coupled with lessening requirements for efficiency, has led to increasing use of high level languages that were originally intended primarily for applications programming (FORTRAN and PL/I). A better compromise, of course, would be to design a special purpose language with the systems programmer in mind. This type of effort began with JOVIAL and NELIAC, which were based on ALGOL58, and has included languages such as EPL,2 MOL360, ESPOL, and others. These systems programming languages were usually created for use by professional programmers to write large, complex data structure manipulation programs. The goal of a systems programming language is to provide a language which can be used without undue concern for “bit twiddling” considerations, yet will generate code that is not appreciably worse than that generated by hand. Such a language should combine the conciseness and readability of high level languages with the space and time efficiency and the ability to “get at” machine and operating system facilities obtainable in assembler language. Designing, writing, and debugging time should be minimized without imposing unnecessary overhead on systems resources. The question arises why a modern implementation of a general purpose language like PL/I would not suffice. Simply put, if the language had a special purpose systems programming implementation, then it would. Commercially available compilers, however, tend to be still too general In ref. [lo](somewhat dated now) for example, the author states that the Multics System Programmer’s Manual ran to 4000 single-space typewriten pages, while the system itself was represented by some 3000 pages of (high level, PL/I-like) source language code. An early version of PL/I designed for implementing MULTICS [ l o ] .
SYSTEMS PROGRAMMING LANGUAGES
179
purpose, all inclusive, and inefficient in terms of code they generate or the run-time environment they invoke. Space efficiency is still an issue; while the amount of core on medium and large scale computers has increased markedly during the last five years,3 so have the users’ expectations and requirements. I n fact, a variation on Parkinson’s law has taken place. Although the additional core was sometimes made available to larger programs, more commonly it was used to support multiprogramming, “making better use of the system’s resources” to improve throughput. Consequently, many users found themselves, for example, running in a 128K byte region of the larger machine instead of in a 32K 36-bit dedicated machine (less 6 to 8K for the operating system), experiencing almost no space advantage. As more core was attached, the machine supported more simultaneous users, and only a few privileged users were able to get more space for themselves. Thus the ordinary systems programmer still found himself concerned with overlaying his program and software paging (swapping) his data. Only the few hardware paged machines such as the Atlas Titan 11, Sigma 7, GE 645, RCA 70/46, PDP10, or IBM 360/67 have facilities for letting programs run in conveniently large (“virtual”) address space, and even then, not without restrictions. To summarize, space is still limited (and is likely to remain so for a few years despite the application of bulk core). Therefore, space efficiency is still a concern for the systems programmer today. Also, as satellite computers with their smaller core become more prevalent, their systems programmers shouId have the benefit of being able to program in a high level but efficient language. Furthermore, time considerations, as always, may be of even more concern. Critical loops in a multi-user interactive program must be coded as efficiently as possible to allow real-time responses. Another consideration affecting time efficiency is the compatibility of a program and its execution environment. For example, code which allows address pointers to cross hardware or software page boundaries indiscriminantly can severely overtax the hardware with excessive 1/0 activity to retrieve pages. Such “thrashing” could occur by executing a “pointer chasing” program (such as an operating system) in a virtual (paged) memory, although it was designed for contiguous core. A language which allows the responsible systems programmer to address his hardware/operating system environment (and its constraints) judiciously is therefore still required within today’s state of the art. For example, many IBM 7090/7040s with 32K 36-bit words were replaced by IBM system 360s with 256 or 512K 8-bit bytes (64K or 128k 32-bit words).
180
R. D. BERGERON et a/.
2. Criteria for a Systems Programming Language
A systems programming language should be designed to be general enough to be used for writing as many different types of systems as possible. The term “system” encompasses such a wide range of constructions that the requirements imposed on a language by two different systems may be quite diverse-in fact, some may be contradictory. For example, a system which is to be used by an industrial organizatoin for computer-aided design should be machine-independent so that it need not be rewritten when the company alters its computer configuration (possibly changing to a completely different machine). On the other hand, an operating system can not afford to have any unnecessary inefficiencies, and therefore must be hand-tooled to reflect all the peculiarities of the particular machine. Design decisions in a systems programming language will affect the applicability of that language to a particular project, making one language better suited than another. In fact, most existing systems programming languages were designed for a particular class of systems, and then enlarged to provide for other types of systems. Therefore they cannot be classified as general systems programming languages. The criteria which should be involved in designing a systems programming language or in evaluating an already existing language are enumerated below. The considerations begin with those which influence the running of systems written in the language, and end with those which influence the writing and designing of such systems. Naturally, the most important considerations from our point of view are listed first. Later five languages will be examined in light of these criteria. 2.1 Target Code Efficiency
A systems programming language must produce efficient code for the machine on which the system is to run. The problem of inefficiency exists at all installations in one or more forms, due to limitations on processor speed, core size, or 1/0 activity. Each systems programmer should know the restrictions of his particular machine configuration and job mix, and tune his program to minimize the demand on the resource which is most scarce. For example, where a machine is predominantly I/O-bound, all input and output must be made as efficient as possible, at the expense of other resources such as memory or processor cycles if necessary. The problem of inefficiency is compounded considerably in a multitasking environment. Consider a program executing in a paging environment. If the program uses its data areas in an inefficient manner, the extra paging activity will certainly slow down the program’s execution. In ad-
SYSTEMS PROGRAMMING LANGUAGES
181
dition, other programs will be slowed down because some pages which they require will have been swapped out of core to make room for those of the inefficient user. Therefore, each program in the system will require more than the usual amount of paging and the degradation of execution times will snowball, reducing batch throughput and response time for interactive programs. The target code of a systems language should make full use of all the capabilities of the machine. For example, a systems programming language should not be limited to conventional addressing schemes on a machine which has a hardware mechanism for indirect addressing, only because other machines are not so equipped. A serious abtempt should be made to have every statement produce optimal code for the particular machine in the environment in which it is used. While some constructs in the language must produce a large amount of code, generate a large work area, or do a significant amount of input or output, the user of the language must take care that his system does not consist of unnecessary occurrences of this type of construct. Thus a user should be made aware of the fact that compilation of a given statement results in an expensive section of code. For example, the generation of implicit (ie., not explicitly specified by the user) calls to run-time conversion routines seems like a great freedom for the programmer, yet the use of these routines might be impractical in the system, due to the time and storage space necessary to perform the conversion. For cases such as these, the user should be made aware of the cost of his code. The Users’ Guide for the language should list the estimated storage requirements and execution overhead of all constructs. If a runtime routine will be invoked, he should be informed that something is being done “behind his back.” When an expensive construct is actually detected, the compiler should signal the user by printing out warning messages and a cross-reference table of such routines, including their sizes, in the listing. Through such feedback a systems programmer can have full control over the consequences of his coding techniques. 2.7.7 Programmer Control of Compiler Facilities
For the majority of his program, the user of any high level language should not have to worry about particulars of the implementation of his task, but rather he should be able to describe that function and let the systems language work out the details. On the other hand, there are usually several critical points in the program where a running system’s overall performance can be greatly improved through better code which only the programmer himself can describe. For this reason it is important
182
R. D. BERGERON
ef a/.
that the systems programming language have facilities for the user to “help it do the right thing.” The following example will illustrate the point: Suppose a particular set of variables is accessed by each module in some large system. Since the routines are compiled separately, no global optimizer could discover this condition. (Good optimization of a general purpose high level language is as yet an unsolved problem.) Either passing a pointer to each routine or denoting that pointer to be external may be too inefficient, since i t would need to be accessed very often. The user should be given the option of informing the compiler that he will require that particular pointer in most of his code (perhaps by specifying that i t should be kept in a register, thus eliminating redundant instructions).
In this case the user knows a peculiarity of his program that the compiler could not have known. The execution time for the system could be reduced (especially in the case of critical loops). Furthermore, when this situation occurs frequently in one program, there may be a significant saving of space due to the deletion of superfluous LOAD and STORE instructions. I n addition to register usage, local efficiency with respect to space could be gained if the user were permitted to indicate that a common parameter area be used for all subroutine calls instead of each call establishing its own area. Efficiency requirements may also be specified globally, in less low level detail. I n PL/I14for example, the user may make use of the OPT option, through which he may indicate whether he wantjs compile-time, run-time, or core usage optimized. 2.7.2 Optimizing in Assembler language
It is a highly controversial point whether or not a systems programmer should implement his system on only a high level. There are many who feel that since an algorithm can be coded completely from within a high level language, the programmer has no need to depart from the purely symbolic source code to consider the target code produced. That is, little value may be gained compared to the amount of time and energy needed to code in assembler language. A truly ideal environment is one in which all systems software including the operating system and perhaps even the machine’s “instruction set” is on a high level (i.e., PL/I-like). Thus, all designing, writing, and debugging can be done a t this level. (Such an environment might evolve when there exist firmware translators for languages like PL/I.) Certainly in a completely bug-free environment, where a programmer can put all his faith into the operating system, the compiler, the interactive supervisor, and the debugging package, all programs can be written and debugged on a Hereafter, unless otherwise noted, PL/I refers to the IBM implementation of PL/I (F compiler).
SYSTEMS PROGRAMMING LANGUAGES
183
purely symbolic level with none of the users ever needing to see the target code generated. Unfortunately in today’s world, too many systems tools are not ideal: debugging packages lose control over the testing program, compilers generate inefficient code, operating systems affect programs where they should not, and even the hardware occasionally (‘goes down.” For example, a t Brown University two perplexing 1/0 problems (dealing with the updating of more than one record per track and “negative track balances”) were solved only after painstaking examination of the assembler language listings of IBM-written access method subroutines. As in this case, it is sometimes necessary to examine a system at its lowest level in order to completely determine its action. A systems language compiler should be able to generate an (optional) assembler language listing in order to help satisfy the criteria described for target code. First of all, an assembler language listing is an ideal method for obtaining feedback on the efficiency or inefficiency of a particular statement. The amount of space required for implementing a given statement can immediately be determined from the assembler language produced. Furthermore, implicit subroutine calls could also be easily detected. For example, the expense of initializing a large array in automatic storage in PL/I can immediately be seen by the pages of assembler language in the listing. In addition, an assembler language listing is a good debugging tool when an obscure bug is detected six months after the compiler has been finished. It may be claimed that producing an assembler listing is too expensive in that it introduces an extra compile-time pass, i.e., an assembly step. However, this step is unnecessary if the compiler generaks an assembler listing along with the machine code. One extra lookup in a table of opcodes and one extra 1/0 operation per instruction is well worth the expense when it can eventually save programmer time or the execution time or space needed by the system itself. Others say that a disassembler which translates from object code back to assembler language is much more practical. Besides the expense, other drawbacks of using this post mortem translation are that mnemonics representing lengths cannot be recovered and the substitution of component names for the displacements into structures is impossible since these are indistinguishable from other constants in the machine code. In a few cases, it is impossible for the user to describe a peculiarity of his system. Furthermore, every compiler will have features which restrict some systems programmers. These unsatisfactory conditions can be relieved by allowing the insertion of assembler language as in-line ‘(open subroutines.” The section of code thus produced will contradict some of the rules of a systems programming language, especially syntactic clarity,
184
R. D. BERGERON et a/.
but at times the advantages of low level coding are great enough to compensate for this loss. When an available facility is expensive in the systems language, judicious recoding of critical portions in assembler language may also be valuable. By taking the machine environment into consideration, a programmer may modify the compiler’s target code to raise the overall efficiency of his system appreciably. The following example will show that a tight algorithm on the source code level does not necessarily ensure an efficient program : A linguistics group at Brown University was coding in an early version of PL/I which did not include the TRANSLATE and VERIFY functions. I n a key routine, they needed to scan a character string for the first occurrence of one of several eight-bit configurations, returning the index of that character in the string. To implement this, they coded a very tight PL/I loop (which compiled into a large amount of machine code). Finally, a systems programmer realized that they had just simulated the single /360 TRT instruction. By recording that small, key routine in assembler language, he was able to reduce the execution time of the whole program two orders of magnitude. It was only through low level knowledge of the machine that the program could be made more efficient.
The trade-off between the efficiency of assembler language and its lack of clarity is easily resolved since such inserts should occur only infrequently when the relevant section of the program is completely thought out. Furthermore, documentation should be provided in great detail to explain the meaning of the code (in fact, the original code in the systems language provides excellent documentation in such a case). 2.2 Run-Time Environment
Since many systems have common run-time needs, such as 1/0 and dynamic storage allocation, it is helpful for a systems language to provide a set of run-time routines. The storage saved by not duplicating the in-line code every time a construct is used must adequately compensate for the extra instructions needed to link to this run-time environment. (The threshold point is a function of the number of parameters and the length of the code in the subroutine.) As stated above, the disadvantage of a run-time environment provided by the system is that the user is unaware of the cost of the routines he invokes because he does not know the implementation details. He should be informed of the approximate cost, in execution time and core, of each routine so that he can judge whether it is worth using. The routines may be made available to the user through the use of the subroutine library from which he can select those which are applicable to his program. Needless to say, the systems language should not burden the
SYSTEMS PROGRAMMING LANGUAGES
185
user with all the system-generated run-time routines, but rather only the few that are actually needed. In this way, provisions in the language for a particular complex routine do not weigh down a user who does not need that facility. By making the run-time routines modular, the system can maximize storage use by allocating and freeing seldom used routines at execution time. 2.3 Error Checking
In general, a system which is implemented using a high level language will have three basic types of error checking. The first of these occurs at compile-time when, for example, the compiler checks for legal syntax or for matched data types in an assignment statement. The second (implicit run-time error checking) occurs if the compiler generates code to check for legal contents of some variable, for example, one being used as an index into an array. Finally, there is explicit, problem-oriented error checking, that is, checks actually specified by the user. These might be needed in a time-sharing system to test the legality of its own users’ input. Implicit run-time error checking in a systems programming language should not be automatically generated. When a complicated system is running, it cannot afford to be slowed down by unnecessary error checking in each instruction; the system may have to run very quickly, especially in an interactive environment where response time is crucial. By the time the system is in production use, “the program bugs have surely been removed.” Thus most of the implicit checking would be just time-consuming without any positive effect. In those cases where system integrity would be in jeopardy without run-time checking, the user may specify explicit error checking. The next question that might arise is whether implicit error checking should be done at compile time. First of all, normal symbol table lookup and type checking must be done at compile time to determine which mode of instruction is to be generated. For example, in a /360 ,the compiler must decide whether to generate a LOAD FULLWORD or a LOAD HALFWORD for a given instruction in the high level language. Second, the time element involved is of an extremely different nature. At compile time, the running speed is not crucial since the compilation will not be done as often as the system is run. Thus as long as the program is not significantly slower in compilation (that is, even as much as 50% or more) error checking is manageable. Although the compiler should not terminate the user’s job, it can check for abnormal conditions (such as branching into an iterative loop) and print an appropriate warning message. The word “warning” is important here, because in a systems programming language (intended for knowledgeable users) the compiler should not
186
R. D. BERGERON et ol.
attempt to correct the user’s mistakes. At one time or another, almost every systems programmer has to get around some restriction placed on him by the design of the language. For example, a user may wish to pass a character string of length four to a subroutine which expects a full word. Since its users are aware of what they are doing, the compiler should allow such a “mistake” and merely issue a warning message, as long as it can generate code.5Note how this differs from PL/I’s handling of warnings. The PL/I compiler will attempt to correct syntax errors; will generate conversions where there are actually errors (without generating warnings) ; will not allow the user to get around restrictions such as printing pointers and offsets; and (in the F-compiler) will assign assumed attributes to undeclared variables without generating warnings. If a compiler does not automatically provide run-time error checking, a user of the language must be able to specify explicit error checking from within the language. Provision must be made, for instance, for the user to determine that on a given a.rray he does want a check to be made for an illegal subscript. This facility can be provided in the form of actual instructions that the user must code after the access to the array, or it can be done through a statement which means “whenever the array ARRAY is coded, generate a check for an invalid subscript.” Thus the user has the option to have error checking a t those few places a t which he actually needs it, instead of every place the compiler feels it is needed. 2.4 Debugging Facilities
Simple 1/0 should be provided for the user so that he may debug with a minimum amount of excess work. Facilities for printing out the core contests of small areas of memory or for dumping values of given variables should be included for this purpose. Such facilities can be provided on demand (that is, the user indicates the point a t which he wants the contents of the variable printed), or universally (that is, the user states that whenever a certain identifier is used, the value is to be dumped). Since a systems language should be designed for today’s interactive environment, aids should be provided for debugging on-line as well as off-line. One such technique is to allow the user to “snap” selected variables and to have them printed on his console. In order to remain completely symbolic, a systems language could also provide a run-time symbol table (only on request) in order to print out the contents of any variable by name (i.e., the user issues the command PRINT VARl). In many cases all of the users’ core may be dumped symbolically, so that he never Again the contradiction to semantic clarity is inherent here, but occasionally the tradeoff must be settled in favor of “tricky” (but well-documented) code.
SYSTEMS PROGRAMMING LANGUAGES
187
needs to worry about such low level considerations as the hexadecimal representation of a character string. 2.5 Syntax Considerations
After the provision of efficiency and cost information, an important consideration in designing a language is to make it as readable as possible. While this seems terribly obvious, it is surprising how many languages violate the principle with awkward or contrived syntax. The instructions themselves should read as easily as if they were part of the documentation; comments within the program should not be the sole method for explaining the algorithm or logical flow, but rather they should enhance the meaning of a set of instructions. One aid in accomplishing self-documentation is to allow long names for identifiers, in order that the labels be self-explanatory. Furthermore, a general character set should be employed in order to avoid trumped-up notation. For example, in specifying that a pointer should be set to point a t a ring element, the *1 language construct
is certainly not as mnemonic as GRAPHICS-POINTER
=
ADDR(R1NG-ELEMENT)
even though they may both do the same thing. In order to understand a program, the statements must not only be readable, but the use of each should clearly convey the corresponding semantics.6The language should be rich enough to provide all the facilities a systems programmer wants. A user must be able to obtain the desired facilities and efficiencies through straightforward use of the commands of the language. However, these must exist without violations to widely accepted notations. For example, a common symbol such as the plus sign (+) should be used only for arithmetic addition and not for something else such as logical or. One of the biggest advantages in using a high level language rather than assembler language is that the programmer can state what he wants to do without having to resort to “misusing” constructs. In a high level language an instruction should be used for the purpose for which it was designed, rather than to accomplish some side effect of that command. 6L)esigners of languages should be aware that due to past experience with other programming languages, its users will have a background in which certain syntactic constructs have associated semantics. The constructs of a language should avoid contradicting these “natural” meanings.
188
R. D. BERGERON et a/.
This feature seems quite obvious at first glance. However it is often missing from low level languages. For example, in 0s Assembler Language, a branch and link instruction, which was intended to branch to a subroutine saving the return address in some register, might be used to load the address of a parameter list into register one before a call is made to a subroutine. BRANCH AROUND PARM LIST; LOAD ADDR IN R1 DC A(PARM1) FIRST PARM DC A(PARM2) SECOND PARM LABEL BAL 14,SUBROUTN CALL THE SUBROUTINE
*
BAL
1,LABEL
Since such trickery tends to accumulate making the logic of the program difficult or impossible to follow, it should be avoided in a systems programming language. It is important to remember that often the code must be examined by other individuals who interface with the system as well as by the author himself. The readability of a language also tends to make it easier to learn. After an initial, short learning period, a programmer should not have to refer to a manual for determining the uses of various constructs. The rules of syntax and of semantics should be natural to the user, and there should be few special cases (i.e., few exceptions) to be memorized. Furthermore, simple commands should be provided for common tasks. All keywords must be meaningful to the user in his application; the semantics should be unambiguous; and misstating a construct should not yield another construct, but rather should be recognized as a mistake. For example, the IBM CMS program editor has both LOCATE and FIND commands which mean about the same thing in common parlance, but provide different facilities, thus resulting in confusion; 0s Assembler Language has bot,h SLR and SRL which look nearly the same, but represent a subtract (Subtract Logical Register) and a shift (Shift Right Logical), thus allowing keypunch errors to become programmers’ nightmares. 2.6 Adaptability
No compiler can be expected to provide for every facility that the programming world will require. (The idea of a universal language has finally been discarded by most people interested in systems programming languages.) A language designer cannot even predict all the constructs needed within a particular environment. Thus the compiler ought to support definitions of new types of operations and operands.
SYSTEMS PROGRAMMING LANGUAGES
189
A method for achieving a measure of adaptability is through compiletime statements. A good general purpose macroprocessor can aid considerably in making the language adaptable to the user. Furthermore, ability to include strings of text from user or system libraries (e.g., the %INCLUDE statement in PL/I) is almost essential in an environment in which several programmers work on a large project. Another answer to the problem is for the systems language to be extensible. In this way, a user can include in his version of the language a construct that is somewhat unusual, but very useful to his application. Furthermore, he need not be concerned that another user’s extensions are included in his version, possibly making his system more bulky and less efficient. Also, a user should be allowed to change keywords in order to make their meaning more natural to him. This almost trivial type of extensibility makes lexical changes possible in order to provide the capabilities as described in the section on syntax above. Substitution for operators or delimiters should also be allowed to aid in the legibility of the syntax to the user. 2.7 Program Modularity
Whenever a complex system is built, sections will be compiled separately, even though they will run at the same time. Parts of a large system may be written by many people who will have to link their modules together. Furthermore, all systems undergo transformations in time as techniques are improved and sections are redesigned, thus making the system’s modularity vital. In addition, some parts of a system might be coded in a different language in order to use each language to its best purpose, and thus improve the performance of the final system. For these reasons, the systems programming language must provide facilities for linking the object code modules it produces to each other, and to those produced by FORTRAN, assembler language, or any other language that may be used. In many cases it is not sufficient to only provide for a CALL to a subroutine which will return when completed. The expense of modularity is such that a systems language should provide linking mechanisms which are as flexible as possible. Components of large systems may require subroutines to be loaded dynamically (provided in the /360 by LINK). Occasionally, components may link in such a way that the calling routine does not require a return (provided by the /360 XCTL). Thus, the storage space for the calling routine may be returned to free storage. In a multiprogramming environment, routines are called to run asynchronously to the “mainline” (as /360’s ATTACH). Furthermore almost all systems have to use some routines which are invoked through the supervisor or
190
R. D. BERGERON et a/.
monitor. For these and other cases, various types of run-time linking must be provided by a language, so that the systems may take full advantage of the many modes of call. 2.8 Machine Dependence
Many people have promoted the idea of complete machine independence of high level languages in order to allow portability of programs between machines. Furthermore, machine independence a t the source level insures that the programmer is not merely forced to manipulate code, but allowed to program an algorithm. On the other hand, systems programmers need to use the full capabilities of the machine at hand. Such functions as input and output, though they exist on all machines in one (or more) forms, are completely different in their facilities and effects and thus the source code should reflect these differences. The language must provide facilities which will generate target code which takes advantage of all the strong features of the machine in order to allow the programmer to use that machine to the best if its abilities. For example, a language designed for the System/360 should provide attributes which correspond to half words and bytes and constructs that make use of the assembler language instruction T R T (a character scanning instruction). Hence complete machine independence cannot be a major design criterion of a’truly efficient systems programming languages. [See Section 2.1 on target code efficiency for a related discussion.] 2.9 Multiple-User Systems
Certain facilities are needed by systems that will have many users simultaneously. Some of these should be included as part of the language itself, whereas the rest should be available to the user only through “hooks” in the compiler. The hooks may be accessible either through syntactic and/or semantic extensions to the language, or by subroutining facilities. The systems programming language should therefore be only one part of a systems programming environment. The systems programming language must provide a compile-time option to generate reentrant code.? This feature is essential for writing an interactive system. By allocating the data areas dynamically, the system will be able to support multiple users who are likely to use the same routine Even for non-reentrant code, there should be no facility for the code to modify itself. Self-modification loses the algorithmic clarity that is desired in a program which might have to be understood by many people, and usually gains very little on today’s machines.
SYSTEMS PROGRAMMING LANGUAGES
191
concurrently; any location which is likely to change must be allocated separately for each user. A facility which should be included as part of the systems programming environment is a software paging mechanism. Even on a machine with hardware paging, software paging is necessary to facilitate handling of large files and to swap users and data in and out of core in a multiple-user system. Of course, this should be an option for the user to specify, so that other systems need not carry that overhead. Another such facility is to allow the user to include protection in his system. In a multiple-user environment, there are certain restricted files and particular areas of storage that can be accessed by only certain individuals. In some cases files need to be protected from modification, at other times from scrutiny or execution by other users. The environment might be able to provide mechanisms in the paging system for setting and inspecting keys for protection from reading, writing, execution, or any combination of these.
2.10 Miscellaneous
A language designer must remember that the language must satisfy the users’ needs and still be pleasant to use. The users of the language will judge the language on syntactic and semantic clarity, ease of writing and debugging, and e6ciency of target code. A proverb due to A. J. Perlis is: “One man’s constant is another man’s variable.’’ Since all programmers have their own ideas of clarity and ease, as many default options as possible must be left unbound until the user indicates his preferences. There is a great variation between the ideals of programmers who implement systems. As an example, one group may prefer undeclared variables to assume some default attributes, whereas another prefers to have them flagged as errors. As many parts of the system as possible should be adaptable to the user; Le., implemented either as a set of parameters changeable at each compilation of the system’s components, or set in the compiler by the project manager for the duration of the coding of the project. For those parts of the system which cannot be adjusted to the peculiarities of the user, the designers of a systems programming language should obey the “Law of Least Astonishment.” In short, this law states that every construct in the system should behave exactly as its syntax suggests. Widely accepted conventions should be followed whenever possible, and exceptions to previously established rules of the language should be minimal.
192
R. D. BERGERON et a/.
3. Specific Constructs
The previous section enumerated the global requirements of a good systems programming language. It considered the overall criteria of such a language both at the source and at the target levels. This section lists the specific facilities needed by programmers for implementation of systems which will function nearly as efficiently as if hand-coded in assembler language. 3.1 Data Facilities
Fixed point variables are used in every program to serve as counters, to contain quantities such as the current length of a varying length character string, or to be used as pointers to reference other data items. Naturally, all the basic arithmetic operations must also be available for manipulating these quantities. Some systems need floating point variables and operations as well. These might be used to contain such numerical data as graphic coordinates. Both bit and character data types are required: the first might contain flags and masks; the second, parsed data items and, in a translator writing system or compiler, generated code. 3.2 Storage Allocation
In order to perform data manipulation efficiently, the systems language should provide various methods of storage allocation on three basic levels as discussed below: program (static), segment (implicit dynamic), and statement (explicit dynamic). A data item which is allocated on the program level is accessible by that program (i.e., “module” or whatever name is given to an independent unit of an executing system) at any time. That is, space is reserved for the item for the duration of program execution. A value defined there during compile time or stored at any time during execution may still be obtained by the system at the end of execution or at any time in between. Data items requiring this static type of allocation include any constants used by the system. Many variables are only needed in a single routine, and therefore they need not be accessible throughout the whole program. In order to conserve space, the variables could be allocated on the segment level (implicit dynamic), that is, the space could be obtained from a free storage mechanism upon entry to the block or subroutine in which it is needed and then freed again (returned to free storage) upon exit, after which the variable may not be referenced.
SYSTEMS PROGRAMMING LANGUAGES
193
Since segment level allocation is performed on entry to the routine, such variables may be used in a reentrant or recursive environment. If the routine calls itself, either directly or through another routine, the values of the variable will be saved (e.g., in a push down stack) and the variables themselves will be reallocated. Upon exit from the second instantiation, the second copy of the variables would be freed, and the first set reestablished as the current values. Similarly, if two separate users access the system at the same time in a reentrant environment, each would be allocated a separate copy of the variables on entry to the routine. The final mode of storage allocation (explicit dynamic) is done on the statement level and is directly controlled by the user. That is, the user issues a command to allocate a variable and at some time later, he issues an explicit command to free it again. During that period of accessibility, the variable is “based on” (referenced by) a pointer (whose value is an address) that has previously been allocated in the program (on the program, segment, or statement level). Such a variable is accessible regardless of segment nesting, as long as the pointer on which that variable is based is accessible. Every reference to the new variable is evaluated by first locating the basing pointer, then using its value to determine the location of the based variable. Thus the declaration of a based variable specifies the attributes of a template which may be located anywhere in core at run time since the value of the basing pointer is undetermined at compile time. In fact, one may reference any section of core as if it were that variable by changing the contents of the basing variable or by specifying another pointer as the base for the variable. Through the use of the second of these options, a user may have “multiple copies” of the variable in core at once, each one referenced by a different pointer. This type of allocation would be used typically to store the blocks of a linked list. The first copy of the variable (head of the list) would be based on an already existing pointer, and each new element would be referenced by a pointer in the previous block. Elements could be added to the list by allocating a new copy of the variable and basing it on the pointer contained in the last element of the list; others could be deleted by altering the pointer of the previous block to reference the following one, and then freeing the variable itself. The allocation of storage is closely related to the concept of scope. The name of a variable must be made known to the compiler whenever a user wishes to reference it. Local variables are those which are known only within the block in which they were declared (including all statically nested blocks). On the other hand, global variables are those which are known in any block in the system. The option of using variables globally requires that the language provide
194
R. D. BERGERON ef a/.
two constructs. One of these is used to declare that a variable is to be allocated in the declaring block, but may be referenced in any block in the system. (This is often called entry or global.) The other construct specifies that a variable that is allocated in another block (as a global variable) is to be known by the declaring block. (This is usually referred to as external.) 3.3 Data Manipulation
All systems must organize data and subsequently access that data as efficiently as possible. The most common method is conventional list processing (“pointer chasing”) in which the data is organized into ordered lists, i.e., blocks (“structures” or “beads”) randomly stored but linked together in some order by a chain of pointers. Typical operations which must be provided from within the systems language are constructing lists (establishing the pointers), searching lists for a particular data item, and deleting items from a list. Another method is to use arrays for storing data with list processing replaced by operations on indices. Additional types of data organizations may be implemented through hash coding techniques [13]and set theoretic structures [8]. String manipulation facilities must be provided to manage bit and character data. Typical operations are pattern matching, accessing a substring, concatenating two (or more) strings, inserting, deleting, substituting, rearranging, etc. 3.4 Program Control and Segmentation
Flow of control must be maintained within individual modules, as well as between separate components of the system. A clear program logic, which is probably the most crucial aspect of systems design, can be maintained only through the use of iterative loops and conditional and unconditional jumps. Typical constructs available for this purpose are I F . . . T H E N . . . ELSE, WHILE, FOR, and CASE statements. Others suggested by Dijkstra provide for the elimination of GOTO’s through the use of DO loops and escape expressions. A common method of program segmentation is through the use of subroutines. Control is passed to a subroutine (or procedure) which saves the environment of the calling routine (return address and register or accumulator contents). Upon entry, a procedure must allocate all its segment level (local) variables. This routine may invoke several other procedures in turn, but eventually must regain control. Finally the procedure frees its local variables, restores the environment of the calling procedure, and
SYSTEMS PROGRAMMING LANGUAGES
195
returns control to that routine. Subroutining requires CALL and RETURN commands for maintaining control. Other useful facilities for run-time linking are the /360 commands ATTACH, XCTL, and LINK described in Section 2.7 on program modularity. I n some cases, a user prefers to communicate between two equally important routines. The systems language may provide the facilit,ies of coroutines, where control can pass from one routine to another and back again without freeing the segment level storage of either. An additional method for segmentation is block structure, such as found in ALGOL. A block is similar to a subroutine in that it signifies a logical unit both a t compile time and a t run time. Variables and labels may be declared and are addressable within this block and all of its inner blocks. Thus, as is the case for subroutines, data local to the block may be overlayed in storage by that of another block when it is not being used. Whereas a subroutine can only be entered by an explicit “call”, a block is entered by passing sequentially through the block head. Thus a block has no need for facilities for returning to the point of entry, but merely passes control sequentially to the next segment. Another method of segmentation is to provide routines which will receive control after an asynchronous interrupt (one which occurs from some outside source a t an unpredictable time). A typical example of such an interrupt is a machine check (such as an addressing error) or a lightpen detect in a graphics system. These routines behave similarly to subroutines in that they are usually not entered sequentially and they return to the point of execution a t which the interrupt occurred. (It is often convenient to use a synchronous device for “calling” this type of segment, such as in PL/I’s SIGNAL facility, which simuIates an appropriate asynchronous interrupt.) 3.5 1/0 and Debugging
Data may be transmitted to and from files in either stream or record format. Stream I/O may be treated by the user as a continuous flow of information. Although this type of 1/0 is usually associated with devices which do either input only or output only (such as a card reader, card punch, or printer), it can also be used to access sequential files on disks or tapes. On the other hand, record 1/0 requires that the data be segmented into logical units, having a fixed or a varying length. This type of 1/0 may be used for direct access as well as sequential files. I n order to allow 1/0to overlap program execution to improve efficiency, the systems environment should provide a means for buffering the data to be transmitted. In the case of input, this implies that several records be read into core before they are actually required by t