Advances
in COMPUTERS VOLUME 13
Contributors to This Volume
B. CHANDRASEKARAN PATRICIA FULTON JAMESJOYCE BOZENA HEN...
45 downloads
1071 Views
13MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advances
in COMPUTERS VOLUME 13
Contributors to This Volume
B. CHANDRASEKARAN PATRICIA FULTON JAMESJOYCE BOZENA HENISZTHOMPSON FREDERICK B. THOMPSON L. WEXELBLAT RICHARD
Advances in
COMPUTERS EDITED BY
MORRIS RUBINOFF Moore School of Electrical Engineering University of Pennsylvania and Pennsylvania Research Associates, Inc. Philadelphia, Pennsylvania
AND
MARSHALL C. YOVITS Department of Computer and Information Sciencc Ohio State University Columbus, Ohio
VOLUME
13
ACADEMIC PRESS
9
New York
9
Son Francisco
A Subsidiary of Harcourt Braw Jovanovich, Publishers
9
London--1975
COPYRIGHT 0 1975, BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED O R TRANSMITTED IN ANY FORM OR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC.
11 1 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road. London NWl
LIBRARY OF
CONGRESS CATALOG CARD
NUMBER : 5 9 - 15 7 6 1
ISBN 0-12-012113-1 PRINTED IN THE UNITED STATES OF AMERICA
Contents CONTRIBUTORS PREFACE
ix xi
Programmed Control of Asynchronous Program Interrupts Richard L. Wexelblat
1. Introduction . 2. Definition of Terms . 3. Attentions and Synchronism . 4. Facilities in Current Languages . 5. External Attentions . 6. Extended Attention Handling . 7. Examples . 8. Conclusion Appendix 1. Syntax of the Attention Handling Language . Appendix 2. Detectable Conditions in PL/I, COBOL, and FORTRAN . Appendix 3. Glossary of Terms . References
1 2 4 5 13 16 31 37 37
38 39 40
Poetry Generation and Analysis James Joyce
1. Introduction . 2. Computing Expertise . 3. Poetry Generation: The Results . 4. Poetry Analysis: Introduction . 5. Concordance-Making . 6. Stylistic Analysis . 7. Prosody . 8. Literary Influence: Milton on Shelley 9. A Statistical Analysis . V
.
43 44 47 52 53 58 61 62 63
CONTENTS
vi
10. Mathematical and Statistical Modeling 11. Textual Bibliography . 12. Conclusion . References .
.
64 67 69 70
Mapping and Computers Patricia Fulton
1. Introduction . 2. History . 3. What Is a Map? 4. The Earth Ellipsoid 5. The Geoid . 6. Geodetic Datum 7. Geodetic Surveys 8. Satellite Geodesy 9. Photogrammetry 10. Projections . 11. Cartography . 12. Data Banks . 13. Future Trends . 14. Conclusions . References .
73 74 76 77 78 79 80 87 89 92 98 102 103 105 106
.
.
. . . .
Practical Natural language Processing: The RE1 System as Prototype Frederick 6.Thompson and Bozena Henisz Thompson
Introduction . 1. Natural Language for Computers . 2. What Constitutes a Natural Language? 3. The Prototype REL System . 4. Semantics and Data Structures . 5. Semantics Revisited . 6. Deduction and Related Issues .
. .
.
. .
.
.
110 110 111 115 122 128 135
CONTENTS
vii
7. English for the Computer . 8. Practical Natural Language Processing References
Artificial Intelligence--The
. .
. ,
143 158 167
Past Decade
B. Chandrasekaran
1. Introduction . 2. The Objectives of the Review . 3. Language Processing . 4. Some Aspects of Representation, Inference, and Planning 5. Automatic Programming . 6. Game-Playing Programs . 7. Some Learning Programs . 8. Heuristic Search . . 9. Pattern Recognition and Scene Analysis 10. Cognitive Psychology and Artificial Intelligence . 11. Concluding Remarks . References .
AUTHORINDEX . SUBJECT INDEX . CONTENTS OF PREVIOUS VOLUMES .
. .
. . . . .
. . .
. .
. .
.
170 173 176 195 202 205 208 213 217 220 224 225
233 237 245
This Page Intentionally Left Blank
Contributors to Volume 13 Numbers in parentheses indicate the pages on which the authors' contributions begin.
B. CHANDRASEKARAN, Department of Computer and Information Science, T h e Ohio State University, Columbus, Ohio (169) PATRICIA FULTON,U.S. Geological Survey, 12201 Sunrise Valley Drive, Reston, Virginia (73) JAMESJOYCE,Computer Sciences Division, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California (43) BOZENAHENISZTHOMPSON, California Institute of Technology, Pasadena, California (109) FREDERICK B. THOMPSON, California Institute of Technolog.y, Pasadena, California (109) RICHARD L. WEXELBLAT, Bell Laboratories, Holmdel, New Jersey ( 1 )
ix
This Page Intentionally Left Blank
Preface
It gives u s great pleasure to welcome Marshall C. Yovits a s Co-Editor of Advances in Computers. As Guest Editor of Volume 11, Dr. Yovits commented on how extensive and diverse is the field commonly known as computer information science. The prescnt volume demonstrates that diversity in its five comprehensive articles : four devoted to advanced computer applications ranging from the practicalities of gcodctics and mapping to the esthetics of poetry generation and analysis, the fifth directed to the problems that arise when computer opcration is asynchronously interrupted by service requests from the “outside” world. The central theme of each of the articles deserves presentation here. In her article on Mapping and Computers, Patricia Fulton describes many of the recent successes of computerized cartography and related uses of cartographic data bases. Frederick B. Thompson and Bozena Henisz Thompson describe a language system that accommodates natural language communication with computers. B. Chandrasekaran points to the complexity of the processes involved in “creating intelligence” and provides a critical examination of the various branches of research on artificial intelligence. James Joyce describes poetry generation and analysis by computer including concordance-making, stylistic analysis, prosody, literary influence, statistical analysis, and mathematical modeling. Richard L. Wexelblat presents extensions to existing programming languages that provide a capacity for interrupt handling through the use of “on-units” and a facility for the synchronization of independent tasks. MORRISRUBINOFF
xi
This Page Intentionally Left Blank
Programmed Control of Asynchronous Program Interrupts RICHARD 1. WEXELBLAT Bell loborotorier Holmdd. New Jersey
.
1 Introduction . . . . . . . . . . . . . . . . . . . . . 1 2. Definition of Terms . . . . . . . . . . . . . . . . . . 2 3. Attentions and Synchronism . . . . . . . . . . . . . . . 4 4 Facilities in Current Languages . . . . . . . . . . . . . . . 5 4.1 PL/I . . . . . . . . . . . . . . . . . . . . . . 5 4.2 COBOL . . . . . . . . . . . . . . . . . . . . . 10 4.3 FORTRAN . . . . . . . . . . . . . . . . . . . . 11 5 External Attentions . . . . . . . . . . . . . . . . . . 13 5.1 An Example of an Asynchronous Attention . . . . . . . . . 13 5.2 External Attentions and Multiprocessing . . . . . . . . . . 15 6. Extended Attention Handling . . . . . . . . . . . . . . . 16 6.1 The On-Unit Approach to Attention Handling . . . . . . . . . 16 6.2 Multiprocessing Considerations . . . . . . . . . . . . . . 25 6.3 Attention Handling through Multiprocessing-An Alternative Approach . 27 6.4 Extensions to FORTRAN . . . . . . . . . . . . . . . 30 7. Examples . . . . . . . . . . . . . . . . . . . . . . 31 8. Conclusion . . . . . . . . . . . . . . . . . . . . . . 37 Appendix 1. Syntax of the Attention Handling Language . . . . . . . 37 Appendix 2. Detectable Conditions in PL/I, COBOL, and FORTRAN . . . 38 Appendix 3. Glossary of Terms . . . . . . . . . . . . . . . 39 References . . . . . . . . . . . . . . . . . . . . . . . 40
.
.
.
1 Introduction
Few high-level programming languages have sufficient richness to give the user the explicit ability to control asynchronous and independent parallel processes . This article discusses the problems associated with the handling of externally caused asynchronous interrupt conditions and presents extensions to existing languages that provide a capacity for interrupt handling . The PL/I programming language is used as a basis for many of the examples because it already has a rudimentary capacity in this area . However, consideration is given to other common languages such as 1
2
RICHARD 1. WEXELBLAT
FORTRAN and COBOL. Two basic control structures are used in the development: “on-units” and a facility achieved through the synchronization of independcnt tasks. The primary area of concern is what happens to a running program as the result of an event or condition arising ‘Loutside”of the main program stream such as might result from an 1/0 device, graphical terminal, or online process controller. Upon encountering this situation, a special segment of code is executed. This might, on the one hand, be a particular subroutine designated by the programmer t o be activated when the external event occurs. On the other hand, it might be a separate program or routine previously activated but currently inactive (in a so-called “wait-state”) awaiting the occurrence of the external event. The response to the event might cause the program to stop or perhaps t o take some special action such as printing some data, calling a subroutine, or signaling some special device. Although such facilities may be achieved in almost any programming language through calls to machine language subroutines, this discussion is primarily devoted to explicit high-level language. Thus, special interest will be paid to the language a programmer will use to specify the action taken by the program when the external event occurs. This article begins with a tutorial overview and a brief survey of the current state of the art, followed by a suggestion of a possible extension to existing facilities (Section 6 ) . At present there does not seem to be any single “best way” to process interrupts and t o specify the programs to do the processing. Alternative formulations are presented and potential efficiency considerations are discussed where appropriate.
2. Definition of Terms
This presentation will, for the most part, be concerned with the effect of external events on a single program. This may be a user’s problem program or the operating system itself. Although ther.e is usually substantial difference in complexity between the two, the basic principles involved are the same. The term task is used to refer to a single running section of code. I n the absence of multiprogramming, a task is roughly equivalent to a program. Under multiprogramming, there may be many tasks running concurrently-all associated with a single program or job. Two tasks that are executing independently but which communicate through some common data area and jointly compute some function or provide the solution to some problem are said to be cooperating. In the special case in which
ASYNCHRONOUS PROGRAM INTERRUPTS
3
the tasks take turns executing, each in turn starting the other and waiting for it to return control, the cooperating tasks are called coroutines. Let the term attention refer to that which a task experiences as result of an external event. Depending upon what action the programmer has specified, the attention may or may not interrupt the program. The word condition will be used to refer to that event which gives rise to an attention. Thus, the occurrence of an overflow condition may cause an overflow attention to be raised and this may in turn interrupt the program. (The term attention has deliberately been introduced to remove the potentially confusing ambiguity in the use of the word interrupt as both verb and noun. This double meaning could lead to the situation where an interrupt (noun) may not interrupt (verb) a program. Interrupt will be used here only as a verb, while attention will be the noun that names the action resulting from the external event.) A brief glossary of definitions of terms is included as Appendix 3. A few examples from a familiar context will help in this informal definition of terms. Assume that a teacher is before a class and a student in the front row raises a hand. The student has raised an attention. This attention may or may not interrupt the teacher, depending upon just what the teacher is doing. I n one case, the teacher may be reading and not looking a t the class a t that moment. The teacher’s attention handling mechanism is said to be disabled. If things continue thus, the student may give up and the attention will have been ignored; or the student may keep his hand up and the attention may then be handled later. It is possible that the teacher might notice the raised hand, stop, and immediately ask the student what is the matter. I n this case, the attention handling mechanism was enabled and the raised hand caused an asynchronous or immediate interrupt. Had the teacher noticed the hand, but chosen to finish the current activity before querying the student, the attention would have caused a synchronous or delayed interrupt. The example can be carried a bit further. Assume the teacher has just begun to ask a question of the class in general. As the teacher talks, some of the students begin raising their hands ready to answer the question. The teacher may note the order in which the hands are raised but go on to finish the question, later calling on the student whose hand went up first. In this situation the attention handling mechanism was enabled but did not cause an asynchronous interrupt. Rather the attentions are stacked or queued and then dequeued by the teacher synchronously. Depending upon the answers received as the students are called on one a t a time, the teacher may dequeue all the attentions or a t any time decide to ignore the remainder of the queued attentions.
4
RICHARD L. WEXELBLAT
One final consideration: suppose the teacher begins a question and suddenly a student with an agonized look raises a hand suddenly very high. Although there may be several hands up, the teacher will very likely recognize the existence of a high priority attention and handle i t immediately (i.e., asynchronously).
3. Attentions and Synchronism
Although the event that causes an attention is by definition external to a task, it may very well be the result of an action of the executing program. Following are some examples of conditions that can cause attentions to occur : Computational Conditions such as underflow/overflow and zero divide; 1/0 Conditions such as end of record, end of file, and transmission error ; Program Conditions such as subscript out af range and attempted transfer to a nonexistent label or address; Machine Error Conditions such as invalid machine operation code and improper data format. Other attention interrupts may occur due to completely external events such as might arise in such real-time applications as process control and time sharing. Depending upon the specific hardwarc involved, some of these conditions can be recognized by the hardware, some must be simulated in the software. Ideally, the methods for programming the handling of all of these should be expressable in some uniform syntax. Although not without exception, there seems to be a rule that the “more synchronous” the event causing the attention, the “easier” it is to handle. It would be reasonable then to consider the kinds of synchronism that must be dealt with. At the hardwarc level, a single machine instruction may be considered a basic unit. Even though the hardware may go through several internal cycles to execute the instruction (especially if microprogrammed) , there is a trndency among hardware designers to synchronize interrupts with respect to the machine language. As computers grow in speed and capacity, however, it becomes harder and harder to determine precisely when machine conditions actually occur. To get the best use of hardware, it is necessary to permit interrupts to be “imprccisc” or asynchronous with respect to the machine language. In some overlapped machines, an overflow condition, for example, may not be detectcd until many instructions after the execution of the actual instruction that initiated the computation that led to overflow. This problem is further compounded in a machine with a “pipeline” arithmetic section. I n a high level language such as ALGOL or PL/I, a single instruction
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
5
may correspond to a great many machine instructions. An attention may be synchronous with respect to the machine language but asynchronous with respect to the higher level language. These two degrees of synchronism do not give enough resolution. Consider the PL/I language. At the implementation level of PL/I (F), IBM’s original PL/I implementation for OS/360 (IBM, 1970), there are certain types of operations that are intrinsically noninterruptable. For example, in assigning a value to an item with a dope vector, if validity of data is to be maintained, no interrupt should be permitted between updating the data and updating the dope vector. This operation of updating a dope vector, although several macliinc instructions, must still be considered a primitive operation a t the “implementati~n’~ level of PL/I. Levels of interrupt can he specified corresponding to levels of hardware or software. For purposes of this exposition the following levels of interrupt are of interest:
T y p e 0-Asynchronous with respect to the machine language (although possibly synchronous with respect to an underlying microprogram). T y p e I-Synchronous with respect to the machine language ; asynchronous with respect to an implementation. T y p e %-Synchronous with respect to an implementation ; asynchronous with respect to the language. T y p e %-Synchronous with respect to the language. T y p e 4-Synchronous with respect to a specific program (can occur only at specific points in a program). Examples of these are given in the following sections.
4. Facilities in Current languages
Most currently available high-level programming languages have at least a rudimentary facility for permitting the programmer to find out about attentions arising from conditions that result from program execution. The following sections describe what is currently available in PL/I and in the two most popular general purpose languages, COBOL and FORTRAN. 4.1 PL/I
At this time, PL/I is the only (fairly) widely used h i g h - l e d language that contains explicit language for handling tasks, attentions, and interrupts. The software of the PL/I environment will classify an interrupt
6
RICHARD 1. WEXELBLAT
according to type (converting the type if /necessary) and pass control to the appropriate part of a user’s program. Unfortunately, the language as now defined by the IBRl implementations tends to force synchronization of all interrupts. Some of the restrictions may be relaxed by the new proposed standard PL/I (ANSI/X3Jl, 1973). The MULTICS system on the Honeywcll (formerly G E ) 645 is implemented almost entirely i n P L / I (Corbato et al., 1972). MULTICS PL/I does not take any special precautions concerning the possibility of unexpected interrupts except t h a t the nvmber of events t h a t can occur asynchronously is severcly limited. It is possible, however, that an attention from an online user’s terminal will interrupt a program during a “noninterruptable” operation such as an area assignment. Although this sort of unfortunate timing may occasionally create difficulty with a user’s problem program, it cannot bother the operating system itself. 4. I . I P L / I Attention Handling
T h e basic attention processing feature of PL/I is the on-statement. A set of predefined conditions such as OVERFLOW, ZERODIVIDE, ENDFILE, CONVERSION, etc., is available and, by executing an onstatement referring to a particular condition name, the programmer may establish the statement or block of statements he wishes to be executed when an attention corresponding to the specified condition actually occurs. The full set of conditions defined in P L / I is given in Appendix 2. A detailed description of the on-statement and its action may be found in a PL/I manual or text (IBM, 1971 ; Bates and Douglas, 1970). For purposes of this exposition, the use of the on-statement can best be illustrated by example. T h e first example shows a simple program for copying one file into another, using the end of file condition to print the number of records processed and stop the program.
T: P R O C E D U R E ; DCL (A, B) R E C O R D FILE, STR CHAR(80); ON ENDE‘ILE(A) B E G I N ; P U T (J); STOP; E N D ; DO J = l BY 1 ; R E A D VILE(A) INTO(STR) ; W R I T E FILE(B) lTtO3f(STR); END; E N D T;
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
7
The on-statement establishes the action to be taken when an end of file occurs on file A. This action is specified by the block of statements following the ON ENDFILE (from BEGIN to E N D ) , called the on-unit, which will be executed when the end of file is detected by the program. The on-unit has been referred to as a “subroutine invoked by the gods.” This is meant to reflect the behavior of an on-unit invoked out-of-line by an agency that is only indirectly the result of the program’s execution. It should be noted that although the occurrence of an end of file is in genera1 potentially a type 0 interrupt, completely asynchronous with respect to the CPU, on the System/360 the hardware (or microprogram) converts it to type 1. The PL/I language as currently defined, however, forces the conversion to type 4 and in the IBM PL/I implementations, the end of file will appear to have occurred a t the end of the read-statement. This has greater significance in the light of asynchronous 1/0 as seen below. On-units may also be used with computational conditions :
T: PROC; DCL (X(lO), Y(10)) FLOAT; ON ENDFILE(SYS1N) STOP; O N OVERFLOW B E G I N ; P U T (‘OVERFLOW I N CASE’,J); GO TO END-LOOP; END; DO J = l BY 1; G E T (X’Y); /* computation on X and Y that might cause overflow*/ END-LOOP: END; END T; I n this example, the overflow on-unit will be executed when and if overflow occurs. At the machine level, overflow is type 1 although the PL/I definition causes the condition to be treated as type 2. Once the on-unit is entered, however, synchronization with the source program is temporarily lost. If the overflow had occurred in the middle of a computation, the result of the computation would have been undefined. I n the given example, the program forces execution back in synchronization with a transfer to the end of the loop. Had the goto-statement been omitted from the on-unit, the program would have continued from the point where the interrupt occurred, with the result of the computation undefined.
8
RICHARD L. WEXELBLAT
An interesting consideration reIated to fixed point overflow concerns the circumstances in which it can occur. When it results from an explicit programmer specified operation (e.g., I = J K ; ) , it is type 2, as mentioned above. The same hardware condition which occurred as the result of some operation in the environment, the evaluation of an expression specified as an array bound, for example, couId be handIed in the environment without the programmer ever being aware of it. I n some cases, it is possible to set up attention handling code that can “fix-up and continue.”
+
ON CONVERSION BEGIN; IF ONSOURCE = ‘ ’ T H E N ONSOURCE = ‘0’; ELSE STOP; END; When an attention is raised as result of a conversion error (in PL/I, type 2, gemrated by the code in the software environment), the specified on-unit will check to see if the offending field was blank. If so, that field will be replaced by zero and execution will continue from the point of interrupt with the conversion attempted again. If the field is not blank the stop-statement is executed. 4.1.2 P l / I Asynchronous I/O
All of the above cases have referred to synchronous attentions or to interrupts that have been synchronized by some hardware or software interface. Consider the following examples of how asynchronous operations are affected by attentions. PL/I has the ability to specify that certain input and output operations are to be executed asynchronously, in parallel with program execution. With each such 1/0 operation is associated an event variable which can record the completion status of the associated operation.
ON ENDFILE(F) B E G I N ; /*things to do on end of file*/ END;
...
DO
...
READ FILE(F) INTO(STR) EVENT(E); /*things t o do that do not require the new input data*/
ASYNCHRONOUS PROGRAM INTERRUPTS
WAIT(E); / * when t.he d a t a are needed /*process the data*/
9
*/
END; I n this example, as above, the endfile on-unit specifies the code the programmer wishes to be executed when an end of file is encountered. The main section of the program is a loop containing an input statement which reads from file F into a variable STR. The presence of the EVENT option in the statement specifies that the read is to be performed asynchronously and associates the event variable E with the operation. E is set ‘Lincomplete’’a t the start of the input. The input activity is then free to go on in parallel with the computation and E will be set “complete” when the input operation finishes. When the data are required, a wait-statement referencing E is executed and the program will wait, if necessary, until the input completes, whereupon the input data may be processed. It would be natural t o assume that an end of file attention will be raised immediately upon occurrence of the end of file condition since it is likely of type 0 or type 1 in the hardware. Unfortunately, the definition of the System/36O precludes type 0 and the definition of PL/I forces the implementation t o treat the interrupt as type 4 and the program does not “find out” about the end of file until the wait-statement is executed. That is, if the condition has occurred, the appropriate on-unit will be invoked a t the time the wait-statement is executed. Indeed, all conditions associated with asynchronous input/output operations are, by the definition of PL/I, forced to be treated as type &even transmission errors.
4. I . 3 Enabling and Disabling in P L / I
On-units in PL/I have block scope. That is, once an on-statement is executed in a given block, that on-unit remains in effect until overridden by the execution of another on-statement for the same condition. When an on-statement is executed in a subroutine or block, the new on-unit will temporarily replace any on-unit for the given condition specified in an outer block. When control returns to the outer block, however, the on-unit for that block is restored. Within a single block level, the execution of an on-statement for a given condition overrides and replaces any existing unit from a previous on-statement executed a t the same block level. It is possible to enable and disable attentions from many of the PL/I conditions a t the block or statement level by putting a prefix on the given
10
RICHARD L. WEXELBLAT
block or statement. Thus, (SUBSCRIPTRANGE) : S: PROCEDURE;
...
END; will enable the normally disabled subscript range checking for the duration of the execution of the procedure S except where the checking is explicitly disabled by a NONSUBSCRIPTRANGE prefix on a statement within S. Similarly the statement
+
(NOOVERFLOW) : A = B C/D *E ; will be execiited with floating point overflow ignored. Note that this does not say that overflow will not occur ; it merely says that i t is not necessary to check for it and if it should occur, the program should not be interrupted. What happens when a given condition occurs while disabled is an interesting question. From the point of view of the programmer and the programming language, the meaning of the program is defined only if the condition does not occur. If i t does actually occur while disabled, the resulting execution is highly dependent on the particular condition and on the hardware and implementation involved. It would be unlikely that the behavior of a program under such a circumstance would be the same on two different types of machine. This leads to interesting problems in the area of program interchange and in the area of programming language standards. The situation is ignored in the definition of Standard FORTRAN (ANSI, 1966). The proposed PL/I Standard (ANSI/X3J1, 1973) attempts at least to identify such problems and to try to make the implementor aware of where the program execution becomes iniplementation dependent. 4.2 COBOL
The language for handling machine and computational condition attentions in COBOL is extremely rudimentary compared with what is available in PL/I. For example the occurrence of an overflow condition may be detected only by an explicit test coded after each computation by the programmer. An optional ON SIZE ERROR clause may be appended to 8 computation sentence. If the computation results in a value too large to be held by the target variable specified in the sentence, a single imperative sentence specified by the SIZE ERROR clause will be executed. At least one COBOL text suggests using the SIZE ERROR clause on every calculation for which an explicit scaling computation has not previously been done (Coddington, 1971). Due to the limited variety of control structures available in COBOL,
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
11
the statement specified in the SIZE ERROR clause is usually a gotostatement. For example, ADD AMOUNT, COUNT TO TOTAL; ON SIZE ERROR GO TO HELP. COMPUTE X = Y/Z; ON SIZE ERROR GO TO NEXT-CASE. Thus, the computational conditions of overflow and zero divide are converted by the COBOL operating environment to an attention class that is in a sense even more restrictive than type 4. The input/output conditions are treated similarly-an AT E N D clause may bc included in a read statement to specify the action to take on encountering an end of file. An INVALID KEY clause is available for use in conjunction with direct access files. READ SYSINFILE; AT E N D GO T O NO-MORE. For some conditions, COBOL does provide a facility similar to the onunit of PL/I. It is possible to specify out of line a section of code to be executed when an 1/0 error occurs. This code will be executed following an error, but after any system-provided error handling routines have executed. The USE sentence in the Declaratives Section of the Procedure Division serves t o define the so-called USE procedure. For example, PROCEDURE DIVISION. DECLARATIVES. ERROR-UNIT SECTION. USE AFTER STANDARD ERROR PROCEDURE ON INPUT. . . . code to be executed after an i/o error on an input file . . . E N D DECLARATIVES.
It is possible to specify an error handling routine for all input files, for all output files, for all files, or for a list of files given explicitly by name. This code is established a t compilc time for the duration of execution and it is not possible to turn the action off and on dynamically. The USE procedure facility may also be used in conjunction with tape label processing and report generation. 4.3 FORTRAN
There is nothing in FORTRAN that corresponds even remotely to PL/I’s on-conditions or COBOL’s USE procedures. Indeed, in American National Standard FORTRAN (ANSI, 1966) there is no provision whatever for detecting any form of computational, machine, I/O, or external condition. Almost all implementations of FORTRAN, however,
12
RICHARD 1. WEXELBLAT
include many extensions in these areas and the current draft of the proposed revision of Standard FORTRAN includes some of these extensions (ANSI/X3J3,1973). Typical of FORTRAN supersets is the IBM FORTRAN IV language (IBM, 1972b). Floating point overflow and underflow may be detected in IBM FORTRAN IV only after the fact. A service subroutine, OVERFL, is provided that may be called with an integer variable as argument. This variable is set to 1 if exponent overflow occurred, 2 if no overflow condition exists, and 3 if exponent underflow was last to occur. A side effect of the call is to reset the indicator. Similarly, there is a DVCHK service subroutine that sets its argument t o 1 if a divide check occurred and 2 if not. 1/0 conditions are handled in a different manner. The Standard FORTRAN formatted read-statement is of the form: READ(u,f)k where u is a unit identification, f identifies a format and k is an input list. IBM FORTRAN IV extends this to
where s,, s2 are statement numbers; E R R = is an optional parameter specifying a statement to which transfer is made if a transmission error occurs during the read; and END= is an optional parameter specifying a statement to which transfer is made if an end of file is encountered during the read. For some reason, IBM’s FORTRAN language implementors chose not to permit the ERR = parameter on the write-statement. Thus, all of IBM FORTRAN IV’s conditions are converted to type 4 through either the system subroutines for computational condition detection or the additional 1/0 parameter mechanism. Although Standard FORTRAN has no form of asynchronous I/O, the IBM implementation does provide a facility in this area similar to that present in PL/I. If a read- or write-statement contains an ID = n parameter ( n is an integer or integer expression whose value must be unique to each read or write), then the transmission occurs asynchronously. A wait-statement is also provided ; however, unlike the PL/I wait-statement, the FORTRAN version may specify an 1/0 list. The FORTRAN wait-statement must be executed to “complete” the 1/0 operation. If the operation has not finished when the wait-statement is executed, the program will wait for the completion a t that time. The E N D = and E R R = parameters may not be used on asynchronous read-statements, rather the function is served by an optional COND = i
ASYNCHRONOUS PROGRAM INTERRUPTS
13
parameter. If the parameter is present, the variable i is set to 1, 2, or 3, depending on whether a read completed normally, an error condition occurred, or an end of file was detected. There seems to be no logical reason for the inconsistency between the error and end of file mechanisms of synchronous and asynchronous input. The FORTRAN implementation for the Honeywell Series 6000 computers (Honeywell, 1971) is supplied with extensions similar to those of IBM FORTRAN IV. I n this case, however, the set of detectable extensions is larger and somewhat more consistent. The 1/0 error parameter may be used in conjunction with a writestatement as well as with a read. In addition, the service subroutines provided, as well as extending the set of computational conditions, permits the user to test for end of file and 1/0 error conditions. It is also possible, through a system subroutine, to establish the label of a statement to which transfer will be made whenever almost any error detectable by the environment, be it computational or I/O, occurs. Although it would require a bit of initialization, it is possible to specify a unique transfer point for each of the set of roughly 80 distinct conditions. This is approximately equivalent to the PL/I facility with normal return from an on-unit forbidden.
5. External Attentions
All of the potentially interrupting conditions considered so far have one thing in common: they are associated directly with some statement, computation, or reference in the source program. I n order to handle the general case of external attentions additional language is needed as well as additional semantics for current language. 5.1 An Example of an Asynchronous Attention
Below are two examples of programs that use a simple attention handling facility, written in a terminal oriented dialect of PL/I known as CPS (IBM, 1972a). CPS has a rudimentary attention handling facility of the form: ON ATTENTION simple-statement ; The on-unit may consist only of a single simple statement. (The attention referred to is the “ATTN” or “BREAK” key of an on-line user’s terminal.) The first example shows the use of an attention to cause the cur-
RICHARD L. WEXELBLAT
14
rent index of a do-loop to be printed. During the loop’s execution, should the programmer get impatient and wish to see how far his program has gone, he may push the attention button and the current value of the index will be printed, after which the program will continue.
ON ATTENTION P U T LIST(J); DO J = l TO N;
/* computation */ END; One of the serious problems with online use of low speed terminals is that a verbose program can waste much time and create mu,ch annoyance while printing long messages, especially when the same program is run repeatedly. I n the next example, the attention key may be used by the programmer a t the terminal to cut short the undesired output and to permit the program to continue.
... ON ATTENTION GO TO SKIP1; P U T E D I T ( ) ( ); SKIPI: . . .
... ON ATTENTION GO TO SKIPB; PUT EDIT( ) ( ); SKIPZ: . . . ON ATTENTION GO TO SKIP3; P U T E D I T ( ) ( ); SKIPS: . . . (To be completely correct, the ellipsis following each “skip” label should contain an ON ATTENTION . , . that nullifies the previous ON
ATTENTION GO TO . . . .)
I n the absence of an attention on-unit, the CPS system default action in response to an attention is to stop the running program and return control to the command level. The user’s explicit specification of an attention on-unit overrides the system response, a t times making it hard to stop a looping program. I n the CPS implementation on the IBM System/36O, if an attention condition is raised while an attention on-unit is being executed, then the program will be stopped. It is sometimes difficult for the programmer to hit the attention key a t the right time.
ASYNCHRONOUS PROGRAM INTERRUPTS
15
In one of the earliest time sharing systems, MIT’s CTSS, the need for more than one level of terminal attention was recognized and, very early in the development of CTSS a mechanism was provided to permit the user to specify multiple levels of execution (Corbato et al., 1963). The problem was that the terminal devices used with CTSS usually had only one attention or break mechanism. The simple but quite successful solution adopted was to permit the terminal user to send his break signal a series of times, each successive signal raising the level of execution one step until the top or command level was reached. The most common use of this mechanism was to provide a two level attention facility: level ¬ify level l-notify button.
the system that the user pushed the break button and the problem program that the user pushed the break
Thus, running under a program with these two levels implemented, a single break was equivalent to raising an attention condition in the problem program, while two breaks in quick succession served to interrupt the program and return control to the command level. 5.2 External Attentions and Multiprocessing
Before going on to look into generalized attention handling language,
it will be necessary to look into one of the potential problems associated with the interaction between attentions and multiprocessing. Although the situation is presented in the context of PL/I, the resulting problem area will be present in any language that involves both multiprocessing and externally generated asynchronous attentions. When a subroutine is called from a PL/I task, it inherits all of the on-units of its progenitor, even if it is spawned as an independent task (i.e., free to execute in parallel). In the case of overflow, for example, this creates no problems since an overflow attention can easily be associated with the task in which the overflow occurred. Any overflow occurring will raise the attention only in the task in which the overflow occurred and only in this task will the on-unit be executed. On the other hand, if an attention is associated with an interrupt from an online process controller of some sort, when a task with this condition enabled spawned subtasks, each would inherit the main task’s on-unit. Suppose several tasks are active, each with this condition enabled, and the condition arises: Which task’s on-unit would be raised? Would they all? I n sequence? I n parallel? Which first?
16
RICHARD 1. WEXELBLAT
6. Extended Attention Handling
Following is a discussion of a possible high level language facility that would permit specification of attention handling algorithms. Although similar in form and syntax to the style of PL/I, the concepts involved are general and could be applied to languages such as ALGOL, COBOL, or FORTRAN. While this is not necessarily the only way to achieve the desired end, the statements and options described below seem to provide a reasonable way to go. Two different approaches to a PL/I extension are described: a fairly complex but powerful facility making use of on-units and permitting multiprocessing, and a somewhat limited facility that makes use of multiprocessing alone. A possible FORTRAN extension is also described. 6.1 The On-Unit Approach to Attention Handling
I n order for a programmer to write programs to process asynchronous attentions, it will be necessary to provide a new data type: the attention. New statements to operate on attention data are also required. The following sections describe the new data type and the statements that operate on it. The syntax of the new statements may be found in Appendix 1. (Wherever a list of names is mentioned below, it is assumed that commas are used to separate names in the list.) 6.1.7 The Attention Data Type
A name declared with the ATTENTION data attribute will be associated with an external condition and will priniarily be used in an on-statement to identify the code to be executed when the attention is raised. Attention data have much in common with conditions in the current PL/I language. Attention handling code specified in an on-unit will be permitted to execute in parallel with the task in which it is invoked. Following is an example of the declaration of an attention: DECLARE A1 ATTENTION ENVIRONMENT (. . .) ; Each attention is associated in some implementation-dependent manner with an external device, condition, or event. The environment option is used just as with a file declaration, to specify implementation dependent parameters. For example the environment may contain the information that identifies the external source of attentions and specifies the maximum depth of the attention queue.
ASYNCHRONOUS PROGRAM INTERRUPTS
17
6 . I .2 Attention On-Units
The code to be executed when an attention is raised will be specified as an on-unit in a manner similar to that illustrated in the examples of Sections 4 and 5. I n order to increase facility, however, two options will be added to the on-statement: task and event. Either of these is sufficient to specify that the on-unit when invoked is to be executed as an independent subtask in parallel with the interrupted task. The event option requires the programmer to specify an event name that will be set incomplete a t the start of the on-unit and set complete when that code is finished executing. The event variable may be used by the program in which the attention was raised to determine when the on-unit is complete and to synchronize computations. The task option permits the programmer optionally to specify a name that may be used to refer to the on-unit from another task; as, for example, when it is necessary to terminate an on-unit executing in parallel. Use of the task option alone permits independent execution when there is no need for an explicit name. If neither the task nor the event option is specified, raising the attention will cause the interrupted task to suspend execution until the onunit completes. Following are some examples of on-statements for conditions :
ON ATTENTION(BREAKKEY) STOP; ON A T T E N T I O N ( B R E A K K E Y ) TASK BEGIN;
ON ATTENTION(EXT1) EVENT(EXT-EVENT-1)
. . . END;
...
END;
BEG1N ;
The key word ATTENTION is used to differentiate between attentions and built-in conditions. While not absolutely necessary, its use makes it possible to extend the set of built-in conditions easily and improves program documentation by making it easy to identify programmer defined attentions. It should be noted that, permitting task and event options in the onstatement would not be necessary if these options were permitted on the begin-statementa more smooth and natural extension to a blockoriented language. It should also be noted that if the task and event options were permitted on neither the on-statement nor the begin-statement, the equivalent effect could be achieved by making the on-unit code an out-of-line subroutine and then invoking that subroutine through a call-statement in the on-unit. The call-statement may, of course, contain the task and event options. This last method would appear to be the smoothest way to add the facility discussed here to a language such as FORTRAN.
18
RICHARD L. WEXELBLAT
A system default action is provided just in case an enabled attention is raised when there is no programmer defined on-unit for that attention. The action taken depends upon the particular implementation and will most likely be to print a message and return.
6.1.3 Values of Attention Data
Each attention datum takes on several status values: a. an activity status: active or inactive b. an enablement status: enabled or disabled c. an access status: immediate, asynchronous or queued. An active attention is one which the program is prepared to process should the associated external condition occur. The result of the occurrence of the external condition associated with an inactive attention is not defined and may vary from implementation to implementation and from condition to condition. This situation may in some circumstances be classified as an error while in others it may be ignored. An attention is activated when any statement in which it is referenced is executed and remains active in all tasks in which it is used. This activity has global scope in the sense that it is not meaningful for an attention to be active in one task and not in another task executing at the same time. Enablement, on the other hand, is a status that is local to a task. If an attention is enabled within a given task, then the occurrence of any external condition associated with the attention would raise the attention in that task, interrupting the task and causing the on-unit for that attention to be executed. If the attention had been disabled, the occurrence of the condition would be ignored. The difference between activity and enablement is somewhat subtle, activity referring to an external phenomenon and enablement associated with the program state itself. The occurrence of the appropriate external event for an inactive attention might very well not be recognized by thc hardware. If the attention were disabled then this occurrence would indeed be noted by the environment but no interrupt would occur and no on-unit would be invoked. An implementation might choose to consider the raising of an attention disabled in every active task to be an error. The access status may have various interpretations, depending upon wh.ether a priority interrupt system is under consideration. Initially it will be assumed that attentions do not have priorities. There are three distinct ways in which an attention may be enabled within a task:
ASYNCHRONOUS PROGRAM INTERRUPTS
19
i. Immediate-this attention must be processed immediately upon the occurrence of the corresponding condition. It may interrupt any existing task other than another immediate attention on-unit. ii. Asynchronous-this attention will be processed in an asynchronous manner, interrupting any existing task other than an attention on-unit. iii. Queued-this attention will not interrupt any task but, rather, the occurrence will be noted and an appropriate entry made into an attention stack. The access status of an attention is established when the attention is enabled and asynchronous access is assumed as default if no status is specified explicitly. (In Section 6.1.5 an alternative formulation that replaces these three attention levels by a set of priorities is described.) 6.7.4 Attention Processing Statements
The access status and enablement of an attention is determined by the enable-statement and its complement, the disable-statement. The former both enables an attention and activates it if it was inactive. It may also be used to change the access status of an enabled attention. The disable-statement, as its name implies, is used to disable attentions. As an attention may be enabled independently in separate tasks, the enable-statements applies only to the task in which it is executed unless otherwise specified. Two further options for the enable-statement are described in Section 6.2. Following are some examples of simple enable and disable-statements. ENABLE ATTENTION(BREAK_KEY) ; ENABLE ATTENTION(BREAK-KEY) ASYNCHRONOUS ; ENABLE ATTENTION(SENSORl,SENSOR2) QUEUED, ATTENTION(LIM1T) I M M E D I A T E , ATTENTION(FLOW,RESERVE) ASYNCHRONOUS; ; DISABLE ATTENTION(BREAK-KEY,LIMIT,FLOW) The first two statements are logically equivalent as asynchronous is the default access class. The flowchart of Fig. 1 illustrates the logical sequence of operations in the execution of a n enable-statement. (Figure 6, in Section 6.2, is a more detailed description of the enable-statement, reflecting the additional options presented in that section.) The occurrence of the external condition associated with a n attention enabled for asynchronous access will cause that attention’s on-unit to be invoked immediately. The task in which the attention was raised will
20
RICHARD 1. WEXELBLAT
ATTENTION
\k w I
PERFORM ENABLEMENT
I
REMOVE A STACK ENTRY; RAISE THE ATTENTION
V
NO FURTHER ACTION
FIQ.1. Attention enabling-simple form. The sequence of decisions and operations involved in the execution of a simple enable-statement. This process is repeated for each attention named in the enable-statement.
either continue or be interrupted, depending upon the options used in the on-statement for that attention. If an attention is enabled for queued access, the occurrence of the associated external event will not interrupt the program or invoke the on-unit a t once. Rather, the attention will be placed on a stack and remain there until the programmer explicitly interrogates the stack. Each task has its own attention stack. Upon executing a dequeue-statement for that attention, the entry will be removed from the queue and the associated on-unit invoked. For example: DEQUEUE ATTENTION (SENSORl) ;
If an attention for SENSOR1 is enqueued, that stack entry will be removed and the on-unit for the SENSOR1 attention will be invoked. If no such attention is enqueued, the statement will have no effect.
ASYNCHRONOUS PROGRAM INTERRUPTS
21
A built-in function, QUEUE, is available to test whether an attention is stacked. Given an attention name as argument, it returns a nonnegative integer that indicates the number of instances of that attention on the queue. In order to prevent the repeated occurrence of an external event from causing an attention on-unit to recurse upon itself, an immediate or asynchronous attention is treated as if it had been enabled for queued access when its on-unit is entered. Thus, further attentions will be stacked and may be accessed from within the on-unit itself. When the on-unit completes, the attention reverts to its former access status and, if the stack is not empty, the on-unit is immediately invoked again. Of course, the programmer is free to modify the access status within the on-unit. If a queued attention is already on the stack when that attention is enabled for asynchronous or immediate access, the on-unit will be invoked a t that time. When an attention is disabled, any stacked interrupts for that attention are removed from the queue and lost. The flowchart in Fig. 2 illustrates the logical sequence of operations that follows the raising of an attention within a single task. This operation occurs independently in each independent task active a t the time the attention is raised. Figure 3 illustrates the action taken upon return from an attention on-unit. 6.7.5 Priorities
I n order to program many practical real-time applications and to mirror many current types of process-control hardware, it might well be necessary to permit the programmer to specify relative “importa,nce” among groups of attentions by associating a priority with an attention. Instead of the three access options described above, the enable-statement would permit a priority option that associates a nonnegative integer with that enablement of the given attention. An attention of some given priority would then be able to interrupt any on-unit for a n attention of equal or lower priority but would not be able to interrupt the on-unit of a higher priority attention. A task spawned from an attention on-unit executes a t the priority of that attention unless otherwise specified. Any time an attention cannot interrupt due to the presence of a higher priority interrupt, the low priority attention would be stacked. With a general priority system, there would be no need to differentiate between asynchronous and immediate attentions and if the lowest priority were defined always to be stacked and never to interrupt, until accessed, the priority option could replace the queued as well as the asynchronous and immediate options.
22
RICHARD 1. WEXELBLAT AN ATTENTION I S RAISED
IGNORE
IGNORE OR ERROR
FIG.2. Interrupt or cnqucue? The action of the attention handler following the raising of an attention.
ASYNCHRONOUS PROGRAM INTERRUPTS
23
RETURN FROM AN ATTENTION ON-UNIT
FIG.3. After return. The action of the attention handler following the return from an attention on-unit. Figures 4 and 5 are analogous respectively to Figs. 2 and 3 and illustrate actions taken when an attention is raised and upon return from an on-unit in a priority attention handling system. 6.7.6 Comparison o f the Three Level and Priority Access
The three level approach is simpler in concept and potentially simpler in implementation than the more general priority system. Although this approach would probably be quitte adequate for most online applications that do not require millisecond or microsecond response times-timesharing and information retrieval systems, for example-an online process control application that required instant interrupt response times would probably need the full priority facility. Both approaches are amenable to subsetting. Restriction of attentions to queued (or lowest priority) access would have the effect of converting all interrupts to type 0 (see Section 3).
24
RICHARD L. WEXELBLAT AN ATTENTION OF PRIORITY. P I S RAISED
FIG.4. Priority handler: interrupt or enqueue? The action of a priority attention handler following the raising of an attention.
Permitting queued and asynchronous access or, alternatively, two priority levels would provide the full generality of interrupt type but is not really sufficient for applications in which there is a difference in “importance” between and among attentions.
ASYNCHRONOUS PROGRAM INTERRUPTS
25
RETURN FROM ON-UNIT FOR ATTENTION OF PRIORITY P
OF THE HIGHEST PRIORITY
RAISE THAT' ATTENTION
I
FIG.5. Priority lian lei-: after return. The action of a priority attention handler following the return from an on-unit. 6.2 Multiprocessing considerations
When a subroutine is called as a task and inherits the on-units of its progenitor, all attentions associated with those on-units will be passed in the disabled state. This will avoid the problem of multiple invocation mentioned in Section 5.2 above. The programmer may then enable the attention in the subtask and disable it in the calling task. T o permit the programmer to maintain complete control over attentions in his program, two additional options are available on the enablestatement. These may be used to synchronize attention enabling in a multiprocessing environment: (a) location, which specifies explicitly the task or tasks in which the attention is to be enabled and (2) exclusion, which restricts the set of tasks in which an attention is enabled. The location option consists of the keyword I N followed by a list of one or more task names in parentheses. An asterisk may be used in lieu of the task names and implies that the enabling applies to all active tasks. If the location option is omitted, then the statement applies only to the task in which it is executed. For example, ENABLE ATTENTION (A) ; ENABLE ATTENTION (A) I N (T1) ;
RICHARD 1. WEXELBLAT
26
ENABLE ATTENTION(A) IN(T1, T2) ; ENABLE ATTENTION (A) I N (") ; The first statement enables A only in the task in which that statement is executed. In the second example, A is enabled in the task named T1 while in the third, it is enabled in both task T1 and in task T2. The effect of the fourth statement is to enable A in all tasks currently active. The exclusion option consists of the keyword ONLY or the keyword LOCALLY. If ONLY appears then the attention is enabled in the task or tasks specified and simultaneously disabled in all other tasks. The LOCALLY keyword implies a more restricted enabling: At any time during the execution of a program, the set of active tasks may be represented as a tree with the initial task a t the top and each active task represented as a subnode of the task from which it was spawned. Tasks a t the same level in the tree and with the same predecessor are called sibling tasks. When a task is enabled LOCALLY, it is enabled in the specified task and simultaneously disabled in any of its subtasks, any sibling tasks and their subtasks and in the predecessor task. This applies independently to each task named in the location option if it is present. For example, ENABLE ATTENTION ( A ) I N ( T l ) ONLY ; ENABLE ATTENTION (A) I N (Tl) LOCALLY; I n both of these examples A is enabled. I n the first example, however, A is disabled in every task other then T1. In the second example, A is disabled in all tasks of the minimal subtree of the task tree that contains the predecessor of T1 except for T1 itself. If an atkention is enabled locally in a group of tasks which fall in the same subtree then the disabling applies only to the tasks not named in the location option. When an attention is enabled in a task exclusively, any attentions on the stack of that task's predecessor(s) are transferred to the stack of that task. Attentions stacked for other tasks being disabled a t that time are lost. The following fragment illustrates the use of some of these features. ENABLE ATTENTION(L1GHT-PEN) ; CALL SUB(X,Y,Z) TASK(TSUB); ENABLE ATTENTION(L1GHT-PEN) IN(TSUB) ONLY;
... Initially, the LIGHT-PEN attention was enabled in the calling routine
ASYNCHRONOUS PROGRAM INTERRUPTS
27
asynchronously. When it was desired to transfer control for light pen attentions to a subroutine, the subroutine was first invoked as an independent task and then the caller enabled the attention in the subtask, simultaneously disabling the attention in itself. Any light pen attentions that occurred during the transfer of responsibility and up to the completion of the enabling were handled by the caller. Any attentions that were pending on the stack of the caller when the transfer occurred were passed along to the subtask. I n the previous example, it was assumed that the on-unit for the LIGHT-PEN attention was passed along when the subtask was called. If the subtask is not to assume control of attention processing until it has established its own on-unit, responsibility for the transfer could be given to the subroutine. This example is a bit more complicated and is illustrated in the section on sample applications below. Figure 6 illustrates the actions taken during the execution of an enablestatement and is an extension of the flowchart of Fig. 1. 6.3 Attention Handling through Multiprocessing-An
Alternative Approach
While the facilities described in Sections 6.1 and 6.2 could be quite smoothly added to a large language such as PL/I or COBOL, current trends in language design are toward smaller, more compact, and more modular languages. Although it is possible to formulate a subset of any language, it is often very difficult to define a subset that is functionally adequate yet still smooth and consistent. Furthermore, without some actual experience with an implementation of the full facility, i t is impractical to try to define a subset. Thus, an alternative method of handling attentions-one that does not use on-units and may thus b.e better suited to small languages-will be presented. In this formulation there is again an attention data type but in this case the attention bears resemblance to the event rather than to the on-condition. As above, an attention will be associated in an implementation defined manner with an external condition. Initially, the attention will have the value incomplete. When the condition is raised, the attention is set complete. If the value is already complete when the attention is raised, the attention will be stacked and kept on the queue until the value again becomes incompl.etc. A task may test if an attention has been raised through use of the COMPLETION built-in function and a task may wait for an attention to be raised. These topics are discussed in more detail after some preliminary description of the environment in which they are to be used.
RICHARD L. WEXELBLAT AN ENABLE-STATEMENT I S EXECUTED
ASSUME THE CURRENT TASK TO BE THE NAMED TASK
-I \
FOR EACH TASK NAMED WHOSE PREDECESSOR IS NOT NAMED: COPY STACK (SEE NOTE I )
I
DISABLE IN ALL BUT NAMED TASKfS)
I
LOCALLY
-ALL SUBTASKS -ALL SIBLINGS -PREDECESSOR
PERFORM SIMPLE-ENABLEMENT IN EACH TASK NAME0 (SEE NOTE 2 )
FIG.6. Complete attention enabling. The complete sequence of decisions and operations involved in thr rxecution of the enable-stat,ement. This process is repeated for each attention named in the enable-sktrment. Note 1. When an exclusion option in the enabling of an iittcntion in a given task causes that attention to be disabled in the predecessor of that task, any queue entries for that attention in the predecessor’s stack are transferred to the stack of the subtask. Note 2. The “simple enablement” referred to is that process illustrated in Fig. 1.
6 . 3 .J Cooperating Tasks Consider a graphics application in which two routines are cooperating in the processing of the data. The purposc of one routine is to respond to each light pen attention by reading in a set of s-y coordinates. The other routine takes thc consecutive coordinate pairs and generates a n
ASYNCHRONOUS PROGRAM INTERRUPTS
29
appropriate display. Although the two routines may communicate through common variables, each is free to execute independent of the other within the bounds of certain constraints. Due to the nature of the physical device involved, it is important that response to a light pen attention be as rapid as possible. It is also necessary that the coordinate-reading routine not get so far ahead of the coordinate-using routine that the common area overflows and it is necessary that the coordinate-using routine not get ahead of the coordinate-reading routine a t all. The problem of synchronizing data access between these two processes is beyond the scope of this discussion. See Dijkstra (1968) for a full discussion of the topic. Synchronization facilities are included in the proposed Standard PL/I (ANSI/X3J1, 1973) and in ALGOL-68 (Lindsey and van der Meulen, 1971). The synchronization of the task control and the processing of the attentions are, however, of direct interest. I n the simplest formulation, control will reside with the tasks most recently invoked until that task terminates or until it voluntarily gives up control by invoking another task or by executing a wait-statement. Thus, in order to prepare to accept an attention, a program will invoke a subroutine as a task. After performing any necessary initializations, the subroutine will execute a wait-statement naming the attention and, if the attention has not been raised, give up control. (If the attention has already been raised, control will “fall through” the wait-statement immediately.) When the attention is raised while the subtask is waiting, the main task will be interrupted, the attention set complete, and the subtask will resume execution. Following the wait-statement, the attention is set incomplete again so as to prepare for the next raising. The subtask then does any computation necessary to respond to the attention and returns to the wait-statement to await another occurrence of the condition that raised the attention. This formulation does not directly permit an attention to interrupt its own attention processing task. If such an interruption is desired, the attention processing task may call itself recursively as a task. Synchronization in this situation would be quite complex. 6.3.2 Enabling, Access, and Priorities
So far nothing has been said about enabling and disabling when h a w dling attentions through tasks. Depending upon the complexity of facility desired, this form of attention handling could either permit the full complexity of the enable-statement or it might be assumed that an attention is enabled from the beginning of execution of a block in which the attention is defined.
30
RICHARD 1. WEXELBLAT
The concept of access level or priority is tied in with the method of task dispatching. Many simple multiprocessing systems use a one level round-robin dispatching algorithm which would not be sufficient for a n attention handling environment. It may be necessary that tasks awaiting attentions get a higher priority than other tasks. If only a single level is available, it would be necessary for the programmer to make sure that all attention handling tasks gained control a t periodic intervals. While this would perhaps be adequate, such a system very likely would never be able to give very high speed response and it is likely that a t least a two level system would be necessary. Again, depending on the host environment, the system could provide either the three level system or the full priority system. If the multiprocessing host system had its own form of priority dispatching control, there is no reason why this control could not be used with the attention handling tasks, provided that a task responding to an external condition of short duration-such as a light pen signal-be able to gain control sufficiently quickly. An example of attention handling through multiprocessing is given below in Section 7. 6.4 Extensions to FORTRAN
Syntactically, FORTRAN is a fairly simple language. I n general, there is only a single keyword in each statement type and excess punctuation and delimitation is minimized. There is no block structuring to speak of and grouping is used only for iteration. What parts of the attention handling language would fit comfortably into FORTRAN? On the surface, the CPS language mentioned in Section 5.1 bears some resemblance to FORTRAN and inspiration will be taken from there. Let the on-statement be added to FORTRAN in the following form:
ON attention-name simple-statement The simple-statement is one of the following : assignment call got0 read/write stop The attention-name is the identifier of an attention. I n Standard FORTRAN, names containing alphabetic characters are either data vari-
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
31
ables or entry names. Files and statement labels are represented by integers. It would probably be more in keeping with the style of FORTRAN to make attention identifiers numeric also but the need for mnemonic significance indicates that permitting alphanumeric names would be better. Although there is some inconvenience in putting the attention handling code outside of the main program, the use of the call-statement in the on-statement would give the effect of a multiple statement on-unit. In this case, communication between main program and attention handling code would have to occur through COMMON storage. The attention handling facility is somewhat restricted in power without multiprocessing. It does not, however, seem in the spirit of FORTRAN to permit multiprocessing. There is no reason why simple versions of the enable and disable-statements could not be supplied and two access levels would probably suffice. On the other hand, it would be possible to use the on-statement itself as the enabling statement and t o disable by providing ‘a null on-unit. (As FORTRAN has no explicit null-statement, the continue-statement could be used for this purpose.) The following program fragment is similar to the first CPS example in Section 5.1. A terminal break key is used to determine how far a doloop has gone.
... C
c
ESTABLISH T H E ON-UNIT O N K E Y WRITE(6,30)L 30 FORMAT (18) E X E C U T E T H E LOOP DO 100 L = 1,2000
... C
100 CONTINUE “DISABLE” T H E CONDITION ON K E Y CONTINUE 7 . Examples
The first example illustrates the treatment of an external attention associated with the light pen of a graphics terminal. It is assumed that an attention is raised when the user depresses a button on the light pen and furthermore that as long as the button remains depressed, attentions continue to be raised a t (possibly irregular) intervals. COORDS is an external function that rcturns a two element vector:
32
RICHARD 1. WEXELBLAT
the current x-y position of the light pen. Initially the attention is declared : DECLARE LIGHT-PEN ATTENTION ENV(. . . parameters . . .); The parameters in the environment option are implementation dependent and, in addition to associating the declared name with the external device may, for example, specify an expected maximum queue depth. The main routine will process the coordinates recorded by the on-unit that is processing the attentions. A very large 2-by-many vector, CVEC, will be used to store the successive coordinate pairs and two indices are used : TAKE-an integer pointer to the coordinate pair in CVEC most recently processed by the main routine. PUT- -an integer pointer to the coordinate pair in CVEC most recently placed in the vector by the attention on-unit. The on-unit is established by the following on-statement :
ON ATTENTION(L1GHT-PEN) CVEC(PUT+l,*) P U T =P U T + 1 ;
BEGIN;
= COORDS;
END; Thus, when the attention occurs, a new set of coordinates is recorded and the index is advanced. The main routine is ready to process the successive coordinate pairs as they become available. Whenever the value of P U T is greater than the value of TAKE, there are data to be processed. The code in the main routine might be: DO WHILE(PUT> TAKE); TAKE = T A K E + 1 ; /*code to do something with the new coordinate pair*/ END;
If attentions are arriving faster than the coordinates can be processed by the main routine then the main program will loop while the P U T > TAKE test is true. It is necessary to provide a mechanism to permit the main program to wait for attentions when it has no data to process. Thus, an event, MORE, is defined which will be set incomplete by the main program when it has processed all of the data and set complete by the on-unit when data are available, The clear and post-statements from the proposed Standard PL/I are used, respectively, to set the event incomplete and
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
33
complete. For reasons explained below, the attention is enabled for immediate access. A “toggle,” ACTIVE, is set to true (‘l’B, in PL/I) initially and then used to control the loop. The mechanism for terminating the loop is described below. The program now looks like: ON ATTENTION(L1GHT-PEN) BEGIN; CVEC(PUT+l,*) = COORDS; PUT =P U T + l ; POST(M0RE) ; END; PUT,TAKE = 0; CLEAR(M0RE) ; ACTIVE = ‘1’B; ENABLE ATTENTION(L1GHT-PEN) IMMEDIATE; DO WHILE(ACT1VE) ; WAIT(M0RE) ; CLEAR(M0RE) ; DO WHILE(PUT > TAKE) ; TAKE = TAKE 1; /*code to process a coordinate pair*/ END; END;
+
I n order for the outer loop to terminate, it is necessary for the variable ACTIVE to be reset in some way to false. Assume another attention, PEN-OFF, is defined to be raised when the light pen button is released. The on-unit for this attention will set ACTIVE to false (‘O’B, in PL/I) and also post MORE complete. (This latter action is necessary as the main loop may be waiting for MORE when the release occurs.) As it is necessary to make sure that all light pen attentions are processed before terminating the processing loop, the PEN-OFF attention must never preempt the LIGHT-PEN attention. ENABLE ATTENTION(L1GHT-PEN) IMMEDIATE, ATTENTION(PEN-OFF) ASYNCHRONOUS ; ON ATTENTION(PEN-OFF) BEGIN; ACTIVE = ‘l’B; POST(M0RE) ; END; Although it would appear that the mechanism described so far would be sufficient t o handle all situations in the simple graphics application used in this example, there is one critical area in the program where a
34
RICHARD 1. WEXELBLAT
combination of unfortunately timed attentions may result in the loss of some data. If a LIGHT-PEN attention followed immediately by a PEN-OFF attention should occur between the end-statements of the two do-loops, the setting of ACTIVE to false would cause the outer loop to terminate without ever making use of the coordinates recorded in the final LIGHT-PEN on-unit. The most straightforward solution to this problem is t o include in the outer loop'^ while-clause a test to make sure th a t all coordinates have been processed; that is, the loop continues while ACTIVE is true or while P U T is greater than TAKE. Putting all of this together in a subroutine:
PROCESS-PEN: PROCEDURE ; DCL (PUT, TAKE, CVEC(10000,2)) FLOAT, COORDS ENTRY EXT RETURNS((2) FLOAT), MORE EVENT, ACTIVE B I T ( l ) , LIGHT-PEN ATTENTION ENV(. . .), PEN-OFF ATTENTION ENV(. . .); ON ATTENTION(L1GHT-PEN) BEGIN; CVEC(PUT+l,*) = COORDS; P U T = PUT+1; POST(MORE) ; END; O N ATTENTION(PEN-OFF) BEGIN; ACTIVE = ‘O’B; POST(MORI~); END; PUT, TAKE = 0; CLEAR(iVI0RE) ; ACTIVE = ‘1’B; ENABLE ATTENTION(L1GHT-PEN) IMMEDIATE, ATTENTION(PEN-OFF) ASYNCHRONOUS; DO WHILE(ACT1VE 1 P U T > TAKE) ; WAIT (MORE) ; CLEAR (MORE); DO WHILE(PUT > TAKE) ; TAKE = TAKE+ 1; /*process coordinate pair*/ END; /*critical area referred to in text*/ END;
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
35
DISABLE ATTENTION(L1GHT-PEN, PEN-OFF) E N D PROCESS-PEN; In this example, because of the use of PUT and TAKE in the main program and in the on-units, it would not be safe to permit the on-units to execute as tasks. Given suitable facilities for synchronization of access, it might be possible to gain efficiency through the use of the task option on the on-units. Another example of attention handling will be taken from a time-sharing systems application. When the system is started up, a small “telephone answering” routine is initiated at a high priority. This routine is prepared to respond to attentions raised by a terminal interface each time a data connection is made. The response by the answerer is to call a copy of the terminal monitor routine to handle the user. It is assumed that a variable, LINE, contains an integer that identifies the line on which the connection is made. ANSWERER: PROCEDURE; DECLARE LINE FIXED, /*phone line of connection*/ FOREVER EVENT, /*to permit program to wait*/ RING ATTENTION ENV(. . .), /*raised when a phone connection is made */ MONITOR ENTRY EXT, /*this will be called each time a connection is made*/ USER (maxno) TASK; /*to identify a monitor task*/ CLEAR(F0REVER) ; ENABLE ATTENTION(R1NG) ASYNCHRONOUS; /*the ring attention on-unit will be invoked as an independent task, once for each connection*/ ON ATTENTION(R1NG) TASK(USER(L1NE)) BEGIN; / *preliminaries-see text */ ENABLE ATTENTION(R1NG) ASYNCHRONOUS; CALL MONITOR(L1NE) ; /*termination-see text */ END; WAIT(F0REVER) ; E N D ANSWERER; The main body of the program is easily described-after initializing the event, enabling the attention, and establishing the on-unit, the program goes into a wait state and remains that way except when actually
36
RICHARD 1. WEXELBLAT
in the on-unit. Each time a connection is made, the on-unit is invoked and that on-unit remains active until the user on the line disconnects. When the attention is raised, its access status is changed to queued. This permits the on-unit to execute safely any code that requires exclusive access to any critical data. The attention is then enabled for asynchronous access again to permit other connections to be made. At this I’oint the MOXITOR program is called that will take care of the user’s reyucsts. The monitor remains active until the user is finished and then, after any termination code that may be necessary, the on-unit terminates. It is assumed that the answerer itself will remain active but in the wait state until terminated by some outside action. This example could be extended somewhat by assuming another attention that will be raised when a phone line disconnects. This action may orcur due to a line fault or t o a user hanging up without logging out. The following code would be added before the wait-statement in ANSWERER (D-LINE is the number of the disconnected line) : DECLARE DISCONNECT ATTENTION ENV(. . .); ENABLE ATTENTION(DISC0NNECT) I M M E D I A T E ; ON ATTENTION(DISC0NNECT) TASK B E G I N ; STOP TASK(USER(D-LINE)) ; EXD; The primary purpose of the on-unit is to terminate the instance of MONITOR that corresponds to the disconnected line. The stop-statement could be preccded or followed by any necessary clean-up code. The last example illustrates a possible application of the attention liandling through multiprocessing approach of Section 6.3. The attention in this case is assumed to be a terminal’s break key and it will be used to initiate a debugging printout during the run of a long program. Early in its execution, the main program will execute the following statement: CALL DEBUG-PRINT TASK; The attention task itself will look like: DEBUG-PR1NT:PROCEDURE DECLARE KEY ATTENTION ENV(. . .); ENABLE ATTENTION(KEY); WAIT (KEY) ;
/ * debugging printouts */ E N D DEBUG-PRINT;
ASYNCHRONOUS PROGRAM INTERRUPTS
37
The subroutine will enable the attention and then enter a wait state until the attention is raised. If that never happens, the subroutine will be terminated when the calling program terminates. When the attention is raised, the debugging print statements will be executed and then the subroutine will terminate. It would also have been feasible to have the DEBUG-PRINT program loop back to the wait-statement so that the print would be repeated each time the terminal break key was depressed.
8. Conclusion
A primary goal of this article was to define and illustrate an attention handling facility embedded in a high level programming language. It is not the first such attempt-as early as 1969, a paper on a process-control language based on PL/I was published (Boulton and Reid, 1969) and in mid-1969, an informal presentation on attention-handling in PL/I was made a t ,z SHARE Meeting (Norman, 1969). Although many of the examples were based on extrapolations of PL/I, the concepts involved are not tied to any language in particular and could be applied to FORTRAN, COBOL, PL/I, or any ALGOL-like language. T o avoid the impression of strong dependence on any particular application area, it should be noted that the underlying language concepts can find application in areas from time-sharing through systems programming and online process control. ,4ttention handling through on-units has a certain elegance that seems well suited to the top-down structural approach to programming, allowing the programmer to specify his attention handling code in a compact modular fashion, almost as part of a block's declarations.
Appendix 1 Syntax of the Attention Handling language The syntax of the attention handling statements is described here in a notation that is an extension of Backus-Naur Form. The :: = operator separates the term being defined (on the left,) from its definition (on the right). Square brackets indicate items that are optional. The ' operator separates options that may appear in arbitrary order. The vertical bar separates alternatives, no more than one of which may be used in a single statement. Upper case letters and punctuation other than that defined in the metalanguage represent characters from the statements being defined while lower case names represent categories defined within the definition itself. Braces are used for grouping and the ellipsis operator ( . . . ) is used t o denote arbitrary repetition.
38
RICHARD 1. WEXELBLAT
For purposes of this Appendix, some informal definitions appear as prose text between # signs. This notation is similar to that devised for the definition of COBOL and is fully defined in the draft PL/I Standard (ANSI/X3Jl, 1973).
1. The Enable-Statement enable-statement : : = ENABLE { attention-part { [access-option] [locationoption] ' [exclusion-option] } } . . . attention-part : : = ATTENTION(attenti0n-list) I ATTENTION(*) access-option : : = IMMR1)IATE 1 ASYNCHRONOUS I QUEUED 1 PRIORITY (unsigned-integer) location-option : : = IN(task-list) I IN(*) exclusion-option : : = LOCALLY I ONLY attention-list : : = # a list of attention names, separated by commas if more than one # task-list : : = # a list of task names, separated by commas if more than one# qinsigned-integer : : = #an unsigned integer #
2. The Disable-Statement disable-st,atement : : = IIISABLE
I
attention-part [location-option] ]
..
3. The Dequeue-Statement dequeue-statement : : = DEQUEUE attention-part
4. The On-Statement (as used with attentions) on-statement : : = ON attention-part { [task-option] [event-option] ] on-unit task-option : : = TASK[ (task-name) ] event-option : : = E:VE:NT(event-name) on-unit : : = # a simple statement or begin-block as for a PL/I on-unit #
Appendix 2 Detectable Conditions in PL/I, COBOL, and FORTRAN
As drscrihcti in Section 4, the genrrnl purpose lunguages PL/I and COBOL and implernented versions of FORTRAN all have a t least somc ability to take special action as result of some potentially interrupting condition. The tables below list the conditions that can be detected in each of these languages and processed without terminating the program run. 1. PL/I
Fixed overflow, floating over and underflow Size-number too large for target
ASYNCHRONOUS PROGRAM INTERRUPTS
39
Division by zero String truncation or reference outside of string bounds Subscript out of range Conversion error 1/0 errors of all sorts ERROR-a general catchall that is raised for any error condition not explicitly named. This includes errors in built-in functions, machine errors, etc. FINISH-a condition raised in a routine just as it is about to return, permitting any last minute cleanup
2. COBOL Overflow, division by zero-combined into it general purpose SIZE ERROR clause End of file Invalid key Error on an 1/0 device Start of tape volume (for label processing) Start of section of Report Program
3. FORTRAN
(16M FORTRAN I V ) Floating point under and overflow Division by zero End of file Error on input device End of asynchronous 1/0 operation
(Honeywell Series
6000)
Integer overflow, floating point under and overflow Division by zero, integer and floating point 1/0 errors of all sorts, sequential and random acces Format and conversion errors Illegal and/or, improper arguments to built-in functions (Square root of negative, log of zero, etc.)
Appendix 3 Glossary of Terms The definitions below are in no sense univrisal, but rather reflect the specialized usage above. ( A number in hrackpts indicxtrs thr section in the text in which a term is defined.)
40
RICHARD 1. WEXELBLAT
access-status of an attention enablement, determining whether i t will interrupt or not. L6.1.31 active-turned on; an active attention may be raised. c6.1.31 asynchronousan asynchronous attention may interrupt a program a t any time except when the on-unit of an immediate attention is active. r6.1.31 attention-the manifestation within a program of an external condition. [21 condition-a situation, occurrence or event that causes an attention to be raised. r21 delayed-an attention whose action does not immediately interrupt a program is said to be delayed. dequeued-removed from a queue or stack. L6.1.41 disabled-not prepared to be raised. E6.1.31 enabled-prepared to be raised. L6.1.31 enqueued-placed on a queue or stack. event-1. an occurrence outside of a program that may cause an attention to be raised. [21 2. a data item that takes on the values “complete” and “incomplete”. r4.1.21 immediate-an immediate attention may always interrupt a program but its attention on-unit may be interrupted only by another immediate attention. r6.1.31 inactive-turned off; the occurrence of the condition corresponding to an inactive attention may be ignored. L6.1.31 interrupt(See Section 2) on-unit-a statement or block of code to be executed when a program is interrupted by an attention or condition. C4.1.11 priority-an integer measure of the relative importance of a task or attention. L‘6.1.51 queued-1. on a queue or stack. 2. raising a queued attention causes an entry to be made on a queue but does not cause the program to be interrupted. 16.1.31 raise-an attention is raised when it is enabled and the external condition with which it is associated occurs. [21 task-(See Section 2) type (of interruption)-a classification of interruptions differentiated by the level of synchronization with the hardware or software environment. [31 REFERENCES ANSI. (1966). “USA Standard FORTRAN.” Amer. Nat. Stand. Ass., New York. ANSI/X3J1. (1973). “BASIS/l (Working Document for PL/I Standard).” Copies available from CBEMA, 1828 L St. NW, Washington, D.C. 20036. ANSI/X3J3. (1973). “FORTREV (Working Dorument for Revised FORTRAN Standard).” Copies available from CBEMA, 1828 L St. NW, Washington, D.C. 20036. Bates, F., and Douglas, M. L. (1970). “Programming Language/One.” 2nd ed. Prentice-Hall, Englewood Cliffs, New Jersey. Boulton, P. I. P., and Reid, P. A. (1969). A process control language. ZEEE Trans. Comput. 18, No. 11, 1049-1053. Coddington, L. (1971). Quick COBOL. Computer Monographs Series (S. Gill, ed.), Vol. 16. Amer. Elsevier, New York.
ASYNCHRONOUS PROGRAM INTERRUPTS
41
Corbato, F. J., Daggett, M. M., Daley, R. C., et al. (1963). “The Compatible TimeSharing System, A Programmer’s Guide.” MIT Press, Cambridge, Massachusetts. Corbato, F. J., Clingen, C. T., and Saltzer, J. H. (1972). MULTICS: The first seven years. Honeywell Comput J. 6, No. 1, 3-14. Dijkstra, E. W. (1968). Cooperating sequential processes. In “Programming Languages” (F. Genuys, ed.), pp. 43-112. Academic Press, New York. Honeywell. (1971). “FORTRAN,” Doc. No. CBP-1686. Honeywell, Phoenix, Arizona. IBM. (1970). “PL/I (F) Language Reference Manual,” Form GC28-8201-3. IBM, Armonk, New York. IBM. (1971). “PL/I Checkout and Optimizing Compilers: Language Reference Manual,” Form SC33-0009-1. IBM, Armonk, New York. JBM. ( 1972a). “Conversational Programming System (CPS),” Form GH20-0758-1. IBM, Armonk, Yew York. IBM. (197213). “FORTRAN IV Language,” Form GC28-6515-9. IBM, Armonk, New York. Lindsey, C. H., and van der Meulen, S. G. (1971). “Informal Introduction to ALGOL68.” North-Holland Publ., Amsterdam. Norman, A. B. (1969). Attention handling in PL/I. Unpublished notes-Entry 2108 in the IBM PL/I Language Log.
This Page Intentionally Left Blank
Poetry Generation and Analysis
JAMES J O Y C E Computer Sciences Division Department o f Electrical Engineering and Computer Sciences University of California Berkeley, California
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1. Introduction. 2. Computing Expertise. . . . . . . 3. Poetry Generation: The Results 4. Poetry Analysis: Introduction . . . . 5. Concordance-Making. . . . . . . 6. Stylistic Analysis . . . . . . . . 7. Prosody . . . . . . . . . . . . 8. Literary Influence: Milton on Shelley . 9. A Statistical Analysis . . . . . . 10. Mathematical and Statistical Modeling . 11. Textual Bibliography. . . . . . . 12. Conclusion . . . . . . . . . . .
References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . .
.
. . .
43 44 47 52 53 58 61 62 63 64 67 69 70
1. Introduction
The progress in literary data processing has neither made human poets obsolete nor rendered literary criticism a matter of having a friendly chat with a HAL 9000. The application of computers to the writing and analysis of poetry is approximately 15 years old and shows great promise for achieving every new subdiscipline’s ideal goal-ceasing to be a thing apart from what one normally accepts as the mainstream of activity in literary study. Results sometimes fall short of expectation in the application of computers to the generation and analysis of poetry, but the successful applications have encouraged computer technology to develop beyond the number-crunching stage toward the no less difficuIt area of natural language analysis. Poetry generation and analysis have nothing to do with automatic translation (as far as I can determine) ; and although some studies of 43
44
JAMES JOYCE
authorship have been done, the bulk of poetry analysis by computer is to understand what is there in the poetry rather than who wrote it. Thus, this survey does not include authorship studies. Poetry generation-the act of programming a computer to produce strings of output in imitation of human poetry-is less often done than poetry analysis, which is programming a computer to search for strings or patterns within human poetry; perhaps this is because most of us would rather read and talk about poems a human writes than about those written by a computer. However, as we shall see, there is a t least one good reason for producing poetry by computer. There are also a number of good reasons for going to the effort of converting poems into computer-readable form and writing (or borrowing) programs to search the lines for strings of characters or patterns. Fortunately, the human effort involved in the computational analysis of poetry can be decreased as more material is available in computer-readable form, as programs for stylistic analysis are shared, and as human-oriented output is easier to produce on high-speed printers, terminals, computer-controlled typesetting equipment, and the like. Because many of the points I make in this survey of the kinds of work going on relate not only t o the analysis of poetry, but more generally to literary analysis, I will a t times substitute “literary analysis” for “the analysis of poetry” to suggest the more general application. The examples, however, will be drawn from applications of the computer to the writing and study of poetry. The studies mentioned were selected because they are indicative of work done or going on, and I regret that it was not possible a t least to mention every worthwhile study. Most of the examples arc of applications to English poetry because I expect my readers are more familiar with English and American literature, not because work done in French or German, for example, is any less important. 2. Computing Expertise
Before one can do much data processing there must be some data t o process. Thc task of encoding data for someone engaged in poetry analysis can become monstrous all too quickly, depending in part upon the complexity of the project. Optical readers have not yet progressed to the point where a book of poetry can be “fed in” to the computer-or anything near that desirable but distant state of direct data entry. The person wishing to do poetry generation can (and generally does) content himself with a limited input vocabulary, concentrating his energies on the program logic necessary to create lines of poetry as output. But the person wishing to do an analysis of, say, Milton’s Paradise Lost must (if he cannot locate one of the several copies in computer-readable form)
POETRY GENERATION AND ANALYSIS
45
arrange for someone to enter the poem into computer-readable form or do so himself. It is this single point that discourages many novices in literary data processing. Although there is a listing of machine-readable text published each January in Computers and the Humanities, a journal devoted to computer applications to the various fields in the humanities, either the investigator may find the material he desires is not yet in machine-readable form, or the person in possession of the machine-readable text for some reason is not willing to share it. Another source for poetry already in computer-readable form can be found in the articles and books written as a result of computer-aided scholarship ; once again, some scholars apparently are not willing to share, whereas others are quite helpful. The situation is such that a group of scholars, among them this author, are a t work on plans for a clearinghouse of literary materials in computer-readable form ; this archive of material will hopefully eliminate much duplication of effort, and make computerreadable materials as easy to obtain as their printed forms are through interlibrary loan. Unfortunately, much work needs to be done before the archive can be operational, and a t present, most literary investigators must encode their data as well as design and write (or have written for them) programs. By and large poetry generation and analysis projects are done in computer languages designed for other purposes, such as FORTRAN, PL/I, ALGOL, and SNOBOL. This is because the traditional lack of money in humanistic disciplines is continued in computer applications. The literary scholar who uses a computer in a project may be more likely to find funds than he was before computers were used in such tasks. But there is nowhere enough money involved to encourage a major computer manufacturer to offer a supported compiler for a language more suitable for poetry generation or analysis. However, I do not mean that the major computer languages are totally inappropriate for poetry generation and analysis. Although FORTRAN is still best used for numeric calculations, it can be made almost tame for literary data processing; the advantage of using FORTRAN is that it is the most widely available computing language and a t a given installation tends to be kept the most problem-free of the installation’s languages. I n my opinion, the beet language for natural language programming is PL/I, a language implemented for IBM, Burroughs or CDC computers; PL/I allows both text processing and number processing without making the programmer work too hard to use them, although it is sometimes too easy to write in inefficient code. Some poetry analysis projects are done in assembler-level languages, but writing in a language so machine-oriented takes someone who is very patient and as interested in com-
46
JAMES JOYCE
puters as he is interested in literature. SNOBOL has been offered by several people as a good language for literary analysis, but I find i t a bit too alien for a confirmed humanist; this objection may well be more personal than professional. One problem with SNOBOL is that it is not available everywhere in a fast version and can be costly to use. Milic uses SNOBOL for his experiments in poetry generation because he does not need to be concrrned with programming efficiency as much as obtaining results quickly which he can then devote his time to analyzing (Milic, 1971) ; for programs processing large amounts of data, however, I suspect it would behoove the investigator to choose another language. COBOL has been used several times for concordance-making programs and may be an overlooked language for natural language processing; COBOL’s advantages are that it looks very much like English and is generally very fast in execution. Some resrarchers, unfortunately, speak ill of COBOL not because they know the language to be improper for literary data processing but because COBOL is the number one language in business data processing. Since there is more activity in poetry analysis than in poetry generation, more programs have been written and made available by their authors so that the same program need not be rewritten by independent investigators. Computers and the Humanities publishes in its May and November numbers a “Directory of Scholars Active” which indicates the projects under way and whether the programs being used or developed in a project are available. The January 1973 issue of the journal includes a brief summary of humanities programs available, in addition to the useful registry of machine-readable text. Comments in some of the program summaries indicated machine-to-machine incompatibility problems, but also a sophistication about programming that may indicate a growing away from the technological ignorance which has hindered work in the field. For a long time investigations into poetry using a computer were done by scholars trained in the humanities but who relied on programmers (generally not trained in the humanities) to instruct the computer. This led to the inevitable communications problems and misestimation of the computer’s capabilities, both over and under what the machine could do. I do not know whether the percent of scholars engaged in literary data processing who can program has passed 5070, but I tend t o doubt it. One reason it is important for literary scholars to know how to program is that computer center consultants do not generally understand the problems proposed by literary scholars, and there are not enough graduate students in literature who know programming to assume the programming chores that graduate students in physics, economics, and other such dis-
POETRY GENERATION AND ANALYSIS
47
ciplines perform. Another reason literary scholars need to know how to program is that effective and efficient literary data processing techniques can best be developed by people who understand the goals of literary data processing and the capabilities of computer programming. Curiously, those who generate poetry by computer all seem to know how to program (that is, there are none of the references to “my programmer” which one finds in literary data processing reports). Perhaps this can be accounted for by the fact that th.e person wishing to generate poetry by computer realizes that the experiment in creation requires a close relationship between literary person and computer; another reason may be, as one generator of poetry told me, the whole thing is being done as an exercise in programming. 3. Poetry Generation: The Results
Poetry generated by a computer has appeared from time to time in print and was on exhibit a t the Association for Computing Machinery’s 1969 Annual Conference in San Francisco, California. More recently two publications have appeared containing a number of computer-produced poems: Cybernetic Serendiptiy, and a volume devoted to poetry entitled Computer Poems. The poetry in these publications gives a good idea of what is being done in poetry generation, both the successes and the failures. A most interesting and thought-provoking application of the computer to writing is by Marc Adrian, the first selection of computer poems and texts in Cybernetic Serendipity. Adrian’s piece “CT 2 1966” is called a “computer text” in the book, and I do not know to what extent I am justified in including it in this discussion. But since the piece is similar to poetry by, for example, Ian Hamilton Finlay in Scotland, I decided I should include it (see Figs. 1 and 2 ) . An IBM 1620 I1 was programmed to select the words or syllables at random for Adrian’s poem; the program also selected a t random a type size for each of the words chosen, and they were set in that size Helvetica type using no capitals. The question of whether the result is a poem is not to be brushed aside lightly. Adrian’s (1969) comments on his work sound very much like an artist discussing his medium: “TOme the neutrality of the machine is of great importance. It allows the spectator t o find his own meaning in the association of words more easily, since their choice, size and disposition are determined a t random.” The question of whether the product is in the medium of poetry or of graphics is answered hastily a t the peril of the reader. I do not believe it is poetry because it is not language, for I hold as an assumption that poetry is a subset
JAMES JOYCE
cocco
bOl
POP
do
POOP
POI
Poldo
blood
doll
coop
POI0
colpo
POOP
loco FIG.1. A computer text. (From Adrian, 1969.)
a
r
a C
r
t t b
a
b t
S
a
a
a S
t b
0
r
C
a
a
a C
b t
t b
r a
a
r C
a
r 0
b t
S
0
C
a C
0
0
r C
r
0
0
b
a
t b
r a
a S
a
b t
t
b
0
r C
a
FIG.2. I. H. Findlay’s poem “Acrobats.” (From Smith, 1968.)
POETRY GENERATION AND ANALYSIS
49
of language. Perhaps one value of Adrian’s work is that as we judge whether it is poetry or not, we have the opportunity to reassess for ourselves just what the word poetry means to us. Gaskins choses the haiku form as the structure from which a poem is t o emerge. Gaskins has a fine sense of haiku and has evidently programmed his computer well. His word-choice captures the Oriental spirit well, and the result is a number of believably Japanese haiku. In a conversation with Gaskins I learned he wrote the program to generate the haiku in SNOBOL language, and, among other things, the program checks to maintain a consistency of seasons to be evoked in the computer-produced poem. The traditional haiku is a three-line poem having five syllables in the first line, seven in the second, and five in the last; the poem is to be evocative rather than discursive, and there shouid be some pointed reference to the season the haiku suggests in the last line. Though these rules do not seem very complex, a good haiku is not easy to write, whether helped by a computer or not. The following are some of Gaskins’ haiku (1973) : Arriving in mist Thoughts of white poinsettias Snow leopards watching.
and Unicorns drowsing Thoughts of steep green river banks Scented thorngrasses.
and Pines among riders Old man is speaking of frost Mount,ain jays watching.
These haiku, while not great poems, are good. Other attempts a t haiku generation by computer are not as good ; their programmers apparently ignored basic haiku structure and generated random words under the constraint of syntax alone, hoping for output which (if the reader stares at for a long time) may make sense. Gaskin’s efforts make sense and have that sense of abbreviation that is in human-written haiku. The title “Haiku Are Like Trolleys (There71 Be Another One Along in a Moment) ”-suggests their machine origin. Gaskins explained to me that after hc had written the haiku-generating program he had modified his computer’s operating system so that in the event of machine idle time, haiku began appearing on a cathode ray terminal which was permanently online, the lines moving upward steadily on the screen as more lines appeared, and finally disappearing off the top of the screen, lost; the haiku
50
JAMES JOYCE
did remain on the screen long enough for someone to copy it if he happened to be watching the screen and desired to do so. Unfortunately, it is my understanding that the haiku-producing program is no longer such an integral part of the system. Two people who generate poetry by computer, Kilgannon (1973) and Milic (1973), have goals beyond the basic goal of generating poems through a computer program. Kilgannon, using an ALGOL program running on an Elliot 4130 computer, generates “lyrics” and then develops his own poem from the one the computer generated. This fascinating symbiosis of man and machine suggests a new way of thinking about poetry-a way that may indeed become an influence on poetry itself. The computer has become a source of inspiration, a generator of something to revise and make into a poem. Poets for centuries have drawn inspiration from things, but drawing also the terms of inspiration-to share the process of inspiration and creation with the source-is a rather intriguing development. An example of this symbiotic relationship is illustrated in the pair of poems “Lyric 3205,” written by the computer, Lyric 3205 judy gotta want upon someone. wanna sadly will go about. sammy gotta want the thief him but the every reason. real distance carry. before god wanna remain. private distance talk indeed baby. an. diane likta tell the thief him but the every reason. real distance carry. before god wanna remain. private distance talk indeed baby. an
and, Kilgannon’s poem developed from “Lyric 3205”: Restlessness judy needs to need someone sadly searching everywhere sammy finds his soul attached to travel, movement, free as air diane lusts communication every life is her domain private distance talks indeed and drives us all to search in vain
A poet writing poems based on words or phrases selected at random is not new. As Milic (1971) has brought to our attention, a childhood
POETRY GENERATION A N D ANALYSIS
51
friend of Dylan Thomas wrote that Thomas “carried with him a small notebook containing a medley of quite ordinary words.” When he wanted to fill in a blank in a poem he would try one word after another from his dictionary, as he called it, until he found one that fit. His aim “was to create pure word-patterns, poems depending upon the sounds of the words and the effect made by words in unusual juxtaposition.” Kilgannon’s use of the words and ideas from “Lyric 3205” in his “Restlessness” is not unlike Thomas’ practice as described by his friend. The difference between a Dylan Thomas poem and one by Kilgannon is not found in whether one used a notebook or a computer to suggest the stuff of which poems are made; the difference lies in the ability of the two poets. Milic’s collaboration with the computer is a pointedly systematic one; he seeks to understand more about what poetry is, and the poems his programs generate serve two functions: (1) they are poems, and (2) they are experiments to see what kinds of freedoms and constraints will produce poetry. He says, “in an important sense, strings of words are interpreted as poetry if they violate t w o of the usual constraints of prose, logical sequence and semantic distribution categories” (Milic, 1971, p. 169). This suggests that poetry does not have to make quite as much sense as prose does, a statement that is in its way true. As Milic points out, modern poets have exploited the reader’s willingness “ t o interpret a poet, no matter how obscure, until he has achieved a satisfactory understanding” (1971, p. 172). Behind the reader’s willingness is a seldom expressed but definite ag+eement that there is something there to understand; and, beyond that, there is someone to understand. Some computergenerated poetry shows a concern for the first part of the agreement; if any shows a concern for the second I am not aware of it. Marc Adrian’s words that the ‘(neutrality of the machine . . . allows the spectator t o find his own meaning in the association of words more easily” can be viewed as so much bad faith on the part of the poet if one insists upon the agreement between reader and writer characterized above. Milic’s computer-generated poems show a concern for the integrity of the poem: although poetry may violate some of the rules, it is not devoid of some basic sense. An example of this concern is in his “Above, Above”: Above, Above This is my epistle to the universe: Above the eager ruffles of the surf, Above the plain flounces of the shore, Above the hungry hems of the wave.
There is a basic sense there, something the reader can enter into; i t is not terribly profound, but it i s not totally obvious (as expository prose is supposed to be) either.
52
JAMES JOYCE
4. Poetry Analysis: Introduction
The agreement between poet and reader that the poet has given the reader something to understand if he will but interpret it is a basic tenet of literary analysis and a major justification for the analysis of a poem by computer; indeed it may not be too much of an exaggeration that the analysis of poetry using a computer allows the reader the opportunity to understand the someone who wrote the poetry with a precision usually not attained through the use of noncomputational analysis alone. This is not to say that literary data processing is all one needs to do to understand poetry-far from it; using a computer to analyze poetry provides a reader with knowledge to be used in addition to more conventional methods of analysis. The analysis of poetry is no less difficult than the analysis of any other natural phenomenon. But what can one analyze in poetry using a computer, a device which is so insensitive and unforgiving to computer programs that the obvious misspelling of a variable name can produce pages of misunderstanding in the form of error messages or bad output? The answer to this, of course, is that i t is the human who does the analysis ; the computer is his instrument for handling large quantities of data quickly and consistently, especially if the task involves counting or searching. A digital computer can process two kinds of data, numeric and string. Poetry can be viewed as string-type data, and thus the kinds of operations one performs on strings can be performed on poetry. For example, a line of poetry is generally mapped, character for character, into a fixedlength field and padded to the right with blanks. There may be other fields for identification information to make up the record depending upon whether or not the project has a need for that information. Most records are card images, reflecting the most common input medium-cards. Most lines of poetry will fit comfortably on a single card along with, say, a linc number or identifier to indicate the poem. Programs search the text field for strings or patterns and if they meet a desired criterion are counted or noted. Generally the kind of processing done is more like business data processing than scientific data processing, except for a few isolated projects. A common product of literary data processing is a concordance, which can be defined for the moment as an alphabetical list of entries, each entry consisting of a word used in the text being concorded and each line in which the word was used (usually with some indication of the location within the text being concorded). An example of a concordance can be seen in Fig. 3.
POETRY GENERATION AND ANALYSIS
53
5. Concordance-Making Production of a concordance is not a computationally sophisticated task, although there are some aspects of concordance-making that as of now are not possible using a computer. The poetic line is searched and each word found (if the concordance is not selective) is written as a record, containing also the poetic line and identifying information, to some intermediate storage device. If the concordance is selective, the word found is checked against a list of words to be omitted and, if it is on that list, the record for that word is not written out. The intermediate output is read from the temporary storage device and sorted into alphabetical order, the result later input to a print program which accumulates statistics about the number of different words and the distribution of relative frequency for the different words. This sounds simple, and basically is; far more difficult for the literary data processor is the task of entering the data to be concorded into computer-readable form, a task which includes the decisions of how to indicate characters not available on the encoding device or the printer, the fields to be included in the record, etc. However, if the concordance project consists of only what I have described above the resulting concordance may be a help t o scholars (in that some help is better than none), but it will not really be satisfactory. Homographs, for example, ‘‘love” as a verb and “love” as a noun, are not two uses of the same word; they are different words and should be treated and counted as such. True, the human user of the concordance should be able to separate homographs, but to do the job correctly they should be separated in the finished concordance. This process a t present requires the human to edit a preliminary form of the concordance, a task which is made considerably easier if one has an online text editor such as the WYLBUR system developed by Stanford University;’ a program designed to regroup lines according to an editor’s indication is not complex to write in any event, and allows a number of other desirable regroupings to be made as well. For example, it is also desirable to have the various forms of a verb, such as “coming” grouped together under “come.” One of the forms, “went,” needs to be placed out of its strictly alphabetical arrangement, but that is a minor problem for the regrouping ‘A description of Stanford’s WYLBUR system is available from the Stanford Center for Information Processing, Stanford University, Stanford, Ca. 94305. Another version of the WYLBUR system is described by Fajman and Borgelt (1973).
PAGE LOUO
ICONTINUEDI AND BY ME, I N S 3 F T RED RAIMENT. T H E F E N I A N S MOVED I N LOUD STREAMS* H E A R I N G H E L L LOUD '.WITH A MURMUR* A S SHOUTING AND MOCKING WE .. SWEEP. ..- - . AND MY LOUO BRAZEN SATTLE-CARS m AN0 W I T H LOUD S I N G I N G I RUSL(ED ON I PARTED HER L I P S W I T H A LOUD SUDDEN CRY. a s THAT HE MAY F I G H T THE WAVES OF T d E LOUO SEA.* I THAT HE MIGHT F I G H T THE WAVES OF T d E LOUO SEA.. W H I L E HUSHED FROM FEAR, OR LOUD W I T H HOPE. A BAhD AND THE LOUD SONG OF THE E V E R - S I 4 S I N G L E A V E S * . AND THE LOUD CHANTING OF THE U N Q U I E T LEAVES. I . AND THE L O U 0 CHAUNTING OF THE U N O U I E T LEAVES, AN0 THE WHITE HUSH END A L L BUT THE LOUD BEAT FROM MARBLE C I T I E S LOUD W I T H TABORS OF OLD NOR GAVE LDUO S E R V I C E T O A LAUSE I WITH THE LOUD HOST BEFORE THE S E A S THEY HAVE LOUD M U S I C e HOPE EVERY DAY RENEWED THAT CRAFTY DEMON AND THAT LOUD B E A S T NOW THAT THE LOUO BEAST RAN e I . COME W I T H L O U 0 CRY AND PANTING a R E A S T rn UPON CROAGH P A T R I C K SANG I T L O V D l s WHILE H i s LOUD SONG REPROVES * . I'LL QUENCH H I S S I N G I N G Y l T H LOUD SONGI LOUD FOR THEE THE MORNING C R I E T H * OH. CEASE YOUR SN I GN I GI WLID AND S H R I L L AND LOUD. BUT LOUO THE GRASSHOPPER T H A T SITS BENEATH. Q A TITAN. W I T H LOUD LAUGHTERS THAT HAVE GONE THITHER TO LOOK FOR THE LOUD STREAMS. THOUGM WE WERE B I D D E N TO SING, CRY NOTHING LOUD. TO THE LOUD SANDS BEFORE DAY HAS BROKEN. LOUD-CRASHING OF THE LOUO-CRASHING AN0 EARTH-SHAKING FIGHT, LWDER SOUL C L A P I T S HANDS AN0 SING, AND LOUDER S I N G BUT LOUDER SANG THAT GHOSTS *UHAT THEN?. * * . LWDLY AND HE CALLED LOUDLY TO THE STARS TO BEND a a AND LOUDLY T W I C E THE L I V I N G WINGS F L A P 1 W I D E $ I nUGH WE RODE I N SADNESS ABOVE LOUGH LAEN. ROUND LDUGH DERG'S HOLY I S L A N O I WENT UPON THE STONES1 LWGHLAN AND THOUGH THEY HAVE TO CROSS THE LOUGHLAN WATERS m AN0 THOUGH THEY HAVE TO CROSS THE LOUGHLAN SEAS LOUNGING L I K E AN ARMY OF OLD MEN LOUNGING FOR REST FROM THE FOAM OF THE SEAS. LeUR I CAN SCOFF AND LOUR a a a m m LOUT I NEEDS MUST MARRY SCUE POOR LOUT,. I NEED MUST MARRY SOME POOR LOUT*. 9 W I T H S K E L P I N G M I S B I G BRAWLING LOUTI m A DRUNKEN. V A I N G L O R I O U S LOUT. A ORUNKENr V A I N - G L O R I O U S LOUT. AND L E F T ME BUT A LOUT, DAN AND JERRY LOUT s Q A LOUT BEGETS A LOUT, . > A LOUT BEGETS A LOUTI m LOVE SEE TRUE-LOVE AND LOVE, I N T H E M W R S WHEN YOUTH HAS CEASED1 FOR LOVE OF USHEEN MY FEET RAN FOR LOVE OF O I S I N FOAM WET FEET FOR L O V E OF O l S I N FOAM-WET FEET I N T O A DESPERATE GULPH OF L W E I -GOOD REASON HAVE I FOR MY LOVEv. LOVE. T I L L THE OANAAN POETS BROUGHT I N LOVE. I CRIED, .THEE WILL I WED, AN0 THE BLUSHES O F F I R S T LOVE NEVER HAVE FLOWN1 AND THE FLUSHES OF F I R S T LOVE NEVER HAVE FLOWN1 M U S I C AND LOVE AND SLEEP AWAIT, * . FOLOEO IN LOVE T H A T FEARS NO HWROW. * . FOR L O V E THEY FOLLOWED ONE AND A L L 1 s W H I L E H I S HEART S T I L L DREAMS OF B A T T L E AND LOVE YET LOVE AND K I S S ON D I M SHORES F A R AWAY s THEY LOVE AN0 K I S S I N I S L A N D S FAR A Y A Y r BUT LOVE AN0 K I S S ON D I M SHORES F A R AWAY s L I G H T IS M A N I S LOVE, AND L I G H T E R I S MAN'S RAGE1 AN0 THEN LOST N I A M H F~URMUREDI .LOVE, WE GO 0 LOST N l A M MOURN AND SAY, .AH, L O V E 0 WE GO I N THE D I M N E S S THEY LOVE G O I N G B Y 1 v COMING OUT OF THE SEA AS THE DAWN COMES, A CHAUN1' 01 .OVE ON MY L I P S . AND T H E I R HOUND. AND T H E I R HORSE, AND T H E I R LOVE, t . THROUGH THE O E U X LOVE OF ITS YOUTH WHEN WANDERING I N THE FOREST, I F HE LOVE BE PLENTIFUL.-A.IO I F HE LOVE ANOTHER* s AS N I THE WOODS n E WANDERS. IF HE LOVE BE PLENTY.-IF HE G I V E S ANOTHER LOVE. *
. . . .. .. ... ... . .. . .. ..
.. .. . . . .. . . . . ... ... . .. . . . .. . ... .. .. . .- . .. ... .. ... .. .. .. . . . .. .. .. . . . . . . .. . ... ... . . . .. .. .. .... . . .. .. . .. .. .. ... . .. . ..
. .
---
.
.. .... .. . .. .. .. .. ..
.. .. . ... ... ... ... ... .. . . . .. .. . . .. ... ... ... ... ...... . .. . .. . ... .. .... .... .... .... . . .. . . ... ... ... ... . . .. .. .. .. . . .. .. . ... ... .. .. . . .. .. . . . .. - ... ... ... ... .... . . .. .. .. ..
.. . . .
. .. . . .. .. .
.. .. .. .. ..
TITLE
LINE
53 O I S l N 3
93
62 O I S I N 3
208
83 MAD K I N G GOLL 93 M A 0 K I N G GOLL l D 5 C U C H U L A I N SEA 111 C U C H U L A I N SEA 111 C U C H U L A I N SEA 114 ROSE OF BATTLE I20 SORROY OF LOVE 120 SORROW O F LOVE 120 SORROW O F LOVE 137 T O SOME I TALK 162 ASKS F O R G I V E 213 GREY ROCK 276 GREY ROCK 398 LEADERS CROWD 399 DEMON BEAST 400 O E W N BEAST 412 TOWER 528 DANCER CRUACH 596 P A R T I N G 641 I S L E STAT 1 1 b 4 8 I S L E STAT I 1 648 I S L E STAT I 1 6 6 2 I S L E STAT I 1 2 6 8 0 LOVE AND DEATH 165 SHADOW WATER A 111 SONG DEIRDRE 3 784 YOMANS BEAUTY
V
18 31 8
V V
81
V V V
3 11 11
81
5
15
9
51 128 11 2 16 V
86 5
7 63 86
99 8
11 328
2 18
38 O I S l N 2
141 11
401 S A I L B V Z A N T I U M S17 WHAT T H f N
20
b1 SAD SHEPHERD 659 I S L E STAT 1 1 1
12
2 OlSlN 1 593 P I L G R I M
5
v
7H 6
V
113
v
144
282 TWO K I N G S 2 8 2 TWO K I N G S
57 O I S I N 3
113
510 JANE JUDGMENT
8
41 47 9 32 32 14 5 LO5 SONGS BURDEN 1 b05 SONGS BURDEN 1 2 OISIN 1 6 DlSlN 1 6 OISIN 1 6 01SIN 1 1 OlSlN 1 7 OISIN 1 7 OISIN 1
7 8 8
9
23 23 27 29
29 29 45 46 46
53 58
DISIN OISIN 01SIN OISIN OISIN OISIN OISIN OISIN 015111 OlSlN DISIN OISIN OlSIN OlSlN OISIN
1
b
6 V V V V
7A 51 57 51 73
V V V
61
V
85 103
1 1
1 1 1 1
2 2 2 2 2 2
341 V 345G 406 13 V V
13 11
V
243 245 2450 98 158
3 3
176 216
59 O l S l N 3 63 O l S l N 3 71 ANASHU V I J A Y A
71 11 11
ANASHU V I J A Y A ANASHU V I J A Y A ANASHU V I J A Y A
63 14 85
3 V V
5 3 5
POETRY GENERATION AND ANALYSIS
55
program sketched above. What is important of course is that when the entries for “went” are regrouped with “come,” there be a cross reference entry to indicate that practice. If the cross-reference text, “See under ‘come’,” is made the context for a single entry under “went,” that problem is solved easily enough. What is described here, it should be said, is not intended to represent the task of making a concordance as ridiculously easy or to pretend there are arcane reasons for editorial decisions made in preparing a concordance which make complete automation impossible; rather, the production of a concordance is basically a computationally simple task which includes some hard work the computer is generally not asked to do. Another purpose for dwelling here on basic concordance-making is to acquaint the reader with the idea of a concordance, as well as the ideas of homographs and the different forms of words. These concepts will be important in understanding the other kinds of literary analysis done by computer. The concordance is useful to literary scholars in that if a question arises over how a writer uses a particular word, the entry for that word can be looked up in a concordance and all instances in which the writer uses the word evaluated to synthesize what the writer means when he uses the word. By reference to a concordance rather than to the original work, individual words that suggest appeals to the senses can be traced in the writer’s work; by finding the words in the concordance the investigator has a better chance of locating all words he desires, rather than risk a mind-numbing search through a large body of work and having to consider each word every time i t appears. Approaches to concordance-making by computer have shown almost as much variety as the people preparing them. The most famous of the concordance-making enterprises is the Cornell Concordance series. Before the advent of computers, concordances were made a t Cornell by teams of graduate students and local housewives working with 3 x 5 in. index cards; such a procedure was lengthy and tedious to those involved, but was an improvement over a single person doing the concordance alone (Smith, 1971). The Cornell Concordances, begun in the early 1960s under the general editorship of Stephen M. Parrish, have produced a series of concordances to the complete works of a number of writers. These concordances are produced camera-ready on the high-speed printer, which with the earlier concordances meant punctuation marks were generally omitted and the letters printed were in all upper case. Subsequent refinements of the machinery involved provide punctuation and the distinction FIG.3. A page from a concordance to the poems of Yeats. (From Parrish and Painter, 1963.)
56
JAMES JOYCE
between upper and lower case, recently seen in the concordance to the poetry of Jonathan Swift (Shinagel, 1972). The distinction between upper and lower case, while not always so important that the absence of the distinction seriously faults the work, is always desirable. The users of concordances are people, and literary people a t that; literary people are used to the upper- and lower-case distinction, as well as the distinction between roman and italic type. The makers of concordances who use a computer, faced with the limited character sets of high-speed print chains, compromised a t first because they had to. The alternative was to have the computer output hand-set by linotype, which would have introduced errors and held up production of the concordance as well as adding substantially to the cost. Recently concordances have been appearing which, although they were produced by computer, have all the advantages and attractiveness of typeset books. These concordances have been filmset, a process by which material is composed on fllm guided by appropriate programs that sense in the input stream “escape codes” which indicate a change in type font, size, or special characters. As an attractive concordance Ingram and Swaim’s (1972) A Concordance to Milton’s English Poetry is a visual delight; not only are capitals and lower case present, but roman, italic, and boldface letters help guide the eye (see Fig. 4 ) . High-frequency words are omitted, but such omissions are few, reflecting the trend in concordance-making to exclude as few words as possible. However, there is no consistency in treating homographs. The concordance also lacks what has become a standard item among computer-produced concordances: a frequency list for all the words. Professor Trevor Howard-Hill’s Concordances to the First Folio of Shakespeare,* eachvolume of which is devoted to a different play, is filmset also. The obvious rightness of producing filrnset computer concordances has yet to be universally acknowledged, but it is apparent that the trend is there. One concordance influenced the text it was keyed to because the project for the concordance was carried out while the edition being concorded was in preparation (Spcvack, 1968-1970). It was based on G. Blakemore Evans’s The Works of Shakespeare. Concordance and edition were in simultaneous production, and several decisions about the text being edited were influenced by the preliminary output from the concordance. Perhaps it is an error to speak of “the concordance,” for Spevack’s work is really several concordances, one to each piece by Shakespeare and one to the total corpus. Interestingly, no lines of context are given with the entries in the concordance, which makes the distinction between upper and lower
’Published
one play per volume by Oxford University Press, 1970-1972,
POETRY GENERATION AND ANALYSIS
57 SINCE R r Lost 10.631
Dusk facer with whits ailkcn Turbsnu wrath'&
Wctlin the borders of her silk'n veil: Soh s i i m Rimrow fading t i m ~ l m l i ~ ,
ull"
Samson 130
Fair Id 2
I . ,
Wasall th~tdidthciriillythavphtaaobusic keep 810 InSiio htr bright Sanctuary:
s1r
Delight l h r t m m . and Sib's Brook that llow'd ma 'Silos's'
SLlru Dwtl1'~thers with Pan. a Siivan. by blest Sons We, If mcttsl, pan wcmd Gold. pait Silver deer. &fore his decent rlcps P Silver wand And O'IC the dark her Silver Mantle threw Others on Silver Lsksr and Rivers h t h ' d
Nativity92 Samson 1614 Par Lost I .I I
Mzrk 268
h r Loa13.595 ParLostItd4 Par Lo114.609 R r Lost 1.437
lnd&live mainly 10 the sin of E r Ten thavsindfould the sin of him who slew To whom thus M w h d Doubt not but that sin Smagainrt L s ~ 1 6 l i ~ h t : t h 1 w h r n t h r y ~ L a w can divovcr .in. but not remove. In sin lor c v n IOU from lik:thir sct &frSli"g Sin and Death. hi8 two m a i m armes. Of washins them lmm i u i l t of inn 10 Life Whether Ishould r y t me now of sin Rctrndi to wash a1 sin. and f i t them u) Toconquer S i n m d Deaththetwosrand f a r . With i w l t of his own sin. for he himself of sin. or Iqd debt; Weakly PI Ic.sl. and ihamcfull A sin Rcpenllhssin. but if thcpunir&h
Fivourrsnsw'd, andsddagmcrrin Tia onely dsy4ght that mskci Sin Trinilyms'un'
B r i d g r u e r ms kin'
D r i v i y f u o f l u ~ h t h i n gof u n andguilt. 1637 *sinme'
R r Lost 12.429 RrLosl 12,431 Par Lost 12.443 R r Lou 12.414 Par Reg I .11 Rr R g I.I59 RrR~3.141 Samson313
Samson 4 9 S m w n 504
Ssmron 1357 Mask 126
Mask 456
Turn forth her i i l ~ e rIming on the night?
Trinity mr 'ansell' -'arch.mgell'
.-._ -,
--'symphonic of
iirrilvn.busklnd Nyppha asgrcat andgoad, 1613 'silver buakm'd Trinity ms 'silver-burkm'd'
R.lm4.19 R . l m 80.14 R.lm84.40 h l m as. 7
8lVU.bftCd
Fair i i l v w r h d l e d @won for ever chiits. Bridpwater m i 'IIIVCIshalter'
s*lrr
Jurt Slmron and Raphetie Anne. wam'd But trouhle. .I oldSimeon plain 1ore.told. Simllla* k g o t l c n Son. Divino Similitude. In our aimilmdc. and 1st them rule Retainins still Divine similitude Simon Andrew nndstmon. famous alter known
R r Lor1 II ,390
P.r LOS13.384 Par Lost 1 520
ParLolt11.512 Par Reg 2 1
01Orrb. 01ollinoi. didst mipire mr 'Sinai' Cod from the MountofSim8. whougray lop As on mount Sind rani 9n.b" Whom thui the S n b o r n Monrtcrmrwnd soon.
sllw Simbrcd. how havcyctroubl'dall mankind
?%nple Shcpherdi keeping watch by night: Atas haw smwle. 10 these Cam comoar'd.
Par Lor1 12.365 rmrRg2.348 Mask 621
Sin Doubl'd that 31" ~nBethd and in Don. At first. and call'd me Sm. and lor a Sign Stranie sllsratttront Sin and k t h amain By sin to foul crorbitsnt desires. Of all things Iranrilorie and vain. when Sin Envicthcmth~l?enn~lbantoknow. Farr be it. that Irhovld write thee sin or blame. Thy Sin and place of doom obscure and loulc. BY sin of disobsdicnce. It11 that hour
1667 ..inn*. Forsin.onwvsrrandmulu.lrlnu harbent. Save what iin bath impaird. whictyet hath wrought And govern well thy ap titc least sin Sinncsad her shadow gnlh:%nd Mivric Now not.lhovihSin,notTime. firrtwrmuphttheehi Farrvshlhou~rt,fromain.nd blsmemtire But harm precedesnot sin. onrly our F a Weptntcom l u l i n ~ o l t h c m o r t dSnn T h e s o h a efthtr sin t i l l dowicsloc And manifold in sie.'dcserv'd 10 falr
1
Par Lost 1.485 Psr L0*12 760 Rr Lost2.1024 Far Lalt3.111 Par Lost 3.446 ParLo~l4.511 Par Loll 4.158 Par LoS14.840 R r Loa6.396 R r Los16.506
Rr Lm16.691 Par Lost 7.346 RrLom9 12 Par Lo&t9.10 R r Los19.292 R r Lon19.321 R r Loit9.lm3 h r Lor19.IM4 Par Lost 10.16 Par Lo.110. I33 Rt Lost 10.172 Par Loll 10.230 R, Lo*t10.234 R,Lost 10.251 P l r Lost 10.352 Fa, Loll 10.407
sirr This d y n l a l l . since by Fstc rhc strength 01 Gods -~. ~.". -~
Par Loll I.I Pa, Lost 12.221 Nativity 158
Par Lost 10 596 R r Lo114.31 5
Par Lost I .I16
Smcr throuRh experienceof this p a t went
Glories: For never iineccrssrcd man. And all who sin-. Baptdd or Infidel For since no drtp within her gulf can hold By my advice: since fate inevitable Worth Walling. since our prcacnl la1a p p " , Dear Daughter. since thou claim's1 me or thy L r s . May Icrprcsa t h e unblsm'd?sinee God i n light. Into 8 Limbo largcand broad. sinsscalld k thcn his Love accurst since love or hait N&&d k t h o ~ ; - & & ~ & & ~ hi;lhfih So since inlo his Church lewd Hirclin@ c l i m b That svcr since in loves imbrascs met. Adam the goodlier1 man of men since borne A l l Bu~trofth'E.rth,iinccwildc,nndof~llchslc Well known lrom Hcav'n; and iinec Meridian hour Mind us 01 like repow. since Gad hath lcl SinaSarcm fell. whom fellic overthrew. T o b o s t what Arms can doe. s i n e thine no more And why not Gods of Men. ainsc g a d . the more Smw hy dacending lrom the Thconnnbovo. N o t me+ I~tulsr.smcc by k r e e But more illuitrioui made. since he the H u d Of thinright handprovokl. iincc first UIall~npue Sincenowwrfindthiiour Empyrullarm Since M k h d and his Powers went forth 10 tame Of ending t h i l g m l Warr. since none hut Thou Or I alone against them. since hy strength 0fwh.twr.re. Bvtiincethou h a i l v o u t u f l Who sincethe Mornins hour vtavtfrom Hur'n Not hither iummand. rincc they cmnol change Follow'd with b d i s t i a n . Since to part. Since lint this a b j e c t for Heroic h n g Since Urid R q c n t d t h e S v n dcuri'd Not l o n p then since I i n OM Night freed Since hiEhw Ifall short. on him who next Sin= R u l o n not impauibly may m e t For now, and s i n a first break of dnwnethc Fiend,
P.rht3.3
Par Lolt3.495 Par Lost 4 69
R;LOit4.7i
R~Lost4.193 R r Los14.322 R r Los14.323 PsrhI4.341 RrLo114612 Par Losl4.58l R r Los14.WS
Far Lostl.lM8 R r Last 5.11
Par Lost5.363 R r Los15.174 Rr Loll 5.342 R r Lost 6.154 R r Los16.433 R r Lo16.f.66 Far h t 6 . 1 0 2 RI ~ o ~ 6 . 8 2 0 R, LOU 7.80 R r LO.18.1 II
RrLo.18.341
R r Loll 8 4 4 5 R r Lost9.25 RrLoa9.M
ParLo119.10
FIG.4 . A page from a concordance t o Milton. (From Ingram and Swaim, 1972, @ Oxford Univ. Press. by permission of the Clarendon Press, Oxford.)
58
JAMES JOYCE
case unnecessary. A nice feature of the work is a concordance for each character, indicating each word the character in the play speaks, the number of speeches, how many of these lines are in verse or prose, the total number of words a character speaks, and the number of different words a character uses. These concordances will undoubtedly help studies of individual characters in the plays; for example, a study of the relative frequency of the kinds of words Hamlet uses in Hamlet, Prince of Denmark may help us better understand this complex character. Are his words primarily self-abusive, despondent, uncertain, calculating, or what? Spevack’s concordance also has word-frequency lists and reverse spelling word indices to facilitate, for example, a study of Shakespeare’s use of -ing words, and homographs are clearly indicated. Misek (1972) shows what can be done in concordance-making with a little imagination and sensitivity towards intended users. The body of the concordance is augmented by an indication, for each line of the text, of the speaker of the line and to whom he is speaking-a tagging Professor Misek did by hand. This work gives every word in the poem with no omissions, and provides a number of excellent charts to summarize information. The work comes very close (through the charts) to computational stylistics, which is the next area of poetry analysis by computer we will investigate. 6. Stylistic Analysis
It may be overstating the point to claim that the single most important result of a computer-produced concordance project is that literary material is thus put into computer-readable form so that more sophisticated kinds of literary analysis can take place. That is, as valuable as the concordance is as a tool for the analysis of poetry, the other kinds of analysis done by computer generally rely on material put into computer-readable form so that a concordance can be made. Stylistic analysis by computer can- be an exercise in what is not characteristic of a poet’s style, and although nonresults are actually quite valuable, the literary community does not quite know how to handle them, so they are generally unannounced. The expense of putting a large amount of material into computer-readable form is balanced against the risk of nonresults, and I have known several would-be investigators who felt the encoding of literary data too high a price to pay for the kinds of results they might get back. One reason for such an attitude on the part of literary data processors when the question of stylistic analysis comes up is that the discipline of literature has a rather fuzzy-set notion of what style means, but researchers in literature are generally not formally oriented enough to take
POETRY GENERATION AND ANALYSIS
59
advantage of the work by Zadeh3 (1965, 1972) and others in characterizing their area of investiation. Elements of style in poetry belong by degrees to that sense of identity we find in the work of an author and which we call the author’s style; within an author’s lifetime (even one as short as Keats’ five active years) critics may speak of changes in the poet’s style, but with the usually implicit assumption that there is a thing such as an overall style. There have been notable attempts to establish precisely the concept of style and other attempts to encourage abandonment of the term altogether. Abandonment of the term “style” is of course no solution to the problem we feel which motivates us to use the word. Two investigators, Ross and Rasche (1972), have assembled a system of routines for the description of style they call EYEBALL. It is their belief that certain kinds of literary statistics are indicators of style and can be the data a literary investigator uses to develop statements about what a writer does. EYEBALL accepts the piece of poetry or prose (the system can handle both) in a form requiring only a minimum of marking: if the input is poetry, the ends of lines are marked with a slash, /. This is because EYEBALL treats the input as a stream of characters, and unless instructed to do otherwise the limitations of each physical record are irrelevant to EYEBALL. This decision regarding input is wise, for one problem in the discipline of literature has been the artificially different ways in which poetry and prose have been treated. M y own work, described below, has convinced me that prose and poetry are tendencies in language and need to be studied for those stylistic measures common to both as manifestations of verbal art and for what formal measures characterize their differentness from one another. Ross and Rasche (1972) have provided a system which will find and list, for example, clauses with compound subjects, periodic sentences, or phrases containing more than one adjective. This information tells the investigator what the writer has selected to do within the context of language, and a study of such selection can lead us either to describe the individual writer’s trademark (which is one thing we mean by “style”), or the presence in a writer’s work of patterns characteristic of a group of writers (which is another meaning of the word “style”). This latter meaning of style as the presence of patterns characteristic of a group of poets motivated Green’s study (1971) of formulas and syntax in Old English poetry. Old English poetry can roughly be said to be the poetry written in the English language before the Norman invasion, and in the interests of getting on with the discussion we will accept 3 F ~ z z ysets are “classes of objects in which the transition from membership to non-membership is gradual rather than abrupt.” See Zadeh (1965, 1972).
60
JAMES JOYCE
that characterization. Old English poetry has a German look to it, as this example shows (the lines in parantheses are literal Modern English translations of the Old English lines they follow) : Caedmon’s Hymn Nu sculon herian heofon-rices Weard (Now we shall praise heaven-kingdom’s Guardian,) Metodes meahta and his mod-gethanc, (Creator’s might and his mind-thought,) weorc Wuldor-Faeder, swa he wundra gehwaes, (work of the Glorious Father, as he wonders each one,) ece Dryhten, or onstealde. (eternal Lord, beginning established.) He aerest scop idelda bearnum (He first shaped for men’s children) Heofon to hrofe, halig Scyppend; (Heaven as roof, holy Creator;) tha middan-geard mon-cynnes Weard, (then Middle-Earth, mankind’s Guardian,) ece llryhten, aefter teode (eternal Lord, afterwards made) firum foldan Frea aelmihtig. (for men earth Master almighty.)
The Modern English version of the lines is purposely a rather literal rendering of the Old English; in this way the two-part structure of the Old English line is most evident and it is easier to see that the hemistich (or half-line) is the unit of composition rather than the line. Green’s study, by focusing on syntax rather than semantics, showed that the poems were constructed by techniques which are suited to and almost certainly grew out of oral composition. For example, in Caedmon’s Hymn, the half-line “ece Dryhten” (eternal Lord) is a formula for expressing the concept of God that the poet is able t o draw upon as he tells his poem, an idea that is associated with the frame Adjective Noun. Green found 30 such frames in an examination of 12,396 hemistiches from the range of Old English poetry, the repeated frames accounting for nearly 32% of the poetry examined. Such a high instance of a limited number of syntactic frames suggests that these frames were memorized by the poets as forms for expression much in the same way t h a t those of use who make up limericks remember that the pattern begins “There was ,’I etc. Notably, Green does not claim the com,/Who puter has helped him decide once and for all that extant Old English poetry was composed orally and later written down; he carefully establishes repetitions of syntactic frames and then attempts to account for such a phenomenon in the poetry. Green’s project involved considerable manual editing of the input data :
+
POETRY GENERATION A N D ANALYSIS
61
the half-lines had to be analyzed for metrics and syntax and this information with the text entered into computer-readable form. Such manual editing of poetry to be processed by computer is fairly common, perhaps because existing systems such as EYEBALL are inappropriate to material such as Old English, or because the investigator did not have access to a machine on which an existing syntax analyzer worked. But there is another way to get from the limitations of the letters of the poem to the music of the poem.
7. Prosody
Dilligan’s recent study (1972) employed a multistage approach which combined effective use of the computer with efficient use of the human investigator. His study was of quantitative verse, the basic rhythm of which is determined by the duration of sound in the utterance (long or short syllables) rather than the traditional system of the accent (strong or weak) each syllable takes, characteristic of most English poetry. Dilligan entered a dictionary of British pronunciation based upon Daniel Jones’ English Pronouncing Dictionary into machine-readable form, and entered Gerard Manley Hopkins’ entire 4800 lines of poetry. A concordance was prepared, and from that output distinctions were made between homographs such as L‘present” (to offer to someone) and (‘present” (as in a gift). From the updated text and dictionary a program produced ‘(afairly broad phonetic transcription of the text which accurately reflects all consonant and vowel sounds, lexical stress, and sentence stress insofar as this last was indicated on the updated text (Dilligan, 1972, p. 40).” This effort did not require quite as much hand-editing as the Old English study, although it should be clear that truc recognition of words by a computer is far from an established fact in literary data processing. Still, Dilligan’s technique took the computer “just over six minutes to produce a phonetic text of Hopkins’s” poetry, the six minutes being the machine’s time after 150 hours of work by two research assistants whose job it was to “transcribe, keypunch, and proofread the pronouncing dictionary (1972, p. 40) .” The transcribed text was then input to a scanning program, in which stress patterns were recognized and tabulated, as well as note taken of assonance (rcpetition of vowel sounds) and alliteration (repetition of consonant sounds). The computer was then used in its familiar role as tireless drudge, and the results sorted, cross-referenced, and listed in various tables. The same process was done for Robert Bridges’ Ibant Obscvri, a translation into English quantitative verse of part of Book VI of Vir-
62
JAMES JOYCE
gil’s Aeneid, and the purpose of the study was to use Hopkins’ practice as a background against which to view Bridges’ experiment with English quantitative verse. For readers whose interests are not prosody, Dilligan’s results are not a s easily summarized as Green’s. After an interesting analysis of Bridges’ practice Dilligan pronounces the experiment with quantitative verse a success.
8. literary Influence: Milton on Shelley
Joseph Raben and Philip H. Smith compared two authors to suggest the influence of one upon the other. The results were reported by Raben (1965). The study was a comparison of Milton’s Paradise Lost and Shelley’s Prometheus Unbound. Verbal correspondences between the two poems were sought “within the confines of single sentences. Even with such limitations, the program has produced tens of thousands of correspondences, most of them consisting of two words, but many ranging up as high as seventeen.” Admitting that perhaps the lower orders of correspondence between the two poems “are probably of no great individual importance,” Raben does point out that “the first sentence in both Paradise Lost and Prometheus Unbound contain God and world” and “the last correspondence located by the computer, at the end of each poem, is good and great (Raben, 1965, p. 242).” The technique used was not reported in the Proceedings, however, but can be found in Smith (1971) ; the technique used is of interest in that i t illustrated another way in which the computer is taught to perform the kinds of analysis which previously were either very taxing to the human investigator or near impossible. Briefly, in Smith’s technique a separate word index (indicating the word and sentence) was generated for Milton and for Shelley, and then the two were merged. The resulting list was then marked by Raben to indicate cquivalents for each word which would be actually considered in the computer comparison of the poems. This marking is crucial, sincc if the scholar indicates an inappropriate word as a n equivalent, the study is thus in error. For example, forms of the word ‘Llove”(such as ‘Llove,’l ‘Lloved,”“lovely,” etc.) were assigned “love” as a n equivalent word, a word such as i%clovcd” was also assigned “love” as an equivalent. The marking process served as a n indication of the basic meaning of the word. Since common words (such as “and,” “or,” etc.) were omitted, they were assigned an asterisk (”) as their equivalent word t o indicate that. The list of word equivalents, which Smith calls a “canonization dictionary,” was processed against Milton’s word index, and then against Shelley’s word index, the output from each run being canonized forms of the words and the sentence number in which the word appears. These
POETRY GENERATION AND ANALYSIS
63
lists were then matched, the output indicating the words and the number of the sentence within which each word appeared in Shelley and the number of the sentence in which it appeared in Milton. These records were then sorted by the sentence number for Milton, thus bringing together all words shared by the two in the same sentence of (say) of Milton’s Paradise Lost. This step should provide the evidence for verbal echoes of hlilton in Shelley, for if a number of words Milton uses in sentence 4 are also used in sentence 237 of Shelley, there is the chance the shared words indicate Milton’s influence on Shelley. Naturally if such a correspondence is very infrequent the best that can be said is that the influence is infrequent, and the worst that the shared words are a matter of chance. But Raben’s results showed, as was mentioned earlier, tens of thousands of correspondences. Sad to say, however, the mere presence of numbers of correspondences can be a matter of chance, and had Raben only the evidence of the numbers his conclusions could only be very tentative indeed. The number of correspondences could best be termed indicative if there were some reason to believe Shelley had knowledge of Milton, and as Raben (1965, pp. 230-232) demonstrates Shelly knew MiIton’s work and was fond of it. This terribly important step of motivating the literary data processing done is, of course, a basic part of experimental design. Investigations of poetry by computer by and large have had a rationale behind them; unfortunately, I cannot say that all investigations have, and a somewhat outstanding example of what can result from a lack of sufficiently motivated research will serve as the next topic considered here.
9. A Statistical Analysis
The -work I am about to discuss, I should like to point out, is not bad work; indeed, the technical aspect of the work seems quite good. What is unfortunate about the work is that it makes a reality the fears of some opponents to the use of the computer in literary analysis. Central to the analysis of litcrature is the hope that the results shed some light on what is being analyzed. For this reason literary scholars, a t the same time they may admire tables of carefully compiled figures, have a nagging question at the back of their minds which demands to be answered; that question can best be phrased as, “So what?” Sainte Marie et al. (1973) present “a brief expository account of the technique [of principal component analysis], together with a summary of the rather striking results obtained from the pilot project.” The application of the technique, if I understand what the authors are saying, is useful “for detecting differences and similarities [within an author’s work] that are interesting from
64
JAMES JOYCE
a literary point of view, or for confirming the existence of such differences and similarities if they are already expected (Sainte-Marie et al., 1973, p. 137).” The article presents results graphically, and says they are suggestive, but does not attempt to answer the questions raised a t all; the authors are pointedly not concerned “with offering conclusions about the work of Moliitre (Sainte-Marie et al., 1973, p. 137) .” Statistics, after all, are not meaningful in and of themselves; they must be interpreted. Granted that their article is written based on the results of a “pilot project.,” but then so was Raben’s article regarding Milton’s influence on Shelley, and Raben did relate his research to the poetry. The work on Moligre was to suggest a technique which will be a useful addition t o the literary scholar. The investigators in the Moliitre project, quite simply, have failed in the most important aspect of their work: to make their activity answer the question every aspect of literary analysis, including literary data processing, must speak to: So what? A number of literary scholars arc somewhat loath to excuse even concordance-making from the additional task of interpreting their results, although a concordance is an obvious aid to understanding what an author means by his words. Literary scholars are even critical of the kind of work bibliographers do in assembling a text which they hope most closely gives us the author’s work, a text free of typographical and editorial errors which keep us from reading the author’s words. The usefulness of principal component analysis as a statistical technique is not my objection; my objection is to the dangerous example of a presentation that stops short of truly demonstrating usefulness. Any methodology must be justified through its demonstrated usefulness in answering the question, So what?-in telling us, even within the limits of a pilot study, something about the literature being investigated. Such a stopping short of literary conclusions, if it continues and spreads, will surely render literary data processing purely an academic exercise in publication. I apologize to the authors of the Moliitre article for being so harsh on their work; I am personally very much a proponent of statistical applications to literary analysis and find what they did quite solid, but their lack of interpretation seems to me a most dangerous example, no matter how ultimately useful their method may be.
10. Mathematical and Statistical Modeling
Quite a different approach from Sainte-Marie’s statistical inference within Moliitre is found in Kahn (197313). Dr. Edward Kahn, a former graduate student in English and mathematics a t the University of Cali-
POETRY GENERATION AND ANALYSIS
65
fornia a t Berkeley, and his work, unlike most of the work described so far, does not involve the computer processing natural language data, but it may signal a rather important direction for literary data processing Kahn’s work is in mathein particular and literary studies in matical modelling of narrative, specifically the aspect of dramatic allegory in Edmund Spenser’s epic, T h e Faerie Queene. The various characters in the poems were grouped into equivalence classes based on their common qualities, the sets being named Faeries, Titans, Paynims (pagans), etc. The relation among the sets was that of dominance-that is, a Faerie dominates (wins a battle with) a Paynim, and so on. Kahn uses the term “plot” to mean “a network of abstract relations that define the universe of discourse in which the narrative is apprehended,” one relation here being that of dominance. A plot is then represented by “a directed graph where each node is understood as an object on which the relations are defined,” and the directed graph representation is canonically transformed into a finite state automaton. By identifying his automata as semigroups, Kahn is able to program a computer (in SNOBOL) to construct the semigroups, the interpretation of which indicates the accuracy and success of the simulation performed. The value of Kahn’s work is not found in only what it says about Spenser’s poem regarding the much-discussed allegory there, but more importantly, in the general applicability of the technique to any narrative for the purpose of characterizing or typing that narrative. The study of types of narratives is of value in helping us understand better just how literature works its spell upon us. My own work in literary data processing, like Kahn’s work, attempts to say something about literature, in general, as well as the individual material examined. The work I refer to is on a theory of poetry which provides insight into the subtle yet important distinctions between poetry and prose.5 The theory basically focuses on the pauses in a poem generally known as caesuras and end-stops; these pauses are of the sort one finds marked by a slash (/) in the following examples: 1. The man who came to dinner/went home./ 2. Bill,/ a healthy lad,/ laughed./
I n poetry, it seems, a rhythm of expectation, a periodicity of pause, is set up by the pauses which occur a t the end of the lines. Not every line needs to end in a pause for this to be true, nor does every line have to contain the same number of syllables-being close is enough for the ‘See Kahn (1973a) for a more lengthy and technical treatment of the work summarized here. Presented at the Computer Science Conference, Columbus, Ohio, February 20-22, 1973.
66
JAMES JOYCE
human ear. These same pauses set up a rhythm of expectation in prose as well, but the period established in most prose is approximately twice as long as the period for poetry. The shorter period of pause for poetry has nothing to do with whether the poetry is good or not; it has everything to do, however, with whether something is poetry or not. By characterizing the poetic line in this fashion one is better able to understand the development of poetry in English ; and, even more interestingly, when they are taught the theory, students understand poetry better and have a better appreciation for it than when the importance of periodicity of pause in poetry is not demonstrated. The role the computer plays in the development of this theory is that of examining bodies of prose and poetry with pauses marked in the text and determining which periodicity (if any) is marked by the pauses, as accumulating other information about them. My results, though a t an early stage, indicate that periodicity of pause is indeed a fruitful way of discussing poetry and prose alike. Poetry and prose can be viewed as tendencies within a continuum of language, since what basically separates them is the difference in period length for pauses. Further, the approach provides a way of accounting for especially poetic passages in a piece which is obviously prose; one would expect (and it seems to be true) that the periodicity of pause for the “poetical” section of a prose piece should be nearer that for poetry than for prose. Characterizations have also been developed for the adjectives “poetic” and ‘Lprosaic” (meaning, respectively, “like poetry’’ and “like prose”) which are used to discuss how regularly the periodicity of pause for the piece occurs; that is, how often the period is carried along by a pause being close to where one would expect it to be rather than a t the exact location. This measure is the variance for the periodicity, and if that variance is near zero the piece is poetic; if not near zero, prosaic. I n addition to the theory of poetry just discussed I have been working on a computational model for stanzaic structures in poetry that has promise. The model, presented a t the Second Computer Science Conference held in Detroit in February, 1974, is of the generic, or characteristic, stanza; the choice of the word generic is to suggest particular stanzas in a given poem grow out of the characteristic stanza. Like the theory of poetry the model of stanzaic structure depends heavily upon pauses in the poetry-this time confining itself to those at the ends of lines where one comes to a complete stop. By observing which lines end in full stops most frequently for a given stanza type one can derive a model for its substanzic structure. For example, in a Shakespearean sonnet thc fullstopped lines occur most often a t lines four, eight, twelve, and fourteen-marking the places in the stanza where major shifts in the stanza
POETRY GENERATION AND ANALYSIS
67
take place. Not every sonnet called Shakespearean has stops only at the ends of lines four, eight, twelve, and fourteen: in fact, few sonnets confine themselves so. But taken as a whole the sonnets clearly display the familiar pattern. This principle can be extended to any group of poems written in what one believes to be a single basic stanza form to yield similar results: Petrarchan sonnets, rhyme royal (as in Chaucer’s Troilus and Criseyde),and Spenser’s stanza in The Faerie Queene. When one examines a poem composed of variable length stanzas, as in the anonymous but important fourteenth-century poem Sir Gawain and the Green Knight, the strength and generality of the models shows itself most decidedly. Comparing variable lengths seems a fool’s errand until a way is devised so the variable lengths are seen as being the same relative distance: all the way through the stanza. We can represent all the way through as the number 1, and if the stanza has L lines a full stop a t the end of line E can be said to have occurred E/L of the way through. By computing E/L for all full-stopped lines in all stanzas of the poem we have a plethora of proportions that, if represented on the interval 0 to 1 show some groupings but apparently without cohesion. But if each proportion is multiplied by the arithmetic mean of the number of lines per stanza and the result grouped by closeness to lines, we can view the distribution of full-stopped lines for variable-length stanzas the same way we view them for fixed-length stanzas; significantly stopped lines then become generic lines of the generic stanza. Of course, when looking within particular stanzas for the lines identified as being significant the lines of the model stanza are not true lines a t all, but indicate the proportion of the way through a stanza the break occurs. Applying the model outlined above to Sir Gawain and the Green Knight I was able to identify the overall stanza pattern for the poem as well as the various emphases given to the pattern in each of the four parts of the poem. Also, the stanza in the poem that seems to disrupt the orderliness of the poem’s structure turns out to be the only stanza in which the audience is treated to an extended description of Morgan le Fay ; Morgan le Fay is responsible for disrupting King Arthurs’ court by sending the Green Knight to challenge it. Gawain’s response to the challenge engineered by Morgan le Fay is the heart of Sir Gawain and the Green Knight.
1 1. Textual Bibliography
The last area to be discussed here, in a way, cannot properly be called a topic in the analysis of poetry, although it is certainly the foundation
68
JAMES JOYCE
of good literary analysis. I am referring to that part of bibliography concerned with establishing the text of an author’s work. Establishing a text is not the same as establishing the identity of the author of a text. Rather, it means that the scholar attempts to put together a text which represents what the poet produced as a poem. This kind of work takes place when there is reason to believe that a poem has been altered in some way either through a printer’s typographical mistake or a n editor’s changes over which the poet did not have control. These conditions are surprisingly frequent in literature; it is widely believed that few major poets are represented to us by texts which are what the poet wrote. What, then, do we have to read, and how inaccurate is it? Both are very good questions. Perhaps it is enough here to say that we do not have Keats’ poem to read when we read the version of The Eve of Saint Agnes most often published as Keats’ poem, or Shakespeare’s play-you pick your favorite, for it is true of all of them. Some poets, such as Emily Dickinson, had their punctuation “corrected” by well-meaning editors who felt it was for the best to do so; but the Dickinson Variorum with the original punctuation shows even better the dramatic energy in the poetry than conventional punctuation of her poetry shows. Computers are being used in several of the steps a bibliographer must take to arrive a t the best text of a poem, including collation, identification of differences among texts, and production of the text itself. Widmann’s (1971) use of the computer to collate “some 80 to 120 editions of A Midsummer Night’s Dream is one approach to the collation problem. The various editions of the play were entered into computer-readable form and, after proofreading, were compared, line for line and within each line word for word (including punctuation). Since Widmann’s program evidently restricted itself to the environment of a line in the word-forword comparison, the problem of determining automatically which words were omitted from a given line was greatly reduced. I n producing the output of the collation Widmann printed all versions of the same line together, beginning with the line from a specific edition which serves as the basis from which other comparisons are made. I n printing subsequent lines only those words which differed from those in the first version of the line were printed, making it considerably easier for the human editor to see where there were differences among the texts. Being able to automate the collation process as described above is quite valuable, for it is only after all the editions have been compared that the literary detective work of constructing a critical edition can take place. Construction of a critical edition involves analysis of the corpus of variants produced by collation, an analysis which is frequently done by hand even after the computer has produced the list of variants. Peavler
POETRY GENERATION AND ANALYSIS
69
(1974) is a t work on a statistical analysis by computer of the corpus of variants for several of Chaucer’s short poems using material supplied to him by Professor George Pace, editor of the Variorum Edition of Chaucer’s shorter poems. Pace had collated the various manuscripts of the short poems by hand and was curious whether the computer might be of help in showing the degree to which one manuscript was similar to another. Peavler’s technique was to transform the manuscript readings into an array, the rows representing the manuscripts and the columns the individual readings, the array entries indicating whether the manuscript had that reading or not. A FORTRAN program then compared each manuscript against every other manuscript and indicated the number of shared readings; this output was both printed and sorted by the number of shared readings so that manuscript pairs were arranged in such a way that pairs of manuscripts which agreed the most were listed first, and so on, until the pairs of manuscripts which agreed the least were listed. These kinds of groupings could be useful in determining genetic relationships between manuscripts such as “manuscript A is a direct copy, or descendent, from manuscript B.” Peavler has not indicated the extent to which he has tried to have a program suggest genetic relationships among the manuscripts, although he does feel that the work of making editorial decisions should not be done by a program.
12. Conclusion
This survey of current work in poetry generation and analysis has attempted to show the spectrum of activities and to convey a sense of why such work is undertaken. To predict where literary data processing is going seems unnecessary and perhaps rash. There are a number of courses at least touching on literary data processing in universities throughout the country, and computer research sections are established within the framework of many literary conferences. In England last year (1973) the Association for Literary and Linguistic Computing was founded, with an East Coast branch (Professor Joseph Raben) and a West Coast Branch (Professor Rudolf Hirschmann) in the U.S.A., and the University of Waterloo now has a special interest group WATCHUM (Waterloo Computing in the Humanities) organized to further humanistic computing a t that university. At Stanford recently an interactive, open-ended question-and-answer program has been developed to lead students in the freshman English program to write poetry-not computer-generated poetry, but poetry written by the students in response to conversations they carried out with the program. In none of the work reported above has
JAMES JOYCE
70
there been the least suggestion that computational analysis should replace a human either in writing or in reading the poem, and that is as it should be. The examples of poetry generation and analysis given here demonstrate that the computer can serve both as a medium for an artist and as a useful colleague, who does the repetitious shuffling and counting within poetry needed by investigators who want to know better what and how a poet means. Literary data processing is obviously growing, developing more computational techniques which will help the scholar and reader in the quest to understand, to hear, what the poet has to say. REFERENCES Adrian, M. (1969). I n “Cybernetic Serendipity” (J. Reichardt, ed.), p. 53. Praeger, New York. Dilligan, R. J. (1972). Ibant Obscvri: Robert Bridge’s experiment in English quantitative verse. Style 6, 38-65. Fajman, R., and Borgelt, J. (1973). WYLBUR: An Interactive Text Editing and Remote Job Entry System. Commun. ACM 16, 314-322. Gaskins, R. (1973). Haiku are like trolleys (there’ll be another one along in a moment). I n “Computer Poems” (R. w. Bailey, ed.), pp. 16-19. Potagannissing Press, Drummond Island, Michigan. Green, D. C. (1971). Formulas and syntax in Old English poetry: A computer study. Comput. Humanities 6, 85-93. Ingram, W., and Swaim, K. (1972). “A Concordance to Milton’s English Poetry.” Oxford Univ. Press, London and New York. Kahn, E. (1973a). Finite state models of plot complexity. Poetics 9, 5-20. Kahn, E. (197313). Algebraic analysis for narrative. (unpublished). Kilgannon, P. (1973). I n “Computer Poems” (R. W. Bailey, ed.), pp. 22-31. Potagannissing Press, Drummond Island, Michigan. Milic, L. T. (1971). On the possible usefulness of poetry generation. I n “The Computer in Literary and Linguistic Research” (R. A. Wisbey, ed.), p. 170. Cambridge Univ. Press, London and New York. Milic, L. T. (1973). I n “Computer Poems” (R. W. Bailey, ed.), pp. 37-40. Potagannissing Press, Drummond Island, Michigan. Misek, L. (1972). “Context Concordance to John Milton’s ‘Paradise Lost’.” Andrew R. Jennings Computing Center, Case Western Reserve University, Cleveland, Ohio. Oxford Univ. Press, London and New York. Parrish, S. M., and Painter, J. A. (1963). “A Concordance to the Poems of W. B. Yeats,” p. 477. Cornell Univ. Press. Ithaca. Peavler, J. M. (1974). Analysis of Corpa of Variations. Comput. Humanities 8, 153-159.
Raben, J. (1965). A computer-aided investigation of literary influence: Milton to Shelley. I n “Literary Data Processing” (J. B. Bessenger et al., eds.), pp. 230-274. Materials Center, Modern Language Ass., New York. Ross, D., Jr., and Rasche, R. H. (1972). EYEBALL: A computer program for description of style. Comput. Humanities 6,213-221. Sainte-Marie, P., Robillard, P., and Bratley, P. (1973). An application of principal component analysis to the works of Molihre. Comput. Humanities 7, 131.
POETRY GENERATION AND ANALYSIS
71
Shinagel, M., ed. (1972). “A Concordance to the Poems of Jonathan Swift.” Cornell Univ. Press, Ithaca, S e w York. Smith, B. H. (1968). “Poetic Closure,” p. 268. Univ. of Chicago Press, Chicago. Smith, P. H., Jr. (1971). Concordances and word indexes. In “Literary Data Processing” (V. Dearing et d.,eds.), IBM Publ. No. GE20-0383-0, pp. 14; 64-70. IBM, Yorktown Heights, S e w York. Spevack, M. (19681970). “A Complete and Systematic Concordance t o the Works of Shakespeare, 6 vols. George Olms, Hildesheim. Widmann, R. L. (1971). “The computer in historical collation: Use of the IBM 360/75 in collating multiple editions of A Midsummer Night’s Dream. I n “The Computer in Literary and Linguistic Research” (R. A. Wisbey, ed.), p. 57. Cambridge Univ. Press, London and S e w York. Zadeh, L. A. (1965). Fuzzy sets. Inform. Contr. 8, 338-353. Zadeh, L. A. (1972). Outline of a New Approach to the Analysis of Complex Systems and Decision Processes,” Electron. Res. Lab. Memo ERL-M342. University of California, Berkeley. BIBLIOGRAPHY The fine bibliographies which have appeared in Computers and the Humanities scarcely need reproduction here; those who wish to see a list of “everything that has been thought and said” can consult those bibliographies. Rather, I would like to indicate items for those interested in reading more on the topics of poetry generation and analysis. The items given here sometimes repeat items from the Reference List, but not all items cited in the article are given here-just those which would help a reader understand more in depth the variety of computer applications to poetry (and prose). Bailey, R. W., ed. (1973). “Computer Poems.” Potagannissing Prem, Drummond Island, Michigan. Available from the Editor for $2.25 postpaid, 1609 Cambridge Road, Ann Arbor, Michigan 48104. Good collection of computer-produced or inspired poems. Bessinger, J. B., Parrish, S. M., and Arder, H. F. eds. (1965). “Literary Data Processing Conference Proceedings.” Materials Center, Modern Language Association, 4 Washington Place, S e w York, S e w York 10003. Good collection of papers illustrating various literary applications. Dearing, V., Kay, M., Raben, J., and Smith, P. H., eds. (1971). “Literary Data Processing,” IBM Publ. S o . GE20-0383. IBM, Yorktown Heights, S e w York. A good nontechnical introduction to the computer as a tool in natural language research. Doleiel, L., and Bailey, R. W. (1969). “Statistics and Style.” Amer. Elsevier, New York. Collertion of articles concerning the application of mathematical models and statistical techniyues to the study of literary style, not all studies computer-related. Leed, J. (1966). “The Computer and Literary Style.” Kent State Univ. Press, Kent, Ohio. Collection of papers reporting computer-assisted investigations of literary style. Mitchell, J. L., ed. (1974). “Proceedings of the International Conference on Computers in the Humanities.” Univ. of Edinburgh Press, Edinburgh (in press). Selected papers of the conference heId July 2CL22, 1973, a t the University of Minnesota.
~,
72
JAMES JOYCE
Richardt, J., rd. ( 1969). “Cybernetic Serendipity.” Praeger, New York. Interesting collection of computer-produced art. Sedelow, S. Y. (1970). The computer in the humanities and fine arts. Comput. Surv. 2, 89-110. An overview of the roles the computer plays in art,, architecture, music, literature, and language. Wisby, R. A,, ed. (1971). “The Computer in Literary and Linguistic Research.” Cambridge Univ. Press, London and New York. Covers applications to lexicography, textual editing, vocabulary studies, stylistic analysis, and language learning.
Principal Journals
Bulletin of the Association for Literary and Linguistic Computing Quarterly. A new journal based in England. Computers and the Humanities (J. Raben, ed.). Five issues a year. Devoted to the use of computers in the humanities. Articles range from surveys of developments to fairly technical applications. Contains an Annual Survey of Recent Developments, an Annual Bibliography of studies in the humanities using a computer, and twice a year a Directory of Scholars Active, which describes ongoing projects. Computer Studies in the Humanities and Verbal Behavior ( S . Y. Sedelow, ed.). Quarterly. Much more exclusively concerned with language than Computers and the Humanities, and articles strike me as much more of a technical nature.
Mapping and Computers
PATRICIA FULTON U.S. Geological Survey I220 7 Sunrise Volley Drive Reston. Virginia
1. Introduction . 2. History . . . 3. What Is a Map?
. . . . . . . . . . . . . . . . . . . 73
. . . . . . 4 . The Earth Ellipsoid . . 5. The Geoid . . . . . 6. Geodetic Datum . . . 7. Geodetic Surveys . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1 Horizontal Networks . . . . . . . . . . . . . . . . 7.2 Vertical Networks . . . . . . . . . . . . . . . . . 7.3 Early Computations . . . . . . . . . . . . . . . . . 8. Satellite Geodesy . . . . . . . . . . . . . . . . . . . 9. Photogramnietry . . . . . . . . . . . . . . . . . . . 10. Projections . . . . . . . . . . . . . . . . . . . . . 11. Cartography . . . . . . . . . . . . . . . . . . . . 12. Data Banks . . . . . . . . . . . . . . . . . . . . 13. Future Trends . . . . . . . . . . . . . . . . . . . . 14. Conclusions . . . . . . . . . . . . . . . . . . . . Refcrences . . . . . . . . . . . . . . . . . . . . . .
74 76 77 78 79 80 80 85 86 87 89 92 98 102 103 105 106
.
1 Introduction
Computers have become such an integral part of the mapping process that it is now virtually impossible to consider mapping without computers . This is true of all areas involved: production of conventional maps. research on new applications. data processing for the auxiliary systems. and nontechnical or administrative utilization . Although this has happened within the span of a few years. this union of computers and mapping is deep rooted . It is a point of pride within the mapping agencies that the earliest use of computers was not only accepted but actually encouraged and promoted by their enthusiastic adoption of what was then an original and startling innovation . 73
74
PATRICIA FUTON
2. History
The first map used by man unqestionably predates all written records.
It was in all likelihood a line drawing made with a sharp stick in soft sand or earth. It undoubtedly gave general information about major landmarks such as mountains and rivers and specific details on dry caves or the salt pans valued by a primitive tribe. As mankind evolved, so did his mapmaking skills, and this same information was conveyed in a more permanent form by being scratched or marked in some way on animal hides, clay tablets, and stone. Judging from some of the maps still in existence today which were made by the primitive tribes, it can be assumed that there were symbols for different kinds of plants, animals, seasons of the year, or times of the month. These graphics are rather stylized and can be learned with very little effort. In fact, there is not too much difference between the basic idea of those very early primitive maps and the maps of today. Maps are more important and more widely used today than they have ever been in the history of mankind. As man’s social structure grew more complex, his need for information increased correspondingly. A part of this expanded need included the need for better and more precise maps. The ancient Egyptian civilization which grew up along the banks of the Nile is a prime example of the way this interrelationship grew. The yearly inundation that renewed the fertility of the fields also washed away the markers that identified the ownership of the fields and enabled the tax collectors to procure their revenue. And so of necessity, surveying was born. The Egyptians, however, were not the only ancient peoples who contributed to mapmaking. The various nations who occupied early Mesopotamia made a very substantial contribution to mapping. The Chaldeans with their great interest in astrology divided the circle into 360° and simultaneously established the sexagesimal system, which we still use today in mapping. Of equal or even perhaps greater importance, the old Sumerians used the same small set of symbols for all the numbers but indicated the exact value by position. This is essentially the same system in use today for computations with arabic numerals. These people also knew the so-called Pythagorean theorem, not only for special cases, but in general. Pythagoras, born in approximately 582 B.c., considered the world to be a globe. Aristotle also argued that the earth was, of necessity, spherical. Eratosthenes (276-195 B.c.) did something practical about the theory and conducted experiments from which he derived the circumference of
MAPPING AND COMPUTERS
75
the earth. He made use of the fact that in Syene, in upper Egypt, the rays of the sun shone vertically into a well. At this same time, the summer solstice, they fell a t an angle in Alexandria, Egypt. Measuring the angles and the distance between the towns, Eratosthenes arrived a t a figure remarkably close to the circumference known today. His figure was too large by a factor of about 16%. After Columbus discovered the new world and Magellan circumnavigated the globe, there was no longer any question about the shape of the earth. In general, everyone agreed that it was spherical. It was not long, however, until men were attempting to define the exact kind of a sphere upon which they lived. I n fact, by the 17tH and 18th centuries there was a very lively dispute between the English and the French. Was the earth prolate, flattened a t the equator, or was it oblate, flattened a t the poles? The argument was settled in the 1800’s by the French Academy of Sciences which sent investigative expeditions to South America and to Lapland. The measurements made by these expeditions proved that the earth was an oblate spheroid. I n 1849 Aime Laussedat, a French Army officer in the Corps of Engineers, decided that photography could be used to create maps. He worked for years on this project, and when he was finally done, in the late 1860’s or early 1870’s, he had defined many of the basic principles that still hold true today. The first application by Americans was probably during the Civil War when balloons were sent up with cameras to photograph enemy positions. For the most part, balloons have been replaced by airplanes as a vehicle for the aerial cameras. However a few balloons, which have the combined properties of a kite-balloon, are still used for this task. The history of computers as an everyday tool for mapping undoubtedly began with J. Presper Eckert and John W. Mauchley a t the University of Pennsylvania, where they worked on the prototype of the electronic computer. This was the ENIAC (Electronic Numerical Integrater and Computer). It was funded through the Ballistics Research Laboratory and the U.S. Army. It took 30 months to complete, and when it was finished it was the only electronic computer in the world. It consisted of 47 panels, each 9 feet high, 2 feet wide, and 1 foot thick. The word size was 10 digits plus a sign. Input and output were chiefly via punched cards, although switches could be set and light displays could bk read. Programs were hard wired into it. A stored program memory was later contributed by John von Neumann. Soon a second machine appeared, the EDVAC (Electronic Discrete Variable Computer). Eckert and Mauchley continued developing computers, including the Universal Automatic Computer or UNIVAC. In time, this machine became available
76
PATRICJA FULTON
commercially and was shipped to waiting customers. The first unit went to the U.S. Census Bureau and the second to the U.S. Army Map Service. Thc UNIVAC retained many of the features of its ancestor, the ENIAC. It was composed of a multitude of resistors, capacitors, and vacuum tubes. The memory was approximately 6 feet long, 5 feet wide, and 5 feet high inside. Repairs were made by simply walking inside the memory and replacing the necessary parts. It was actually a small room with its own door. There was a high console with flashing lights and a bank of 6 to 8 magnetic tape units for auxiliary storage. Input was from cards (round punches) and output on a printer. For most of its life the UNIVAC worked almost 24 hours a day on three shifts. By the time it was retired in 1962, it had made an indelible and irreversible impact on mapping. By that time it seemed that every government agency and every big laboratory of private industry had a t least one large computer and dozens of smaller ones. The names of some of these computers have become household words, especially I B M ; others, like the BERLESC at the Aberdeen Proving Grounds, were known only to a handful of users.
3. What Is a Map? A map is a graphic representation of a part of the earth’s surface. It is also a selective representation in that some features are emphasized to serve a particular purpose. The familiar road map is an ubiquitous example. There are many types of derivative product maps used by geographers and demographers, for example, maps that show population density by clusters of little black dots, or livestock production by pictures of cows and sheep. Such maps can illustrate a situation or condition that might otherwise be awkward to describe. Weather maps show storm tenters, isobars, and isotherms for certain areas. Hydrographic maps are concerned with water, the courses of streams and rivers, and the outlines of lakcs and ponds. Nautical charts delineate underwater features as well as shoreline details. These charts may show bays and harbors so that ships and boats may travel safely without going aground. They may show contours on the ocean’s floor which depict mountain ranges and volcanoes that are still submerged. They may outline the oyster beds in Chesapeake Bay, or they may trace the routes of the Chinook salmon on its way to the spawning grounds in the Northwest. One of the most widely used maps is the topographic map. A topographic map is also a graphic of thc earth’s surface, but it must be plotted to a definite scale. It represents the slope and character of the terrain; that is, it shows planimetry and terrain relief. Planimetric features are
77
MAPPING AND COMPUTERS
roads, houses, the outlines of fields, and city boundaries. Terrain relief is depicted by contour lines with labeled elevations and spot elevations.
4. The Earth Ellipsoid
The best way to really appreciate the all-pervasive influence of computers on mapping is to look a t each phase of the mapping process. It all begins when people need to examine part of the earth’s surface in detail. The steps should be considered in the natural order of their occurrence-first a look a t the earth itself. The earth can be considered an ellipsoid of revolution, a figure obtained by rotating an ellipse about its shorter axis. The flattening is that amount by which the ellipsoid differs from a sphere. Thus, two dimensions will uniquely define an ellipsoid of revolution. By traditional usage, the semimajor axis and the flattening serve this purpose. An ellipsoid of revolution is an ideal mathematical figure. Many of the relationships can be expressed by equations in closed form, which facilitates the computations. The shape of the earth, as everyone is well aware, is far from this ideal figure. Yet to be able to compute distances and directions with any fidelity and still be manageable, the computations must be reduced to a figure like the ellipsoid. This fact is a primary and basic reason for the merging of computers into the mapmaking procedures. I n earlier years, the simple expedient of using different ellipsoids for different parts of the earth served the purpose superbly well. Thus, for any one area of the earth, its chosen ellipsoid fits computationally, and practically, better than any other method so far devised. Let it now be emphasized that whcn computations are discussed, they are made on the ellipsoid or the idealized geometric figure of the earth. The following is a list of some of thcse ellipsoids and where they are used: Ellipsoid Clarke 1866 Brazil Modified Airy Airy British Clarke 1880 World Geodetic System 1966 Everest International
Area North America South America east of the Andes Ireland Great Britain Africa south of the Sahara Soviet Union India Pacific and Atlantic Oceans
As can be seen from the dates, some of thcse ellipsoids were determined
70
PATRICIA FULTON
a century ago. The methods used were quite primitive by modern standards. The newest methods only date from the 1950’s.
5. The Geoid
As everyone knows from their own observations, the earth is not a t all an idealized geometric solid. A map, to merit its name under the present definition, must relate to ground truth in some measurable dimensions. Thus, another surface must be defined and utilized; it is denoted as the geoid. The geoid is often likened to the mean sea-level surface. This means that if the oceans were free to flow over the entire surface of the earth and to adjust to the effects of gravity and centrifugal force, their resultant shape would be that of the geoid. It is also known that the earth’s mass is not evenly distributed. For this reason, the surface of the geoid itself is irregular. When outlines of the ellipsoid and the geoid are superimposed one upon the other, there are many discrepancies. These are termed geoid separations, geoid heights, or geoid undulations. More precisely, the geoid is an equipotential surface; in this case, gravity is everywhere equal, and the direction of gravity is always perpendicular to the surface. This last is particularly important in the positioning of measuring instruments. That is, the vertical axes of the instruments correspond to the direction of gravity. The difference between the perpendicular to the ellipsoid and the perpendicular to the geoid (or the commonly known plumb line) is measured by an angle called the deflecPerpendicular to Geoid Axis of
Geoid
Center df the Earth
(I
I
/
i = semimajor axis
FIG.1. Relationship of ellipsoid and geoid.
MAPPING AND COMPUTERS
79
tion of the vertical. The science of geodesy is the study of this figure, the geoid.
6. Geodetic Datum
The horizontal datum is composed of the latitude and longitude of the initial point or origin of the datum and the geoid separation of that point a t the origin. The datum also includes the azimuth, the angle of horizontal deviation measured clockwise from a standard direction (i.e., north or south), and the parameters-usually the radius and flatteningof the particular ellipsoid consigned to that datum. By this definition, the measurements are consistent within the system of any one datum. This means further that a datum is, by definition, a reference or base for other geodetic measurements. The historical method of establishing a geodetic datum was to select a first-order triangulation station and to designate this as the origin. If this was near the center of the network it was even better. Then the astronomical coordinates and azimuth a t the datum origin were derived. By this method, the geoid and ellipsoid were defined as being coincident a t the origin point. Or, to restate it, a t the origin, the deflection from the vertical and the separation between the ellipsoid and the geoid were zero. Another way of describing this is to say that the plumb line a t the origin was normal to both geoid and the ellipsoid. In point of fact, this is not the case a t all. Neither the real deflection nor the undulation are usually zero. At any rate, the final situation is one where the points within any one system are correct with respect to each other. Furthermore, although the deflection and the separation could be zero a t the origin, this is not the situation a t other stations in the network. In fact it is quite possible, and happens more frequently than not, that rather large discrepancies appear between geodetic latitude and longitude and the corresponding astronomical latitude and longitude of one point. It was quite impossible to handle these problems except by approximations. Because these differences can become unmanageable in certain situations, a second type of datum orientation called the astrogeodetic orientation has been developed. This second method is possible only because of the introduction of computers. In this type of orientation, the deflection of the vertical is not arbitrarily set to zero. Instead, it is the quantity computed from the least squares solution applied to the astronomic station observations within the network. By this method, the discrepancies of all stations are minimized. That is, when computers do the processing,
80
PATRICIA FULTON
the actual observations are reduced mathematically to give a reasonable and true model of the earth. Thus, geodesists no longer need to rely on a collage of approximations to estimate the earth’s surface. 7 . Geodetic Surveys
Surveys are the procedures for making measurements on the surface of the earth. Thus, any description of surveys should begin with an explanation of the different types of surveying that are carried out at present. The logical type to start with is geodetic surveying-for two reasons: ( 1 ) this is the type of surveying defined by law as the task of federal mapping agencies; (2) and perhaps more pertinently for this article, this is the first type of surveying that utilized the computer. Geodetic surveying is the process of making measurements for the determination of the size and shape of the earth. The positions of points determined by these methods are of such a high degree of accuracy that they provide the basis for other surveys. In the United States, as in most countries, the government establishes the rules and regulations for maps. The responsibility of establishing geodetic control networks throughout the United States is charged to that branch of the National Oceanic and Atmospheric Administration (NOAA) which was formally known as the Coast and Geodetic Survey and which is now known as the National Geodetic Survey (NGS). Control surveys are of two main types: those concerned with horizontal positioning, and those concerned with leveling or the elevation of points. Horizontal survey control networks themselves can be done in any one of several ways. The preparation for surveying any control network requires a great deal of forethought and many technical decisions. The higher the order of control the greater the accuracy demanded of it. Thus, firstorder surveying necessitates the most care and attention. First-order triangulation must provide point positions which are accurate to one part in 1,000,000 in distance and orientation. A major first-order geodetic network spanning continents will, of course, cross all sorts of international political boundaries. The attendant problems are magnified and complicated in direct proportion to the size of the network. 7.1 Horizontal Networks
7.7.7 Astronomic Stations
The positions a t which astronomic measurements are taken are called LaPlace stations. Their purpose is to tie the geodetic networks together and fix them at the specific points on the surface of the earth. These
MAPPING AND COMPUTERS
81
measurements are astronomic latitude and longitude. The observations are made with optical instruments which are positioned perpendicular to the geoid; that is, the vertical axis of the tool is coincident with the direction of gravity. For geodetic work, astronomic longitude is the difference in time measured when a specific star is directly over the Greenwich meridian and when the same star is directly over the station. The most accurate chronometers available are carried along to measure the time a t the points in the network. The exact time a t which a star is over the prime or Greenwich meridian can be found in the star catalogs, of which there are several. Then the difference in time or astronomic longitude can be determined. These catalogs have been generated by computer programs, and the computed results are available in both printout form and on magnetic tape. Astronomic latitude is the angle between the perpendicular t o the geoid and the plane of the equator. Geodesists who define and compute such parameters are concerned with the exact position of the North Pole (the axis of rotation) a t the time of measurement. The situation is aggravated by the fact that the North Pole does not stay in the same spot, but roams around in the area contiguous to the point generally considered to be the North Pole. 7.I .2 Triangulation
The triangulation method is probably the oldest type of survey. It depends upon the geometric proposition that if one side of a triangle and all the angles are known, or in this case measured, the values for the remaining sides can be computed. When this process is extended by adding more triangles to form a chain, then the area to be surveyed can be covered by a network of these triangles. It should be emphasized that the measurement of an angle at a point will often include the azimuth, the direction of that point relative to the North Pole. When the included side and two angles of the triangle are known, the solution of the triangle is then known. However, for triangulation, all three angles of the triangle are usually measured. Multiple measurements are recorded t o provide the most probable value. The stations of these triangles must be intervisible to the instruments taking measurements. Many times, determination of all three points by direct observation is not feasible. Of course, with two angles and the included side, that third angle can be computed, and this is frequently done. The initial measurement of length in a triangulation network is known as the base line. From the earliest days of surveying until quite recently, it has always been easier and more economical to obtain accurate angular measurements than accurate linear measurements, hence the popularity of triangulation.
a2
PATRICIA FULTON
B
FIG. 2. Triangulation network. A, north base; B, south base; AB, baseline. Preliminary high precision data: length of baseline AB; latitude and longitude of points A and B ; and azimuth of line AB. Measured data: angles to new control points. Computed data: latitude and longitude of new points; and length and azimuth of new lines.
In flat terrain the requirement that the vertices of a given triangle be intervisible is accomplished by building towers. The instruments are then placed on the tops of these towers, and the requisite measurements made. At present, this practice is seldom used for first-order surveys, and for the most part is reserved for second- and third-order survey. 7. I . 3 Trilateration
Another method used for the higher order surveys is called trilateration.
It also is expanded in the form of a network (Fig. 3 ) . By this method, the angular values are obtained after measuring all the distances and then solving by the law of cosines. Multiple-distance measurements are made a t each station to provide the necessary accuracy and precision. Trilateration was not especially practical until the development of electronic distance-measuring equipment. Now it is considered to be the equal of triangulation. Some of its proponents claim it even surpasses triangulation. 7. I .4 Traverse
A traverse is the easiest way of extending a control network. The procedure starts at a known point with a known azimuth (direction) to
MAPPING AND COMPUTERS
83
Fro. 3. Trilateration network. A, north base; B, south base; AB, baseline. Preliminary high precision data: length of line AB; latitude and longitude of points A and B; and azimuth of line AB. Measured data: length of each line. Computed data: latitude and longitude of new points; and length and azimuth between new points.
another point. Then the surveyor measures the angles and distances between the points of this network. The direction of each line of the traverse is computed from the angular measurement. The position of each control point is then computed from the length measurements of the line. When the first station and thc last station are coincident, it is called a closed traverse; the other type is the open traverse (Fig. 4 ) . Until recently, a traverse was considered only good enough for secondary networks, and because it was an economical method, it was used extensively. However, where the new electronic distance-measuring instruments are used, a traverse can be as accurate as triangulation. In fact, the interior of Australia was measured by this means. So far, the surveying methods mentioned have been used exclusively for horizontal control. These have been triangulation, trilateration, and traversing. All the measurements have been made on the apparent or topographic surface of the earth. Note that this is the third separate and distinct surface that is involved. 7 .I .5 tnsfruments
The instruments upon which the measurements are made are a vital part of the process and so require some description. The instruments used for measuring have changed drastically within the past few years. Perhaps the best known of the earlier instruments is the theodolite (Fig. 5 ) , an optical tool which is really a high-precision telescope with a cross-
04
PATRICIA FULTON
FIG.4. Diagram of a traverse. The given traverse is closed. If it extended only to point C, either north or south of the dotted line, it would be an open traverse. A, north base; B, south base; AB, baseline. Preliminary data: latitude and longitude of point B ; and azimuth of line AB. Measured data: length of the lines; and angles between the lines. Computed data: latitude and longitude of new points; and length and azimuth of new lines.
hair for sighting and scales for determining horizontal and vertical angles. On a theodolite, provisions have been made for the exact leveling of the axes of the telescope. Accurate horizontal angular readings of a graduated circle can also be made. The instruments are usually mounted on the familiar tripods seen along the roadside before the start of a housing development or at the construction site of a large shopping center. Some of the most successful examples of new tools are the electronic distance-measuring devices. One popular type, the geodimeter, emits pulses of ordinary light a t controlled frequencies. Another, the tellurometer uses radar to measure slant distances. More recently, laser beams have been used successfully. With this electronic equipment, greater distances than ever before can be measured. I n fact, continent-spanning networks are in operation. The first computations done on measurements taken in the field are done on site to screen out gross errors. To their undying credit, the surveyors have always acknowledged the possibility of errors. I n fact, they attempt to pinpoint the possible sources of error and exert great care to control them. Among the situations cited in the instrument instruction manuals are details covering possible errors in sighting, mensuration, and instrument settings. All this care is necessary because an error in any part of a control network is propagated throughout the entire network.
MAPPING A N D COMPUTERS
85
FIG.5. The theodolite, a high-precision telescope. 7.2 Vertical Networks
Just as there are several types of horizontal networks there are also several types of vertical control networks. The one generally considered to be the most accurate is established by differential leveling. In this sort of a network, two calibrated rods are held vertically a t different locations along a planned route, and readings are made with an optical instrument positioned between them (Fig. 6 ) . The reading is the difference in elevation between the points. As the optical instrument (telescope) is leveled by means of a bubble, gravity affects the instrument; therefore, the telescope and bubble are parallel to the geoid. A second type of leveling is called trigonometric leveling. This is accomplished by using a theodolite or similar instrument to measure a vertical angle between two points having a known distance between them. The elevation of the desired point can then be computed. By this method, both horizontal control and vertical control can be established a t the same time on the same network. Although this is a much more economical
86
PATRICIA FULTON
FIG.6. A level rod being used in a field survey.
method, it is less accurate than differential leveling. I n actual practice the high-order horizontal and vertical control networks are independent and separate one from the other. Barometric leveling is the third type used. The differences in atmospheric pressure a t the various elevation control stations are measured. These measurements, together with air pressure are used to determine the elevations of various other points. The accuracy of barometric leveling is less than that of the other two methods. It is used a great deal, however, in preliminary surveys where later and more accurate measurements will be made by either trigonometric or differential leveling. Just as there are horizontal-control networks and vertical-control networks, there are also horizontal and vertical datums.
7.3
Early Computations
These survey networks were solved triangle by triangle, generally in the shape of a quadrilateral. There were printed forms for each phase of the computations with spaces for input, intermediate answers, and final results. Every step had to be done by two different mathematicians as
MAPPING AND COMPUTERS
a7
a check on the accuracy. Angular measurements in degrees, minutes, and seconds were laboriously converted to trigonometric functions by tables and interpolation. The reverse process was carried out for angular results. The task was performed by rooms full of mathematicians, each with a mechanical calculator on the desk. Even obtaining a square root was a time-consuming process requiring repetitive entries on the calculators. It took years before final adjusted values could be assigned to positions in only moderate-sized networks. It is no wonder that electronic computers with stored program capability were welcomed so eagerly. The mathematicians were a t last free to solve problems in ways more compatible with the physical situation. Consider the conditions for a geodetic survey network. The astronomic positions are recorded a t LaPlace stations and referenced to the geoid. Triangulation and/or trilateration measurements are taken on the topographic surface. Remember that all the measurements must be reduced to the ellipsoid for the computations. These observations comprise different combinations of known and unknown variables, sides, and angles with repeated measurements. All these equations require a simultaneous solution involving thousands of measurements. Needless to say, matrix operations of this magnitude are still a formidable task even with the third- and fourth-generation computers of today. Now, however, by means of computers the solutions to the true situations are being achieved. Rigorous mathematical formulations are applied to the real observations for the earth in its entirety. These yield accurate worldwide solutions to replace the unsatisfying approximations, limited in area, of the precomputer years. Even now, the programs are being readied for reducing the measurements of the latest high-precision traverse. A search is underway for the biggest and fastest computer available, and even this will probably not be completely adequate to the task. The detailed formulation and development of the requisite mathematics will be found in the literature cited in the bibliography.
8. Satellite Geodesy
The use of artificial earth satellites as research vehicles for obtaining geodetic information was promulgated in the 1950’s. The earliest versions were primitive modifications of the traditional triangulation and trilateration. These rapidly increased in sophistication and in the quality of the results. One of the most successful systems is a worldwide geocentric model with stations so distantly spaced that they are nonintervisible, yet most satellite observations are three-station events. When a satellite
88
PATRICIA FULTON
is observed from the ends of a base line, the rays form a plane in space. The reasoning is that the spatial orientation can be determined from the measured direction cosines of the two rays. The direction of the baseline can be computed as the line in which two such planes containing the baseline intersect. When three stations form a triangle, five such planes are necessary and sufficient for a unique solution. All the triangles must have geometric strength. The final positions for points on the earth are expressed in a geocentric Cartesian coordinate system in three dimensions, a model of elegant simplicity. However, consider the not-so-simple details. The practical application depends upon the successful development of suitable telemetry and advance tracking algorithms. To an even greater degree, however, it depends upon high-speed computers to monitor the actual orbit and to predict the path of the future orbits. Orbit-prediction programs are based on the extremely complicated equations of dynamic astronomy for the gravitational potential of the earth. Nearly always written in FORTRAN, they take years to become fully operational. Even on today’s big computers they take hours to run. Once the programming is done, the most advantageous orbits are devised, and the satellites are shot into space (see Fig. 7 ) . The stellar cameras and all the other equipment require calibration before use. This demands repeated measurements fed as input to computer programs. The satellites are photographed against a star background, and the star positions are also used in the computations of the satellite position by referencing the star catalogs. All this computer processing just mentioned occurs before the recording of the first image. After the data are acquired, as many as 3500 measurements are made on each photograph. The data reduction requires computer programs which reference the star catalogs, adjust the data for instrument deviations, transform it to the same reference system, and finally compute the most probable positions. It requires many years of programming and computer years of processing to perform all these tasks properly, but the results provide positional accuracy never before achieved. Several other techniques are particularly appropriate for satellite geodesy. Some electronic systems measure distances by means of highfrequency signals, as described in Section 7.1.5. Others measure positions by utilizing the Doppler effect. This latter method takes advantage of the fact that the rate of change in the frequency of a constant signal can be measured as it approaches and recedes from the station. Optical systems use light flashed from a satellite for positioning information. All systems require massive programming efforts and hours of run time to reduce the data.
MAPPING AND COMPUTERS
89
Satellite Orbil
i I
Perigee Earth’s
/
k
x
Q argument of Perigee
i inclination
9. Photogrammetry
Originally, photogrammetry started out as “the science or art of obtaining reliable measurements by means of photography.” By the latest definition photogrammetry is, “the art, science, and technology of obtaining reliable information about physical objects and the environment through processes of recording, measuring, and interpreting photographic images and patterns of electromagnetic and acoustical radiant energy and magnetic phenomena.” I n its simplest form, the first phase of a photogrammetric system is composed of a certain part of the earth which is to be photographed, a camera with its lens and film, and a vehicle t o carry the camera aloft. The flight paths require good design for the same reasons as the survey networks. As a plane travels along t h e flight path, a series of photographs are shot which overlap in the direction of flight. The next path produces pictures which overlap those of the first path; optimum coverage of the area to be mapped is thus insured. Map scale and flight height are directly correlated. Low-altitude photography cap-
90
PATRICIA FULTON
tures the detail required for large-scale maps. High-altitude photography incorporates more area per picture as needed for small-scale maps. The translation of the information on the photograph to digital form is the next step. To perform the necessary operations, an entire series of photogrammetric instruments have been specially designed and built. The comparators can measure the position of a point on a photograph to within plus or minus 1 or 2 micrometers. The stereocomparators permit viewing the images of two overlapping photographs called a stereo pair. When the operator of such an instrument looks through the eye pieces, a three-dimensional scene appears. There are also stereoplotters (Fig. 8 ) with a “floating mark mechanism” which is built into the instruments. Viewing the dot through the optical system, the operator can then maneuver the dot until it appears to be on the surface of the earth. I n this way it is possible to trace out a contour, a line of equal elevation on the surface of the earth. These contours can be traced onto paper or film a t the same time by means of a moveable arm with a marking device attached to the stereoplotter. For added stability during the measuring process, the image is often transferred from the photographic film to a glass plate. Thus the terms “photo” and “plate” are used interchangeably. Before an instrument’s initial use and periodically thereafter, test data are run through calibration programs to maintain accuracy standards. The coordinates of the plate points are usually recorded on punched cards, paper tape, or magnetic tape at the same time that they are measured. Analytical photogrammetry is the data-reduction technique. The sim-
FIG.8 . A stereoplotter.
MAPPING AND COMPUTERS
91
plest case is the solution for a single photograph. Next in complexity is the solution for a stereo pair of photographs (Fig. 9 ) , termed a single model. Two types of analytical photogrammetry are used a t present. One of these is derived from the principle of coplanarity; the other makes use of the principle of collinearity. In the coplanarity model, B represents the air base and pl and p z are a stereo pair of exposures on which point P appears (Fig. 10). By the illustrated geometry, the rays of vectors A,, A , , and B lie in the same plane. The inclusion of condition equations assure the intersection of A , and A,. The ground coordinates of the points are not implicit within the model. For this reason, a solution can be computed only after additional equations supply these coordinates. This same principle was described in Section 8. The principle of collinearity is shown in Fig. 11. This model is based on the fact that the image of a point, the exposure station,
I
I
I
I
I
4
FIG.9. Stereopair of photographs.
92
PATRICIA FULTON
FIG.10. Diagram illustrating the coplanarity model.
and the point itself on the ground are all on the same straight line. The collinearity condition equations contains all the elements needed for a solution-image coordinates, ground coordinates, and camera orientation. I n practice, the condition equations are linearized by use of a Taylor or McLaurin series. There are six camera orientation parameters for each photograph and three coordinates for each point. The number of equations is then equal to twice the number of photo points. The known values and some approximations are put in matrix form. These are overdetermined systems and are then handled iteratively within a rigorous leastsquares solution. From the time the mathematical formulation is translated into computing algorithms and the programs become operational, several years may have elapsed. This is true for the single-camera and single-model solutions. The computer run time and core requirements for such a solution are negligible. I n the next step, these models are expanded to permit the solution of a strip of photographs. At this point, computer run time and core limitations become additional program parameters. When these strips are built up so as to form a block solution, the usual expedients of program overlays and auxiliary disk and/or tape storage become mandatory for even the big computers. Similarly, the run time increases to hours. 10. Projections
One thing that has remained unchanged over the years is the basic problem of mapping-that is, to show the earth, a curved solid, on a
MAPPING AND COMPUTERS
93
FIG.11. Diagram illustrating the collinearity model.
plane, the flat piece of paper which is the map, with measurable fidelity. The problem is then exemplified by the classic case of peeling an orange or apple so that the skin or rind can be laid out flat. Of course, this cannot be achieved without some tearing or distortion of the fruit peeling. The same is true for mapping. The earth cannot be represented on a flat sheet of paper without some distortion. Because it is impossible to transfer points from a curved surface to a plane surface without some distortion, the projections have been designed to control these distortions. The most popular projections in use today are either conformal or equal area. Conformal means that the angles are preserved, in particular, that some given latitude and longitude that are a t right angles on the earth spheroid are also a t right angles on the map.
94
PATRICIA FULTON
The equal-area projections, just as the name implies, show areas on the map true in relationship to the areas on the spheroid. In most conformal projections, the actual correspondence between points on the ellipsoid and points on the map are defined by means of mathematical equations. In many projections it is extremely difficult if a t all possible to describe geometrically and to depict graphically the method of projection used. One of the most satisfactory ways of formulating some of these projections is by means of complex variables. An informative introduction to projections is achieved by use of sphere and conics, shown graphically. On many occasions, actual computations are carried out using the sphere in preference to the ellipsoid. Sometimes this approximation is the most practical method, considering accuracy requirements and the savings in time and money. This simplified version of a conic section and a sphere can illustrate several of the more efficacious projections. By this means, the geometric representation of the Mer-
f __
.- . -
.
.-
__
FIG.12. An example of a Mercator projection map.
MAPPING AND COMPUTERS
95
FIQ.13. Map illustrating a transverse Mercator projection.
cator projection can be envisioned as a sphere surrounded by a cylinder which is tangent a t the equator (Fig. 12). The transverse Mercator projection is popularly depicted as a cylinder which is tangent to the earth a t a given meridian. This meridian is designated as the central meridian for a given zone. The whole earth generally is divided into zones of a few degrees ( 2 O to 6 O ) each. With this relationship, the scale is true only along the central meridian. The closer a point approaches to the edge of the zone the greater the error in scale that will be found in its position. This error is then corrected by the application of the proper scale factor. The scale factor can be minimized by reducing the size of the cylinder and allowing i t to sink into the sphere. Now instead of the tangent condition along the central meridian, the cylinder actually cuts into the sphere, and the result is that now there are two circles on the cylinder where the scale is true. At the central meridian the scale varies in the direction perpendicular to the meridian, but now a t the edges of the zone, the scale error has been greatly diminished.
PATRICIA FULTON
96
This approximates very well what is done in mapping today. On a map a graticule is either a series of intersecting lines or a series of tick marks which represent the parallels and meridians of latitude and longitude. A grid is a rectangular system defined on the particular map projection. It too is composed of intersecting lines or tick marks. The transverse Mercator projection (Fig. 13) is generally used for those political areas whose dimension is greatest north to south. The transverse Mercator projection is used officially for those States of the United States with such configurations and in other parts of the world such as Great Britain and Norway. The Universal Transverse Mercator (UTM) grid came into popular use after World War 11. It is in official use in all of the departments of the Defense Mapping Agency. The other projection used officially for parts of the United States is
Upper standard parallel
Limits of projection
Lower standard parallel
\
Area of stretching S c a l e too l a r g e
'-. __
,
,'
Standard parallel Scale e x o c t Area of compression Scale t o o s m a l l Scale e x a c t
FIQ.14. Diagram of sphere and conic projections.
MAPPING 'AND COMPUTERS
97
FIG.15. An example of a Lambert conformal conic projection map.
the Lambert conformal conic projection (Figs. 14, 15). It is especially appropriate for areas that are narrow in the latitudinal direction (northsouth) and much wider in longitudinal direction (east-west). By its definition, the parallels and meridians are, respectively, arcs of concentric circles and the radii of these concentric circles. Many aeronautical charts are based upon it, and it is the official projection of several countries in South America. The transverse Mercator and the Lambert conformal conic projections provide the legally sanctioned state plane grid system. They are used for those latitudes between SOON and 8OOS; for the regions north of 80"N and south of 80°S the polar stereographic projection is generally used. Because of their cxtensive use and mathematical definitions, projections along with survey adjustments were among the first map elements programmed for computers. As each new generation of computers emerges, these essential programs are modified and entered into the sys-
98
PATRICIA FULTON
tems. Many are characterized by equations in closed form; others must be solved by means of series expansions. Going from the ellipsoid to the plane is called the direct solution, and from the plane to the ellipsoid, the inverse.
11. Cartography
Cartography is described as the production of maps including design, projection, compilation, drafting, and the actual reproduction. By this definition, cartography may be considered as the oldest branch of mapmaking. Indeed much of the hand work that goes into creating a map had its beginning hundreds of years ago. Handmade items are excessively expensive in today’s market place. Even though the finished‘product may be an example of superb craftsmanship, the time and expense involved are quite often prohibitive and are the main reasons why the old methods of cartography are under close scrutiny today. As a result of the investigations of the past few years, some of the newest and most exciting changes in the whole mapping field are taking place in the particular area of cartography. These innovations come under the general heading of digital mapping and automated cartography. Digital mapping and automated cartography have evolved only because of the existence of computers. The basic premise of computer-aided mapping (CAM) is that a map in digital form on magnetic tape is as truly a map as the conventional form with contours and drainage printed on a sheet of paper. The translation of the information contained on a photograph or a published map into digital form has become an active project in all government mapping agencies. Various techniques are now being used. The photogrammetric instruments described in Section 9 tabulated discrete points. Many of these comparators and plotters have been modified to record continuous strings of x, y, and z coordinates. Now when an operator completes a plate there is the conventional contour tracing on paper and, in addition, a three-dimensional digital terrain model recorded on magnetic tape. Another version of these stereoplotters does the same thing but in a slightly different manner. Instead of traveling around the contours, the movement is directed back and forth across the photograph; z, y, and z coordinates are recorded a t predetermined intervals. That is, profiles of the earth’s surface are traced, but again the final result is the threedimensional digital terrain model. The intervals a t which the data are recorded are generally specified in one of two ways-either after the re-
MAPPING AND COMPUTERS
99
cording instrument has traveled over a fixed distance or after the elapse of a set time period. Another method of data acquisition is the manual digitization of maps (Fig. 16). This is accomplished on instruments of the drafting-table type. A cursor, a pointing tool, is positioned over a point, and the coordinates are recorded. The cursor also can be traced along linear features such as roads or contours, and so record the three-dimensional coordinates. Again the collection intervals can be set for either distance or time mode. The development of automatic line followers is another approach to map digitization. These photooptical instruments carry light sensors (usually diodes) on a framework very similar to that of an automatic plotter suspended over the manuscript. The line followers work quite well on single lines even though they follow a twisted, convoluted course. If a line should branch or intersect with another line, then human intervention, manual or preprogrammed, is required. Again, the digitized coordinates are written on some medium that can be fed into a computer for the actual data processing. Scanner-digitizers are another answer to the problem. Those scanners that are currently operational have certain features in common. They all make use of some sort of photoelectric device for sensing the data. Most commonly, this
FIG.16. Manual digitizer.
100
PATRICIA FULTON
is either a deflectable beam photomultiplier tube or an image dissector tube, both capable of sensing various gray levels. Ordinarily, a lower power of two such as 32 (29 or 64 (2Oj gray levels is found to be adequate. Some of these sensors can also distinguish colors, and this provides an additional way of identifying the various map features. The scanners generate overwhelming amounts of data. A 3 X 5-inch film chip digitized at 0.001-inch resolution would produce 15 million pixels, discrete picture elements. The data from a standard contour plate digitized a t 50 pm fill five 10.5-inch tape reels with approximately 95 million bytes. This same contour plate digitized manually a t l-mm increments would not fill one reel but would generate points with three dimensions each. Because of the tremendous amounts of data involved, a scanner-digitizer system usually has its own dedicated computer. This can be either a fairly large general-purpose computer devoted exclusively to this one project, or it can be a minicomputer or sometimes a series of minicomputers also dedicated to this one project. At present, there are many diverse opinions on this very subject. The question of which is best, a very large general-purpose computer or a large bank of minicomputers has yct to be resolved. Regardless of the collection technique, the data are processed through error-correction routines where known errors are deleted immediately, and the rest of the data are plotted. The plots are then visually inspected for any remaining errors. This step is repeated until the data meet preset error criteria. Commonly, the next step is to superimpose a grid upon the terrain model. Essentially a mesh is created, and an elevation value is assigned to each node. Research is still in progress to determine the best mathematical method for this procedure. Many of the better programs make use of a weighted distance function. If a known point falls exactly on a grid intersection, that value is assigned to the intersection. I n all other cases where the known points fall within the area being examined, a weight value computed as a function of the distance of the known point from the intersection point is computed. The final value for the intersection will then be the ratio of the sum of the weighted random point elevations over the sum of the weights. After gridding, the data are ready for some mapping applications, one of which is the now popular orthophoto. The orthophoto is quite different from the conventional line map in several ways. It is an aerial photograph which has been subjected to differential rectification. This means that the image displacements caused by terrain relief and the tilt of the camera a t the moment of exposure have been corrected. These corrections
MAPPING AND COMPUTERS
101
are made one a t a time, each correction covering a very small part of the photograph. The areas are so small that they blend together and are not discernible on the finished orthophoto. The terrain data in profile form, digital or graphic (Fig. 17) , guide the orthophoto instruments during the exposure process. Later, contour lines, grids, graticules, and place names are added just as they are on the traditional line map. At this point, after the labor and expense involved in obtaining a digital map in machine-readable form has been described, it is wise to ask why digital mapping is being done. Indeed, over the past 10 years (for digital mapping is no more than 10 years old), this valid question has arisen repeatedly. At some time during the course of debates on this question, someone will point to an existing line map or aerial photograph and ask the question “Look a t the vast amount of data recorded here. Does anyone really think there is a better storage medium?” This doubt is understandable. The amount of information on an aerial photograph or printed line map is virtually unlimited. The data are easily stored-a
FIG.17. Examples of terrain profiles.
102
PATRICIA FULTON
sheet of paper or film takes very little space. It is equally easy to retrieve. What reason can there be for going to the trouble and expense of putting a map into digital form and then storing it? Until a few years ago the answer to this question was very much in doubt. Today the answer is apparent and generally accepted. Once a map is in digital form i t can be retrieved and, most importantly, reproduced in whatever form the user desires. This is really the crux of the whole matter. The user needs a map that serves his particular needs, and more often than not he needs it quickly. I n this decade, legislators and others concerned with the conservation and proper development of our natural resources need a variety of specialized maps. Some of these users are fortunate, for they can take their time in formulating long-range plans. Others are faced with emergencies arising from natural phenomena, and they must make vital and immediate decisions; the effects of some of these decisions will last a long time. In any sort of natural disaster, the necessary information can be presented to policy makers and decision makers quickly and cheaply by maps in digital form. Hence, the effort is worthwhile.
12. Data Banks
Even as map data are being digitized, the problem of their subsequent storage and retrieval arises. The exact form in which these data are to be stored is still in question. For many of the map features such as buildings, bridges, and airports, no standard form has been agreed upon as yet. The matter of format, that is, the size and length of each record, is also undecided. Many persons believe that certain identifying features must be included but that the details should be left up to those creating each particular data base. The theory behind this is that it adds the needed flexibility for a new and growing field of endeavor. Data bases continue to proliferate even without general agreement on the details. Figures 12, 13, and 15, which illustrate the various map projections, were derived and plotted from such a geographic data base and library of projection programs. Display
All maps are graphics, and with increasing frequency they are becoming computer graphics. The significant reason for putting map data in digital form is because such data can be used to derive specific informa-
MAPPING AND COMPUTERS
103
tion with the speed and flexibility unknown in any other medium. This can be as simple as a printout, which identifies certain regions and provides statistical information such as the area, the perimeter distance, and the location of the centroid of the region. Many times this information is more useful when accompanied by a plot produced either online or offline from data bases. Some digital map data can also be reproduced as ordinary text data for bibliographies. The production of bibliographies from computerized data is constantly increasing in both magnitude and importance. The placement of names on maps involves printing the name of a city, town, lake, etc., on a map close enough so that the site may be identified with the name, and placing the name so that it does not interfere with any other graphic information on the map. Because of these restrictions, name placement becomes a very difficult task. Indeed, some mapping agencies have both computers and plotters dedicated to this one application. When plotters are mentioned in this section, the drum type or flatbed type which uses either a ballpoint pen or a liquid-ink pen is meant. These run the gamut from high-speed and medium-precision to low-speed and extremely high-precision plotters. The more demanding specifications can also be satisfied by coordinate plotters with photoheads. On these plotters, a light beam traces out a line or flashes a symbol through a template on special film. Very fine, very smooth lines can be produced by these plotters. Many plotters also produce scribed copy. Scribing is the process of etching a groove on coated material, usually mylar. This involves positional accuracy, line-weight accuracy, and also line-depth control. Scribed sheets like the film are important because they can be used directly to produce the press plates that actually create the maps and other graphics. Another useful and increasingly popular display device is the cathode ray tube (CRT). Most are black and white, but an increasing number of color sets are in use. These are especially useful for the interactive manipulation of data and for online error correction. They also provide a fast method of combining different kinds of data in one display. The C R T is usually accompanied by a printer so that the pictorial information may be retained in hard copy.
13. Future Trends
Information about the oceans and seas, the surface of the earth, and the atmosphere is being recorded and mapped. Long before the astronauts landed on the moon, a series of observational satellites had been sent
104
PATRICIA FULTON
up to orbit it. From their various sensors, maps of the moon were created. These were accurate enough to assist in safe landings and takeoffs. All data acquisition is in some way involved with computers. The moon was only the beginning of extraterrestrial mapping. Probes have been sent to Venus and to Mars. Even now the surface of Mars is being mapped (Fig. 18), and some of the data are being recorded in digital form and processed on computers. I n nearly all terrestrial and extraterrestrial projects the data so gathered are stored in digital form in some computerized data bank. With increasing frequency, the pictorial output can be described as some form of computcr graphic. Aerial photography is now supplemented by satellite photography and utilizes a greater variety of sensors, including multispectral scanners and side-looking radar (Fig. 19). The use of digitized data from these sensors
FIG. 18.
Image of Mars. (NASA Photography.)
MAPPING AND COMPUTERS
l E C C # $ ~ ~ & . - l O N 4 4 - 5 1 N #&%W?S-W
MSSm3g"hS U i EL33
#8-#1.1IW-N-l-N-D.#?'IEWS
105
E-1079-15131-6 01
FIG.19. Multispectral image from a satellite. (NASA Photography.)
increases continually. The data are usually transmitted in digital form and undergo computer processing for error deletion and quality control before display.
14. Conclusions
The various steps that constitute the mapmaking process have been reviewed. Each step has involved the use of computers, and it has become obvious that their use is constantly expanding.
106
PATRICIA FULTON
One additional point is of interest. When the programmable desk-top calculators became available, many mapping applications were withdrawn from the large general-purpose computers and placed on these small but powerful devices. I n some respects this is the completion of a circle, inasmuch as the same computations which had been transferred from the old mechanical desk-top calculators to the electronic computers have now returned to desk-top instruments. From another viewpoint, however, particularly the viewpoint of the user, the only similarity is the desk-top location. Long tedious strings of arithmetical manipulations are now performed by depressing one button. This analogy holds true for mapmaking in general. Its basic purpose, to present pictorial information about the earth in a measurable format, remains unchanged. The methods and procedures have been irrevocably altered by data reduction via computer. The instrumentation of mapmaking has been redesigned by the same explosion in technology that has spawned generation after generation of computers within the span of a few years. The combination of mapping and computers is both happy and successful. REFERENCES Anderle, R. J. (1972). Pole position of 1971 based on Doppler Satellite observations. U.S. Nav. Weapons Lab., Tech. Rep. TR-2734. Bickmore, D. P. (1971). Experimental maps of the Bideford area. Commonw. Surv. Officers Conf., Proc. Bickmore, D. P., and Kelk, B. (1972). Production of a multicolour geological map by automakd means. Int. Geol. Congr., Proc. Zdth, 1972 Sect. 16, ‘pp. 121-127. Bomford, Brigadier G. (1962). “Geodesy.” Oxford Univ. Press (Clarendon), London and New York. Boyle, A. R. (1970). Automation in hydrographic charting. Can. Sum. 24(5). Boyle, A. R. (1971). Computer aided map compilations. Can. Nut. Res. Counc., 2nd Semin. Rep. ManlMach. Commun. Breward, R. W. (1972). A mathematical approach t o the storage of digitized contours. Brit. Cartog. Soc. J . 9 ( 2 ) , 82-86. Brown, D. (1969). Computational tradeoffs in a comparator. P ~ o ~ o g r a r nEng. ~. 35(2), 185-194. Brown, D . (1971). Close-range camera calibration. Photogramm. Eng. 37(8), 855-866. Burkard, R. K. (1964). “Geodesy for the Layman.” U S . Air Force Aeronaut. Chart and Inform. Cent., Chart Res. Div., Geophys. and Space Sci., Br. Colvocoresses, A. P. (1973). Data referencing by map grid cell. Surv. Mapping 2 3 ( 1 ) , 57-60. Connelly, D. S. (1971). An experiment in contour map smoothing with the ECU Automated Contouring System. Cartog. J. Ezp. Cartography Unit S(1). Deutsch, E. S. (1970). Thinning algorithms on rectangular, hexagonal and triangular arrays. Univ., Computer Sci. Cent., Tech. Rep. 70-115 (NASA Grant NGL-21002-008. U S . Atomic Energy Commission Contract AT440-1) 3662. Doyle, F. J. (1966). Analytical photogrammetry. I n “Manual of Photogrammetry,” 3rd ed., Vol. 1, pp. 461-513.
MAPPING AND COMPUTERS
107
Doyle, F. J. (1969). “Useful Applications of Earth-Oriented Satellites, Summer Study on Space Applications.” Nat. Res. Conc.-Nat. Acad. Sci., Div. Eng., Washington, D.C. Doyle, F. J . (1970). Photographic systems for Apollo. Photogram. Eng. 36(10), 1039-1044. Evans, I. S. (1970). The implementation of an automated cartography system. I n “Data Processing in Biology and Geology” (J. L. Cutbill, ed.), Syst. Ass. Spec. Vol. 3. Fischer, I. (1960). A map of geoidal contours in North America. Bull. Geod. Int. 57. Fischer, I. (1969). The geoid in South America referred to various reference systems. R e v . Cartografica 18. Fischer, I. (1972). I n “Geographical Data Handling” ( R . F. Tomlinson, eds.), Vols. 1 and 2. Int. Geog. Union, Ottawa. Hamilton, W. C. (1964). “Statistics in Physical Science.” Ronald Press, New York. Helava, IT. V. (1969). Some trends in automation of photogrammetry. Bildmess. Lufbildw. 3 7 ( 6 ) . Jancaitis, J. R., and Junkins, J . L. (1972). Mathematical techniques for automated cartography. Final technical report. I;.S. Army Eng. Topographic Lab. ETL-CR73-4 (Contract No. DAAK02-72-C-0256). Johnson, P. H. (1972). The image-processing system for the Earth Resources Technology Satellite. Bendiz Tech. J . Spring, pp. 46-51. Jordon, Eggert, Knessal (1966). “Handbuch der Vermessungskunde,” 10th ed. Jordan, Eggert, Knessal, Stuttgart. Kaula, W. M . (1962). Celestial geodesy. N A S A Tech. Nate NASA TN D-1155. Kaula, W. M. (1964). SECOR orbit revealed. Missiles Rockets February. Keller, M., and Trwinkel, G. C. (1964). Aerotriangulation strip adjustment. U.S. Coast Geod. Surv., Tech. Bull. 23. Keller, M., and Tcninkel, G. C. (1967). Block analytic aerotriangulation. U.S. Coast Geod. Surv.,Tech. Bull. 35. Lyddan, R. H. (1973). Basic mapping-for repourre development, environmental protection and land-use planning. Amer. Congr. Surv. Mapping, SSrd Annu. Meet. 1913, Proc. pp. 223-228. McEwen, R. B.. and Tyler, D. A. (1972). Applications of extraterrestrial surveying and mapping. Amer. Soc. Civil Eng., Surv. Mapping Div., J . 98, No. SU2, 201-2 18. Mancini, A,, and Gambino, L. A. Results of space triangulation adjustments from satellite data. G I M R A D A Res. Note 13. Merrill, R. D. (1973). Representations of contours and regions for efficient computer search. Ass. Comput. Mach. Commun. 1 6 ( 2 ) , 69-82. Mueller, I. I. (1964). “Introduction to Satellite Geodesy.” Frederick Ungar Publ. Co., New York. Petrie, G. (1972). Digitizing of photogrammetric instruments for cartographic applications. Ph,otogrammetria 28(5). Peucker, T. (1972). Computer cartography. Ass. Amer. Geographers, Comm. College Geogruphy, Resow. Pap. 17. Rosenfield, A. (1969). “Picture Processing by Computer.” Academic Press, New York. Sanford, V. (1958). “A Short History of Mathematics.” Houghton, Boston, Massachusetts.
108
PATRICIA FULTON
Schmid, H. H. (1965). Accuracy aspects of a world-wide passive satellite triangulation system. Photogramm. Eng. 31 (l), 104-117. Schmid, H. H., and Schmid, E. (1965). “A Generalized Least Squares Solution for Hybrid Measuring Systems.” US. Coast and Geodetic Survey, Rockville, Maryland. Struck, D. J. (1948). “A Concise History of Mathematics.” Dover, New York. US. Department of the Army. (1951). The universal grid systems (Universal Transverse Mercator) and (Universal Polar Stereographic). U S . Dep. Army Air Force, Training Manual TMS-241. Whipple, J. M. (1973). Surveillance of water quality. Amer. SOC.Photogrammetry J . 39(2), 137-145. Whitmore, G. D. (1949). “Advanced Surveying and Mapping.” Int. Textbook Co., Scranton, Pennsylvania.
Practical Natural Language Processing: The RE1 System as Prototype
.
FREDERICK B THOMPSON BOZENA HENISZ THOMPSON California lnstitufe o f Technology Pasadena. California
Introduction . . . . . . . . . . . . . . . . . . . . . 1. Natural Language for Computers . . . . . . . . . . . . . . 2. What Constitutes a Natural Language? . . . . . . . . . . . 2.1 The Importance of Context . . . . . . . . . . . . . . 2.2 The Idiosyncmtic Nature of Practical Languages . . . . . . . . 2.3 Language Extension through Definition . . . . . . . . . . . 3. The Prototype REL System . . . . . . . . . . . . . . . . 3.1 The REL Language Processor . . . . . . . . . . . . . . 3.2 Base Languages . . . . . . . . . . . . . . . . . . . 3.3 User Language Packages . . . . . . . . . . . . . . . . 3.4 Command Language and Metalanguage . . . . . . . . . . . 3.5 REL Service Utilities . . . . . . . . . . . . . . . . . 3.6 REL Operating System . . . . . . . . . . . . . . . . 4 . Semantics and Data Structures . . . . . . . . . . . . . . . 4.1 The 1mport.ance of Data Struct.urcs . . . . . . . . . . . . 4.2 Are There Universal Data Structurcs for English? . . . . . . . 4.3 Data Management for Relational Data Systems . . . . . . . . 5. Semantics Revisited . . . . . . . . . . . . . . . . . . 5.1 Primitive Words and Semantic Nets . . . . . . . . . . . . 5.2 The Nature of the Interpretive Routines . . . . . . . . . . 5.3 The Unlimited Complcxity of Data Structures . . . . . . . . 6. Deduction and Related Issues . . . . . . . . . . . . . . . 6.1 Extension and Intension . . . . . . . . . . . . . . . . 6.2 The Incorporation of Intensional Meaning . . . . . . . . . . 6.3 More Extensive Intensional Processing . . . . . . . . . . . 6.4 Inductive Inference . . . . . . . . . . . . . . . . . 7. English for the Computer . . . . . . . . . . . . . . . . . 7.1 Features . . . . . . . . . . . . . . . . . . . . . 7.2 Case Grammars . . . . . . . . . . . . . . . . . . 7.3 Verb Semantics. . . . . . . . . . . . . . . . . . . 109
110 110 111 111 112 113 115 116 118 118 120 121 122 122 122 123 126 128 128 131 133 135 135 137 140 142 143 144 146 148
110
F. B. THOMPSON AND
B. H. THOMPSON
The User vs. Linguistic Knowledge . . . . . . . . . Quantifiers. . . . . . . . . . . . . . . . . . . Computational Aspects of Quantifiers . . . . . . . . Steps in Developing English for the Computer . . . . . 8. Practical Natural Language Processing . . . . . . . . . 8.1 Natural Language Processors . . . . . . . . . . . 8.2 Fluency and Language Learning . . . . . . . . . . 8.3 What Kinds of Systems Can We Expect? . . . . . . . 8.4 Why Natural Languages for Communicating with Computers? References . . . . . . . . . . . . . . . . . . . . 7.4 7.5 7.6 7.7
. . . . . . . .
. . . . . .
. . . . . .
. .
. . . . . .
150 151 153
156 158 158 160 162 166 167
Introduction
Much has been written and much good work has been done on natural language as a means of communication with computers. Some of the problems involved in the development of systems with such a capability have been effectively solved ; some others, more difficult, have been more clearly delineated. We are now a t a point where i t is worthwhile to assess the current state of the art and future directions in this area of resgarch. Our own interest has been to achieve a n early operational system for natural language communication with computers. Our assessment here will reflect this concern. We will constantly be focusing on the question: What can be done now? Our views are based upon direct and successful experience with the problems discussed. Our system for comiiiunicating with the computer in natural language, the REL (Rapidly Extensible Language) System, is in experimental operation and is nearing the transportable, prototype stage. I n this paper, we call on examples from this work to niake our points as clear and concrete as possible. The question we have sought to answer in our research is this: How can a qualified professional effectively use the computer for relevant research with minimum constraints on the language of communication?
1. Natural language for Computers
The term “natural language” as used in this paper has the following meaning: a language that is natural for a specific user. This concern is not the only one that motivates research in natural language processing. Some other researchers are interested in the problem of how to build a system that understands language the way humans understand language. I n this paper we will not comment on this approach, interesting a s it is in itself. We proceed to assess the current state and future direc-
PRACTICAL NATURAL LANGUAGE PROCESSING
111
tions for natural language processing from an essentially utilitarian, practical point of view. To that end, we will take up the following issues: What is a natural language? Semantics and data structures Deduction and related issues English for the computer Practical natural language processing 2. What Constitutes a Natural language?
In this section we focus on the nature of languages that are natural for communication with computers. 2.1 The Importance of Context
In the common use of the term, “natural language’’ refers to such languages as English, Polish, French. In some attempts to define natural language in this sense, linguists have introduced the notion of the “fluent native speaker” of a language. This notion is applied in ascertaining what constitutcs an acceptable utterance in a natural language. Thus, an utterance is a well-formed sentence of English if it is recognized as such by a fluent native speaker of English. A definition of natural language along such lines may be adequate for some purposes of linguistic discussion of the nature of language in general. But specific uses of language by individuals in specific contexts show characteristics which are not easily subsumed under such a general definition. The language used by professionals in conversations with their colleagues concerning restricted areas of professional interest has a distinctly different character from general usage. Words take on special meanings, and even quite common words may be used in a highly technical way. Elliptical constructions carry meanings that would not be a t all apparent or even reconstructible in a wider or more distant context. From the point of view of an individual user, language is almost always highly context dependent. This context is not, only linguistic, i.e., the surrounding sentences, but in a very significant degree also extralinguistic-that is, the specific setting of the conversation in a given life situation. The following exarnple, cited by Bross et al. (1972) illustrates the point. They found that surgeons often close their standard write-ups of operations with the sentence: “The patient left the operating room in good condition.” When it was pointed out to them that this sentence was ambiguous and could mean that the patient had mopped the floor,
112
F. 6.
THOMPSON AND 6. H. THOMPSON
put away the equipment, and indeed left the operating room in good condition, thcy tended to laugh and say that such an interpretation would never occur to anyone reading their reports. The problem of context goes far beyond the repression of ordinary or possible alternate meanings on the basis of context. Consider the question: “What is the size of chemistry?’’ Out of context, it seems meaningless. However, one could easily imagine i t asked in a high school faculty meeting. One could also imagine it asked of a computer in the process of arranging class schedules for such a school. Further, “size of chemistry” would be intcrpretcd in this latter context not as ‘current number of chemistry students,’ as it would be in the former context, but rather as ‘number of students who have registered for chemistry.’ I n order that effective communication be achieved in specific situations, the interpretation of sentences has to be highly sensitive to specific contexts. Such interpretations may not be available to the mythical “fluent native speaker” and therefore the sentences may appear meaningless or illformed when out of context. It is also true that those context-specific interpretations may not be available to the same individual speaker, for instance, a high school principal with regard to the above examples, in a different contextual situation. The typical professional works in a narrow, highly technical context, a context in which many interlaced meanings have developed in the course of his own work. The clues that can be found in any ordinary discourse arc not sufficient to distinguish that highly specific context from all possible contexts. The idea of a single natural language processing system that will ingcst a11 of the world’s technical literature and then produce on dcmand the answer to a specific request may be an appealing one. But, a t this time thcrc are no useful insights of how contcxt-delimiting algorithms would operate to meet the requirements of such a system. 2.2 The Idiosyncratic Nature of Practical languages
From the practical point of view of computer applications in the near future it is quite nc.ccssary and advantageous to limit the context of the particular application language to the narrow universe of discourse involved in the user’s area of interest. There is a wide range of reasons why this is so, and many of them will become apparent in the latter sections of this p:q)er. The major reasoil why the universe of discourse of application languages has to bc narrow is the idiosyncratic nature and the rapid rate of change that characterize the interests and fields of bonafide computer users. Suppose one had an online computer system for social sciences, with data banks containing the already vast files of data currently in
PRACTICAL NATURAL LANGUAGE PROCESSING
113
use by social scientists. The vocabulary of the common query language would presumably include the word “developed.” The system might be able to discern from context a difference in meaning between a “fully developed skill,” a “developed country,” and an “early developed child.” However, a particular rcscarcher may not know which of the many meanings existing in the literature for the term “developed country” is used by the system. A sociologist interested in testing a theory of institutional evolution would not find acceptable an economist’s notion of development as a particular linear function of per capita GNP and the ratio of industrial capital to governmental expenditure. Who is to say how ‘(developed” is to be defined for such a broadly based system? The situation is similar with applications of such systems to management. Certainly there are significant differences among firms in the use of such technical terms as “direct salary” or “inventory.” Whether “direct salary” is to include vacation salaries or not may depend on whether one is bidding on governmental or commercial contracts. To some, “product inventory” includes items already sold but waiting shipment ; others keep inventory in terms of dollar value rather than the number of items. Even in the same firm, different managers a t different levels keep separate and incommensurable accounts. Computer scientists usually think in terms of “the” system of accounts for a firm, perhaps not aware of the fact that these accounts are kept largely for tax accounting purposes, and that management decisions are made on the basis of quite distinct accumulations of the detailed statistics. Further, changes in tax law, union contracts, pricing policy, etc., change the meaning of recorded statistics from year to year. Thus, a general set of semantic meanings built into an insensitive system can be worse than useless to a manager who has to live with change. To conclude, we state this emphatically: For the foreseeable future, natural language systems will be highly idiosyncratic in their vocabulary and the semantic processes involved.
2.3 Language Extension through Definition
These same considerations of the idiosyncratic and evolving nature of a given computer application give rise to a second property of systems that are natural. This is that they have to provide for easy and natural extensibility of languages on the part of the user himself. The introduction of new terms and changing definitions of old ones are ubiquitous aspects of language usage, especially in the technical, intensively worked areas in which computers are applied. In our computer applications today, where programs are written in FORTRAN or COBOL, language change is a matter of reprogramming.
114
F. B. THOMPSON AND B. H. THOMPSON
Thus when a social scientist wishes to apply a standard data analysis routine to a new, previously unidentified subset of this data, he writes a short FORTRAN routine that first creates this new subset as a separate entity and then calls for the application of this and perhaps other analysis routines to it. But when he is in a position to communicate with the computer in natural language, he will be able to state the new subset of his data in descriptive terms. If the description turns to be unwieldy, he may wish to simply name his new subset, thus defining it once and for all. This process can be illustrated from experience with our REL system on the part of the anthropologist Thayer Scudder (Dostert, 1971). I n the course of his work this user defined the term %ex ratio” as follows: def: sex ratio of “sample”: (number of “sample” who are male)*100/ (number of “sample” who are female) Subsequently he wanted to examine the structure and properties of families which had had all of their children. He decided t o concentrate on the older women. He created the class of older women of the Mazulu village which he was studying in the following way: def: Mazulu crone: Mazulu female who was born before 1920 He was then in a position to ask such questions as: What is the sex ratio of the children of Rilazulu crones? Note that without the capability to form definitions, this question would have had to be stated this way: What is the number of male children of Mazulu females who were born before 1920 times 100 divided by the number of female children of Mazulu females who were born before 1920? The very essence of intellectual activity is the creation and testing of new conceptual arrangements of basic elements. This is clearly true in the case of research activities. But it is equally true of management where new organizational groupings, new product lines, new accounting policies are constantly evolving as means of organizing the business for more efficient operation, more effective market penetration. Changing and extending through definition and redefinition are the language means concommitant to this process. In terms of today’s state of the art in natural language systems and our goal of a language capability natural for the user, language extension by the user is also of considerable theoretical interest. When we supply a “natural language” system to a new user, even if we build a specialized
PRACTICAL NATURAL LANGUAGE PROCESSING
115
vocabulary around his own data base, he will find the result t o be a new dialect with word usage somewhat stilted, with certain syntactic constructions interpreted semantically in ways t h a t do not exactly coincide with his own usage. Our experience indicates t h a t the new user initially engages in considerable language play, paraphrasing his questions in a variety of ways to check whether he gets the same answers and asking questions whose answers he knows to see whether the computer responds correctly. As he gains familiarity and confidence, he begins to change and extend the language in ways that make i t feel more natural to him. Any language algorithmically defined is in some sense artificial. For some time to come as we learn more about language itself, our natural language systems will initially feel artificial to the user. They will become natural for him as hc makes them his own. A tool can feel natural to a user and still be quite limited in the capabilities he may desire. I n the same way, the natural language systems we can now provide have many limitations, but they can still become natural tools for their users if these users can “fit them t o their hand” as it were, through language extension and language change.
3. The Prototype RE1 System
I n the previous section we stated that each user has to have his own idiosyncratic language package which he can extend and modify according to his needs. How can we design a system that will support these many language packages? How can facilities be provided in such a system so that new language packages may be quickly developed around the individual needs of new users? Each language package will require its own syntax and semantics, will need a parser and a semantic processor, language extension handling procedures, data structure processing utilities, operating system-i.e., an entire language/data processing system. How can such complete packages be factored into parts so that the main modules may be shared while the idiosyncratic aspects are isolated? What are the appropriate interfaces between these modules? In what form should the idiosyncratic parts be declared to the underlying common system? These are questions which we have answered in the implementation of REL. The REL system is a computer system for handling languages which are natural (Dostert, 1971). By designing and implementing the REL System, we have sought a solution to the above and other questions concerned with natural language processing. The early experience we can now get through observation of how the system actually works when used
116
F. 8. THOMPSON AND 8. H. THOMPSON
USER LANGUAGE/ DATA BASE PACKAGES
,-,
o
3
w
l3
5
BASE LANGUAGES
DATA
LANGUAGE REL LANGUAGE PROCESSOR REL OPERATING SYSTEM COMPUTER HARDWARE AND OPERATING SYSTEM
FIG.1. Architecture of the REL system.
by bonafide users is valuable, we believe, for the further development of natural language systems. I n this scction we lay out the major architecture of the REL System. This architecture is diagrammed in Fig. 1. We will discuss separately the following six areas represented in that figure: 1. REL Language Processor, 2. Base Languages, 3. User Language Packages, 4. Command Language and Metalanguage, 5. R E L Service Utilities, 6. R E L Operating System. 3.1 The RE1 language Processor
A detailed description of this part of the REL system, which cannot be gone into here, is found in Thompson (1974). The REL system is designed to support a wide variety of user language packages. Usually a computer language is defined by its own language processor. Thus, for example, FORTRAN exists in the computer in the form of a FORTRAN compiler. We have taken quite a different approach, namcly, to provide one language processor for all R E L language packages. The REL language processor is a simple syntax-directed interpreter. The notion of syntax-directed interpretation is of considerable theoretical importance in language analysis. Syntax-directed interpretation assumes that the language is defined by its rules of grammar and, corresponding to each of those, associated interpretation rules. The essence of this notion is that when a segment of a sentence is recognized as a grammatical phrase according to an appropriate rule of grammar, the meaning of that phrase can be found by calling on the corresponding interpretation rule.
PRACTICAL NATURAL LANGUAGE PROCESSING
117
I n view of the importance of syntax directed interpretation, its operation will be illustrated by an example, i.e., how it would process such an arithmetical expression as: ((3
+ 4)*(6 - 5 ) ) .
Let US suppose that the system had already recognized “3”, “4”, “5”, and “6” as numbers and that it had been supplied these syntax rules:
+
(number)) R1: (number) -+ ((number) R2: {number) -+ ((number) * (number)) R3: (number) -+ ((number) - (number)) Let us suppose also that for each of these rules i t had been supplied with a n interpretive routine, namely: with R1 routine T1 which adds two numbers with R2 routine T 2 which multiplies two numbers with R3 routine T3 which subtracts two numbers The REL language processor (indeed, any syntax-directed interpreter) first parses the sentence or expression, that is, it matches the syntax rules to thc sentence to determine its grammatical structure. The result of the parsing for the above expression is shown in the diagram: (number ;T2) (number ;T1) (number ;T3) (number) (number) (number) (number) (( 3 4 >*( 6 5 1)
+
With each node in this parsing diagram an interpretive routine is associated which corresponds to the syntax ruIe being applied. After parsing, the semantic processor uses the parsing diagram to carry out the semantic interpretation of the sentence. It does this by systematically applying the indicated interpretive routines to the arguments supplied by the nodes below, starting always with the bottom nodes and working up toward the top. In the example, it would first carry out the addition and subtraction using the interpretive routincs T1 and T3, respectively. I t would then apply the interpretive routine T2 to the resulting two numbers, completing the evaluation of the given expression. The simple conceptual scheme has been refined in the REL language processor into highly efficient algorithms. The parser is our own refinement of the Kay (1967) algorithm. It is designed to handle any general rewrite rule grammar. It also has specific mechanisms for handling syn-
118
F. B. THOMPSON AND
B. H. THOMPSON
tactic features, in the sense of transformational grammar (Chomsky, 1965) and certain simple but useful transformations. These will be discussed more fully below. The parser finds all possible parsings of the input sentence. The semantic processor, although simple in conception, has some rather unique features for handling definitions and bound variables. I n these regards, it resembles conceptually the Vienna definition language (Wegner, 1972). It also handles ambiguity (more on that below). By describing the language processor we have also identified how a language is to be defined, namely, it is defined as a set of grammar rules and corresponding interpretative routines. 3.2 Base Languages
As stated above, each user should have a language package built around his individual needs. However, the independent development of a new language of any complexity for each user would still be a major task. Moreover, many of the linguistic and data management aspects are likely to be shared. Thus, technical English augmented by statistical nomenclature and associated routines can be effectively used in a great number of social science and management applications. Such applications will differ from one to another in their vocabulary and their data, but the general, common part of such a family of applied languages can exist. For this purpose, we have implemented “base languages.” The most prominent base language in the R E L system is R E L English (Dostert and Thompson, 1971, 1972, 1974). R E L English contains no vocabulary other than the function words, e.g., “have,” [‘what,’’“of,” and such operators as “maximum.” I t also includes the nomenclature and processing routines of statistical analysis. Each application of R E L English adds its own vocabulary and data and possibly some application-specific processing routines. Two other base languages that have been developed are the Animated Film Language (Thompson e t al., 1974) for interactive graphics and the R E L Simulation Language (Nicolaides, 1974) for designing, testing, and applying discrete simulation models. Other conceivable base languages are R E L French (or some other natural language), a base language for applied mathematics, a base language for music composition, etc. 3.3 User language Packages
The question dealt with here is how a typical user uses the REL system. Consider for example a social scientist who has in hand a body of field data he wants to analyze, say, economic statistics regarding world
PRACTICAL NATURAL LANGUAGE PROCESSING
119
trade. He would typically make use of REL English as a base language. To do this he would create his own language package, say, under the name TRADE, by typing from his terminal the simple command: COPY TRADE FROM R E L ENGLISH The basic vocabulary for TRADE would arise naturally from his data; for instance, it might contain the names of all countries, names of economic groupings, e.g., “European Common Market,” and relation words associated with the variables in his data, e.g., “gross national product,” “governmental expenditures.’’ He could put in his data by typing simple declarative sentences, e.g., The population of Iran was 5000 in 1970. This is one possibility. However, R E L English, as a base language, has provisions for bulk data input from standard unit record sources. He could make direct use of these, submitting his boxes of data cards to the computing center operators. He might do this as a batch job overnight. The next morning he could begin interrogation of his data, defining iicw conceptual notions, etc. To do this, he would avail himself of a terminal, issue the command E N T E R TRADE and proceed. H e might ask, for example, What is the mean and standard deviation of the populations of European countries? He could contextually define the notion of “per capita”: def : per capita “area” : “area”/population and then ask: What is the correlation between per capita gross national product and capital investment of European Common Market countries? For the situation in the above example, the capabilities of R E L English might suffice. For other situations, it may be that no available base language would do. Under such conditions the user would have to seek the help of an applications programmer to create a new language specifically for his needs. Aspects of this task will be discussed below. Suffice i t to say here that our objective is to facilitate this task of creating such a new language so that it could be achieved in a matter of weeks, the major concern being the user’s needs, both in syntax and semantics, rather than the programming problems of the language processor design and implementation.
F. B. THOMPSON AND B. H. THOMPSON
120
3.4 Command language and Metalanguage
The command language is that simple language one uses to communicate one’s needs to the system, e.g., the creation of a new language, entering or deleting an existing language, invoking protection keys, etc. The data associated with the command language is the information concerning the various base languages and user language packages in the system. It appears to the language processor just as any other user language package. The research task of implementing a new R E L language for a specific user is a technical task not too different from other research and management tasks. The REL system itself could be used to facilitate this task with a language that would be natural for language writing. The metalanguage is such an R E L base language (Szolovits, 1974). Unlike other R E L languages, it does not stand alone but in essence underlies each of the other R E L languages. It is in the metalanguage that other languages are defined and extendcd. The metalanguage includes the capability to declare new syntax rules and new data structures, and to program associated interpretive routines. Since the metalanguage knows all about the REL language processor and the operating environment, it can perform a variety of diagnostics and carry out a limited amount of optimization. The metalanguage also contains a variety of debugging facilities such as standard break point, snap shot and trace, and it also facilitates tasks directly related to the language implementation such as display of the parsing graph of a sentence. The following sequence illustrates the use of the metalanguage to examine the output of the parsing phase of language processing. Assume that one had entered a user language based upon R E L English, and asked the question “What is 3 2?” After obtaining the answer, one could switch to the metalanguage and ask for the phrase marker and thus obtain the parsing graph. This is illustrated below.
+
WHAT IS 3+2? 5 METALANGUAGE PMARKER LANGUAGE WHAT IS 3+2?
ss _ _ _
VP
- - -
- _ _ _ - - - -- - VP- - - -
- -
-
..
-
-
NU- _ _ CV N U NU ? WHATIS 3 + 2
PRACTICAL NATURAL LANGUAGE PROCESSING
121
It is the metalanguage that facilitates the design and implementation of new user language packages. With it, new languages can be brought into being quickly and efficiently and base languages can be augmented with specialized syntax and processing routines, tailoring them to the needs of particular users. 3.5 RE1 Service Utilities
A variety of service utility routines are provided to the language writer. They embody an answer to the question: Which facilities must be at the discretion of the language writer and which can be subsumed by the system? Two such services will be described to illustrate this. REL is designed to handle large data bases and the needs of a number of users, therefore it makes extensive use of disk storage through a paging subsystem. The allocation of new pages, addressing problems, management of pages in the high speed memory are all handled by this subsystem. However, loading and locking of pages in high speed memory and any conventions concerning the contents of pages are left entirely to the language writer. H e can also ascertain the number of unlocked page slots available in high speed memory. We are convinced that the timc spent in moving material into and out of high speed memory is now and will be for a considerable time into the future a primary consideration for operationally effective systems. Major efficiencies can be realized by optimization; but such optimization can only be carried out by the language writer who is cognizant of thc nature of the data structures he is manipulating. Most of this optimization will be donc a t the level of the data structure design, but it will depend on the programmer’s being aware of page boundaries and availability of paging slots, and his being able to lock pages in high speed memory. These considerations are reflected in the paging utility routines made available to him. A second set of service utilities concerns language extension. The language processing mechanisms for handling dcfinitions are built into the parser and the semantic processor. They require that the various parts of dcfinitions be placed in appropriate structures. However, the syntax for defining can vary from one uscr’s language package to another. Some user languages may have assignments similar t o those in programming languages; others, such as REL English, may have a variety of syntax forms for the creation of verbs. Therefore the updating and deleting from dictionaries and syntax tables is carried out by service utilities, thus allowing the language writer to adopt his own Conventions in these regards while allowing strict system maintenance of internal tables. These two examples-paging and definition handling-illustrate not
122
F. 6. THOMPSON AND 6. H. THOMPSON
only the nature of service utilities provided, but the general approach to the implementation of special purpose applications languages represented by the R E L System. 3.6 RE1 Operating System
The R E L Operating System provides the interfaces with the underlying operating system environments. T o round out the characterization of REL, the specifications of the environments in which it now operates will be given here. The REL systein has been implemented on an IBM370 computer. I n light of the rapid evolution of computer systems and also of our goal of obtaining early operational experience, we have sought to make the system as transportable as possib1e.l The system is now operating in the following operating system environments: MFT, MVT, TSO, CP67/CMS, VM/CMS, VSZ/TSO. The R E L System requires 120K bytes of core memory. The minimum amount of disk space is approximately lo* bytes; however, effective use of the system requires considerably more, depending on the size of data bascs involved. 4. Semantics and Data Structures
There exists a wide variety of approaches to semantics from the fields of linguistics, logic, and computer science. Our problem here ltowevcr is more limited sincc our interest is in how computers are to be programmed to handle natural languages. 4.1 The Importance of Data Structures
It secnis plausiblc to separate computer processing of language into three steps: ( I ) analyzing the input sentence structure, namely parsing; (2) processing this structure in conjunction with some internal data and thcreby developing an appropriate interpretation of the input ; and (3) preparing an appropriate response, for example, the forming of an answer or the rnovemcnt of a robot. The second of these is semantic analysis. The internal data may be conceptualized in a variety of ways-as sequcntial files, as a relational data base, as sentences of a formal language, as a conceptual or semantic net, as a group of procedures. However one may think of it, it has somc structure. T h e various words of the input scntence point, through the dictionary, into this data structure. The seAs an exainple of its transportability, wc were able to demonstratr the system on the University of Pisa computer at the 1973 Inteinationol Meding on Computational Linguistics in Pisa, Italy.
PRACTICAL NATURAL LANGUAGE PROCESSING
123
mantic processing interrelates the structure of the input sentence with the structure of the data. Data structures, a prime consideration in all aspects of computer science today, are central to natural language processing. I n natural language processing, input sentences tend to be short, interpretive routines tend to be complex, and data bases tend to be large and highly interrelated. This is in contrast to, say, FORTRAN “sentences” which comprise entire programs, interpretive routines that are very short, and collections of otherwise independent data items. Because of these basic diff erences, natural language processors need to be considerably less sensitive to the optimization of working storage and considerably more sensitive to the complexities of data structures and their organization with respect to standard storage. Thus in discussing the semantics of natural language processing, a central topic is data structures. 4.2 Are There Universal Data Structures for English?
A t first glance it would seem that in the design of a language such as REL English, a data structure set could be adopted that would somehow reflect a general English usage and thus be suitable for all, or essentially all, applications. Several user language packages based upon such a general English would differ only in vocabulary, specific data, and added definitions. Theoretically this can be done; for example, n-ary relations or list structures will do. However in the processing of large amounts of data, the inner loops of the interpretive routines which have to do with searching, merging, and sorting are performed many thousands of times. Savings of several orders of magnitude in computer time can be realized by a more judicious choice of data structures. This is particularly true because of the large differential in memory access time between main memory wherc processing is done and peripbcral memory where data is stored. The selection of data structures cannot be made in the abstract. For example, suppose one’s data is time dependent, a supposition true of many data bascs of importance. How is the time variable attached to each record to be handled in storage? We illustrate this particular problem from our REL experience. REL English is time oriented. Thus one could ask of a data base concerning family relationships and locations such questions as : Where was Stan Smith when ,Jill Jones lived in Los Angeles? How many Smiths lived in New York since June 16,1968? In order that time be handled, each item of data carries two times indi-
124
F. B. THOMPSON AND B. H. THOMPSON
eating the interval of time during which the data itern is assumed to hold. Thus, if Jill Jones lived in Los Angeles between May 6, 1963 and October 18, 1972, there would be this data entry in the location file: (Jill Jones, Los Angeles, May 6, 1963, October 18, 1972) This would be, of course, appropriately coded. But what does “appropriately coded” mean? Well, “May 6, 1963” would be translated into a single number. When talking about data structures in a concrete, impIemented system, as opposed to talking about data structures in the abstract, a field size for holding this internal number form of a specific time, together with the units of this number, has to be assigned. The convention we have adopted in REL English is that the unit of time is a day and in the file structure for a relation, two time fields will each be two bytes long. The immediate implication of this is that REL English can only handle time over a 180 year period and only down to units of a day. Therefore, clearly, it could not be immediately applied to airline schedules which require time in units of minutes, but which only cover a span of a week ; or, for that matter, to historic data spannnig centuries. To what degree are these conventions built into REL English? Consider only the output routines. Suppose one asks the question: When did Jill ,Jones arrive in Los Angeles? The answer should be a date--“May 6, 1963,” rather than some coded number representing May 6, 1963. Yet the internal form of this data must be amenable to rapid comparisons and arithmetical operations. The translation routines from internal form to external form have to be cognizant of the calendar, e.g., information about occurrences of leap years. Phrases like “the previous 5 months,” Yhree weeks after February 18, 1960” all require knowledge of the calendar for translation. If the conventions concerning time storage were to be modified in some application, interpretive routines and time conversion utilities would have to be reprogrammed accordingly. Could we have done better in our tasks? What time unit should we have picked? Perhaps minutes? Over what span? In historic time to a thousand years in the future, perhaps a span of 6000 years? That would require approximately 23 bytes per time field, multiplying space requirements by 5 and increasing computer time by even a larger factor because of additional paging, especially in conjunction with sort utilities. Perhaps a form of a base REL English could be designed that would allow a user to select the unit of time, and therefore the time span. But generalizations of this sort run into both a sharply rising implementation cost and a sharply falling marginal number of user-oriented applications.
PRACTICAL NATURAL LANGUAGE PROCESSING
125
The limitations of R E L English are certainly not only in regard to the units and spans of time. Context considerations have to go farther. Consider the following example. We have applied REL English, as it now stands, to analyze interrelationships among scientists in a given data base. For each scientist, the data gives his date of birth, educational background, the institutions and dates of his employment, and his publications. The user of this data, in the process of his investigations wants to use the terms occurring in the titles of papers as part of his query, for example, to inquire about all authors of papers whose titles include the word “radar.” The titles of papers are in the data base, but they are as literal strings in the lexicon. I n this form they are unavailable to the interpretive routines of REL English which know nothing about literal strings as a data structure. I n this particular case, new syntax and associated interpretive routines that apply to literal strings and know how to locate them in the lexicon can be added to REL English. But, such additions comprise language extension a t the application programmer level rather than the user level. Although it is clear that different applications have their own special requirements for interpretive routines, we still have to examine whether an R E L type system can be based upon a single data base management system with its own basic data structures. This is an important question, for there are a number of major efforts and community wide coordinating committees concerned with such general systems. I n particular, relational data systems are being pressed forward as being generally applicable to the organization of data. The data structures that underlie these systems are files of contiguously stored unit records, each record consisting of a sequence of numbers, which are the values of the variables that describe the class of individuals studied. Since the range of applications to be supported by a system such as R E L is broad, it is obvious that relational data structures are not adequate as the only structural form for the data. Deductive techniques, such as theorem proving, require distinctly different structuring of data, and they will be more and more a part of natural language systems. The REL Animated Film Language uses quite different data structures. I n that language a picture is defined in terms of a set of linear transformations on more primitive picture parts, in an iterative way. These linear transformations take the form of a 3 x 3 matrix of numbers. These, of course, can be processed as a lo-ary relation, the 10th component being a pointer to the picture part. However, the interpretive routines that involve presenting these pictures as a moving image on a graphic display require utmost efficiency, efficiency which is obtained by using storage structures and access algorithms specifically designed for this application.
126
F. B. THOMPSON AND B. H. THOMPSON
I n the future, even more than in the past, there will be families of applications which will dictate their underlying data structures. Natural language systems built for such applications will have to recognize these structure requirements on data storage and data management if they are to achieve the necessary efficicncy. Once this fact of computer system design is recognized, the supporting software of opcrating systems can be designed so as to facilitate such highly adaptive language programming on the part of applications programmers. The success of our R E L work in this regard is most encouraging, as evidenced above. 4.3 Data Management for Relational Data Systems
Let us now narrow the range of applications considerably to just those which are restricted to relational data files. R E L English, as an REL base language, is designed for such restricted applications. Is a single data mangement system sufficient in the case of those? This is a moot question. There are two classes of such applications that may well dictate separate data management utilities. These two classes of relational systems arc cxamplified by the United States census data, on the one hand, and the files of information on scientists, mentioned above, on the other. In the formcr, the number of individuals is of the order 106-10X,and typical queries involve an appreciable number of the variables in each record considered independently. I n the latter, the number of individuals is of the order 103-105,and typical queries involve only a few of the variables in each record, but intcrrelatc separate records in nonsequential ways. I n the former, processing can typically (though not exclusively) be done sequentially through the file; in fact it has to be done in this way, since file sizc is so large as to make random link following from one individual’s record to another prohibitive from the point of view of processing time. .Just consider querying the census data as to all persons who have a cousin living in Rlilwaukee! I n the other case, exemplified by the data on scientists, the nature of the investigation demands random link following in proccssing. For example, one might wish to run a cluster analysis which would tend to group scientists on the basis of common authorship of papers, or on thc basis of joint institutional affiliation. A general relational data managerncnt system could certainly handle both kinds of applications, but each requires quite distinct processing optimization that must be internal to the system. I n order to get a more concrete feel for these issues, consider the following data processing problem from our R E L experience. REL English data structures are essentially classes and tinary relations. Suppose that in dealing with the “scientist” data, a subclass of scientists had been formed consisting of all linguistics, numbering, say, 2100. Suppose the
PRACTICAL NATURAL LANGUAGE PROCESSING
127
relation of “institutional affiliation” was carried as a file, say, with 9000 entries. In the REL system, data is stored on “pages” of 2000 bytes each. REL English classes require 8 bytes per entry, a member field and two half-word time fields. Thus a class page holds approximately 250 entries; a relation page holds 150. Thus the “linguist” class uses 9 pages, the “institutional affiliation” uses 60. If we want to compute the meaning of the phrase “institutional affiliation of linguists” we can do i t in two ways; a. consider each linguist at a time, find all of his institutional affiliations In this method a page of the “linguist” class is brought into main memory, and then each “institutional affiliation” page is loaded. This is repeated for each linguist in turn, a total of over 120,000 page loadings. At roughly 60 msec per page, this would require over an hour and a half; b. first determine how many page slots in main memory are available, say, in this case the number is 9. Use one for the output page, one for relation pages and the remaining 7 for class pages. Having brought in the first seven class pages, lock them in main memory and go through each “institutional affiliation” page in turn, finding the institutional affiliations of all linguists that are in the first 7 class pages. Repeat for the remaining 2 class pages. I n this method, 129 pages are moved from disk memory to main memory consuming less than 3 seconds. Thus the ratio of computing time for method (a) relative to method (b) is 3 orders of magnitude. This makes the importance of the problem involved clear. REL English, of course, makes use of optimizing methods illustrated by method ( b ) . These methods were developed by Greenfield (1972). At this stage in the development of relational data base systems, two questions need clarification. (1) Are relational data base applications rather uniformly distributed between the relatively small, highly interdependent data bases and the huge data files with their completely independent records, or is the distribution of applications definitely bimodal? (2) Can internal sort, merge, and other basic file processing optimization techniques be so programmed as to meet the efficiency requirements for the whole class of relational data base applications or must two distinct relational data base management systems necessarily evolve? Basic hardware and operating system decisions are involved, especially as far as virtual memory philosophies are concerned. Because of the large and growing number and the importance of such applications, these are significant research questions.
128
F. B. T H O M P S O N AND B. H. T H O M P S O N
5. Semantics Revisited
What is the character of the basic decisions one must make in designing the semantics of a natural language system? To answer this question, we will examine in some detail the semantics of R E L English. This discussion will also be a useful foundation for the following section on deduction. 5.1 Primitive Words and Semantic Nets
A relational data base may abstractly be considered as referring to individuals, predicating certain relations among these, and assigning values to each of them for certain attributes. I n putting such data into a natural language system we must have ways of: (a) introducing new words into the lcxicon for referring to new individuals, classes, relations, and attributes; and (b) declaring that certain individuals are in certain classes and relationships and have certain values for attributes. Words introduced by process (a) and interlinked by process (b) will be called primitive. How nonprimitive words may be introduced by definitions will be shown shortly. First, what is provided in R E L English for the user to carry out ( a ) and ( b ) ? There are four ways for the user of REL English to introduce new primitive nouns into the lexicon, which are illustrated by the five expressions: John: = name Mary: = name male: = class parent: = relation age: = number relation Once words such as “Mary,” “male,” “parent,” and “age” are introduced into the lexicon, they may be interrelated by declarative sentences such as these: John is a male. John is a parent of Mary The age of John is 35. The computer interprets the expression: male: =class in the following way. ( 1 ) It allocates a new page in the disk memory, say, the page with disk address 01. It puts certain header information
PRACTICAL NATURAL LANGUAGE PROCESSING
129
a t the top of this page, including an indication that i t is a class, but otherwise leaving it blank. Second, it puts “male” into the dictionary with the definition: (noun phrase, a), indicating its part of speech and a pointer to its assigned page. The expression: John:
=
name
results in similar actions, allocating, say, page p. The sentence: John is a male. results in the pointer p being written on page a, thus indicating that John is a member of the class An alternate way of introducing a noun into the vocabulary is by definition. The following illustrates this: (Y.
def : father: male parent I n this case there are no new data pages assigned. Rather, “father” is put into the dictionary with the definition: (noun phrase, D) where D is an internal (compiled) form of the phrase “male parent.” In this internal form, there are pointers to the data pages for the primitive words “male” and “parent” or to their definitions if they are not primitive but are in turn defined. In natural language processing systems, there is a level of primitive data structures that correspond to primitive expressions of the language (words, phrases, or syntactic forms) which serve as the semantic atoms of the system. More complex expressions refer to complex relationships among these atoms. Data is carried in the system as linkages in one form or another between these atoms. It is the sum total of the semantic atoms of the system together with the linkages existing between them that has been given the name of the “semantic net.” The following conversation illustrates how a very simple semantic net is built up in R E L English : Who are males? eh? male: = class Who are males? none John: = name .John is a male. Who are males? John
130
F. B. THOMPSON AND B. H. THOMPSON
Bob: =name Bob is a male. Who are males? John Bob age: =number relation Bob’s age is 14. parent: =relation Who are Bob’s parents? insufficient data John is a parent of Bob. Who are Bob’s parents? John Sue: = name Sue is a parent of Bob. Who arc Bob’s parents? John Sue
It shows how the words “male,” “age,” and “parent” and the names of individuals “John,” “Bob,” and “Sue” are introduced. It also shows how data becomes part of the system. The data structure t h a t results from this conversation is illustrated by the semantic net in Fig. 2. The difference between a primitive noun and a defined noun is illustrated in Fig. 3a,h. (The two conversations in Fig. 3 a,b, are assumed to be continuations of the above conversation.) The same word, “boy” is defined in Fig. 3a, but is entered as a primitive noun in Fig. 3b. Figure 3b is disturbing. It illustratcs that the information held by the system concerning a primitive class is indeed strictly limited to that which is explicitly contained in its linkages to other primitive individuals, classes,
FIG.2. A semantic net in REL English
PRACTICAL NATURAL LANGUAGE PROCESSING def: boy: male whose age is less than 16 Who are boys? Bob Bill: =name Bill is male. BiIl’s age is 8. Who are boys? Bob Bill
boy: =class All males whose age is less than 16 are boys. Who are boys? Bob Bill: =name Bill is male. Bill’s age is 8. Who are boys? Bob
(a)
(b)
131
FIG.3. Introduction of a word by definition (a) and primitively (b) in REL English.
and relations. The issues involved here are the subject of Section 6 on deduction. 5.2 The Nature of the Interpretive Routines
The discussion of how nouns are introduced into the R E L English system has illustrated some aspects of the basic decisions that must be made in designing the semantics of a natural language system, namely, the identification of the atoms and linkages that constitute the data structures and how they are tied to the primitive words. We now turn to the character of the decisions one makes in designing the interpretive routines, the semantic counterparts of the rules of grammar. T o this end, let us consider a spccific rule of syntax from the REL English grammar: (noun phrase) -+(noun phrase)(noun phrase). Examples of phrases to which this rule is applicable are: Boston ships male parent male dogs author Scott We will speak of the two constituents as the left and the right nouns. Our task, in designing the semantics corresponding to this rule is to describe an interpretive routine which will operate on the data structures referenced by the left and the right nouns and produce a data structure which expresses the appropriate meaning of the whole phrase. But what are the data structures referenced by the left and the right nouns? As we have seen above, in REL English a noun refers either to a n individual, a class, or a relation. Thus we have the following nine cases to consider.
F. B. THOMPSON AND 8. H. THOMPSON
132
Case Case Case Case Case Case Case Case Case
1 : class-class 2 : class-individual 3: class-relation 4:individual-class 5 : individual-individual 6: individual-relation 7: relation-class 8: relation-individual 9 : relation-relation
“male dogs’’ “biologist Jones” “male student” “Boston ships” -
“Harvard students” “student employee” “author Scott” “student owner”
Case 1: class-class “male dogs” In this case the answer is very easy-the intersection of the two classes involved. “Male dogs” refers to the class of all things which are members of both the class “male” and the class [‘dogs.’’ Case 2: class-individual “biologist Jones” This expression might be used if there are several individuals with the same name and the noun modifier, i.e., the left noun, is to be used for disambiguation. “Biologist Jones” is that one of the people called Jones who is a member of the class of biologists. Thus the semantic referent of the class-individual phrase is the individual itself if the individual is a member of the given class. If the individual is not a member of the class, the phrase is construed as meaningless. Case 3: class-relation “male student” We must distinguish such an expression as: “male student of Stanford” which may be grouped: male (student of Stanford), and thus reduce to case 1, from: “male students will report to Room 16.” It is the latter that illustrates the usage covered by this case. Clearly what is meant by “male students” is “those who are both male and students of some school.” Thus we simply compute the range of the student relation, the class of things that are students, and go back to case 1.
It is easy to see that in all cases where either one or the other of the constituents refers to the relation, this same procedure applies. Thus: Case Case Case Case
6 reduces 7 reduces 8 reduces 9 reduces
to to to to
Case Case Case Case
4 1 2 3.
Cases 4 and 5 remain. We consider them in reverse order.
Case 5 : individual-individual Constructions of this type do not exist in common English usage. Thus any accidental occurrence of this case will be construed as meaningless.
PRACTICAL NATURAL LANGUAGE PROCESSING
133
Case 4: individual-class “Boston ships” The phrase “Boston ships,” on the face of it, appears to have a clear meaning-the subclass of ships that are in Boston. Surely that is its meaning in the sentence: What Boston ships will leave for New York tonight? However, consider the sentences : What Boston ships will leave London for their homeport tonight? Ships made in Brooklyn are fast but Boston ships last longer. On the basis of these examples we define the notion of an intervening relation: R is an intervening relation between an individual and a class if there are members of that class which are related by R t o the given individual. In the above three sentences, “Boston ships” is being interpreted in terms of three different intervening relations, namely “location,” “homeport,” and “place of manufacture.” We will subdivide case 4 into three subclasses. Case 4a: the individual and the class have no intervening relations. Under these circumstances, the phrase will be construed as meaningless. Case 4b: the individual and the class have exactly one intervening relation R. Then the meaning of the phrase is the class of those elements of the given class related by R to the given individual. Case 4c: the individual and the class have several intervening relations. Then the phrase is ambiguous, its several meanings corresponding as in case 4b to the several intervening relations. Possible redundant subclasses which have the same members even though arising from different intervening relations are suppressed. The above discussion of the (noun phrase) (noun phrase} rule identifies the semantic analyses performed by the corresponding interpretive routines in REL English. 5.3 The Unlimited Complexity of Data Structures
The basic role of data structures in the semantics of natural language processing systems is apparent from the above discussion. R E L English is limited by the choice of individuals, classes, and binary relations as its basic data structures. It gains a significantly greater capability to reflect our normal usage of English by including with each entry in a class or relation two time fields, that is to say, fields that are specifically interpreted as time and thus structurally identified in semantic processing. This addition makes it possible to give semantic responses to all of the time-related syntax of English, from adverbs of time to the tense of verbs. R E L English could be further augmented with a data
134
F. B. THOMPSON AND B. H. THOMPSON
structure for literal strings and semantic routines that would manipulate them. Then it could deal with such phrases as: city whose name is Boston papers whose titles contain the word radar. Another class of English expressions which we handle are expressions which arise from statistics, for example: correlation between age and income of employees. Here, the correlation calculation matches the age of an employee with the income of that employee. This kind of matching is required for the proper interpretation of a number of expressions. For these purposes, REL English, internal to its interpretive routines, uses a data structure we call a labeled number class. Thus each of the phrases: age of employees income of employees gives rise, internally, to a class of numbers, the members of which are “labeled” with the corresponding employee. The correlation calculation matches on this employee label. Without the addition of such structural means, the processing efficiency for many phrases would be intolerably slow. Surely there will be applications of natural language systems to domains of greater conceptual complexity than can be efficiently represented by individuals, classes, and binary relations, even with the additional structures built into REL English. If such systems are to respond effectively in their semantics to the subtle clues of English syntax that alert our minds to these conceptual complexities, astute design of complex data structures will be the key. A theoretical elaboration of these ideas is found in Thompson (1966). The human mind appears to have a t its disposal memory structures of arbitrary complexities. From all of these i t chooses those which best give meaning to experience. At any given instant of time, we use our current cognitive structures including the linguistic structures we have perceived in our speech community to frame our actions and our verbal responses. As the flow of moment-to-moment experiences carries us along, it is imperative that these structures which we have imposed change so that we adapt to a changing world. This restructuring is not just an extension and reinforcement of our old structures. It is indeed the ability to form new conceptualization which grasp morc cogently what is significant that we revere as the quality of an able mind. I n a language for the computer in which the primitive data structures
PRACTICAL NATURAL LANGUAGE PROCESSING
135
are fixed, there can be no transmutation to more significant forms. It is a t the time when the application programmer selects those data structures that can best sustain his user’s domain that man’s ingenuity guarantees an effective language processing system.
6. Deduction and Related Issues
I n this section we examine the extent to which inference-making capability can be usefully incorporated into practical natural language processing systems. 6.1 Extension and Intension
Semantic theory makes an important distinction between extensional meaning and intensional meaning. Synonomous pairs of terms are “denotation” and “connotation,” “referential” and “intensional” meaning. An expression in a language usually has both an extensional and an intensional meaning. Its extensional meaning is the object, relationship, action, or the like to which the expression refers. Its intensional meaning is the conceptual framework that relates this expression with other expressions of the language in a way that reflects the conditions under which i t is used. Thus the extension of the word ‘(city” is the class of all cities, of which Boston is a particular member. The intension of the word “city” includes the facts that cities are geographic locations, that they are relatively dense and self-contained accumulations of domestic, commercial, and industrial human enterprises. This distinction of extension and intension is also useful when considering natural language processing systems. However, the similarity between general semantic theory and computer language semantics is not as direct as some would have it to be. I n the first place, a computer must find the meaning of a phrase in its own data structures. Consider the meanings of the word “city.” We could say that “city” has an extensional meaning for the computer if it interprets the string CITY as referring to a file in the data base that consists of the internalized form of such strings as BOSTON, NEW YORK, etc. “City” would have an intensional meaning if the interpretation of the string CITY referred to a node in a complex list structure which would link it to nodes associated with the words “human,” “geographical entity,” etc., or if it referred to the internalized form of sentences affirming general characteristics of cities. But this distinction is by no means clear. I n the “extensional” structure, we also find links to the population relation files,
136
F. B. THOMPSON AND B. H. THOMPSON
etc. ; the extensional files may have the same data structure as the intensional structures. As a matter fact, the “semantic nets” of Quillian (1969) and the “conceptual nets” of Schank (1973), both of which are rightfully considered as “intensional systems,” have essentially the same linked structures as the ring structures of Sutherland’s (1963) “Sketchpad” and earlier versions of REL English, both of which are “extensional systems.” An apparent difference between systems with “extensional” semantics and those with “intensional” or “conceptual” semantics is illustrated by looking a t how these systems would answer such a question as Does any man have two wives? An “extensional” system would process the file of “man” against the file for the “wife” relation and answer “yes” if some man was an argument for two entries in the “wife” filc. Note that the meaning of “man” is construed as “entries actually existing in the man-file a t the time of query.” An “intensional” system would check whether the concept “wife” had the property of being a function on the class “man.” It might do this by checking linkages and labels that could be found between nodes of a conceptual net or by evoking a theorem-proving program using sentences stored in some structural form as the data and attempt to prove as a theorem the internalized form of the sentence: some man has a t least two wives. If this sentence were shown to be a contradiction, it would presumably apply to all men, whether to be found elsewhere in the data base or not. The distinction between “extensional” and “intensional” systems that is apparent in the example above is certainly a valid one when considering the system as a black box functioning in a specific application. However, it is a difficult one to characterize in terms of system operation. Extrcme forms of each can be recognized; a system using theorem-proving techniques where the universe of discourse may be considered arbitrarily large is clearly intensional. In general, the distinction has little significance for how the system works internally. Perhaps that is how it should be, for in semantic theory “extensional meaning” and “intensional meaning” were never supposed to be mutually exclusive; rather, just two ways of looking a t meaning. The notions of conceptual nets and cognitive as opposed to syntactic systems reflect more on the orientation of the researcher than on the internal operations of the system he may develop. The block manipulating system of Winograd (1972) has a strictly extensional relationship with its universe of discourse, while extcrnally having many features with an intensional feel. R E L English, a highly extensional system, allows the user to impose and use an intensional structure through definitions.
PRACTICAL NATURAL LANGUAGE PROCESSING
137
Deduction, the main topic of this section, is closely related t o intension. Deduction is the process by which one reasons on the basis of known facts to conclusions which follow logically from them. This is in contrast to checking all possible instances to see if the conclusions hold whenever the statement of facts apply. In a formal logical system, deduction can be given a precise definition. However, in practice, when dealing with natural language processing systems, deduction can more usefully be construed as making use of intensional as well as extensional meaning. 6.2 The Incorporation of Intensional Meaning
Intensional meaning can enter into semantic processing is a variety of ways, and systems can vary all the way from purely extensional to intensional based upon theorem-proving algorithms that do not use extensional information at all. From the practical point of view, the problem with systems that use intensional information-deduction, if you will-is that computing time rises inordinately, indeed to the point where there is a t this time no possibility of applying them to real life problems. We are confident that purely extensional natural language systems, such as REL English, can be effectively applied in the near future. We are also confident that systems incorporating general deductive capabilities are at least a decade away. What needs to be examined here is how far and in what directions we can now move to incorporate intensional information in semantic processing. To this end we will first examine in more detail the limitations of purely extensional systems, calling on REL English for illustration. In a purely extensional system, the primitive words of the language are totally independent of one another as far as the internal semantics of the system is concerned. This is illustrated by the words “boy” and iimale” in Fig. 3b. After introducing both “boy” and “male” as primitive classes, nothing prevents one from adding the statements:
Tom: = name Tom is a male. Tom’s age is 12. without adding, possibly by oversight: Tom is a boy.
If one then were to ask: Is Tom a boy? one would get the answer: No. There is no way the system can make
138
F.
8. THOMPSON AND 8.
H. THOMPSON
use of the intensional information embodied in the statement: “All males whose age is less than 16 are boys.” Adding definitional power may provide a limited intensional capability. For example if we define “boy” as in Fig. 3a, then REL English would respond correctly to the questions:
Is Tom a boy? Are boys male? Internally, to answer the latter question, it would first construct the class of boys. It would do this by going through the class of males and picking
out those whose age is less than 16. It would then check to see whether each member of this constructed class of boys is also a member of the male class. Clearly the system should have concluded the question by making use of the obvious intensional meaning contained in the definition that “boy” is a subclass of “male.” Nevertheless, definitions can add significant deductive power to a natural language system. I n a simple application of REL English to family relationships, one can quickly define the usual relationships in terms of the single primitive relation “parent” and the primitive classes of “male” and ‘(female:” def :child :converse of parent def :sibling: child of parent but not identity def :sister:female sibling def :aunt: sister of parent Information about a person’s aunts can then be “deduced” even though only data about parents is included in the data base. Structural means can be added that will incorporate an essentially greater step. Suppose, for example, that we supply the additional lexical statement, here applied to the word “location: ” location: =transitive relation. The result would differ from a simple relation only in the setting of a flag in the data structure. However, the following deduction would be possible: John’s location is room 514. Room 514’s location is T building. T building’s location is ABC Company. ABC Company’s location is New York. Is John in New York? Yes.
PRACTICAL NATURAL LANGUAGE PROCESSING
139
John is construed to be in New York if John’s location is New York, or if the location of John’s location is New York, etc. Techniques of this kind, some considerably more complex, provide the means of handling may aspects of ordinary English usage. When one is interested in a particular domain of application, the interpretive routines of the language can reflect a great deal of the intensional knowledge of the user. Under these conditions, very powerful deductive capabilities can be built into the system itself. One can imagine a system for handling airline reservations where one could ask for a route from A to B, and the computer, in responding, would first look for direct flights; failing that, seek one stop routes, two stop routes, etc., maintaining a check to prevent cycles, ensure adequate transfer times, etc. The Navy has a computer program that computes the length of the shortest sea route between two points on the seas; this could be incorporated in the interpretation routine corresponding to a grammar rule that recognizes the phrase “sea distance.” In the process of determining the sea distance from New York to Los Angeles, it would have to deduce that a ship would need to go through the Panama Canal. Sophisticated and subtle use of procedures of this sort by Winograd (1972) have given his system an impressive capability to effectively handle intensional expressions of natural language concerning block structures. Woods (see Woods et al., 1972) has incorporated a good deal of intensional information concerning geology and chemistry into the procedures underlying the vocabulary and syntax in his Lunar Rocks query language. This, together with the general excellence of his system, makes it the most fluent of the natural language processing systems that are in operation today. Providing the user with the capability to form definitions and using interpretive routines that incorporate intensional information concerning a specific universe of discourse can result in long response times to some questions. However, in the case of these two methods of using intensional meanings, the user himself is aware that his query will entail lengthy computations and thus is more willing to tolerate a long wait for his answer. This is our experience with the users of the R E L System. When a definition is put into the system which obviously entails processing of an appreciable portion of the data base or if a statistical analysis is invoked which abstracts a great deal of detailed data, the aware user is prepared for a delay in response. In fact, i t has been suggested to us by our users that we incorporate the ability to “peel off” a question whose response is going to take some time and thus free the terminal so that they can investigate in detail the results of the previous “long” query while waiting for the response.
140
F. B. THOMPSON AND
B. H. THOMPSON
6.3 More Extensive Intensional Processing
We now turn to the consideration of deductive methods that require computing times, which, at the current state of the art, are too long to permit their employment in practical natural language processing systems. The paradigm, of course, is theorem-proving techniques. However, abstractly similar problems are encountered in much simpler intensional cont.exts. We will first illustrate the problems involved in such a simple context. Let US return to the consideration of phrases such as “Boston ships.” Recall that we defined R to be an intervening relation between “Boston” and “ships” if there is some ship that is related by R t o Boston. Now consider the following two possibilities : a. a phrase of the form (individual noun)-(class noun) is meaningful if there is a primitive intervening relation between them ; b. a phrase of the form (individual noun)-(class noun) is meaningful if there is some intervening relation between them. Further, its meaning derives from the simplest of such relation, i.e., the relation involving the fewest primitive relations.
For example, suppose that the Maru is a ship that it is owned by Jones and that *Jones lives in Boston. Then Maru is related to Boston by the intervening relation “home of owner.” If there are ships located in Boston, owned by the city of Boston, with homeport Boston, that is to say related to Boston by any direct, primitive relation, the “home of owner” relation will be ignored by ( a ) . But if no primitive relation exists between any ship and Boston, rather than immediately construing “Boston ships” as meaningless, the computer would look further into finding some assignable meaning by (b) . I n REL English we restrict ourselves to ( a ) , that is, we consider only primitive intervening relationships. A worrisome consequence is that the meaningfulness of a phrase depends on the initial selection of primitive relations for the language. For example, suppose someone wishes to investigate the relationships between students and courses at some university. Suppose that included in his data is the instructor of each course and the department of each instructor. In putting his data into the REL system, using REL English, it would be natural to use the words “instructor” and “department” to refer to primitive relations, for these correspond directly to relationships in the raw data. Now consider the phrase “mathematics course.” Using (b) above, a mathematics course would be a course taught by someone in the mathematics department. Using (a) above, “mathematics course” would be meaningless unless further defined.
PRACTICAL NATURAL LANGUAGE PROCESSING
141
Now there are arguments on both sides, for if the computer has the ability to look far afield, it may find meanings quite unintended by the user. Since the user can define such notions, e.g., def: “mathematics” course: course that is taught by a lLmathematics” instructor we have accepted in REL English interpretation ( a ) . This choice was greatly influenced by the fact that interpretation (b) incurs unacceptable computing time. Suppose there is no primitive relation between some ship and Boston. How should we proceed to look for a two-step relation? We could construct the class of all those things that are related by some primitive relation to Boston and then examine each of these to see if some ship is related to it by some primitive relation. The number of relation pages that would have to be brought from disk to main memory would be enormous. And if no two-step relation were found, the computer time would escalate exponentially. It is this characteristic of more profound deductive methods that presents the primary problem in their incorporation in practical natural language processing. In each of these methods one is trying to find the simplest relationship between two entities. In trying to find such a pathway between them, one is faced a t each step along the way with too many pathways to explore further. Research into deductive processes generally takes the form of finding means to more discriminately select the pathways one should follow. The work on semantic nets and conceptual processing, especially the work of Quillian (1969), explores in depth problems which, though more sophisticated, are closely similar to method (b) above for finding the meaning of an (individual)-(class) phrase. Most theorem-proving techniques are based upon the resolution principle of Robinson (1968)) and Green and Raphael (1968). One adds to the set of axioms of the system the negation of the theorem to be proved and seeks to derive a contradiction from this expanded set. Along the way, one generates a larger and larger family of statements that can be derived from this expanded set. Insightful methods have been developed to control the growing size of this family. But in practical applications where the set of axioms, or meaning postulates, is large, this intermediate family of statements becomes enormous, far too large to be handled in any reasonable computing time. We have heard it estimated by competent workers in the field that it will take an improvement of an order of magnitude in technique and two orders of magnitude in the speed of computer hardware to bring to
142
F. B. THOMPSON AND B. H. THOMPSON
a practical level deductive techniques of reasonable sophistication; furthermore, t.hat this will take a decade to accomplish. We substantially agree with this estimate. 6.4 Inductive Inference
There is another class of problems related to deduction which involve decisions concerned with intensional meaning, namely, the problems of inductive inference. By an inductive inference we mean arriving a t a conclusion on the basis of insufficient evidence. Inferences of this kind are quite necessary if our natural language processing systems are going to be useful. Inductive inference occurs in R E L English in connection with the interpretation of time-oriented data. Consider the following conversation : John arrived in Boston on December 6, 1965. John departed Boston on May 18, 1970. Where was John in June 1968? Boston John was in New York on July 16, 1967. Where was John in June 1968? Insufficient data
It is assumed, of course, that no other information concerned with John’s location is in the data. After the first two statements, the system infers that John’s location is Boston throughout the interval from December 6, 1965 to May 18, 1970. When other information is added, namely, that he was in New York on July 16, 1967, this inference is broken. I n general, if an individual A is related by a relation R to an individual B a t two times t , and te, and A is related by R to no other individual during the interval between tl and t?, then it is inferred that A’s relation to B holds throughout the interval. Certainly, building into the system inferences of this kind is a worrisome thing. However in practice, a user puts almost all of his data into his language package before he starts his detailed investigation and he quite naturally establishes the intervals in which relations are to hold, e-g., The location of John was Boston from December 6, 1965 t o May 15, 1970. On the other hand, if this inference concerning time had not been built into the system, the usefulness of the system would be greatly reduced.
PRACTICAL NATURAL LANGUAGE PROCESSING
143
For example, suppose the system is applied to the personnel files of a typical business, and the general manager wants to know how many engineers are employed in each of his facilities. Suppose the data shows that engineer Smith was assigned to facility ABC the previous June. The manager would find the system useless if it answers his query: How many engineers are assigned to each facility? with : Insufficient data. Inductive inferences of a variety of sorts will be required in most applications of natural language processing systems. With regard to such general aspects as time, they can be built into the major base languages such as REL English. However, in narrower matters, they must be added with care and knowledgeable appreciation for the application a t hand. Such problems and methods are well known in the field of artificial intelligence where they are referred to as heuristics. Experience from that field, especially with the heuristics involved in making complex decisions, will be useful in guiding the incorporation of inductive inferences in applied natural language processing systems.
7 . English for the Computer
So far we have stressed the idiosyneracy of natural languages, their dependence on context, and the great variety in their function as tools for communicating with the computer. However, there does exist a common body of language mechanisms-a vocabulary of function words such as “and,” ((of,” and “who”-and a richly extended family of syntactic forms which we share as a ubiquitous part of our culture and refer to as English. To what extent can we build natural languages for computers on this English? This is not a matter of all or nothing. At one extreme is the use of a few English words in a programming language like COBOL. At the other is the ability to handle highly elliptic constructions, indirect discourse, conditionals, and other subtle and complex forms of colloquial language. Somewhere along this continuum there is a threshold beyond which one would say “this is natural English.” A few systems are now beyond, though not far beyond, this threshold. More will be said about this question in Section 8.2, Fluency and Language Learning. I n the present scction, we ask, “What are the general techniques that have been Gsed by the systems incorporating English?”
144
F.
B. THOMPSON AND
B. H. THOMPSON
In the first place, these systems handle most of the normal constructions found in technical English-complex noun phrases, subordinate and relative clauses, verbs and auxiliaries, tense, conjunctions, passive, question and negative forms. Here are some examples of sentences which have been processed by the R E L System: Were IBM’s salcs greater than the average sales of electronics companies in 1965? The per capita gross national product of which South American nations exceeded 50 in the last three years? What was the average number of employees of companies whose gross profits exceeded 1000 in 1970 and 1971? How many students who take biology take each mathematics course? Which language courses were taken by more than 10 students? 7.1 Features
Some computational iinguistic techniques have been useful in implementing English syntax in natural language processing systems. One technique that is commonly used is features. I n a grammar for English, one categorizes words and expressions into various parts of speech, e.g., noun phrase, verb phrase, conjunction. In writing computational grammar rules that express the structure of English constructions one needs to make more refined distinctions. How, for example, arc we to allow “the boy” but exclude “the the boy,” and indeed properly reflect the role of determiners like “the” in such phrases as “the big boy” but not “big the boy”? For these purposes we need to distinguish determiner modified noun phrases from noun phrases not so modified. The role of features is to subcategorize parts of speech. Thus a word or phrase may have the part of speech “noun phrase” and the features “plural,” “nominative,” and “determiner modified.” Features in R E L English are binary, that is, each feature may be plus or minus (on or off). Thus the plural feature (PLF) is on for plural noun phrases (+PLF) and off for singular noun phrases (-PLF). The following rule of grammar: (noun phrase)l+pLp-+ (noun phrase)-pLp s allows the plural “s” to go on a singular noun phrase. The “1” means that the features of the first constituent of the right-hand side are also carried over and assigned to the resulting left-hand side. The determiner (DTF) and quantifier (QNF) features are set by such rules as:
PRACTICAL NATURAL LANGUAGE PROCESSING
145
(noun phrase)l+DTF* the (noun phrase)-DTF-QNF (noun phra8e)l+QNF-pLF Some (noun Phr&Se)-DTF-QNF (noun phrase)l+QNF+ d l (noun PhI%Se)+pLF-QNF (noun phraSe)i+QNF--$ all Of (noun phr&Se)+pLF+DTF-QNF --f
accounting for such phrases as: some boy some boys all boys all of the boys but excluding: the all boys all boy all of boys The primary role of features in natural language processing systems is the ordering of the hierarchical organization of syntactic constituents with the aim of controlling syntactic ambiguity. Noun phrases, for example, are hierarchically structured, i.e., some constituents serve as modifiers of others. Some phrases are genuinely ambiguous, for instance, “Jane’s children’s books” can mean either “the books of Jane’s children” or “a children’s book which is in some relation to Jane, e.g., owned or authored by her.” But in computational analysis, phrases which are normally unambiguous also turn out to have alternate analyses: wealthy benefactor’s statue may parse “wealthy (benefactor’s statue) ” or statue”. Similarly for
‘‘ (wealthy
benefactor) ’S
crowded New York’s subways.
To illustrate how features are employed in such cases, we use the adjectival (APF) , possessive (POF) and possessive modified (PSF) features and the rules: (noun phrase)l+poF + (noun phrase)-PLF-pOF’s example: New York’s (noun phrase)z+APF (noun phrase)-APF-DTF-QNF-POF-PSF (noun phraSe)-DTF-POF-QNF-PSF example: wealthy benefactor (noun phrase)2+PSF-+ (noun phrase)-DTF+poF (noun phrase)-DTF-QNF-POF-PSF example: New York’s subways ---f
146
F. B. THOMPSON AND B. H. THOMPSON
These three rules allow, for example, the following phrases: (wealthy benefactor) ’s statue (crowded New York) ’s subways (John’s son)’s teacher good (old uncle) John’s (old uncle) and in each case disallow the alternative grouping. In the case of “good old uncle” we are not dealing with a genuine disambiguation for either grouping would result in the same semantic value. Excluding one or the other grouping has the sole function of preventing multiple parsings. “John’s old uncle” represents a construction where exclusion of one case, i.e., “(John’s old) uncle”, seems in line with good English usage. I n the last two examples we see valid and important use of features to exercise control of syntactic ambiguity. One is tempted to go farther, as indeed we have. The groupings: John’s (son’s teacher) wealthy (benefactor’s statue) are not allowed. As a result, we do not recognize the ambiguity of: Jane’s children’s book stout major’s wife. At some points we have chosen to exclude certain ambiguous forms even a t some loss in fluency on the grounds of computational efficiency. We are not sure we are correct in these decisions. Experienoe with actual users and experimentation with alternate decisions will be invaluable for the improvement of REL English. 7.2 Case Grammars
Another technique that is used in natural language processing systems is the application of case grammars following the linguistic work of Fillmore (1968). The essential ideas can be grasped from the following illustration. Consider the sentences: John gave Mary the book. John gave the book to Mary. Mary was given the book by John. The book was given to Mary by John.
PRACTICAL NATURAL LANGUAGE PROCESSING
147
pG-